Compare commits

...

9134 Commits

Author SHA1 Message Date
683c54c999 Git 2.49
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-14 09:19:41 -07:00
c9d3534de3 Merge tag 'l10n-2.49.0-rnd1' of https://github.com/git-l10n/git-po
l10n-2.49.0-rnd1

* tag 'l10n-2.49.0-rnd1' of https://github.com/git-l10n/git-po:
  l10n: zh_TW: Git 2.49.0 round 1
  l10n: update German translation
  l10n: po-id for 2.49
  l10n: zh_CN: updated translation for 2.49
  l10n: uk: add 2.49 translation
  l10n: tr: Update Turkish translations for 2.49.0
  l10n: ko: fix minor typo in Korean translation
  l10n: it: fix spelling of "sorgente" (Italian for "source")
  l10n: sv.po: Fix Swedish typos
  l10n: sv.po: Update Swedish translation
  l10n: fr: 2.49 round 2
  l10n: bg.po: Updated Bulgarian translation (5836t)
  l10n: Updated translation for vi-2.49
2025-03-13 10:20:33 -07:00
ab7cb7e263 Merge branch 'l10n/zh-TW/2025-03-09' of github.com:l10n-tw/git-po
* 'l10n/zh-TW/2025-03-09' of github.com:l10n-tw/git-po:
  l10n: zh_TW: Git 2.49.0 round 1
2025-03-13 21:57:56 +08:00
7bc205bec2 l10n: zh_TW: Git 2.49.0 round 1
Co-authored-by: Lumynous <lumynou5.tw@gmail.com>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2025-03-13 21:53:11 +08:00
c64eec3400 Merge branch 'l10n-de-2.49' of github.com:ralfth/git
* 'l10n-de-2.49' of github.com:ralfth/git:
  l10n: update German translation
2025-03-13 14:15:38 +08:00
9db5ab6f6c l10n: update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2025-03-13 07:03:42 +01:00
ab00724389 l10n: po-id for 2.49
Update following components:

  * builtin/clone.c
  * builtin/commit.c
  * builtin/fetch.c
  * builtin/index-pack.c
  * builtin/pack-objects.c
  * builtin/refs.c
  * builtin/repack.c
  * builtin/unpack-objects.c
  * command-list.h
  * diff.c
  * object-file.c
  * parse-options.c
  * promisor-remote.c
  * refspec.c
  * remote.c

Translate following new components:

  * path-walk.c
  * builtin/backfill.c
  * t/helper/test-path-walk.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2025-03-13 08:21:11 +08:00
4b68faf6b9 A bit more updates after -rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12 12:06:58 -07:00
a867909543 Merge branch 'pb/doc-follow-remote-head'
Doc updates.

* pb/doc-follow-remote-head:
  config/remote.txt: improve wording for 'remote.<name>.followRemoteHEAD'
  config/remote.txt: reunite 'severOption' description paragraphs
2025-03-12 12:06:58 -07:00
870c74987b Merge branch 'tc/zlib-ng-fix'
"git version --build-options" stopped showing zlib version by
mistake due to recent refactoring, which has been corrected.

* tc/zlib-ng-fix:
  help: print zlib-ng version number
  help: include git-zlib.h to print zlib version
2025-03-12 12:06:58 -07:00
066590497e Merge branch 'ma/clone-doc-markup-fix'
Doc markup fix.

* ma/clone-doc-markup-fix:
  git-clone doc: fix indentation
2025-03-12 12:06:57 -07:00
4d53aae14b Merge branch 'tl/zh_CN_2.49.0_rnd' of github.com:dyrone/git
* 'tl/zh_CN_2.49.0_rnd' of github.com:dyrone/git:
  l10n: zh_CN: updated translation for 2.49
2025-03-12 19:36:40 +08:00
ed99a5d9b8 l10n: zh_CN: updated translation for 2.49
Helped-by: 依云 <lilydjwg@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
2025-03-12 14:52:52 +08:00
2bd71e1c16 Merge branch '2.49-uk-update' of github.com:arkid15r
* '2.49-uk-update' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: uk: add 2.49 translation
2025-03-12 11:10:40 +08:00
5b75ad9ee8 l10n: uk: add 2.49 translation
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Co-authored-by: Mikhail T. <Mikhail.Teterin@BNY.com>
Co-authored-by: Tamara Lazerka <lazerkatamara@gmail.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Mikhail T. <Mikhail.Teterin@BNY.com>
Signed-off-by: Tamara Lazerka <lazerkatamara@gmail.com>
2025-03-11 19:48:31 -07:00
f17f45f387 l10n: tr: Update Turkish translations for 2.49.0
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2025-03-11 15:05:57 +03:00
00cbbbe90a Merge branch 'vi-2.49' of github.com:Nekosha/git-po
* 'vi-2.49' of github.com:Nekosha/git-po:
  l10n: Updated translation for vi-2.49
2025-03-11 07:35:07 +08:00
b50b68dfd4 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5836t)
2025-03-11 07:33:18 +08:00
aa77e3afef Merge branch 'fr_v2.49' of github.com:jnavila/git
* 'fr_v2.49' of github.com:jnavila/git:
  l10n: fr: 2.49 round 2
2025-03-11 07:23:32 +08:00
2d8902bb24 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Fix Swedish typos
  l10n: sv.po: Update Swedish translation
2025-03-11 07:22:07 +08:00
ec507acbfd l10n: ko: fix minor typo in Korean translation
Signed-off-by: seoyeon-kwon <seoyeon.kwon@navercorp.com>
2025-03-11 07:20:03 +08:00
ee01097f28 l10n: it: fix spelling of "sorgente" (Italian for "source")
Signed-off-by: Ruggero Turra <ruggero.turra@cern.ch>
2025-03-11 07:16:20 +08:00
83b278ef74 git-clone doc: fix indentation
Commit bc26f7690a (clone: make it possible to specify --tags,
2025-02-06) added a new paragraph in the middle of this list item. By
adding an empty line rather than using a list continuation, we broke the
list continuation, with the new paragraph ending up funnily indented.

Restore the chain of list continuations.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10 09:55:00 -07:00
7d5a9e6b99 l10n: sv.po: Fix Swedish typos
Signed-off-by: Tuomas Ahola <taahol@utu.fi>
2025-03-10 17:48:42 +01:00
6167370b87 l10n: sv.po: Update Swedish translation
- Update for 2.49.0.
- Fix numerous typos found by spelling checker.
- Fix more straight quotes.
- Harmonize translation of "blob" (to "blob", not "blobb").
- Harmonize translation of "reflog" (to "referenslogg").

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2025-03-10 17:48:34 +01:00
87a0bdbf0f Git 2.49-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10 08:47:08 -07:00
5d55ad01f5 Merge branch 'tb/fetch-follow-tags-fix'
* tb/fetch-follow-tags-fix:
  fetch: fix following tags when fetching specific OID
2025-03-10 08:45:58 -07:00
bd52d9a058 fetch: fix following tags when fetching specific OID
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22),
unconditionally adds "HEAD" to the list of ref prefixes we send to the
server.

This breaks a core assumption that the list of prefixes we send to the
server is complete. We must either send all prefixes we care about, or
none at all (in the latter case the server then advertises everything).

The tag following code is careful to only add "refs/tags/" to the list
of prefixes if there are already entries in the prefix list. But because
the new code from 3f763ddf28 runs after the tag code, and because it
unconditionally adds to the prefix list, we may end up with a prefix
list that _should_ have "refs/tags/" in it, but doesn't.

When that is the case, the server does not advertise any tags, and our
auto-following breaks because we never learned about any tags in the
first place.

Fix this by only adding "HEAD" to the ref prefixes when we know that we
are already limiting the advertisement. In either case we'll learn about
HEAD (either through the limited advertisement, or implicitly through a
full advertisement).

Reported-by: Igor Todorovski <itodorov@ca.ibm.com>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07 16:15:18 -08:00
2b1e0f8cd5 help: print zlib-ng version number
When building against zlib-ng, the header file `zlib.h` is not included,
but `zlib-ng.h` is included instead. It's `zlib.h` that defines
`ZLIB_VERSION` and that macro is used to print out zlib version in
`git-version(1)` with `--build-options`. But when it's not defined, no
version is printed.

`zlib-ng.h` defines another macro: `ZLIBNG_VERSION`. Use that macro to
print the zlib-ng version in `git version --build-options` when it's
set. Otherwise fallback to `ZLIB_VERSION`.

Signed-off-by: Toon Claes <toon@iotcl.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07 12:23:30 -08:00
49d9cd8dea help: include git-zlib.h to print zlib version
In 41f1a8435a (git-compat-util: move include of "compat/zlib.h" into
"git-zlib.h", 2025-01-28) some code was refactored to enable easier
linking against zlib-ng.

This removed `zlib.h` being indirectly included in `help.c`. As this
file uses `ZLIB_VERSION` to print the version number of zlib when
running git-version(1) with `--build-options`, this resulted in a
regression.

Include `git-zlib.h` directly into `help.c` to print zlib version
information. This brings back the zlib version in the output of
`git version --build-options`.

Signed-off-by: Toon Claes <toon@iotcl.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07 12:23:29 -08:00
a36e024e98 Merge branch 'js/win-2.49-build-fixes'
Hotfix to help building Git-for-Windows.

* js/win-2.49-build-fixes:
  cmake: generalize the handling of the `CLAR_TEST_OBJS` list
  meson: fix sorting
  ident: stop assuming that `gw_gecos` is writable
2025-03-06 14:06:32 -08:00
bc86ef104a Merge branch 'pw/repo-layout-doc-update'
Some future breaking changes would remove certain parts of the
default repository, which were still described even when the
documents were built for the future with WITH_BREAKING_CHANGES.

* pw/repo-layout-doc-update:
  docs: fix repository-layout when building with breaking changes
2025-03-06 14:06:31 -08:00
62c58891e1 Merge branch 'tz/doc-txt-to-adoc-fixes'
Fallouts from recent renaming of documentation files from .txt
suffix to the new .adoc suffix have been corrected.

* tz/doc-txt-to-adoc-fixes: (38 commits)
  xdiff: *.txt -> *.adoc fixes
  unpack-trees.c: *.txt -> *.adoc fixes
  transport.h: *.txt -> *.adoc fixes
  trace2/tr2_sysenv.c: *.txt -> *.adoc fixes
  trace2.h: *.txt -> *.adoc fixes
  t6434: *.txt -> *.adoc fixes
  t6012: *.txt -> *.adoc fixes
  t/helper/test-rot13-filter.c: *.txt -> *.adoc fixes
  simple-ipc.h: *.txt -> *.adoc fixes
  setup.c: *.txt -> *.adoc fixes
  refs.h: *.txt -> *.adoc fixes
  pseudo-merge.h: *.txt -> *.adoc fixes
  parse-options.h: *.txt -> *.adoc fixes
  object-name.c: *.txt -> *.adoc fixes
  list-objects-filter-options.h: *.txt -> *.adoc fixes
  fsck.h: *.txt -> *.adoc fixes
  diffcore.h: *.txt -> *.adoc fixes
  diff.h: *.txt -> *.adoc fixes
  contrib/long-running-filter: *.txt -> *.adoc fixes
  config.c: *.txt -> *.adoc fixes
  ...
2025-03-06 14:06:31 -08:00
9709163687 cmake: generalize the handling of the CLAR_TEST_OBJS list
A late-comer to the v2.49.0 party, `sk/unit-test-oid`, added yet another
array item to `CLAR_TEST_OBJS`, causing the `win+VS build` job to fail
with symptoms like this one:

  unit-tests-lib.lib(u-oid-array.obj) : error LNK2019: unresolved
  external symbol cl_parse_any_oid referenced in function fill_array

This is a similar scenario to the one that forced me to write
8afda42fce (cmake: generalize the handling of the `UNIT_TEST_OBJS`
list, 2024-09-18): The hard-coded echo of `CLAR_TEST_OBJS` in
`CMakeLists.txt` that recapitulates faithfully what was already
hard-coded in `Makefile` would either have to be updated whack-a-mole
style, or generalized.

Just like I chose the latter option for `UNIT_TEST_OBJS`, I now do the
same for `CLAR_TEST_OBJS`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-06 08:35:08 -08:00
31761f3911 meson: fix sorting
In 904339edbd (Introduce support for the Meson build system,
2024-12-06) the `meson.build` file was introduced, adding also a
Windows-specific list of source files. This list was obviously meant to
be sorted alphabetically, but there is one mistake. Let's fix that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-06 08:35:07 -08:00
4478ad37a7 ident: stop assuming that gw_gecos is writable
In 590e081dea (ident: add NO_GECOS_IN_PWENT for systems without
pw_gecos in struct passwd, 2011-05-19), code was introduced to iterate
over the `gw_gecos` field; The loop variable is of type `char *`, which
assumes that `gw_gecos` is writable.

However, it is not necessarily writable (and it is a bad idea to have it
writable in the first place), so let's switch the loop variable type to
`const char *`.

This is not a new problem, but what is new is the Meson build. While it
does not trigger in CI builds, imitating the commands of
`ci/run-build-and-tests.sh` in a regular Git for Windows SDK (`meson
setup build . --fatal-meson-warnings --warnlevel 2 --werror --wrap-mode
nofallback -Dfuzzers=true` followed by `meson compile -C build --`
results in this beautiful error:

  "cc" [...] -o libgit.a.p/ident.c.obj "-c" ../ident.c
  ../ident.c: In function 'copy_gecos':
  ../ident.c:68:18: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
     68 |         for (src = get_gecos(w); *src && *src != ','; src++) {
        |                  ^
  cc1.exe: all warnings being treated as errors

Now, why does this not trigger in CI? The answer is as simple as it is
puzzling: The `win+Meson` job completely side-steps Git for Windows'
development environment, opting instead to use the GCC that is on the
`PATH` in GitHub-hosted `windows-latest` runners. That GCC is pinned to
v12.2.0 and targets the UCRT (unlikely to change any time soon, see
https://github.com/actions/runner-images/blob/win25/20250303.1/images/windows/toolsets/toolset-2022.json#L132-L141).
That is in stark contrast to Git for Windows, which uses GCC v14.2.0 and
targets MSVCRT. Git for Windows' `Makefile`-based build also obviously
uses different compiler flags, otherwise this compile error would have
had plenty of opportunity in almost 14 years to surface.

In other words, contrary to my expectations, the `win+Meson` job is
ill-equipped to replace the `win build` job because it exercises a
completely different tool version/compiler flags vector than what Git
for Windows needs.

Nevertheless, there is currently this huge push, including breaking
changes after -rc1 and all, for switching to Meson. Therefore, we need
to make it work, somehow, even in Git for Windows' SDK, hence this
patch, at this point in time.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-06 08:35:07 -08:00
60b9a254b6 l10n: fr: 2.49 round 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2025-03-06 16:46:25 +01:00
cc2eb7ece2 l10n: bg.po: Updated Bulgarian translation (5836t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2025-03-06 09:18:38 +01:00
a97ca95160 l10n: Updated translation for vi-2.49
Signed-off-by: Vũ Tiến Hưng <newcomerminecraft@gmail.com>
2025-03-06 12:41:39 +07:00
e969bc8759 A few more after -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05 10:37:53 -08:00
3dea2ad17d Merge branch 'rs/reftable-reader-new-leakfix'
Leakfix.

* rs/reftable-reader-new-leakfix:
  reftable: release name on reftable_reader_new() error
2025-03-05 10:37:46 -08:00
22fab08fb8 Merge branch 'pw/build-meson-technical-and-howto-docs'
Meson-based build procedure forgot to build some docs, which has
been corrected.

* pw/build-meson-technical-and-howto-docs:
  meson: fix building technical and howto docs
2025-03-05 10:37:45 -08:00
cdf458c60e Merge branch 'kn/ref-migrate-skip-reflog'
Usage string of "git refs" has been corrected.

* kn/ref-migrate-skip-reflog:
  refs: show --no-reflog in the help text
2025-03-05 10:37:45 -08:00
e2334d2f35 Merge branch 'jc/breaking-changes-early-adopter-option'
Doc update.

* jc/breaking-changes-early-adopter-option:
  BreakingChanges: clarify the procedure
2025-03-05 10:37:45 -08:00
3334de6494 Merge branch 'dm/editorconfig-bash-is-like-sh'
The editorconfig file is updated to tell us that bash scripts are
similar to general Bourne shell scripts.

* dm/editorconfig-bash-is-like-sh:
  editorconfig: add .bash extension
2025-03-05 10:37:44 -08:00
2c6fd30198 Merge branch 'cc/lop-remote'
Large-object promisor protocol extension.

* cc/lop-remote:
  doc: add technical design doc for large object promisors
  promisor-remote: check advertised name or URL
  Add 'promisor-remote' capability to protocol v2
2025-03-05 10:37:44 -08:00
6024f321d4 Merge branch 'sk/unit-test-oid'
Convert a few unit tests to the clar framework.

* sk/unit-test-oid:
  t/unit-tests: convert oidtree test to use clar test framework
  t/unit-tests: convert oidmap test to use clar test framework
  t/unit-tests: convert oid-array test to use clar test framework
  t/unit-tests: implement clar specific oid helper functions
2025-03-05 10:37:43 -08:00
feffb34257 Merge branch 'ps/path-sans-the-repository'
The path.[ch] API takes an explicit repository parameter passed
throughout the callchain, instead of relying on the_repository
singleton instance.

* ps/path-sans-the-repository:
  path: adjust last remaining users of `the_repository`
  environment: move access to "core.sharedRepository" into repo settings
  environment: move access to "core.hooksPath" into repo settings
  repo-settings: introduce function to clear struct
  path: drop `git_path()` in favor of `repo_git_path()`
  rerere: let `rerere_path()` write paths into a caller-provided buffer
  path: drop `git_common_path()` in favor of `repo_common_path()`
  worktree: return allocated string from `get_worktree_git_dir()`
  path: drop `git_path_buf()` in favor of `repo_git_path_replace()`
  path: drop `git_pathdup()` in favor of `repo_git_path()`
  path: drop unused `strbuf_git_path()` function
  path: refactor `repo_submodule_path()` family of functions
  submodule: refactor `submodule_to_gitdir()` to accept a repo
  path: refactor `repo_worktree_path()` family of functions
  path: refactor `repo_git_path()` family of functions
  path: refactor `repo_common_path()` family of functions
2025-03-05 10:37:43 -08:00
92f8da8de3 docs: fix repository-layout when building with breaking changes
Since commit 8ccc75c245 (remote: announce removal of "branches/" and
"remotes/", 2025-01-22) enabling WITH_BREAKING_CHANGES when building git
removes support for reading branches from ".git/branches" and remotes
from ".git/remotes". However those locations are still documented in
gitrepository-layout.adoc even though the build does not support them.

Rectify this by adding a new document attribute "with-breaking-changes"
and use it to make the inclusion of those sections of the documentation
conditional. Note that the name of the attribute does not match the test
prerequisite WITHOUT_BREAKING_CHANGES added in c5bc9a7f94 (Makefile:
wire up build option for deprecated features, 2025-01-22). This is to
avoid the awkward double negative ifndef::without_breaking_changes for
documentation that should be included when WITH_BREAKING_CHANGES is
enabled. The test prerequisite will be renamed to match the
documentation attribute in a future patch series.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05 07:25:11 -08:00
bad7910399 reftable: release name on reftable_reader_new() error
If block_source_read_block() or parse_footer() fail, we leak the "name"
member of struct reftable_reader in reftable_reader_new().  Release it.

Reported by: H Z <shiyuyuranzh@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-04 09:21:39 -08:00
6a64ac7b01 Git 2.49-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-04 08:19:20 -08:00
6dff5de1da refs: show --no-reflog in the help text
We forgot that we must keep the documentation and help text in sync.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 14:51:29 -08:00
61cd812130 xdiff: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:27 -08:00
d6b67cefb5 unpack-trees.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:27 -08:00
ee00ef41f2 transport.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:27 -08:00
15db9a895d trace2/tr2_sysenv.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:26 -08:00
508cf7f5d8 trace2.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:26 -08:00
366074dc18 t6434: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:26 -08:00
8ea7d41f17 t6012: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:26 -08:00
e680c62542 t/helper/test-rot13-filter.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:25 -08:00
9f04cd7c61 simple-ipc.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:25 -08:00
0543300b59 setup.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:25 -08:00
72d385824a refs.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:24 -08:00
dc657d5625 pseudo-merge.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:24 -08:00
458f8b0eab parse-options.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:24 -08:00
550fac1d13 object-name.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:24 -08:00
02ed88f6a2 list-objects-filter-options.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:23 -08:00
c09c29b430 fsck.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:23 -08:00
3936e95a7f diffcore.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:23 -08:00
5c03752665 diff.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:23 -08:00
87e0910fb8 contrib/long-running-filter: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:22 -08:00
bbd6174b25 config.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:22 -08:00
e8015223c7 builtin.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:22 -08:00
08ce333d36 apply.c: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:22 -08:00
d795c65b3a advice.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:21 -08:00
97350e18e2 doc: *.txt -> *.adoc fixes
Update a few more instances of Documentation/*.txt files which have been
renamed to *.adoc.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:21 -08:00
59d9280908 technical/partial-clone: update reference to rev-list-options.adoc
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:21 -08:00
9100c91cd4 howto/new-command: update reference to builtin docs
Commit ec14d4ecb5 (builtin.h: take over documentation from
api-builtin.txt, 2017-08-02) deleted api-builtin.txt and moved the
contents into builtin.h.  Most of the references were fixed in
d85e9448dd (new-command.txt: update reference to builtin docs,
2023-02-04), but one remained.  Fix it.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:20 -08:00
5ac2c61b55 MyFirstObjectWalk: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:20 -08:00
8b4b41aefb MyFirstContribution: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:20 -08:00
7c78c599bb CodingGuidelines: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:20 -08:00
c50fbb2dd2 README: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:19 -08:00
d40da0bd4b Makefile: update reference to technical/racy-git.adoc
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:19 -08:00
7d90a272ac doc: remove unneeded .gitattributes
The top-level .gitattributes file contains entries for the Documentation
tree.  Documentation/.gitattributes has not been touched since it was
added in 14f9e128d3 (Define the project whitespace policy, 2008-02-10).

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:19 -08:00
33af5a3334 .gitattributes: more *.txt -> *.adoc updates
All Documentation files now end in .adoc.  Update the entries for
git-merge.adoc, gitk.adoc, and user-manual.adoc to properly set the
conflict-marker-size attribute.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:19 -08:00
82deaae3b9 t0450: *.txt -> *.adoc fixes
After 1f010d6bdf (doc: use .adoc extension for AsciiDoc files,
2025-01-20), we no longer matched any files in this test.  The result is
that we did not test for mismatches in the documentation and --help
output.

Adjust the test to look at the renamed *.adoc files.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:18 -08:00
c268e3285d BreakingChanges: clarify the procedure
The point behind a compile-time switch is to ensure that we have a
mechanism to hide myriad of backward incompatible changes that may
be prepared and accumulated over time, yet make them available for
testing any time during the development toward the big version
boundary.  Add a few words to stress that point.

Since the document was first written, we have added the CI job that
the document anticipated us to have.  Rephrase to state the current
status.

The discussion in [*1*] made us abandon the "feature.git3" based
runtime switching of behaviour and instead adopt the compile-time
switching mechanism, but a stray sentence about runtime switching
still remained in the final text by mistake.  Remove it.

[Reference]

 *1* https://lore.kernel.org/git/xmqqldzel6ug.fsf@gitster.g/

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 10:07:03 -08:00
5040f9f164 doc: add technical design doc for large object promisors
Let's add a design doc about how we could improve handling liarge blobs
using "Large Object Promisors" (LOPs). It's a set of features with the
goal of using special dedicated promisor remotes to store large blobs,
and having them accessed directly by main remotes and clients.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 08:57:40 -08:00
db91954e18 A few more before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 08:53:03 -08:00
aa0ba82319 Merge branch 'ps/build-meson-fixes'
CI fix.

* ps/build-meson-fixes:
  gitlab-ci: fix "msvc-meson" test job succeeding despite test failures
2025-03-03 08:53:03 -08:00
ca39da6997 Merge branch 'ps/meson-contrib-bits'
Update meson-based build procedure to cover contrib/ and other
places as well.

* ps/meson-contrib-bits:
  ci: exercise credential helpers
  ci: fix propagating UTF-8 test locale in musl-based Meson job
  meson: wire up static analysis via Coccinelle
  meson: wire up git-contacts(1)
  meson: wire up credential helpers
  contrib/credential: fix compilation of "osxkeychain" helper
  contrib/credential: fix compiling "libsecret" helper
  contrib/credential: fix compilation of wincred helper with MSVC
  contrib/credential: fix "netrc" tests with out-of-tree builds
  GIT-BUILD-OPTIONS: propagate project's source directory
2025-03-03 08:53:03 -08:00
85e342adbd Merge branch 'ms/merge-recursive-string-list-micro-optimization'
Rename processing in the recursive merge backend has seen a micro
optimization.

* ms/merge-recursive-string-list-micro-optimization:
  merge-recursive: optimize time complexity for process_renames
2025-03-03 08:53:02 -08:00
238c8d3984 Merge branch 'lo/doc-merge-submodule-update'
What happens to submodules during merge has been documented in a
bit more detail.

* lo/doc-merge-submodule-update:
  merge-strategies.adoc: detail submodule merge
2025-03-03 08:53:02 -08:00
ab09eddf60 Merge branch 'ps/build-meson-fixes-0130'
Assorted fixes and improvements to the build procedure based on
meson.

* ps/build-meson-fixes-0130:
  gitlab-ci: restrict maximum number of link jobs on Windows
  meson: consistently use custom program paths to resolve programs
  meson: fix overwritten `git` variable
  meson: prevent finding sed(1) in a loop
  meson: improve handling of `sane_tool_path` option
  meson: improve PATH handling
  meson: drop separate version library
  meson: stop linking libcurl into all executables
  meson: introduce `libgit_curl` dependency
  meson: simplify use of the common-main library
  meson: inline the static 'git' library
  meson: fix OpenSSL fallback when not explicitly required
  meson: fix exec path with enabled runtime prefix
2025-03-03 08:53:02 -08:00
1aabec0b48 Merge branch 'dk/test-aggregate-results-paste-fix'
The use of "paste" command for aggregating the test results have
been corrected.

* dk/test-aggregate-results-paste-fix:
  t/aggregate-results: fix paste(1) invocation
2025-03-03 08:53:01 -08:00
c84209a8fd editorconfig: add .bash extension
Both files in the command below appear to be indented with tabs, and I'd
expect .bash files to have roughly the same style as .sh files.

$ find . -name \*.bash
./contrib/completion/git-completion.bash
./ci/check-directional-formatting.bash

Signed-off-by: David Mandelberg <david@mandelberg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 08:39:47 -08:00
87eccc3a81 meson: fix building technical and howto docs
When our asciidoc files were renamed from "*.txt" to "*.adoc" in
1f010d6bdf (doc: use .adoc extension for AsciiDoc files, 2025-01-20)
the "meson.build" file in "Documentation" was updated but the
"meson.build" files in the "technical" and "howto" subdirectories were
not. This causes the meson build to fail when configured with
-Ddocs=html. Fix this by updating the relevant "meson.build" files.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 08:38:05 -08:00
06d9252bcc doc: fix build-docdep.perl
We renamed from .txt to .adoc all the asciidoc source files and
necessary includes.  We also need to adjust the build-docdep tool to
work on files whose suffix is .adoc when computing the documentation
dependencies.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-01 10:26:15 -08:00
561de07b57 contrib/subtree: rename .txt to .adoc
The .txt extensions were changed to .adoc in 1f010d6bdf (doc: use .adoc
extension for AsciiDoc files, 2025-01-20).

Do the same for contrib/subtree.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-01 10:00:52 -08:00
fa779fa88d contrib/contacts: rename .txt to .adoc
The .txt extensions were changed to .adoc in 1f010d6bdf (doc: use .adoc
extension for AsciiDoc files, 2025-01-20).

Do the same for contrib/contacts.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-01 10:00:51 -08:00
41c793eae9 doc: update howto-index.sh for .adoc extensions
The .txt extensions were changed to .adoc in 1f010d6bdf (doc: use .adoc
extension for AsciiDoc files, 2025-01-20).  This left broken links in
the generated howto-index.html.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-01 10:00:51 -08:00
2a1530a953 Merge branch 'ps/meson-contrib-bits' into tz/doc-txt-to-adoc-fixes
* ps/meson-contrib-bits:
  ci: exercise credential helpers
  ci: fix propagating UTF-8 test locale in musl-based Meson job
  meson: wire up static analysis via Coccinelle
  meson: wire up git-contacts(1)
  meson: wire up credential helpers
  contrib/credential: fix compilation of "osxkeychain" helper
  contrib/credential: fix compiling "libsecret" helper
  contrib/credential: fix compilation of wincred helper with MSVC
  contrib/credential: fix "netrc" tests with out-of-tree builds
  GIT-BUILD-OPTIONS: propagate project's source directory
2025-03-01 10:00:45 -08:00
028f618658 path: adjust last remaining users of the_repository
With the preceding refactorings we now only have a couple of implicit
users of `the_repository` left in the "path" subsystem, all of which
depend on global state via `calc_shared_perm()`. Make the dependency on
`the_repository` explicit by passing the repo as a parameter instead and
adjust callers accordingly.

Note that this change bubbles up into a couple of subsystems that were
previously declared as free from `the_repository`. Instead of marking
all of them as `the_repository`-dependent again, we instead use the
repository that is available in the calling context. There are three
exceptions though with "copy.c", "pack-write.c" and "tempfile.c".
Adjusting these would require us to adapt callsites all over the place,
so this is left for a future iteration.

Mark "path.c" as free from `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
f1ce861c34 environment: move access to "core.sharedRepository" into repo settings
Similar as with the preceding commit, we track "core.sharedRepository"
via a pair of global variables. Move them into `struct repo_settings` so
that we can instead track them per-repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
6f3fbed8ed environment: move access to "core.hooksPath" into repo settings
The "core.hooksPath" setting is stored in a global variable and
populated via the `git_default_core_config`. This may cause issues in
the case where one is handling multiple different repositories in a
single process with different values for that config key, as we may or
may not see the correct value in that case. Furthermore, global state
blocks our path towards libification.

Refactor the code so that we instead store the value in `struct
repo_settings`. The value is computed as-needed and cached. The result
should be functionally the same as there aren't ever any code paths
where we'd execute hooks outside the context of a repository.

Note that this requires us to change the passed-in repository in the
`repo_git_path()` family of functions to be non-constant, as we call
`adjust_git_path()` there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
b411ed60c7 repo-settings: introduce function to clear struct
We don't provide a way to clear a `struct repo_settings`, and instead
open-code this in `repo_clear()`. This is mixing up concerns and means
that developers have to touch multiple files whenever they add a new
field to the structure in case the associated resources need to be
released.

Provide a new `repo_settings_clear()` function to improve this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
88dd321cfe path: drop git_path() in favor of repo_git_path()
Remove `git_path()` in favor of the `repo_git_path()` family of
functions, which makes the implicit dependency on `the_repository` go
away.

Note that `git_path()` returned a string allocated via `get_pathname()`,
which uses a rotating set of statically allocated buffers. Consequently,
callers didn't have to free the returned string. The same isn't true for
`repo_common_path()`, so we also have to add logic to free the returned
strings.

This refactoring also allows us to remove `repo_common_pathv()` as well
as `get_pathname()` from the public interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
8ee018d863 rerere: let rerere_path() write paths into a caller-provided buffer
Same as with `get_worktree_git_dir()` a couple of commits ago, the
`rerere_path()` function returns paths that need not be free'd by the
caller because `git_path()` internally uses `get_pathname()`.

Refactor the function to instead accept a caller-provided buffer that
the path will be written into, passing on ownership to the caller. This
refactoring prepares us for the removal of `git_path()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
cb0ae672ae A bit more post -rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-27 15:23:01 -08:00
9f280bea98 Merge branch 'jc/3.0-branches-remotes-update'
Removal of ".git/branches" and ".git/remotes" support in the
BreakingChanges document has been further clarified.

* jc/3.0-branches-remotes-update:
  BreakingChanges: clarify branches/ and remotes/
2025-02-27 15:23:01 -08:00
68c3be61fc Merge branch 'bc/http-push-auth-netrc-fix'
The netrc support (via the cURL library) for the HTTP transport has
been re-enabled.

* bc/http-push-auth-netrc-fix:
  http: allow using netrc for WebDAV-based HTTP protocol
2025-02-27 15:23:01 -08:00
16b2e579f9 Merge branch 'rs/clear-commit-marks-optim'
A micro-optimization.

* rs/clear-commit-marks-optim:
  commit: avoid parent list buildup in clear_commit_marks_many()
2025-02-27 15:23:00 -08:00
c51a0b47c9 Merge branch 'pw/rebase-i-ff-empty-commit'
"git rebase -i" failed to allow rewording an empty commit that has
been fast-forwarded.

* pw/rebase-i-ff-empty-commit:
  rebase -i: reword empty commit after fast-forward
2025-02-27 15:23:00 -08:00
3c0f4abaf5 Merge branch 'kn/ref-migrate-skip-reflog'
"git refs migrate" can optionally be told not to migrate the reflog.

* kn/ref-migrate-skip-reflog:
  builtin/refs: add '--no-reflog' flag to drop reflogs
2025-02-27 15:23:00 -08:00
9d8cce051a Merge branch 'ua/os-version-capability'
The value of "uname -s" is by default sent over the wire as a part
of the "version" capability.

* ua/os-version-capability:
  agent: advertise OS name via agent capability
  t5701: add setup test to remove side-effect dependency
  version: extend get_uname_info() to hide system details
  version: refactor get_uname_info()
  version: refactor redact_non_printables()
  version: replace manual ASCII checks with isprint() for clarity
2025-02-27 15:23:00 -08:00
aea7c185be gitlab-ci: fix "msvc-meson" test job succeeding despite test failures
We have recently noticed that the "msvc-meson" test job in GitLab CI
succeeds even if there are failures. This is somewhat puzzling because
we use exactly the same command as we do on GitHub Actions, and there
the jobs fail as exected.

As it turns out, this is another weirdness of the GitLab CI hosted
runner for Windows [1]: by default, even successful commands will not
make the job fail. Interestingly though, this depends on what exactly
the command is that you're running -- the MinGW-based job for example
works alright and does fail as expected.

The root cause here seems to be specific behaviour of PowerShell. The
invocation of `ForEach-Object` does not bubble up any errors in case the
invocation of `meson test` fails, and thus we don't notice the error.
This is specific to executing the command in a loop: other build steps
where we execute commands directly fail as expected.

This is because the specific version of PowerShell that we use in the
runner does not know about `PSNativeCommandUseErrorActionPreference`
yet, which controls whether native commands like "meson.exe" honor the
`ErrorActionPreference` variable. The preference has been introduced
with PowerShell 7.3 and is default-enabled since PowerShell 7.4, but
GitLab's hosted runners still seem to use PowerShell 5.1. Consequently,
when tests fail, we won't bubble up the error at all from the loop and
thus the job doesn't fail. This isn't an issue in other cases though
where we execute native commands directly, as the GitLab runner knows to
check the last error code after every command.

The same thing doesn't seem to be an issue on GitHub Actions, most
likely because it uses PowerShell 7.4. Curiously, the preference for
`PSNativeCommandUseErrorActionPreference` is disabled there, but the
jobs fail as expected regardless of that. It's puzzling, but I do not
have enough PowerShell expertise to give a definitive answer as to why
it works there.

In any case, Meson 1.8 will likely get support for slicing tests [1], so
we can eventually get rid of the whole PowerShell script. For now, work
around the issue by explicitly exiting out of the loop with a non-zero
error code if we see that Meson has failed.

[1]: https://github.com/mesonbuild/meson/pull/14092

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-27 10:42:31 -08:00
9350423982 gitlab-ci: restrict maximum number of link jobs on Windows
The hosted Windows runners on GitLab.com only have 7.5GB of RAM. Given
that "link.exe" provided by Microsoft Visual Studio is multi-threaded by
itself already and thus quite memory hungry this can quickly lead to
memory starvation, out-of-memory situations and thus failed CI jobs.

Fix the issue by limiting the number of concurrent linker jobs. The same
issue hasn't been observed on GitHub Actions yet, probably because it
got more than twice the amount of RAM with 16GB.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:37 -08:00
2c374ea4bb meson: consistently use custom program paths to resolve programs
The calls to `find_program()` in our documentation don't use our custom
program path. This variable gets populated on Windows with the location
of Git for Windows so that we can use it to provide our build tools.
Consequently, we may not be able to find all necessary binaries on
Windows.

Adapt the calls to use the program path to fix this. While at it, drop
`required: true` arguments, which are the default anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:37 -08:00
3ee3a6eb52 meson: fix overwritten git variable
We're assigning the `git` variable in three places:

  - In "meson.build" to store the external Git executable.

  - In "meson.build" to store the compiled Git executable.

  - In "Documentation/meson.build" to store the external Git executable,
    a second time.

The last case is only needed because we overwrite the original variable
with the built version. Rename the variable used for the built Git
executable so that we don't have to resolve the external Git executable
multiple times.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:37 -08:00
16c89dcf80 meson: prevent finding sed(1) in a loop
We're searching for the sed(1) executable in a loop, which will make us
try to find it multiple times. Starting with the preceding commit we
already declare a variable for that program in the top-level build file.
Use it so that we only need to search for the program once.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:36 -08:00
42846efc3b meson: improve handling of sane_tool_path option
The `sane_tool_path` option can be used to override the PATH variable
from which the build process, tests and ultimately Git will end up
picking programs from. It is currently lacking though because we only
use it to populate the PATH environment variable for executed scripts
and for the `BROKEN_PATH_FIX` mechanism, but we don't use it to find
programs used in the build process itself.

Fix this issue by treating it similar to the Windows-specific paths,
which will make us use it both to find programs and to populate the PATH
environment variable.

To help with this fix, change the type of the option to be an array of
paths, which makes the handling a bit easier for us. It's also the
correct thing to do as the input indeed is a list of paths.

Furthermore, the option now overrides the default behaviour on Windows,
which si to pick up tools from Git for Windows. This is done so that it
becomes easier to override that default behaviour in case it's not
desired.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:36 -08:00
454d79b61b meson: improve PATH handling
When locating programs required for the build we give some special
treatment to Windows systems so that we know to also look up tools
provided by a Git for Windows installation. This ensures that the build
doesn't have any prerequisites other than Microsoft Visual Studio, Meson
and Git for Windows.

Consequently, some of the programs returned by `find_program()` may not
be found via PATH, but via these extra directories. But while Meson can
use these tools directly without any special treatment, any scripts that
we execute may not be able to find those programs. To help them we thus
prepend the directories of a subset of the found programs to PATH.

This doesn't make much sense though: we don't need to prepend PATH for
any program that was found via PATH, but we really only need to do so
for programs located via the extraneous Windows-specific paths. So
instead of prepending all programs paths, we really only need to prepend
the Windows-specific paths.

Adapt the code accordingly by only prepeding Windows-specific paths to
PATH, which both simplifies the code and clarifies intent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:36 -08:00
eee25bbd84 meson: drop separate version library
When building `libgit.a` we link it against a `libgit_version.a` library
that contains the version information that we inject at build time. The
intent of this is to avoid rebuilding all of `libgit.a` whenever the
version changes. But that wouldn't happen in the first place, as we know
to just rebuild the files that depend on the generated "version-def.h"
file.

This is an artifact of an earlier version of the Meson build infra that
didn't ultimately land. We didn't yet have "version-def.h", and instead
injected the version via preprocessor directives. And here we would have
rebuilt all of `libgit.a` indeed in case the version changes, because
the preprocessor directive applied to all files.

Stop building the separate library and instead add "version-def.h" to
the list of source files directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:36 -08:00
f5fac42e07 meson: stop linking libcurl into all executables
We set up libcurl via the `libgit_dependencies` variable, which gets
propagated into every user of the `libgit` dependency. This is not
necessary though, as most of our executables aren't even supposed to
link against libcurl.

Fix this by only propagating include directories as a libgit dependency
and propagating the full curl dependency via `libgit_curl`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:35 -08:00
dfc88bd647 meson: introduce libgit_curl dependency
We've got a set of common source files that we use for those executables
that link against libcurl. The setup is somewhat repetitive though.
Simplify it by declaring a `libgit_curl` dependency that bundles all of
it together.

Note that we don't include curl itself as a dependency. This is because
we already pull it in transitively via the libgit dependency, which is
unfortunate because libgit itself shouldn't actually link against curl
in the first place. This will get fixed in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:35 -08:00
ebb35369f1 meson: simplify use of the common-main library
The "common-main.c" file is used by multiple executables. In order to
make it easy to set it up we have created a separate library that these
executables can link against. All of these executables also want to link
against `libgit.a` though, which makes it necessary to specify both of
these as dependencies for every executable.

Simplify this a bit by declaring the library as a source dependency:
instead of creating a static library, we now instead compile the common
set of files into each executable separately.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:35 -08:00
ce9432889c meson: inline the static 'git' library
When setting up `libgit.a` we first create the static library itself,
and then declare it as part of a dependency such that compile arguments,
include directories and transitive dependencies get propagated to the
users of that library. As such, the static library isn't expected to be
used by anything but the declared dependency.

Inline the static library so that we don't even use a separate variable
for it. This avoids any kind of confusion that may arise and clarifies
how the library is supposed to be used.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:35 -08:00
6128301075 meson: fix OpenSSL fallback when not explicitly required
When OpenSSL isn't provided by the system we know to fall back to the
subproject wrapper. This is especially helpful on Windows systems, where
you typically don't have OpenSSL available, in order to reduce the
number of required dependencies.

The fallback is broken though when the OpenSSL backend is set to 'auto'
as we end up calling `dependency('openssl', required: false)` in that
case, which implicitly disables falling back to the wrapper.

Fix the issue by re-allowing the fallback in case either OpenSSL is
required or in case the backend is set to 'auto'. While at it, fix
reporting of the backend in case the user asked us to pick no HTTPS
backend at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:35 -08:00
bd262d07b6 meson: fix exec path with enabled runtime prefix
When the runtime prefix option is enabled, Git is built such that it
knows to locate its binaries relative to the directory a binary is being
executed from. This requires us to figure out relative paths, which is
handled in `system_prefix()` by trying to strip a couple of well-known
paths.

One of these paths, GIT_EXEC_PATH, is expected to be absolute when
runtime prefixes are enabled, but relative otherwise. And while our
Makefile gets this correctly, in Meson we always wire up the absolute
path, which may result in us not being able to find binaries.

Fix this by conditionally injecting the paths depending on whether or
not the `runtime_prefix` option is enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 09:09:34 -08:00
08bdfd4535 Git 2.49-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26 08:55:18 -08:00
e24570b0a3 Merge branch 'jk/check-mailmap-wo-name-fix'
"git check-mailmap" segfault fix.

* jk/check-mailmap-wo-name-fix:
  mailmap: fix check-mailmap with full mailmap line
2025-02-26 08:51:00 -08:00
bbca240cbf Merge branch 'ek/mingw-rename-symlink'
Symlink renaming fix.

* ek/mingw-rename-symlink:
  compat/mingw: rename the symlink, not the target
2025-02-26 08:50:37 -08:00
4ebba56419 merge-strategies.adoc: detail submodule merge
Submodule merges are, in general, similar to other merges based on oid
three-way-merge. When a conflict happens, however, Git has two special
cases (introduced in 68d03e4a6e) on handling the conflict before
yielding it to the user. From the merge-ort and merge-recursive sources:

- "Case #1: a is contained in b or vice versa": both strategies try to
perform a fast-forward in the submodules if the commit referred by the
conflicted submodule is descendant of another;

- "Case #2: There are one or more merges that contain a and b in the
submodule.  If there is only one, then present it as a suggestion to the
user, but leave it marked unmerged so the user needs to confirm the
resolution."

Add a small paragraph on merge-strategies.adoc describing this behavior.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 16:06:06 -08:00
887758c998 BreakingChanges: clarify branches/ and remotes/
As we have created an empty .git/branches/ hierarchy until fairly
recently, these directories may be found in modern repositories, but
it is highly unlikely that they are being used.

Reported-by: Jakub Wilk <jwilk@jwilk.net>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 15:48:16 -08:00
5a526e5e18 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 14:19:37 -08:00
f65d9cfd3f Merge branch 'po/meson-perl-fix'
Upgrade the minimum Perl version enforced by meson-based build to
match what Makefile-based build uses.

* po/meson-perl-fix:
  meson: fix Perl version check for Meson versions before 1.7.0
  meson: bump minimum required Perl version to 5.26.0
2025-02-25 14:19:37 -08:00
2ebbe2b2db Merge branch 'ms/rename-match-name-with-pattern'
Code renaming.

* ms/rename-match-name-with-pattern:
  refspec: clarify function naming and documentation
2025-02-25 14:19:37 -08:00
092180990d Merge branch 'ad/set-default-target-in-makefiles'
Correct the default target in Documentation/Makefile, and
future-proof all Makefiles from similar breakages by declaring the
default target (which happens to be "all") upfront.

* ad/set-default-target-in-makefiles:
  Makefile: set default goals in makefiles
2025-02-25 14:19:36 -08:00
9b07c152df Merge branch 'pw/merge-tree-stdin-deadlock-fix'
"git merge-tree --stdin" has been improved (including a workaround
for a deadlock).

* pw/merge-tree-stdin-deadlock-fix:
  merge-tree: fix link formatting in html docs
  merge-tree: improve docs for --stdin
  merge-tree: only use basic merge config
  merge-tree: remove redundant code
  merge-tree --stdin: flush stdout to avoid deadlock
2025-02-25 14:19:36 -08:00
37b34c4e99 Merge branch 'mh/doc-commit-title-not-subject'
The documentation of "git commit" and "git rebase" now refer to
commit titles as such, not "subject".

* mh/doc-commit-title-not-subject:
  doc: use 'title' consistently
2025-02-25 14:19:36 -08:00
a8a5bb1f78 Merge branch 'bc/diff-reject-empty-arg-to-pickaxe'
The -G/-S options to the "diff" family of commands caused us to hit
a BUG() when they get no values; they have been corrected.

* bc/diff-reject-empty-arg-to-pickaxe:
  diff: don't crash with empty argument to -G or -S
2025-02-25 14:19:35 -08:00
5ce6e0e242 Merge branch 'tb/new-make-fix'
Workaround the overly picky HT/SP rule in newer GNU Make.

* tb/new-make-fix:
  Makefile: remove accidental recipe prefix in conditional
2025-02-25 14:19:35 -08:00
f52abcda95 Merge branch 'da/xdiff-w-sign-compare-workaround'
Noises from "-Wsign-compare" in the borrowed xdiff code has been
squelched.

* da/xdiff-w-sign-compare-workaround:
  xdiff: avoid signed vs. unsigned comparisons in xutils.c
  xdiff: avoid signed vs. unsigned comparisons in xpatience.c
  xdiff: avoid signed vs. unsigned comparisons in xhistogram.c
  xdiff: avoid signed vs. unsigned comparisons in xemit.c
  xdiff: avoid signed vs. unsigned comparisons in xdiffi.c
  xdiff: move sign comparison warning guard into each file
2025-02-25 14:19:35 -08:00
149585079f t/unit-tests: convert oidtree test to use clar test framework
Adapt oidtree test script to clar framework by using clar assertions
where necessary. `cl_parse_any_oid()` ensures the hash algorithm is set
before parsing. This prevents issues from an uninitialized or invalid
hash algorithm.

Introduce 'test_oidtree__initialize` handles the to set up of the global
oidtree variable and `test_oidtree__cleanup` frees the oidtree when all
tests are completed.

With this change, `check_each` stops at the first error encountered,
making it easier to address it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:23 -08:00
69bc044def t/unit-tests: convert oidmap test to use clar test framework
Adapt oidmap test script to clar framework by using clar assertions
where necessary. `cl_parse_any_oid()` ensures the hash algorithm is set
before parsing. This prevents issues from an uninitialized or invalid
hash algorithm.

Introduce 'test_oidmap__initialize` handles the to set up of the global
oidmap map with predefined key-value pairs, and `test_oidmap__cleanup`
frees the oidmap and its entries when all tests are completed.

The test loops through all entries to detect multiple errors. With this
change, it stops at the first error encountered, making it easier to
address it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
869a1edf44 t/unit-tests: convert oid-array test to use clar test framework
Adapt oid-array test script to clar framework by using clar assertions
where necessary. Remove descriptions from macros to reduce
redundancy, and move test input arrays to global scope for reuse across
multiple test functions. Introduce `test_oid_array__initialize()` to
explicitly initialize the hash algorithm.

These changes streamline the test suite, making individual tests
self-contained and reducing redundant code.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
a16a2ee312 t/unit-tests: implement clar specific oid helper functions
`get_oid_arbitrary_hex()` and `init_hash_algo()` are both required for
oid-related tests to run without errors. In the current implementation,
both functions are defined and declared in the
`t/unit-tests/lib-oid.{c,h}` which is utilized by oid-related tests in
the homegrown unit tests structure.

Adapt functions in lib-oid.{c,h} to use clar. Both these functions
become available for oid-related test files implemented using the clar
testing framework, which requires them. This will be used by subsequent
commits.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
ce98863204 t/aggregate-results: fix paste(1) invocation
When running `make test`, when missing prereqs the following is emitted:

    make aggregate-results
    usage: paste [-s] [-d delimiters] file ...
    fixed   1
    success 30066
    failed  0
    broken  218
    total   31274

POSIX says that `paste` requires a file operand; stdin was clearly
intended by 49da404070 (test-lib: show missing prereq summary,
2021-11-20). Use it.

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-24 12:24:16 -08:00
1ca727f230 commit: avoid parent list buildup in clear_commit_marks_many()
clear_commit_marks_1() clears the marks of the first parent and its
first parent and so on, and saves the higher numbered parents in a list
for later.  There is no benefit in keeping that list growing with each
handled commit.  Clear it after each run to reduce peak memory usage.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-24 08:51:18 -08:00
3306edb380 http: allow using netrc for WebDAV-based HTTP protocol
For an extended period of time, we've enabled libcurl's netrc
functionality, which will read credentials from the netrc file if none
are provided.  Unfortunately, we have also not documented this fact or
written any tests for it, but people have come to rely on it.

In 610cbc1dfb ("http: allow authenticating proactively", 2024-07-10), we
accidentally broke the ability of users to use the netrc file for the
WebDAV-based HTTP protocol.  Notably, it works on the initial request
but does not work on subsequent requests, which causes failures because
that version of the protocol will necessarily make multiple requests.

This happens because curl_empty_auth_enabled never returns -1, only 0 or
1, and so if http.proactiveAuth is not enabled, the username and
password are always set to empty credentials, which prevents libcurl's
fallback to netrc from working.  However, in other cases, the server
continues to get a 401 response and the credential helper is invoked,
which is the normal behavior, so this was not noticed earlier.

To fix this, change the condition to check for enabling empty auth and
also not having proactive auth enabled, which should result in the
username and password not being set to a single colon in the typical
case, and thus the netrc file being used.

Reported-by: Peter Georg <peter.georg@physik.uni-regensburg.de>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-24 08:49:10 -08:00
bb60c52131 mailmap: fix check-mailmap with full mailmap line
I recently had reported to me a crash from a coworker using the recently
added sendemail mailmap support:

  3724814 Segmentation fault      (core dumped) git check-mailmap "bugs@company.xx"

This appears to happen because of the NULL pointer name passed into
map_user(). Fix this by passing "" instead of NULL so that we have a
valid pointer.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 18:27:16 -08:00
2d2a71ce85 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 10:35:54 -08:00
84a5ce3f03 Merge branch 'ac/doc-http-ssl-type-config'
Two configuration variables about SSL authentication material that
weren't mentioned in the documentations are now mentioned.

* ac/doc-http-ssl-type-config:
  docs: indicate http.sslCertType and sslKeyType
2025-02-21 10:35:54 -08:00
55b5ba87f1 Merge branch 'en/doc-renormalize'
Doc updates.

* en/doc-renormalize:
  doc: clarify the intent of the renormalize option in the merge machinery
2025-02-21 10:35:53 -08:00
0fbe93b36c Merge branch 'jc/doc-boolean-synonyms'
Doc updates.

* jc/doc-boolean-synonyms:
  doc: centrally document various ways tospell `true` and `false`
2025-02-21 10:35:53 -08:00
ee8020ff40 Merge branch 'ua/update-server-info-sans-the-repository'
Code clean-up.

* ua/update-server-info-sans-the-repository:
  builtin/update-server-info: remove the_repository global variable
2025-02-21 10:35:53 -08:00
975fc0471a compat/mingw: rename the symlink, not the target
Since 183ea3ea (Merge branch 'ps/mingw-rename', 2024-11-13),
a new technique is used on Windows to rename files, where supported.
The first step of this technique is to open the file with
`CreateFileW`. At that time, `FILE_ATTRIBUTE_NORMAL` was passed as
the value of the `dwFlagsAndAttributes` argument. In b30404df [2], this
was improved by passing `FILE_FLAG_BACKUP_SEMANTICS`, to support
directories as well as regular files.

However, neither value of `dwFlagsAndAttributes` is sufficient to open
a symbolic link with the correct semantics to rename it. Symlinks on
Windows are reparse points. Attempting to open a reparse point with
`CreateFileW` dereferences the reparse point and opens the target
instead, unless `FILE_FLAG_OPEN_REPARSE_POINT` is included in
`dwFlagsAndAttributes`. This is documented for that flag and in the
"Symbolic Link Behavior" section of the `CreateFileW` docs [3].

This produces a regression where attempting to rename a symlink on
Windows renames its target to the intended new name and location of the
symlink. For example, if `symlink` points to `file`, then running

    git mv symlink symlink-renamed

leaves `symlink` in place and unchanged, but renames `file` to
`symlink-renamed` [4].

This regression is detectable by existing tests in `t7001-mv.sh`, but
the tests must be run by a Windows user with the ability to create
symlinks, and the `ln -s` command used to create the initial symlink
must also be able to create a real symlink (such as by setting the
`MSYS` environment variable to `winsymlinks:nativestrict`). Then
these two tests fail if the regression is present, and pass otherwise:

    38 - git mv should overwrite file with a symlink
    39 - check moved symlink

Let's fix this, so that renaming a symlink again renames the symlink
itself and leaves the target unchanged, by passing

    FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT

as the `dwFlagsAndAttributes` argument. This is sufficient (and safe)
because including `FILE_FLAG_OPEN_REPARSE_POINT` causes no harm even
when used to open a file or directory that is not a reparse point. In
that case, as noted in [3], this flag is simply ignored.

[1]: 183ea3eabf
[2]: b30404dfc0
[3]: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew
[4]: https://github.com/git-for-windows/git/issues/5436

Signed-off-by: Eliah Kagan <eliah.kagan@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 10:24:43 -08:00
89be7d2774 builtin/refs: add '--no-reflog' flag to drop reflogs
The "git refs migrate" subcommand converts the backend used for ref
storage. It always migrates reflog data as well as refs. Introduce an
option to exclude reflogs from migration, allowing them to be discarded
when they are unnecessary.

This is particularly useful in server-side repositories, where reflogs
are typically not expected. However, some repositories may still have
them due to historical reasons, such as bugs, misconfigurations, or
administrative decisions to enable reflogs for debugging. In such
repositories, it would be optimal to drop reflogs during the migration.

To address this, introduce the '--no-reflog' flag, which prevents reflog
migration. When this flag is used, reflogs from the original reference
backend are migrated. Since only the new reference backend remains in
the repository, all previous reflogs are permanently discarded.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 09:55:02 -08:00
63a597dd94 ci: exercise credential helpers
Wire up credential helpers in our CI runs so that we can rest assured
that they compile and (if tests are available) function correctly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-20 07:49:52 -08:00
235fe77c29 ci: fix propagating UTF-8 test locale in musl-based Meson job
The musl-based Meson job is supposed to explicitly specify the UTF-8
locale used for testing, which has been introduced with 84bb5eeace (ci:
switch linux-musl to use Meson, 2025-01-28). That commit had two issues
though:

  - We continue to refer to "linux-musl", even though the job has been
    renamed in the same commit to "linux-musl-meson".

  - We use the wrong option name to specify the locale. This was not
    noticed though due to the first issue.

Fix both of these issues by fixing both the job and option naems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-20 07:49:52 -08:00
b838bf1938 Merge branch 'master' of https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
  gitk: introduce support for the Meson build system
  gitk: extract script to build executable
  gitk: make the "list references" default window width wider
  gitk: fix arrow keys in input fields with Tcl/Tk >= 8.6
  gitk: Use an external icon file on Windows
  gitk: Unicode file name support
  gitk(Windows): avoid inadvertently calling executables in the worktree
2025-02-20 05:59:56 -08:00
4a6cc6a20e Merge branch 'pks-meson-support' of https://github.com/pks-t/gitk
* 'pks-meson-support' of https://github.com/pks-t/gitk:
  gitk: introduce support for the Meson build system
  gitk: extract script to build executable

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-02-20 10:54:37 +01:00
9990b581fa Merge branch 'g4w-gitk' of https://github.com/dscho/gitk
* 'g4w-gitk' of https://github.com/dscho/gitk:
  gitk: make the "list references" default window width wider
  gitk: fix arrow keys in input fields with Tcl/Tk >= 8.6
  gitk: Use an external icon file on Windows
  gitk: Unicode file name support
  gitk(Windows): avoid inadvertently calling executables in the worktree

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-02-20 10:53:53 +01:00
b4c06f7c4d gitk: introduce support for the Meson build system
Upstream Git has introduced support for the Meson build system.
Introduce support for Meson into gitk, as well, so that Git can easily
build its vendored copy of Gitk via a `subproject()` directive. The
instructions can be set up as follows:

  $ meson setup build
  $ meson compile -C build
  $ meson install -C build

Specific options, like for example where Gitk shall be installed to, can
be specified at setup time via `-D`. Available options can be discovered
by running `meson configure` either in the source or build directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-02-20 08:52:15 +01:00
0d4fe3047f gitk: extract script to build executable
Extract the scrip that "builds" Gitk from our Makefile so that we can
reuse it in Meson.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-02-20 08:52:07 +01:00
cf7ee48190 agent: advertise OS name via agent capability
As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version-os" (e.g., "git/1.8.3.1-Linux").

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality. The operating system name is retrieved using the 'sysname'
field of the `uname(2)` system call or its equivalent.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-19 09:48:37 -08:00
0bf8d1b395 meson: fix Perl version check for Meson versions before 1.7.0
Command `perl --version` says, e.g., “This is perl 5, version 26,
subversion 0 (v5.26.0)”, which older versions of Meson interpret as
version 26.

This will be fixed in Meson 1.7.0, but at the time of writing that isn’t
yet released.

If we run `perl -V:version` we get the unambiguous response
“version='5.26.0';”, but we need at least Meson 1.5.0 to be able to do that.

Note that Perl are seriously considering dropping the leading 5 entirely
in the near future (https://perl.github.io/PPCs/ppc0025-perl-version/),
but that shouldn’t affect us.

Signed-off-by: Peter Oliver <git@mavit.org.uk>
Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-19 08:04:26 -08:00
d874d37837 meson: bump minimum required Perl version to 5.26.0
Commit 702d8c1f3b (Require Perl 5.26.0, 2024-10-23) dropped support
for Perl versions older than 5.26.0. The Meson build system, which
has been developed in parallel to that commit, hasn't been bumped
accordingly and thus still requires Perl 5.8.1 or newer.

Fix this by requiring Perl 5.26.0 or newer with Meson.

Signed-off-by: Peter Oliver <git@mavit.org.uk>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-19 08:04:11 -08:00
a554262210 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 15:30:33 -08:00
6d385fe277 Merge branch 'bc/contrib-thunderbird-patch-inline-fix'
A thunderbird helper script lost its bashism.

* bc/contrib-thunderbird-patch-inline-fix:
  thunderbird-patch-inline: avoid bashism
2025-02-18 15:30:33 -08:00
5dd710cb62 Merge branch 'lo/t7603-path-is-file-update'
Test clean-up.

* lo/t7603-path-is-file-update:
  t7603: replace test -f by test_path_is_file
2025-02-18 15:30:33 -08:00
716b00e6e9 Merge branch 'da/difftool-sans-the-repository'
"git difftool" code clean-up.

* da/difftool-sans-the-repository:
  difftool: eliminate use of USE_THE_REPOSITORY_VARIABLE
  difftool: eliminate use of the_repository
  difftool: eliminate use of global variables
2025-02-18 15:30:32 -08:00
7722b997c6 Merge branch 'jt/rev-list-missing-print-info'
"git rev-list --missing=" learned to accept "print-info" that gives
known details expected of the missing objects, like path and type.

* jt/rev-list-missing-print-info:
  rev-list: extend print-info to print missing object type
  rev-list: add print-info action to print missing object path
2025-02-18 15:30:32 -08:00
345aaf3976 Merge branch 'ps/send-pack-unhide-error-in-atomic-push'
"git push --atomic --porcelain" used to ignore failures from the
other side, losing the error status from the child process, which
has been corrected.

* ps/send-pack-unhide-error-in-atomic-push:
  send-pack: gracefully close the connection for atomic push
  t5543: atomic push reports exit code failure
  send-pack: new return code "ERROR_SEND_PACK_BAD_REF_STATUS"
  t5548: add porcelain push test cases for dry-run mode
  t5548: add new porcelain test cases
  t5548: refactor test cases by resetting upstream
  t5548: refactor to reuse setup_upstream() function
  t5504: modernize test by moving heredocs into test bodies
2025-02-18 15:30:32 -08:00
e565f37553 Merge branch 'ds/backfill'
Lazy-loading missing files in a blobless clone on demand is costly
as it tends to be one-blob-at-a-time.  "git backfill" is introduced
to help bulk-download necessary files beforehand.

* ds/backfill:
  backfill: assume --sparse when sparse-checkout is enabled
  backfill: add --sparse option
  backfill: add --min-batch-size=<n> option
  backfill: basic functionality and tests
  backfill: add builtin boilerplate
2025-02-18 15:30:31 -08:00
c1d6628c94 meson: wire up static analysis via Coccinelle
Wire up static analysis via Coccinelle via a new test target
"coccicheck". This target can be executed via `meson compile coccicheck`
and generates the semantic patch for us.

Note that we don't hardcode the list of source and header files that
shall be analyzed, and instead use git-ls-files(1) to find them for us.
This is because we also want to analyze files that may not get built on
the current platform, so finding all sources at configure time is easier
than introducing a new variable that tracks all sources, including those
which aren't being built.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:04 -08:00
e9e924e581 meson: wire up git-contacts(1)
Wire up the build for git-contacts(1) in Meson.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:04 -08:00
1cee22ebff meson: wire up credential helpers
We've got a couple of credential helpers in "contrib/credential", all
of which aren't yet wired up via Meson. Do so.

Note that ideally, we'd also wire up t0303 to be executed with each of
the credential helpers to verify their functionality. Unfortunately
though, none of them pass the test suite right now, so this is left for
a future change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:03 -08:00
3f22889276 contrib/credential: fix compilation of "osxkeychain" helper
The "osxkeychain" helper does not compile due to a warning generated by
the unused `argc` parameter. Fix the warning by checking for the minimum
number of required arguments explicitly in the least restrictive way
possible.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:03 -08:00
a47b8733b3 contrib/credential: fix compiling "libsecret" helper
The "libsecret" credential helper does not compile when developer
warnings are enabled due to three warnings:

    - contrib/credential/libsecret/git-credential-libsecret.c:78:1:
      missing initializer for field ‘reserved’ of ‘SecretSchema’
      [-Werror=missing-field-initializers]. This issue is fixed by using
      designated initializers.

    - contrib/credential/libsecret/git-credential-libsecret.c:171:43:
      comparison of integer expressions of different signedness: ‘int’
      and ‘guint’ {aka ‘unsigned int’} [-Werror=sign-compare]. This
      issue is fixed by using an unsigned variable to iterate through
      the string vector.

    - contrib/credential/libsecret/git-credential-libsecret.c:420:14:
      unused parameter ‘argc’ [-Werror=unused-parameter]. This issue is
      fixed by checking the number of arguments, but in the least
      restrictive way possible.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:03 -08:00
f8d95a323a contrib/credential: fix compilation of wincred helper with MSVC
The git-credential-wincred helper does not compile on Windows with
Microsoft Visual Studio because of our use of `__attribute__()`, which
its compiler doesn't support. While the rest of our codebase would know
to handle this because we redefine the macro in "compat/msvc.h", this
stub isn't available here because we don't include "git-compat-util.h"
in the first place.

Fix the issue by making the attribute depend on the `_MSC_VER`
preprocessor macro.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:03 -08:00
fd21e6e447 contrib/credential: fix "netrc" tests with out-of-tree builds
Tests of the "netrc" credential helper aren't prepared to handle
out-of-tree builds:

  - They expect the "test.pl" script to be located relative to the build
    directory, even though it is located in the source directory.

  - They expect the built "git-credential-netrc" helper to be located
    relative to the "test.pl" file, evne though it is loated in the
    build directory.

This works alright as long as source and build directories are the same,
but starts to break apart with Meson.

Fix these first issue by using the new "GIT_SOURCE_DIR" variable to
locate the test script itself. And fix the second issue by introducing a
new environment variable "CREDENTIAL_NETRC_PATH" that can be set for
out-of-tree builds to locate the built credential helper.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:03 -08:00
c5823641a6 GIT-BUILD-OPTIONS: propagate project's source directory
A couple of our tests require knowledge around where to find the
project's source directory in order to locate files required for the
test itself. Until now we have been wiring these up ad-hoc via new,
specialized variables catered to the specific usecase. This is quite
awkward though, as every test that potentially needs to locate paths
relative to the source directory needs to grow another variable.

Introduce a new "GIT_SOURCE_DIR" variable into GIT-BUILD-OPTIONS to stop
this proliferation. Remove existing variables that can be derived from
it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:02 -08:00
36463e32df promisor-remote: check advertised name or URL
A previous commit introduced a "promisor.acceptFromServer" configuration
variable with only "None" or "All" as valid values.

Let's introduce "KnownName" and "KnownUrl" as valid values for this
configuration option to give more choice to a client about which
promisor remotes it might accept among those that the server advertised.

In case of "KnownName", the client will accept promisor remotes which
are already configured on the client and have the same name as those
advertised by the client. This could be useful in a corporate setup
where servers and clients are trusted to not switch names and URLs, but
where some kind of control is still useful.

In case of "KnownUrl", the client will accept promisor remotes which
have both the same name and the same URL configured on the client as the
name and URL advertised by the server. This is the most secure option,
so it should be used if possible.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:05:37 -08:00
d460267613 Add 'promisor-remote' capability to protocol v2
When a server S knows that some objects from a repository are available
from a promisor remote X, S might want to suggest to a client C cloning
or fetching the repo from S that C may use X directly instead of S for
these objects.

Note that this could happen both in the case S itself doesn't have the
objects and borrows them from X, and in the case S has the objects but
knows that X is better connected to the world (e.g., it is in a
$LARGEINTERNETCOMPANY datacenter with petabit/s backbone connections)
than S. Implementation of the latter case, which would require S to
omit in its response the objects available on X, is left for future
improvement though.

Then C might or might not, want to get the objects from X. If S and C
can agree on C using X directly, S can then omit objects that can be
obtained from X when answering C's request.

To allow S and C to agree and let each other know about C using X or
not, let's introduce a new "promisor-remote" capability in the
protocol v2, as well as a few new configuration variables:

  - "promisor.advertise" on the server side, and:
  - "promisor.acceptFromServer" on the client side.

By default, or if "promisor.advertise" is set to 'false', a server S will
not advertise the "promisor-remote" capability.

If S doesn't advertise the "promisor-remote" capability, then a client C
replying to S shouldn't advertise the "promisor-remote" capability
either.

If "promisor.advertise" is set to 'true', S will advertise its promisor
remotes with a string like:

  promisor-remote=<pr-info>[;<pr-info>]...

where each <pr-info> element contains information about a single
promisor remote in the form:

  name=<pr-name>[,url=<pr-url>]

where <pr-name> is the urlencoded name of a promisor remote and
<pr-url> is the urlencoded URL of the promisor remote named <pr-name>.

For now, the URL is passed in addition to the name. In the future, it
might be possible to pass other information like a filter-spec that the
client may use when cloning from S, or a token that the client may use
when retrieving objects from X.

It is C's responsibility to arrange how it can reach X though, so pieces
of information that are usually outside Git's concern, like proxy
configuration, must not be distributed over this protocol.

It might also be possible in the future for "promisor.advertise" to have
other values. For example a value like "onlyName" could prevent S from
advertising URLs, which could help in case C should use a different URL
for X than the URL S is using. (The URL S is using might be an internal
one on the server side for example.)

By default or if "promisor.acceptFromServer" is set to "None", C will
not accept to use the promisor remotes that might have been advertised
by S. In this case, C will not advertise any "promisor-remote"
capability in its reply to S.

If "promisor.acceptFromServer" is set to "All" and S advertised some
promisor remotes, then on the contrary, C will accept to use all the
promisor remotes that S advertised and C will reply with a string like:

  promisor-remote=<pr-name>[;<pr-name>]...

where the <pr-name> elements are the urlencoded names of all the
promisor remotes S advertised.

In a following commit, other values for "promisor.acceptFromServer" will
be implemented, so that C will be able to decide the promisor remotes it
accepts depending on the name and URL it received from S. So even if
that name and URL information is not used much right now, it will be
needed soon.

Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:05:37 -08:00
a620046b29 diff: don't crash with empty argument to -G or -S
The pickaxe options, -G and -S, need either a regex or a string to look
through the history for.  An empty value isn't very useful since it
would either match everything or nothing, and what's worse, we presently
crash with a BUG like so when the user provides one:

    BUG: diffcore-pickaxe.c:241: should have needle under -G or -S

Since it's not very nice of us to crash and this wouldn't do anything
useful anyway, let's simply inform the user that they must provide a
non-empty argument and exit with an error if they provide an empty one
instead.

Reported-by: Jared Van Bortel <cebtenzzre@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 10:17:02 -08:00
c2d96bc42c doc: use 'title' consistently
The first line of a commit message is variously called 'title' or
'subject'.

Prefer 'title' unless discussing email.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:56:00 -08:00
6a9ae81015 merge-tree: fix link formatting in html docs
In the html documentation the link to the "OUTPUT" section is surrounded
by square brackets. Fix this by adding explicit link text to the cross
reference.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:40 -08:00
3e681a7ccc merge-tree: improve docs for --stdin
Add a section for --stdin in the list of options and document that it
implies -z so readers know how to parse the output. Also correct the
merge status documentation for --stdin as if the status is less than
zero "git merge-tree" dies before printing it.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:40 -08:00
54cf5d2da8 merge-tree: only use basic merge config
Commit 9c93ba4d0a (merge-recursive: honor diff.algorithm, 2024-07-13)
replaced init_merge_options() with init_basic_merge_config() for use in
plumbing commands and init_ui_merge_config() for use in porcelain
commands. As "git merge-tree" is a plumbing command it should call
init_basic_merge_config() rather than init_ui_merge_config(). The merge
ort machinery ignores "diff.algorithm" so the behavior is unchanged by
this commit but it future proofs us against any future changes to
init_ui_merge_config().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:39 -08:00
1b0e5f4499 merge-tree: remove redundant code
real_merge() only ever returns "0" or "1" as it dies if the merge status
is less than zero. Therefore the check for "result < 0" is redundant and
the result variable is not needed. The return value of real_merge() is
ignored because exit status of "git merge-tree --stdin" is "0" for both
successful and conflicted merges (the status of each merge is written to
stdout). The return type of real_merge() is not changed as it is used
for the program's exit status when "--stdin" is not given.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:39 -08:00
344a107b55 merge-tree --stdin: flush stdout to avoid deadlock
If a process tries to read the output from "git merge-tree --stdin"
before it closes merge-tree's stdin then it deadlocks. This happens
because merge-tree does not flush its output before trying to read
another line of input and means that it is not possible to cherry-pick a
sequence of commits using "git merge-tree --stdin". Fix this by calling
maybe_flush_or_die() before trying to read the next line of
input. Flushing the output after each merge does not seem to affect the
performance, any difference is lost in the noise even after increasing
the number of runs.

$ git rev-list --merges --parents -n100 origin/master |
	sed 's/^[^ ]* //' >/tmp/merges
$ hyperfine -L flush 0,1 --warmup 1 --runs 30 \
	'GIT_FLUSH={flush} ./git merge-tree --stdin </tmp/merges'
Benchmark 1: GIT_FLUSH=0 ./git merge-tree --stdin </tmp/merges
  Time (mean ± σ):     546.6 ms ±  11.7 ms    [User: 503.2 ms, System: 40.9 ms]
  Range (min … max):   535.9 ms … 567.7 ms    30 runs

Benchmark 2: GIT_FLUSH=1 ./git merge-tree --stdin </tmp/merges
  Time (mean ± σ):     546.9 ms ±  12.0 ms    [User: 505.9 ms, System: 38.9 ms]
  Range (min … max):   529.8 ms … 570.0 ms    30 runs

Summary
  'GIT_FLUSH=0 ./git merge-tree --stdin </tmp/merges' ran
    1.00 ± 0.03 times faster than 'GIT_FLUSH=1 ./git merge-tree --stdin </tmp/merges'

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:39 -08:00
044b6f04f2 refspec: clarify function naming and documentation
Rename `match_name_with_pattern()` to `match_refname_with_pattern()` to
better reflect its purpose and improve documentation comment clarity.
The previous function name and parameter names were inconsistent, making
it harder to understand their roles in refspec matching.

- Rename parameters:
  - `key` -> `pattern` (globbing pattern to match)
  - `name` -> `refname` (refname to check)
  - `value` -> `replacement` (replacement mapping pattern)

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:44:27 -08:00
15ff206863 t5701: add setup test to remove side-effect dependency
Currently, the "test capability advertisement" test creates some files
with expected content which are used by other tests below it.

To remove that side-effect from this test, let's split up part of
it into a "setup"-type test which creates the files with expected content
which gets reused by multiple tests. This will be useful in a following
commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:13 -08:00
6aa09fd872 version: extend get_uname_info() to hide system details
Currently, get_uname_info() function provides the full OS information.
In a following commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:12 -08:00
0a78d61247 version: refactor get_uname_info()
Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:12 -08:00
cdfd081df6 version: refactor redact_non_printables()
The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:12 -08:00
0c124cba54 version: replace manual ASCII checks with isprint() for clarity
Since the isprint() function checks for printable characters, let's
replace the existing hardcoded ASCII checks with it. However, since
the original checks also handled spaces, we need to account for spaces
explicitly in the new check.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:12 -08:00
5309c1e9fb Makefile: set default goals in makefiles
Explicitly set the default goal at the very top of various makefiles.
This is already present in some makefiles, but not all of them.

In particular, this corrects a regression introduced in a38edab7c8
(Makefile: generate doc versions via GIT-VERSION-GEN, 2024-12-06).  That
commit added some config files as build targets for the Documentation
directory, and put the target configuration in a sensible place.
Unfortunately, that sensible place was above any other build target
definitions, meaning the default goal changed to being those
configuration files only, rather than the HTML and man page
documentation.

Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Helped-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:02:26 -08:00
0394451348 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-14 17:53:49 -08:00
60cb8e79cb Merge branch 'ps/doc-http-upload-archive-service'
Doc update.

* ps/doc-http-upload-archive-service:
  doc: documentation for http.uploadarchive config option
2025-02-14 17:53:49 -08:00
82522a9e2c Merge branch 'kn/reflog-migration-fix-followup'
Code clean-up.

* kn/reflog-migration-fix-followup:
  reftable: prevent 'update_index' changes after adding records
  refs: use 'uint64_t' for 'ref_update.index'
  refs: mark `ref_transaction_update_reflog()` as static
2025-02-14 17:53:48 -08:00
c3fffcfe8e Merge branch 'bf/fetch-set-head-fix'
Fetching into a bare repository incorrectly assumed it always used
a mirror layout when deciding to update remote-tracking HEAD, which
has been corrected.

* bf/fetch-set-head-fix:
  fetch set_head: fix non-mirror remotes in bare repositories
  fetch set_head: refactor to use remote directly
2025-02-14 17:53:48 -08:00
09e74b06ea Merge branch 'op/worktree-is-main-bare-fix'
Going into a secondary worktree and asking "is the main worktree
bare?" did not work correctly when per-worktree configuration
option was in use, which has been corrected.

* op/worktree-is-main-bare-fix:
  worktree: detect from secondary worktree if main worktree is bare
2025-02-14 17:53:48 -08:00
5785d9143b Merge branch 'tc/clone-single-revision'
"git clone" learned to make a shallow clone for a single commit
that is not necessarily be at the tip of any branch.

* tc/clone-single-revision:
  builtin/clone: teach git-clone(1) the --revision= option
  parse-options: introduce die_for_incompatible_opt2()
  clone: introduce struct clone_opts in builtin/clone.c
  clone: add tags refspec earlier to fetch refspec
  clone: refactor wanted_peer_refs()
  clone: make it possible to specify --tags
  clone: cut down on global variables in clone.c
2025-02-14 17:53:48 -08:00
0cc13007e5 Merge branch 'bc/doc-adoc-not-txt'
All the documentation .txt files have been renamed to .adoc to help
content aware editors.

* bc/doc-adoc-not-txt:
  Remove obsolete ".txt" extensions for AsciiDoc files
  doc: use .adoc extension for AsciiDoc files
  gitattributes: mark AsciiDoc files as LF-only
  editorconfig: add .adoc extension
  doc: update gitignore for .adoc extension
2025-02-14 17:53:47 -08:00
0d03fda6a5 config/remote.txt: improve wording for 'remote.<name>.followRemoteHEAD'
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-14 14:09:36 -08:00
aaf8f79c67 config/remote.txt: reunite 'severOption' description paragraphs
When 'remote.<name>.followRemoteHEAD' was added in b7f7d16562 (fetch:
add configuration for set_head behaviour, 2024-11-29), its description
was added to remote.txt in between the two paragraphs describing
'remote.<name>.serverOption'. Reunite these two paragraphs.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-14 14:09:36 -08:00
b07dd9078b merge-recursive: optimize time complexity for process_renames
Avoid O(n^2) complexity in `process_renames()` when building a sorted
`string_list` by constructing it unsorted and sorting it afterward,
reducing the complexity to O(n log n).

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-13 21:33:00 -08:00
f23179924b Makefile: remove accidental recipe prefix in conditional
Back in 728b9ac0c3 (Makefile(s): avoid recipe prefix in conditional
statements, 2024-04-08), we prepared our Makefiles for a forthcoming
change in upstream Make that would ban the recipe prefix within a
conditional statement by replacing tabs (the prefix) with eight spaces.

In b9d6f64393 (compat/zlib: allow use of zlib-ng as backend,
2025-01-28), a handful of recipe prefix characters were introduced in a
conditional statement ('ifdef ZLIB_NG'), causing 'make' to fail on my
system, which uses GNU Make 4.4.90.

Remove the recipe prefix characters by replacing them with the same
script as is mentioned in 728b9ac0c3.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-13 13:16:34 -08:00
e2067b49ec The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 10:09:08 -08:00
2d7a874493 Merge branch 'da/help-autocorrect-one-fix'
"git -c help.autocorrect=0 psuh" shows the suggested typofix,
unlike the previous attempt in the base topic.

* da/help-autocorrect-one-fix:
  help: add "show" as a valid configuration value
  help: show the suggested command when help.autocorrect is false
2025-02-12 10:08:55 -08:00
39de0ffbe3 Merge branch 'sc/help-autocorrect-one'
"[help] autocorrect = 1" used to be a way to say "please wait for
0.1 second after suggesting a typofix of the command name before
running that command"; now it means "yes, if there is a plausible
typofix for the command name, please run it immediately".

* sc/help-autocorrect-one:
  help: interpret boolean string values for help.autocorrect
2025-02-12 10:08:55 -08:00
0a99ffb4d6 Merge branch 'ms/remote-valid-remote-name'
Code shuffling.

* ms/remote-valid-remote-name:
  remote: relocate valid_remote_name
2025-02-12 10:08:54 -08:00
998c5f0c75 Merge branch 'ms/refspec-cleanup'
Code clean-up.  cf. <Z6G-toOJjMmK8iJG@pks.im>

* ms/refspec-cleanup:
  refspec: relocate apply_refspecs and related funtions
  refspec: relocate matching related functions
  remote: rename query_refspecs functions
  refspec: relocate refname_matches_negative_refspec_item
  remote: rename function omit_name_by_refspec
2025-02-12 10:08:54 -08:00
791677a5dd Merge branch 'jp/doc-trailer-config'
Documentaiton updates.

* jp/doc-trailer-config:
  config.txt: add trailer.* variables
2025-02-12 10:08:54 -08:00
5b9d01bc4d Merge branch 'zh/gc-expire-to'
"git gc" learned the "--expire-to" option and passes it down to
underlying "git repack".

* zh/gc-expire-to:
  gc: add `--expire-to` option
2025-02-12 10:08:53 -08:00
a4af0b6288 Merge branch 'js/libgit-rust'
Foreign language interface for Rust into our code base has been added.

* js/libgit-rust:
  libgit: add higher-level libgit crate
  libgit-sys: also export some config_set functions
  libgit-sys: introduce Rust wrapper for libgit.a
  common-main: split init and exit code into new files
2025-02-12 10:08:53 -08:00
3f3fd0f346 Merge branch 'ac/t5401-use-test-path-is-file'
Test clean-up.

* ac/t5401-use-test-path-is-file:
  t5401: prefer test_path_is_* helper function
2025-02-12 10:08:52 -08:00
9865ef2457 Merge branch 'ac/t6423-unhide-git-exit-status'
Test clean-up.

* ac/t6423-unhide-git-exit-status:
  t6423: fix suppression of Git’s exit code in tests
2025-02-12 10:08:52 -08:00
07c401d392 Merge branch 'ps/repack-keep-unreachable-in-unpacked-repo'
"git repack --keep-unreachable" to send unreachable objects to the
main pack "git repack -ad" produces did not work when there is no
existing packs, which has been corrected.

* ps/repack-keep-unreachable-in-unpacked-repo:
  builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-12 10:08:52 -08:00
aae91a86fb Merge branch 'ds/name-hash-tweaks'
"git pack-objects" and its wrapper "git repack" learned an option
to use an alternative path-hash function to improve delta-base
selection to produce a packfile with deeper history than window
size.

* ds/name-hash-tweaks:
  pack-objects: prevent name hash version change
  test-tool: add helper for name-hash values
  p5313: add size comparison test
  pack-objects: add GIT_TEST_NAME_HASH_VERSION
  repack: add --name-hash-version option
  pack-objects: add --name-hash-version option
  pack-objects: create new name-hash function version
2025-02-12 10:08:51 -08:00
a3b56f5f43 xdiff: avoid signed vs. unsigned comparisons in xutils.c
The comparisons all involve comparisons against unsigned values.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 09:41:17 -08:00
13b67f15c1 xdiff: avoid signed vs. unsigned comparisons in xpatience.c
The loop iteration variable is non-negative and used in comparisons
against a size_t value. Use size_t to eliminate the mismatch.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 09:41:17 -08:00
2dc6cf247e xdiff: avoid signed vs. unsigned comparisons in xhistogram.c
The comparisons all involve unsigned variables. Cast the comparison
to unsigned to eliminate the mismatch.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 09:41:16 -08:00
46fb084353 xdiff: avoid signed vs. unsigned comparisons in xemit.c
The unsigned `ignored` variable causes expressions to promote to
unsigned. Use a signed value to make comparisons use the same types.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 09:41:16 -08:00
0d31bab479 xdiff: avoid signed vs. unsigned comparisons in xdiffi.c
The loop iteration variable is non-negative and only used in comparisons
against other size_t values.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 09:41:16 -08:00
9d16f89584 xdiff: move sign comparison warning guard into each file
Allow each file to fix the warnings guarded by the macro separately by
moving the definition from the shared xinclude.h into each file that
needs it.

xmerge.c and xprepare.c do not contain any signed vs. unsigned
comparisons so the definition was not included in these files.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-12 09:41:15 -08:00
45761988ac doc: clarify the intent of the renormalize option in the merge machinery
The -X renormalize (or merge.renormalize config) option is intended to
reduce conflicts due to normalization of newer versions of history.  It
does so by renormalizing files that it is about to do a three-way
content merge on.  Some folks thought it would renormalize all files
throughout the tree, and the previous wording wasn't clear enough to
dispell that misconception.  Update the docs to make it clear that the
merge machinery will only apply renormalization to files which need a
three-way content merge.

(Technically, the merge machinery also does renormalization on
modify/delete conflicts, in order to see if the modification was merely
a normalization; if so, it can accept the delete and not report a
conflict.  But it's not clear that this piece needs to be explained to
users, and trying to distinguish it might feel like splitting hairs and
overcomplicating the explanation, so we leave it out.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-11 13:34:36 -08:00
832f56f06a doc: centrally document various ways tospell true and false
We do not seem to centrally document exhaustively ways to spell
Boolean values.

The description in the Environment Variables of git(1) section
assumes that the reader is already familiar with how "Boolean valued
configuration variables" are specified, without referring to
anything, so there is no way for the readers to find out more.

The description of `bool` in the section on "--type
<type>" in "git config --help" might be the place to do so, but it
is not telling us all that much.

The description of Boolean valued placeholders in the pretty formats
section of "git log --help" enumerates the possible values with "etc."
implying there may be other synonyms; shrink the list of samples and
instead refer to the canonical and authoritative source of truth, which
now is git-config(1).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-11 10:12:04 -08:00
af8fc7be10 rebase -i: reword empty commit after fast-forward
When rebase rewords a commit it picks the commit and then runs "git
commit --amend" to reword it. When the commit is picked the sequencer
tries to reuse existing commits by fast-forwarding if the parents are
unchanged. Rewording an empty commit that has been fast-forwarded fails
because "git commit --amend" is called without "--allow-empty". This
happens because when a commit is fast-forwarded the logic that checks
whether we should pass "--allow-empty" is skipped. Fix this by always
passing "--allow-empty" when rewording a commit. This is safe because we
are amending a commit that has already been picked so if it had become
empty when it was picked we'd have already returned an error.

As "git commit" will happily create empty merge commits without
"--allow-empty" we do not need to pass that flag when rewording merge
commits.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-11 09:50:53 -08:00
62898b8f5e builtin/update-server-info: remove the_repository global variable
Remove the_repository global variable in favor of the repository
argument that gets passed in "builtin/update-server-info.c".

When `-h` is passed to the command outside a Git repository, the
`run_builtin()` will call the `cmd_update_server_info()` function
with `repo` set to NULL and then early in the function, "parse_options()"
call will give the options help and exit, without having to consult much
of the configuration file. So it is safe to omit reading the config when
`repo` argument the caller gave us is NULL.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10 16:20:21 -08:00
59d26bd961 thunderbird-patch-inline: avoid bashism
The use of "echo -e" is not portable and not specified by POSIX.  dash
does not support any options except "-n", and so this script will not
work on operating systems which use that as /bin/sh.

Fortunately, the solution is easy: switch to printf(1), which is
specified by POSIX and allows the escape sequences we want to use.  This
will allow the script to work with any POSIX shell.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10 16:16:19 -08:00
388218fac7 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10 10:18:32 -08:00
50e1821529 Merge branch 'jk/ci-coverity-update'
CI update to make Coverity job work again.

* jk/ci-coverity-update:
  ci: set CI_JOB_IMAGE for coverity job
2025-02-10 10:18:31 -08:00
6f0b72205d Merge branch 'sk/unit-tests-0130'
Convert a handful of unit tests to work with the clar framework.

* sk/unit-tests-0130:
  t/unit-tests: convert strcmp-offset test to use clar test framework
  t/unit-tests: convert strbuf test to use clar test framework
  t/unit-tests: adapt example decorate test to use clar test framework
  t/unit-tests: convert hashmap test to use clar test framework
2025-02-10 10:18:31 -08:00
246569bf83 Merge branch 'ps/hash-cleanup'
Further code clean-up on the use of hash functions.  Now the
context object knows what hash function it is working with.

* ps/hash-cleanup:
  global: adapt callers to use generic hash context helpers
  hash: provide generic wrappers to update hash contexts
  hash: stop typedeffing the hash context
  hash: convert hashing context to a structure
2025-02-10 10:18:31 -08:00
0ca6b46d7c Merge branch 'jt/gitlab-ci-base-fix'
Two CI tasks, whitespace check and style check, work on the
difference from the base version and the version being checked, but
the base was computed incorrectly in GitLab CI in some cases, which
has been corrected.

* jt/gitlab-ci-base-fix:
  ci: fix base commit fallback for check-whitespace and check-style
2025-02-10 10:18:30 -08:00
34736ff48e Merge branch 'pw/apply-ulong-overflow-check'
"git apply" internally uses unsigned long for line numbers and uses
strtoul() to parse numbers on the hunk headers.  It however forgot
to check parse errors.

* pw/apply-ulong-overflow-check:
  apply: detect overflow when parsing hunk header
2025-02-10 10:18:30 -08:00
442b7e0018 Merge branch 'ps/setup-reinit-fixes'
"git init" to reinitialize a repository that already exists cannot
change the hash function and ref backends; such a request is
silently ignored now.

* ps/setup-reinit-fixes:
  setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH
  setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT
  t0001: remove duplicate test
2025-02-10 10:18:29 -08:00
f1cc562b77 t7603: replace test -f by test_path_is_file
`test_path_is_file` provides a better output when asserting whether a
file exists. Replace the occurrences of `test -f` in t7603 with it,
facilitating the trace of possible test failures.

Signed-off-by: Lucas Oshiro <lucasseikioshiro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10 08:03:30 -08:00
07242c2a5a path: drop git_common_path() in favor of repo_common_path()
Remove `git_common_path()` in favor of the `repo_common_path()` family
of functions, which makes the implicit dependency on `the_repository` go
away.

Note that `git_common_path()` used to return a string allocated via
`get_pathname()`, which uses a rotating set of statically allocated
buffers. Consequently, callers didn't have to free the returned string.
The same isn't true for `repo_common_path()`, so we also have to add
logic to free the returned strings.

This refactoring also allows us to remove `repo_common_pathv()` from the
public interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:23 -08:00
8e4710f011 worktree: return allocated string from get_worktree_git_dir()
The `get_worktree_git_dir()` function returns a string constant that
does not need to be free'd by the caller. This string is computed for
three different cases:

  - If we don't have a worktree we return a path into the Git directory.
    The returned string is owned by `the_repository`, so there is no
    need for the caller to free it.

  - If we have a worktree, but no worktree ID then the caller requests
    the main worktree. In this case we return a path into the common
    directory, which again is owned by `the_repository` and thus does
    not need to be free'd.

  - In the third case, where we have an actual worktree, we compute the
    path relative to "$GIT_COMMON_DIR/worktrees/". This string does not
    need to be released either, even though `git_common_path()` ends up
    allocating memory. But this doesn't result in a memory leak either
    because we write into a buffer returned by `get_pathname()`, which
    returns one out of four static buffers.

We're about to drop `git_common_path()` in favor of `repo_common_path()`,
which doesn't use the same mechanism but instead returns an allocated
string owned by the caller. While we could adapt `get_worktree_git_dir()`
to also use `get_pathname()` and print the derived common path into that
buffer, the whole schema feels a lot like premature optimization in this
context. There are some callsites where we call `get_worktree_git_dir()`
in a loop that iterates through all worktrees. But none of these loops
seem to be even remotely in the hot path, so saving a single allocation
there does not feel worth it.

Refactor the function to instead consistently return an allocated path
so that we can start using `repo_common_path()` in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:23 -08:00
3859e39659 path: drop git_path_buf() in favor of repo_git_path_replace()
Remove `git_path_buf()` in favor of `repo_git_path_replace()`. The
latter does essentially the same, with the only exception that it does
not rely on `the_repository` but takes the repo as separate parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
bba59f58a4 path: drop git_pathdup() in favor of repo_git_path()
Remove `git_pathdup()` in favor of `repo_git_path()`. The latter does
essentially the same, with the only exception that it does not rely on
`the_repository` but takes the repo as separate parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
7f17900b5b path: drop unused strbuf_git_path() function
The `strbuf_git_path()` function isn't used anywhere, and neither should
it grow any callers because it depends on `the_repository`. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
f5c714e2a7 path: refactor repo_submodule_path() family of functions
As explained in an earlier commit, we're refactoring path-related
functions to provide a consistent interface for computing paths into the
commondir, gitdir and worktree. Refactor the "submodule" family of
functions accordingly.

Note that in contrast to the other `repo_*_path()` families, we have to
pass in the repository as a non-constant pointer. This is because we end
up calling `repo_read_gitmodules()` deep down in the callstack, which
may end up modifying the repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
f9467895d8 submodule: refactor submodule_to_gitdir() to accept a repo
The `submodule_to_gitdir()` function implicitly uses `the_repository` to
resolve submodule paths. Refactor the function to instead accept a repo
as parameter to remove the dependency on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:21 -08:00
93a8cfaf3c path: refactor repo_worktree_path() family of functions
As explained in an earlier commit, we're refactoring path-related
functions to provide a consistent interface for computing paths into the
commondir, gitdir and worktree. Refactor the "worktree" family of
functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:21 -08:00
bdfc07bfdf path: refactor repo_git_path() family of functions
As explained in an earlier commit, we're refactoring path-related
functions to provide a consistent interface for computing paths into the
commondir, gitdir and worktree. Refactor the "gitdir" family of
functions accordingly.

Note that the `repo_git_pathv()` function is converted into an internal
implementation detail. It is only used to implement `the_repository`
compatibility shims and will eventually be removed from the public
interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:21 -08:00
70a16ff8a1 path: refactor repo_common_path() family of functions
The functions provided by the "path" subsystem to derive repository
paths for the commondir, gitdir, worktrees and submodules are quite
inconsistent. Some functions have a `strbuf_` prefix, others have
different return values, some don't provide a variant working on top of
`strbuf`s.

We're thus about to refactor all of these family of functions so that
they follow a common pattern:

  - `repo_*_path()` returns an allocated string.

  - `repo_*_path_append()` appends the path to the caller-provided
    buffer while returning a constant pointer to the buffer. This
    clarifies whether the buffer is being appended to or rewritten,
    which otherwise wasn't immediately obvious.

  - `repo_*_path_replace()` replaces contents of the buffer with the
    computed path, again returning a pointer to the buffer contents.

The returned constant pointer isn't being used anywhere yet, but it will
be used in subsequent commits. Its intent is to allow calling patterns
like the following somewhat contrived example:

    if (!stat(&st, repo_common_path_replace(repo, &buf, ...)) &&
        !unlink(repo_common_path_replace(repo, &buf, ...)))
            ...

Refactor the commondir family of functions accordingly and adapt all
callers.

Note that `repo_common_pathv()` is converted into an internal
implementation detail. It is only used to implement `the_repository`
compatibility shims and will eventually be removed from the public
interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:21 -08:00
9520f7d998 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 14:56:45 -08:00
5f338eae76 Merge branch 'ps/leakfixes-0129'
A few more leakfixes.

* ps/leakfixes-0129:
  scalar: free result of `remote_default_branch()`
  unix-socket: fix memory leak when chdir(3p) fails
2025-02-06 14:56:45 -08:00
9d0e81e2ae Merge branch 'ps/zlib-ng'
The code paths to interact with zlib has been cleaned up in
preparation for building with zlib-ng.

* ps/zlib-ng:
  ci: make "linux-musl" job use zlib-ng
  ci: switch linux-musl to use Meson
  compat/zlib: allow use of zlib-ng as backend
  git-zlib: cast away potential constness of `next_in` pointer
  compat/zlib: provide stubs for `deflateSetHeader()`
  compat/zlib: provide `deflateBound()` shim centrally
  git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"
  compat: introduce new "zlib.h" header
  git-compat-util: drop `z_const` define
  compat: drop `uncompress2()` compatibility shim
2025-02-06 14:56:45 -08:00
9fad473fae Merge branch 'js/bundle-unbundle-fd-reuse-fix'
The code path used when "git fetch" fetches from a bundle file
closed the same file descriptor twice, which sometimes broke things
unexpectedly when the file descriptor was reused, which has been
corrected.

* js/bundle-unbundle-fd-reuse-fix:
  bundle: avoid closing file descriptor twice
2025-02-06 14:56:44 -08:00
2bf3c7fab1 Merge branch 'ps/ci-misc-updates'
CI updates (containerization, dropping stale ones, etc.).

* ps/ci-misc-updates:
  ci: remove stale code for Azure Pipelines
  ci: use latest Ubuntu release
  ci: stop special-casing for Ubuntu 16.04
  gitlab-ci: add linux32 job testing against i386
  gitlab-ci: remove the "linux-old" job
  github: simplify computation of the job's distro
  github: convert all Linux jobs to be containerized
  github: adapt containerized jobs to be rootless
  t7422: fix flaky test caused by buffered stdout
  t0060: fix EBUSY in MinGW when setting up runtime prefix
2025-02-06 14:56:44 -08:00
7c2f291943 difftool: eliminate use of USE_THE_REPOSITORY_VARIABLE
Remove the USE_THE_REPOSITORY_VARIABLE #define now that all
state is passed to each function from callers.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 13:00:21 -08:00
a24953f3df difftool: eliminate use of the_repository
Make callers pass a repository struct into each function instead
of relying on the global the_repository variable.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 13:00:21 -08:00
8241ae63d8 difftool: eliminate use of global variables
Move difftool's global variables into a difftools_option struct
in preparation for removal of USE_THE_REPOSITORY_VARIABLE.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 13:00:21 -08:00
dd1eb665ef doc: documentation for http.uploadarchive config option
In Git v2.44.0 support for 'git archive' over HTTP protocol
was added, but it was nowhere documented how it should be
enabled in git-http-backend.

Add missing documentation.

Signed-off-by: Piotr Szlazak <piotr.szlazak@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:33:14 -08:00
337855629f builtin/clone: teach git-clone(1) the --revision= option
The git-clone(1) command has the option `--branch` that allows the user
to select the branch they want HEAD to point to. In a non-bare
repository this also checks out that branch.

Option `--branch` also accepts a tag. When a tag name is provided, the
commit this tag points to is checked out and HEAD is detached. Thus
`--branch` can be used to clone a repository and check out a ref kept
under `refs/heads` or `refs/tags`. But some other refs might be in use
as well. For example Git forges might use refs like `refs/pull/<id>` and
`refs/merge-requests/<id>` to track pull/merge requests. These refs
cannot be selected upon git-clone(1).

Add option `--revision` to git-clone(1). This option accepts a fully
qualified reference, or a hexadecimal commit ID. This enables the user
to clone and check out any revision they want. `--revision` can be used
in conjunction with `--depth` to do a minimal clone that only contains
the blob and tree for a single revision. This can be useful for
automated tests running in CI systems.

Using option `--branch` and `--single-branch` together is a similar
scenario, but serves a different purpose. Using these two options, a
singlet remote tracking branch is created and the fetch refspec is set
up so git-fetch(1) will receive updates on that branch from the remote.
This allows the user work on that single branch.

Option `--revision` on contrary detaches HEAD, creates no tracking
branches, and writes no fetch refspec.

Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
[jc: removed unnecessary TEST_PASSES_SANITIZE_LEAK from the test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:26:42 -08:00
9144b9362b parse-options: introduce die_for_incompatible_opt2()
The functions die_for_incompatible_opt3() and
die_for_incompatible_opt4() already exist to die whenever a user
specifies three or four options respectively that are not compatible.

Introduce die_for_incompatible_opt2() which dies when two options that
are incompatible are set.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:54 -08:00
7a52a8c7d8 clone: introduce struct clone_opts in builtin/clone.c
There is a lot of state stored in global variables in builtin/clone.c.
In the long run we'd like to remove many of those.

Introduce `struct clone_opts` in this file. This struct will be used to
contain all details needed to perform the clone. The struct object can
be thrown around to all the functions that need these details.

The first field we're adding is `wants_head`. In some scenarios
(specifically when both `--single-branch` and `--branch` are given) we
are not interested in `HEAD` on the remote. The field `wants_head` in
`struct clone_opts` will hold this information. We could have put
`option_branch` and `option_single_branch` into that struct instead, but
in a following commit we'll be using `wants_head` as well.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:54 -08:00
bc26f7690a clone: make it possible to specify --tags
Option --no-tags was added in 0dab2468ee (clone: add a --no-tags option
to clone without tags, 2017-04-26). At the time there was no need to
support --tags as well, although there was some conversation about
it[1].

To simplify the code and to prepare for future commits, invert the flag
internally. Functionally there is no change, because the flag is
default-enabled passing `--tags` has no effect, so there's no need to
add tests for this.

[1]: https://lore.kernel.org/git/CAGZ79kbHuMpiavJ90kQLEL_AR0BEyArcZoEWAjPPhOFacN16YQ@mail.gmail.com/

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
2ca67c6f14 clone: add tags refspec earlier to fetch refspec
In clone.c we call refspec_ref_prefixes() to copy the fetch refspecs
from the `remote->fetch` refspec into `ref_prefixes` of
`transport_ls_refs_options`. Afterwards we add the tags prefix
`refs/tags/` prefix as well. At a later point, in wanted_peer_refs() we
process refs using both `remote->fetch` and `TAG_REFSPEC`.

Simplify the code by appending `TAG_REFSPEC` to `remote->fetch` before
calling refspec_ref_prefixes().

To be able to do this, we set `option_tags` to 0 when --mirror is given.
This is because --mirror mirrors (hence the name) all the refs,
including tags and they do not need to be treated separately.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
7f420a6bda clone: cut down on global variables in clone.c
In clone.c the `struct option` which is used to parse the input options
for git-clone(1) is a global variable. Due to this, many variables that
are used to parse the value into, are also global.

Make `builtin_clone_options` a local variable in cmd_clone() and carry
along all variables that are only used in that function.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
879780f9a1 clone: refactor wanted_peer_refs()
The function wanted_peer_refs() is used to map the refs returned by the
server to refs we will save in our clone.

Over time this function grown to be very complex. Refactor it.

Previously, there was a separate code path for when
`option_single_branch` was set. It resulted in duplicated code and
deeper nested conditions. After this refactor the code path for when
`option_single_branch` is truthy modifies `refs` and then falls through
to the common code path. This approach relies on the `refspec` being set
correctly and thus only mapping refs that are relevant.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
78a95e0d80 worktree: detect from secondary worktree if main worktree is bare
When extensions.worktreeConfig is true and the main worktree is
bare -- that is, its config.worktree file contains core.bare=true
-- commands run from secondary worktrees incorrectly see the main
worktree as not bare. As such, those commands incorrectly think
that the repository's default branch (typically "main" or
"master") is checked out in the bare repository even though it's
not. This makes it impossible, for instance, to checkout or delete
the default branch from a secondary worktree, among other
shortcomings.

This problem occurs because, when extensions.worktreeConfig is
true, commands run in secondary worktrees only consult
$commondir/config and $commondir/worktrees/<id>/config.worktree,
thus they never see the main worktree's core.bare=true setting in
$commondir/config.worktree.

Fix this problem by consulting the main worktree's config.worktree
file when checking whether it is bare. (This extra work is
performed only when running from a secondary worktree.)

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Olga Pilipenco <olga.pilipenco@shopify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-05 09:46:23 -08:00
3eeed876a9 docs: indicate http.sslCertType and sslKeyType
0a01d41ee4 (http: add support for different sslcert and sslkey types.,
2023-03-20) added useful SSL config options, but did not document them.

Signed-off-by: Andrew Carter <andrew@emailcarter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-05 09:43:38 -08:00
3295c35398 rev-list: extend print-info to print missing object type
Additional information about missing objects found in git-rev-list(1)
can be printed by specifying the `print-info` missing action for the
`--missing` option. Extend this action to also print missing object type
information inferred from its containing object. This token follows the
form `type=<type>` and specifies the expected object type of the missing
object.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-05 09:32:01 -08:00
c6d896bcfd rev-list: add print-info action to print missing object path
Missing objects identified through git-rev-list(1) can be printed by
setting the `--missing=print` option. Additional information about the
missing object, such as its path and type, may be present in its
containing object.

Add the `print-info` missing action for the `--missing` option that,
when set, prints additional insight about the missing object inferred
from its containing object. Each line of output for a missing object is
in the form: `?<oid> [<token>=<value>]...`. The `<token>=<value>` pairs
containing additional information are separated from each other by a SP.
The value is encoded in a token specific fashion, but SP or LF contained
in value are always expected to be represented in such a way that the
resulting encoded value does not have either of these two problematic
bytes. This format is kept generic so it can be extended in the future
to support additional information.

For now, only a missing object path info is implemented. It follows the
form `path=<path>` and specifies the full path to the object from the
top-level tree. A path containing SP or special characters is enclosed
in double-quotes in the C style as needed. In a subsequent commit,
missing object type info will also be added.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-05 09:32:01 -08:00
414c82300a builtin/repack: fix --keep-unreachable when there are no packs
The "--keep-unreachable" flag is supposed to append any unreachable
objects to the newly written pack. This flag is explicitly documented as
appending both packed and loose unreachable objects to the new packfile.
And while this works alright when repacking with preexisting packfiles,
it stops working when the repository does not have any packfiles at all.

The root cause are the conditions used to decide whether or not we want
to append "--pack-loose-unreachable" to git-pack-objects(1). There are
a couple of conditions here:

  - `has_existing_non_kept_packs()` checks whether there are existing
    packfiles. This condition makes sense to guard "--keep-pack=",
    "--unpack-unreachable" and "--keep-unreachable", because all of
    these flags only make sense in combination with existing packfiles.
    But it does not make sense to disable `--pack-loose-unreachable`
    when there aren't any preexisting packfiles, as loose objects can be
    packed into the new packfile regardless of that.

  - `delete_redundant` checks whether we want to delete any objects or
    packs that are about to become redundant. The documentation of
    `--keep-unreachable` explicitly says that `git repack -ad` needs to
    be executed for the flag to have an effect.

    It is not immediately obvious why such redundant objects need to be
    deleted in order for "--pack-unreachable-objects" to be effective.
    But as things are working as documented this is nothing we'll change
    for now.

  - `pack_everything & PACK_CRUFT` checks that we're not creating a
    cruft pack. This condition makes sense in the context of
    "--pack-loose-unreachable", as unreachable objects would end up in
    the cruft pack anyway.

So while the second and third condition are sensible, it does not make
any sense to condition `--pack-loose-unreachable` on the existence of
packfiles.

Fix the bug by splitting out the "--pack-loose-unreachable" and only
making it depend on the second and third condition. Like this, loose
unreachable objects will be packed regardless of any preexisting
packfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:58:02 -08:00
f21ea69d94 remote: relocate valid_remote_name
Move the `valid_remote_name()` function from the refspec subsystem to
the remote subsystem to better align with the separation of concerns.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:55:59 -08:00
d549b6c9ff refspec: relocate apply_refspecs and related funtions
Move the functions `apply_refspecs()` and `apply_negative_refspecs()`
from `remote.c` to `refspec.c`. These functions focus on applying
refspecs, so centralizing them in `refspec.c` improves code organization
by keeping refspec-related logic in one place.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:51:42 -08:00
7b24a170d2 refspec: relocate matching related functions
Move the functions `refspec_find_match()`, `refspec_find_all_matches()`
and `refspec_find_negative_match()` from `remote.c` to `refspec.c`.
These functions focus on matching refspecs, so centralizing them in
`refspec.c` improves code organization by keeping refspec-related logic
in one place.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:51:41 -08:00
be0905fed1 remote: rename query_refspecs functions
Rename functions related to handling refspecs in preparation for their
move from `remote.c` to `refspec.c`. Update their names to better
reflect their intent:

    - `query_refspecs()` -> `refspec_find_match()` for clarity, as it
      finds a single matching refspec.

    - `query_refspecs_multiple()` -> `refspec_find_all_matches()` to
      better reflect that it collects all matching refspecs instead of
      returning just the first match.

    - `query_matches_negative_refspec()` ->
      `refspec_find_negative_match()` for consistency with the
      updated naming convention, even though this static function
      didn't strictly require renaming.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:51:41 -08:00
230d022fe3 refspec: relocate refname_matches_negative_refspec_item
Move the functions `refname_matches_negative_refspec_item()`,
`refspec_match()`, and `match_name_with_pattern()` from `remote.c` to
`refspec.c`. These functions focus on refspec matching, so placing them
in `refspec.c` aligns with the separation of concerns. Keep
refspec-related logic in `refspec.c` and remote-specific logic in
`remote.c` for better code organization.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:51:41 -08:00
e4f6ab0085 remote: rename function omit_name_by_refspec
Rename the function `omit_name_by_refspec()` to
`refname_matches_negative_refspec_item()` to provide clearer intent.
The previous function name was vague and did not accurately describe its
purpose. By using `refname_matches_negative_refspec_item`, make the
function's purpose more intuitive, clarifying that it checks if a
reference name matches any negative refspec.

Rename function parameters for consistency with existing naming
conventions. Use `refname` instead of `name` to align with terminology
in `refs.h`.

Remove the redundant doc comment since the function name is now
self-explanatory.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04 09:51:41 -08:00
85127bcdea backfill: assume --sparse when sparse-checkout is enabled
The previous change introduced the '--[no-]sparse' option for the 'git
backfill' command, but did not assume it as enabled by default. However,
this is likely the behavior that users will most often want to happen.
Without this default, users with a small sparse-checkout may be confused
when 'git backfill' downloads every version of every object in the full
history.

However, this is left as a separate change so this decision can be reviewed
independently of the value of the '--[no-]sparse' option.

Add a test of adding the '--sparse' option to a repo without sparse-checkout
to make it clear that supplying it without a sparse-checkout is an error.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 16:12:42 -08:00
bff4555767 backfill: add --sparse option
One way to significantly reduce the cost of a Git clone and later fetches is
to use a blobless partial clone and combine that with a sparse-checkout that
reduces the paths that need to be populated in the working directory. Not
only does this reduce the cost of clones and fetches, the sparse-checkout
reduces the number of objects needed to download from a promisor remote.

However, history investigations can be expensive as computing blob diffs
will trigger promisor remote requests for one object at a time. This can be
avoided by downloading the blobs needed for the given sparse-checkout using
'git backfill' and its new '--sparse' mode, at a time that the user is
willing to pay that extra cost.

Note that this is distinctly different from the '--filter=sparse:<oid>'
option, as this assumes that the partial clone has all reachable trees and
we are using client-side logic to avoid downloading blobs outside of the
sparse-checkout cone. This avoids the server-side cost of walking trees
while also achieving a similar goal. It also downloads in batches based on
similar path names, presenting a resumable download if things are
interrupted.

This augments the path-walk API to have a possibly-NULL 'pl' member that may
point to a 'struct pattern_list'. This could be more general than the
sparse-checkout definition at HEAD, but 'git backfill --sparse' is currently
the only consumer.

Be sure to test this in both cone mode and not cone mode. Cone mode has the
benefit that the path-walk can skip certain paths once they would expand
beyond the sparse-checkout. Non-cone mode can describe the included files
using both positive and negative patterns, which changes the possible return
values of path_matches_pattern_list(). Test both kinds of matches for
increased coverage.

To test this, we can create a blobless sparse clone, expand the
sparse-checkout slightly, and then run 'git backfill --sparse' to see
how much data is downloaded. The general steps are

 1. git clone --filter=blob:none --sparse <url>
 2. git sparse-checkout set <dir1> ... <dirN>
 3. git backfill --sparse

For the Git repository with the 'builtin' directory in the
sparse-checkout, we get these results for various batch sizes:

| Batch Size      | Pack Count | Pack Size | Time  |
|-----------------|------------|-----------|-------|
| (Initial clone) | 3          | 110 MB    |       |
| 10K             | 12         | 192 MB    | 17.2s |
| 15K             | 9          | 192 MB    | 15.5s |
| 20K             | 8          | 192 MB    | 15.5s |
| 25K             | 7          | 192 MB    | 14.7s |

This case matters less because a full clone of the Git repository from
GitHub is currently at 277 MB.

Using a copy of the Linux repository with the 'kernel/' directory in the
sparse-checkout, we get these results:

| Batch Size      | Pack Count | Pack Size | Time |
|-----------------|------------|-----------|------|
| (Initial clone) | 2          | 1,876 MB  |      |
| 10K             | 11         | 2,187 MB  | 46s  |
| 25K             | 7          | 2,188 MB  | 43s  |
| 50K             | 5          | 2,194 MB  | 44s  |
| 100K            | 4          | 2,194 MB  | 48s  |

This case is more meaningful because a full clone of the Linux
repository is currently over 6 GB, so this is a valuable way to download
a fraction of the repository and no longer need network access for all
reachable objects within the sparse-checkout.

Choosing a batch size will depend on a lot of factors, including the
user's network speed or reliability, the repository's file structure,
and how many versions there are of the file within the sparse-checkout
scope. There will not be a one-size-fits-all solution.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 16:12:42 -08:00
6840fe9ee2 backfill: add --min-batch-size=<n> option
Users may want to specify a minimum batch size for their needs. This is only
a minimum: the path-walk API provides a list of OIDs that correspond to the
same path, and thus it is optimal to allow delta compression across those
objects in a single server request.

We could consider limiting the request to have a maximum batch size in the
future. For now, we let the path-walk API batches determine the
boundaries.

To get a feeling for the value of specifying the --min-batch-size parameter,
I tested a number of open source repositories available on GitHub. The
procedure was generally:

 1. git clone --filter=blob:none <url>
 2. git backfill

Checking the number of packfiles and the size of the .git/objects/pack
directory helps to identify the effects of different batch sizes.

For the Git repository, we get these results:

| Batch Size      | Pack Count | Pack Size | Time  |
|-----------------|------------|-----------|-------|
| (Initial clone) | 2          | 119 MB    |       |
| 25K             | 8          | 290 MB    | 24s   |
| 50K             | 5          | 290 MB    | 24s   |
| 100K            | 4          | 290 MB    | 29s   |

Other than the packfile counts decreasing as we need fewer batches, the
size and time required is not changing much for this small example.

For the nodejs/node repository, we see these results:

| Batch Size      | Pack Count | Pack Size | Time   |
|-----------------|------------|-----------|--------|
| (Initial clone) | 2          | 330 MB    |        |
| 25K             | 19         | 1,222 MB  | 1m 22s |
| 50K             | 11         | 1,221 MB  | 1m 24s |
| 100K            | 7          | 1,223 MB  | 1m 40s |
| 250K            | 4          | 1,224 MB  | 2m 23s |
| 500K            | 3          | 1,216 MB  | 4m 38s |

Here, we don't have much difference in the size of the repo, though the
500K batch size results in a few MB gained. That comes at a cost of a
much longer time. This extra time is due to server-side delta
compression happening as the on-disk deltas don't appear to be reusable
all the time. But for smaller batch sizes, the server is able to find
reasonable deltas partly because we are asking for objects that appear
in the same region of the directory tree and include all versions of a
file at a specific path.

To contrast this example, I tested the microsoft/fluentui repo, which
has been known to have inefficient packing due to name hash collisions.
These results are found before GitHub had the opportunity to repack the
server with more advanced name hash versions:

| Batch Size      | Pack Count | Pack Size | Time   |
|-----------------|------------|-----------|--------|
| (Initial clone) | 2          | 105 MB    |        |
| 5K              | 53         | 348 MB    | 2m 26s |
| 10K             | 28         | 365 MB    | 2m 22s |
| 15K             | 19         | 407 MB    | 2m 21s |
| 20K             | 15         | 393 MB    | 2m 28s |
| 25K             | 13         | 417 MB    | 2m 06s |
| 50K             | 8          | 509 MB    | 1m 34s |
| 100K            | 5          | 535 MB    | 1m 56s |
| 250K            | 4          | 698 MB    | 1m 33s |
| 500K            | 3          | 696 MB    | 1m 42s |

Here, a larger variety of batch sizes were chosen because of the great
variation in results. By asking the server to download small batches
corresponding to fewer paths at a time, the server is able to provide
better compression for these batches than it would for a regular clone.
A typical full clone for this repository would require 738 MB.

This example justifies the choice to batch requests by path name,
leading to improved communication with a server that is not optimally
packed.

Finally, the same experiment for the Linux repository had these results:

| Batch Size      | Pack Count | Pack Size | Time    |
|-----------------|------------|-----------|---------|
| (Initial clone) | 2          | 2,153 MB  |         |
| 25K             | 63         | 6,380 MB  | 14m 08s |
| 50K             | 58         | 6,126 MB  | 15m 11s |
| 100K            | 30         | 6,135 MB  | 18m 11s |
| 250K            | 14         | 6,146 MB  | 18m 22s |
| 500K            | 8          | 6,143 MB  | 33m 29s |

Even in this example, where the default name hash algorithm leads to
decent compression of the Linux kernel repository, there is value for
selecting a smaller batch size, to a limit. The 25K batch size has the
fastest time, but uses 250 MB more than the 50K batch size. The 500K
batch size took much more time due to server compression time and thus
we should avoid large batch sizes like this.

Based on these experiments, a batch size of 50,000 was chosen as the
default value.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 16:12:42 -08:00
1e72e889e7 backfill: basic functionality and tests
The default behavior of 'git backfill' is to fetch all missing blobs that
are reachable from HEAD. Document and test this behavior.

The implementation is a very simple use of the path-walk API, initializing
the revision walk at HEAD to start the path-walk from all commits reachable
from HEAD. Ignore the object arrays that correspond to tree entries,
assuming that they are all present already.

The path-walk API provides lists of objects in batches according to a
common path, but that list could be very small. We want to balance the
number of requests to the server with the ability to have the process
interrupted with minimal repeated work to catch up in the next run.
Based on some experiments (detailed in the next change) a minimum batch
size of 50,000 is selected for the default.

This batch size is a _minimum_. As the path-walk API emits lists of blob
IDs, they are collected into a list of objects for a request to the
server. When that list is at least the minimum batch size, then the
request is sent to the server for the new objects. However, the list of
blob IDs from the path-walk API could be much longer than the batch
size. At this moment, it is unclear if there is a benefit to split the
list when there are too many objects at the same path.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 16:12:41 -08:00
a3f79e9abd backfill: add builtin boilerplate
In anticipation of implementing 'git backfill', populate the necessary files
with the boilerplate of a new builtin. Mark the builtin as experimental at
this time, allowing breaking changes in the near future, if necessary.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 16:12:41 -08:00
e5a0d5d8bb Merge branch 'master' into ds/backfill
* master: (446 commits)
  The seventh batch
  The sixth batch
  The fifth batch
  The fourth batch
  refs/reftable: fix uninitialized memory access of `max_index`
  remote: announce removal of "branches/" and "remotes/"
  The third batch
  hash.h: drop unsafe_ function variants
  csum-file: introduce hashfile_checkpoint_init()
  t/helper/test-hash.c: use unsafe_hash_algo()
  csum-file.c: use unsafe_hash_algo()
  hash.h: introduce `unsafe_hash_algo()`
  csum-file.c: extract algop from hashfile_checksum_valid()
  csum-file: store the hash algorithm as a struct field
  t/helper/test-tool: implement sha1-unsafe helper
  trace2: prevent segfault on config collection with valueless true
  refs: fix creation of reflog entries for symrefs
  ci: wire up Visual Studio build with Meson
  ci: raise error when Meson generates warnings
  meson: fix compilation with Visual Studio
  ...
2025-02-03 16:12:33 -08:00
b81f8c8dd3 send-pack: gracefully close the connection for atomic push
Patrick reported an issue that the exit code of git-receive-pack(1) is
ignored during atomic push with "--porcelain" flag, and added new test
cases in t5543.

This issue originated from commit 7dcbeaa0df (send-pack: fix
inconsistent porcelain output, 2020-04-17). At that time, I chose to
ignore the exit code of "finish_connect()" without investigating the
root cause of the abnormal termination of git-receive-pack. That was an
incorrect solution.

The root cause is that an atomic push operation terminates early without
sending a flush packet to git-receive-pack. As a result,
git-receive-pack continues waiting for commands without exiting. By
sending a flush packet at the appropriate location in "send_pack()", we
ensure that the git-receive-pack process closes properly, avoiding an
erroneous exit code for git-push. At the same time, revert the changes
to the "transport.c" file made in commit 7dcbeaa0df.

Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:58 -08:00
60c208db58 t5543: atomic push reports exit code failure
Add new test cases in t5543 to avoid ignoring the exit code of
git-receive-pack(1) during atomic push with "--porcelain" flag.

We'd typically notice this case because the refs would have their error
message set. But there is an edge case when pushing refs succeeds, but
git-receive-pack(1) exits with a non-zero exit code at a later point in
time due to another error. An atomic git-push(1) would ignore that error
code, and consequently it would return successfully and not print any
error message at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:58 -08:00
3028db4af2 send-pack: new return code "ERROR_SEND_PACK_BAD_REF_STATUS"
The "push_refs" function in the transport_vtable is the handler for
git-push operation. All the "push_refs" functions for different
transports (protocols) should have the same behavior, but the behavior
of "git_transport_push()" function for builtin_smart_vtable in
"transport.c" (which calls "send_pack()" in "send-pack.c") differs from
the handler of the HTTP protocol.

The "push_refs()" function for the HTTP protocol which calls the
"push_refs_with_push()" function in "transport-helper.c" will return 0
even when a bad REF_STATUS (such as REF_STATUS_REJECT_NONFASTFORWARD)
was found. But "send_pack()" for Git smart protocol will return -1 for
a bad REF_STATUS.

We cannot ignore bad REF_STATUS directly in the "send_pack()" function,
because the function is also used in "builtin/send-pack.c". So we add a
new non-zero error code "SEND_PACK_ERROR_REF_STATUS" for "send_pack()".

Ignore the specific error code in the "git_transport_push()" function to
have the same behavior as "push_refs()" for HTTP protocol. Note that
even though we ignore the error here, we'll ultimately still end up
detecting that a subset of refs was not pushed in `transport_push()`
because we eventually call `push_had_errors()` on the remote refs.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:57 -08:00
dd69a12e6a t5548: add porcelain push test cases for dry-run mode
New dry-run test cases:

 - git push --porcelain --dry-run
 - git push --porcelain --dry-run --force
 - git push --porcelain --dry-run --atomic
 - git push --porcelain --dry-run --atomic --force

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:57 -08:00
2329b6b461 t5548: add new porcelain test cases
Add two more test cases exercising git-push(1) with `--procelain`, one
exercising a non-atomic and one exercising an atomic push.

Based-on-patch-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:56 -08:00
bc0f5939a5 t5548: refactor test cases by resetting upstream
Refactor the test cases with the following changes:

 - Calling setup_upstream() to reset upstream after running each test
   case.

 - Change the initial branch tips of the workspace to reduce the branch
   setup operations in the workspace.

 - Reduced the two steps of setting up and cleaning up the pre-receive
   hook by moving the operations into the corresponding test case,

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:56 -08:00
12ad6b8fea t5548: refactor to reuse setup_upstream() function
Refactor the function setup_upstream_and_workbench(), extracting
create_upstream_template() and setup_upstream() from it. The former is
used to create the upstream repository template, while the latter is
used to rebuild the upstream repository and will be reused in subsequent
commits.

To ensure that setup_upstream() works properly in both local and HTTP
protocols, the HTTP settings have been moved to the setup_upstream() and
setup_upstream_and_workbench() functions.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:56 -08:00
b1be3953e5 t5504: modernize test by moving heredocs into test bodies
We have several heredocs in t5504 located outside of any particular test
bodies. Move these into the test bodies to match our modern coding
style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:24:55 -08:00
7c1d34fe5d t6423: fix suppression of Git’s exit code in tests
Some test in t6423 supress Git's exit code, which can cause test
failures go unnoticed. Specifically using git <subcommand> |
<other-command> masks potential failures of the Git command.

This commit ensures that Git's exit status is correctly propogated by:
- Avoiding pipes that suppress exit codes.

Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:23:15 -08:00
e4542d8b35 help: add "show" as a valid configuration value
Add a literal value for showing the suggested autocorrection
for consistency with the rest of the help.autocorrect options.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:22:05 -08:00
e21bf2c431 help: show the suggested command when help.autocorrect is false
Make the handling of false boolean values for help.autocorrect
consistent with the handling of value 0 by showing the suggested
commands but not running them.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 15:22:03 -08:00
a0fc18f042 Merge branch 'sc/help-autocorrect-one' into da/help-autocorrect-one-fix
* sc/help-autocorrect-one:
  help: interpret boolean string values for help.autocorrect
2025-02-03 15:21:57 -08:00
318f4c9827 t5401: prefer test_path_is_* helper function
"test -f" does not provide a nice error message when we hit test
failures, so use test_path_is_file instead.

Signed-off-by: ambar chakravartty <amch9605@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 14:11:19 -08:00
bc204b7427 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 10:23:35 -08:00
1f124f3024 Merge branch 'kn/reflog-migration-fix-fix'
Fix bugs in an earlier attempt to fix "git refs migration".

* kn/reflog-migration-fix-fix:
  refs/reftable: fix uninitialized memory access of `max_index`
  reftable: write correct max_update_index to header
2025-02-03 10:23:35 -08:00
b83a2f9006 Merge branch 'kn/pack-write-with-reduced-globals'
Code clean-up.

* kn/pack-write-with-reduced-globals:
  pack-write: pass hash_algo to internal functions
  pack-write: pass hash_algo to `write_rev_file()`
  pack-write: pass hash_algo to `write_idx_file()`
  pack-write: pass repository to `index_pack_lockfile()`
  pack-write: pass hash_algo to `fixup_pack_header_footer()`
2025-02-03 10:23:34 -08:00
f49905d47d Merge branch 'ps/build-meson-fixes'
More build fixes and enhancements on meson based build procedure.

* ps/build-meson-fixes:
  ci: wire up Visual Studio build with Meson
  ci: raise error when Meson generates warnings
  meson: fix compilation with Visual Studio
  meson: make the CSPRNG backend configurable
  meson: wire up fuzzers
  meson: wire up generation of distribution archive
  meson: wire up development environments
  meson: fix dependencies for generated headers
  meson: populate project version via GIT-VERSION-GEN
  GIT-VERSION-GEN: allow running without input and output files
  GIT-VERSION-GEN: simplify computing the dirty marker
2025-02-03 10:23:34 -08:00
803b5acaa7 Merge branch 'ps/3.0-remote-deprecation'
Following the procedure we established to introduce breaking
changes for Git 3.0, allow an early opt-in for removing support of
$GIT_DIR/branches/ and $GIT_DIR/remotes/ directories to configure
remotes.

* ps/3.0-remote-deprecation:
  remote: announce removal of "branches/" and "remotes/"
  builtin/pack-redundant: remove subcommand with breaking changes
  ci: repurpose "linux-gcc" job for deprecations
  ci: merge linux-gcc-default into linux-gcc
  Makefile: wire up build option for deprecated features
2025-02-03 10:23:33 -08:00
c43136d67b Merge branch 'jk/combine-diff-cleanup'
Code clean-up for code paths around combined diff.

* jk/combine-diff-cleanup:
  tree-diff: make list tail-passing more explicit
  tree-diff: simplify emit_path() list management
  tree-diff: use the name "tail" to refer to list tail
  tree-diff: drop list-tail argument to diff_tree_paths()
  combine-diff: drop public declaration of combine_diff_path_size()
  tree-diff: inline path_appendnew()
  tree-diff: pass whole path string to path_appendnew()
  tree-diff: drop path_appendnew() alloc optimization
  run_diff_files(): de-mystify the size of combine_diff_path struct
  diff: add a comment about combine_diff_path.parent.path
  combine-diff: use pointer for parent paths
  tree-diff: clear parent array in path_appendnew()
  combine-diff: add combine_diff_path_new()
  run_diff_files(): delay allocation of combine_diff_path
2025-02-03 10:23:33 -08:00
caf17423d3 Merge branch 'tb/unsafe-hash-cleanup'
The API around choosing to use unsafe variant of SHA-1
implementation has been updated in an attempt to make it harder to
abuse.

* tb/unsafe-hash-cleanup:
  hash.h: drop unsafe_ function variants
  csum-file: introduce hashfile_checkpoint_init()
  t/helper/test-hash.c: use unsafe_hash_algo()
  csum-file.c: use unsafe_hash_algo()
  hash.h: introduce `unsafe_hash_algo()`
  csum-file.c: extract algop from hashfile_checksum_valid()
  csum-file: store the hash algorithm as a struct field
  t/helper/test-tool: implement sha1-unsafe helper
2025-02-03 10:23:32 -08:00
14ddc393b1 ci: set CI_JOB_IMAGE for coverity job
The main GitHub Actions workflow switched away from the "$distro"
variable in b133d3071a (github: simplify computation of the job's
distro, 2025-01-10). Since the Coverity job also depends on our
ci/install-dependencies.sh script, it needs to likewise set CI_JOB_IMAGE
to find the correct dependencies (without this patch, we don't install
curl and the build fails).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 09:24:42 -08:00
f9d4bb7b9a Merge branch 'ps/ci-misc-updates' into jk/ci-coverity-update
* ps/ci-misc-updates:
  ci: remove stale code for Azure Pipelines
  ci: use latest Ubuntu release
  ci: stop special-casing for Ubuntu 16.04
  gitlab-ci: add linux32 job testing against i386
  gitlab-ci: remove the "linux-old" job
  github: simplify computation of the job's distro
  github: convert all Linux jobs to be containerized
  github: adapt containerized jobs to be rootless
  t7422: fix flaky test caused by buffered stdout
  t0060: fix EBUSY in MinGW when setting up runtime prefix
2025-02-03 09:24:25 -08:00
af8bf677c1 t/unit-tests: convert strcmp-offset test to use clar test framework
Adapt strcmp-offset test script to clar framework by using clar
assertions where necessary. Introduce `test_strcmp_offset__empty()` to
verify `check_strcmp_offset()` behavior when both input strings are
empty. This ensures the function correctly handles edge cases and
returns expected values.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:45 -08:00
4b995465b2 t/unit-tests: convert strbuf test to use clar test framework
Adapt strbuf test script to clar framework by using clar assertions
where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:45 -08:00
e0f807bdad t/unit-tests: adapt example decorate test to use clar test framework
Introduce `test_example_decorate__initialize()` to explicitly set up
object IDs and retrieve corresponding objects before tests run. This
ensures a consistent and predictable test state without relying on data
from previous tests.

Add `test_example_decorate__cleanup()` to clear decorations after each
test, preventing interference between tests and ensuring each runs in
isolation.

Adapt example decorate test script to clar framework by using clar
assertions where necessary. Previously, tests relied on data written by
earlier tests, leading to unintended dependencies between them. This
explicitly initializes the necessary state within
`test_example_decorate__readd`, ensuring it does not depend on prior
test executions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:44 -08:00
38b066ee76 t/unit-tests: convert hashmap test to use clar test framework
Adapts hashmap test script to clar framework by using clar assertions
where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:44 -08:00
acc4fb302b ci: fix base commit fallback for check-whitespace and check-style
The check-whitespace and check-style CI scripts require a base commit.
In GitLab CI, the base commit can be provided by several different
predefined CI variables depending on the type of pipeline being
performed.

In 30c4f7e350 (check-whitespace: detect if no base_commit is provided,
2024-07-23), the GitLab check-whitespace CI job was modified to support
CI_MERGE_REQUEST_DIFF_BASE_SHA as a fallback base commit if
CI_MERGE_REQUEST_TARGET_BRANCH_SHA was not provided. The same fallback
strategy was also implemented for the GitLab check-style CI job in
bce7e52d4e (ci: run style check on GitHub and GitLab, 2024-07-23).

The base commit fallback is implemented using shell parameter expansion
where, if the first variable is unset, the second variable is used as
fallback. In GitLab CI, these variables can be set but null. This has
the unintended effect of selecting an empty first variable which results
in CI jobs providing an invalid base commit and failing.

Fix the issue by defaulting to the fallback variable if the first is
unset or null.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:08:46 -08:00
0578f1e66a global: adapt callers to use generic hash context helpers
Adapt callers to use generic hash context helpers instead of using the
hash algorithm to update them. This makes the callsites easier to reason
about and removes the possibility that the wrong hash algorithm is used
to update the hash context's state. And as a nice side effect this also
gets rid of a bunch of users of `the_hash_algo`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:06:11 -08:00
b2755c15e2 hash: provide generic wrappers to update hash contexts
The hash context is supposed to be updated via the `git_hash_algo`
structure, which contains a list of function pointers to update, clone
or finalize a hashing context. This requires the callers to track which
algorithm was used to initialize the context and continue to use the
exact same algorithm. If they fail to do that correctly, it can happen
that we start to access context state of one hash algorithm with
functions of a different hash algorithm. The result would typically be a
segfault, as could be seen e.g. in the patches part of 98422943f0 (Merge
branch 'ps/weak-sha1-for-tail-sum-fix', 2025-01-01).

The situation was significantly improved starting with 04292c3796
(hash.h: drop unsafe_ function variants, 2025-01-23) and its parent
commits. These refactorings ensure that it is not possible to mix up
safe and unsafe variants of the same hash algorithm anymore. But in
theory, it is still possible to mix up different hash algorithms with
each other, even though this is a lot less likely to happen.

But still, we can do better: instead of asking the caller to remember
the hash algorithm used to initialize a context, we can instead make the
context itself remember which algorithm it has been initialized with. If
we do so, callers can use a set of generic helpers to update the context
and don't need to be aware of the hash algorithm at all anymore.

Adapt the context initialization functions to store the hash algorithm
in the hashing context and introduce these generic helpers. Callers will
be adapted in the subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:06:11 -08:00
7346e340f1 hash: stop typedeffing the hash context
We generally avoid using `typedef` in the Git codebase. One exception
though is the `git_hash_ctx`, likely because it used to be a union
rather than a struct until the preceding commit refactored it. But now
that it is a normal `struct` there isn't really a need for a typedef
anymore.

Drop the typedef and adapt all callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:06:10 -08:00
52eef501e1 hash: convert hashing context to a structure
The `git_hash_context` is a union containing the different hash-specific
states for SHA1, its unsafe variant as well as SHA256. We know that only
one of these states will ever be in use at the same time because hash
contexts cannot be used for multiple different hashes at the same point
in time.

We're about to extend the structure though to keep track of the hash
algorithm used to initialize the context, which is impossible to do
while the context is a union. Refactor it to instead be a structure that
contains the union of context states.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:06:10 -08:00
0cbcba5455 Merge branch 'tb/unsafe-hash-cleanup' into ps/hash-cleanup
* tb/unsafe-hash-cleanup:
  hash.h: drop unsafe_ function variants
  csum-file: introduce hashfile_checkpoint_init()
  t/helper/test-hash.c: use unsafe_hash_algo()
  csum-file.c: use unsafe_hash_algo()
  hash.h: introduce `unsafe_hash_algo()`
  csum-file.c: extract algop from hashfile_checksum_valid()
  csum-file: store the hash algorithm as a struct field
  t/helper/test-tool: implement sha1-unsafe helper
2025-01-31 10:05:46 -08:00
58b5801aa9 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 09:44:16 -08:00
81309f424b Merge branch 'jc/show-index-h-update'
Doc and short-help text for "show-index" has been clarified to
stress that the command reads its data from the standard input.

* jc/show-index-h-update:
  show-index: the short help should say the command reads from its input
2025-01-31 09:44:16 -08:00
bdd1988eb3 Merge branch 'ja/doc-notes-markup-updates'
Doc mark-up updates.

* ja/doc-notes-markup-updates:
  doc: convert git-notes to new documentation format
2025-01-31 09:44:15 -08:00
ecba2c181c Merge branch 'sk/strlen-returns-size_t'
Code clean-up.

* sk/strlen-returns-size_t:
  date.c: Fix type missmatch warings from msvc
2025-01-31 09:44:15 -08:00
dccd9c5cf2 Merge branch 'ja/doc-restore-markup-update'
Doc mark-up updates.

* ja/doc-restore-markup-update:
  doc: convert git-restore to new style format
2025-01-31 09:44:15 -08:00
72f1ddfbc9 Merge branch 'ps/build-meson-fixes' into ps/build-meson-fixes-0130
* ps/build-meson-fixes:
  ci: wire up Visual Studio build with Meson
  ci: raise error when Meson generates warnings
  meson: fix compilation with Visual Studio
  meson: make the CSPRNG backend configurable
  meson: wire up fuzzers
  meson: wire up generation of distribution archive
  meson: wire up development environments
  meson: fix dependencies for generated headers
  meson: populate project version via GIT-VERSION-GEN
  GIT-VERSION-GEN: allow running without input and output files
  GIT-VERSION-GEN: simplify computing the dirty marker
2025-01-30 14:53:50 -08:00
7e88640cd1 setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH
The exact same issue as described in the preceding commit also exists
for GIT_DEFAULT_HASH. Thus, reinitializing a repository that e.g. uses
SHA1 with `GIT_DEFAULT_HASH=sha256 git init` will cause the object
format of that repository to change to SHA256. This is of course bogus
as any existing objects and refs will not be converted, thus causing
repository corruption:

    $ git init repo
    Initialized empty Git repository in /tmp/repo/.git/
    $ cd repo/
    $ git commit --allow-empty -m message
    [main (root-commit) 35a7344] message
    $ GIT_DEFAULT_HASH=sha256 git init
    Reinitialized existing Git repository in /tmp/repo/.git/
    $ git show
    fatal: your current branch appears to be broken

Fix the issue by ignoring the environment variable in case the repo has
already been initialized with an object hash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30 14:36:41 -08:00
796fda3f78 setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT
The GIT_DEFAULT_REF_FORMAT environment variable can be set to influence
the default ref format that new repostiories shall be initialized with.
While this is the expected behaviour when creating a new repository, it
is not when reinitializing a repository: we should retain the ref format
currently used by it in that case.

This doesn't work correctly right now:

    $ git init --ref-format=files repo
    Initialized empty Git repository in /tmp/repo/.git/
    $ GIT_DEFAULT_REF_FORMAT=reftable git init repo
    fatal: could not open '/tmp/repo/.git/refs/heads' for writing: Is a directory

Instead of retaining the current ref format, the reinitialization tries
to reinitialize the repository with the different format. This action
fails when git-init(1) tries to write the ".git/refs/heads" stub, which
in the context of the reftable backend is always written as a file so
that we can detect clients which inadvertently try to access the repo
with the wrong ref format. Seems like the protection mechanism works for
this case, as well.

Fix the issue by ignoring the environment variable in case the repo has
already been initialized with a ref storage format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30 14:36:40 -08:00
150c31bf88 t0001: remove duplicate test
The test in question is an exact copy of the testcase preceding it.
Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30 14:36:40 -08:00
a206058fda apply: detect overflow when parsing hunk header
"git apply" uses strtoul() to parse the numbers in the hunk header but
silently ignores overflows. As LONG_MAX is a legitimate return value for
strtoul() we need to set errno to zero before the call to strtoul() and
check that it is still zero afterwards. The error message we display is
not particularly helpful as it does not say what was wrong.  However, it
seems pretty unlikely that users are going to trigger this error in
practice and we can always improve it later if needed.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30 14:18:12 -08:00
087740d65a scalar: free result of remote_default_branch()
We don't free the result of `remote_default_branch()`, leading to a
memory leak. This leak is exposed by t9211, but only when run with Meson
with the `-Db_sanitize=leak` option:

    Direct leak of 5 byte(s) in 1 object(s) allocated from:
        #0 0x5555555cfb93 in malloc (scalar+0x7bb93)
        #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8
        #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8
        #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9
        #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16
        #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9
        #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14
        #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28
        #8 0x5555555d196b in cmd_main ../scalar.c:992:14
        #9 0x5555557c4059 in main ../common-main.c:64:11
        #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #12 0x555555592054 in _start (scalar+0x3e054)

    DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s).

As the `branch` variable may contain a string constant obtained from
parsing command line arguments we cannot free the leaking variable
directly. Instead, introduce a new `branch_to_free` variable that only
ever gets assigned the allocated string and free that one to plug the
leak.

It is unclear why the leak isn't flagged when running the test via our
Makefile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30 11:07:22 -08:00
c5fe29f696 unix-socket: fix memory leak when chdir(3p) fails
When trying to create a Unix socket in a path that exceeds the maximum
socket name length we try to first change the directory into the parent
folder before creating the socket to reduce the length of the name. When
this fails we error out of `unix_sockaddr_init()` with an error code,
which indicates to the caller that the context has not been initialized.
Consequently, they don't release that context.

This leads to a memory leak: when we have already populated the context
with the original directory that we need to chdir(3p) back into, but
then the chdir(3p) into the socket's parent directory fails, then we
won't release the original directory's path. The leak is exposed by
t0301, but only when running tests in a directory hierarchy whose path
is long enough to make the socket name length exceed the maximum socket
name length:

    Direct leak of 129 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8
        #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2
        #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3
        #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7
        #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6
        #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11
        #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6
        #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3
        #9 0x555555700547 in run_builtin ../git.c:480:11
        #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9
        #11 0x5555556ffee8 in run_argv ../git.c:807:4
        #12 0x5555556fee6b in cmd_main ../git.c:947:19
        #13 0x55555593f689 in main ../common-main.c:64:11
        #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #16 0x5555555ad1d4 in _start (git+0x591d4)

    DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s).

Fix this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30 11:07:22 -08:00
65c10aa8d5 libgit: add higher-level libgit crate
The C functions exported by libgit-sys do not provide an idiomatic Rust
interface. To make it easier to use these functions via Rust, add a
higher-level "libgit" crate, that wraps the lower-level configset API
with an interface that is more Rust-y.

This combination of $X and $X-sys crates is a common pattern for FFI in
Rust, as documented in "The Cargo Book" [1].

[1] https://doc.rust-lang.org/cargo/reference/build-scripts.html#-sys-packages

Co-authored-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29 15:06:50 -08:00
d76eb0dccc libgit-sys: also export some config_set functions
In preparation for implementing a higher-level Rust API for accessing
Git configs, export some of the upstream configset API via libgitpub and
libgit-sys. Since this will be exercised as part of the higher-level API
in the next commit, no tests have been added for libgit-sys.

While we're at it, add git_configset_alloc() and git_configset_free()
functions in libgitpub so that callers can manage config_set structs on
the heap. This also allows non-C external consumers to treat config_sets
as opaque structs.

Co-authored-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29 15:06:50 -08:00
3b0d05c4a7 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29 14:05:10 -08:00
296cf82f93 Merge branch 'ps/reflog-migration-with-logall-fix'
The "git refs migrate" command did not migrate the reflog for
refs/stash, which is the contents of the stashes, which has been
corrected.

* ps/reflog-migration-with-logall-fix:
  refs: fix migration of reflogs respecting "core.logAllRefUpdates"
2025-01-29 14:05:10 -08:00
c5216a1bc6 Merge branch 'am/trace2-with-valueless-true'
The trace2 code was not prepared to show a configuration variable
that is set to true using the valueless true syntax, which has been
corrected.

* am/trace2-with-valueless-true:
  trace2: prevent segfault on config collection with valueless true
2025-01-29 14:05:10 -08:00
d205f06ae0 Merge branch 'kn/reflog-symref-fix'
reflog entries for symbolic ref updates were broken, which has been
corrected.

* kn/reflog-symref-fix:
  refs: fix creation of reflog entries for symrefs
2025-01-29 14:05:10 -08:00
8d6240d4c6 Merge branch 'rs/ref-fitler-used-atoms-value-fix'
"git branch --sort=..." and "git for-each-ref --format=... --sort=..."
did not work as expected with some atoms, which has been corrected.

* rs/ref-fitler-used-atoms-value-fix:
  ref-filter: remove ref_format_clear()
  ref-filter: move is-base tip to used_atom
  ref-filter: move ahead-behind bases into used_atom
2025-01-29 14:05:09 -08:00
de56e1d746 Merge branch 'ja/doc-commit-markup-updates'
Doc updates.

* ja/doc-commit-markup-updates:
  doc: migrate git-commit manpage secondary files to new format
  doc: convert git commit config to new format
  doc: make more direct explanations in git commit options
  doc: the mode param of -u of git commit is optional
  doc: apply new documentation guidelines to git commit
2025-01-29 14:05:09 -08:00
f046ab2dd4 Merge branch 'ds/path-walk-1'
Introduce a new API to visit objects in batches based on a common
path, or by type.

* ds/path-walk-1:
  path-walk: drop redundant parse_tree() call
  path-walk: reorder object visits
  path-walk: mark trees and blobs as UNINTERESTING
  path-walk: visit tags and cached objects
  path-walk: allow consumer to specify object types
  t6601: add helper for testing path-walk API
  test-lib-functions: add test_cmp_sorted
  path-walk: introduce an object walk by path
2025-01-29 14:05:09 -08:00
e7f8bf125c libgit-sys: introduce Rust wrapper for libgit.a
Introduce libgit-sys, a Rust wrapper crate that allows Rust code to call
functions in libgit.a. This initial patch defines build rules and an
interface that exposes user agent string getter functions as a proof of
concept. This library can be tested with `cargo test`. In later commits,
a higher-level library containing a more Rust-friendly interface will be
added at `contrib/libgit-rs`.

Symbols in libgit can collide with symbols from other libraries such as
libgit2. We avoid this by first exposing library symbols in
public_symbol_export.[ch]. These symbols are prepended with "libgit_" to
avoid collisions and set to visible using a visibility pragma. In
build.rs, Rust builds contrib/libgit-rs/libgit-sys/libgitpub.a, which also
contains libgit.a and other dependent libraries, with
-fvisibility=hidden to hide all symbols within those libraries that
haven't been exposed with a visibility pragma.

Co-authored-by: Kyle Lippincott <spectral@google.com>
Co-authored-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 14:45:47 -08:00
3f8f2abe05 common-main: split init and exit code into new files
Currently, object files in libgit.a reference common_exit(), which is
contained in common-main.o. However, common-main.o also includes main(),
which references cmd_main() in git.o, which in turn depends on all the
builtin/*.o objects.

We would like to allow external users to link libgit.a without needing
to include so many extra objects. Enable this by splitting common_exit()
and check_bug_if_BUG() into a new file common-exit.c, and add
common-exit.o to LIB_OBJS so that these are included in libgit.a.

This split has previously been proposed ([1], [2]) to support fuzz tests
and unit tests by avoiding conflicting definitions for main(). However,
both of those issues were resolved by other methods of avoiding symbol
conflicts. Now we are trying to make libgit.a more self-contained, so
hopefully we can revisit this approach.

Additionally, move the initialization code out of main() into a new
init_git() function in its own file. Include this in libgit.a as well,
so that external users can share our setup code without calling our
main().

[1] https://lore.kernel.org/git/Yp+wjCPhqieTku3X@google.com/
[2] https://lore.kernel.org/git/20230517-unit-tests-v2-v2-1-21b5b60f4b32@google.com/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 14:39:16 -08:00
78cdeed4c7 ci: make "linux-musl" job use zlib-ng
We don't yet have any test coverage for the new zlib-ng backend as part
of our CI. Add it by installing zlib-ng in Alpine Linux, which causes
Meson to pick it up automatically.

Note that we are somewhat limited with regards to where we run that job:
Debian-based distributions don't have zlib-ng in their repositories,
Fedora has it but doesn't run tests, and Alma Linux doesn't have the
package either. Alpine Linux does have it available and is running our
test suite, which is why it was picked.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:23 -08:00
84bb5eeace ci: switch linux-musl to use Meson
Switch over the "linux-musl" job to use Meson instead of Makefiles. This
is done due to multiple reasons:

  - It simplifies our CI infrastructure a bit as we don't have to
    manually specify a couple of build options anymore.

  - It verifies that Meson detects and sets those build options
    automatically.

  - It makes it easier for us to wire up a new CI job using zlib-ng as
    backend.

One platform compatibility that Meson cannot easily detect automatically
is the `GIT_TEST_UTF8_LOCALE` variable used in tests. Wire up a build
option for it, which we set via a new "MESONFLAGS" environment variable.

Note that we also drop the CC variable, which is set to "gcc". We
already default to GCC when CC is unset in "ci/lib.sh", so this is not
needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:23 -08:00
b9d6f64393 compat/zlib: allow use of zlib-ng as backend
The zlib-ng library is a hard fork of the old and venerable zlib
library. It describes itself as zlib replacement with optimizations for
"next generation" systems. As such, it contains several implementations
of central algorithms using for example SSE2, AVX2 and other vectorized
CPU intrinsics that supposedly speed up in- and deflating data.

And indeed, compiling Git against zlib-ng leads to a significant speedup
when reading objects. The following benchmark uses git-cat-file(1) with
`--batch --batch-all-objects` in the Git repository:

    Benchmark 1: zlib
      Time (mean ± σ):     52.085 s ±  0.141 s    [User: 51.500 s, System: 0.456 s]
      Range (min … max):   52.004 s … 52.335 s    5 runs

    Benchmark 2: zlib-ng
      Time (mean ± σ):     40.324 s ±  0.134 s    [User: 39.731 s, System: 0.490 s]
      Range (min … max):   40.135 s … 40.484 s    5 runs

    Summary
      zlib-ng ran
        1.29 ± 0.01 times faster than zlib

So we're looking at a ~25% speedup compared to zlib. This is of course
an extreme example, as it makes us read through all objects in the
repository. But regardless, it should be possible to see some sort of
speedup in most commands that end up accessing the object database.

The zlib-ng library provides a compatibility layer that makes it a
proper drop-in replacement for zlib: nothing needs to change in the
build system to support it. Unfortunately though, this mode isn't easy
to use on most systems because distributions do not allow you to install
zlib-ng in that way, as that would mean that the zlib library would be
globally replaced. Instead, many distributions provide a package that
installs zlib-ng without the compatibility layer. This version does
provide effectively the same APIs like zlib does, but all of the symbols
are prefixed with `zng_` to avoid symbol collisions.

Implement a new build option that allows us to link against zlib-ng
directly. If set, we redefine zlib symbols so that we use the `zng_`
prefixed versions thereof provided by that library. Like this, it
becomes possible to install both zlib and zlib-ng (without the compat
layer) and then pick whichever library one wants to link against for
Git.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:23 -08:00
9170c03fd7 git-zlib: cast away potential constness of next_in pointer
The `struct git_zstream::next_in` variable points to the input data and
is used in combination with `struct z_stream::next_in`. While that
latter field is not marked as a constant in zlib, it is marked as such
in zlib-ng. This causes a couple of compiler errors when we try to
assign these fields to one another due to mismatching constness.

Fix the issue by casting away the potential constness of `next_in`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:23 -08:00
bb5d35c1a8 compat/zlib: provide stubs for deflateSetHeader()
The function `deflateSetHeader()` has been introduced with zlib v1.2.2.1,
so we don't use it when linking against an older version of it. Refactor
the code to instead provide a central stub via "compat/zlib.h" so that
we can adapt it based on whether or not we use zlib-ng in a subsequent
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:23 -08:00
a2dcb69998 compat/zlib: provide deflateBound() shim centrally
The `deflateBound()` function has only been introduced with zlib 1.2.0.
When linking against a zlib version older than that we thus provide our
own compatibility shim. Move this shim into "compat/zlib.h" so that we
can adapt it based on whether or not we use zlib-ng in a subsequent
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
41f1a8435a git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"
We include "compat/zlib.h" in "git-compat-util.h", which is
unnecessarily broad given that we only have a small handful of files
that use the zlib library. Move the header into "git-zlib.h" instead and
adapt users of zlib to include that header.

One exception is the reftable library, as we don't want to use the
Git-specific wrapper of zlib there, so we include "compat/zlib.h"
instead. Furthermore, we move the include into "reftable/system.h" so
that users of the library other than Git can wire up zlib themselves.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
629188ede7 compat: introduce new "zlib.h" header
Introduce a new "compat/zlib-compat.h" header that we include instead of
including <zlib.h> directly. This will allow us to wire up zlib-ng as an
alternative backend for zlib compression in a subsequent commit.

Note that we cannot just call the file "compat/zlib.h", as that may
otherwise cause us to include that file instead of <zlib.h>.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
27e8960037 git-compat-util: drop z_const define
Before including <zlib.h> we explicitly define `z_const` to an empty
value. This has the effect that the `z_const` macro in "zconf.h" itself
will remain empty instead of being defined as `const`, which effectively
adapts a couple of APIs so that their parameters are not marked as being
constants.

It is dubious though whether this is something we actually want: not
marking a parameter as a constant doesn't make it any less constant than
it was. The define was added via 07564773c2 (compat: auto-detect if zlib
has uncompress2(), 2022-01-24), where it was seemingly carried over from
our internal compatibility shim for `uncompress2()` that was removed in
the preceding commit. The commit message doesn't mention why we carry
over the define and make it public, either, and I cannot think of any
reason for why we would want to have it.

Drop the define.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
3656d57bbf compat: drop uncompress2() compatibility shim
Our compat library has an implementation of zlib's `uncompress2()`
function that gets used when linking against an old version of zlib
that doesn't yet have it. The last user of `uncompress2()` got removed
in 15a60b747e (reftable/block: open-code call to `uncompress2()`,
2024-04-08), so the compatibility code is not required anymore. Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
da898a5c64 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:02:25 -08:00
b09b10ad26 Merge branch 'jp/t8002-printf-fix'
Test fix.

* jp/t8002-printf-fix:
  t8002: fix ambiguous printf conversion specifications
2025-01-28 13:02:24 -08:00
a17fd7dd3a Merge branch 'ps/reftable-sign-compare'
The reftable/ library code has been made -Wsign-compare clean.

* ps/reftable-sign-compare:
  reftable: address trivial -Wsign-compare warnings
  reftable/blocksource: adjust `read_block()` to return `ssize_t`
  reftable/blocksource: adjust type of the block length
  reftable/block: adjust type of the restart length
  reftable/block: adapt header and footer size to return a `size_t`
  reftable/basics: adjust `hash_size()` to return `uint32_t`
  reftable/basics: adjust `common_prefix_size()` to return `size_t`
  reftable/record: handle overflows when decoding varints
  reftable/record: drop unused `print` function pointer
  meson: stop disabling -Wsign-compare
2025-01-28 13:02:24 -08:00
73e055d71e Merge branch 'mh/credential-cache-authtype-request-fix'
The "cache" credential back-end did not handle authtype correctly,
which has been corrected.

* mh/credential-cache-authtype-request-fix:
  credential-cache: respect authtype capability
2025-01-28 13:02:24 -08:00
f8b9821f7d Merge branch 'jk/pack-header-parse-alignment-fix'
It was possible for "git unpack-objects" and "git index-pack" to
make an unaligned access, which has been corrected.

* jk/pack-header-parse-alignment-fix:
  index-pack, unpack-objects: use skip_prefix to avoid magic number
  index-pack, unpack-objects: use get_be32() for reading pack header
  parse_pack_header_option(): avoid unaligned memory writes
  packfile: factor out --pack_header argument parsing
  bswap.h: squelch potential sparse -Wcast-truncate warnings
2025-01-28 13:02:23 -08:00
3ddeb7f337 Merge branch 'ps/build-meson-subtree'
The meson-driven build is now aware of "git-subtree" housed in
contrib/subtree hierarchy.

* ps/build-meson-subtree:
  meson: wire up the git-subtree(1) command
  meson: introduce build option for contrib
  contrib/subtree: fix building docs
2025-01-28 13:02:23 -08:00
63d555a2dc Merge branch 'mh/connect-sign-compare'
The code in connect.c has been updated to work around complaints
from -Wsign-compare.

* mh/connect-sign-compare:
  connect: address -Wsign-compare warnings
2025-01-28 13:02:23 -08:00
8d335468ec Merge branch 'sk/unit-tests'
Move a few more unit tests to the clar test framework.

* sk/unit-tests:
  t/unit-tests: convert reftable tree test to use clar test framework
  t/unit-tests: adapt priority queue test to use clar test framework
  t/unit-tests: convert mem-pool test to use clar test framework
  t/unit-tests: handle dashes in test suite filenames
2025-01-28 13:02:22 -08:00
f0a371a39d Merge branch 'jc/show-usage-help'
The help text from "git $cmd -h" appear on the standard output for
some $cmd and the standard error for others.  The built-in commands
have been fixed to show them on the standard output consistently.

* jc/show-usage-help:
  builtin: send usage() help text to standard output
  oddballs: send usage() help text to standard output
  builtins: send usage_with_options() help text to standard output
  usage: add show_usage_if_asked()
  parse-options: add show_usage_with_options_if_asked()
  t0012: optionally check that "-h" output goes to stdout
2025-01-28 13:02:22 -08:00
b4cf68476a pack-objects: prevent name hash version change
When the --name-hash-version option is used in 'git pack-objects', it
can change from the initial assignment to when it is used based on
interactions with other arguments. Specifically, when writing or reading
bitmaps, we must force version 1 for now. This could change in the
future when the bitmap format can store a name hash version value,
indicating which was used during the writing of the packfile.

Protect the 'git pack-objects' process from getting confused by failing
with a BUG() statement if the value of the name hash version changes
between calls to pack_name_hash_fn().

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:43 -08:00
7f9870794f test-tool: add helper for name-hash values
Add a new test-tool helper, name-hash, to output the value of the
name-hash algorithms for the input list of strings, one per line.

Since the name-hash values can be stored in the .bitmap files, it is
important that these hash functions do not change across Git versions.
Add a simple test to t5310-pack-bitmaps.sh to provide some testing of
the current values. Due to how these functions are implemented, it would
be difficult to change them without disturbing these values. The paths
used for this test are carefully selected to demonstrate some of the
behavior differences of the two current name hash versions, including
which conditions will cause them to collide.

Create a performance test that uses test_size to demonstrate how
collisions occur for these hash algorithms. This test helps inform
someone as to the behavior of the name-hash algorithms for their repo
based on the paths at HEAD.

My copy of the Git repository shows modest statistics around the
collisions of the default name-hash algorithm:

Test                               this tree
--------------------------------------------------
5314.1: paths at head                         4.5K
5314.2: distinct hash value: v1               4.1K
5314.3: maximum multiplicity: v1                13
5314.4: distinct hash value: v2               4.2K
5314.5: maximum multiplicity: v2                 9

Here, the maximum collision multiplicity is 13, but around 10% of paths
have a collision with another path.

In a more interesting example, the microsoft/fluentui [1] repo had these
statistics at time of committing:

Test                               this tree
--------------------------------------------------
5314.1: paths at head                        19.5K
5314.2: distinct hash value: v1               8.2K
5314.3: maximum multiplicity: v1               279
5314.4: distinct hash value: v2              17.8K
5314.5: maximum multiplicity: v2                44

[1] https://github.com/microsoft/fluentui

That demonstrates that of the nearly twenty thousand path names, they
are assigned around eight thousand distinct values. 279 paths are
assigned to a single value, leading the packing algorithm to sort
objects from those paths together, by size.

With the v2 name hash function, the maximum multiplicity lowers to 44,
leaving some room for further improvement.

In a more extreme example, an internal monorepo had a much worse
collision rate:

Test                               this tree
--------------------------------------------------
5314.1: paths at head                       227.3K
5314.2: distinct hash value: v1              72.3K
5314.3: maximum multiplicity: v1             14.4K
5314.4: distinct hash value: v2             166.5K
5314.5: maximum multiplicity: v2               138

Here, we can see that the v2 name hash function provides somem
improvements, but there are still a number of collisions that could lead
to repacking problems at this scale.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:43 -08:00
30696be71f p5313: add size comparison test
As custom options are added to 'git pack-objects' and 'git repack' to
adjust how compression is done, use this new performance test script to
demonstrate their effectiveness in performance and size.

The recently-added --name-hash-version option allows for testing
different name hash functions. Version 2 intends to preserve some of the
locality of version 1 while more often breaking collisions due to long
filenames.

Distinguishing objects by more of the path is critical when there are
many name hash collisions and several versions of the same path in the
full history, giving a significant boost to the full repack case. The
locality of the hash function is critical to compressing something like
a shallow clone or a thin pack representing a push of a single commit.

This can be seen by running pt5313 on the open source fluentui
repository [1]. Most commits will have this kind of output for the thin
and big pack cases, though certain commits (such as [2]) will have
problematic thin pack size for other reasons.

[1] https://github.com/microsoft/fluentui
[2] a637a06df05360ce5ff21420803f64608226a875

Checked out at the parent of [2], I see the following statistics:

Test                                         HEAD
---------------------------------------------------------------
5313.2: thin pack with version 1             0.37(0.44+0.02)
5313.3: thin pack size with version 1                   1.2M
5313.4: big pack with version 1              2.04(7.77+0.23)
5313.5: big pack size with version 1                   20.4M
5313.6: shallow fetch pack with version 1    1.41(2.94+0.11)
5313.7: shallow pack size with version 1               34.4M
5313.8: repack with version 1                95.70(676.41+2.87)
5313.9: repack size with version 1                    439.3M
5313.10: thin pack with version 2            0.12(0.12+0.06)
5313.11: thin pack size with version 2                 22.0K
5313.12: big pack with version 2             2.80(5.43+0.34)
5313.13: big pack size with version 2                  25.9M
5313.14: shallow fetch pack with version 2   1.77(2.80+0.19)
5313.15: shallow pack size with version 2              33.7M
5313.16: repack with version 2               33.68(139.52+2.58)
5313.17: repack size with version 2                   160.5M

To make comparisons easier, I will reformat this output into a different
table style:

| Test         | V1 Time | V2 Time | V1 Size | V2 Size |
|--------------|---------|---------|---------|---------|
| Thin Pack    |  0.37 s |  0.12 s |   1.2 M |  22.0 K |
| Big Pack     |  2.04 s |  2.80 s |  20.4 M |  25.9 M |
| Shallow Pack |  1.41 s |  1.77 s |  34.4 M |  33.7 M |
| Repack       | 95.70 s | 33.68 s | 439.3 M | 160.5 M |

The v2 hash function successfully differentiates the CHANGELOG.md files
from each other, which leads to significant improvements in the thin
pack (simulating a push of this commit) and the full repack. There is
some bloat in the "big pack" scenario and essentially the same results
for the shallow pack.

In the case of the Git repository, these numbers show some of the issues
with this approach:

| Test         | V1 Time | V2 Time | V1 Size | V2 Size |
|--------------|---------|---------|---------|---------|
| Thin Pack    |  0.02 s |  0.02 s |   1.1 K |   1.1 K |
| Big Pack     |  1.69 s |  1.95 s |  13.5 M |  14.5 M |
| Shallow Pack |  1.26 s |  1.29 s |  12.0 M |  12.2 M |
| Repack       | 29.51 s | 29.01 s | 237.7 M | 238.2 M |

Here, the attempts to remove conflicts in the v2 function seem to cause
slight bloat to these sizes. This shows that the Git repository benefits
a lot from cross-path delta pairs.

The results are similar with the nodejs/node repo:

| Test         | V1 Time | V2 Time | V1 Size | V2 Size |
|--------------|---------|---------|---------|---------|
| Thin Pack    |  0.02 s |  0.02 s |   1.6 K |   1.6 K |
| Big Pack     |  4.61 s |  3.26 s |  56.0 M |  52.8 M |
| Shallow Pack |  7.82 s |  7.51 s | 104.6 M | 107.0 M |
| Repack       | 88.90 s | 73.75 s | 740.1 M | 764.5 M |

Here, the v2 name-hash causes some size bloat more often than it reduces
the size, but it also universally improves performance time, which is an
interesting reversal. This must mean that it is helping to short-circuit
some delta computations even if it is not finding the most efficient
ones. The performance improvement cannot be explained only due to the
I/O cost of writing the resulting packfile.

The Linux kernel repository was the initial target of the default name
hash value, and its naming conventions are practically build to take the
most advantage of the default name hash values:

| Test         | V1 Time  | V2 Time  | V1 Size | V2 Size |
|--------------|----------|----------|---------|---------|
| Thin Pack    |   0.17 s |   0.07 s |   4.6 K |   4.6 K |
| Big Pack     |  17.88 s |  12.35 s | 201.1 M | 159.1 M |
| Shallow Pack |  11.05 s |  22.94 s | 269.2 M | 273.8 M |
| Repack       | 727.39 s | 566.95 s |   2.5 G |   2.5 G |

Here, the thin and big packs gain some performance boosts in time, with
a modest gain in the size of the big pack. The shallow pack, however, is
more expensive to compute, likely because similarly-named files across
different directories are farther apart in the name hash ordering in v2.
The repack also gains benefits in computation time but no meaningful
change to the full size.

Finally, an internal Javascript repo of moderate size shows significant
gains when repacking with --name-hash-version=2 due to it having many name
hash collisions. However, it's worth noting that only the full repack
case has significant differences from the v1 name hash:

| Test      | V1 Time   | V2 Time  | V1 Size | V2 Size |
|-----------|-----------|----------|---------|---------|
| Thin Pack |    8.28 s |   7.28 s |  16.8 K |  16.8 K |
| Big Pack  |   12.81 s |  11.66 s |  29.1 M |  29.1 M |
| Shallow   |    4.86 s |   4.06 s |  42.5 M |  44.1 M |
| Repack    | 3126.50 s | 496.33 s |   6.2 G | 855.6 M |

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:43 -08:00
ce961135cc pack-objects: add GIT_TEST_NAME_HASH_VERSION
Add a new environment variable to opt-in to different values of the
--name-hash-version=<n> option in 'git pack-objects'. This allows for
extra testing of the feature without repeating all of the test
scenarios. Unlike many GIT_TEST_* variables, we are choosing to not add
this to the linux-TEST-vars CI build as that test run is already
overloaded. The behavior exposed by this test variable is of low risk
and should be sufficient to allow manual testing when an issue arises.

But this option isn't free. There are a few tests that change behavior
with the variable enabled.

First, there are a few tests that are very sensitive to certain delta
bases being picked. These are both involving the generation of thin
bundles and then counting their objects via 'git index-pack --fix-thin'
which pulls the delta base into the new packfile. For these tests,
disable the option as a decent long-term option.

Second, there are some tests that compare the exact output of a 'git
pack-objects' process when using bitmaps. The warning that ignores the
--name-hash-version=2 and forces version 1 causes these tests to fail.
Disable the environment variable to get around this issue.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:43 -08:00
928ef41dd8 repack: add --name-hash-version option
The new '--name-hash-version' option for 'git repack' is a simple
pass-through to the underlying 'git pack-objects' subcommand. However,
this subcommand may have other options and a temporary filename as part
of the subcommand execution that may not be predictable or could change
over time.

The existing test_subcommand method requires an exact list of arguments
for the subcommand. This is too rigid for our needs here, so create a
new method, test_subcommand_flex. Use it to check that the
--name-hash-version option is passing through.

Since we are modifying the 'git repack' command, let's bring its usage
in line with the Documentation's synopsis. This removes it from the
allow list in t0450 so it will remain in sync in the future.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:43 -08:00
fc62e033cd pack-objects: add --name-hash-version option
The previous change introduced a new pack_name_hash_v2() function that
intends to satisfy much of the hash locality features of the existing
pack_name_hash() function while also distinguishing paths with similar
final components of their paths.

This change adds a new --name-hash-version option for 'git pack-objects'
to allow users to select their preferred function version. This use of
an integer version allows for future expansion and a direct way to later
store a name hash version in the .bitmap format.

For now, let's consider how effective this mechanism is when repacking a
repository with different name hash versions. Specifically, we will
execute 'git pack-objects' the same way a 'git repack -adf' process
would, except we include --name-hash-version=<n> for testing.

On the Git repository, we do not expect much difference. All path names
are short. This is backed by our results:

| Stage                 | Pack Size | Repack Time |
|-----------------------|-----------|-------------|
| After clone           | 260 MB    | N/A         |
| --name-hash-version=1 | 127 MB    | 129s        |
| --name-hash-version=2 | 127 MB    | 112s        |

This example demonstrates how there is some natural overhead coming from
the cloned copy because the server is hosting many forks and has not
optimized for exactly this set of reachable objects. But the full repack
has similar characteristics for both versions.

Let's consider some repositories that are hitting too many collisions
with version 1. First, let's explore the kinds of paths that are
commonly causing these collisions:

 * "/CHANGELOG.json" is 15 characters, and is created by the beachball
   [1] tool. Only the final character of the parent directory can
   differentiate different versions of this file, but also only the two
   most-significant digits. If that character is a letter, then this is
   always a collision. Similar issues occur with the similar
   "/CHANGELOG.md" path, though there is more opportunity for
   differences In the parent directory.

 * Localization files frequently have common filenames but
   differentiates via parent directories. In C#, the name
   "/strings.resx.lcl" is used for these localization files and they
   will all collide in name-hash.

[1] https://github.com/microsoft/beachball

I've come across many other examples where some internal tool uses a
common name across multiple directories and is causing Git to repack
poorly due to name-hash collisions.

One open-source example is the fluentui [2] repo, which  uses beachball
to generate CHANGELOG.json and CHANGELOG.md files, and these files have
very poor delta characteristics when comparing against versions across
parent directories.

| Stage                 | Pack Size | Repack Time |
|-----------------------|-----------|-------------|
| After clone           | 694 MB    | N/A         |
| --name-hash-version=1 | 438 MB    | 728s        |
| --name-hash-version=2 | 168 MB    | 142s        |

[2] https://github.com/microsoft/fluentui

In this example, we see significant gains in the compressed packfile
size as well as the time taken to compute the packfile.

Using a collection of repositories that use the beachball tool, I was
able to make similar comparisions with dramatic results. While the
fluentui repo is public, the others are private so cannot be shared for
reproduction. The results are so significant that I find it important to
share here:

| Repo     | --name-hash-version=1 | --name-hash-version=2 |
|----------|-----------------------|-----------------------|
| fluentui |               440 MB  |               161 MB  |
| Repo B   |             6,248 MB  |               856 MB  |
| Repo C   |            37,278 MB  |             6,755 MB  |
| Repo D   |           131,204 MB  |             7,463 MB  |

Future changes could include making --name-hash-version implied by a config
value or even implied by default during a full repack.

It is important to point out that the name hash value is stored in the
.bitmap file format, so we must force --name-hash-version=1 when bitmaps
are being read or written. Later, the bitmap format could be updated to
be aware of the name hash version so deltas can be quickly computed
across the bitmapped/not-bitmapped boundary. To promote the safety of
this parameter, the validate_name_hash_version() method will die() if
the given name-hash version is incorrect and will disable newer versions
if not yet compatible with other features, such as --write-bitmap-index.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:41 -08:00
dca924b450 pack-objects: create new name-hash function version
As we will explore in later changes, the default name-hash function used
in 'git pack-objects' has a tendency to cause collisions and cause poor
delta selection. This change creates an alternative that avoids some
collisions while preserving some amount of hash locality.

The pack_name_hash() method has not been materially changed since it was
introduced in ce0bd64 (pack-objects: improve path grouping
heuristics., 2006-06-05). The intention here is to group objects by path
name, but also attempt to group similar file types together by making
the most-significant digits of the hash be focused on the final
characters.

Here's the crux of the implementation:

	/*
	 * This effectively just creates a sortable number from the
	 * last sixteen non-whitespace characters. Last characters
	 * count "most", so things that end in ".c" sort together.
	 */
	while ((c = *name++) != 0) {
		if (isspace(c))
			continue;
		hash = (hash >> 2) + (c << 24);
	}

As the comment mentions, this only cares about the last sixteen
non-whitespace characters. This cause some filenames to collide more than
others. This collision is somewhat by design in order to promote hash
locality for files that have similar types (.c, .h, .json) or could be the
same file across a directory rename (a/foo.txt to b/foo.txt). This leads to
decent cross-path deltas in cases like shallow clones or packing a
repository with very few historical versions of files that share common data
with other similarly-named files.

However, when the name-hash instead leads to a large number of name-hash
collisions for otherwise unrelated files, this can lead to confusing the
delta calculation to prefer cross-path deltas over previous versions of the
same file.

The new pack_name_hash_v2() function attempts to fix this issue by
taking more of the directory path into account through its hash
function. Its naming implies that we will later wire up details for
choosing a name-hash function by version.

The first change is to be more careful about paths using non-ASCII
characters. With these characters in mind, reverse the bits in the byte
as the least-significant bits have the highest entropy and we want to
maximize their influence. This is done with some bit manipulation that
swaps the two halves, then the quarters within those halves, and then
the bits within those quarters.

The second change is to perform hash composition operations at every
level of the path. This is done by storing a 'base' hash value that
contains the hash of the parent directory. When reaching a directory
boundary, we XOR the current level's name-hash value with a downshift of
the previous level's hash. This perturbation intends to create low-bit
distinctions for paths with the same final 16 bytes but distinct parent
directory structures.

The collision rate and effectiveness of this hash function will be
explored in later changes as the function is integrated with 'git
pack-objects' and 'git repack'.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:05 -08:00
f11f0a5a2d refs/reftable: fix uninitialized memory access of max_index
When migrating reflogs between reference backends, maintaining the
original order of the reflog entries is crucial. To achieve this, an
`index` field is stored within the `ref_update` struct that encodes the
relative order of reflog entries. This field is used by the reftable
backend as update index for the respective reflog entries to maintain
that ordering.

These update indices must be respected when writing table headers, which
encode the minimum and maximum update index of contained records in the
header and footer. This logic was added in commit bc67b4ab5f (reftable:
write correct max_update_index to header, 2025-01-15), which started to
use `reftable_writer_set_limits()` to propagate the mininum and maximum
update index of all records contained in a ref transaction.

However, we only set the maximum update index for the first transaction
argument, even though there can be multiple such arguments. This is the
case when we write to multiple stacks in a single transaction, e.g. when
updating references in two different worktrees at once. Consequently,
the update index for all but the first argument remain uninitialized,
which may cause undefined behaviour.

Fix this by moving the assignment of the maximum update index in
`reftable_be_transaction_finish()` inside the loop, which ensures that
all elements of the array are correctly initialized.

Furthermore, initialize the `max_index` field to 0 when queueing a new
transaction argument. This is not strictly necessary, as all elements of
`write_transaction_table_arg.max_index` are now assigned correctly.
However, this initialization is added for consistency and to safeguard
against potential future changes that might inadvertently introduce
uninitialized memory access.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 08:21:41 -08:00
93dc16483a fetch set_head: fix non-mirror remotes in bare repositories
In b1b713f722 (fetch set_head: handle mirrored bare repositories,
2024-11-22) it was implicitly assumed that all remotes will be mirrors
in a bare repository, thus fetching a non-mirrored remote could lead to
HEAD pointing to a non-existent reference. Make sure we only overwrite
HEAD if we are in a bare repository and fetching from a mirror.
Otherwise, proceed as normally, and create
refs/remotes/<nonmirrorremote>/HEAD instead.

Reported-by: Christian Hesse <list@eworm.de>
Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 08:16:47 -08:00
638060dcb9 fetch set_head: refactor to use remote directly
As a preparatory step to use even more properties from the remote
struct, refactor set_head to take the entire struct as a parameter,
instead of the necessary bits. This also allows consolidating the use of
gtransport->remote in set_head, making the access of the remote's
properties consistent in the function.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 08:16:45 -08:00
9a84794ad8 bundle: avoid closing file descriptor twice
Already when introduced in c7a8a16239 (Add bundle transport,
2007-09-10), the `bundle` transport had a bug where it would open a file
descriptor to the bundle file and then close it _twice_: First, the file
descriptor (`data->fd`) is passed to `unbundle()`, which would use it as
the `stdin` of the `index-pack` process, which as a consequence would
close it via `start_command()`. However, `data->fd` would still hold the
numerical value of the file descriptor, and `close_bundle()` would see
that and happily close it again.

This seems not to have caused too many problems in almost two decades,
but I encountered a situation today where it _does_ cause problems: In
i686 variants of Git for Windows, it seems that file descriptors are
reused quickly after they have been closed.

In the particular scenario I faced, `git fetch <bundle> <ref>` gets the
same file descriptor value when opening the bundle file and importing
its embedded packfile (which implicitly closes the file descriptor) and
then when opening a pack file in `fetch_and_consume_refs()` while
looking up an object's header.

Later on, after the bundle has been imported (and the `close_bundle()`
function erroneously closes the file descriptor that has _already_ been
closed when using it as `stdin` for `git index-pack`), the same file
descriptor value has now been reused via `use_pack()`. Now, when either
the recursive fetch (which defaults to "on", unfortunately) or a
commit-graph update needs to `mmap()` the packfile, it fails due to a
now-invalid file descriptor that _should_ point to the pack file but
doesn't anymore.

To fix that, let's invalidate `data->fd` after calling `unbundle()`.
That way, `close_bundle()` does not close a file descriptor that may
have been reused for something different. While at it, document that
`unbundle()` closes the file descriptor, and ensure that it also does
that when failing to verify the bundle.

Luckily, this bug does not affect the bundle URI feature, it only
affects the `git fetch <bundle>` code path.

Note that this patch does not _completely_ clarifies who is responsible
to close that file descriptor, as `run_command()` may fail _without_
closing `cmd->in`. Addressing this issue thoroughly, however, would
require a rather thorough re-design of the `start_command()` and
`finish_command()` functionality to make it a lot less murky who is
responsible for what file descriptors.

At least this here patch is relatively easy to reason about, and
addresses a hard failure (`fatal: mmap: could not determine filesize`)
at the expense of leaking a file descriptor under very rare
circumstances in which `git fetch` would error out anyway.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-25 18:38:11 -08:00
08032fa30f gc: add --expire-to option
This commit extends the functionality of `git gc`
by adding a new option, `--expire-to=<dir>`. Previously,
this feature was implemented in 91badeba32 (builtin/repack.c:
implement `--expire-to` for storing pruned objects, 2022-10-24),
which allowing users to specify a directory where unreachable
and expired cruft packs are stored during garbage collection.
However, users had to run `git repack --cruft --expire-to=<dir>`
followed by `git prune` to achieve similar results within `git gc`.

By introducing `--expire-to=<dir>` directly into `git gc`,
we simplify the process for users who wish to manage their
repository's cleanup more efficiently. This change involves
passing the `--expire-to=<dir>` parameter through to `git repack`,
making it easier for users to set up a backup location for cruft
packs that will be pruned.

Due to the original `git gc --prune=now` deleting all unreachable
objects by passing the `-a` parameter to git repack. With the
addition of the `--cruft` and `--expire-to` options, it is necessary
to modify this default behavior: instead of deleting these
unreachable objects, they should be merged into a cruft pack and
collected in a specified directory. Therefore, we do not pass `-a`
to the repack command but instead pass `--cruft`, `--expire-to`,
and `--cruft-expiration=now` to repack.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-24 14:32:28 -08:00
6bba6f604b config.txt: add trailer.* variables
The trailer.* configuration variables are currently only described in
git-interpret-trailers(1) but affect git-commit and git-tag as well.
Move that section into its own config/trailer.txt file and also include
it in git-config(1).

Signed-off-by: Julian Prein <julian@druckdev.xyz>
Acked-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-24 12:37:43 -08:00
8ccc75c245 remote: announce removal of "branches/" and "remotes/"
Back when Git was in its infancy, remotes were configured via separate
files in "branches/" (back in 2005). This mechanism was replaced later
that year with the "remotes/" directory. Both mechanisms have eventually
been replaced by config-based remotes, and it is very unlikely that
anybody still uses these directories to configure their remotes.

Both of these directories have been marked as deprecated, one in 2005
and the other one in 2011. Follow through with the deprecation and
finally announce the removal of these features in Git 3.0.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: with a small tweak to the help message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-24 08:08:56 -08:00
5f8f7081f7 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 15:07:03 -08:00
39ba2e8e56 Merge branch 'jc/cli-doc-option-and-config'
Doc update.

* jc/cli-doc-option-and-config:
  gitcli: document that command line trumps config and env
2025-01-23 15:07:02 -08:00
6ecb4fc149 Merge branch 'mh/doc-credential-helpers-with-pat'
Document that it is insecure to use Personal Access Tokens, which
some hosting providers take as username/password, embedded in URLs.

* mh/doc-credential-helpers-with-pat:
  docs: discuss caching personal access tokens
  docs: list popular credential helpers
2025-01-23 15:07:02 -08:00
294673a17e Merge branch 'ak/instaweb-python-port-binding-fix'
The "instaweb" bound only to local IP address without "--local" and
to all addresses with "--local", which was the other way around, when
using Python's http.server class, which has been corrected.

* ak/instaweb-python-port-binding-fix:
  instaweb: fix ip binding for the python http.server
2025-01-23 15:07:02 -08:00
aa31820d9d Merge branch 'sj/meson-doc-technical-dependency-fix'
The meson build procedure for Documentation/technical/ hierarchy was
missing necessary dependencies, which has been corrected.

* sj/meson-doc-technical-dependency-fix:
  meson: fix missing deps for technical articles
2025-01-23 15:07:02 -08:00
d8093fd6c1 Merge branch 'tc/meson-use-our-version-def-h'
The meson build procedure looked for the 'version-def.h' file in a
wrong directory, which has been corrected.

* tc/meson-use-our-version-def-h:
  meson: ensure correct version-def.h is used
2025-01-23 15:07:01 -08:00
7e3cb2e515 Merge branch 'en/object-name-with-funny-refname-fix'
Extended SHA-1 expression parser did not work well when a branch
with an unusual name (e.g. "foo{bar") is involved.

* en/object-name-with-funny-refname-fix:
  object-name: be more strict in parsing describe-like output
  object-name: fix resolution of object names containing curly braces
2025-01-23 15:07:01 -08:00
0cb454c072 Merge branch 'ds/path-walk-1' into ds/backfill
* ds/path-walk-1:
  path-walk: drop redundant parse_tree() call
  path-walk: reorder object visits
  path-walk: mark trees and blobs as UNINTERESTING
  path-walk: visit tags and cached objects
  path-walk: allow consumer to specify object types
  t6601: add helper for testing path-walk API
  test-lib-functions: add test_cmp_sorted
  path-walk: introduce an object walk by path
2025-01-23 12:00:40 -08:00
04292c3796 hash.h: drop unsafe_ function variants
Now that all callers have been converted from:

    the_hash_algo->unsafe_init_fn();

to

    unsafe_hash_algo(the_hash_algo)->init_fn();

and similar, we can remove the scaffolding for the unsafe_ function
variants and force callers to use the new unsafe_hash_algo() mechanic
instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:17 -08:00
a8dd3821fe csum-file: introduce hashfile_checkpoint_init()
In 106140a99f (builtin/fast-import: fix segfault with unsafe SHA1
backend, 2024-12-30) and 9218c0bfe1 (bulk-checkin: fix segfault with
unsafe SHA1 backend, 2024-12-30), we observed the effects of failing to
initialize a hashfile_checkpoint with the same hash function
implementation as is used by the hashfile it is used to checkpoint.

While both 106140a99f and 9218c0bfe1 work around the immediate crash,
changing the hash function implementation within the hashfile API to,
for example, the non-unsafe variant would re-introduce the crash. This
is a result of the tight coupling between initializing hashfiles and
hashfile_checkpoints.

Introduce and use a new function which ensures that both parts of a
hashfile and hashfile_checkpoint pair use the same hash function
implementation to avoid such crashes.

A few things worth noting:

  - In the change to builtin/fast-import.c::stream_blob(), we can see
    that by removing the explicit reference to
    'the_hash_algo->unsafe_init_fn()', we are hardened against the
    hashfile API changing away from the_hash_algo (or its unsafe
    variant) in the future.

  - The bulk-checkin code no longer needs to explicitly zero-initialize
    the hashfile_checkpoint, since it is now done as a result of calling
    'hashfile_checkpoint_init()'.

  - Also in the bulk-checkin code, we add an additional call to
    prepare_to_stream() outside of the main loop in order to initialize
    'state->f' so we know which hash function implementation to use when
    calling 'hashfile_checkpoint_init()'.

    This is OK, since subsequent 'prepare_to_stream()' calls are noops.
    However, we only need to call 'prepare_to_stream()' when we have the
    HASH_WRITE_OBJECT bit set in our flags. Without that bit, calling
    'prepare_to_stream()' does not assign 'state->f', so we have nothing
    to initialize.

  - Other uses of the 'checkpoint' in 'deflate_blob_to_pack()' are
    appropriately guarded.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:17 -08:00
3339180b28 t/helper/test-hash.c: use unsafe_hash_algo()
Remove a series of conditionals within the shared cmd_hash_impl() helper
that powers the 'sha1' and 'sha1-unsafe' helpers.

Instead, replace them with a single conditional that transforms the
specified hash algorithm into its unsafe variant. Then all subsequent
calls can directly use whatever function it wants to call without having
to decide between the safe and unsafe variants.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:17 -08:00
f0c266af4e csum-file.c: use unsafe_hash_algo()
Instead of calling the unsafe_ hash function variants directly, make use
of the shared 'algop' pointer by initializing it to:

    f->algop = unsafe_hash_algo(the_hash_algo);

, thus making all calls use the unsafe variants directly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:17 -08:00
7b081d2f70 hash.h: introduce unsafe_hash_algo()
In 253ed9ecff (hash.h: scaffolding for _unsafe hashing variants,
2024-09-26), we introduced "unsafe" variants of the SHA-1 hashing
functions by introducing new functions like "unsafe_init_fn()" and so
on.

This approach has a major shortcoming that callers must remember to
consistently use one variant or the other. Failing to consistently use
(or not use) the unsafe variants can lead to crashes at best, or subtle
memory corruption issues at worst.

In the hashfile API, this isn't difficult to achieve, but verifying that
all callers consistently use the unsafe variants is somewhat of a chore
given how spread out all of the callers are. In the sha1 and sha1-unsafe
test helpers, all of the calls to various hash functions are guarded by
an "if (unsafe)" conditional, which is repetitive and cumbersome.

Address these issues by introducing a new pattern whereby one
'git_hash_algo' can return a pointer to another 'git_hash_algo' that
represents the unsafe version of itself. So instead of having something
like:

    if (unsafe)
      the_hash_algo->init_fn(...);
      the_hash_algo->update_fn(...);
      the_hash_algo->final_fn(...);
    else
      the_hash_algo->unsafe_init_fn(...);
      the_hash_algo->unsafe_update_fn(...);
      the_hash_algo->unsafe_final_fn(...);

we can instead write:

    struct git_hash_algo *algop = the_hash_algo;
    if (unsafe)
      algop = unsafe_hash_algo(algop);

    algop->init_fn(...);
    algop->update_fn(...);
    algop->final_fn(...);

This removes the existing shortcoming by no longer forcing the caller to
"remember" which variant of the hash functions it wants to call, only to
hold onto a 'struct git_hash_algo' pointer that is initialized once.

Similarly, while there currently is still a way to "mix" safe and unsafe
functions, this too will go away after subsequent commits remove all
direct calls to the unsafe_ variants.

Note that hash_algo_by_ptr() needs an adjustment to allow passing in the
unsafe variant of a hash function. All other query functions on the
hash_algos array will continue to return the safe variants of any
function.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:16 -08:00
5fcc683338 csum-file.c: extract algop from hashfile_checksum_valid()
Perform a similar transformation as in the previous commit, but focused
instead on hashfile_checksum_valid(). This function does not work with a
hashfile structure itself, and instead validates the raw contents of a
file written using the hashfile API.

We'll want to be prepared for a similar change to this function in the
future, so prepare ourselves for that by extracting 'the_hash_algo' into
its own field for use within this function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:16 -08:00
48524fac64 csum-file: store the hash algorithm as a struct field
Throughout the hashfile API, we rely on a reference to 'the_hash_algo',
and call its _unsafe function variants directly.

Prepare for a future change where we may use a different 'git_hash_algo'
pointer (instead of just relying on 'the_hash_algo' throughout) by
making the 'git_hash_algo' pointer a member of the 'hashfile' structure
itself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:16 -08:00
d9213e4716 t/helper/test-tool: implement sha1-unsafe helper
With the new "unsafe" SHA-1 build knob, it is convenient to have a
test-tool that can exercise Git's unsafe SHA-1 wrappers for testing,
similar to 't/helper/test-tool sha1'.

Implement that helper by altering the implementation of that test-tool
(in cmd_hash_impl(), which is generic and parameterized over different
hash functions) to conditionally run the unsafe variants of the chosen
hash function, and expose the new behavior via a new 'sha1-unsafe' test
helper.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:28:16 -08:00
2fd367cf63 trace2: prevent segfault on config collection with valueless true
When TRACE2 analytics is enabled, a configuration variable set to
"valueless true" causes a segfault.

Steps to Reproduce

    GIT_TRACE2=true GIT_TRACE2_CONFIG_PARAMS=status.*
    git -c status.relativePaths version
    Expected Result
    git version 2.46.0
    Actual Result
    zsh: segmentation fault GIT_TRACE2=true

Add checks to prevent the segfault and instead show that the
variable without value.

Signed-off-by: Adam Murray <ad@canva.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 10:01:56 -08:00
3519492430 refs: fix creation of reflog entries for symrefs
The commit 297c09eabb (refs: allow multiple reflog entries for the
same refname, 2024-12-16) added logic to exit early in
`lock_ref_for_update()` after obtaining the required lock. This was
added as a performance optimization on a false assumption that no
further processing was required for reflog-only updates.

However the assumption was wrong.  For a symref's reflog entry, the
update needs to be populated with the old_oid value, but the early
exit skipped this necessary step.

This caused a bug in Git 2.48 in the files backend where target
references of symrefs being updated would create a corrupted reflog
entry for the symref since the old_oid is not populated.

Everything the early exit skipped in the code path is necessary for
both regular and symbolic ref, so eliminate the mistaken
optimization, and also add a test to ensure that such an issue
doesn't arise in the future.

Reported-by: Nika Layzell <nika@thelayzells.com>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23 09:56:22 -08:00
b224e8e36c path-walk: drop redundant parse_tree() call
This call to parse_tree() was flagged by Coverity for ignoring the
return value. But if we look a little further up the function, we can
see that there is already a call to parse_tree_gently(), and we'll
return early if that fails. So by this point the tree will always be
parsed, and the call is redundant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 17:52:44 -08:00
ee109848cf Merge branch 'ps/build-meson-fixes' into ps/zlib-ng
* ps/build-meson-fixes:
  ci: wire up Visual Studio build with Meson
  ci: raise error when Meson generates warnings
  meson: fix compilation with Visual Studio
  meson: make the CSPRNG backend configurable
  meson: wire up fuzzers
  meson: wire up generation of distribution archive
  meson: wire up development environments
  meson: fix dependencies for generated headers
  meson: populate project version via GIT-VERSION-GEN
  GIT-VERSION-GEN: allow running without input and output files
  GIT-VERSION-GEN: simplify computing the dirty marker
2025-01-22 13:39:42 -08:00
7304bd2bc3 ci: wire up Visual Studio build with Meson
Add a new job to GitHub Actions and GitLab CI that builds and tests
Meson-based builds with Visual Studio.

A couple notes:

  - While the build job is mandatory, the test job is marked as "manual"
    on GitLab so that it doesn't run by default. We already have a bunch
    of Windows-based jobs, and the computational overhead that these
    cause is simply out of proportion to run the test suite twice.

    The same isn't true for GitHub as I could not find a way to make a
    subset of jobs manually triggered.

  - We disable Perl. This is because we pick up Perl from Git for
    Windows, which outputs different paths ("/c/" instead of "C:\") than
    what we expect in our tests.

  - We don't use the Git for Windows SDK. Instead, the build only
    depends on Visual Studio, Meson and Git for Windows. All the other
    dependencies like curl, pcre2 and zlib get pulled in and compiled
    automatically by Meson and thus do not have to be provided by the
    system.

  - We open-code "ci/run-test-slice.sh". This is because we only have
    direct access to PowerShell, so we manually implement the logic.
    There is an upstream pull request for the Meson build system [1] to
    implement test slicing in Meson directly.

  - We don't process test artifacts for failed CI jobs. This is done to
    keep down prerequisites to a minimum.

All tests are passing.

[1]: https://github.com/mesonbuild/meson/pull/14092

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:35 -08:00
a8179952e1 ci: raise error when Meson generates warnings
Meson prints warnings in several cases, like for example when using a
feature supported by the current version of Meson, but not yet supported
by the minimum required version as declared by the project. These
warnings will not cause the setup to fail by default, which makes it
quite easy to miss them.

Improve this by passing `--fatal-meson-warnings` to `meson setup` so
that our CI jobs will fail on warnings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:34 -08:00
13cb20fc46 meson: fix compilation with Visual Studio
The Visual Studio compiler defaults to C89 unless explicitly asked to
use a different version of the C standard. We don't specify any C
standard at all though in our Meson build, and consequently compiling
Git fails:

    ...\git\git-compat-util.h(14): fatal error C1189: #error:  "Required C99 support is in a test phase.  Please see git-compat-util.h for more details."

Fix the issue by specifying the project's C standard. Funny enough,
specifying C99 does not work because apparently, `__STDC_VERSION__` is
not getting defined in that version at all. Instead, we have to specify
C11 as the project's C standard, which is also done in our CMake build
instructions.

We don't want to generally enforce C11 though, as our requiremets only
state that a C99 compiler is required. In fact, we don't even require
plain C99, but rather the GNU variant thereof.

Meson allows us to handle this case rather easily by specifying
"gnu99,c11", which will cause it to fall back to C11 in case GNU C99 is
unsupported. This feature has only been introduced with Meson 1.3.0
though, and we support 0.61.0 and newer. In case we use such an oldish
version though we fall back to requiring GNU99 unconditionally. This
means that Windows essentially requires Meson 1.3.0 and newer when using
Visual Studio, but I doubt that this is ever going to be a real problem.

Tested-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:34 -08:00
ef8c3a1b8a meson: make the CSPRNG backend configurable
The CSPRNG backend is not configurable in Meson and isn't quite
discoverable, either. Make it configurable and add the actual backend
used to the summary.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:34 -08:00
28911f7dca meson: wire up fuzzers
Meson does not yet know to build our fuzzers. Introduce a new build
option "fuzzers" and wire up the fuzzers in case it is enabled. Adapt
our CI jobs so that they build the fuzzers by default.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:33 -08:00
88d4bff8c3 meson: wire up generation of distribution archive
Meson knows to generate distribution archives via `meson dist`. In
addition to generating the archive itself, this target also knows to
compile and execute tests from that archive, which helps to ensure that
the result is an adequate drop-in replacement for the versioned project.

While this already works as-is, one omission is that we don't propagate
the commit that this is built from into the resulting archive. This can
be fixed though by adding a distribution script that propagates the
version into the "version" file, which GIT-VERSION-GEN knows to read if
present.

Use GIT-VERSION-GEN to populate that file. As the script is executed in
the build directory, not in the directory where we generate the archive,
we have to use a shell to resolve the "MESON_DIST_ROOT" environment
variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:33 -08:00
5d0cf6bb3a meson: wire up development environments
The Meson build system is able to wire up development environments. The
intent is to make build artifacts of the project available. This is
typically used to export e.g. paths to linkable libraries, which isn't
all that interesting in our context given that we don't have an official
library interface.

But what we can use this mechanism for is to expose the built Git
executables as well as the build directory. This allows users to play
around with the built Git version in the devenv, and allows them to
execute our test scripts directly with the built distribution.

Wire up this feature, which can then be used via `meson devenv` in the
build directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:33 -08:00
53d75bd3e4 meson: fix dependencies for generated headers
We generate a couple of headers from our documentation. These headers
are added to the libgit sources, but two of them aren't used by the
library, but instead by our builtins. This can cause parallel builds to
fail because the builtin object may be compiled before the header was
generated.

Fix the issue by adding both "config-list.h" and "hook-list.h" to the
list of builtin sources. While "command-list.h" is generated similarly,
it is used by "help.c" and thus part of the libgit sources indeed.

Reported-by: Evan Martin <evan.martin@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:33 -08:00
6ff99174d1 meson: populate project version via GIT-VERSION-GEN
The Git version for Meson is currently wired up manually. It can thus
grow (and already has grown) stale quite easily, as having multiple
sources of truth is never a good idea. This issue is mostly of cosmetic
nature as we don't use the project version anywhere, and instead use the
GIT-VERSION-GEN script to propagate the correct version into our build.
But it is somewhat puzzling when `meson setup` announces to build an old
Git release.

There are a couple of alternatives for how to solve this:

  - We can keep the version undefined, but this makes Meson output
    "undefined" for the version, as well.

  - We can use GIT-VERSION-GEN to generate the version for us. At the
    point of configuring the project we haven't yet figured out host
    details though, and thus we didn't yet set up the shell environment.
    While not an issue for Unix-based systems, this would be an issue in
    Windows, where the shell typically gets provided via Git for Windows
    and thus requires some special setup.

  - We can pull the default version out of GIT-VERSION-GEN and move it
    into its own file. This likely requires some adjustments for scripts
    that bump the version, but allows Meson to read the version from
    that file trivially.

Pick the second option and use GIT-VERSION-GEN as it gives us the most
accurate version. In order to fix the bootstrapping issue on Windows
systems we simply set the version to 'unknown' in case no shell was
found. As the version is only of cosmetic value this isn't really much
of an issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:32 -08:00
f6a2efdc9b GIT-VERSION-GEN: allow running without input and output files
The GIT-VERSION-GEN script requires an input file containing formatting
directives to be replaced as well as an output file that will get
overwritten in case the file contents have changed. When computing the
project version for Meson we don't want to have either though:

  - We only want to compute the version without anything else, but don't
    have an input file that would match that exact format. While we
    could of course introduce a new file just for that usecase, it feels
    suboptimal to add another file every time we want to have a slightly
    different format for versioned data.

  - The computed version needs to be read from stdout so that Meson can
    wire it up for the project.

Extend the script to handle both usecases by recognizing `--format=` as
alternative to providing an input path and by writing to stdout in case
no output file was given.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:32 -08:00
e40622a60b GIT-VERSION-GEN: simplify computing the dirty marker
The GIT-VERSION-GEN script computes the version that Git is being built
from. When building from a commit with an unclean worktree it knows to
append "-dirty" to that version to indicate that there were custom
changes applied and that it isn't the exact same as that commit.

The dirtiness check is done manually via git-diff-index(1), which is
somewhat puzzling though: we already use git-describe(1) to compute the
version, which also knows to compute dirtiness via the "--dirty" flag.
But digging back in history explains why: the "-dirty" suffix was added
in 31e0b2ca81 (GIT 1.5.4.3, 2008-02-23), and git-describe(1) didn't yet
have support for "--dirty" back then.

Refactor the script to use git-describe(1). Despite being simpler, it
also results in a small speedup:

    Benchmark 1: git describe --dirty --match "v[0-9]*"
      Time (mean ± σ):      12.5 ms ±   0.3 ms    [User: 6.3 ms, System: 8.8 ms]
      Range (min … max):    12.0 ms …  13.5 ms    200 runs

    Benchmark 2: git describe --match "v[0-9]*" HEAD && git update-index -q --refresh && git diff-index --name-only HEAD --
      Time (mean ± σ):      17.9 ms ±   1.1 ms    [User: 8.8 ms, System: 14.4 ms]
      Range (min … max):    17.0 ms …  30.6 ms    148 runs

    Summary
      git describe --dirty --match "v[0-9]*" ran
        1.43 ± 0.09 times faster than git describe --match "v[0-9]*" && git update-index -q --refresh && git diff-index --name-only HEAD --

While the speedup doesn't really matter on Unix-based systems, where
filesystem operations are typically fast, they do matter on Windows
where the commands take a couple hundred milliseconds. A quick and dirty
check on that system shows a speedup from ~800ms to ~400ms.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:37:32 -08:00
68f51871df builtin/pack-redundant: remove subcommand with breaking changes
The git-pack-redundant(1) subcommand has been castrated to require
the "--i-still-use-this" option to do anything since 4406522b
(pack-redundant: escalate deprecation warning to an error,
2023-03-23), which appeared in Git 2.41 and was announced for
removal with 53a92c9552 (Documentation/BreakingChanges: announce
removal of git-pack-redundant(1), 2024-09-02). Stop compiling the
subcommand in case the `WITH_BREAKING_CHANGES` build flag is set.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:34:05 -08:00
4b5073c64b ci: repurpose "linux-gcc" job for deprecations
The "linux-gcc" job isn't all that interesting by itself and can be
considered more or less the "standard" job: it is running with a
reasonably up-to-date image and uses GCC as a compiler, both of which we
already cover in other jobs.

There is one exception though: we change the default branch to be "main"
instead of "master", so it is forging ahead a bit into the future to
make sure that this change does not cause havoc. So let's expand on this
a bit and also add the new "WITH_BREAKING_CHANGES" flag to the mix.

Rename the job to "linux-breaking-changes" accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:28:28 -08:00
04c29bdea0 ci: merge linux-gcc-default into linux-gcc
The "linux-gcc-default" job is mostly doing the same as the "linux-gcc"
job, except for a couple of minor differences:

  - We use an explicit GCC version instead of the default version
    provided by the distribution. We have other jobs that test with
    "gcc-8", making this distinction pointless.

  - We don't set up the Python version explicitly, and instead use the
    default Python version. Python 2 has been end-of-life for quite a
    while now though, making this distinction less interesting.

  - We set up the default branch name to be "main" in "linux-gcc". We
    have other testcases that don't and also some that explicitly use
    "master".

  - We use "ubuntu:20.04" in one job and "ubuntu:latest" in another. We
    already have a couple other jobs testing these respectively.

So overall, the job does not add much to our test coverage.

Drop the "linux-gcc-default" job and adapt "linux-gcc" to start using
the default GCC compiler, effectively merging those two jobs into one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:28:27 -08:00
c5bc9a7f94 Makefile: wire up build option for deprecated features
With 57ec9254eb (docs: introduce document to announce breaking changes,
2024-06-14), we have introduced a new document that tracks upcoming
breaking changes in the Git project. In 2454970930 (BreakingChanges:
early adopter option, 2024-10-11) we have amended the document a bit to
mention that any introduced breaking changes must be accompanied by
logic that allows us to enable the breaking change at compile-time.
While we already have two breaking changes lined up, neither of them has
such a switch because they predate those instructions.

Introduce the proposed `WITH_BREAKING_CHANGES` preprocessor macro and
wire it up with both our Makefiles and Meson. This does not yet wire up
the build flag for existing deprecations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:28:27 -08:00
a0bea0978f refs: fix migration of reflogs respecting "core.logAllRefUpdates"
In 246cebe320 (refs: add support for migrating reflogs, 2024-12-16) we
have added support to git-refs(1) to migrate reflogs between reference
backends. It was reported [1] though that not we don't migrate reflogs
for a subset of references, most importantly "refs/stash".

This issue is caused by us still honoring "core.logAllRefUpdates" when
trying to migrate reflogs: we do queue the updates, but depending on the
value of that config we may decide to just skip writing the reflog entry
altogether. And given that:

  - The default for "core.logAllRefUpdates" is to only create reflogs
    for branches, remotes, note refs and "HEAD"

  - "refs/stash" is neither of these ref types.

We end up skipping the reflog creation for that particular reference.

Fix the bug by setting `REF_FORCE_CREATE_REFLOG`, which instructs the
ref backends to create the reflog entry regardless of the config or any
preexisting state.

[1]: <Z5BTQRlsOj1sygun@tapette.crustytoothpaste.net>

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 10:00:33 -08:00
017bd89239 reftable: prevent 'update_index' changes after adding records
The function `reftable_writer_set_limits()` allows updating the
'min_update_index' and 'max_update_index' of a reftable writer. These
values are written to both the writer's header and footer.

Since the header is written during the first block write, any subsequent
changes to the update index would create a mismatch between the header
and footer values. The footer would contain the newer values while the
header retained the original ones.

To protect against this bug, prevent callers from updating these values
after any record is written. To do this, modify the function to return
an error whenever the limits are modified after any record adds. Check
for record adds within `reftable_writer_set_limits()` by checking the
`last_key` and `next` variable. The former is updated after each record
added, but is reset at certain points. The latter is set after writing
the first block.

Modify all callers of the function to anticipate a return type and
handle it accordingly. Add a unit test to also ensure the function
returns the error as expected.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 09:51:36 -08:00
e7c1b9f123 refs: use 'uint64_t' for 'ref_update.index'
The 'ref_update.index' variable is used to store an index for a given
reference update. This index is used to order the updates in a
predetermined order, while the default ordering is alphabetical as per
the refname.

For large repositories with millions of references, it should be safer
to use 'uint64_t'. Let's do that. This also is applied for all other
code sections where we store 'index' and pass it around.

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 09:51:36 -08:00
af47976cc0 refs: mark ref_transaction_update_reflog() as static
The `ref_transaction_update_reflog()` function is only used within
'refs.c', so mark it as static.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 09:51:35 -08:00
33319b0976 reftable: address trivial -Wsign-compare warnings
Address the last couple of trivial -Wsign-compare warnings in the
reftable library and remove the DISABLE_SIGN_COMPARE_WARNINGS macro that
we have in "reftable/system.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:30 -08:00
7c4c1cbc0b reftable/blocksource: adjust read_block() to return ssize_t
The `block_source_read_block()` function and its implementations return
an integer as a result that reflects either the number of bytes read, or
an error. As such its return type, a signed integer, isn't wrong, but it
doesn't give the reader a good hint what it actually returns.

Refactor the function to return an `ssize_t` instead, which is typical
for functions similar to read(3p) and should thus give readers a better
signal what they can expect as a result.

Adjust callers to better handle the returned value to avoid warnings
with -Wsign-compare. One of these callers is `reader_get_block()`, whose
return value is only ever used by its callers to figure out whether or
not the read was successful. So instead of bubbling up the `ssize_t`
there, too, we adapt it to only indicate success or errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:30 -08:00
1f054af72f reftable/blocksource: adjust type of the block length
The block length is used to track the number of bytes available in a
specific block. As such, it is never set to a negative value, but is
still represented by a signed integer.

Adjust the type of the variable to be `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:30 -08:00
b1e4b6f4dc reftable/block: adjust type of the restart length
The restart length is tracked as a positive integer even though it
cannot ever be negative. Furthermore, it is effectively capped via the
MAX_RESTARTS variable.

Adjust the type of the variable to be `uint32_t`. While this type is
excessive given that MAX_RESTARTS fits into an `uint16_t`, other places
already use 32 bit integers for restarts, so this type is being more
consistent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:30 -08:00
ffe6643668 reftable/block: adapt header and footer size to return a size_t
The functions `header_size()` and `footer_size()` return a positive
integer representing the size of the header and footer, respectively,
dependent on the version of the reftable format. Similar to the
preceding commit, these functions return a signed integer though, which
is nonsensical given that there is no way for these functions to return
negative.

Adapt the functions to return a `size_t` instead to fix a couple of sign
comparison warnings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:29 -08:00
57adf71b93 reftable/basics: adjust hash_size() to return uint32_t
The `hash_size()` function returns the number of bytes used by the hash
function. Weirdly enough though, it returns a signed integer for its
size even though the size obviously cannot ever be negative. The only
case where it could be negative is if the function returned an error
when asked for an unknown hash, but we assert(3p) instead.

Adjust the type of `hash_size()` to be `uint32_t` and adapt all places
that use signed integers for the hash size to follow suit. This also
allows us to get rid of a couple asserts that we had which verified that
the size was indeed positive, which further stresses the point that this
refactoring makes sense.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:29 -08:00
5ac65f0d6b reftable/basics: adjust common_prefix_size() to return size_t
The `common_prefix_size()` function computes the length of the common
prefix between two buffers. As such its return value will always be an
unsigned integer, as the length cannot be negative. Regardless of that,
the function returns a signed integer, which is nonsensical and causes a
couple of -Wsign-compare warnings all over the place.

Adjust the function to return a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:29 -08:00
072e3aa3a5 reftable/record: handle overflows when decoding varints
The logic to decode varints isn't able to detect integer overflows: as
long as the buffer still has more data available, and as long as the
current byte has its 0x80 bit set, we'll continue to add up these values
to the result. This will eventually cause the `uint64_t` to overflow, at
which point we'll return an invalid result.

Refactor the function so that it is able to detect such overflows. The
implementation is basically copied from Git's own `decode_varint()`,
which already knows to handle overflows. The only adjustment is that we
also take into account the string view's length in order to not overrun
it. The reftable documentation explicitly notes that those two encoding
schemas are supposed to be the same:

    Varint encoding
    ^^^^^^^^^^^^^^^

    Varint encoding is identical to the ofs-delta encoding method used
    within pack files.

    Decoder works as follows:

    ....
    val = buf[ptr] & 0x7f
    while (buf[ptr] & 0x80) {
      ptr++
      val = ((val + 1) << 7) | (buf[ptr] & 0x7f)
    }
    ....

While at it, refactor `put_var_int()` in the same way by copying over
the implementation of `encode_varint()`. While `put_var_int()` doesn't
have an issue with overflows, it generates warnings with -Wsign-compare.
The implementation of `encode_varint()` doesn't, is battle-tested and at
the same time way simpler than what we currently have.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:28 -08:00
a204f92d1c reftable/record: drop unused print function pointer
In 42c424d69d (t/helper: inline printing of reftable records,
2024-08-22) we stopped using the `print` function of the reftable record
vtable and instead moved its implementation into the single user of it.
We didn't remove the function itself from the vtable though. Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:28 -08:00
eb8728d88a meson: stop disabling -Wsign-compare
In 4f9264b0cd (config.mak.dev: drop `-Wno-sign-compare`, 2024-12-06) we
have started an effort to make our codebase compile with -Wsign-compare.
But while we removed the -Wno-sign-compare flag from "config.mak.dev",
we didn't adjust the Meson build instructions in the same way.

Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:28 -08:00
2d0ff147e5 t8002: fix ambiguous printf conversion specifications
In e7fb2ca945 (builtin/blame: fix out-of-bounds write with blank
boundary commits, 2025-01-10), we have introduced two new tests that
expect a certain amount of padding. This padding is generated via
printf using the "%0.s" conversion specification. That directive is
ambiguous because it might be interpreted as field width (most shells)
or 0-padding flag for numeric fields (coreutils).

Fix this issue by using "%${N}s" instead, which is already being
used in other tests (i.e. t5300, t0450) and is unambiguous.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jan Palus <jpalus@fastmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:04:26 -08:00
dd98f54f30 Remove obsolete ".txt" extensions for AsciiDoc files
Since we no longer have any AsciiDoc files that end in ".txt", don't
modify them with .gitattributes or ignore them with .gitignore.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:10 -08:00
1f010d6bdf doc: use .adoc extension for AsciiDoc files
We presently use the ".txt" extension for our AsciiDoc files.  While not
wrong, most editors do not associate this extension with AsciiDoc,
meaning that contributors don't get automatic editor functionality that
could be useful, such as syntax highlighting and prose linting.

It is much more common to use the ".adoc" extension for AsciiDoc files,
since this helps editors automatically detect files and also allows
various forges to provide rich (HTML-like) rendering.  Let's do that
here, renaming all of the files and updating the includes where
relevant.  Adjust the various build scripts and makefiles to use the new
extension as well.

Note that this should not result in any user-visible changes to the
documentation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:06 -08:00
ed4cf6e8e2 gitattributes: mark AsciiDoc files as LF-only
In a future commit, we'll move the AsciiDoc documentation files to the
".adoc" extension rather than the extension ".txt".  We need these files
to use only LF because they are read by generate-cmdlist.sh using the
read builtin.

If we allow CRLF here, the CR at the end of the line is treated as part
of the synopsis, since a POSIX shell doesn't consider it special like
LF.  In that case, we generate synopsis strings in C that contain a CR,
which the compiler does not like because it believes that the double
quote string terminator is missing, and as a consequence, compilation
fails.

Because we rely on LF-only endings here to compile successfully and we
want Git to continue to be able to compile on Windows, mark these files
as LF-only in the .gitattributes file.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:05 -08:00
97343c8c2f editorconfig: add .adoc extension
The .adoc extension is commonly used for AsciiDoc files.  In a future
commit, we'll update some files to switch from the .txt extension to the
.adoc extension, so update the EditorConfig file to use the same
configuration for both extensions, since we want the files to be
formatted completely identically whether they're using the older or
newer extension.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:05 -08:00
89cdbffa86 doc: update gitignore for .adoc extension
We presently use the ".txt" extension for our AsciiDoc files.  While not
wrong, most editors do not associate this extension with AsciiDoc,
meaning that contributors don't get automatic editor functionality that
could be useful, such as syntax highlighting and prose linting.

Instead, in a future commit, we're going to move to using the more
common ".adoc" extension for these files, which many editors
intrinsically recognize as an AsciiDoc file.  To avoid contributors
accidentally checking in generated files, ignore the new extension for
generated files in the documentation .gitignore files.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:05 -08:00
8705c9bd13 pack-write: pass hash_algo to internal functions
The internal functions `write_rev_trailer()`, `write_rev_trailer()`,
`write_mtimes_header()` and write_mtimes_trailer()` use the global
`the_hash_algo` variable to access the repository's hash function. Pass
the hash_algo down from callers, all of which already have access to the
variable.

This removes all global variables from the 'pack-write.c' file, so
remove the 'USE_THE_REPOSITORY_VARIABLE' macro.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:36:35 -08:00
6b2aa7fd37 pack-write: pass hash_algo to write_rev_file()
The `write_rev_file()` function uses the global `the_hash_algo` variable
to access the repository's hash_algo. To avoid global variable usage,
pass a hash_algo from the layers above. Also modify children functions
`write_rev_file_order()` and `write_rev_header()` to accept
'the_hash_algo'.

Altough the layers above could have access to the hash_algo internally,
simply pass in `the_hash_algo`. This avoids any compatibility issues and
bubbles up global variable usage to upper layers which can be eventually
resolved.

However, in `midx-write.c`, since all usage of global variables is
removed, don't reintroduce them and instead use the `repo` available in
the context.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:36:34 -08:00
7653e9af9b pack-write: pass hash_algo to write_idx_file()
The `write_idx_file()` function uses the global `the_hash_algo` variable
to access the repository's hash_algo. To avoid global variable usage,
pass a hash_algo from the layers above.

Since `stage_tmp_packfiles()` also resides in 'pack-write.c' and calls
`write_idx_file()`, update it to accept a `struct git_hash_algo` as a
parameter and pass it through to the callee.

Altough the layers above could have access to the hash_algo internally,
simply pass in `the_hash_algo`. This avoids any compatibility issues and
bubbles up global variable usage to upper layers which can be eventually
resolved.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:36:34 -08:00
e2f6f76585 pack-write: pass repository to index_pack_lockfile()
The `index_pack_lockfile()` function uses the global `the_repository`
variable to access the repository. To avoid global variable usage, pass
the repository from the layers above.

Altough the layers above could have access to the repository internally,
simply pass in `the_repository`. This avoids any compatibility issues
and bubbles up global variable usage to upper layers which can be
eventually resolved.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:36:34 -08:00
8244d01de6 pack-write: pass hash_algo to fixup_pack_header_footer()
The `fixup_pack_header_footer()` function uses the global
`the_hash_algo` variable to access the repository's hash function. To
avoid global variable usage, pass a hash_algo from the layers above.

Altough the layers above could have access to the hash_algo internally,
simply pass in `the_hash_algo`. This avoids any compatibility issues and
bubbles up global variable usage to upper layers which can be eventually
resolved.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:36:34 -08:00
c5490ce9d1 ref-filter: remove ref_format_clear()
Now that ref_format_clear() no longer releases any memory we don't need
it anymore.  Remove it and its counterpart, ref_format_init().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 09:06:24 -08:00
7ee4fd18ac ref-filter: move is-base tip to used_atom
The string_list "is_base_tips" in struct ref_format stores the
committish part of "is-base:<committish>".  It has the same problems
that its sibling string_list "bases" had.  Fix them the same way as the
previous commit did for the latter, by replacing the string_list with
fields in "used_atom".

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 09:06:20 -08:00
5e58db6575 ref-filter: move ahead-behind bases into used_atom
verify_ref_format() parses a ref-filter format string and stores
recognized items in the static array "used_atom".  For
"ahead-behind:<committish>" it stores the committish part in a
string_list member "bases" of struct ref_format.

ref_sorting_options() also parses bare ref-filter format items and
stores stores recognized ones in "used_atom" as well.  The committish
parts go to a dummy struct ref_format in parse_sorting_atom(), though,
and are leaked and forgotten.

If verify_ref_format() is called before ref_sorting_options(), like in
git for-each-ref, then all works well if the sort key is included in the
format string.  If it isn't then sorting cannot work as the committishes
are missing.

If ref_sorting_options() is called first, like in git branch, then we
have the additional issue that if the sort key is included in the format
string then filter_ahead_behind() can't see its committish, will not
generate any results for it and thus it will be expanded to an empty
string.

Fix those issues by replacing the string_list with a field in used_atom
for storing the committish.  This way it can be shared for handling both
ref-filter format strings and sorting options in the same command.

Reported-by: Ross Goldberg <ross.goldberg@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 09:06:15 -08:00
4e746b1a31 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 08:44:55 -08:00
85cf8801c8 Merge branch 'sk/unit-test-hash'
Test update.

* sk/unit-test-hash:
  t/unit-tests: convert hash to use clar test framework
2025-01-21 08:44:55 -08:00
73c152e610 Merge branch 'mh/gitattr-doc-markup-fix'
Doc markup fix.

* mh/gitattr-doc-markup-fix:
  docs: fix typesetting of merge driver placeholders
2025-01-21 08:44:55 -08:00
c032b1d8bc Merge branch 'dk/zsh-config-completion-fix'
Completion script updates for zsh

* dk/zsh-config-completion-fix:
  completion: repair config completion for Zsh
2025-01-21 08:44:55 -08:00
780f7fdaa3 Merge branch 'aj/difftool-config-doc-fix'
Docfix.

* aj/difftool-config-doc-fix:
  difftool docs: restore correct position of tool list
2025-01-21 08:44:54 -08:00
7b39a128c8 Merge branch 'ps/the-repository'
More code paths have a repository passed through the callchain,
instead of assuming the primary the_repository object.

* ps/the-repository:
  match-trees: stop using `the_repository`
  graph: stop using `the_repository`
  add-interactive: stop using `the_repository`
  tmp-objdir: stop using `the_repository`
  resolve-undo: stop using `the_repository`
  credential: stop using `the_repository`
  mailinfo: stop using `the_repository`
  diagnose: stop using `the_repository`
  server-info: stop using `the_repository`
  send-pack: stop using `the_repository`
  serve: stop using `the_repository`
  trace: stop using `the_repository`
  pager: stop using `the_repository`
  progress: stop using `the_repository`
2025-01-21 08:44:54 -08:00
d6a7cace21 Merge branch 'jt/fsck-skiplist-parse-fix'
A misconfigured "fsck.skiplist" configuration variable was not
diagnosed as an error, which has been corrected.

* jt/fsck-skiplist-parse-fix:
  fsck: reject misconfigured fsck.skipList
2025-01-21 08:44:53 -08:00
cb441e1ec3 Merge branch 'ps/reftable-get-random-fix'
The code to compute "unique" name used git_rand() which can fail or
get stuck; the callsite does not require cryptographic security.
Introduce the "insecure" mode and use it appropriately.

* ps/reftable-get-random-fix:
  reftable/stack: accept insecure random bytes
  wrapper: allow generating insecure random bytes
2025-01-21 08:44:53 -08:00
57ebdd5af4 Merge branch 'jk/t7407-use-test-grep'
Test clean-up.

* jk/t7407-use-test-grep:
  t7407: use test_grep
2025-01-21 08:44:53 -08:00
5a59d1e1a0 Merge branch 'jk/lsan-race-ignore-false-positive'
The code to check LSan results has been simplified and made more
robust.

* jk/lsan-race-ignore-false-positive:
  test-lib: add a few comments to LSan log checking
  test-lib: simplify lsan results check
  test-lib: invert return value of check_test_results_san_file_empty
2025-01-21 08:44:52 -08:00
98046591b9 index-pack, unpack-objects: use skip_prefix to avoid magic number
When parsing --pack_header=, we manually skip 14 bytes to the data.
Let's use skip_prefix() to do this automatically.

Note that we overwrite our pointer to the front of the string, so we
have to add more context to the error message. We could avoid this by
declaring an extra pointer to hold the value, but I think the modified
message is actually preferable; it should give translators a bit more
context.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 08:42:56 -08:00
f1299bff26 index-pack, unpack-objects: use get_be32() for reading pack header
Both of these commands read the incoming pack into a static unsigned
char buffer in BSS, and then parse it by casting the start of the buffer
to a struct pack_header. This can result in SIGBUS on some platforms if
the compiler doesn't place the buffer in a position that is properly
aligned for 4-byte integers.

This reportedly happens with unpack-objects (but not index-pack) on
sparc64 when compiled with clang (but not gcc). But we are definitely in
the wrong in both spots; since the buffer's type is unsigned char, we
can't depend on larger alignment. When it works it is only because we
are lucky.

We'll fix this by switching to get_be32() to read the headers (just like
the last few commits similarly switched us to put_be32() for writing
into the same buffer).

It would be nice to factor this out into a common helper function, but
the interface ends up quite awkward. Either the caller needs to hardcode
how many bytes we'll need, or it needs to pass us its fill()/use()
functions as pointers. So I've just fixed both spots in the same way;
this is not code that is likely to be repeated a third time (most of the
pack reading code uses an mmap'd buffer, which should be properly
aligned).

I did make one tweak to the shared code: our pack_version_ok() macro
expects us to pass the big-endian value we'd get by casting. We can
introduce a "native" variant which uses the host integer ordering.

Reported-by: Koakuma <koachan@protonmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 08:42:56 -08:00
4f02f4d68d parse_pack_header_option(): avoid unaligned memory writes
In order to recreate a pack header in our in-memory buffer, we cast the
buffer to a "struct pack_header" and assign the individual fields. This
is reported to cause SIGBUS on sparc64 due to alignment issues.

We can work around this by using put_be32() which will write individual
bytes into the buffer.

Reported-by: Koakuma <koachan@protonmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 08:42:55 -08:00
798e0f4516 packfile: factor out --pack_header argument parsing
Both index-pack and unpack-objects accept a --pack_header argument. This
is an undocumented internal argument used by receive-pack and fetch to
pass along information about the header of the pack, which they've
already read from the incoming stream.

In preparation for a bugfix, let's factor the duplicated code into a
common helper.

The callers are still responsible for identifying the option. While this
could likewise be factored out, it is more flexible this way (e.g., if
they ever started using parse-options and wanted to handle both the
stuck and unstuck forms).

Likewise, the callers are responsible for reporting errors, though they
both just call die(). I've tweaked unpack-objects to match index-pack in
marking the error for translation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 08:42:55 -08:00
2105064b10 bswap.h: squelch potential sparse -Wcast-truncate warnings
In put_be32(), we right-shift a uint32_t value various amounts and then
assign the low 8-bits to individual "unsigned char" bytes, throwing away
the high bits. For shifts smaller than 24 bits, those thrown away bits
will be arbitrary bits from the original uint32_t.

This works exactly as we want, but if you feed a constant, then sparse
complains. For example if we write this (which we plan to do in a future
patch):

  put_be32(hdr, PACK_SIGNATURE);

then "make sparse" produces:

  compat/bswap.h:175:22: error: cast truncates bits from constant value (5041 becomes 41)
  compat/bswap.h:176:22: error: cast truncates bits from constant value (504143 becomes 43)
  compat/bswap.h:177:22: error: cast truncates bits from constant value (5041434b becomes 4b)

And the same issue exists in the other put_be*() functions, when used
with a constant.

We can silence this warning by explicitly masking off the truncated
bits. The compiler is smart enough to know the result is the same, and
the asm generated by gcc (with both -O0 and -O2) is identical.

Curiously this line already exists:

	put_be32(&hdr_version, INDEX_EXTENSION_VERSION2);

in the fsmonitor.c file, but it does not get flagged because the CPP
macro expands to a small integer (2).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 08:42:55 -08:00
0f3d8e2e46 Merge branch 'kn/reflog-migration-fix' into kn/reflog-migration-fix-followup
* kn/reflog-migration-fix:
  reftable: write correct max_update_index to header
2025-01-17 15:42:58 -08:00
ffbd3f98f9 t/unit-tests: convert reftable tree test to use clar test framework
Adapts reftable tree test script to clar framework by using clar
assertions where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:12 -08:00
8b702f93dd t/unit-tests: adapt priority queue test to use clar test framework
Convert the prio-queue test script to clar framework by using clar
assertions where necessary. Test functions are created as a standalone
to test different cases.

update the type of the variable `j` from int to `size_t`, this ensures
compatibility with the type used for result_size, which is also size_t,
preventing a potential warning or error caused by comparisons between
signed and unsigned integers.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:12 -08:00
c143dfa7ed t/unit-tests: convert mem-pool test to use clar test framework
Adapt the mem-pool test script to use clar framework by using clar
assertions where necessary.Test functions are created as a standalone to
test different test cases.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:11 -08:00
aae2b431b0 t/unit-tests: handle dashes in test suite filenames
"generate-clar-decls.sh" script is designed to extract function
signatures that match a specific pattern derived from the unit test
file's name. The script does not know to massage file names with dashes,
which will make it search for functions that look like, for example,
`test_mem-pool_*`. Having dashes in function names is not allowed
though, so these patterns won't ever match a legal function name.

Adapt script to translate dashes (`-`) in test suite filenames to
underscores (`_`) to correctly extract the function signatures and run
the corresponding tests. This will be used by subsequent commits which
follows the same construct.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:11 -08:00
f66d1423f5 builtin: send usage() help text to standard output
Using the show_usage_and_exit_if_asked() helper we introduced
earlier, fix callers of usage() that want to show the help text when
explicitly asked by the end-user.  The help text now goes to the
standard output stream for them.

These are the bog standard "if we got only '-h', then that is a
request for help" callers.  Their

	if (argc == 2 && !strcmp(argv[1], "-h"))
		usage(message);

are simply replaced with

	show_usage_and_exit_if_asked(argc, argv, message);

With this, the built-ins tested by t0012 all send their help text to
their standard output stream, so the check in t0012 that was half
tightened earlier is now fully tightened to insist on standard error
stream being empty.

Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 13:30:03 -08:00
a36a822d7d oddballs: send usage() help text to standard output
Using the show_usage_if_asked() helper we introduced earlier, fix
callers of usage() that want to show the help text when explicitly
asked by the end-user.  The help text now goes to the standard
output stream for them.

The callers in this step are oddballs in that their invocations of
usage() are *not* guarded by

	if (argc == 2 && !strcmp(argv[1], "-h")
		usage(...);

There are (unnecessarily) being clever ones that do things like

	if (argc != 2 || !strcmp(argv[1], "-h")
		usage(...);

to say "I know I take only one argument, so argc != 2 is always an
error regardless of what is in argv[].  Ah, by the way, even if argc
is 2, "-h" is a request for usage text, so we do the same".

Some like "git var -h" just do not treat "-h" any specially, and let
it take the same error code paths as a parameter error.

Now we cannot do the same, so these callers are rewrittin to do the
show_usage_and_exit_if_asked() first and then handle the usage error
the way they used to.

Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 13:30:03 -08:00
b821c999ca builtins: send usage_with_options() help text to standard output
Using the show_usage_with_options_if_asked() helper we introduced
earlier, fix callers of usage_with_options() that want to show the
help text when explicitly asked by the end-user.  The help text now
goes to the standard output stream for them.

The test in t7600 for "git merge -h" may want to be retired, as the
same is covered by t0012 already, but it is specifically testing that
the "-h" option gets a response even with a corrupt index file, so
for now let's leave it there.

Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 13:30:03 -08:00
0148fd836a usage: add show_usage_if_asked()
Some commands call usage() when they are asked to give the help
message with "git cmd -h", but this has the same problem as we
fixed with callers of usage_with_options() for the same purpose.

Introduce a helper function that captures the common pattern

	if (argc == 2 && !strcmp(argv[1], "-h"))
		usage(usage);

and replaces it with

	show_usage_if_asked(argc, argv, usage);

to help correct these code paths.

Note that this helper function still exits with status 129, and
t0012 insists on it.  After converting all the mistaken callers of
usage_with_options() to call this new helper, we may want to address
it---the end user is asking us to give the help text, and we are
doing exactly as asked, so there is no reason to exit with non-zero
status.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 13:30:02 -08:00
1782abd773 parse-options: add show_usage_with_options_if_asked()
Many commands call usage_with_options() when they are asked to give
the help message, but it sends the help text to the standard error
stream.  When the user asked for it with "git cmd -h", the help
message is the primary output from the command, hence we should send
it to the standard output stream, instead.

Introduce a helper function that captures the common pattern

	if (argc == 2 && !strcmp(argv[1], "-h"))
		usage_with_options(usage, options);

and replaces it with

	show_usage_with_options_if_asked(argc, argv, usage, options);

to help correct code paths.

Note that this helper function still exits with status 129, and
t0012 insists on it.  After converting all the mistaken callers of
usage_with_options() to call this new helper, we may want to address
it---the end user is asking us to give the help text, and we are
doing exactly as asked, so there is no reason to exit with non-zero
status.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 13:30:02 -08:00
e4c0a1499c t0012: optionally check that "-h" output goes to stdout
For most commands, "git foo -h" will send the help output to stdout, as
this is what parse-options.c does. But some commands send it to stderr
instead. This is usually because they call usage_with_options(), and
should be switched to show_usage_help_and_exit_if_asked().

Currently t0012 is permissive and allows either behavior. We'd like it
to eventually enforce that help goes to stdout, and teaching it to do so
identifies the commands that need to be changed. But during the
transition period, we don't want to enforce that for most test runs.

So let's introduce a flag that will let most test runs use the
permissive behavior, and people interested in converting commands can
run:

  GIT_TEST_HELP_MUST_BE_STDOUT=1 ./t0012-help.sh

to see the failures. Eventually (when all builtins have been converted)
we'll remove this flag entirely and always check the strict behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 13:30:02 -08:00
4ad47d2de3 gitcli: document that command line trumps config and env
We centrally explain that "--no-whatever" is the way to countermand
the "--whatever" option.  Explain that a configured default and the
value specified by an environment variable can be overridden by the
corresponding command line option, too.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 10:08:58 -08:00
8454b42f94 meson: wire up the git-subtree(1) command
Wire up the git-subtree(1) command, which is part of "contrib/". Note
that we have to move around the exact location where we include the
"contrib/" subdirectory so that it comes after building the docs so that
we have access to some of the common functionality.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 09:56:38 -08:00
07892da045 meson: introduce build option for contrib
We unconditionally wire up building command completion present in the
"contrib/" directory. This may or may not be what users want, and we
don't provide a way to disable it.

Introduce a new "contrib" build option. This option is introduced as an
array so that users can manually pick which exact features they want to
include from the "contrib" directory. By default, we build and install
shell completions, which is a commonly used feature and also the current
default.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 09:56:38 -08:00
d4cd75f6bd contrib/subtree: fix building docs
In a38edab7c8 (Makefile: generate doc versions via GIT-VERSION-GEN,
2024-12-06), we have refactored how we build our documentation by
injecting the Git version into the Asciidoc and AsciiDoctor config
files instead of doing so via arguments. As such, the original config
files were removed, where the expectation is that they get generated via
`GIT-VERSION-GEN` now.

Whie the git-subtree(1) command part of "contrib/" also builds docs
using these same config files, its Makefile wasn't adjusted accordingly
and thus building the docs is broken.

Fix this by using `GIT-VERSION-GEN` to generate those files.

Reported-by: Renato Botelho <garga@FreeBSD.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 09:56:37 -08:00
49b299215d connect: address -Wsign-compare warnings
Most of the warnings were about loop variables being declared as ints
with a condition using a size_t, whereby switching the variable to
size_t fixes the warning.

One other case was comparing the result of strlen to an int passed
as an argument, which turns out could just as well be passed as a
size_t, albeit trickling to other functions.

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 09:27:42 -08:00
efff4a85a4 The first batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-16 16:35:14 -08:00
b9a6830836 Merge branch 'mb/t7110-use-test-path-helper'
Test modernization.

* mb/t7110-use-test-path-helper:
  t7110: replace `test -f` with `test_path_is_*` helpers
2025-01-16 16:35:14 -08:00
3902b083e7 Merge branch 'ps/meson-weak-sha1-build'
meson-based build now supports the unsafe-sha1 build knob.

* ps/meson-weak-sha1-build:
  meson: provide a summary of configured backends
  meson: wire up unsafe SHA1 backend
  meson: add missing dots for build options
  meson: simplify conditions for HTTPS and SHA1 dependencies
  meson: require SecurityFramework when it's used as SHA1 backend
  meson: deduplicate access to SHA1/SHA256 backend options
  meson: consistenlty spell 'CommonCrypto'
2025-01-16 16:35:14 -08:00
564b907c8a Merge branch 'ps/more-sign-compare'
More -Wsign-compare fixes.

* ps/more-sign-compare:
  sign-compare: avoid comparing ptrdiff with an int/unsigned
  commit-reach: use `size_t` to track indices when computing merge bases
  shallow: fix -Wsign-compare warnings
  builtin/log: fix remaining -Wsign-compare warnings
  builtin/log: use `size_t` to track indices
  commit-reach: use `size_t` to track indices in `get_reachable_subset()`
  commit-reach: use `size_t` to track indices in `remove_redundant()`
  commit-reach: fix type of `min_commit_date`
  commit-reach: fix index used to loop through unsigned integer
  prio-queue: fix type of `insertion_ctr`
2025-01-16 16:35:14 -08:00
66e01e510a Merge branch 'ps/object-collision-check'
CI jobs gave sporadic failures, which turns out that that the
object finalization code was giving an error when it did not have
to.

* ps/object-collision-check:
  object-file: retry linking file into place when occluding file vanishes
  object-file: don't special-case missing source file in collision check
  object-file: rename variables in `check_collision()`
  object-file: fix race in object collision check
2025-01-16 16:35:13 -08:00
f8f5af2952 Merge branch 'as/long-option-help-i18n'
Tweak the help text used for the option value placeholders by
parse-options API so that translations can customize the "<>"
placeholder signal (e.g. "--option=<value>").

* as/long-option-help-i18n:
  parse-options: localize mark-up of placeholder text in the short help
2025-01-16 16:35:13 -08:00
637fb90228 Merge branch 're/submodule-parse-opt'
"git submodule" learned various ways to spell the same option,
e.g. "--branch=B" can be spelled "--branch B" or "-bB".

* re/submodule-parse-opt:
  git-submodule.sh: rename some variables
  git-submodule.sh: improve variables readability
  git-submodule.sh: add some comments
  git-submodule.sh: get rid of unused variable
  git-submodule.sh: get rid of isnumber
  git-submodule.sh: improve parsing of short options
  git-submodule.sh: improve parsing of some long options
2025-01-16 16:35:13 -08:00
2a13745101 doc: migrate git-commit manpage secondary files to new format
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15 14:43:36 -08:00
819fdd6e76 doc: convert git commit config to new format
Also prevent git-commit manpage to refer to itself in the config
description by using a variable.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15 14:43:36 -08:00
01b9465440 doc: make more direct explanations in git commit options
- Use imperative mood
- make use of the placeholder format to simplify style

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15 14:43:36 -08:00
d533c10697 doc: the mode param of -u of git commit is optional
Fix the synopsis to reflect the option description.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15 14:43:36 -08:00
be2ea674cc doc: apply new documentation guidelines to git commit
- switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- use _<placeholder>_ instead of <placeholder> in the description
- use `backticks for keywords and more complex option
descriptions`. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15 14:43:36 -08:00
bc67b4ab5f reftable: write correct max_update_index to header
In 297c09eabb (refs: allow multiple reflog entries for the same refname,
2024-12-16), the reftable backend learned to handle multiple reflog
entries within the same transaction. This was done modifying the
`update_index` for reflogs with multiple indices. During writing the
logs, the `max_update_index` of the writer was modified to ensure the
limits were raised to the modified `update_index`s.

However, since ref entries are written before the modification to the
`max_update_index`, if there are multiple blocks to be written, the
reftable backend writes the header with the old `max_update_index`. When
all logs are finally written, the footer will be written with the new
`min_update_index`. This causes a mismatch between the header and the
footer and causes the reftable file to be corrupted. The existing tests
only spawn a single block and since headers are lazily written with the
first block, the tests didn't capture this bug.

To fix the issue, the appropriate `max_update_index` limit must be set
even before the first block is written. Add a `max_index` field to the
transaction which holds the `max_index` within all its updates, then
propagate this value to the reftable backend, wherein this is used to
the set the `max_update_index` correctly.

Add a test which creates a few thousand reference updates with multiple
reflog entries, which should trigger the bug.

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15 09:12:09 -08:00
1dca492edd meson: fix missing deps for technical articles
We need an explicit `depends: documentation_deps` so that all of our
Documentation targets know they require asciidoc.conf. This shows up
as parallel build failures with it not yet being available.

Other targets look OK already.

Signed-off-by: Sam James <sam@gentoo.org>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-14 11:17:35 -08:00
4771501c0a meson: ensure correct version-def.h is used
To build the libgit-version library, Meson first generates
`version-def.h` in the build directory. Then it compiles `version.c`
into a library. During compilation, Meson tells to include both the
build directory and the project root directory.

However, when the user previously has compiled Git using Make, they will
have a `version-def.h` file in project root directory as well. Because
`version-def.h` is included in `version.c` using the #include directive
with double quotes, some preprocessors will look for the header file in
the same directory as the source file. This will cause compilation of
`version.c` ran by Meson to include `version-def.h` previously made by
Make, which might be out of date.

To explicitly tell the preprocessor which `version-def.h` to use, pass
the absolute path of this file as macro GIT_VERSION_H to the
preprocessor using option `-D` and have `version.c` `#include
GIT_VERSION_H`. To remain working with other build systems than Meson,
include "version-def.h" if that macro is not defined.

Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-14 11:14:54 -08:00
757161efcc Sync with Git 2.48.1 2025-01-13 13:02:01 -08:00
46afc2ba91 Start the Git 2.49 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 13:00:48 -08:00
f93ff170b9 Git 2.48.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 12:57:19 -08:00
65faad6d84 Sync with Git 2.47.2
Git 2.47.2

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmdkT1sACgkQsLXohpav
# 5svdhRAAq0WoZIg+33vYNNVSTm3Ux9RJslmXs3lQuhuUJ61hK/28drSLU29GH7x7
# 3nmmjp1cegnXRVLBAfoYDdzPprNNrQFQEHQEzgG/GDZw0OXn+WTZuNyrrUYoa+sd
# QSLlElRj2qrpHIMOsMIBKBSNB+qjJHOMGdxcBAS768TfnQpGIpc1KJa24TxsVBzC
# ScP4uvrFfPyQrqFUgiUhCeqLnO/6T5i/QAn/8cS5a1+zor5ZHSlw28TZTOxN2odo
# Rulp/FtehiDEzmRowgD3M4fImAPY6Ib6VORCYASqpJFFla30tu2bQqEi6raOMTec
# hg5Ibkmj6fHFONaYvoTMRkYHmtUnNgIPU/CYPwswNk8w1+PPQfJ+TYjBXOQgdTLW
# F0azHBHh7NRmEHVydiF9CqjgNVRzjO4IEZfGqXNFPPMvR6UUzDaIkrpYbwXBFMin
# GNPV3QISeXj9ROjJoCv0nclXETwWemykjZlD6b5krXn5TaJlFb+69qJvXrCLq5WY
# EoevSqKkB9HVK9si7P8Sh1cPGOr3kfiFPmMNKFVI8l0+iDFgBywOomWNS/JEzqu1
# nN142DKdL1W/rkeMUhbX2h11CZNvHKIOy3iaA4MTOing8/eMzyUUQ73Ck7odYs4f
# rZ0tTXKJhxojPvBpTxYe9SxM0bDLREiOv0zX76+sIuhbAQCmk0o=
# =MNNf
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Dec 2024 08:52:43 AM PST
# gpg:                using RSA key E1F036B1FEE7221FC778ECEFB0B5E88696AFE6CB
# gpg: Good signature from "Junio C Hamano <gitster@pobox.com>" [ultimate]
# gpg:                 aka "Junio C Hamano <junio@pobox.com>" [ultimate]
# gpg:                 aka "Junio C Hamano <jch@google.com>" [ultimate]

* tag 'v2.47.2':
  Git 2.47.2
  Git 2.46.3
  Git 2.45.3
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2025-01-13 12:55:26 -08:00
191f0c8db2 object-name: be more strict in parsing describe-like output
From Documentation/revisions.txt:
    '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
      Output from `git describe`; i.e. a closest tag, optionally
      followed by a dash and a number of commits, followed by a dash, a
      'g', and an abbreviated object name.
which means that output of the format
    ${REFNAME}-${INTEGER}-g${HASH}
should parse to fully expanded ${HASH}.  This is fine.  However, we
currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
-g${HASH} and assume the rest is valid.  That is problematic, since it
breaks things like

    git cat-file -p branchname:path/to/file/named/i-gaffed

which, when commit (or tree or blob) affed exists, will not return us
information about the file we are looking for but will instead
erroneously tell us about object affed.

A few additional notes:
  - This is a slight backward incompatibility break, because we used
    to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}.  However,
    a backward incompatible break is necessary, because there is no
    other way for someone to be more specific and disambiguate that they
    want the blob master:path/to/who-gabbed instead of the object abbed.
  - There is a possibility that check_refname_format() rules change in
    the future.  However, we can only realistically loosen the rules
    for what that function accepts rather than tighten.  If we were to
    tighten the rules, some real world repositories may already have
    refnames that suddenly become unacceptable and we break those
    repositories.  As such, any describe-like syntax of the form
    ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
    changes in this commit will remain valid in the future.
  - The fact that check_refname_format() rules could loosen in the
    future is probably also an important reason to make this change.  If
    the rules loosen, there might be additional cases within
    ${GARBAGE}-g${HASH} that become ambiguous in the future.  While
    abbreviated hashes can be disambiguated by abbreviating less, it may
    well be that these alternative object names have no way of being
    disambiguated (much like pathnames cannot be).  Accepting all random
    ${GARBAGE} thus makes it difficult for us to allow future
    extensions to object naming.

So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
present in the string, and would be considered a valid ref and
non-negative integer.

Also, add a few tests for git describe using object names of the form
    ${REVISION_NAME}${MODIFIERS}
since an early version of this patch failed on constructs like
    git describe v2.48.0-rc2-161-g6c2274cdbc^0

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 11:48:43 -08:00
71e19a0031 object-name: fix resolution of object names containing curly braces
Given a branch name of 'foo{bar', commands like

    git cat-file -p foo{bar:README.md

should succeed (assuming that branch had a README.md file, of course).
However, the change in cce91a2cae (Change 'master@noon' syntax to
'master@{noon}'., 2006-05-19) presumed that curly braces would always
come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
to entirely miss the ':' and assume there's no object being referenced.
In short, git would report:

    fatal: Not a valid object name foo{bar:README.md

Change the parsing to only make the assumption of paired curly braces
immediately after either a '@' or '^' character appears.

Add tests for this, as well as for a few other test cases that initial
versions of this patch broke:
  * 'foo@@{...}'
  * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'

Note that we'd prefer not duplicating the special logic for "@^" characters
here, because if get_oid_basic() or interpret_nth_prior_checkout() or
get_oid_basic() or similar gain extra methods of using curly braces,
then the logic in get_oid_with_context_1() would need to be updated as
well.  But it's not clear how to refactor all of these to have a simple
common callpoint with the specialized logic.

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Helped-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 11:48:28 -08:00
b569cbf2c6 Merge branch 'ps/meson-weak-sha1-build' into ps/build-meson-fixes
* ps/meson-weak-sha1-build:
  meson: provide a summary of configured backends
  meson: wire up unsafe SHA1 backend
  meson: add missing dots for build options
  meson: simplify conditions for HTTPS and SHA1 dependencies
  meson: require SecurityFramework when it's used as SHA1 backend
  meson: deduplicate access to SHA1/SHA256 backend options
  meson: consistenlty spell 'CommonCrypto'
2025-01-13 09:34:31 -08:00
4e3dd47c9d help: interpret boolean string values for help.autocorrect
A help.autocorrect value of 1 is currently interpreted as "wait 1
decisecond", which can be confusing to users who believe they are setting a
boolean value to turn the autocorrect feature on.

Interpret the value of help.autocorrect as either one of the accepted list
of special values ("never", "immediate", ...), a boolean or an integer. If
the value is 1, it is no longer interpreted as a decisecond value of 0.1s
but as a true boolean, the equivalent of "immediate". If the value is 2 or
more, continue treating it as a decisecond wait time.

False boolean string values ("off", "false", "no") are now equivalent to
"never", meaning that guessed values are still shown but nothing is
executed. True boolean string values are interpreted as "immediate".

Signed-off-by: Scott Chacon <schacon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 08:20:01 -08:00
18a7e19846 gitk: make the "list references" default window width wider
When using remotes (with git-flow especially), the remote reference names
are almost always wordwrapped in the "list references" window because it's
somewhat narrow by default. It's possible to resize it with a mouse,
but it's annoying to have to do this every time, especially on Windows 10,
where the window border seems to be only one (1) pixel wide, thus making
the grabbing of the window border tricky.

Signed-off-by: James J. Raden <james.raden@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-01-11 18:17:42 +01:00
ac75b4c265 gitk: fix arrow keys in input fields with Tcl/Tk >= 8.6
Tcl/Tk 8.6 introduced new events for the cursor left/right keys and
apparently changed the behavior of the previous event.

Let's work around that by using the new events when we are running with
Tcl/Tk 8.6 or later.

This fixes https://github.com/git-for-windows/git/issues/495

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-01-11 18:17:42 +01:00
baaa9d6d86 gitk: Use an external icon file on Windows
Git for Windows now ships with the new Git icon from git-scm.com. Use that
icon file if it exists instead of the old procedurally drawn one.

This patch was sent upstream but so far no decision on its inclusion was
made, so commit it to our fork.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-01-11 18:17:42 +01:00
5eb02dd8f0 gitk: Unicode file name support
Assumes file names in git tree objects are UTF-8 encoded.

On most unix systems, the system encoding (and thus the TCL system
encoding) will be UTF-8, so file names will be displayed correctly.

On Windows, it is impossible to set the system encoding to UTF-8.
Changing the TCL system encoding (via 'encoding system ...', e.g. in the
startup code) is explicitly discouraged by the TCL docs.

Change gitk functions dealing with file names to always convert
from and to UTF-8.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-01-11 18:17:42 +01:00
4cbe9e0e21 gitk(Windows): avoid inadvertently calling executables in the worktree
Just like CVE-2022-41953 for Git GUI, there exists a vulnerability of
`gitk` where it looks for `taskkill.exe` in the current directory before
searching `PATH`.

Note that the many `exec git` calls are unaffected, due to an obscure
quirk in Tcl's `exec` function. Typically, `git.exe` lives next to
`wish.exe` (i.e. the program that is run to execute `gitk` or Git GUI)
in Git for Windows, and that is the saving grace for `git.exe because
`exec` searches the directory where `wish.exe` lives even before the
current directory, according to
https://www.tcl-lang.org/man/tcl/TclCmd/exec.htm#M24:

	If a directory name was not specified as part of the application
	name, the following directories are automatically searched in
	order when attempting to locate the application:

	    The directory from which the Tcl executable was loaded.

	    The current directory.

	    The Windows 32-bit system directory.

	    The Windows home directory.

	    The directories listed in the path.

The same is not true, however, for `taskkill.exe`: it lives in the
Windows system directory (never mind the 32-bit, Tcl's documentation is
outdated on that point, it really means `C:\Windows\system32`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-01-11 18:17:42 +01:00
76baf97fa1 instaweb: fix ip binding for the python http.server
`git instaweb -d python` should bind the server to 0.0.0.0, while
`git instaweb -d python -l` should bind the server to 127.0.0.1.

The code had them backwards by mistake since 2eb14bb2d4
(git-instaweb: add Python builtin http.server support, 2019-01-28).

Signed-off-by: Alecs King <alecsk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 15:27:52 -08:00
69666e6746 doc: convert git-restore to new style format
- Switch the synopsis to a 'synopsis' block which will
  automatically format placeholders in italics and keywords in
  monospace

- Use _<placeholder>_ instead of <placeholder> in the description

- Use backticks for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

While at it, also convert an option description to imperative mood.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 15:21:21 -08:00
77b2d29e91 doc: convert git-notes to new documentation format
- Switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 15:19:52 -08:00
64156589d9 Merge branch 'ps/meson-weak-sha1-build' into ps/zlib-ng
* ps/meson-weak-sha1-build:
  meson: provide a summary of configured backends
  meson: wire up unsafe SHA1 backend
  meson: add missing dots for build options
  meson: simplify conditions for HTTPS and SHA1 dependencies
  meson: require SecurityFramework when it's used as SHA1 backend
  meson: deduplicate access to SHA1/SHA256 backend options
  meson: consistenlty spell 'CommonCrypto'
2025-01-10 15:18:56 -08:00
a90ff409f0 docs: discuss caching personal access tokens
Describe problems storing personal access tokens in git-credential-cache
and suggest alternatives.

Research suggests that many users are confused about this:

> the point of passwords is that (ideally) you memorise them [so]
> they're never stored anywhere in plain text. Yet GitHub's personal
> access token system seems to basically force you to store the token in
> plain text?

https://stackoverflow.com/questions/46645843/where-to-store-my-git-personal-access-token#comment89963004_46645843
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 15:10:00 -08:00
cf5b8276dc docs: list popular credential helpers
git-credential-store saves credentials unencrypted on disk. It is the
least secure choice of credential helper. Nevertheless, it appears
several times more popular than any other credential helper [1].

Inform users about more secure alternatives.

[1] https://stackoverflow.com/questions/35942754/how-can-i-save-username-and-password-in-git

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 15:10:00 -08:00
fbe8d3079d Git 2.48
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:20:20 -08:00
b28fb93e51 Merge branch 'ps/build-sign-compare'
Last-minute fix for a regression in "git blame --abbrev=<length>"
when insane <length> is specified; we used to correctly cap it to
the hash output length but broke it during the cycle.

* ps/build-sign-compare:
  builtin/blame: fix out-of-bounds write with blank boundary commits
  builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-10 09:19:34 -08:00
3ae35648bf Merge branch 'js/git-version-gen-update'
Build regression fix.

* js/git-version-gen-update:
  GIT-VERSION-GEN: allow it to be run in parallel
2025-01-10 09:19:33 -08:00
e39e332e50 ci: remove stale code for Azure Pipelines
Support for Azure Pipelines has been retired in 6081d3898f (ci: retire
the Azure Pipelines definition, 2020-04-11) in favor of GitHub Actions.
Our CI library still has some infrastructure left for Azure though that
is now unused. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:39 -08:00
6bc06e8f20 ci: use latest Ubuntu release
Both GitHub Actions and GitLab CI use the "ubuntu:latest" tag as the
default image for most jobs. This tag is somewhat misleading though, as
it does not refer to the latest release of Ubuntu, but to the latest LTS
release thereof. But as we already have a couple of jobs exercising the
oldest LTS release of Ubuntu that Git still supports, it would make more
sense to test the oldest and youngest versions of Ubuntu.

Adapt these jobs to instead use the "ubuntu:rolling" tag, which refers
to the actual latest release, which currently is Ubuntu 24.10.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:38 -08:00
678b22f528 ci: stop special-casing for Ubuntu 16.04
With c85bcb5de1 (gitlab-ci: switch from Ubuntu 16.04 to 20.04,
2024-10-31) we have adapted the last CI job to stop using Ubuntu 16.04
in favor of Ubuntu 20.04. Remove the special-casing we still have in our
CI scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:38 -08:00
4ad71b16cd gitlab-ci: add linux32 job testing against i386
Add another job to GitLab CI that tests against the i386 architecture.
This job is equivalent to the same job in GitHub Workflows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:38 -08:00
5aea4ff36c gitlab-ci: remove the "linux-old" job
The "linux-old" job was historically testing against the oldest
supported LTS release of Ubuntu. But with c85bcb5de1 (gitlab-ci: switch
from Ubuntu 16.04 to 20.04, 2024-10-31) it has been converted to test
against Ubuntu 20.04, which already gets exercised in a couple of other
CI jobs. It's thus not adding any significant test coverage.

Drop the job.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:38 -08:00
b133d3071a github: simplify computation of the job's distro
We explicitly list the distro of Linux-based jobs, but it is equivalent
to the name of the image in almost all cases, except that colons are
replaced with dashes. Drop the redundant information and massage it in
our CI scripts, which is equivalent to how we do it in GitLab CI.

There are a couple of exceptions:

  - The "linux32" job, whose distro name is different than the image
    name. This is handled by adapting all sites to use the new name.

  - The "alpine" and "fedora" jobs, neither of which specify a tag for
    their image. This is handled by adding the "latest" tag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:37 -08:00
9548e0478e github: convert all Linux jobs to be containerized
We have split the CI jobs in GitHub Workflows into two categories:

  - Those running on a machine pool directly.

  - Those running in a container on the machine pool.

The latter is more flexible because it allows us to freely pick whatever
container image we want to use for a specific job, while the former only
allows us to pick from a handful of different distros. The containerized
jobs do not have any significant downsides to the best of my knowledge:

  - They aren't significantly slower to start up. A quick comparison by
    Peff shows that the difference is mostly lost in the noise:

            job             |  old | new
        --------------------|------|------
        linux-TEST-vars      11m30s 10m54s
        linux-asan-ubsan     30m26s 31m14s
        linux-gcc             9m47s 10m6s
        linux-gcc-default     9m47s  9m41s
        linux-leaks          25m50s 25m21s
        linux-meson          10m36s 10m41s
        linux-reftable       10m25s 10m23s
        linux-reftable-leaks 27m18s 27m28s
        linux-sha256          9m54s 10m31s

    Some jobs are a bit faster, some are a bit slower, but there does
    not seem to be any significant change.

  - Containerized jobs run as root, which keeps a couple of tests from
    running. This has been addressed in the preceding commit though,
    where we now use setpriv(1) to run tests as a separate user.

  - GitHub injects a Node binary into containerized jobs, which is
    dynamically linked. This has led to some issues in the past [1], but
    only for our 32 bit jobs. The issues have since been resolved.

Overall there seem to be no downsides, but the upside is that we have
more control over the exact image that these jobs use. Convert the Linux
jobs accordingly.

[1]: https://lore.kernel.org/git/20240912094841.GD589828@coredump.intra.peff.net/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:37 -08:00
2a21098b98 github: adapt containerized jobs to be rootless
The containerized jobs in GitHub Actions run as root, giving them
special permissions to for example delete files even when the user
shouldn't be able to due to file permissions. This limitation keeps us
from using containerized jobs for most of our Ubuntu-based jobs as it
causes a number of tests to fail.

Adapt the jobs to create a separate user that executes the test suite.
This follows similar infrastructure that we already have in GitLab CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:37 -08:00
65f586132b t7422: fix flaky test caused by buffered stdout
One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by a race between git-submodule(1) and grep(1):

  1. git-submodule(1) (or its child process) writes the first X/S line
     we're trying to match.

  2. grep(1) matches the line.

  3a. grep(1) exits, closing the pipe.

  3b. git-submodule(1) (or its child process) writes the rest of its
  lines.

Steps 3a and 3b happen at the same time without any guarantees. If 3a
happens first, we get SIGPIPE. Otherwise, we don't and the test fails.

Fix the issue by generating a couple thousand nested submodules and
matching on the first nested submodule. This ensures that the recursive
git-submodule(1) process completely fills its stdout buffer, which makes
subsequent writes block until the downstream consumer of the pipe either
reads more or closes it.

To verify that this works as expected one can apply the following patch
to the preimage of this commit, which used to reliably trigger the race:

    diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
    index 3c5177cc30..df6001f8a0 100755
    --- a/t/t7422-submodule-output.sh
    +++ b/t/t7422-submodule-output.sh
    @@ -202,7 +202,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
     		cd repo &&
     		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
     		{ git submodule status --recursive 2>err; echo $?>status; } |
    -			grep -q recursive-submodule-path-1 &&
    +			{ sleep 1 && grep -q recursive-submodule-path-1 && sleep 1; } &&
     		test_must_be_empty err &&
     		test_match_signal 13 "$(cat status)"
     	)

With the pipe-stuffing workaround the test runs successfully.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:37 -08:00
b537af720e t0060: fix EBUSY in MinGW when setting up runtime prefix
Two of our tests in t0060 verify that the runtime prefix functionality
works as expected by creating a separate directory hierarchy, copying
the Git executable in there and then creating scripts relative to that
executable.

These tests fail quite regularly in GitLab CI with the following error:

    expecting success of 0060.218 '%(prefix)/ works':
            mkdir -p pretend/bin &&
            cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
            git config yes.path "%(prefix)/yes" &&
            GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
            echo "$(pwd)/pretend/yes" >expect &&
            test_cmp expect actual
    ++ mkdir -p pretend/bin
    ++ cp /c/GitLab-Runner/builds/gitlab-org/git/git.exe pretend/bin/
    cp: cannot create regular file 'pretend/bin/git.exe': Device or resource busy
    error: last command exited with $?=1
    not ok 218 - %(prefix)/ works

Seemingly, the "git.exe" binary we are trying to overwrite is still
being held open. It is somewhat puzzling why exactly that is: while the
preceding test _does_ write to and execute the same path, it should have
exited and shouldn't keep any backgrounded processes around. So it must
be held open by something else, either in MinGW or in Windows itself.

While the root cause is puzzling, the workaround is trivial enough:
instead of writing the file twice we simply pull the common setup into a
separate test case so that we won't observe EBUSY in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 09:15:36 -08:00
64f3ff3ffc GIT-VERSION-GEN: allow it to be run in parallel
"Why would one want to run it in parallel?" I hear you ask. I am glad
you are curious, because a curious story is what it is, indeed.

The `GIT-VERSION-GEN` script is quite a pillar of Git's source code,
with most lines being unchanged for the past 15 years. Until the v2.48.0
release candidate cycle.

Its original purpose was to generate the version string and store it in
the `GIT-VERSION-FILE`.

This paradigm changed quite dramatically when support for building with
Meson was introduced. Most crucially, a38edab7c8 (Makefile: generate
doc versions via GIT-VERSION-GEN, 2024-12-06) changed the way the
documentation is built by using the `GIT-VERSION-GEN` file to write out
the `asciidocor-extensions.rb` and `asciidoc.conf` files with now
hard-coded version strings.

Crucially, the Makefile rule to generate those files needs to be run in
every build because `GIT_VERSION` could have been specified in the
`make` command-line, which would require these files to be modified.

This introduced a surprising race condition!

And this is how that race surfaces: When calling `make -j2 html man`
from the top-level directory (a variant of which is invoked in Git for
Windows' release process), two sub-processes are spawned, a `make -C
Documentation html` one and a `make -C Documentation man` one. Both run
the rule to (re-)generate `asciidoctor-extensions.rb` or
`asciidoc.conf`, invoking `GIT-VERSION-GEN` to do so. That script first
generates a temporary file (appending the `+` character to the
filename), then looks whether it contains something different than the
already existing file (if it exists, that is), and either replaces it if
needed, or removes the temporary file. If one of the two parallel
invocations removes that temporary file before the other can compare it,
or even worse: if one tries to replace the target file just after the
other _started_ writing the temporary file (but did not finish writing
it yet), that race condition now causes bad builds.

This may sound highly theoretical, but due to the design of Git's build
process, Git for Windows is forced to use a (slow) POSIX emulation layer
to run that script and in the blink of an eye it becomes very much not
theoretical at all. See Exhibit A: These GitHub workflow runs failed
because one of the two competing `make` processes tried to remove the
temporary file when the other process had already done so:

https://github.com/git-for-windows/git-sdk-32/actions/runs/12663456654
https://github.com/git-for-windows/git-sdk-32/actions/runs/12683174970
https://github.com/git-for-windows/git-sdk-64/actions/runs/12649348496

While it is undesirable to run this script over and over again,
certainly when this involves above-mentioned slow POSIX emulation layer,
the stage of the release cycle in which we are presently finding
ourselves does not lend itself to a re-design where this script could be
run once, and once only, but instead dictates that a quick and reliable
work-around be implemented that prevents the race condition without
changing the overall architecture of the build process.

This patch does that: By using a filename suffix for the temporary file
which is based on the currently-executing script's process ID, We
guarantee that the two competing invocations cannot overwrite or remove
each others' temporary files.

The filename suffix still ends in `+` to ensure that the temporary
artifacts are matched by the `*+` pattern in `.gitignore` that was added
in f9bbaa384e (Add intermediate build products to .gitignore,
2009-11-08).

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 08:50:53 -08:00
e7fb2ca945 builtin/blame: fix out-of-bounds write with blank boundary commits
When passing the `-b` flag to git-blame(1), then any blamed boundary
commits which were marked as uninteresting will not get their actual
commit ID printed, but will instead be replaced by a couple of spaces.

The flag can lead to an out-of-bounds write as though when combined with
`--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ`
as we simply use memset(3p) on that array with the user-provided length
directly. The result is most likely that we segfault.

An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes.
But when the underlying object ID is SHA1, and if the abbreviated length
exceeds the SHA1 length, it would cause us to print more bytes than
desired, and the result would be misaligned.

Instead, fix the bug by computing the length via strlen(3p). This makes
us write as many bytes as the formatted object ID requires and thus
effectively limits the length of what we may end up printing to the
length of its hash. If `--abbrev=` asks us to abbreviate to something
shorter than the full length of the underlying hash function it would be
handled by the call to printf(3p) correctly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 06:56:55 -08:00
1fbb8d7ecb builtin/blame: fix out-of-bounds read with excessive --abbrev
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.

It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, whereas printf(3p) does.

Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 06:56:54 -08:00
0b43274850 credential-cache: respect authtype capability
Previously, credential-cache populated authtype regardless whether
"get" request had authtype capability. As documented in
git-credential.txt, authtype "should not be sent unless the appropriate
capability ... is provided".

Add test. Without this change, the test failed because "credential fill"
printed an incomplete credential with only protocol and host attributes
(the unexpected authtype attribute was discarded by credential.c).

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 15:04:15 -08:00
6979bf6f8f tree-diff: make list tail-passing more explicit
The ll_diff_tree_paths() function and its helpers all take a pointer to
a list tail, possibly add to it, and then return the new tail. This
works but has two downsides:

  - The top-level caller (diff_tree_paths() in this case) has to make a
    fake combine_diff_path struct to act as the list head. This is
    especially weird here, as it's a flexible-sized struct which will
    have an empty FLEX_ARRAY field. That used to be a portability
    problem, though these days it is legal because our FLEX_ARRAY macro
    over-allocates if necessary. It's still kind of ugly, though.

  - Besides the name "tail", it's not immediately obvious that the entry
    we pass around will not be examined by each function. Using a
    pointer-to-pointer or similar makes it more obvious we only care
    about the pointer itself, not its contents.

We can solve both by passing around a pointer to the tail instead. That
gets rid of the return value entirely, though note that because of the
recursion we actually need a three-star pointer for this to work.

The result is fairly readable, as we only need to dereference the tail
in one spot. If we wanted to make it simpler we could wrap the tail in a
struct, which we pass around.

Another option is to convert combine_diff to use our generic list_head
API. I tried that and found the result became much harder to read
overall. It means that _all_ code that looks at combine_diff_path
structs needs to be modified, since the "next" pointer is now inside a
list_head which has to be dereferenced with list_entry(). And we lose
some type safety, since we're just passing around a list_head struct
everywhere, and everybody who looks at it has to specify the type to
list_entry themselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:27 -08:00
6632bcba51 tree-diff: simplify emit_path() list management
In emit_path() we may append a new combine_diff_path entry to our list,
decide that we don't want it (because opt->pathchange() told us so) and
then roll it back.

Between the addition and the rollback, it doesn't matter if it's in the
list or not (no functions can even tell, since it's a singly-linked list
and we pass around just the tail entry).

So it's much simpler to just wait until opt->pathchange() tells us
whether to keep it, and either attach it (or free it) then. We do still
have to allocate it up front since it's that struct itself which is
passed to the pathchange callback.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:26 -08:00
d8baf083c5 tree-diff: use the name "tail" to refer to list tail
The ll_diff_tree_paths() function and its helpers all append to a
running list by taking in a pointer to the old tail and returning the
new tail. But they just call this argument "p", which is not very
descriptive.

It gets particularly confusing in emit_path(), where we actually add to
the list, because "p" does double-duty: it is the tail of the list, but
it is also the entry which we add. Except that in some cases we _don't_
add a new entry (or we might even add it and roll it back) if the path
isn't interesting. At first glance, this makes it look like a bug that
we pass "p" on to ll_diff_tree_paths() to recurse; sometimes it is
getting the new entry we made and sometimes not!

But it's not a bug, because ll_diff_tree_paths() does not care about the
entry itself at all. It is only using its "next" pointer as the tail of
the list.

Let's swap out "p" for "tail" to make this obvious. And then in
emit_path() we'll continue to use "p" for our newly allocated entry.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:26 -08:00
a5c4e31af9 tree-diff: drop list-tail argument to diff_tree_paths()
The internals of the path diffing code, including ll_diff_tree_paths(),
all take an extra combine_diff_path parameter which they use as the tail
of a list of results, appending any new entries to it.

The public-facing diff_tree_paths() takes the same argument, but it just
makes the callers more awkward. They always start with a clean list, and
have to set up a fake head struct to pass in.

Let's keep the public API clean by always returning a new list. That
keeps the fake struct as an implementation detail of tree-diff.c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:26 -08:00
69f6dea44c combine-diff: drop public declaration of combine_diff_path_size()
We want callers to use combine_diff_path_new() to allocate structs,
rather than using combine_diff_path_size() and xmalloc(). That gives us
more consistency over the initialization of the fields.

Now that the final external user of combine_diff_path_size() is gone, we
can stop declaring it publicly. And since our constructor is the only
caller, we can just inline it there.

Breaking the size computation into two parts also lets us reuse the
intermediate multiplication result of the parent length, since we need
to know it to perform our memset(). The result is a little easier to
read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:26 -08:00
b20d7d348c tree-diff: inline path_appendnew()
Our path_appendnew() has been simplified to the point that it is mostly
just implementing combine_diff_path_new(), plus setting the "next"
pointer. Since there's only one caller, let's replace it completely with
a call to that helper function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:25 -08:00
8c53354658 tree-diff: pass whole path string to path_appendnew()
When diffing trees, we'll have a strbuf "base" containing the
slash-separted names of our parent trees, and a "path" string
representing an entry name from the current tree. We pass these
separately to path_appendnew(), which combines them to form a single
path string in the combine_diff_path struct.

Instead, let's append the path string to our base strbuf ourselves, pass
in the result, and then roll it back with strbuf_setlen(). This lets us
simplify path_appendnew() a bit, enabling further refactoring.

And while it might seem like this causes extra wasted allocations, it
does not in practice. We reuse the same strbuf for each tree entry, so
we only have to allocate it to match the largest name. Plus, in a
recursive diff we'll end up doing this same operation to extend the base
for the next level of recursion. So we're really just incurring a small
memcpy().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:25 -08:00
a8dda1af6a tree-diff: drop path_appendnew() alloc optimization
When we're diffing trees, we create a list of combine_diff_path structs
that represent changed paths. We allocate each struct and add it to the
list with path_appendnew(), which we then feed to opt->pathchange().
That function tells us whether the path is of interest or not; if not,
then we can throw away the struct we allocated.

So there's an optimization to avoid extra allocations: instead of
throwing away the new entry, we try to reuse it. If it was large enough
to store the next path we care about, we can do so. And if not, we fall
back to freeing and re-allocating a new struct.

This comes from 72441af7c4 (tree-diff: rework diff_tree() to generate
diffs for multiparent cases as well, 2014-04-07), where the goal was to
have even the 2-parent diff code use the combine-diff infrastructure,
but without taking a performance hit.

The implementation causes some complexities in the interface (as we
store the allocation length inside the "next" pointer), and prevents us
from using the regular combine_diff_path_new() constructor. The
complexity is mostly contained inside two functions, but it's worth
re-evaluating how much it's helping.

That commit claims it helps ~1% on generating two-parent diffs in
linux.git. Here are the timings I get on the same command today ("old"
is the current tip of master, and "new" has this patch applied):

  Benchmark 1: ./git.old log --raw --no-abbrev --no-renames v3.10..v3.11
    Time (mean ± σ):     532.9 ms ±   5.8 ms    [User: 472.7 ms, System: 59.6 ms]
    Range (min … max):   525.9 ms … 543.3 ms    10 runs

  Benchmark 2: ./git.new log --raw --no-abbrev --no-renames v3.10..v3.11
    Time (mean ± σ):     538.3 ms ±   5.7 ms    [User: 478.0 ms, System: 59.7 ms]
    Range (min … max):   528.5 ms … 545.3 ms    10 runs

  Summary
    ./git.old log --raw --no-abbrev --no-renames v3.10..v3.11 ran
    1.01 ± 0.02 times faster than ./git.new log --raw --no-abbrev --no-renames v3.10..v3.11

So we do end up on average 1% faster, but with 2% of noise. I tried to
focus more on diff performance by running the commit traversal
separately, like:

  git rev-list v3.10..v3.11 >in

and then timing just the diffs:

  Benchmark 1: ./git.old diff-tree --stdin -r <in
    Time (mean ± σ):     415.7 ms ±   5.8 ms    [User: 357.7 ms, System: 58.0 ms]
    Range (min … max):   410.9 ms … 430.3 ms    10 runs

  Benchmark 2: ./git.new diff-tree --stdin -r <in
    Time (mean ± σ):     418.5 ms ±   2.1 ms    [User: 361.7 ms, System: 56.6 ms]
    Range (min … max):   414.9 ms … 421.3 ms    10 runs

  Summary
    ./git.old diff-tree --stdin -r <in ran
      1.01 ± 0.02 times faster than ./git.new diff-tree --stdin -r <in

That gets roughly the same result.

Adding in "-c" to do multi-parent diffs doesn't change much:

  Benchmark 1: ./git.old diff-tree --stdin -r -c <in
    Time (mean ± σ):     525.3 ms ±   6.6 ms    [User: 470.0 ms, System: 55.1 ms]
    Range (min … max):   508.4 ms … 531.0 ms    10 runs

  Benchmark 2: ./git.new diff-tree --stdin -r -c <in
    Time (mean ± σ):     532.3 ms ±   6.2 ms    [User: 469.0 ms, System: 63.1 ms]
    Range (min … max):   520.3 ms … 539.4 ms    10 runs

  Summary
    ./git.old diff-tree --stdin -r -c <in ran
      1.01 ± 0.02 times faster than ./git.new diff-tree --stdin -r -c <in

And of course if you add in a lot more work by doing actual
content-level diffs, any difference is lost entirely (here the newer
version is actually faster, but that's really just noise):

  Benchmark 1: ./git.old diff-tree --stdin -r --cc <in
    Time (mean ± σ):     11.571 s ±  0.064 s    [User: 11.287 s, System: 0.283 s]
    Range (min … max):   11.497 s … 11.615 s    3 runs

  Benchmark 2: ./git.new diff-tree --stdin -r --cc <in
    Time (mean ± σ):     11.466 s ±  0.109 s    [User: 11.108 s, System: 0.357 s]
    Range (min … max):   11.346 s … 11.560 s    3 runs

  Summary
    ./git.new diff-tree --stdin -r --cc <in ran
      1.01 ± 0.01 times faster than ./git.old diff-tree --stdin -r --cc <in

So my conclusion is that it probably does help a little, but it's mostly
lost in the noise. I could see an argument for keeping it, as the
complexity is hidden away in functions that do not often need to be
touched. But it does make them more confusing than necessary (despite
some detailed explanations from the author of that commit; it just took
me a while to wrap my head around what was going on) and prevents
further refactoring of the combine_diff_path struct. So let's drop it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:25 -08:00
ca3abe41d7 run_diff_files(): de-mystify the size of combine_diff_path struct
We allocate a combine_diff_path struct with space for 5 parents. Why 5?

The history is not particularly enlightening. The allocation comes from
b4b1550315 (Don't instantiate structures with FAMs., 2006-06-18), which
just switched to xmalloc from a stack struct with 5 elements. That
struct changed to 5 from 4 in 2454c962fb (combine-diff: show mode
changes as well., 2006-02-06), when we also moved from storing raw sha1
bytes to the combine_diff_parent struct. But no explanation is given.
That 4 comes from the earliest code in ea726d02e9 (diff-files: -c and
--cc options., 2006-01-28).

One might guess it is for the 4 stages we can store in the index. But
this code path only ever diffs the current state against stages 2 and 3.
So we only need two slots.

And it's easy to see this is still the case. We fill the parent slots by
subtracting 2 from the ce_stage() values, ignoring values below 2. And
since ce_stage() is only 2 bits, there are 4 values, and thus we need 2
slots.

Let's use the correct value (saving a tiny bit of memory) and add a
comment explaining what's going on (saving a tiny bit of programmer
brain power).

Arguably we could use:

  1 + (STAGEMASK >> STAGESHIFT) - 2

which lets the compiler enforce that we will not go out-of-bounds if we
see an unexpected value from ce_stage(). But that is more confusing to
explain, and the constant "2" is baked into other parts of the function.
It is a fundamental constant, not something where somebody might bump a
macro and forget to update this code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:24 -08:00
30f7414ca1 diff: add a comment about combine_diff_path.parent.path
We only fill in the per-parent "path" field when it differs from what's
in combine_diff_path.path (and even then only when the option is
appropriate). Let's document that.

Suggested-by: Wink Saville <wink@saville.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 12:24:24 -08:00
3a0599788f combine-diff: use pointer for parent paths
Commit d76ce4f734 (log,diff-tree: add --combined-all-paths option,
2019-02-07) added a "path" field to each combine_diff_parent struct.
It's defined as a strbuf, but this is overkill. We never manipulate the
buffer beyond inserting a single string into it.

And in fact there's a small bug: we zero the parent structs, including
the path strbufs. For the 0th parent, we strbuf_init() the strbuf before
adding to it. But for subsequent parents, we never do the init. This is
technically violating the strbuf API, though the code there is resilient
enough to handle this zero'd state.

This patch switches us to just store an allocated string pointer.
Zeroing it is enough to properly initialize it there (modulo the usual
assumption we make that a NULL pointer is all-zeroes).

And as a bonus, we can just check for a non-NULL value to see if it is
present, rather than repeating the combined_all_paths logic at each
site.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 10:31:23 -08:00
5173099aae tree-diff: clear parent array in path_appendnew()
All of the other functions which allocate a combine_diff_path struct
zero out the parent array, but this code path does not. There's no bug,
since our caller will fill in most of the fields. But leaving the unused
fields (like combine_diff_parent.path) uninitialized makes working with
the struct more error-prone than it needs to be.

Let's just zero the parent field to be consistent with the
combine_diff_path_new() allocator.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 10:05:50 -08:00
7067793441 combine-diff: add combine_diff_path_new()
The combine_diff_path struct has variable size, since it embeds both the
memory allocation for the path field as well as a variable-sized parent
array. This makes allocating one a bit tricky.

We have a helper to compute the required size, but it's up to individual
sites to actually initialize all of the fields. Let's provide a
constructor function to make that a little nicer. Besides being shorter,
it also hides away tricky bits like the computation of the "path"
pointer (which is right after the "parent" flex array).

As a bonus, using the same constructor everywhere means that we'll
consistently initialize all parts of the struct. A few code paths left
the parent array unitialized. This didn't cause any bugs, but we'll be
able to simplify some code in the next few patches knowing that the
parent fields have all been zero'd.

This also gets rid of some questionable uses of "int" to store buffer
lengths. Though we do use them to allocate, I don't think there are any
integer overflow vulnerabilities here (the allocation helper promotes
them to size_t and checks arithmetic for overflow, and the actual memcpy
of the bytes is done using the possibly-truncated "int" value).

Sadly we can't use the FLEX_* macros to simplify the allocation here,
because there are two variable-sized parts to the struct (and those
macros only handle one).

Nor can we get stop publicly declaring combine_diff_path_size(). This
patch does not touch the code in path_appendnew() at all, which is not
ready to be moved to our new constructor for a few reasons:

  - path_appendnew() has a memory-reuse optimization where it tries to
    reuse combine_diff_path structs rather than freeing and
    reallocating.

  - path_appendnew() does not create the struct from a single path
    string, but rather allocates and copies into the buffer from
    multiple sources.

These can be addressed by some refactoring, but let's leave it as-is for
now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 09:57:44 -08:00
949bb8f74f run_diff_files(): delay allocation of combine_diff_path
While looping over the index entries, when we see a higher level stage
the first thing we do is allocate a combine_diff_path struct for it. But
this can leak; if check_removed() returns an error, we'll continue to
the next iteration of the loop without cleaning up.

We can fix this by just delaying the allocation by a few lines.

I don't think this leak is triggered in the test suite, but it's pretty
easy to see by inspection. My ulterior motive here is that the delayed
allocation means we have all of the data needed to initialize "dpath" at
the time of malloc, making it easier to factor out a constructor
function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 09:56:28 -08:00
21e1b44865 difftool docs: restore correct position of tool list
2a9dfdf260 (difftool docs: de-duplicate configuration sections, 2022-09-07)
moved the difftool documentation, but missed moving this "include" line that
includes the generated list of diff tools, as referenced in the moved text.

Restore the correct position of the included list.

Signed-off-by: Adam Johnson <me@adamj.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 08:46:53 -08:00
43850dcf9c t/unit-tests: convert hash to use clar test framework
Adapt the hash test functions to clar framework by using clar
assertions where necessary. Following the consensus to convert
the unit-tests scripts found in the t/unit-tests folder to clar driven by
Patrick Steinhardt. Test functions are structured as a standalone to
test individual hash string and literal case.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 07:55:00 -08:00
a60673e925 Merge branch 'js/reftable-realloc-errors-fix'
Last-minute fix to a recent update.

* js/reftable-realloc-errors-fix:
  t-reftable-basics: allow for `malloc` to be `#define`d
2025-01-08 14:10:27 -08:00
e05e111feb Merge branch 'sj/meson-perl-build-fix'
The build procedure in "meson" for the "perl/" hierarchy lacked
necessary dependencies, which has been corrected.

* sj/meson-perl-build-fix:
  meson: fix perl dependencies
2025-01-08 14:10:26 -08:00
d02c37c3e6 t-reftable-basics: allow for malloc to be #defined
As indicated by the `#undef malloc` line in `reftable/basics.h`, it is
quite common to use allocators other than the default one by defining
`malloc` constants and friends.

This pattern is used e.g. in Git for Windows, which uses the powerful
and performant `mimalloc` allocator.

Furthermore, in `reftable/basics.c` this `#undef malloc` is
_specifically_ disabled by virtue of defining the
`REFTABLE_ALLOW_BANNED_ALLOCATORS` constant before including
`reftable/basic.h`, to ensure that such a custom allocator is also used
in the reftable code.

However, in 8db127d43f (reftable: avoid leaks on realloc error,
2024-12-28) and in 2cca185e85 (reftable: fix allocation count on
realloc error, 2024-12-28), `reftable_set_alloc()` function calls were
introduced that pass `malloc`, `realloc` and `free` function pointers as
parameters _after_ `reftable/basics.h` ensured that they were no longer
`#define`d. This would override the custom allocator and re-set it to
the default allocator provided by, say, libc or MSVCRT.

This causes problems because those calls happen after the initial
allocator has already been used to initialize an array, which is
subsequently resized using the overridden default `realloc()` allocator.

You cannot mix and match allocators like that, which leads to a
`STATUS_HEAP_CORRUPTION` (C0000374) on Windows, and when running this
unit test through shell and/or `prove` (which only support 7-bit status
codes), it surfaces as exit code 127.

It is actually unnecessary to use those function pointers to
`malloc`/`realloc`/`free`, though: The `reftable` code goes out of its
way to fall back to the initial allocator when passing `NULL` parameters
instead. So let's do that instead of causing heap corruptions.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-08 09:41:52 -08:00
45c0897204 meson: fix perl dependencies
`generate_perl_command` needs `depends: [git_version_file]` and the uses
in top-level meson.build were fine, but the ones in perl/ weren't, causing
parallel build failures in some cases as GIT-BUILD-OPTIONS wasn't yet
available.

Signed-off-by: Sam James <sam@gentoo.org>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-08 08:05:39 -08:00
6a63995335 docs: fix typesetting of merge driver placeholders
Following the `CodingGuidlines`, since these placeholders are literal
they should be typeset verbatim, so fix some that aren't.

Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 15:11:36 -08:00
14650065b7 RelNotes/2.48.0: fix typos etc.
Correct verb tense, add missing words, avoid double blank lines,
and rephrase things that don’t read well to me like “Turn this linkage
to relative paths”.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 10:46:18 -08:00
ca7158076f fsck: reject misconfigured fsck.skipList
In Git, fsck operations can ignore known broken objects via the
`fsck.skipList` configuration. This option expects a path to a file with
the list of object names. When the configuration is specified without a
path, an error message is printed, but the command continues as if the
configuration was not set. Configuring `fsck.skipList` without a value
is a misconfiguration so config parsing should be more strict and reject
it.

Update `git_fsck_config()` to no longer ignore misconfiguration of
`fsck.skipList`. The same behavior is also present for
`fetch.fsck.skipList` and `receive.fsck.skipList` so the configuration
parsers for these are updated to ensure the related operations remain
consistent.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 09:22:25 -08:00
0b4f8afef6 reftable/stack: accept insecure random bytes
The reftable library uses randomness in two call paths:

  - When reading a stack in case some of the referenced tables
    disappears. The randomness is used to delay the next read by a
    couple of milliseconds.

  - When writing a new table, where the randomness gets appended to the
    table name (e.g. "0x000000000001-0x000000000002-0b1d8ddf.ref").

In neither of these cases do we need strong randomness.

Unfortunately though, we have observed test failures caused by the
former case. In t0610 we have a test that spawns a 100 processes at
once, all of which try to write a new table to the stack. And given that
all of the processes will require randomness, it can happen that these
processes make the entropy pool run dry, which will then cause us to
die:

    + test_seq 100
    + printf %s commit\trefs/heads/branch-%s\n
    68d032e9edd3481ac96382786ececc37ec28709e 1
    + printf %s commit\trefs/heads/branch-%s\n
    68d032e9edd3481ac96382786ececc37ec28709e 2
    ...
    + git update-ref refs/heads/branch-98 HEAD
    + git update-ref refs/heads/branch-97 HEAD
    + git update-ref refs/heads/branch-99 HEAD
    + git update-ref refs/heads/branch-100 HEAD
    fatal: unable to get random bytes
    fatal: unable to get random bytes
    fatal: unable to get random bytes
    fatal: unable to get random bytes
    fatal: unable to get random bytes
    fatal: unable to get random bytes
    fatal: unable to get random bytes

The report was for NonStop, which uses OpenSSL as the backend for
randomness. In the preceding commit we have adapted that backend to also
return randomness in case the entropy pool is empty and the caller
passes the `CSPRNG_BYTES_INSECURE` flag. Do so to fix the issue.

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 09:04:18 -08:00
1568d1562e wrapper: allow generating insecure random bytes
The `csprng_bytes()` function generates randomness and writes it into a
caller-provided buffer. It abstracts over a couple of implementations,
where the exact one that is used depends on the platform.

These implementations have different guarantees: while some guarantee to
never fail (arc4random(3)), others may fail. There are two significant
failures to distinguish from one another:

  - Systemic failure, where e.g. opening "/dev/urandom" fails or when
    OpenSSL doesn't have a provider configured.

  - Entropy failure, where the entropy pool is exhausted, and thus the
    function cannot guarantee strong cryptographic randomness.

While we cannot do anything about the former, the latter failure can be
acceptable in some situations where we don't care whether or not the
randomness can be predicted.

Introduce a new `CSPRNG_BYTES_INSECURE` flag that allows callers to opt
into weak cryptographic randomness. The exact behaviour of the flag
depends on the underlying implementation:

    - `arc4random_buf()` never returns an error, so it doesn't change.

    - `getrandom()` pulls from "/dev/urandom" by default, which never
      blocks on modern systems even when the entropy pool is empty.

    - `getentropy()` seems to block when there is not enough randomness
      available, and there is no way of changing that behaviour.

    - `GtlGenRandom()` doesn't mention anything about its specific
      failure mode.

    - The fallback reads from "/dev/urandom", which also returns bytes in
      case the entropy pool is drained in modern Linux systems.

That only leaves OpenSSL with `RAND_bytes()`, which returns an error in
case the returned data wouldn't be cryptographically safe. This function
is replaced with a call to `RAND_pseudo_bytes()`, which can indicate
whether or not the returned data is cryptographically secure via its
return value. If it is insecure, and if the `CSPRNG_BYTES_INSECURE` flag
is set, then we ignore the insecurity and return the data regardless.

It is somewhat questionable whether we really need the flag in the first
place, or whether we wouldn't just ignore the potentially-insecure data.
But the risk of doing that is that we might have or grow callsites that
aren't aware of the potential insecureness of the data in places where
it really matters. So using a flag to opt-in to that behaviour feels
like the more secure choice.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 09:04:18 -08:00
4a2b3df546 Merge tag 'l10n-2.48.0-rnd1' of https://github.com/git-l10n/git-po
l10n-2.48.0-rnd1

* tag 'l10n-2.48.0-rnd1' of https://github.com/git-l10n/git-po:
  l10n: po-id for 2.48
  l10n: zh_CN: updated translation for 2.48
  l10n: uk: v2.48 update
  l10n: sv.po, fixed swedish typos
  l10n: vi: Updated translation for 2.48
  l10n: Update German translation
  l10n: tr: Update Turkish translations for 2.48
  l10n: sv.po: Update Swedish translation
  l10n: fr: v2.48.0
  l10n: zh_TW: Git 2.48 round 2
  l10n: zh_TW: Git 2.48
  l10n: bg.po: Updated Bulgarian translation (5804t)
  l10n: fr.po: Minor improvements
2025-01-07 08:53:02 -08:00
ddb5287894 t7407: use test_grep
There are a few grep calls here that can benefit from test_grep, which
produces more user-friendly output when it fails.

One of these calls also passes "-sq", which is curious. The "-q" option
suppresses the matched output. But test output is either already
redirected to /dev/null in non-verbose mode, and in verbose mode it's
better to see the output. The "-s" option suppresses errors opening
files, but we are just grepping in the "expected" file we just
generated, so it should not be needed. Neither of these was really
hurting anything, but they are not a style we'd like to see emulated. So
get rid of them.

(It is also curious to grep in the expected file in the first place, but
that is because we are auto-generating the expectation from a Git
command. So this is double-checking it did what we wanted).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:31:45 -08:00
164a2516eb test-lib: add a few comments to LSan log checking
Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
code, 2025-01-01) added code to suppress a false positive in the leak
checker. But if you're just reading the code, the obscure grep call is a
bit of a head-scratcher. Let's add a brief comment explaining what's
going on (and anybody digging further can find this commit or that one
for all the details).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:18:15 -08:00
b9a9df93a3 test-lib: simplify lsan results check
We want to know if there are any leaks logged by LSan in the results
directory, so we run "find" on the containing directory and pipe it to
xargs. We can accomplish the same thing by just globbing in the shell
and passing the result to grep, which has a few advantages:

  - it's one fewer process to run

  - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
    checked at the beginning of the function, and is the same glob used
    to show the logs in check_test_results_san_file_

  - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
    space in it. For example doing:

       mkdir "/tmp/foo bar"
       TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test

    would yield a lot of:

      grep: /tmp/foo: No such file or directory
      grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory

    when there are leaks. We could do the same thing with "xargs
    --null", but that isn't portable.

We are now subject to command-line length limits, but that is also true
of the globbing cat used to show the logs themselves. This hasn't been a
problem in practice.

We do need to use "grep -s" for the case that the glob does not expand
(i.e., there are not any log files at all). This option is in POSIX, and
has been used in t7407 for several years without anybody complaining.
This also also naturally handles the case where the surrounding
directory has already been removed (in which case there are likewise no
files!), dropping the need to comment about it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:17:54 -08:00
8d24d56ce1 test-lib: invert return value of check_test_results_san_file_empty
We have a function to check whether LSan logged any leaks. It returns
success for no leaks, and non-zero otherwise. This is the simplest thing
for its callers, who want to say "if no leaks then return early". But
because it's implemented as a shell pipeline, you end up with the
awkward:

  ! find ... |
  xargs grep leaks |
  grep -v false-positives

where the "!" is actually negating the final grep. Switch the return
value (and name) to return success when there are leaks. This should
make the code a little easier to read, and the negation in the callers
still reads pretty naturally.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:09:14 -08:00
fc613c01d4 Merge branch '2.48-uk-update' of github.com:arkid15r/git-ukrainian-l10n
* '2.48-uk-update' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: uk: v2.48 update
2025-01-07 15:45:43 +08:00
56610beac2 Merge branch 'vi-2.48' of github.com:Nekosha/git-po
* 'vi-2.48' of github.com:Nekosha/git-po:
  l10n: vi: Updated translation for 2.48
2025-01-07 15:45:21 +08:00
12bcb4d4d0 Merge branch 'l10n-de-2.48' of github.com:ralfth/git
* 'l10n-de-2.48' of github.com:ralfth/git:
  l10n: Update German translation
2025-01-07 15:44:49 +08:00
111a9d51d2 Merge branch 'tl/zh_CN_2.48.0_rnd' of github.com:dyrone/git
* 'tl/zh_CN_2.48.0_rnd' of github.com:dyrone/git:
  l10n: zh_CN: updated translation for 2.48
2025-01-07 15:44:11 +08:00
97bfea6377 Merge branch 'fr_v2.48.0' of github.com:jnavila/git
* 'fr_v2.48.0' of github.com:jnavila/git:
  l10n: fr: v2.48.0
  l10n: fr.po: Minor improvements
2025-01-07 15:39:11 +08:00
b987f159e3 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5804t)
2025-01-07 15:38:39 +08:00
8ddca35c13 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.48
2025-01-07 15:37:51 +08:00
ac8fe418a6 Merge branch 'tr-l10n' of github.com:bitigchi/git-po
* 'tr-l10n' of github.com:bitigchi/git-po:
  l10n: tr: Update Turkish translations for 2.48
2025-01-07 15:36:40 +08:00
02b355f546 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po, fixed swedish typos
  l10n: sv.po: Update Swedish translation
2025-01-07 15:35:49 +08:00
15341c8499 Merge branch 'l10n/zh-TW/2024-12-17' of github.com:l10n-tw/git-po
* 'l10n/zh-TW/2024-12-17' of github.com:l10n-tw/git-po:
  l10n: zh_TW: Git 2.48 round 2
  l10n: zh_TW: Git 2.48
2025-01-07 15:34:24 +08:00
8776470cf3 completion: repair config completion for Zsh
Commit 1e0ee4087e (completion: add and use
__git_compute_first_level_config_vars_for_section, 2024-02-10) uses an
indirect variable syntax that is only valid for Bash, but the Zsh
completion code relies on the Bash completion code to function. Zsh
supports a different indirect variable expansion using ${(P)var}, but in
`emulate ksh` mode does not support Bash's ${!var}.

This manifests as completing strange config options like
"__git_first_level_config_vars_for_section_remote" as a choice for the
command line

    git config set remote.

Using Zsh's C-x ? _complete_debug widget with the cursor at the end of
that command line captures a trace, in which we see (some details
elided):

    +__git_complete_config_variable_name:7> __git_compute_first_level_config_vars_for_section remote
     +__git_compute_first_level_config_vars_for_section:7> local section=remote
     +__git_compute_first_level_config_vars_for_section:7> __git_compute_config_vars
      +__git_compute_config_vars:7> test -n $'add.ignoreErrors\nadvice.addEmbeddedRepo\nadvice.addEmptyPathspec\nadvice.addIgnoredFile[…]'
     +__git_compute_first_level_config_vars_for_section:7> local this_section=__git_first_level_config_vars_for_section_remote
     +__git_compute_first_level_config_vars_for_section:7> test -n __git_first_level_config_vars_for_section_remote
    +__git_complete_config_variable_name:7> local this_section=__git_first_level_config_vars_for_section_remote
    +__git_complete_config_variable_name:7> __gitcomp_nl_append __git_first_level_config_vars_for_section_remote remote. '' ' '
     +__gitcomp_nl_append:7> __gitcomp_nl __git_first_level_config_vars_for_section_remote remote. '' ' '
      +__gitcomp_nl:7> emulate -L zsh
      +__gitcomp_nl:7> compset -P '*[=:]'
      +__gitcomp_nl:7> compadd -Q -S ' ' -p remote. -- __git_first_level_config_vars_for_section_remote

We perform the test for __git_compute_config_vars correctly, but the
${!this_section} references are not expanded as expected.

Instead, portably expand indirect references through the new
__git_indirect. Contrary to some versions you might find online [1],
this version avoids echo non-portabilities [2] [3] and correctly quotes
the indirect expansion after eval (so that the result is not split or
globbed before being handed to printf).

[1]: https://unix.stackexchange.com/a/41409/301073
[2]: https://askubuntu.com/questions/715765/mysterious-behavior-of-echo-command#comment1056038_715769
[3]: https://mywiki.wooledge.org/CatEchoLs

The following demo program demonstrates how this works:

    b=1
    indirect() {
      eval printf '%s' "\"\$$1\""
    }
    f() {
      # Comment this out to see that it works for globals, too. Or, use
      # a value with spaces like '2 3 4' to see how it handles those.
      local b=2
      local a=b
      test -n "$(indirect $a)" && echo nice
    }
    f

When placed in a file "demo", then both
    bash -x demo
and
    zsh -xc 'emulate ksh -c ". ./demo"' |& tail
provide traces showing that "$(indirect $a)" produces 2 (or 1, with the
global, or "2 3 4" as a single string, etc.).

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Acked-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06 14:21:26 -08:00
a41e394e21 Merge branch 'bf/fetch-set-head-config'
A hotfix on an advice messagge added during this cycle.

* bf/fetch-set-head-config:
  fetch: fix erroneous set_head advice message
2025-01-06 12:02:21 -08:00
b74ff38af5 Git 2.48-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06 08:24:43 -08:00
ee0e3bbc8d Merge branch 'jc/doc-opt-tilde-expand'
Docfix.

* jc/doc-opt-tilde-expand:
  gitcli.txt: typeset pathnames as monospace
2025-01-06 08:23:29 -08:00
1fa37a0608 Merge branch 'mh/doc-windows-home-env'
Docfix.

* mh/doc-windows-home-env:
  git.txt: fix heading line of tildes
2025-01-06 08:23:29 -08:00
d7fcbe2c56 object-file: retry linking file into place when occluding file vanishes
Prior to 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30), callers could expect that a successful return from
`finalize_object_file()` means that either the file was moved into
place, or the identical bytes were already present. If neither of those
happens, we'd return an error.

Since that commit, if the destination file disappears between our
link(3p) call and the collision check, we'd return success without
actually checking the contents, and without retrying the link. This
solves the common case that the files were indeed the same, but it means
that we may corrupt the repository if they weren't (this implies a hash
collision, but the whole point of this function is protecting against
hash collisions).

We can't be pessimistic and assume they're different; that hurts the
common case that the mentioned commit was trying to fix. But after
seeing that the destination file went away, we can retry linking again.
Adapt the code to do so when we see that the destination file has racily
vanished. This should generally succeed as we have just observed that
the destination file does not exist anymore, except in the very unlikely
event that it gets recreated by another concurrent process again.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06 07:57:17 -08:00
cfae50e40e object-file: don't special-case missing source file in collision check
In 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30) we have started to ignore ENOENT when opening either the
source or destination file of the collision check. This was done to
handle races more gracefully in case either of the potentially-colliding
disappears.

The fix is overly broad though: while the destination file may indeed
vanish racily, this shouldn't ever happen for the source file, which is
a temporary object file (either loose or in packfile format) that we
have just created. So if any concurrent process would have removed that
temporary file it would indicate an actual issue.

Stop treating ENOENT specially for the source file so that we always
bubble up this error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06 07:57:17 -08:00
c1acf1a317 object-file: rename variables in check_collision()
Rename variables used in `check_collision()` to clearly identify which
file is the source and which is the destination. This will make the next
step easier to reason about when we start to treat those files different
from one another.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06 07:57:17 -08:00
e63e62171b Merge branch 'master' of https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
  gitk: Update Bulgarian translation (327t)
2025-01-06 06:52:05 -08:00
bac67e1370 Merge branch 'master' of https://github.com/j6t/git-gui
* 'master' of https://github.com/j6t/git-gui:
  git-gui i18n: Updated Bulgarian translation (579t)
2025-01-06 06:51:37 -08:00
233d48f5de fetch: fix erroneous set_head advice message
9e2b7005be (fetch set_head: add warn-if-not-$branch option, 2024-12-05)
tried to expand the advice message for set_head with the new option, but
unfortunately did not manage to add the right incantation. Fix the
advice message with the correct usage of warn-if-not-$branch.

Reported-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06 06:50:03 -08:00
238c0c095f l10n: po-id for 2.48
Update following components:

  * advice.c
  * archive.c
  * builtin/checkout.c
  * builtin/clone.c
  * builtin/config.c
  * builtin/describe.c
  * builtin/fetch.c
  * builtin/gc.c
  * builtin/index-pack.c
  * builtin/notes.c
  * builtin/pack-objects.c
  * builtin/remote.c
  * builtin/worktree.c
  * commit.c
  * fetch-pack.c
  * hook.c
  * object-name.c
  * refs.c
  * refs/files-backend.c
  * remote.c
  * worktree.c

Translate following new components:

  * cache-tree.c
  * daemon.c
  * merge-ll.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2025-01-06 15:55:13 +07:00
bba9dd6a96 l10n: zh_CN: updated translation for 2.48
Signed-off-by: Teng Long <dyroneteng@gmail.com>
2025-01-05 19:04:34 +08:00
ae6336b617 Merge branch 'as/translations-bg'
* as/translations-bg:
  git-gui i18n: Updated Bulgarian translation (579t)

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-01-05 10:44:35 +01:00
10fd0e1203 l10n: uk: v2.48 update
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2025-01-04 19:26:33 -08:00
087ac48674 l10n: sv.po, fixed swedish typos
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2025-01-04 22:56:39 +01:00
a2df58fb15 l10n: vi: Updated translation for 2.48
Signed-off-by: Vũ Tiến Hưng <newcomerminecraft@gmail.com>
2025-01-05 01:58:03 +07:00
866ea87703 t7110: replace test -f with test_path_is_* helpers
`test -f` and `! test -f` do not provide clear error messages when they fail.
To enhance debuggability, use `test_path_is_file` and `test_path_is_missing`,
which instead provide more informative error messages.

Note that `! test -f` checks if a path is not a file, while
`test_path_is_missing` verifies that a path does not exist. In this specific
case the tests are meant to check the absence of the path, making
`test_path_is_missing` a valid replacement.

Signed-off-by: Matteo Bagnolini <matteobagnolini2003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-03 10:35:13 -08:00
b1dbc87686 l10n: Update German translation
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2025-01-03 17:41:26 +01:00
b67a603f63 gitcli.txt: typeset pathnames as monospace
Commit 1bc1e94091 (doc: option value may be separate for valid reasons,
2024-11-25) added a paragraph discussing tilde-expansion of, e.g.,
~/directory/file.

The tilde character has a special meaning to asciidoc tools. In this
particular case, AsciiDoc matches up the two tildes in "e.g.
~/directory/file or ~u/d/f" and sets the text between them using
subscript. In the manpage, where subscripting is not possible, this
renders as "e.g.  /directory/file oru/d/f".

These paths are literal values, which our coding guidelines want typeset
as verbatim using backticks. Do that. One effect of this is indeed that
the asciidoc tools stop interpreting tilde and other special characters.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-03 08:23:59 -08:00
38d7016891 git.txt: fix heading line of tildes
The two-line heading added in 8525e92886 (Document HOME environment
variable, 2024-12-09) uses too many tilde characters, so the heading
isn't detected as such. Both AsciiDoc and Asciidoctor end up
misrendering this in different ways.

Use the correct number of tilde characters to fix this.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-03 08:23:10 -08:00
1b4e9a5f8b Merge branch 'ps/build-meson-html'
The build procedure based on meson learned to generate HTML
documention pages.

* ps/build-meson-html:
  Documentation: wire up sanity checks for Meson
  t/Makefile: make "check-meson" work with Dash
  meson: install static files for HTML documentation
  meson: generate articles
  Documentation: refactor "howto-index.sh" for out-of-tree builds
  Documentation: refactor "api-index.sh" for out-of-tree builds
  meson: generate user manual
  Documentation: inline user-manual.conf
  meson: generate HTML pages for all man page categories
  meson: fix generation of merge tools
  meson: properly wire up dependencies for our docs
  meson: wire up support for AsciiDoctor
2025-01-02 13:37:08 -08:00
effbef2beb Merge branch 'jk/lsan-race-ignore-false-positive'
CI jobs that run threaded programs under LSan has been giving false
positives from time to time, which has been worked around.

This is an alternative to the jk/lsan-race-with-barrier topic with
much smaller change to the production code.

* jk/lsan-race-ignore-false-positive:
  test-lib: ignore leaks in the sanitizer's thread code
  test-lib: check leak logs for presence of DEDUP_TOKEN
  test-lib: simplify leak-log checking
  test-lib: rely on logs to detect leaks
  Revert barrier-based LSan threading race workaround
2025-01-02 13:37:08 -08:00
b119a687d4 test-lib: ignore leaks in the sanitizer's thread code
Our CI jobs sometimes see false positive leaks like this:

        =================================================================
        ==3904583==ERROR: LeakSanitizer: detected memory leaks

        Direct leak of 32 byte(s) in 1 object(s) allocated from:
            #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
            #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
            #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
            #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
            #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
            #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
            #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
            #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

This is not a leak in our code, but appears to be a race between one
thread calling exit() while another one is in LSan's stack setup code.
You can reproduce it easily by running t0003 or t5309 with --stress
(these trigger it because of the threading in git-grep and index-pack
respectively).

This may be a bug in LSan, but regardless of whether it is eventually
fixed, it is useful to work around it so that we stop seeing these false
positives.

We can recognize it by the mention of the sanitizer functions in the
DEDUP_TOKEN line. With this patch, the scripts mentioned above should
run with --stress indefinitely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
6fb8cb3d68 test-lib: check leak logs for presence of DEDUP_TOKEN
When we check the leak logs, our original strategy was to check for any
non-empty log file produced by LSan. We later amended that to ignore
noisy lines in 370ef7e40d (test-lib: ignore uninteresting LSan output,
2023-08-28).

This makes it hard to ignore noise which is more than a single line;
we'd have to actually parse the file to determine the meaning of each
line.

But there's an easy line-oriented solution. Because we always pass the
dedup_token_length option, the output will contain a DEDUP_TOKEN line
for each leak that has been found. So if we invert our strategy to stop
ignoring useless lines and only look for useful ones, we can just count
the number of DEDUP_TOKEN lines. If it's non-zero, then we found at
least one leak (it would even give us a count of unique leaks, but we
really only care if it is non-zero).

This should yield the same outcome, but will help us build more false
positive detection on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
373a432696 test-lib: simplify leak-log checking
We have a function to count the number of leaks found (actually, it is
the number of processes which produced a log file). Once upon a time we
cared about seeing if this number increased between runs. But we
simplified that away in 95c679ad86 (test-lib: stop showing old leak
logs, 2024-09-24), and now we only care if it returns any results or
not.

In preparation for refactoring it further, let's drop the counting
function entirely, and roll it into the "is it empty" check. The outcome
should be the same, but we'll be free to return a boolean "did we find
anything" without worrying about somebody adding a new call to the
counting function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
5fa0c4dd29 test-lib: rely on logs to detect leaks
When we run with sanitizers, we set abort_on_error=1 so that the tests
themselves can detect problems directly (when the buggy program exits
with SIGABRT). This has one blind spot, though: we don't always check
the exit codes for all programs (e.g., helpers like upload-pack invoked
behind the scenes).

For ASan and UBSan this is mostly fine; they exit as soon as they see an
error, so the unexpected abort of the program causes the test to fail
anyway.

But for LSan, the program runs to completion, since we can only check
for leaks at the end. And in that case we could miss leak reports. And
thus we started checking LSan logs in faececa53f (test-lib: have the
"check" mode for SANITIZE=leak consider leak logs, 2022-07-28).
Originally the logs were optional, but logs are generated (and checked)
always as of 8c1d6691bc (test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by
default, 2024-07-11). And we even check them for each test snippet, as
of cf1464331b (test-lib: check for leak logs after every test,
2024-09-24).

So now aborting on error is superfluous for LSan! We can get everything
we need by checking the logs. And checking the logs is actually
preferable, since it gives us more control over silencing false
positives (something we do not yet do, but will soon).

So let's tell LSan to just exit normally, even if it finds leaks. We can
do so with exitcode=0, which also suppresses the abort_on_error flag.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
fc89d14c63 Revert barrier-based LSan threading race workaround
The extra "barrier" approach was too much code whose sole purpose
was to work around a race that is not even ours (i.e. in LSan's
teardown code).

In preparation for queuing a solution taking a much-less-invasive
approach, let's revert them.
2025-01-01 14:13:01 -08:00
d062ccf4c3 A bit more post Git 2.48-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 09:21:15 -08:00
d893741e02 Merge branch 'jk/lsan-race-with-barrier'
CI jobs that run threaded programs under LSan has been giving false
positives from time to time, which has been worked around.

* jk/lsan-race-with-barrier:
  grep: work around LSan threading race with barrier
  index-pack: work around LSan threading race with barrier
  thread-utils: introduce optional barrier type
  Revert "index-pack: spawn threads atomically"
  test-lib: use individual lsan dir for --stress runs
2025-01-01 09:21:15 -08:00
98422943f0 Merge branch 'ps/weak-sha1-for-tail-sum-fix'
An earlier "csum-file checksum does not have to be computed with
sha1dc" topic had a few code paths that had initialized an
implementation of a hash function to be used by an unmatching hash
by mistake, which have been corrected.

* ps/weak-sha1-for-tail-sum-fix:
  ci: exercise unsafe OpenSSL backend
  builtin/fast-import: fix segfault with unsafe SHA1 backend
  bulk-checkin: fix segfault with unsafe SHA1 backend
2025-01-01 09:21:14 -08:00
73e35b172a Merge branch 'rs/reftable-realloc-errors'
The custom allocator code in the reftable library did not handle
failing realloc() very well, which has been addressed.

* rs/reftable-realloc-errors:
  t-reftable-merged: handle realloc errors
  reftable: handle realloc error in parse_names()
  reftable: fix allocation count on realloc error
  reftable: avoid leaks on realloc error
2025-01-01 09:21:13 -08:00
1a18bf3a5b l10n: tr: Update Turkish translations for 2.48
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2025-01-01 15:29:51 +03:00
bc2c65770d Git 2.48-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:58:28 -08:00
e1d34f36ea Merge branch 'ms/t7611-test-path-is-file'
Test modernization.

* ms/t7611-test-path-is-file:
  t7611: replace test -f with test_path_is* helpers
2024-12-30 06:56:28 -08:00
5b34dd08d0 parse-options: localize mark-up of placeholder text in the short help
i18n: expose substitution hint chars in functions and macros to
translators

For example (based on builtin/commit.c and shortened): the "--author"
option takes a name.  In source this can be represented as:

  OPT_STRING(0, "author", &force_author, N_("author"), N_("override author")),

When the command is run with "-h" (short help) option (git commit -h),
the above definition is displayed as:

  --[no-]author <author>    override author

Git does not use translated option names so the first part of the
above, "--[no-]author", is given as-is (it is based on the 2nd
argument of OPT_STRING).  However the string "author" in the pair of
"<>", and the explanation "override author for commit" may be
translated into user's language.

The user's language may use a convention to mark a replaceable part of
the command line (called a "placeholder string") differently from
enclosing it inside a pair of "<>", but the implementation in
parse-options.c hardcodes "<%s>".

Allow translators to specify the presentation of a placeholder string
for their languages by overriding the "<%s>".

In case the translator's writing system is sufficiently different than
Latin the "<>" characters can be substituted by an empty string thus
effectively skipping them in the output.  For example languages with
uppercase versions of characters can use that to deliniate
replaceability.

Alternatively a translator can decide to use characters that are
visually close to "<>" but are not interpreted by the shell.

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:55:24 -08:00
6a0ee54f9a meson: provide a summary of configured backends
There are a couple of backends from which the user can choose for HTTPS,
SHA1, its unsafe variant as well as SHA256. Provide a summary of the
configured values to make these more discoverable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:49 -08:00
d2c0b6a86c meson: wire up unsafe SHA1 backend
In 06c92dafb8 (Makefile: allow specifying a SHA-1 for non-cryptographic
uses, 2024-09-26), we have introduced a cryptographically-insecure
backend for SHA1 that can optionally be used in some contexts where the
processed data is not security relevant. This effort was in-flight with
the effort to introduce Meson, so we don't have an equivalent here.

Wire up a new build option that lets users pick an unsafe SHA1 backend.

Note that for simplicity's sake we have to drop the error condition
around an unhandled SHA1 backend. This should be fine though given that
Meson verifies the value for combo-options for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:49 -08:00
12068bd4de meson: add missing dots for build options
Most of our Meson build options end with a trailing dot, but those for
our SHA1 and SHA256 backends don't. Add it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:49 -08:00
6d8aa2aec8 meson: simplify conditions for HTTPS and SHA1 dependencies
The conditions used to figure out whteher the Security framework or
OpenSSL library is required are a bit convoluted because they can be
pulled in via the HTTPS, SHA1 or SHA256 backends. Refactor them to be
easier to read.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:46 -08:00
d6787d9751 meson: require SecurityFramework when it's used as SHA1 backend
The Security framework is required when we use CommonCrypto either as
HTTPS or SHA1 backend, but we only require it in case it is set up as
HTTPS backend. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:45 -08:00
31eb6d7cf0 meson: deduplicate access to SHA1/SHA256 backend options
We've got a couple of repeated calls to `get_option()` for the SHA1 and
SHA256 backend options. While not an issue, it makes the code needlessly
verbose.

Fix this by consistently using a local variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:45 -08:00
8214e27d27 meson: consistenlty spell 'CommonCrypto'
The 'CommonCrypto' backend can be specified as HTTPS and SHA1 backends,
but the value that one needs to use is inconsistent across those two
build options. Unify it to 'CommonCrypto'.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:50:45 -08:00
cade724b52 Merge branch 'ps/weak-sha1-for-tail-sum-fix' into ps/meson-weak-sha1-build
* ps/weak-sha1-for-tail-sum-fix:
  ci: exercise unsafe OpenSSL backend
  builtin/fast-import: fix segfault with unsafe SHA1 backend
  bulk-checkin: fix segfault with unsafe SHA1 backend
2024-12-30 06:50:28 -08:00
599a63409b ci: exercise unsafe OpenSSL backend
In the preceding commit we have fixed a segfault when using an unsafe
SHA1 backend that is different from the safe one. This segfault only
went by unnoticed because we never set up an unsafe backend in our CI
systems. Fix this ommission by setting `OPENSSL_SHA1_UNSAFE` in our
TEST-vars job.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:46:30 -08:00
106140a99f builtin/fast-import: fix segfault with unsafe SHA1 backend
Same as with the preceding commit, git-fast-import(1) is using the safe
variant to initialize a hashfile checkpoint. This leads to a segfault
when passing the checkpoint into the hashfile subsystem because it would
use the unsafe variants instead:

    ++ git --git-dir=R/.git fast-import --big-file-threshold=1
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0)
    ==577126==The signal is caused by a READ memory access.
    ==577126==Hint: address points to the zero page.
        #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
        #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
        #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
        #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
        #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2
        #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3
        #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5
        #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4
        #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4
        #9 0x555555b1f493 in run_builtin ../git.c:480:11
        #10 0x555555b1bfef in handle_builtin ../git.c:740:9
        #11 0x555555b1e6f4 in run_argv ../git.c:807:4
        #12 0x555555b1b87a in cmd_main ../git.c:947:19
        #13 0x5555561649e6 in main ../common-main.c:64:11
        #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #16 0x555555772c84 in _start (git+0x21ec84)

    ==577126==Register values:
    rax = 0x0000511000000cc0  rbx = 0x0000000000000000  rcx = 0x000000000000000c  rdx = 0x0000000000000000
    rdi = 0x0000000000000000  rsi = 0x00005070000009c0  rbp = 0x00005070000009c0  rsp = 0x00007fffffff5b30
     r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x00007ffff7a01a30
    r12 = 0x0000000000000000  r13 = 0x00007fffffff6b60  r14 = 0x00007ffff7ffd000  r15 = 0x00005555563b9910
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex
    ==577126==ABORTING
    ./test-lib.sh: line 1039: 577126 Aborted                 git --git-dir=R/.git fast-import --big-file-threshold=1 < input
    error: last command exited with $?=134
    not ok 167 - R: blob bigger than threshold

The segfault is only exposed in case the unsafe and safe backends are
different from one another.

Fix the issue by initializing the context with the unsafe SHA1 variant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:46:30 -08:00
9218c0bfe1 bulk-checkin: fix segfault with unsafe SHA1 backend
In 1b9e9be8b4 (csum-file.c: use unsafe SHA-1 implementation when
available, 2024-09-26) we have converted our `struct hashfile` to use
the unsafe SHA1 backend, which results in a significant speedup. One
needs to be careful with how to use that structure now though because
callers need to consistently use either the safe or unsafe variants of
SHA1, as otherwise one can easily trigger corruption.

As it turns out, we have one inconsistent usage in our tree because we
directly initialize `struct hashfile_checkpoint::ctx` with the safe
variant of SHA1, but end up writing to that context with the unsafe
ones. This went unnoticed so far because our CI systems do not exercise
different hash functions for these two backends, and consequently safe
and unsafe variants are equivalent. But when using SHA1DC as safe and
OpenSSL as unsafe backend this leads to a crash an t1050:

    ++ git -c core.compression=0 add large1
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==1367==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x507000000db0 sp 0x7fffffff5690 T0)
    ==1367==The signal is caused by a READ memory access.
    ==1367==Hint: address points to the zero page.
        #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
        #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
        #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
        #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
        #4 0x555555b9905d in deflate_blob_to_pack ../bulk-checkin.c:286:4
        #5 0x555555b98ae9 in index_blob_bulk_checkin ../bulk-checkin.c:362:15
        #6 0x555555ddab62 in index_blob_stream ../object-file.c:2756:9
        #7 0x555555dda420 in index_fd ../object-file.c:2778:9
        #8 0x555555ddad76 in index_path ../object-file.c:2796:7
        #9 0x555555e947f3 in add_to_index ../read-cache.c:771:7
        #10 0x555555e954a4 in add_file_to_index ../read-cache.c:804:9
        #11 0x5555558b5c39 in add_files ../builtin/add.c:355:7
        #12 0x5555558b412e in cmd_add ../builtin/add.c:578:18
        #13 0x555555b1f493 in run_builtin ../git.c:480:11
        #14 0x555555b1bfef in handle_builtin ../git.c:740:9
        #15 0x555555b1e6f4 in run_argv ../git.c:807:4
        #16 0x555555b1b87a in cmd_main ../git.c:947:19
        #17 0x5555561649e6 in main ../common-main.c:64:11
        #18 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #19 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #20 0x555555772c84 in _start (git+0x21ec84)

    ==1367==Register values:
    rax = 0x0000511000001080  rbx = 0x0000000000000000  rcx = 0x000000000000000c  rdx = 0x0000000000000000
    rdi = 0x0000000000000000  rsi = 0x0000507000000db0  rbp = 0x0000507000000db0  rsp = 0x00007fffffff5690
     r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x00007ffff7a01a30
    r12 = 0x0000000000000000  r13 = 0x00007fffffff6b38  r14 = 0x00007ffff7ffd000  r15 = 0x00005555563b9910
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex
    ==1367==ABORTING
    ./test-lib.sh: line 1023:  1367 Aborted                 git $config add large1
    error: last command exited with $?=134
    not ok 4 - add with -c core.compression=0

Fix the issue by using the unsafe variant instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:46:29 -08:00
0ad3d65652 object-file: fix race in object collision check
One of the tests in t5616 asserts that git-fetch(1) with `--refetch`
triggers repository maintenance with the correct set of arguments. This
test is flaky and causes us to fail sometimes:

    ++ git -c protocol.version=0 -c gc.autoPackLimit=0 -c maintenance.incremental-repack.auto=1234 -C pc1 fetch --refetch origin
    error: unable to open .git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack: No such file or directory
    fatal: unable to rename temporary file to '.git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack'
    fatal: could not finish pack-objects to repack local links
    fatal: index-pack failed
    error: last command exited with $?=128

The error message is quite confusing as it talks about trying to rename
a temporary packfile. A first hunch would thus be that this packfile
gets written by git-fetch(1), but removed by git-maintenance(1) while it
hasn't yet been finalized, which shouldn't ever happen. And indeed, when
looking closer one notices that the file that is supposedly of temporary
nature does not have the typical `tmp_pack_` prefix.

As it turns out, the "unable to rename temporary file" fatal error is a
red herring and the real error is "unable to open". That error is raised
by `check_collision()`, which is called by `finalize_object_file()` when
moving the new packfile into place. Because t5616 re-fetches objects, we
end up with the exact same pack as we already have in the repository. So
when the concurrent git-maintenance(1) process rewrites the preexisting
pack and unlinks it exactly at the point in time where git-fetch(1)
wants to check the old and new packfiles for equality we will see ENOENT
and thus `check_collision()` returns an error, which gets bubbled up by
`finalize_object_file()` and is then handled by `rename_tmp_packfile()`.
That function does not know about the exact root cause of the error and
instead just claims that the rename has failed.

This race is thus caused by b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26), where we have newly introduced
the collision check.

By definition, two files cannot collide with each other when one of them
has been removed. We can thus trivially fix the issue by ignoring ENOENT
when opening either of the files we're about to check for collision.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:35:50 -08:00
7a8d9efc26 grep: work around LSan threading race with barrier
There's a race with LSan when spawning threads and one of the threads
calls die(). We worked around one such problem with index-pack in the
previous commit, but it exists in git-grep, too. You can see it with:

  make SANITIZE=leak THREAD_BARRIER_PTHREAD=YesOnLinux
  cd t
  ./t0003-attributes.sh --stress

which fails pretty quickly with:

  ==git==4096424==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 32 byte(s) in 1 object(s) allocated from:
      #0 0x7f906de14556 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
      #1 0x7f906dc9d2c1 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
      #2 0x7f906de2500d in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
      #3 0x7f906de25187 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:614
      #4 0x7f906de17d18 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:53
      #5 0x7f906de143a9 in ThreadStartFunc<false> ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:431
      #6 0x7f906dc9bf51 in start_thread nptl/pthread_create.c:447
      #7 0x7f906dd1a677 in __clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

As with the previous commit, we can fix this by inserting a barrier that
makes sure all threads have finished their setup before continuing. But
there's one twist in this case: the thread which calls die() is not one
of the worker threads, but the main thread itself!

So we need the main thread to wait in the barrier, too, until all
threads have gotten to it. And thus we initialize the barrier for
num_threads+1, to account for all of the worker threads plus the main
one.

If we then test as above, t0003 should run indefinitely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:18:58 -08:00
526c0a851b index-pack: work around LSan threading race with barrier
We sometimes get false positives from our linux-leaks CI job because of
a race in LSan itself. The problem is that one thread is still
initializing its stack in LSan's code (and allocating memory to do so)
while anothe thread calls die(), taking down the whole process and
triggering a leak check.

The problem is described in more detail in 993d38a066 (index-pack: spawn
threads atomically, 2024-01-05), which tried to fix it by pausing worker
threads until all calls to pthread_create() had completed. But that's
not enough to fix the problem, because the LSan setup code runs in the
threads themselves. So even though pthread_create() has returned, we
have no idea if all threads actually finished their setup before letting
any of them do real work.

We can fix that by using a barrier inside the threads themselves,
waiting for all of them to hit the start of their main function before
any of them proceed.

You can test for the race by running:

  make SANITIZE=leak THREAD_BARRIER_PTHREAD=YesOnLinux
  cd t
  ./t5309-pack-delta-cycles.sh --stress

which fails quickly before this patch, and should run indefinitely
without it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:18:58 -08:00
7d0037b59a thread-utils: introduce optional barrier type
One thread primitive we don't yet support is a barrier: it waits for all
threads to reach a synchronization point before letting any of them
continue. This would be useful for avoiding the LSan race we see in
index-pack (and other places) by having all threads complete their
initialization before any of them start to do real work.

POSIX introduced a pthread_barrier_t in 2004, which does what we want.
But if we want to rely on it:

  1. Our Windows pthread emulation would need a new set of wrapper
     functions. There's a Synchronization Barrier primitive there, which
     was introduced in Windows 8 (which is old enough for us to depend
     on).

  2. macOS (and possibly other systems) has pthreads but not
     pthread_barrier_t. So there we'd have to implement our own barrier
     based on the mutex and cond primitives.

Those are do-able, but since we only care about avoiding races in our
LSan builds, there's an easier way: make it a noop on systems without a
native pthread barrier.

This patch introduces a "maybe_thread_barrier" API. The clunky name
(rather than just using pthread_barrier directly) should hopefully clue
people in that on some systems it will do nothing. It's wired to a
Makefile knob which has to be triggered manually, and we enable it for
the linux-leaks CI jobs (since we know we'll have it there).

There are some other possible options:

  - we could turn it on all the time for Linux systems based on uname.
    But we really only care about it for LSan builds, and there is no
    need to add extra code to regular builds.

  - we could turn it on only for LSan builds. But that would break
    builds on non-Linux platforms (like macOS) that otherwise should
    support sanitizers.

  - we could trigger only on the combination of Linux and LSan together.
    This isn't too hard to do, but the uname check isn't completely
    accurate. It is really about what your libc supports, and non-glibc
    systems might not have it (though at least musl seems to).

    So we'd risk breaking builds on those systems, which would need to
    add a new knob. Though the upside would be that running local "make
    SANITIZE=leak test" would be protected automatically.

And of course none of this protects LSan runs from races on systems
without pthread barriers. It's probably OK in practice to protect only
our CI jobs, though. The race is rare-ish and most leak-checking happens
through CI.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:18:57 -08:00
ca9d60f246 Revert "index-pack: spawn threads atomically"
This reverts commit 993d38a066.

That commit was trying to solve a race between LSan setting up the
threads stack and another thread calling exit(), by making sure that all
pthread_create() calls have finished before doing any work that might
trigger the exit().

But that isn't sufficient. The setup code actually runs in the
individual threads themselves, not in the spawning thread's call to
pthread_create(). So while it may have improved the race a bit, you can
still trigger it pretty quickly with:

  make SANITIZE=leak
  cd t
  ./t5309-pack-delta-cycles.sh --stress

Let's back out that failed attempt so we can try again.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:18:57 -08:00
d601aee605 test-lib: use individual lsan dir for --stress runs
When storing output in test-results/, we usually give each numbered run
in a --stress set its own output file. But we don't do that for storing
LSan logs, so something like:

  ./t0003-attributes.sh --stress

will have many scripts simultaneously creating, writing to, and deleting
the test-results/t0003-attributes.leak directory. This can cause logs
from one run to be attributed to another, spurious failures when
creation and deletion race, and so on.

This has always been broken, but nobody noticed because it's rare to do
a --stress run with LSan (since the point is for the code to run quickly
many times in order to hit races). But if you're trying to find a race
in the leak sanitizing code, it makes sense to use these together.

We can fix it by using $TEST_RESULTS_BASE, which already incorporates
the stress job suffix.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:18:57 -08:00
956b486cac l10n: sv.po: Update Swedish translation
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2024-12-30 12:04:46 +01:00
31f5549c28 l10n: fr: v2.48.0
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2024-12-29 19:47:30 +01:00
306ab352f4 Merge branch 'ps/meson-test-wo-gitweb'
meson-based build without GitWeb failed the self tests.

* ps/meson-test-wo-gitweb:
  meson: enable auto-discovered "gitweb"
  GIT-BUILD-OPTIONS: wire up NO_GITWEB option
  GIT-BUILD-OPTIONS: sort variables alphabetically
2024-12-28 12:20:35 -08:00
df2faf1a65 Merge branch 'as/gitk-git-gui-repo-update'
The developer documentation has been updated to give the latest
info on gitk and git-gui maintainer.

* as/gitk-git-gui-repo-update:
  Update the official repo of gitk
2024-12-28 10:11:42 -08:00
1e78120928 t-reftable-merged: handle realloc errors
Check reallocation errors in unit tests, like everywhere else.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:45 -08:00
e4981ed1e7 reftable: handle realloc error in parse_names()
Check the final reallocation for adding the terminating NULL and handle
it just like those in the loop.  Simply use REFTABLE_ALLOC_GROW instead
of keeping the REFTABLE_REALLOC_ARRAY call and adding code to preserve
the original pointer value around it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:44 -08:00
2cca185e85 reftable: fix allocation count on realloc error
When realloc(3) fails, it returns NULL and keeps the original allocation
intact.  REFTABLE_ALLOC_GROW overwrites both the original pointer and
the allocation count variable in that case, simultaneously leaking the
original allocation and misrepresenting the number of storable items.

parse_names() avoids the leak by keeping the original pointer if
reallocation fails, but still increase the allocation count in such a
case as if it succeeded.  That's OK, because the error handling code
just frees everything and doesn't look at names_cap anymore.

reftable_buf_add() does the same, but here it is a problem as it leaves
the reftable_buf in a broken state, with ->alloc being roughly twice as
big as the actually allocated memory, allowing out-of-bounds writes in
subsequent calls.

Reimplement REFTABLE_ALLOC_GROW to avoid leaks, keep allocation counts
in sync and still signal failures to callers while avoiding code
duplication in callers.  Make it an expression that evaluates to 0 if no
reallocation is needed or it succeeded and 1 on failure while keeping
the original pointer and allocation counter values.

Adjust REFTABLE_ALLOC_GROW_OR_NULL to the new calling convention for
REFTABLE_ALLOC_GROW, but keep its support for non-size_t alloc variables
for now.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:44 -08:00
8db127d43f reftable: avoid leaks on realloc error
When realloc(3) fails, it returns NULL and keeps the original allocation
intact.  REFTABLE_ALLOC_GROW overwrites both the original pointer and
the allocation count variable in that case, simultaneously leaking the
original allocation and misrepresenting the number of storable items.

parse_names() and reftable_buf_add() avoid leaking by restoring the
original pointer value on failure, but all other callers seem to be OK
with losing the old allocation.  Add a new variant of the macro,
REFTABLE_ALLOC_GROW_OR_NULL, which plugs the leak and zeros the
allocation counter.  Use it for those callers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:44 -08:00
2c3ca00b48 l10n: zh_TW: Git 2.48 round 2
Co-authored-by: Lumynous <lumynou5.tw@gmail.com>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2024-12-28 13:24:48 +08:00
ffbd89cbb7 l10n: zh_TW: Git 2.48
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2024-12-28 13:15:42 +08:00
40fdd46b7f l10n: bg.po: Updated Bulgarian translation (5804t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2024-12-27 22:42:29 +01:00
24027256aa sign-compare: avoid comparing ptrdiff with an int/unsigned
Instead, offset the base pointer with integer and compare it with
the other pointer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 12:25:30 -08:00
5419445b4d Documentation: wire up sanity checks for Meson
Wire up sanity checks for Meson to verify that no man pages are missing.
This check is similar to the same check we already have for our tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:12 -08:00
d8af27d309 t/Makefile: make "check-meson" work with Dash
The "check-meson" target uses process substitution to check whether
extracted contents from "meson.build" match expected contents. Process
substitution is unportable though and thus the target will fail when
using for example Dash.

Fix this by writing data into a temporary directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:11 -08:00
7a3136e5c7 meson: install static files for HTML documentation
Now that we generate man pages, articles and user manual with Meson the
only thing that is still missing in an installation of HTML documents is
a couple of static files. Wire these up to finalize Meson's support for
generating HTML documentation.

Diffing an installation that uses our Makefile with an installation that
uses Meson only surfaces a couple of discepancies now:

  - Meson doesn't install "everyday.html" and "git-remote-helpers.html".
    These files are marked as obsolete and don't contain any useful
    information anymore: they simply point to their modern equivalents.

  - Meson doesn't install "*.txt" files when asking for HTML docs. I'm
    not sure why our Makefiles do this in the first place, and it does
    seem like the resulting installation is fully functional even
    without those files.

Other than that, both layout and file contents are the exact same.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:11 -08:00
bcf7edee09 meson: generate articles
While the Meson build system already knows to generate man pages and our
user manual, it does not yet generate the random assortment of articles
that we have. Plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:11 -08:00
8922506cb2 Documentation: refactor "howto-index.sh" for out-of-tree builds
The "howto-index.sh" is used to generate an index of our how-to docs. It
receives as input the paths to these documents, which would typically be
relative to the "Documentation/" directory in Makefile-based builds. In
an out-of-tree build though it will get relative that may be rooted
somewhere else entirely.

The file paths do end up in the generated index, and the expectation is
that they should always start with "howto/". But for out-of-tree builds
we would populate it with the paths relative to the build directory,
which is wrong.

Fix the issue by using `$(basename "$file")` to generate the path. While
at it, move the script into "howto/" to align it with the location of
the comparable "api-index.sh" script.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:11 -08:00
88e08b92e9 Documentation: refactor "api-index.sh" for out-of-tree builds
The "api-index.sh" script generates an index of API-related
documentation. The script does not handle out-of-tree builds and thus
cannot be used easily by Meson.

Refactor it to be independent of locations by both accepting a source
directory where the API docs live as well as a path to an output file.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:10 -08:00
ae0b33939d meson: generate user manual
Our documentation contains a user manual that gives people a short
introduction to Git. Our Makefile knows to generate the manual into
three different formats: an HTML page, a PDF and an info page. The Meson
build instructions don't yet generate any of these.

While wiring up all these formats I hit a couple of road blocks with how
we generate our info pages. Even though I eventually resolved these, it
made me question whether anybody actually uses info pages in the first
place. Checking through a couple of downstream consumers I couldn't find
a single user of either the info pages nor of our PDF manual in Arch
Linux, Debian, Fedora, Ubuntu, FreeBSD or OpenBSDFedora. So it's rather
safe to assume that there aren't really any users out there, and thus
the added complexity does not seem worth it.

Wire up support for building the user manual in HTML format and
conciously skip over the other two formats. This is basically a form of
silent deprecation: if people out there use the other two formats they
will eventually complain about them missing in Meson, which means we can
wire them up at a later point. If they don't we can phase out these
formats eventually.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:10 -08:00
851ecc4290 Documentation: inline user-manual.conf
When generating our user manual we set up a bit of extra configuration
compared to our normal configuration. This is done by having an extra
"user-manual.conf" file that Asciidoc seems to pull in automatically due
to matching filenames with "user-manual.txt". This dependency is quite
hidden though and thus easy to miss. Furthermore, it seems that Asciidoc
does not know to pull it in for out-of-tree builds where we use relative
paths.

The setup in AsciiDoctor is somewhat different: instead of having two
sets of configuration, we condition the use of manual-specific configs
based on whether the document type is "book". And as we only build our
user manual with that type this is sufficient.

Use the same trick for our user manual by inlining the configuration
into "asciidoc.conf.in" and making it conditional on whether or not
"doctype-book" is defined.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:10 -08:00
0696ebe9ce meson: generate HTML pages for all man page categories
When generating HTML pages for our man pages we only generate them for
category 1 in Meson, which are the pages corresponding to our built-in
commands. I cannot tell why I added this filter though: our Makefile
installs all man pages, so a Meson-based build misses out on many of
them.

Fix this by removing the filter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:10 -08:00
b88540045c meson: fix generation of merge tools
Our buildsystems generate a list of diff and merge tools that ultimately
end up in our documentation. And while Meson does wire up the logic, it
tries to use the TOOL_MODE environment variable to set up the mode. This
is wrong though: the mode is set via an argument that we have fixed to
'diff' mode by accident.

Fix this such that merge tools are properly generated.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:09 -08:00
2a8bd34c55 meson: properly wire up dependencies for our docs
A couple of Meson documentation targets use `meson.current_source_dir()`
to resolve inputs. This has the downside that it does not automagically
make Meson track these inputs as a dependency. After all, string
arguments really can be anything, even if they happen to match an actual
filesystem path.

Adapt these build targets to instead use inputs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:09 -08:00
d838d821c9 meson: wire up support for AsciiDoctor
While our Makefile supports both Asciidoc and AsciiDoctor, our Meson
build instructions only support the former. Wire up support for the
latter, as well.

Our Makefile always favors Asciidoc, but Meson will automatically figure
out which of both to use based on whether they are installed or not. To
keep compatibility with our Makefile it favors Asciidoc over Asciidoctor
in case both are available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:09 -08:00
d963ac98ec meson: enable auto-discovered "gitweb"
In 7d549fe317 (meson: skip gitweb build when Perl is disabled,
2024-12-20) we have started to conditionally enable "gitweb" based on
whether or not Perl is enabled. By accident though that change causes us
to not build gitweb in case its feature flag is set to "auto" even if
autoconfiguration determines that it could be built. This is because we
use "gitweb_option.enabled()", which only checks whether the feature has
been explicitly enabled.

Fix the issue by using `gitweb_option.allowed()` instead, which returns
true in case it is either explicitly enabled or set to "auto". This also
works for the case where the feature becomes auto-disabled due to Perl
not being present because we use `disable_auto_if(not perl.found())`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:17:19 -08:00
cbcc2f7911 GIT-BUILD-OPTIONS: wire up NO_GITWEB option
Building our "gitweb" interface is optional in our Makefile and in Meson
and not wired up at all with CMake, but disabling it causes a couple of
tests in the t950* range that pull in "t/lib-gitweb.sh". This is because
the test library knows to execute gitweb-tests based on whether or not
Perl is available, but we may have Perl available and still end up not
building gitweb e.g. with `make test NO_GITWEB=YesPlease`.

Fix this issue by wiring up a new "NO_GITWEB" build option so that we
can skip these tests in case gitweb is not built.

Note that this new build option requires us to move the configuration of
GIT-BUILD-OPTIONS to a later point in our Meson build instructions. But
as that file is only consumed by our tests at runtime this change does
not cause any issues.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:17:19 -08:00
cfa1f2ae96 GIT-BUILD-OPTIONS: sort variables alphabetically
The variables declared and substituted in GIT-BUILD-OPTIONS are not
ordered in any obvious way. Sort them alphabetically so that it becomes
obvious where new variables should go.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:17:19 -08:00
cef3d4a89f t7611: replace test -f with test_path_is* helpers
Replace `test -f` and `test ! -f` with `test_path_is_file` and
`test_path_is_missing` for better debuggability.

While `test -f` ensures that the file exists and is a regular file,
`test_path_is_file` provides clearer error messages on failure.

On the other hand, `test ! -f` checks either the absence of a regular
file or the presence of any other filesystem object, but looking at
them in the test individually, all of them should've said `test ! -e`,
i.e. "there shouldn't be anything at given path on filesystem."

Replace these cases with `test_path_is_missing` for better
debuggability.

Helped-by: karthik nayak <karthik.188@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:13:59 -08:00
5e7fe8a7b8 commit-reach: use size_t to track indices when computing merge bases
The functions `repo_get_merge_bases_many()` and friends accepts an array
of commits as well as a parameter that indicates how large that array
is. This parameter is using a signed integer, which leads to a couple of
warnings with -Wsign-compare.

Refactor the code to use `size_t` to track indices instead and adapt
callers accordingly. While most callers are trivial, there are two
callers that require a bit more scrutiny:

  - builtin/merge-base.c:show_merge_base() subtracts `1` from the
    `rev_nr` before calling `repo_get_merge_bases_many_dirty()`, so if
    the variable was `0` it would wrap. This code is fine though because
    its only caller will execute that code only when `argc >= 2`, and it
    follows that `rev_nr >= 2`, as well.

  - bisect.ccheck_merge_bases() similarly subtracts `1` from `rev_nr`.
    Again, there is only a single caller that populates `rev_nr` with
    `good_revs.nr`. And because a bisection always requires at least one
    good revision it follws that `rev_nr >= 1`.

Mark the file as -Wsign-compare-clean.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:12:40 -08:00
455ac07021 shallow: fix -Wsign-compare warnings
Fix a couple of -Wsign-compare issues in "shallow.c" and mark the file
as -Wsign-compare-clean. This change prepares the code for a refactoring
of `repo_in_merge_bases_many()`, which will be adapted to accept the
number of commits as `size_t` instead of `int`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:12:40 -08:00
1ab5948141 builtin/log: fix remaining -Wsign-compare warnings
Fix remaining -Wsign-compare warnings in "builtin/log.c" and mark the
file as -Wsign-compare-clean. While most of the fixes are obvious, one
fix requires us to use `cast_size_t_to_int()`, which will cause us to
die in case the `size_t` cannot be represented as `int`. This should be
fine though, as the data would typically be set either via a config key
or via the command line, neither of which should ever exceed a couple of
kilobytes of data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:11:46 -08:00
0905ed201a builtin/log: use size_t to track indices
Similar as with the preceding commit, adapt "builtin/log.c" so that it
tracks array indices via `size_t` instead of using signed integers. This
fixes a couple of -Wsign-compare warnings and prepares the code for
a similar refactoring of `repo_get_merge_bases_many()` in a subsequent
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:11:46 -08:00
85ee0680e2 commit-reach: use size_t to track indices in get_reachable_subset()
Similar as with the preceding commit, adapt `get_reachable_subset()` so
that it tracks array indices via `size_t` instead of using signed
integers to fix a couple of -Wsign-compare warnings. Adapt callers
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:11:45 -08:00
45843d8f4e commit-reach: use size_t to track indices in remove_redundant()
The function `remove_redundant()` gets as input an array of commits as
well as the size of that array and then drops redundant commits from
that array. It then returns either `-1` in case an error occurred, or
the new number of items in the array.

The function receives and returns these sizes with a signed integer,
which causes several warnings with -Wsign-compare. Fix this issue by
consistently using `size_t` to track array indices and splitting up
the returned value into a returned error code and a separate out pointer
for the new computed size.

Note that `get_merge_bases_many()` and related functions still track
array sizes as a signed integer. This will be fixed in a subsequent
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:11:45 -08:00
04aeeeaab1 commit-reach: fix type of min_commit_date
The `can_all_from_reach_with_flag()` function accepts a parameter that
allows callers to cut off traversal at a specified commit date. This
parameter is of type `time_t`, which is a signed type, while we end up
comparing it to a commit's `date` field, which is of the unsigned type
`timestamp_t`.

Fix the parameter to be of type `timestamp_t`. There is only a single
caller in "upload-pack.c" that sets this parameter, and that caller
knows to pass in a `timestamp_t` already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:11:45 -08:00
95c09e4d07 commit-reach: fix index used to loop through unsigned integer
In 62e745ced2 (prio-queue: use size_t rather than int for size,
2024-12-20), we refactored `struct prio_queue` to track the number of
contained entries via a `size_t`. While the refactoring adapted one of
the users of that variable, it forgot to also adapt "commit-reach.c"
accordingly. This was missed because that file has -Wsign-conversion
disabled.

Fix the issue by using a `size_t` to iterate through entries.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:11:15 -08:00
44945dfe86 prio-queue: fix type of insertion_ctr
In 62e745ced2 (prio-queue: use size_t rather than int for size,
2024-12-20), we have converted `struct prio_queue` to use `size_t` to
track the number of entries in the queue as well as the allocated size
of the underlying array. There is one more counter though, namely the
insertion counter, that is still using an `unsigned` instead of a
`size_t`. This is unlikely to ever be a problem, but it makes one wonder
why some indices use `size_t` while others use `unsigned`. Furthermore,
the mentioned commit stated the intent to also adapt these variables,
but seemingly forgot to do so.

Fix the issue by converting those counters to use `size_t`, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:10:41 -08:00
d11d003ba5 date.c: Fix type missmatch warings from msvc
Fix compiler warings from msvc in date.c for value truncation from 64
bit to 32 bit integers.

Also switch from int to size_t for all variables with result of strlen()
which cannot become negative.

Signed-off-by: Sören Krecker <soekkle@freenet.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-26 13:34:28 -08:00
b59358100c Update the official repo of gitk
Point out:
- current maintaner
- contribution flow is via the mailing list

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-26 08:06:42 -08:00
76cf4f61c8 Merge https://github.com/j6t/git-gui
* 'master' of https://github.com/j6t/git-gui:
  git-gui: use system encoding to show console output
  git-gui: Remove forced rescan of stat-dirty files.
2024-12-26 08:02:23 -08:00
e76b53ef23 gitk: Update Bulgarian translation (327t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-24 11:58:09 +01:00
996f0c583b Hopefully the final batch before 2.48-rc1
Let's wait for git-gui, gitk, and possibly po/ and delay the tagging
of the -rc1.  Many people are already offline for the end-of-year
holidays and it is a slow week, and 'master' front has too many new
things graduated from 'next' a bit too early for me to feel
comfortable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23 10:13:58 -08:00
6f8ae955bd Merge branch 'kn/reflog-migration'
"git refs migrate" learned to also migrate the reflog data across
backends.

* kn/reflog-migration:
  refs: mark invalid refname message for translation
  refs: add support for migrating reflogs
  refs: allow multiple reflog entries for the same refname
  refs: introduce the `ref_transaction_update_reflog` function
  refs: add `committer_info` to `ref_transaction_add_update()`
  refs: extract out refname verification in transactions
  refs/files: add count field to ref_lock
  refs: add `index` field to `struct ref_udpate`
  refs: include committer info in `ref_update` struct
2024-12-23 09:32:29 -08:00
f74eae3e47 Merge branch 'ma/asciidoctor-build-fixes'
A topic to optionally build with meson, which has graduated to
'master' recently, broke Documentation pipeline with asciidoctor
for the normal Makefile build as well as meson-based one, which
have been corrected.

* ma/asciidoctor-build-fixes:
  asciidoctor-extensions.rb.in: inject GIT_DATE
  asciidoctor-extensions.rb.in: add missing word
  asciidoctor-extensions.rb.in: delete existing <refmiscinfo/>
2024-12-23 09:32:27 -08:00
f074cdea46 Merge branch 'ps/build-hotfix'
A topic to optionally build with meson, which has graduated to
'master' recently, has regressed the normal Makefile build, which
is being corrected.

* ps/build-hotfix:
  meson: add options to override build information
  GIT-VERSION-GEN: fix overriding GIT_BUILT_FROM_COMMIT and GIT_DATE
  GIT-VERSION-GEN: fix overriding GIT_VERSION
  Makefile: introduce template for GIT-VERSION-GEN
  Makefile: drop unneeded indirection for GIT-VERSION-GEN outputs
  Makefile: stop including "GIT-VERSION-FILE" in docs
2024-12-23 09:32:26 -08:00
83c8f76235 Merge branch 'ps/ci-meson'
The meson-build procedure is integrated into CI to catch and
prevent bitrotting.

* ps/ci-meson:
  ci: wire up Meson builds
  t: introduce compatibility options to clar-based tests
  t: fix out-of-tree tests for some git-p4 tests
  Makefile: detect missing Meson tests
  meson: detect missing tests at configure time
  t/unit-tests: rename clar-based unit tests to have a common prefix
  Makefile: drop -DSUPPRESS_ANNOTATED_LEAKS
  ci/lib: support custom output directories when creating test artifacts
2024-12-23 09:32:25 -08:00
e9a4054320 Merge branch 'kl/doc-build-fix'
Build fix.

* kl/doc-build-fix:
  doc: remove extra quotes in generated docs
2024-12-23 09:32:23 -08:00
a08ebf8b3e Merge branch 'tb/bitmap-fix-pack-reuse'
Code to reuse objects based on bitmap contents have been tightened
to avoid race condition even when multiple packs are involved.

* tb/bitmap-fix-pack-reuse:
  pack-bitmap.c: ensure pack validity for all reuse packs
2024-12-23 09:32:22 -08:00
8650022fab Merge branch 'jk/prio-queue-sign-compare-fix'
Type clean-up.

* jk/prio-queue-sign-compare-fix:
  prio-queue: use size_t rather than int for size
2024-12-23 09:32:21 -08:00
77825f7553 Merge branch 'ps/build-meson-gitweb'
meson-based build still tried to build and install gitweb even when
Perl is disabled, which has been corrected.

* ps/build-meson-gitweb:
  meson: skip gitweb build when Perl is disabled
2024-12-23 09:32:19 -08:00
77edd59394 Merge branch 'sk/calloc-not-malloc-plus-memset'
Code clean-up.

* sk/calloc-not-malloc-plus-memset:
  git: use calloc instead of malloc + memset where possible
2024-12-23 09:32:18 -08:00
88e59f8027 Merge branch 'js/range-diff-diff-merges'
"git range-diff" learned to optionally show and compare merge
commits in the ranges being compared, with the --diff-merges
option.

* js/range-diff-diff-merges:
  range-diff: introduce the convenience option `--remerge-diff`
  range-diff: optionally include merge commits' diffs in the analysis
2024-12-23 09:32:17 -08:00
c4cc685a62 Merge branch 'js/mingw-rename-fix'
Update the way rename() emulation on Windows handle directories to
correct an earlier attempt to do the same.

* js/mingw-rename-fix:
  mingw_rename: do support directory renames
2024-12-23 09:32:16 -08:00
bad5d1ad25 Merge branch 'js/github-windows-setup-fix'
Revert recent changes to the way windows environment is set up for
GitHub CI.

* js/github-windows-setup-fix:
  GitHub ci(windows): speed up initializing Git for Windows' minimal SDK again
2024-12-23 09:32:15 -08:00
8cad35f353 Merge branch 'js/ps-build-cmake-fixup'
Build fixes for Windows.

* js/ps-build-cmake-fixup:
  cmake/vcxproj: stop special-casing `remote-ext`
  cmake: put the Perl modules into the correct location again
  cmake: use the correct file name for the Perl header
  cmake(mergetools): better support for out-of-tree builds
  cmake: better support for out-of-tree builds follow-up
2024-12-23 09:32:13 -08:00
002a8a9d36 Merge branch 'as/show-index-uninitialized-hash'
Regression fix for 'show-index' when run outside of a repository.

* as/show-index-uninitialized-hash:
  t5300: add test for 'show-index --object-format'
  show-index: fix uninitialized hash function
2024-12-23 09:32:12 -08:00
4156b6a741 Merge branch 'ps/build-sign-compare'
Start working to make the codebase buildable with -Wsign-compare.

* ps/build-sign-compare:
  t/helper: don't depend on implicit wraparound
  scalar: address -Wsign-compare warnings
  builtin/patch-id: fix type of `get_one_patchid()`
  builtin/blame: fix type of `length` variable when emitting object ID
  gpg-interface: address -Wsign-comparison warnings
  daemon: fix type of `max_connections`
  daemon: fix loops that have mismatching integer types
  global: trivial conversions to fix `-Wsign-compare` warnings
  pkt-line: fix -Wsign-compare warning on 32 bit platform
  csum-file: fix -Wsign-compare warning on 32-bit platform
  diff.h: fix index used to loop through unsigned integer
  config.mak.dev: drop `-Wno-sign-compare`
  global: mark code units that generate warnings with `-Wsign-compare`
  compat/win32: fix -Wsign-compare warning in "wWinMain()"
  compat/regex: explicitly ignore "-Wsign-compare" warnings
  git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-23 09:32:11 -08:00
f7c607fac3 Merge branch 'kn/reftable-writer-log-write-verify'
Reftable backend adds check for upper limit of log's update_index.

* kn/reftable-writer-log-write-verify:
  reftable/writer: ensure valid range for log's update_index
2024-12-23 09:32:08 -08:00
19fbad7918 Merge branch 'ps/ci-gitlab-update'
GitLab CI updates.

* ps/ci-gitlab-update:
  ci/lib: fix "CI setup" sections with GitLab CI
  ci/lib: do not interpret escape sequences in `group ()` arguments
  ci/lib: remove duplicate trap to end "CI setup" group
  gitlab-ci: update macOS images to Sonoma
2024-12-23 09:32:07 -08:00
3151e6a121 Merge branch 'ps/reftable-alloc-failures-zalloc-fix'
Recent reftable updates mistook a NULL return from a request for
0-byte allocation as OOM and died unnecessarily, which has been
corrected.

* ps/reftable-alloc-failures-zalloc-fix:
  reftable/basics: return NULL on zero-sized allocations
  reftable/stack: fix zero-sized allocation when there are no readers
  reftable/merged: fix zero-sized allocation when there are no readers
  reftable/stack: don't perform auto-compaction with less than two tables
2024-12-23 09:32:06 -08:00
f37c6dd44e git-gui i18n: Updated Bulgarian translation (579t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-23 18:05:14 +01:00
d7282891f5 reftable/basics: return NULL on zero-sized allocations
In the preceding commits we have fixed a couple of issues when
allocating zero-sized objects. These issues were masked by
implementation-defined behaviour. Quoting malloc(3p):

  If size is 0, either:

    * A null pointer shall be returned and errno may be set to an
      implementation-defined value, or

    * A pointer to the allocated space shall be returned. The
      application shall ensure that the pointer is not used to access an
      object.

So it is perfectly valid that implementations of this function may or
may not return a NULL pointer in such a case.

Adapt both `reftable_malloc()` and `reftable_realloc()` so that they
return NULL pointers on zero-sized allocations. This should remove any
implementation-defined behaviour in our allocators and thus allows us to
detect such platform-specific issues more easily going forward.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-22 00:58:23 -08:00
2d3cb4b4b5 reftable/stack: fix zero-sized allocation when there are no readers
Similar as the preceding commit, we may try to do a zero-sized
allocation when reloading a reftable stack that ain't got any tables.
It is implementation-defined whether malloc(3p) returns a NULL pointer
in that case or a zero-sized object. In case it does return a NULL
pointer though it causes us to think we have run into an out-of-memory
situation, and thus we return an error.

Fix this by only allocating arrays when they have at least one entry.

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-22 00:58:23 -08:00
5ab83521cf reftable/merged: fix zero-sized allocation when there are no readers
It was reported [1] that Git started to fail with an out-of-memory error
when initializing repositories with the reftable backend on NonStop
platforms. A bisect led to 802c0646ac (reftable/merged: handle
allocation failures in `merged_table_init_iter()`, 2024-10-02), which
changed how we allocate memory when initializing a merged table.

The root cause of this seems to be that NonStop returns a `NULL` pointer
when doing a zero-sized allocation. This would've already happened
before the above change, but we never noticed because we did not check
the result. Now we do notice and thus return an out-of-memory error to
the caller.

Fix the issue by skipping the allocation altogether in case there are no
readers.

[1]: <00ad01db5017$aa9ce340$ffd6a9c0$@nexbridge.com>

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-22 00:58:23 -08:00
8e27ee9220 reftable/stack: don't perform auto-compaction with less than two tables
In order to compact tables we need at least two tables. Bail out early
from `reftable_stack_auto_compact()` in case we have less than two
tables.

In the original, `stack_table_sizes_for_compaction()` yields an array
that has the same length as the number of tables. This array is then
passed on to `suggest_compaction_segment()`, which returns an empty
segment in case we have less than two tables. The segment is then passed
to `segment_size()`, which will return `0` because both start and end of
the segment are `0`. And because we only call `stack_compact_range()` in
case we have a positive segment size we don't perform auto-compaction at
all. Consequently, this change does not result in a user-visible change
in behaviour when called with a single table.

But when called with no tables this protects us against a potential
out-of-memory error: `stack_table_sizes_for_compaction()` would try to
allocate a zero-byte object when there aren't any tables, and that may
lead to a `NULL` pointer on some platforms like NonStop which causes us
to bail out with an out-of-memory error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-22 00:58:23 -08:00
5c95773eac Merge branch 'js/no-rescan-on-empty-diff'
* js/no-rescan-on-empty-diff:
  git-gui: Remove forced rescan of stat-dirty files.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-21 14:06:33 +01:00
beb8081f31 asciidoctor-extensions.rb.in: inject GIT_DATE
After a38edab7c8 (Makefile: generate doc versions via GIT-VERSION-GEN,
2024-12-06), we no longer inject GIT_DATE when building with
Asciidoctor.

Replace the <date/> tag in the XML to inject the value of GIT_DATE.
Unlike <refmiscinfo/> as handled in a recent commit, we have no reason
to expect that this tag might be missing, so there's no need for "maybe
remove, then add" and we can just outright replace the one that
Asciidoctor has generated based on the mtime of the source file.

Compared to pre-a38edab7c8, we now end up injecting this also in the
build of Git.3pm, which until now has been using the mtime of Git.pm.
That is arguably even a good change since it results in more
reproducible builds.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 17:34:35 -08:00
c683924d06 asciidoctor-extensions.rb.in: add missing word
Commit a38edab7c8 (Makefile: generate doc versions via GIT-VERSION-GEN,
2024-12-06) stopped providing an attribute value "Git $(GIT_VERSION)" to
asciidoc/Asciidoctor over the command line. Instead, we now provide the
attribute to asciidoc through a generated asciidoc.conf, where the value
is generated as "Git @GIT_VERSION@".

In the similar mechanism for Asciidoctor, we forgot the "Git" prefix.
Restore it.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 17:34:35 -08:00
298805c823 asciidoctor-extensions.rb.in: delete existing <refmiscinfo/>
After the recent a38edab7c8 (Makefile: generate doc versions via
GIT-VERSION-GEN, 2024-12-06), building with Asciidoctor results in
manpages where the headers no longer contain "Git Manual" and the
footers no longer identify the built Git version.

Before a38edab7c8, we used to just provide a few attributes to
Asciidoctor (and asciidoc). Commit 7a30134358 (asciidoctor-extensions:
provide `<refmiscinfo/>`, 2019-09-16) noted that older versions of
Asciidoctor didn't propagate those attributes into the built XML files,
so we started injecting them ourselves from this script. With newer
versions of Asciidoctor, we'd end up with some harmless duplication
among the tags in the final XML.

Post-a38edab7c8, we don't provide these attributes and Asciidoctor
inserts empty-ish values. After our additions from 7a30134358, we get

  <refmiscinfo class="source">&#160;</refmiscinfo>
  <refmiscinfo class="manual">&#160;</refmiscinfo>
  <refmiscinfo class="source">2.47.1.[...]</refmiscinfo>
  <refmiscinfo class="manual">Git Manual</refmiscinfo>

When these are handled, it appears to be first come first served,
meaning that our additions have no effect and we regress as described in
the first paragraph.

Remove existing "source" or "manual" <refmiscinfo/> tags before adding
ours. I considered removing all <refmiscinfo/> to get a nice clean
slate, instead of just those two that we want to replace to be a bit
more precise. I opted for the latter. Maybe one day, Asciidoctor learns
to insert something useful there which `xmlto` can pick up and make good
use of -- let's not interfere.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 17:34:35 -08:00
f5f82c0d5f Merge branch 'ps/build-hotfix' into ma/asciidoctor-build-fixes
* ps/build-hotfix:
  meson: add options to override build information
  GIT-VERSION-GEN: fix overriding GIT_BUILT_FROM_COMMIT and GIT_DATE
  GIT-VERSION-GEN: fix overriding GIT_VERSION
  Makefile: introduce template for GIT-VERSION-GEN
  Makefile: drop unneeded indirection for GIT-VERSION-GEN outputs
  Makefile: stop including "GIT-VERSION-FILE" in docs
2024-12-20 17:34:25 -08:00
49edce4ff9 show-index: the short help should say the command reads from its input
The short help text given by "git show-index -h" says

    $ git show-index -h
    usage: git show-index [--object-format=<hash-algorithm>]

        --[no-]object-format <hash-algorithm>
                              specify the hash algorithm to use

The command takes a pack .idx file from its standard input.  The
user has to _know_ this, as there is no indication from this output.

Give a hint that the data to work on is fed from its standard input.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 17:30:57 -08:00
1bc815c3d0 meson: add options to override build information
We inject various different kinds of build information into build
artifacts, like the version string or the commit from which Git was
built. Add options to let users explicitly override this information
with Meson.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 12:36:46 -08:00
cfa01e6da5 GIT-VERSION-GEN: fix overriding GIT_BUILT_FROM_COMMIT and GIT_DATE
Same as with the preceding commit, neither GIT_BUILT_FROM_COMMIT nor
GIT_DATE can be overridden via the environment. Especially the latter is
of importance given that we set it in our own "Documentation/doc-diff"
script.

Make the values of both variables overridable. Luckily we don't pull in
these values via any included Makefiles, so the fix is trivial compared
to the fix for GIT_VERSON.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 12:36:45 -08:00
992bc5618f GIT-VERSION-GEN: fix overriding GIT_VERSION
GIT-VERSION-GEN tries to derive the version that Git is being built from
via multiple different sources in the following order:

  1. A file called "version" in the source tree's root directory, if it
     exists.

  2. The current commit in case Git is built from a Git repository.

  3. Otherwise, we use a fallback version stored in a variable which is
     bumped whenever a new Git version is getting tagged.

It used to be possible to override the version by overriding the
`GIT_VERSION` Makefile variable (e.g. `make GIT_VERSION=foo`). This
worked somewhat by chance, only: `GIT-VERSION-GEN` would write the
actual Git version into `GIT-VERSION-FILE`, not the overridden value,
but when including the file into our Makefile we would not override the
`GIT_VERSION` variable because it has already been set by the user. And
because our Makefile used the variable to propagate the version to our
build tools instead of using `GIT-VERSION-FILE` the resulting build
artifacts used the overridden version.

But that subtle mechanism broke with 4838deab65 (Makefile: refactor
GIT-VERSION-GEN to be reusable, 2024-12-06) and subsequent commits
because the version information is not propagated via the Makefile
variable anymore, but instead via the files that `GIT-VERSION-GEN`
started to write. And as the script never knew about the `GIT_VERSION`
environment variable in the first place it uses one of the values listed
above instead of the overridden value.

Fix this issue by making `GIT-VERSION-GEN` handle the case where
`GIT_VERSION` has been set via the environment.

Note that this requires us to introduce a new GIT_VERSION_OVERRIDE
variable that stores a potential user-provided value, either via the
environment or via "config.mak". Ideally we wouldn't need it and could
just continue to use GIT_VERSION for this. But unfortunately, Makefiles
will first include all sub-Makefiles before figuring out whether it
needs to re-make any of them [1]. Consequently, if there already is a
GIT-VERSION-FILE, we would have slurped in its value of GIT_VERSION
before we call GIT-VERSION-GEN, and because GIT-VERSION-GEN now uses
that value as an override it would mean that the first generated value
for GIT_VERSION will remain unchanged.

Furthermore we have to move the include for "GIT-VERSION-FILE" after the
includes for "config.mak" and related so that GIT_VERSION_OVERRIDE can
be set to the value provided by "config.mak".

[1]: https://www.gnu.org/software/make/manual/html_node/Remaking-Makefiles.html

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 12:36:45 -08:00
114494ae2c Makefile: introduce template for GIT-VERSION-GEN
Introduce a new template to call GIT-VERSION-GEN. This will allow us to
iterate on how exactly the script is called in subsequent commits
without having to adapt all call sites every time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 12:36:45 -08:00
b329f2eb00 Makefile: drop unneeded indirection for GIT-VERSION-GEN outputs
Some of the callsites of GIT-VERSION-GEN generate the target file with a
"+" suffix first and then move the file into place when the new contents
are different compared to the old contents. This allows us to avoid a
needless rebuild by not updating timestamps of the target file when its
contents will remain unchanged anyway.

In fact though, this exact logic is already handled in GIT-VERSION-GEN,
so doing this manually is pointless. This is a leftover from an earlier
version of 4838deab65 (Makefile: refactor GIT-VERSION-GEN to be
reusable, 2024-12-06), where the script didn't handle that logic for us.

Drop the needless indirection.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 12:36:44 -08:00
1b0882cba2 Makefile: stop including "GIT-VERSION-FILE" in docs
We include "GIT-VERSION-FILE" in our docs Makefile, but don't actually
use the "GIT_VERSION" variable that it provides. This is a leftover from
the conversion to make "GIT-VERSION-GEN" generate version information
in-place by substituting placeholders in 4838deab65 (Makefile: refactor
GIT-VERSION-GEN to be reusable, 2024-12-06) and subsequent commits,
where all usages of the variable were removed.

Stop including the file.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 12:36:44 -08:00
b4d15f73e2 l10n: fr.po: Minor improvements
* Fix an occurrence of "dpuis" to "depuis".
* Add some entries in the translation index at the beginning of the
  file.
* Harmonize the spelling of various items based on how common each
  spelling or translation is throughout the file.
  * superproject -> super-projet
  * patch -> rustine
  * regex / regexp -> regex
  * regular expression -> expression régulière
  * loose object -> objet esseulé
  * directory -> répertoire
* Fix various typos (e.g.: trailing ".<" or ".", "mêm" -> "même")
* Fix minor grammatical errors (e.g: "le valeur" -> "la valeur")
* Remove old translations

Signed-off-by: Florian Sabourin <ethiraric@gmail.com>
2024-12-20 19:24:57 +01:00
7d549fe317 meson: skip gitweb build when Perl is disabled
It is possible to configure a Git build without Perl when disabling both
our test suite and all Perl-based features. In Meson, this can be
achieved with `meson setup -Dperl=disabled -Dtests=false`.

It was reported by a user that this breaks the Meson build because
gitweb gets built even if Perl was not discovered in such a build:

    $ meson setup .. -Dtests=false -Dperl=disabled
    ...
    ../gitweb/meson.build:2:43: ERROR: Unable to get the path of a not-found external program

Fix this issue by introducing a new feature-option that allows the user
to configure whether or not to build Gitweb. The feature is set to
'auto' by default and will be disabled automatically in case Perl was
not found on the system.

Reported-by: Daniel Engberg <daniel.engberg.lists@pyret.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:39:20 -08:00
71edf6c3c8 path-walk: reorder object visits
The path-walk API currently uses a stack-based approach to recursing
through the list of paths within the repository. This guarantees that
after a tree path is explored, all paths contained within that tree path
will be explored before continuing to explore siblings of that tree
path.

The initial motivation of this depth-first approach was to minimize
memory pressure while exploring the repository. A breadth-first approach
would have too many "active" paths being stored in the paths_to_lists
map.

We can take this approach one step further by making sure that blob
paths are visited before tree paths. This allows the API to free the
memory for these blob objects before continuing to perform the
depth-first search. This modifies the order in which we visit siblings,
but does not change the fact that we are performing depth-first search.

To achieve this goal, use a priority queue with a custom sorting method.
The sort needs to handle tags, blobs, and trees (commits are handled
slightly differently). When objects share a type, we can sort by path
name. This will keep children of the latest path to leave the stack be
preferred over the rest of the paths in the stack, since they agree in
prefix up to and including a directory separator. When the types are
different, we can prefer tags over other types and blobs over trees.

This causes significant adjustments to t6601-path-walk.sh to rearrange
the order of the visited paths.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:05 -08:00
6333e7ae0b path-walk: mark trees and blobs as UNINTERESTING
When the input rev_info has UNINTERESTING starting points, we want to be
sure that the UNINTERESTING flag is passed appropriately through the
objects. To match how this is done in places such as 'git pack-objects', we
use the mark_edges_uninteresting() method.

This method has an option for using the "sparse" walk, which is similar in
spirit to the path-walk API's walk. To be sure to keep it independent, add a
new 'prune_all_uninteresting' option to the path_walk_info struct.

To check how the UNINTERSTING flag is spread through our objects, extend the
'test-tool path-walk' command to output whether or not an object has that
flag. This changes our tests significantly, including the removal of some
objects that were previously visited due to the incomplete implementation.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:05 -08:00
9145660979 path-walk: visit tags and cached objects
The rev_info that is specified for a path-walk traversal may specify
visiting tag refs (both lightweight and annotated) and also may specify
indexed objects (blobs and trees). Update the path-walk API to walk
these objects as well.

When walking tags, we need to peel the annotated objects until reaching
a non-tag object. If we reach a commit, then we can add it to the
pending objects to make sure we visit in the commit walk portion. If we
reach a tree, then we will assume that it is a root tree. If we reach a
blob, then we have no good path name and so add it to a new list of
"tagged blobs".

When the rev_info includes the "--indexed-objects" flag, then the
pending set includes blobs and trees found in the cache entries and
cache-tree. The cache entries are usually blobs, though they could be
trees in the case of a sparse index. The cache-tree stores
previously-hashed tree objects but these are cleared out when staging
objects below those paths. We add tests that demonstrate this.

The indexed objects come with a non-NULL 'path' value in the pending
item. This allows us to prepopulate the 'path_to_lists' strmap with
lists for these paths.

The tricky thing about this walk is that we will want to combine the
indexed objects walk with the commit walk, especially in the future case
of walking objects during a command like 'git repack'.

Whenever possible, we want the objects from the index to be grouped with
similar objects in history. We don't want to miss any paths that appear
only in the index and not in the commit history.

Thus, we need to be careful to let the path stack be populated initially
with only the root tree path (and possibly tags and tagged blobs) and go
through the normal depth-first search. Afterwards, if there are other
paths that are remaining in the paths_to_lists strmap, we should then
iterate through the stack and visit those objects recursively.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:05 -08:00
c8dba310d7 path-walk: allow consumer to specify object types
We add the ability to filter the object types in the path-walk API so
the callback function is called fewer times.

This adds the ability to ask for the commits in a list, as well. We
re-use the empty string for this set of objects because these are passed
directly to the callback function instead of being part of the
'path_stack'.

Future changes will add the ability to visit annotated tags.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:05 -08:00
d190124f27 t6601: add helper for testing path-walk API
Add some tests based on the current behavior, doing interesting checks
for different sets of branches, ranges, and the --boundary option. This
sets a baseline for the behavior and we can extend it as new options are
introduced.

Store and output a 'batch_nr' value so we can demonstrate that the paths are
grouped together in a batch and not following some other ordering. This
allows us to test the depth-first behavior of the path-walk API. However, we
purposefully do not test the order of the objects in the batch, so the
output is compared to the expected output through a sort.

It is important to mention that the behavior of the API will change soon as
we start to handle UNINTERESTING objects differently, but these tests will
demonstrate the change in behavior.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:04 -08:00
cef003d453 test-lib-functions: add test_cmp_sorted
This test helper will be helpful to reduce repeated logic in
t6601-path-walk.sh, but may be helpful elsewhere, too.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:04 -08:00
9d46bc791b path-walk: introduce an object walk by path
In anticipation of a few planned applications, introduce the most basic form
of a path-walk API. It currently assumes that there are no UNINTERESTING
objects, and does not include any complicated filters. It calls a function
pointer on groups of tree and blob objects as grouped by path. This only
includes objects the first time they are discovered, so an object that
appears at multiple paths will not be included in two batches.

These batches are collected in 'struct type_and_oid_list' objects, which
store an object type and an oid_array of objects.

The data structures are documented in 'struct path_walk_context', but in
summary the most important are:

  * 'paths_to_lists' is a strmap that connects a path to a
    type_and_oid_list for that path. To avoid conflicts in path names,
    we make sure that tree paths end in "/" (except the root path with
    is an empty string) and blob paths do not end in "/".

  * 'path_stack' is a string list that is added to in an append-only
    way. This stores the stack of our depth-first search on the heap
    instead of using recursion.

  * 'path_stack_pushed' is a strmap that stores path names that were
    already added to 'path_stack', to avoid repeating paths in the
    stack. Mostly, this saves us from quadratic lookups from doing
    unsorted checks into the string_list.

The coupling of 'path_stack' and 'path_stack_pushed' is protected by the
push_to_stack() method. Call this instead of inserting into these
structures directly.

The walk_objects_by_path() method initializes these structures and
starts walking commits from the given rev_info struct. The commits are
used to find the list of root trees which populate the start of our
depth-first search.

The core of our depth-first search is in a while loop that continues
while we have not indicated an early exit and our 'path_stack' still has
entries in it. The loop body pops a path off of the stack and "visits"
the path via the walk_path() method.

The walk_path() method gets the list of OIDs from the 'path_to_lists'
strmap and executes the callback method on that list with the given path
and type. If the OIDs correspond to tree objects, then iterate over all
trees in the list and run add_children() to add the child objects to
their own lists, adding new entries to the stack if necessary.

In testing, this depth-first search approach was the one that used the
least memory while iterating over the object lists. There is still a
chance that repositories with too-wide path patterns could cause memory
pressure issues. Limiting the stack size could be done in the future by
limiting how many objects are being considered in-progress, or by
visiting blob paths earlier than trees.

There are many future adaptations that could be made, but they are left for
future updates when consumers are ready to take advantage of those features.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 08:37:04 -08:00
8ddcdc1bb3 refs: mark invalid refname message for translation
The error message produced by `transaction_refname_valid()` changes based
on whether the update is a ref update or a reflog update, with the use
of a ternary operator. This breaks translation since the sub-msg is not
marked for translation. Fix this by setting the entire message using a
`if {} else {}` block and marking each message for translation.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 07:52:44 -08:00
62e745ced2 prio-queue: use size_t rather than int for size
The alloc and nr fields of a prio-queue tell us how much memory is
allocated and used in the array. So the natural type for them is size_t,
which prevents overflow on 64-bit systems where "int" is still 32 bits.

This is unlikely to happen in practice, as we typically use it for
storing commits, and having 2^31 of those is rather a lot. But it's good
to keep our generic data structures as flexible as possible. And as we
start to enforce -Wsign-compare, it means that callers need to use
"int", too, and the problem proliferates. Let's fix it at the source.

The changes here can be put into a few groups:

  1. Changing the alloc/nr fields in the struct to size_t. This requires
     swapping out int for size_t in negotiator/skipping.c, as well as
     in prio_queue_get(), because those all iterate over the array.
     Building with -Wsign-compare complains about these.

  2. Other code that assigns or passes around indexes into the array
     (e.g., the swap() and compare() functions) won't trigger
     -Wsign-compare because we are simply truncating the values. These
     are caught by -Wconversion, but I've adjusted them here to
     future-proof us.

  3. In prio_queue_reverse() we compute "queue->nr - 1" without checking
     if anything is in the queue, which underflows now that nr is
     unsigned. We can fix that by returning early when the queue is
     empty (there is nothing to reverse).

  4. The insertion_ctr variable is currently unsigned, but can likewise
     grow (it is actually worse, because adding and removing an element
     many times will keep increasing the counter, even though "nr" does
     not). I've bumped that to size_t here, as well.

     But -Wconversion notes that computing the "cmp" result by
     subtracting the counters and assigning to "int" is a potential
     problem. And that's true even before this patch, since we use an
     unsigned counter (imagine comparing "2^32-1" and "0", which should
     be a high positive value, but instead is "-1" as a signed int).

     Since we only care about the sign (and not the magnitude) of the
     result, we could fix this by swapping out the subtraction for a
     ternary comparison. Probably the performance impact would be
     negligible, since we just called into a custom compare function and
     branched on its result anyway. But it's easy enough to do a
     branchless version by subtracting the comparison results.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 07:21:45 -08:00
ff795a5c5e Finishing touches before 2.48-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-19 10:58:34 -08:00
1df37ef81a Merge branch 'tc/bundle-with-tag-remove-workaround'
"git bundle create" with an annotated tag on the positive end of
the revision range had a workaround code for older limitation in
the revision walker, which has become unnecessary.

* tc/bundle-with-tag-remove-workaround:
  bundle: remove unneeded code
2024-12-19 10:58:34 -08:00
930f2b4811 Merge branch 'mh/doc-windows-home-env'
Doc update.

* mh/doc-windows-home-env:
  Document HOME environment variable
2024-12-19 10:58:32 -08:00
cb89eebf3b Merge branch 'js/log-remerge-keep-ancestry'
"git log -p --remerge-diff --reverse" was completely broken.

* js/log-remerge-keep-ancestry:
  log: --remerge-diff needs to keep around commit parents
2024-12-19 10:58:31 -08:00
a1f34d5955 Merge branch 'bf/fetch-set-head-config'
"git fetch" honors "remote.<remote>.followRemoteHEAD" settings to
tweak the remote-tracking HEAD in "refs/remotes/<remote>/HEAD".

* bf/fetch-set-head-config:
  remote set-head: set followRemoteHEAD to "warn" if "always"
  fetch set_head: add warn-if-not-$branch option
  fetch set_head: move warn advice into advise_if_enabled
  fetch: add configuration for set_head behaviour
2024-12-19 10:58:30 -08:00
ae75cefd94 Merge branch 'jc/set-head-symref-fix'
"git fetch" from a configured remote learned to update a missing
remote-tracking HEAD but it asked the remote about their HEAD even
when it did not need to, which has been corrected.  Incidentally,
this also corrects "git fetch --tags $URL" which was broken by the
new feature in an unspecified way.

* jc/set-head-symref-fix:
  fetch: do not ask for HEAD unnecessarily
2024-12-19 10:58:28 -08:00
5f212684ab Merge branch 'bf/set-head-symref'
When "git fetch $remote" notices that refs/remotes/$remote/HEAD is
missing and discovers what branch the other side points with its
HEAD, refs/remotes/$remote/HEAD is updated to point to it.

* bf/set-head-symref:
  fetch set_head: handle mirrored bare repositories
  fetch: set remote/HEAD if it does not exist
  refs: add create_only option to refs_update_symref_extended
  refs: add TRANSACTION_CREATE_EXISTS error
  remote set-head: better output for --auto
  remote set-head: refactor for readability
  refs: atomically record overwritten ref in update_symref
  refs: standardize output of refs_read_symbolic_ref
  t/t5505-remote: test failure of set-head
  t/t5505-remote: set default branch to main
2024-12-19 10:58:27 -08:00
7525cd8c35 git: use calloc instead of malloc + memset where possible
Avoid calling malloc + memset by calling calloc.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:48:34 -08:00
d4cd757051 match-trees: stop using the_repository
Stop using `the_repository` in the "match-trees" subsystem by passing
down the already-available repository parameters to internal functions
as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
e1335a9407 graph: stop using the_repository
Stop using `the_repository` in the "graph" subsystem by reusing the
repository we already have available via `struct rev_info`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
1b374ad71f add-interactive: stop using the_repository
Stop using `the_repository` in the "add-interactive" subsystem by
reusing the repository we already have available via parameters or in
the `add_i_state` structure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
727c71a112 tmp-objdir: stop using the_repository
Stop using `the_repository` in the "tmp-objdir" subsystem by passing
in the repostiroy when creating a new temporary object directory.

While we could trivially update the caller to pass in the hash algorithm
used by the index itself, we instead pass in `the_hash_algo`. This is
mostly done to stay consistent with the rest of the code in that file,
which isn't prepared to handle arbitrary repositories, either.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
b81093aeae resolve-undo: stop using the_repository
Stop using `the_repository` in the "resolve-undo" subsystem by passing
in the hash algorithm when reading or writing resolve-undo information.

While we could trivially update the caller to pass in the hash algorithm
used by the index itself, we instead pass in `the_hash_algo`. This is
mostly done to stay consistent with the rest of the code in that file,
which isn't prepared to handle arbitrary repositories, either.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
6c27d22276 credential: stop using the_repository
Stop using `the_repository` in the "credential" subsystem by passing in
a repository when filling, approving or rejecting credentials.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
71e5afee8b mailinfo: stop using the_repository
Stop using `the_repository` in the "mailinfo" subsystem by passing in
a repository when setting up the mailinfo structure.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
b4c476c43a diagnose: stop using the_repository
Stop using `the_repository` in the "diagnose" subsystem by passing in a
repository when generating a diagnostics archive.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:31 -08:00
c365dbb44e server-info: stop using the_repository
Stop using `the_repository` in the "server-info" subsystem by passing in
a repository when updating server info and storing the repository in the
`update_info_ctx` structure to make it accessible to other functions.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:30 -08:00
5ee907bb3f send-pack: stop using the_repository
Stop using `the_repository` in the "send-pack" subsystem by passing in a
repository when sending a packfile.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:30 -08:00
395b584b57 serve: stop using the_repository
Stop using `the_repository` in the "serve" subsystem by passing in a
repository when advertising capabilities or serving requests.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:30 -08:00
bd0c0fb790 trace: stop using the_repository
Stop using `the_repository` in the "trace" subsystem by passing in a
repository when setting up tracing.

Adjust the only caller accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:30 -08:00
59b6131a67 pager: stop using the_repository
Stop using `the_repository` in the "pager" subsystem by passing in a
repository when setting up the pager and when configuring it.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:30 -08:00
1f7e6478dc progress: stop using the_repository
Stop using `the_repository` in the "progress" subsystem by passing in a
repository when initializing `struct progress`. Furthermore, store a
pointer to the repository in that struct so that we can pass it to the
trace2 API when logging information.

Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 10:44:30 -08:00
913a1e157c Merge branch 'ps/build-sign-compare' into ps/the-repository
* ps/build-sign-compare:
  t/helper: don't depend on implicit wraparound
  scalar: address -Wsign-compare warnings
  builtin/patch-id: fix type of `get_one_patchid()`
  builtin/blame: fix type of `length` variable when emitting object ID
  gpg-interface: address -Wsign-comparison warnings
  daemon: fix type of `max_connections`
  daemon: fix loops that have mismatching integer types
  global: trivial conversions to fix `-Wsign-compare` warnings
  pkt-line: fix -Wsign-compare warning on 32 bit platform
  csum-file: fix -Wsign-compare warning on 32-bit platform
  diff.h: fix index used to loop through unsigned integer
  config.mak.dev: drop `-Wno-sign-compare`
  global: mark code units that generate warnings with `-Wsign-compare`
  compat/win32: fix -Wsign-compare warning in "wWinMain()"
  compat/regex: explicitly ignore "-Wsign-compare" warnings
  git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-18 10:43:16 -08:00
62b3ec8a3f pack-bitmap.c: ensure pack validity for all reuse packs
Commit 44f9fd6496 (pack-bitmap.c: check preferred pack validity when
opening MIDX bitmap, 2022-05-24) prevents a race condition whereby the
preferred pack disappears between opening the MIDX bitmap and attempting
verbatim reuse out of its packs.

That commit forces open_midx_bitmap_1() to ensure the validity of the
MIDX's preferred pack, meaning that we have an open file handle on the
*.pack, ensuring that we can reuse bytes out of verbatim later on in the
process[^1].

But 44f9fd6496 was not extended to cover multi-pack reuse, meaning that
this same race condition exists for non-preferred packs during verbatim
reuse. Work around that race in the same way by only marking valid packs
as reuse-able. For packs that aren't reusable, skip over them but
include the number of objects they have to ensure we allocate a large
enough 'reuse' bitmap (e.g. if a pack in the middle of the MIDX
disappeared but we still want to reuse later packs).

Since we're ensuring the validity of these packs within the verbatim
reuse code, we no longer have to special-case the preferred pack and
open it within the open_midx_bitmap_1() function.

An alternative approach to the one taken here would be to open all
MIDX'd packs from within open_midx_bitmap_1(). But that would be both
slower and make the bitmaps less useful, since we can still perform some
pack reuse among the packs that still exist when the *.bitmap is opened.

After applying this patch, we can simulate the new behavior after
instrumenting Git like so:

    diff --git a/packfile.c b/packfile.c
    index 9560f0a33c..aedce72524 100644
    --- a/packfile.c
    +++ b/packfile.c
    @@ -557,6 +557,11 @@ static int open_packed_git_1(struct packed_git *p)
     		; /* nothing */

     	p->pack_fd = git_open(p->pack_name);
    +	{
    +		const char *delete = getenv("GIT_RACILY_DELETE");
    +		if (delete && !strcmp(delete, pack_basename(p)))
    +			return -1;
    +	}
     	if (p->pack_fd < 0 || fstat(p->pack_fd, &st))
     		return -1;
     	pack_open_fds++;

and adding the following test:

    test_expect_success 'disappearing packs' '
            git init disappearing-packs &&
            (
                    cd disappearing-packs &&

                    git config pack.allowPackReuse multi &&

                    test_commit A &&
                    test_commit B &&
                    test_commit C &&

                    A="$(echo "A" | git pack-objects --revs $packdir/pack-A)" &&
                    B="$(echo "A..B" | git pack-objects --revs $packdir/pack-B)" &&
                    C="$(echo "B..C" | git pack-objects --revs $packdir/pack-C)" &&

                    git multi-pack-index write --bitmap --preferred-pack=pack-A-$A.idx &&

                    test_pack_objects_reused_all 9 3 &&

                    test_env GIT_RACILY_DELETE=pack-A-$A.pack \
                            test_pack_objects_reused_all 6 2 &&
                    test_env GIT_RACILY_DELETE=pack-B-$B.pack \
                            test_pack_objects_reused_all 6 2 &&
                    test_env GIT_RACILY_DELETE=pack-C-$C.pack \
                            test_pack_objects_reused_all 6 2

            )
    '

Note that we could relax the single-pack version of this which was most
recently addressed in dc1daacdcc (pack-bitmap: check pack validity when
opening bitmap, 2021-07-23), but only partially. Because we still need
to know the object count in the pack, we'd still have to open the pack's
*.idx, so the savings there are marginal.

Note likewise that we add a new "if (!packs_nr)" early return in the
pack reuse code to avoid a potentially expensive allocation on the
'reuse' bitmap in the case that no packs are available for reuse.

[^1]: Unless we run out of open file handles. If that happens and we are
  forced to close the only open file handle of a file that has been
  removed from underneath us, there is nothing we can do.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18 09:51:09 -08:00
a803b1e171 doc: remove extra quotes in generated docs
Commit a38edab7c8 (Makefile: generate doc versions via GIT-VERSION-GEN,
2024-12-06) moved these variables from the Makefile to asciidoc.conf.in.
When doing so, some extraneous quotes were added; these are visible in
the generated .xml files, at least, and possibly in other locations:

    --- a/tmp/orig-git-bisect.xml
    +++ b/Documentation/git-bisect.xml
    @@ -5,14 +5,14 @@
     <refentry lang="en">
     <refentryinfo>
	 <title>git-bisect(1)</title>
    -    <date>2024-12-06</date>
    -<revhistory><revision><date>2024-12-06</date></revision></revhistory>
    +    <date>'2024-12-06'</date>^M
    +<revhistory><revision><date>'2024-12-06'</date></revision></revhistory>^M
     </refentryinfo>
     <refmeta>
     <refentrytitle>git-bisect</refentrytitle>
     <manvolnum>1</manvolnum>
    -<refmiscinfo class="source">Git 2.47.1.409.g9bb10d27e7</refmiscinfo>
    -<refmiscinfo class="manual">Git Manual</refmiscinfo>
    +<refmiscinfo class="source">'Git 2.47.1.410.ga38edab7c8'</refmiscinfo>^M
    +<refmiscinfo class="manual">'Git Manual'</refmiscinfo>^M
     </refmeta>
     <refnamediv>
	 <refname>git-bisect</refname>

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 17:14:17 -08:00
d882f382b3 Merge https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
  gitk: offer "Copy commit ID to X11 selection" only on X11
  gitk: support auto-copy comit ID to primary clipboard
  gitk: prefs dialog: refine Auto-select UI
  gitk: UI text: change "SHA1 ID" to "Commit ID"
  gitk: add text wrapping preferences
  gitk: make headings of preferences bold
  gitk: check main window visibility before waiting for it to show
  gitk: sv.po: Update Swedish translation (323t)
2024-12-17 16:17:28 -08:00
661734e6c8 Merge branch 'ah/commit-id-to-clipboard'
* ah/commit-id-to-clipboard:
  gitk: offer "Copy commit ID to X11 selection" only on X11
  gitk: support auto-copy comit ID to primary clipboard
  gitk: prefs dialog: refine Auto-select UI
  gitk: UI text: change "SHA1 ID" to "Commit ID"

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-17 21:54:58 +01:00
2456374e78 cmake/vcxproj: stop special-casing remote-ext
When the `vcxproj` target was introduced in `config.mak.uname` to allow
building Git with the Visual C toolchain, the `git remote-ext` command
was always executed in its dashed form. Therefore, it was impossible to
pass the test suite unless that command existed in its dashed form, and
we had to special-case this.

Later, when the `vcxproj` target got out of fashion because Visual
Studio gained native support for CMake builds, this special-casing was
copied without questioning it.

But as of 675df192c5 (transport-helper: do not run git-remote-ext etc.
in dashed form, 2020-08-26), the reason for this special-casing no
longer exists. So let's just drop it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:48:54 -08:00
1c01f0fb72 cmake: put the Perl modules into the correct location again
In ccfba9e0c4 (Makefile: use "generate-perl.sh" to massage Perl
library, 2024-12-06), the previous strategy (which avoided spawning a
shell script to transform the files) was replaced by the same
`generate-perl.sh` invocation as for the Makefile-based build.

The only difference is that now the transformation tries to handle the
Perl modules in-place (which ends up in empty files because the same
file is used as input and output via stdin/stdout redirection), and the
Perl script cannot find them anymore because they are not in the
expected place.

Let's put them into the expected place again, i.e. into
`perl/build/lib/` instead of `perl/`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:48:54 -08:00
ca358e6bb2 cmake: use the correct file name for the Perl header
In e4b488049a (Makefile: extract script to massage Perl scripts,
2024-12-06), the code was refactored that is used to transform the Perl
scripts/modules to their final form.

Even the CMake-based build was adjusted, but the change used the file
name `PERL-HEADER` instead of the file name used by the Makefile-based
build (same name but with the `GIT-` prefix). Let's adjust the former to
the latter.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:48:54 -08:00
df87d53e94 cmake(mergetools): better support for out-of-tree builds
In 7e0730c8ba (t: better support for out-of-tree builds, 2024-12-06)
the strategy was changed from letting `t7609-mergetool--lib.sh`
hard-code the directory where it expects to find the merge tools to
hard-coding that value in the placeholder `@GIT_TEST_MERGE_TOOLS_DIR@`
that is replaced during the build.

However, likely due to a copy/paste mistake (and reviewers missed this,
too), the CMake-based build was adjusted incorrectly, replacing that
placeholder not with the path to the merge tools, but with a Boolean
indicating whether to use a runtime-generated path prefix or not.

Let's fix that, addressing this CMake-build's symptom:

  Initialized empty Git repository in D:/a/git/git/t/trash directory.t7609-mergetool--lib/.git/
  ++ . true/vimdiff
  ./test-lib.sh: line 1021: true/vimdiff: No such file or directory

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:48:53 -08:00
c1c5b03afc cmake: better support for out-of-tree builds follow-up
In 7e0730c8ba (t: better support for out-of-tree builds, 2024-12-06),
the `bin-wrappers/` strategy was changed so that it no longer hard-codes
the template directory to be `@BUILD_DIR@/templates/blt`, but instead
interpolates the `@TEMPLATE_DIR@` placeholder during the build.

However, this commit only adjusted the `Makefile`-based build.

Let's adjust the CMake-based build as well. This fixes t0000.15 which
would otherwise fail with:

  ++ echo ''\''t1234-verbose/err'\'' is not empty, it contains:'
  't1234-verbose/err' is not empty, it contains:
  ++ cat t1234-verbose/err
  warning: templates not found in @TEMPLATE_DIR@

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:48:53 -08:00
55d62306ee GitHub ci(windows): speed up initializing Git for Windows' minimal SDK again
It used to be the case that initializing the minimal SDK (i.e. a
radically slimmed-down subset of Git for Windows' development
environment intended to perform the CI builds and little else) took
a bit over one minute, would then be cached, and subsequent jobs would
take at most half a dozen seconds to initialize said minimal SDK.

It is important that this step is fast because we have to run the test
suite in parallel, in a set of matrix jobs, to offset the slowness of
the shell-based test suite, and each and every job has to initialize the
very same minimal SDK.

While it may sound as if parallelizing the jobs might only waste the
generously-provided build minutes but at least the _wallclock_ time
would pass quick, in reality it matters a lot: Frequently Git for
Windows' or GitGitGadget PRs get stuck waiting for quite a while before
CI builds start because other PRs' builds still spend substantial
amounts of time to run, blocking due to the concurrency limit being
reached.

Since 91839a8827 (ci: create script to set up Git for Windows SDK,
2024-10-09), the situation has worsened: every job that requires the
minimal Git for Windows SDK spends roughly two-and-a-half minutes doing
so.

With the switch away from the GitHub Action `setup-git-for-windows-sdk`,
we incurred more downsides:

- It is no longer possible for said Action to fix problems independently
  from the Git repository, e.g. when new rules about GitHub Actions
  require changes in the way the minimal SDK is initialized.

- The minimal SDK was installed specifically outside of the worktree so
  as not to clutter it nor incur an additional cost to verify that the
  worktree is clean.

Therefore, even if it would be nice to have a shared process between
GitHub and GitLab based CI builds, let's switch the GitHub-based CI back
to the tried-and-tested `setup-git-for-windows-sdk` Action.

This commit partially reverts 91839a8827 (ci: create script to set up
Git for Windows SDK, 2024-10-09).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:26:26 -08:00
b30404dfc0 mingw_rename: do support directory renames
In 391bceae43 (compat/mingw: support POSIX semantics for atomic
renames, 2024-10-27), we taught the `mingw_rename()` function to respect
POSIX semantics, but we did so only as a fallback after `_wrename()`
fails.

This hid a bug in the implementation that was not caught by Git's test
suite: The `CreateFileW()` function _can_ open handles to directories,
but not when asked to use the `FILE_ATTRIBUTE_NORMAL` flag, as that flag
only is allowed for files.

Let's fix this by using the common `FILE_FLAG_BACKUP_SEMANTICS` flag
that can be used for opening handles to directories, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-17 12:06:59 -08:00
246cebe320 refs: add support for migrating reflogs
The `git refs migrate` command was introduced in
25a0023f28 (builtin/refs: new command to migrate ref storage formats,
2024-06-06) to support migrating from one reference backend to another.

One limitation of the command was that it didn't support migrating
repositories which contained reflogs. A previous commit, added support
for adding reflog updates in ref transactions. Using the added
functionality bake in reflog support for `git refs migrate`.

To ensure that the order of the reflogs is maintained during the
migration, we add the index for each reflog update as we iterate over
the reflogs from the old reference backend. This is to ensure that the
order is maintained in the new backend.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:34 -08:00
297c09eabb refs: allow multiple reflog entries for the same refname
The reference transaction only allows a single update for a given
reference to avoid conflicts. This, however, isn't an issue for reflogs.
There are no conflicts to be resolved in reflogs and when migrating
reflogs between backends we'd have multiple reflog entries for the same
refname.

So allow multiple reflog updates within a single transaction. Also the
reflog creation logic isn't exposed to the end user. While this might
change in the future, currently, this reduces the scope of issues to
think about.

In the reftable backend, the writer sorts all updates based on the
update_index before writing to the block. When there are multiple
reflogs for a given refname, it is essential that the order of the
reflogs is maintained. So add the `index` value to the `update_index`.
The `index` field is only set when multiple reflog entries for a given
refname are added and as such in most scenarios the old behavior
remains.

This is required to add reflog migration support to `git refs migrate`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:34 -08:00
84675fa271 refs: introduce the ref_transaction_update_reflog function
Introduce a new function `ref_transaction_update_reflog`, for clients to
add a reflog update to a transaction. While the existing function
`ref_transaction_update` also allows clients to add a reflog entry, this
function does a few things more, It:
  - Enforces that only a reflog entry is added and does not update the
  ref itself.
  - Allows the users to also provide the committer information. This
  means clients can add reflog entries with custom committer
  information.

The `transaction_refname_valid()` function also modifies the error
message selectively based on the type of the update. This change also
affects reflog updates which go through `ref_transaction_update()`.

A follow up commit will utilize this function to add reflog support to
`git refs migrate`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:34 -08:00
4483be36f4 refs: add committer_info to ref_transaction_add_update()
The `ref_transaction_add_update()` creates the `ref_update` struct. To
facilitate addition of reflogs in the next commit, the function needs to
accommodate setting the `committer_info` field in the struct. So modify
the function to also take `committer_info` as an argument and set it
accordingly.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:33 -08:00
add2c4f6e2 refs: extract out refname verification in transactions
Unless the `REF_SKIP_REFNAME_VERIFICATION` flag is set for an update,
the refname of the update is verified for:

  - Ensuring it is not a pseudoref.
  - Checking the refname format.

These checks will also be needed in a following commit where the
function to add reflog updates to the transaction is introduced. Extract
the code out into a new static function.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:33 -08:00
611986f300 refs/files: add count field to ref_lock
When refs are updated in the files-backend, a lock is obtained for the
corresponding file path. This is the case even for reflogs, i.e. a lock
is obtained on the reference path instead of the reflog path. This
works, since generally, reflogs are updated alongside the ref.

The upcoming patches will add support for reflog updates in ref
transaction. This means, in a particular transaction we want to have ref
updates and reflog updates. For a given ref in a given transaction there
can be at most one update. But we can theoretically have multiple reflog
updates for a given ref in a given transaction. A great example of this
would be when migrating reflogs from one backend to another. There we
would batch all the reflog updates for a given reference in a single
transaction.

The current flow does not support this, because currently refs & reflogs
are treated as a single entity and capture the lock together. To
separate this, add a count field to ref_lock. With this, multiple
updates can hold onto a single ref_lock and the lock will only be
released when all of them release the lock.

This patch only adds the `count` field to `ref_lock` and adds the logic
to increment and decrement the lock. In a follow up commit, we'll
separate the reflog update logic from ref updates and utilize this
functionality.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:33 -08:00
a3582e2eac refs: add index field to struct ref_udpate
The reftable backend, sorts its updates by refname before applying them,
this ensures that the references are stored sorted. When migrating
reflogs from one backend to another, the order of the reflogs must be
maintained. Add a new `index` field to the `ref_update` struct to
facilitate this.

This field is used in the reftable backend's sort comparison function
`transaction_update_cmp`, to ensure that indexed fields maintain their
order.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:32 -08:00
1a83e26d72 refs: include committer info in ref_update struct
The reference backends obtain the committer information from
`git_committer_info(0)` when adding a reflog. The upcoming patches
introduce support for migrating reflogs between the reference backends.
This requires an interface to creating reflogs, including custom
committer information.

Add a new field `committer_info` to the `ref_update` struct, which is
then used by the reference backends. If there is no `committer_info`
provided, the reference backends default to using
`git_committer_info(0)`. The field itself cannot be set to
`git_committer_info(0)` since the values are dynamic and must be
obtained right when the reflog is being committed.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:32 -08:00
063bcebf0c Git 2.48-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 08:54:04 -08:00
4538338c7e range-diff: introduce the convenience option --remerge-diff
Just like `git log`, now also `git range-diff` has that option as a
shortcut for the common operation that would otherwise require the quite
unwieldy (if theoretically "more correct") `--diff-mode=remerge` option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 08:45:48 -08:00
f8043236c6 range-diff: optionally include merge commits' diffs in the analysis
The `git log` command already offers support for including diffs for
merges, via the `--diff-merges=<format>` option.

Let's add corresponding support for `git range-diff`, too. This makes it
more convenient to spot differences between commit ranges that contain
merges.

This is especially true in scenarios with non-trivial merges, i.e.
merges introducing changes other than, or in addition to, what merge ORT
would have produced. Merging a topic branch that changes a function
signature into a branch that added a caller of that function, for
example, would require the merge commit itself to adjust that caller to
the modified signature.

In my code reviews, I found the `--diff-merges=remerge` option
particularly useful.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 08:45:48 -08:00
eb8374c652 Merge branch 'js/log-remerge-keep-ancestry' into js/range-diff-diff-merges
* js/log-remerge-keep-ancestry:
  log: --remerge-diff needs to keep around commit parents
2024-12-16 08:45:14 -08:00
29e5596eb8 Merge branch 'ps/build'
Build procedure update plus introduction of Meson based builds.

* ps/build: (24 commits)
  Introduce support for the Meson build system
  Documentation: add comparison of build systems
  t: allow overriding build dir
  t: better support for out-of-tree builds
  Documentation: extract script to generate a list of mergetools
  Documentation: teach "cmd-list.perl" about out-of-tree builds
  Documentation: allow sourcing generated includes from separate dir
  Makefile: simplify building of templates
  Makefile: write absolute program path into bin-wrappers
  Makefile: allow "bin-wrappers/" directory to exist
  Makefile: refactor generators to be PWD-independent
  Makefile: extract script to generate gitweb.js
  Makefile: extract script to generate gitweb.cgi
  Makefile: extract script to massage Python scripts
  Makefile: extract script to massage Shell scripts
  Makefile: use "generate-perl.sh" to massage Perl library
  Makefile: extract script to massage Perl scripts
  Makefile: consistently use PERL_PATH
  Makefile: generate doc versions via GIT-VERSION-GEN
  Makefile: generate "git.rc" via GIT-VERSION-GEN
  ...
2024-12-15 17:54:33 -08:00
ededd0d5dc Merge branch 'jt/fix-fattening-promisor-fetch'
Fix performance regression of a recent "fatten promisor pack with
local objects" protection against an unwanted gc.

* jt/fix-fattening-promisor-fetch:
  index-pack --promisor: also check commits' trees
  index-pack --promisor: don't check blobs
  index-pack --promisor: dedup before checking links
2024-12-15 17:54:31 -08:00
4007617fda Merge branch 'ps/commit-with-message-syntax-fix'
The syntax ":/<text>" to name the latest commit with the matching
text was broken with a recent change, which has been corrected.

* ps/commit-with-message-syntax-fix:
  object-name: fix reversed ordering with ":/<text>" revisions
2024-12-15 17:54:30 -08:00
67761be927 Merge branch 'rj/strvec-splice-fix'
Correct strvec_splice() that misbehaved when the strvec is empty.

* rj/strvec-splice-fix:
  strvec: `strvec_splice()` to a statically initialized vector
2024-12-15 17:54:29 -08:00
e6663b9ac5 Merge branch 'bf/explicit-config-set-in-advice-messages'
The advice messages now tell the newer 'git config set' command to
set the advice.token configuration variable to squelch a message.

* bf/explicit-config-set-in-advice-messages:
  advice: suggest using subcommand "git config set"
2024-12-15 17:54:28 -08:00
ab738b2f1f Merge branch 'jc/forbid-head-as-tagname'
"git tag" has been taught to refuse to create refs/tags/HEAD
as such a tag will be confusing in the context of UI provided by
the Git Porcelain commands.

* jc/forbid-head-as-tagname:
  tag: "git tag" refuses to use HEAD as a tagname
  t5604: do not expect that HEAD can be a valid tagname
  refs: drop strbuf_ prefix from helpers
  refs: move ref name helpers around
2024-12-15 17:54:26 -08:00
73b7e03e9e Merge branch 'jk/describe-perf'
"git describe" optimization.

* jk/describe-perf:
  describe: split "found all tags" and max_candidates logic
  describe: stop traversing when we run out of names
  describe: stop digging for max_candidates+1
  t/perf: add tests for git-describe
  t6120: demonstrate weakness in disjoint-root handling
2024-12-15 17:54:25 -08:00
df5d7a7ba5 Merge branch 'kn/reftable-writer-log-write-verify' into kn/reflog-migration
* kn/reftable-writer-log-write-verify:
  reftable/writer: ensure valid range for log's update_index
2024-12-15 15:49:01 -08:00
36625a6974 gitk: offer "Copy commit ID to X11 selection" only on X11
This option is only useful where a selection clipboard is available, which
is only the case on X11. Do not clutter the UI in other environments.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-14 16:36:42 +01:00
2ccc89b0c1 The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 07:33:46 -08:00
d0ddf344da Merge branch 'kk/doc-ancestry-path'
The --ancestry-path option is designed to be given a commit that is
on the path, which was not documented, which has been corrected.

* kk/doc-ancestry-path:
  doc: mention rev-list --ancestry-path restrictions
2024-12-13 07:33:46 -08:00
ca43bd2562 Merge branch 'kn/midx-wo-the-repository'
Yet another "pass the repository through the callchain" topic.

* kn/midx-wo-the-repository:
  midx: inline the `MIDX_MIN_SIZE` definition
  midx: pass down `hash_algo` to functions using global variables
  midx: pass `repository` to `load_multi_pack_index`
  midx: cleanup internal usage of `the_repository` and `the_hash_algo`
  midx-write: pass down repository to `write_midx_file[_only]`
  write-midx: add repository field to `write_midx_context`
  midx-write: use `revs->repo` inside `read_refs_snapshot`
  midx-write: pass down repository to static functions
  packfile.c: remove unnecessary prepare_packed_git() call
  midx: add repository to `multi_pack_index` struct
  config: make `packed_git_(limit|window_size)` non-global variables
  config: make `delta_base_cache_limit` a non-global variable
  packfile: pass down repository to `for_each_packed_object`
  packfile: pass down repository to `has_object[_kept]_pack`
  packfile: pass down repository to `odb_pack_name`
  packfile: pass `repository` to static function in the file
  packfile: use `repository` from `packed_git` directly
  packfile: add repository to struct `packed_git`
2024-12-13 07:33:44 -08:00
3b11c9139d Merge branch 'cw/worktree-extension'
Introduce a new repository extension to prevent older Git versions
from mis-interpreting worktrees created with relative paths.

* cw/worktree-extension:
  worktree: refactor `repair_worktree_after_gitdir_move()`
  worktree: add relative cli/config options to `repair` command
  worktree: add relative cli/config options to `move` command
  worktree: add relative cli/config options to `add` command
  worktree: add `write_worktree_linking_files()` function
  worktree: refactor infer_backlink return
  worktree: add `relativeWorktrees` extension
  setup: correctly reinitialize repository version
2024-12-13 07:33:43 -08:00
cd0a222f08 Merge branch 'es/oss-fuzz'
Backport oss-fuzz tests for us to our codebase.

* es/oss-fuzz:
  fuzz: port fuzz-url-decode-mem from OSS-Fuzz
  fuzz: port fuzz-parse-attr-line from OSS-Fuzz
  fuzz: port fuzz-credential-from-url-gently from OSS-Fuzz
2024-12-13 07:33:42 -08:00
e56c283c15 Merge branch 'en/fast-import-verify-path'
"git fast-import" learned to reject paths with ".."  and "." as
their components to avoid creating invalid tree objects.

* en/fast-import-verify-path:
  t9300: test verification of renamed paths
  fast-import: disallow more path components
  fast-import: disallow "." and ".." path components
2024-12-13 07:33:41 -08:00
90bf05e45a Merge branch 'kh/doc-update-ref-grammofix'
Grammofix.

* kh/doc-update-ref-grammofix:
  Documentation/git-update-ref.txt: add missing word
2024-12-13 07:33:39 -08:00
1ddfe5acde Merge branch 'kh/doc-bundle-typofix'
Typofix.

* kh/doc-bundle-typofix:
  Documentation/git-bundle.txt: fix word join typo
2024-12-13 07:33:38 -08:00
5cbe030c86 Merge branch 'jc/doc-error-message-guidelines'
Developer documentation update.

* jc/doc-error-message-guidelines:
  CodingGuidelines: a handful of error message guidelines
2024-12-13 07:33:37 -08:00
a32668829d Merge branch 'jt/bundle-fsck'
"git bundle --unbundle" and "git clone" running on a bundle file
both learned to trigger fsck over the new objects with configurable
fck check levels.

* jt/bundle-fsck:
  transport: propagate fsck configuration during bundle fetch
  fetch-pack: split out fsck config parsing
  bundle: support fsck message configuration
  bundle: add bundle verification options type
2024-12-13 07:33:36 -08:00
f94bfa1516 log: --remerge-diff needs to keep around commit parents
To show a remerge diff, the merge needs to be recreated. For that to
work, the merge base(s) need to be found, which means that the commits'
parents have to be traversed until common ancestors are found (if any).

However, one optimization that hails all the way back to cb115748ec
(Some more memory leak avoidance, 2006-06-17) is to release the commit's
list of parents immediately after showing it _and to set that parent
list to `NULL`_. This can break the merge base computation.

This problem is most obvious when traversing the commits in reverse: In
that instance, if a parent of a merge commit has been shown as part of
the `git log` command, by the time the merge commit's diff needs to be
computed, that parent commit's list of parent commits will have been set
to `NULL` and as a result no merge base will be found (even if one
should be found).

Traversing commits in reverse is far from the only circumstance in which
this problem occurs, though. There are many avenues to traversing at
least one commit in the revision walk that will later be part of a merge
base computation, for example when not even walking any revisions in
`git show <merge1> <merge2>` where `<merge1>` is part of the commit
graph between the parents of `<merge2>`.

Another way to force a scenario where a commit is traversed before it
has to be traversed again as part of a merge base computation is to
start with two revisions (where the first one is reachable from the
second but not in a first-parent ancestry) and show the commit log with
`--topo-order` and `--first-parent`.

Let's fix this by special-casing the `remerge_diff` mode, similar to
what we did with reflogs in f35650dff6 (log: do not free parents when
walking reflog, 2017-07-07).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:56:10 -08:00
eab5dbab92 ci: wire up Meson builds
Wire up CI builds for both GitLab and GitHub that use the Meson build
system.

While the setup is mostly trivial, one gotcha is the test output
directory used to be in "t/", but now it is contained in the build
directory. To unify the logic across Makefile- and Meson-based builds we
explicitly set up the `TEST_OUTPUT_DIRECTORY` variable so that it is the
same for both build systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:47 -08:00
9faf3963b6 t: introduce compatibility options to clar-based tests
Our unit tests that don't yet use the clar unit testing framework ignore
any option that they do not understand. It is thus fine to just pass
test options we set up globally to those unit tests as they are simply
ignored. This makes our life easier because we don't have to special
case those options with Meson, where test options are set up globally
via `meson test --test-args=`.

But our clar-based unit testing framework is way stricter here and will
fail in case it is passed an unknown option. Stub out these options with
no-ops to make our life a bit easier.

Note that this also requires us to remove the `-x` short option for
`--exclude`. This is because `-x` has another meaning in our integration
tests, as it enables shell tracing. I doubt there are a lot of people
out there using it as we only got a small hand full of clar tests in the
first place. So better change it now so that we can in the long run
improve compatibility between the two different test drivers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:47 -08:00
78ad7291df t: fix out-of-tree tests for some git-p4 tests
Both t9835 and t9836 exercise git-p4, but one exercises Python 2 whereas
the other one uses Python 3. These tests do not exercise "git p4", but
instead they use "git p4.py". This calls the unbuilt version of
"git-p4.py" that still has the "#!/usr/bin/env python" shebang, which
allows the test to modify which Python version comes first in $PATH,
making it possible to force a Python version.

But "git-p4.py" is not in our PATH during out-of-tree builds, and thus
we cannot locate "git-p4.py". The tests thus break with CMake and Meson.

Fix this by instead manually setting up script wrappers that invoke the
respective Python interpreter directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:47 -08:00
154ce05cce Makefile: detect missing Meson tests
In the preceding commit, we have introduced consistency checks to Meson
to detect any discrepancies with missing or extraneous tests in its
build instructions. These checks only get executed in Meson though, so
any users of our Makefiles wouldn't be alerted of the fact that they
have to modify the Meson build instructions in case they add or remove
any tests.

Add a comparable test target to our Makefile to plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:46 -08:00
0ed1512141 meson: detect missing tests at configure time
It is quite easy for the list of integration tests to go out-of-sync
without anybody noticing. Introduce a new configure-time check that
verifies that all tests are wired up properly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:46 -08:00
c081e7340f t/unit-tests: rename clar-based unit tests to have a common prefix
All of the code files for unit tests using the self-grown unit testing
framework have a "t-" prefix to their name. This makes it easy to
identify them and use globbing in our Makefile and in other places. On
the other hand though, our clar-based unit tests have no prefix at all
and thus cannot easily be discerned from other files in the unit test
directory.

Introduce a new "u-" prefix for clar-based unit tests. This prefix will
be used in a subsequent commit to easily identify such tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:46 -08:00
23eeee08d6 Makefile: drop -DSUPPRESS_ANNOTATED_LEAKS
The -DSUPPRESS_ANNOTATED_LEAKS preprocessor directive was used to enable
our `UNLEAK()` macro in the past, which marks memory as still-reachable
so that the leak sanitizer does not complain. Starting with 52c7dbd036
(git-compat-util: drop now-unused `UNLEAK()` macro, 2024-11-20) this
macro has been removed, and thus the preprocessor directive is not
required anymore, either.

Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:45 -08:00
714c134dd6 ci/lib: support custom output directories when creating test artifacts
Update `create_failed_test_artifacts ()` so that it can handle arbitrary
test output directories. This fixes creation of these artifacts for
macOS on GitLab CI, which uses a separate output directory already. This
will also be used by our out-of-tree builds with Meson.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13 06:48:45 -08:00
d77c3e35bb gitk: support auto-copy comit ID to primary clipboard
Auto-select ("Copy commit ID to X11 selection") is useful when a
selection cliboard exists, but otherwise generally meaningless, for
instance on Windows.

Add a similar pref and behavior which copies the commit ID to the
primary clipboard - for platforms without a selection clipboard, but
which can also be useful additionally on platforms with selection.

Note that while autoselect is enabled by default, autocopy isn't.

That's because the selection clipboard is typically dispensable, while
the primary clipboard can be considered a more precious resource,
which we don't want to (clear and) overwrite by default.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
2024-12-13 01:37:08 +02:00
92d911a531 gitk: prefs dialog: refine Auto-select UI
Tl;DR: change Auto-select text, move the length input to a new line.

The Auto-select preference auto-selects [part of] the commit ID text
at the respective widget on startup, and when the current commit at
the graph changes.

Its real premise, however, is to populate the selection clipboard
with the commit ID. Consider, for instance, how meaningless it is on
platforms without a selection clipboard - like Windows or macOS (on
Windows the selection is not even visible with the default Tk theme,
because it's only visible in focused widgets - which the commit ID
widget is not during normal application of this selection).

So rename the Auto-select label to "Copy commit ID to X11 selection",
to reflect better the ultimate outcome of its application

Note that there exists other, non-X11 platforms with a selection
clipboard, like Wayland, and if a native Tk client exists on such
platforms, then the description will not be accurate, but hopefully
it's not too misleading either.

Additionally, move the length input widget to a new line, because:
- This length applies to both Auto-select and "Copy commit reference"
  context menu item, so it's not exclusive to the selection length.
- The next commit will add support for primary clipboard as well,
  where this length will also be used.

Also, move the "Hide remotes" item above these selection prefs, to
keep the selection prefs semi-grouped before the spacing of the
following title "Diff display options".

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
2024-12-13 01:27:11 +02:00
66496dabd4 gitk: UI text: change "SHA1 ID" to "Commit ID"
SHA1 might not stay forever, and plans to use SHA256 already exist,
so use the official name for it - "Commit ID".

Only visible UI texts are modified to reduce the noise when using
git-blame, while comments and variable names still contain SHA1/sha1.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
2024-12-13 01:17:05 +02:00
dd1072dfa8 bundle: remove unneeded code
The changes in commit c06793a4ed (allow git-bundle to create bottomless
bundle, 2007-08-08) ensure annotated tags are properly preserved when
creating a bundle using a revision range operation.

At the time the range notation would peel the ends to their
corresponding commit, meaning ref v2.0 would point to the v2.0^0 commit.
So the above workaround was introduced. This code looks up the ref
before it's written to the bundle, and if the ref doesn't point to the
object we expect (for tags this would be a tag object), we skip the ref
from the bundle. Instead, when the ref is a tag that's the positive end
of the range (e.g. v2.0 from the range "v1.0..v2.0"), then that ref is
written to the bundle instead.

Later, in 895c5ba3c1 (revision: do not peel tags used in range notation,
2013-09-19), the behavior of parsing ranges was changed and the problem
was fixed at the cause. But the workaround in bundle.c was not reverted.

Now it seems this workaround can cause a race condition. git-bundle(1)
uses setup_revisions() to parse the input into `struct rev_info`. Later,
in write_bundle_refs(), it uses this info to write refs to the bundle.
As mentioned at this point each ref is looked up again and checked
whether it points to the object we expect. If not, the ref is not
written to the bundle. But, when creating a bundle in a heavy traffic
repository (a repo with many references, and frequent ref updates) it's
possible a branch ref was updated between setup_revisions() and
write_bundle_refs() and thus the extra check causes the ref to be
skipped.

The workaround was originally added to deal with tags, but the code path
also gets hit by non-tag refs, causing this race condition. Because it's
no longer needed, remove it and fix the possible race condition.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12 17:08:35 +09:00
c6b43f663e ci/lib: fix "CI setup" sections with GitLab CI
Whenever we source "ci/lib.sh" we wrap the directives in a separate
group so that they can easily be collapsed in the web UI. And as we
source the script multiple times during a single CI run we thus end up
with the same section name reused multiple times, as well.

This is broken on GitLab CI though, where reusing the same group name is
not supported. The consequence is that only the last of these sections
can be collapsed.

Fix this issue by including the name of the sourcing script in the
group's name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12 16:57:21 +09:00
d2ca12020f ci/lib: do not interpret escape sequences in group () arguments
We use printf to set up sections with GitLab CI, which requires us to
print a bunch of escape sequences via printf. The group name is
controlled by the user and is expanded directly into the formatting
string, which may cause problems in case the argument contains escape
sequences or formatting directives.

Fix this potential issue by using formatting directives to pass variable
data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12 16:57:21 +09:00
33b06fa603 ci/lib: remove duplicate trap to end "CI setup" group
We exlicitly trap on EXIT in order to end the "CI setup" group. This
isn't necessary though given that `begin_group ()` already sets up the
trap for us.

Remove the duplicate trap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12 16:57:21 +09:00
e1b52cf71e gitlab-ci: update macOS images to Sonoma
The macOS Ventura images we use for GitLab CI runners have been
deprecated. Update them to macOS 14, aka Sonoma.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12 16:57:20 +09:00
2187ce76c5 Merge branch 'ps/build' into ps/3.0-remote-deprecation
* ps/build: (24 commits)
  Introduce support for the Meson build system
  Documentation: add comparison of build systems
  t: allow overriding build dir
  t: better support for out-of-tree builds
  Documentation: extract script to generate a list of mergetools
  Documentation: teach "cmd-list.perl" about out-of-tree builds
  Documentation: allow sourcing generated includes from separate dir
  Makefile: simplify building of templates
  Makefile: write absolute program path into bin-wrappers
  Makefile: allow "bin-wrappers/" directory to exist
  Makefile: refactor generators to be PWD-independent
  Makefile: extract script to generate gitweb.js
  Makefile: extract script to generate gitweb.cgi
  Makefile: extract script to massage Python scripts
  Makefile: extract script to massage Shell scripts
  Makefile: use "generate-perl.sh" to massage Perl library
  Makefile: extract script to massage Perl scripts
  Makefile: consistently use PERL_PATH
  Makefile: generate doc versions via GIT-VERSION-GEN
  Makefile: generate "git.rc" via GIT-VERSION-GEN
  ...
2024-12-12 16:55:41 +09:00
5c46677067 Merge branch 'ps/build' into ps/ci-meson
* ps/build: (24 commits)
  Introduce support for the Meson build system
  Documentation: add comparison of build systems
  t: allow overriding build dir
  t: better support for out-of-tree builds
  Documentation: extract script to generate a list of mergetools
  Documentation: teach "cmd-list.perl" about out-of-tree builds
  Documentation: allow sourcing generated includes from separate dir
  Makefile: simplify building of templates
  Makefile: write absolute program path into bin-wrappers
  Makefile: allow "bin-wrappers/" directory to exist
  Makefile: refactor generators to be PWD-independent
  Makefile: extract script to generate gitweb.js
  Makefile: extract script to generate gitweb.cgi
  Makefile: extract script to massage Python scripts
  Makefile: extract script to massage Shell scripts
  Makefile: use "generate-perl.sh" to massage Perl library
  Makefile: extract script to massage Perl scripts
  Makefile: consistently use PERL_PATH
  Makefile: generate doc versions via GIT-VERSION-GEN
  Makefile: generate "git.rc" via GIT-VERSION-GEN
  ...
2024-12-12 16:30:28 +09:00
cb656b4222 Merge branch 'cw/worktree-extension' into ps/ci-meson
* cw/worktree-extension:
  worktree: refactor `repair_worktree_after_gitdir_move()`
  worktree: add relative cli/config options to `repair` command
  worktree: add relative cli/config options to `move` command
  worktree: add relative cli/config options to `add` command
  worktree: add `write_worktree_linking_files()` function
  worktree: refactor infer_backlink return
  worktree: add `relativeWorktrees` extension
  setup: correctly reinitialize repository version
2024-12-12 16:30:12 +09:00
b86f0f9071 git-submodule.sh: rename some variables
Every switch and option which is passed to git-submodule.sh has a
corresponding variable which is set accordingly; by convention, the name
of the variable is the option name (for example, "--jobs" and "$jobs").

Rename "$custom_name", "$deinit_all" and "$nofetch", for consistency.

Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:48 +09:00
3ad0ba7227 git-submodule.sh: improve variables readability
When git-submodule.sh parses various options and switches, it sets some
variables to values; the variables in turn affect the options given to
git-submodule--helper.

Currently, variables which correspond to switches have boolean values
(for example, whenever "--force" is passed, force=1), while variables
which correspond to options which take arguments have string values that
sometimes contain the option name and sometimes only the option value.

Set all of the variables to strings which contain the option name (e.g.
force="--force" rather than force=1); this has a couple of advantages:
it improves consistency, readability and debuggability.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:48 +09:00
57f9b30fcd git-submodule.sh: add some comments
Add a couple of comments in a few functions where they were missing.

Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:48 +09:00
402e46daf5 git-submodule.sh: get rid of unused variable
Remove the variable "$diff_cmd" which is no longer used.

Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:48 +09:00
006f546bc3 git-submodule.sh: get rid of isnumber
It's entirely unnecessary to check whether the argument given to an
option (i.e. --summary-limit) is valid in the shell wrapper, since it's
already done when parsing the various options in git-submodule--helper.

Remove this check from the script; this both improves consistency
throughout the script, and the error message shown to the user in case
some invalid non-numeric argument was passed to "--summary-limit" is
more informative as well.

Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:47 +09:00
e6c3e34945 git-submodule.sh: improve parsing of short options
Some command-line options have a short form which takes an argument; for
example, "--jobs" has the form "-j", and it takes a numerical argument.

When parsing short options, support the case where there is no space
between the flag and the option argument, in order to improve
consistency with the rest of the builtin git commands.

Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:47 +09:00
b71687ca03 git-submodule.sh: improve parsing of some long options
Some command-line options have a long form which takes an argument. In
this case, the argument can be given right after `='; for example,
"--depth" takes a numerical argument, which can be given as "--depth=X".

Support the case where the argument is given right after `=' for all
long options, in order to improve consistency throughout the script.

Signed-off-by: Roy Eldar <royeldar0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-11 20:46:47 +09:00
caacdb5dfd The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-10 10:04:58 +09:00
7041902dfa Merge branch 'ps/reftable-iterator-reuse'
Optimize reading random references out of the reftable backend by
allowing reuse of iterator objects.

* ps/reftable-iterator-reuse:
  refs/reftable: reuse iterators when reading refs
  reftable/merged: drain priority queue on reseek
  reftable/stack: add mechanism to notify callers on reload
  refs/reftable: refactor reflog expiry to use reftable backend
  refs/reftable: refactor reading symbolic refs to use reftable backend
  refs/reftable: read references via `struct reftable_backend`
  refs/reftable: figure out hash via `reftable_stack`
  reftable/stack: add accessor for the hash ID
  refs/reftable: handle reloading stacks in the reftable backend
  refs/reftable: encapsulate reftable stack
2024-12-10 10:04:58 +09:00
de9278127e Merge branch 'ps/reftable-detach'
Isolates the reftable subsystem from the rest of Git's codebase by
using fewer pieces of Git's infrastructure.

* ps/reftable-detach:
  reftable/system: provide thin wrapper for lockfile subsystem
  reftable/stack: drop only use of `get_locked_file_path()`
  reftable/system: provide thin wrapper for tempfile subsystem
  reftable/stack: stop using `fsync_component()` directly
  reftable/system: stop depending on "hash.h"
  reftable: explicitly handle hash format IDs
  reftable/system: move "dir.h" to its only user
2024-12-10 10:04:56 +09:00
35f40385e4 Merge branch 'bc/allow-upload-pack-from-other-people'
Loosen overly strict ownership check introduced in the recent past,
to keep the promise "cloning a suspicious repository is a safe
first step to inspect it".

* bc/allow-upload-pack-from-other-people:
  Allow cloning from repositories owned by another user
2024-12-10 10:04:55 +09:00
9cd1e2e1a0 Merge branch 'pb/mergetool-errors'
End-user experience of "git mergetool" when the command errors out
has been improved.

* pb/mergetool-errors:
  git-difftool--helper.sh: exit upon initialize_merge_tool errors
  git-mergetool--lib.sh: add error message for unknown tool variant
  git-mergetool--lib.sh: add error message if 'setup_user_tool' fails
  git-mergetool--lib.sh: use TOOL_MODE when erroring about unknown tool
  completion: complete '--tool-help' in 'git mergetool'
2024-12-10 10:04:53 +09:00
bd31944dda Merge branch 'jc/doc-opt-tilde-expand'
Describe a case where an option value needs to be spelled as a
separate argument, i.e. "--opt val", not "--opt=val".

* jc/doc-opt-tilde-expand:
  doc: option value may be separate for valid reasons
2024-12-10 10:04:52 +09:00
8afff26aa0 Merge branch 'bc/ancient-ci'
Drop support for ancient environments in various CI jobs.

* bc/ancient-ci:
  Add additional CI jobs to avoid accidental breakage
  ci: remove clause for Ubuntu 16.04
  gitlab-ci: switch from Ubuntu 16.04 to 20.04
2024-12-10 10:04:51 +09:00
14ef8c04c5 strvec: strvec_splice() to a statically initialized vector
We use a singleton empty array to initialize a `struct strvec`;
similar to the empty string singleton we use to initialize a `struct
strbuf`.

Note that an empty strvec instance (with zero elements) does not
necessarily need to be an instance initialized with the singleton.
Let's refer to strvec instances initialized with the singleton as
"empty-singleton" instances.

    As a side note, this is the current `strvec_pop()`:

    void strvec_pop(struct strvec *array)
    {
    	if (!array->nr)
    		return;
    	free((char *)array->v[array->nr - 1]);
    	array->v[array->nr - 1] = NULL;
    	array->nr--;
    }

    So, with `strvec_pop()` an instance can become empty but it does
    not going to be the an "empty-singleton".

This "empty-singleton" circumstance requires us to be careful when
adding elements to instances.  Specifically, when adding the first
element:  when we detach the strvec instance from the singleton and
set the internal pointer in the instance to NULL.  After this point we
apply `realloc()` on the pointer.  We do this in
`strvec_push_nodup()`, for example.

The recently introduced `strvec_splice()` API is expected to be
normally used with non-empty strvec's.  However, it can also end up
being used with "empty-singleton" strvec's:

       struct strvec arr = STRVEC_INIT;
       int a = 0, b = 0;

       ... no modification to arr, a or b ...

       const char *rep[] = { "foo" };
       strvec_splice(&arr, a, b, rep, ARRAY_SIZE(rep));

So, we'll try to add elements to an "empty-singleton" strvec instance.

Avoid misapplying `realloc()` to the singleton in `strvec_splice()` by
adding a special case for strvec's initialized with the singleton.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-10 09:07:47 +09:00
1a14c857db index-pack --promisor: also check commits' trees
Commit c08589efdc (index-pack: repack local links into promisor packs,
2024-11-01) seems to contain an oversight in that the tree of a commit
is not checked. Teach git to check these trees.

The fix slows down a fetch from a certain repo at $DAYJOB from 2m2.127s
to 2m45.052s, but in order to make the fetch correct, it seems worth it.

In order to test this, we could create server and client repos as
follows...

 C   S
  \ /
   O

(O and C are commits both on the client and server. S is a commit
only on the server. C and S have the same tree but different commit
messages. The diff between O and C is non-zero.)

...and then, from the client, fetch S from the server.

In theory, the client declares "have C" and the server can use this
information to exclude S's tree (since it knows that the client has C's
tree, which is the same as S's tree). However, it is also possible for
the server to compute that it needs to send S and not O, and proceed
from there; therefore the objects of C are not considered at all when
determining what to send in the packfile. In order to prevent a test of
client functionality from having such a dependence on server behavior, I
have not included such a test.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-10 08:53:59 +09:00
36198026d8 index-pack --promisor: don't check blobs
As a follow-up to the parent of this commit, it was found that not
checking for the existence of blobs linked from trees sped up the fetch
from 24m47.815s to 2m2.127s. Teach Git to do that.

The tradeoff of not checking blobs is documented in a code comment.

(Blobs may also be linked from tag objects, but it is impossible to know
the type of an object linked from a tag object without looking it up in
the object database, so the code for that is untouched.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-10 08:53:59 +09:00
911d14203c index-pack --promisor: dedup before checking links
Commit c08589efdc (index-pack: repack local links into promisor packs,
2024-11-01) fixed a bug with what was believed to be a negligible
decrease in performance [1] [2]. But at $DAYJOB, with at least one repo,
it was found that the decrease in performance was very significant.

Looking at the patch, whenever we parse an object in the packfile to
be indexed, we check the targets of all its outgoing links for its
existence. However, this could be optimized by first collecting all such
targets into an oidset (thus deduplicating them) before checking. Teach
Git to do that.

On a certain fetch from the aforementioned repo, this improved
performance from approximately 7 hours to 24m47.815s. This number will
be further reduced in a subsequent patch.

[1] https://lore.kernel.org/git/CAG1j3zGiNMbri8rZNaF0w+yP+6OdMz0T8+8_Wgd1R_p1HzVasg@mail.gmail.com/
[2] https://lore.kernel.org/git/20241105212849.3759572-1-jonathantanmy@google.com/

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-10 08:53:59 +09:00
8525e92886 Document HOME environment variable
Git documentation refers to $HOME and $XDG_CONFIG_HOME often, but does
not specify how or where these values come from on Windows where neither
is set by default. The new documentation reflects the behavior of
setup_windows_environment() in compat/mingw.c.

Signed-off-by: Alejandro Barreto <alejandro.barreto@ni.com>
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-10 08:47:55 +09:00
0668f0470d Merge branch 'yk/console-encoding'
* yk/console-encoding:
  git-gui: use system encoding to show console output

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-09 21:19:33 +01:00
904b36b815 gitk: add text wrapping preferences
Add a new preference "wrapdefault" which allows enabling char/word wrap.
Impacts all text in the ctext widget for which no other preference exists.

Also make the (existing) preference "wrapcomment" configurable graphically.
Its setting impacts only the "comment" part of the ctext widget.

Signed-off-by: Christoph Sommer <sommer@cms-labs.org>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-09 20:58:02 +01:00
b2490ae42f gitk: make headings of preferences bold
Make preference groups like "Diff display options" stand out more.

Signed-off-by: Christoph Sommer <sommer@cms-labs.org>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-12-09 20:58:02 +01:00
e5b5eca3f2 git-gui: use system encoding to show console output
This change makes non-ascii console output (eg server messages in the
`git push` command output) properly render in the git gui windows.

Fixes: https://github.com/prati0100/git-gui/issues/68

Signed-off-by: Yuri Konotopov <ykonotopov@gnome.org>
2024-12-08 22:14:45 +04:00
0ff919e87a object-name: fix reversed ordering with ":/<text>" revisions
Recently it was reported [1] that "look for the youngest commit
reachable from any ref with log message that match the given
pattern" syntax (i.e. ':/<text>') started to return results in
reverse recency order. This regression was introduced in Git v2.47.0
and is caused by a memory leak fix done in 57fb139b5e (object-name:
fix leaking commit list items, 2024-08-01).

The intent of the identified commit is to stop modifying the commit list
provided by the caller such that the caller can properly free all commit
list items, including those that the called function might potentially
remove from the list. This was done by creating a copy of the passed-in
commit list and modifying this copy instead of the caller-provided list.

We already knew to create such a copy beforehand with the `backup` list,
which was used to clear the `ONELINE_SEEN` commit mark after we were
done. So the refactoring simply renamed that list to `copy` and started
to operate on that list instead. There is a gotcha though: the backup
list, and thus now also the copied list, is always being prepended to,
so the resulting list is in reverse order! The end result is that we
pop commits from the wrong end of the commit list, returning commits in
reverse recency order.

Fix the bug by appending to the list instead.

[1]: <CAKOEJdcPYn3O01p29rVa+xv=Qr504FQyKJeSB-Moze04ViCGGg@mail.gmail.com>

Reported-by: Aarni Koskela <aarni@valohai.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-08 08:23:14 +09:00
6c915c3f85 fetch: do not ask for HEAD unnecessarily
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist,
2024-11-22), git-fetch learned to opportunistically set $REMOTE/HEAD
when fetching by always asking for remote HEAD, in the hope that it
will help setting refs/remotes/<name>/HEAD if missing.

But it is not needed to always ask for remote HEAD.  When we are
fetching from a remote, for which we have remote-tracking branches,
we do need to know about HEAD.  But if we are doing one-shot fetch,
e.g.,

  $ git fetch --tags https://github.com/git/git

we do not even know what sub-hierarchy of refs/remotes/<remote>/
we need to adjust the remote HEAD for.  There is no need to ask for
HEAD in such a case.

Incidentally, because the unconditional request to list "HEAD"
affected the number of ref-prefixes requested in the ls-remote
request, this affected how the requests for tags are added to the
same ls-remote request, breaking "git fetch --tags $URL" performed
against a URL that is not configured as a remote.

Reported-by: Josh Steadmon <steadmon@google.com>
[jc: tests are also borrowed from Josh's patch]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 21:58:59 +09:00
49c6b912e2 reftable/writer: ensure valid range for log's update_index
Each reftable addition has an associated update_index. While writing
refs, the update_index is verified to be within the range of the
reftable writer, i.e. `writer.min_update_index <= ref.update_index` and
`writer.max_update_index => ref.update_index`.

The corresponding check for reflogs in `reftable_writer_add_log` is
however missing. Add a similar check, but only check for the upper
limit. This is because reflogs are treated a bit differently than refs.
Each reflog entry in reftable has an associated update_index and we also
allow expiring entries in the middle, which is done by simply writing a
new reflog entry with the same update_index. This means, writing reflog
entries with update_index lesser than the writer's update_index is an
expected scenario.

Add a new unit test to check for the limits and fix some of the existing
tests, which were setting arbitrary values for the update_index by
ensuring they stay within the now checked limits.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 08:04:46 +09:00
904339edbd Introduce support for the Meson build system
Introduce support for the Meson build system, a "modern" meta build
system that supports many different platforms, including Linux, macOS,
Windows and BSDs. Meson supports different backends, including Ninja,
Xcode and Microsoft Visual Studio. Several common IDEs provide an
integration with it.

The biggest contender compared to Meson is probably CMake as outlined in
our "Documentation/technical/build-systems.txt" file. Based on my own
personal experience from working with both build systems extensively I
strongly favor Meson over CMake. In my opinion, it feels significantly
easier to use with a syntax that feels more like a "real" programming
language. The second big reason is that Meson supports Rust natively,
which may prove to be important given that the project may pick up Rust
as another language eventually.

Using Meson is rather straight-forward. An example:

    ```
    # Meson uses out-of-tree builds. You can set up multiple build
    # directories, how you name them is completely up to you.
    $ mkdir build
    $ cd build
    $ meson setup .. -Dprefix=/tmp/git-installation

    # Build the project. This also provides several other targets like
    e.g. `install` or `test`.
    $ ninja

    # Meson has been wired up to support execution of our test suites.
    # Both our unit tests and our integration tests are supported.
    # Running `meson test` without any arguments will execute all tests,
    # but the syntax supports globbing to select only some tests.
    $ meson test 't-*'
    # Execute single test interactively to allow for debugging.
    $ meson test 't0000-*' --interactive --test-args=-ix
    ```

The build instructions have been successfully tested on the following
systems, tests are passing:

  - Apple macOS 10.15.

  - FreeBSD 14.1.

  - NixOS 24.11.

  - OpenBSD 7.6.

  - Ubuntu 24.04.

  - Windows 10 with Cygwin.

  - Windows 10 with MinGW64, except for t9700, which is also broken with
    our Makefile.

  - Windows 10 with Visual Studio 2022 toolchain, using the Native Tools
    Command Prompt with `meson setup --vsenv`. Tests pass, except for
    t9700.

  - Windows 10 with Visual Studio 2022 solution, using the Native Tools
    Command Prompt with `meson setup --backend vs2022`. Tests pass,
    except for t9700.

  - Windows 10 with VS Code, using the Meson plug-in.

It is expected that there will still be rough edges in the current
version. If this patch lands the expectation is that it will coexist
with our other build systems for a while. Like this, distributions can
slowly migrate over to Meson and report any findings they have to us
such that we can continue to iterate. A potential cutoff date for other
build systems may be Git 3.0.

Some notes:

  - The installed distribution is structured somewhat differently than
    how it used to be the case. All of our binaries are installed into
    `$libexec/git-core`, while all binaries part of `$bindir` are now
    symbolic links pointing to the former. This rule is consistent in
    itself and thus easier to reason about.

  - We do not install dashed binaries into `$libexec/git-core` anymore,
    so there won't e.g. be a symlink for git-add(1). These are not
    required by modern Git and there isn't really much of a use case for
    those anymore. By not installing those symlinks we thus start the
    deprecation of this layout.

  - We're targeting Meson 1.3.0, which has been released relatively
    recently November 2023. The only feature we use from that version is
    `fs.relative_to()`, which we could replace if necessary. If so, we
    could start to target Meson 1.0.0 and newer, released in December
    2022.

  - The whole build instructions count around 3300 lines, half of which
    is listing all of our code and test files. Our Makefiles are around
    5000 lines, autoconf adds another 1300 lines. CMake in comparison
    has only 1200 linescode, but it avoids listing individual files and
    does not wire up auto-configuration as extensively as the Meson
    instructions do.

  - We bundle a set of subproject wrappers for curl, expat, openssl,
    pcre2 and zlib. This allows developers to build Git without these
    dependencies preinstalled, and Meson will fetch and build them
    automatically. This is especially helpful on Windows.

Helped-by: Eli Schwartz <eschwartz@gentoo.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:14 +09:00
00ab97b1bc Documentation: add comparison of build systems
We're contemplating whether to eventually replace our build systems with
a build system that is easier to use. Add a comparison of build systems
to our technical documentation as a baseline for discussion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:13 +09:00
5ee8927824 t: allow overriding build dir
Our "test-lib.sh" assumes that our build directory is the parent
directory of "t/". While true when using our Makefile, it's not when
using build systems that support out-of-tree builds.

In commit ee9e66e4e7 (cmake: avoid editing t/test-lib.sh, 2022-10-18),
we have introduce support for overriding the GIT_BUILD_DIR by creating
the file "$GIT_BUILD_DIR/GIT-BUILD-DIR" with its contents pointing to
the location of the build directory. The intent was to stop modifying
"t/test-lib.sh" with the CMake build systems while allowing out-of-tree
builds. But "$GIT_BUILD_DIR" is somewhat misleadingly named, as it in
fact points to the _source_ directory. So while that commit solved part
of the problem for out-of-tree builds, CMake still has to write files
into the source tree.

Solve the second part of the problem, namely not having to write any
data into the source directory at all, by also supporting an environment
variable that allows us to point to a different build directory. This
allows us to perform properly self-contained out-of-tree builds.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:13 +09:00
7e0730c8ba t: better support for out-of-tree builds
Our in-tree builds used by the Makefile use various different build
directories scattered around different locations. The paths to those
build directories have to be propagated to our tests such that they can
find the contained files. This is done via a mixture of hardcoded paths
in our test library and injected variables in our bin-wrappers or
"GIT-BUILD-OPTIONS".

The latter two mechanisms are preferable over using hardcoded paths. For
one, we have all paths which are subject to change stored in a small set
of central files instead of having the knowledge of build paths in many
files. And second, it allows build systems which build files elsewhere
to adapt those paths based on their own needs. This is especially nice
in the context of build systems that use out-of-tree builds like CMake
or Meson.

Remove hardcoded knowledge of build paths from our test library and move
it into our bin-wrappers and "GIT-BUILD-OPTIONS".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:13 +09:00
023c3370ac Documentation: extract script to generate a list of mergetools
We include the list of available mergetools into our manpages. Extract
the script that performs this logic such that we can reuse it in other
build systems.

While at it, refactor the Makefile targets such that we don't create
"mergetools-list.made" anymore. It shouldn't be necessary, as we can
instead have other targets depend on "mergetools-{diff,merge}.txt"
directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:13 +09:00
628d49f6e5 Documentation: teach "cmd-list.perl" about out-of-tree builds
The "cmd-list.perl" script generates a list of commands that can be
included into our manpages. The script doesn't know about out-of-tree
builds and instead writes resulting files into the source directory.

Adapt it such that we can read data from the source directory and write
data into the build directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:12 +09:00
9219325be7 Documentation: allow sourcing generated includes from separate dir
Our documentation uses "include::" directives to include parts that are
either reused across multiple documents or parts that we generate at
build time. Unfortunately, top-level includes are only ever resolved
relative to the base directory, which is typically the directory of the
including document. Most importantly, it is not possible to have either
asciidoc or asciidoctor search multiple directories.

It follows that both kinds of includes must live in the same directory.
This is of course a bummer for out-of-tree builds, because here the
dynamically-built includes live in the build directory whereas the
static includes live in the source directory.

Introduce a `build_dir` attribute and prepend it to all of our includes
for dynamically-built files. This attribute gets set to the build
directory and thus converts the include path to an absolute path, which
asciidoc and asciidoctor know how to resolve.

Note that this change also requires us to update "build-docdep.perl",
which tries to figure out included files such our Makefile can set up
proper build-time dependencies. This script simply scans through the
source files for any lines that match "^include::" and treats the
remainder of the line as included file path. But given that those may
now contain the "{build_dir}" variable we have to teach the script to
replace that attribute with the actual build directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:12 +09:00
ed060aa0a3 Makefile: simplify building of templates
When we install Git we also install a set of default templates that both
git-init(1) and git-clone(1) populate into our build directories. The
way the pristine templates are laid out in our source directory is
somewhat weird though: instead of reconstructing the actual directory
hierarchy in "templates/", we represent directory separators with "--".

The only reason I could come up with for why we have this is the
"branches/" directory, which is supposed to be empty when installing it.
And as Git famously doesn't store empty directories at all we have to
work around this limitation.

Now the thing is that the "branches/" directory is a leftover to how
branches used to be stored in the dark ages. gitrepository-layout(5)
lists this directory as "slightly deprecated", which I would claim is a
strong understatement. I have never encountered anybody using it today
and would be surprised if it even works as expected. So having the "--"
hack in place for an item that is basically unused, unmaintained and
deprecated doesn't only feel unreasonable, but installing that entry by
default may also cause confusion for users that do not know what this is
supposed to be in the first place.

Remove this directory from our templates and, now that we do not require
the workaround anymore, restructure the templates to form a proper
hierarchy. This makes it way easier for build systems to install these
templates into place.

We should likely think about removing support for "branch/" altogether,
but that is outside of the scope of this patch series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:12 +09:00
d2407bb8dc Makefile: write absolute program path into bin-wrappers
Write the absolute program path into our bin-wrappers. This allows us to
simplify the Meson build instructions we are about to introduce a bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:12 +09:00
95bcd6f0b7 Makefile: allow "bin-wrappers/" directory to exist
The "bin-wrappers/" directory gets created by our build system and is
populated with one script for each of our binaries. There isn't anything
inherently wrong with the current layout, but it is somewhat hard to
adapt for out-of-tree build systems.

Adapt the layout such that our "bin-wrappers/" directory always exists
and contains our "wrap-for-bin.sh" script to make things a little bit
easier for subsequent steps.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:11 +09:00
3f145a4fe3 Makefile: refactor generators to be PWD-independent
We have multiple scripts that generate headers from other data. All of
these scripts have the assumption built-in that they are executed in the
current source directory, which makes them a bit unwieldy to use during
out-of-tree builds.

Refactor them to instead take the source directory as well as the output
file as arguments.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:11 +09:00
19d8fe7da6 Makefile: extract script to generate gitweb.js
Similar to the preceding commit, also extract the script to generate the
"gitweb.js" file. While the logic itself is trivial, it helps us avoid
duplication of logic across build systems and ensures that the build
systems will remain in sync with each other in case the logic ever needs
to change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:11 +09:00
d2507bbbf4 Makefile: extract script to generate gitweb.cgi
In order to generate "gitweb.cgi" we have to replace various different
placeholders. This is done ad-hoc and is thus not easily reusable across
different build systems.

Introduce a new GITWEB-BUILD-OPTIONS.in template that we populate at
configuration time with the expected options. This script is then used
as input for a new "generate-gitweb.sh" script that generates the final
"gitweb.cgi" file. While this requires us to repeat the options multiple
times, it is in line to how we generate other build options like our
GIT-BUILD-OPTIONS file.

While at it, refactor how we replace the GITWEB_PROJECT_MAXDEPTH. Even
though this variable is supposed to be an integer, the source file has
the value quoted. The quotes are eventually stripped via sed(1), which
replaces `"@GITWEB_PROJECT_MAXDEPTH@"` with the actual value, which is
rather nonsensical. This is made clearer by just dropping the quotes in
the source file.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:11 +09:00
b7835b941b Makefile: extract script to massage Python scripts
Extract a script that massages Python scripts. This provides a couple of
benefits:

  - The build logic is deduplicated across Make, CMake and Meson.

  - CMake learns to rewrite scripts as-needed at build time instead of
    only writing them at configure time.

Furthermore, we will use this script when introducing Meson to
deduplicate the logic across build systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:10 +09:00
eb98cb835c Makefile: extract script to massage Shell scripts
Same as in the preceding commits, extract a script that allows us to
unify how we massage shell scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:10 +09:00
ccfba9e0c4 Makefile: use "generate-perl.sh" to massage Perl library
Extend "generate-perl.sh" such that it knows to also massage the Perl
library files. There are two major differences:

  - We do not read in the Perl header. This is handled by matching on
    whether or not we have a Perl shebang.

  - We substitute some more variables, which we read in via our
    GIT-BUILD-OPTIONS.

Adapt both our Makefile and the CMake build instructions to use this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:10 +09:00
a38edab7c8 Makefile: generate doc versions via GIT-VERSION-GEN
The documentation we generate embeds information for the exact Git
version used as well as the date of the commit. This information is
injected by injecting attributes into the build process via command line
argument.

Refactor the logic so that we write the information into "asciidoc.conf"
and "asciidoctor-extensions.rb" via `GIT-VERSION-GEN` for AsciiDoc and
AsciiDoctor, respectively.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:09 +09:00
e4b488049a Makefile: extract script to massage Perl scripts
Extract the script to inject various build-time parameters into our Perl
scripts into a standalone script. This is done such that we can reuse it
in other build systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:09 +09:00
9bb10d27e7 Makefile: generate "git.rc" via GIT-VERSION-GEN
The "git.rc" is used on Windows to embed information like the project
name and version into the resulting executables. As such we need to
inject the version information, which we do by using preprocessor
defines. The logic to do so is non-trivial and needs to be kept in sync
with the different build systems.

Refactor the logic so that we generate "git.rc" via `GIT-VERSION-GEN`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:09 +09:00
c2a3b847ed Makefile: consistently use PERL_PATH
When injecting the Perl path into our scripts we sometimes use '@PERL@'
while we othertimes use '@PERL_PATH@'. Refactor the code use the latter
consistently, which makes it easier to reuse the same logic for multiple
scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:09 +09:00
0c8d339514 Makefile: propagate Git version via generated header
We set up a couple of preprocessor macros when compiling Git that
propagate the version that Git was built from to `git version` et al.
The way this is set up makes it harder than necessary to reuse the
infrastructure across the different build systems.

Refactor this such that we generate a "version-def.h" header via
`GIT-VERSION-GEN` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:08 +09:00
4838deab65 Makefile: refactor GIT-VERSION-GEN to be reusable
Our "GIT-VERSION-GEN" script always writes the "GIT-VERSION-FILE" into
the current directory, where the expectation is that it should exist in
the source directory. But other build systems that support out-of-tree
builds may not want to do that to keep the source directory pristine,
even though CMake currently doesn't care.

Refactor the script such that it won't write the "GIT-VERSION-FILE"
directly anymore, but instead knows to replace @PLACEHOLDERS@ in an
arbitrary input file. This allows us to simplify the logic in CMake to
determine the project version, but can also be reused later on in order
to generate other files that need to contain version information like
our "git.rc" file.

While at it, change the format of the version file by removing the
spaces around the equals sign. Like this we can continue to include the
file in our Makefiles, but can also start to source it in shell scripts
in subsequent steps.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:08 +09:00
dbe46c0feb Makefile: consistently use @PLACEHOLDER@ to substitute
We have a bunch of placeholders in our scripts that we replace at build
time, for example by using sed(1). These placeholders come in three
different formats: @PLACEHOLDER@, @@PLACEHOLDER@@ and ++PLACEHOLDER++.

Next to being inconsistent it also creates a bit of a problem with
CMake, which only supports the first syntax in its `configure_file()`
function. To work around that we instead manually replace placeholders
via string operations, which is a hassle and removes safeguards that
CMake has to verify that we didn't forget to replace any placeholders.
Besides that, other build systems like Meson also support the CMake
syntax.

Unify our codebase to consistently use the syntax supported by such
build systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:08 +09:00
4638e8806e Makefile: use common template for GIT-BUILD-OPTIONS
The "GIT-BUILD-OPTIONS" file is generated by our build systems to
propagate built-in features and paths to our tests. The generation is
done ad-hoc, where both our Makefile and the CMake build instructions
simply echo a bunch of strings into the file. This makes it very hard to
figure out what variables are expected to exist and what format they
have, and the written variables can easily get out of sync between build
systems.

Introduce a new "GIT-BUILD-OPTIONS.in" template to address this issue.
This has multiple advantages:

  - It demonstrates which built options exist in the first place.

  - It can serve as a spot to document the build options.

  - Some build systems complain when not all variables could be
    substituted, alerting us of mismatches. Others don't, but if we
    forgot to substitute such variables we now have a bogus string that
    will likely cause our tests to fail, if they have any meaning in the
    first place.

Backfill values that we didn't yet set in our CMake build instructions.
While at it, remove the `SUPPORTS_SIMPLE_IPC` variable that we only set
up in CMake as it isn't used anywhere.

This change requires us to adapt the setup of TEST_OUTPUT_DIRECTORY in
"test-lib.sh" such that it does not get overwritten after sourcing when
it has been set up via the environment. This is the only instance I
could find where we rely on ordering on variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07 07:52:08 +09:00
e03d2a9ccb t/helper: don't depend on implicit wraparound
In our test helpers we have two cases where we assign -1 to an `unsigned
long`. The intent is to essentially mean "unbounded output", which is
achieved via implicit wraparound of the value.

This pattern causes warnings with -Wsign-compare though. Adapt it and
instead use `ULONG_MAX` explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:05 +09:00
89a0c5c024 scalar: address -Wsign-compare warnings
There are two -Wsign-compare warnings in "scalar.c", both of which are
trivial:

  - We mistakenly use a signed integer to loop towards an upper unsigned
    bound in `cmd_reconfigure()`.

  - We subtract `path_sep - enlistment->buf`, which results in a signed
    integer, and use the value in a ternary expression where second
    value is unsigned. But as `path_sep` is being assigned the result of
    `find_last_dir_sep(enlistment->buf + offset)` we know that it must
    always be bigger than or equal to `enlistment->buf`, and thus the
    result will be positive.

Address both of these warnings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:05 +09:00
efb38ad49f builtin/patch-id: fix type of get_one_patchid()
In `get_one_patchid()` we assign either the result of `strlen()` or
`remove_space()` to `len`. But while the former correctly returns a
`size_t`, the latter returns an `int` to indicate the length of the
stripped string even though it cannot ever return a negative value. This
causes a warning with "-Wsign-conversion".

In fact, even `get_one_patchid()` itself is also using an integer as
return value even though it always returns the length of the patch, and
this bubbles up to other callers.

Adapt the function and its helpers to use `size_t` for string lengths
consistently.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:05 +09:00
6411a0a896 builtin/blame: fix type of length variable when emitting object ID
The `length` variable is used to store how many bytes we wish to emit
from an object ID. This value will either be the full hash algorithm's
length, or the abbreviated hash that can be set via `--abbrev` or the
"core.abbrev" option. The former is of type `size_t`, whereas the latter
is of type `int`, which causes a warning with "-Wsign-compare".

The reason why `abbrev` is using a signed type is mostly that it is
initialized with `-1` to indicate that we have to compute the minimum
abbreviation length. This length is computed via `find_alignment()`,
which always gets called before `emit_other()`, and thus we can assume
that the value would never be negative in `emit_other()`.

In fact, we can even assume that the value will always be at least
`MINIMUM_ABBREV`, which is enforced by both `git_default_core_config()`
and `parse_opt_abbrev_cb()`. We implicitly rely on this by subtracting
up to 3 without checking for whether the value becomes negative. We then
pass the value to printf(3p) to print the prefix of our object's ID, so
if that assumption was violated we may end up with undefined behaviour.

Squelch the warning by asserting this invariant and casting the value of
`abbrev` to `size_t`. This allows us to store the whole length as an
unsigned integer, which we can then pass to `fwrite()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:05 +09:00
87318f2b6e gpg-interface: address -Wsign-comparison warnings
There are a couple of -Wsign-comparison warnings in "gpg-interface.c".
Most of them are trivial and simply using signed integers to loop
towards an upper unsigned bound.

But in `parse_signed_buffer()` we have one case where the different
signedness of the two values of a ternary expression results in a
warning. Given that:

  - `size` will always be bigger than `len` due to the loop condition.

  - `eol` will always be after `buf + len` because it is found via
    memchr(3p) starting from `buf + len`.

We know that both values will always be natural integers.

Squelch the warning by casting the left-hand side to `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:04 +09:00
7d200af27f daemon: fix type of max_connections
The `max_connections` type tracks how many children git-daemon(1) would
spawn at the same time. This value can be controlled via a command line
switch: if given a positive value we'll set that up as the limit. But
when given either zero or a negative value we don't enforce any limit at
all.

But even when being passed a negative value we won't actually store it,
but normalize it to 0. Still, the variable used to store the config is
using a signed integer, which causes warnings when comparing the number
of accepted connections (`max_connections`) with the number of current
connections being handled (`live_children`).

Adapt the type of `max_connections` such that the types of both
variables match.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:04 +09:00
8108d1ac94 daemon: fix loops that have mismatching integer types
We have several loops in "daemon.c" that use a signed integer to loop
through a `size_t`. Adapt them to instead use a `size_t` as counter
value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:04 +09:00
80c9e70ebe global: trivial conversions to fix -Wsign-compare warnings
We have a bunch of loops which iterate up to an unsigned boundary using
a signed index, which generates warnigs because we compare a signed and
unsigned value in the loop condition. Address these sites for trivial
cases and enable `-Wsign-compare` warnings for these code units.

This patch only adapts those code units where we can drop the
`DISABLE_SIGN_COMPARE_WARNINGS` macro in the same step.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:04 +09:00
25435e4ad8 pkt-line: fix -Wsign-compare warning on 32 bit platform
Similar to the preceding commit, we get a warning in `get_packet_data()`
on 32 bit platforms due to our lenient use of `ssize_t`. This function
is kind of curious though: we accept an `unsigned size` of bytes to
read, then store the actual number of bytes read in an `ssize_t` and
return it as an `int`. This is a whole lot of integer conversions, and
in theory these can cause us to overflow when the passed-in size is
larger than `ssize_t`, which on 32 bit platforms is implemented as an
`int`.

None of the callers of that function even care about the number of bytes
we have read, so returning that number is moot anyway. Refactor the
function such that it only returns an error code, which plugs the
potential overflow. While at it, convert the passed-in size parameter to
be of type `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:03 +09:00
ba8f6018b5 csum-file: fix -Wsign-compare warning on 32-bit platform
On 32-bit platforms, ssize_t may be "int" while size_t may be
"unsigned int".  At times we compare the number of bytes we read
stored in a ssize_t variable with "unsigned int", but that is done
after we check that we did not get an error return (which is
negative---and that is the whole reason why we used ssize_t and not
size_t), so these comparisons are safe.

But compilers may not realize that.  Cast these to size_t to work
around the false positives.  On platforms with size_t/ssize_t wider
than a normal int, this won't be an issue.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:03 +09:00
47d72a74a7 diff.h: fix index used to loop through unsigned integer
The `struct diff_flags` structure is essentially an array of flags, all
of which have the same type. We can thus use `sizeof()` to iterate
through all of the flags, which we do in `diff_flags_or()`. But while
the statement returns an unsigned integer, we used a signed integer to
iterate through the flags, which generates a warning.

Fix this by using `size_t` for the index instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:03 +09:00
4f9264b0cd config.mak.dev: drop -Wno-sign-compare
There is no need anymore to disable `-Wsign-compare` now that all files
that cause warnings have been marked accordingly. Drop the option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:02 +09:00
41f43b8243 global: mark code units that generate warnings with -Wsign-compare
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:02 +09:00
709fdce089 compat/win32: fix -Wsign-compare warning in "wWinMain()"
GCC generates a warning in "headless.c" because we compare `slash` with
`size`, where the former is an `int` and the latter is a `size_t`. Fix
the warning by storing `slash` as a `size_t`, as well.

This commit is being singled out because the file does not include the
"git-compat-util.h" header, and consequently, we cannot easily mark it
with the `DISABLE_SIGN_COMPARE_WARNING` macro.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:01 +09:00
6e1d0ce470 compat/regex: explicitly ignore "-Wsign-compare" warnings
Explicitly ignore "-Wsign-compare" warnings in our bundled copy of the
regcomp implementation. We don't use the macro introduced in the
preceding commit because this code does not include "git-compat-util.h"
in the first place.

Note that we already directly use "#pragma GCC diagnostic ignored" in
"regcomp.c", so it shouldn't be an issue to use it directly in the new
spot, either.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:01 +09:00
2121a76d71 git-compat-util: introduce macros to disable "-Wsign-compare" warnings
When compiling with DEVELOPER=YesPlease, we explicitly disable the
"-Wsign-compare" warning. This is mostly because our code base is full
of cases where we don't bother at all whether something should be signed
or unsigned, and enabling the warning would thus cause tons of warnings
to pop up.

Unfortunately, disabling this warning also masks real issues. There have
been multiple CVEs in the Git project that would have been flagged by
this warning (e.g. CVE-2022-39260, CVE-2022-41903 and several fixes in
the vicinity of these CVEs). Furthermore, the final audit report by
X41 D-Sec, who are the ones who have discovered some of the CVEs, hinted
that it might be a good idea to become more strict in this context.

Now simply enabling the warning globally does not fly due to the stated
reason above that we simply have too many sites where we use the wrong
integer types. Instead, introduce a new set of macros that allow us to
mark a file as being free of warnings with "-Wsign-compare". The
mechanism is similar to what we do with `USE_THE_REPOSITORY_VARIABLE`:
every file that is not marked with `DISABLE_SIGN_COMPARE_WARNINGS` will
be compiled with those warnings enabled.

These new markings will be wired up in the subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 20:20:01 +09:00
db162862b3 describe: split "found all tags" and max_candidates logic
Commit a30154187a (describe: stop traversing when we run out of names,
2024-10-31) taught git-describe to automatically reduce the
max_candidates setting to match the total number of possible names. This
lets us break out of the traversal rather than fruitlessly searching for
more candidates when there are no more to be found.

However, setting max_candidates to 0 (e.g., if the repo has no tags)
overlaps with the --exact-match option, which explicitly uses the same
value. And this causes a regression with --always, which is ignored in
exact-match mode. We used to get this in a repo with no tags:

  $ git describe --always HEAD
  b2f0a7f

and now we get:

  $ git describe --always HEAD
  fatal: no tag exactly matches 'b2f0a7f47f5f2aebe1e7fceff19a57de20a78c06'

The reason is that we bail early in describe_commit() when
max_candidates is set to 0. This logic goes all the way back to
2c33f75754 (Teach git-describe --exact-match to avoid expensive tag
searches, 2008-02-24).

We should obviously fix this regression, but there are two paths,
depending on what you think:

  $ git describe --always --exact-match

and

  $ git describe --always --candidates=0

should do. Since the "--always" option was added, it has always been
ignored in --exact-match (or --candidates=0) mode. I.e., we treat
--exact-match as a true exact match of a tag, and never fall back to
using --always, even if it was requested.

If we think that's a bug (or at least a misfeature), then the right
solution is to fix it by removing the early bail-out from 2c33f75754,
letting the noop algorithm run and then hitting the --always fallback
output. And then our regression naturally goes away, because it follows
the same path.

If we think that the current "--exact-match --always" behavior is the
right thing, then we have to differentiate the case where we
automatically reduced max_candidates to 0 from the case where the user
asked for it specifically. That's possible to do with a flag, but we can
also just reimplement the logic from a30154187a to explicitly break out
of the traversal when we run out of candidates (rather than relying on
the existing max_candidates check).

My gut feeling is along the lines of option 1 (it's a bug, and people
would be happy for "--exact-match --always" to give the fallback rather
than ignoring "--always"). But the documentation can be interpreted in
the other direction, and we've certainly lived with the existing
behavior for many years. So it's possible that changing it now is the
wrong thing.

So this patch fixes the regression by taking the second option,
retaining the "--exact-match" behavior as-is. There are two new tests.
The first shows that the regression is fixed (we don't even need a new
repo without tags; a restrictive --match is enough to create the
situation that there are no candidate names).

The second test confirms that the "--exact-match --always" behavior
remains unchanged and continues to die when there is no tag pointing at
the specified commit. It's possible we may reconsider this in the
future, but this shows that the approach described above is implemented
faithfully.

We can also run the perf tests in p6100 to see that we've retained the
speedup that a30154187a was going for:

  Test                                           HEAD^             HEAD
  --------------------------------------------------------------------------------------
  6100.2: describe HEAD                          0.72(0.64+0.07)   0.72(0.66+0.06) +0.0%
  6100.3: describe HEAD with one max candidate   0.01(0.00+0.00)   0.01(0.00+0.00) +0.0%
  6100.4: describe HEAD with one tag             0.01(0.01+0.00)   0.01(0.01+0.00) +0.0%

Reported-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 15:18:21 +09:00
e66fd72e97 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 13:23:18 +09:00
0f588c4661 Merge branch 'kh/sequencer-comment-char'
The sequencer failed to honor core.commentString in some places.

* kh/sequencer-comment-char:
  sequencer: comment commit messages properly
  sequencer: comment `--reference` subject line properly
  sequencer: comment checked-out branch properly
2024-12-06 13:23:18 +09:00
b4269ebf35 Merge branch 'sj/refs-symref-referent-fix'
A double-free that may not trigger in practice by luck has been
corrected in the reference resolution code.

* sj/refs-symref-referent-fix:
  ref-cache: fix invalid free operation in `free_ref_entry`
2024-12-06 13:23:16 +09:00
e02082c7f8 Merge branch 'bf/set-head-symref' into js/set-head-symref-fix
* bf/set-head-symref:
  fetch set_head: handle mirrored bare repositories
  fetch: set remote/HEAD if it does not exist
  refs: add create_only option to refs_update_symref_extended
  refs: add TRANSACTION_CREATE_EXISTS error
  remote set-head: better output for --auto
  remote set-head: refactor for readability
  refs: atomically record overwritten ref in update_symref
  refs: standardize output of refs_read_symbolic_ref
  t/t5505-remote: test failure of set-head
  t/t5505-remote: set default branch to main
2024-12-06 12:09:43 +09:00
6c397d0104 advice: suggest using subcommand "git config set"
The advice message currently suggests using "git config advice..." to
disable advice messages, but since

00bbdde141 (builtin/config: introduce "set" subcommand, 2024-05-06)

we have the "set" subcommand for config. Since using the subcommand is
more in-line with the modern interface, any advice should be promoting
its usage. Change the disable advice message to use the subcommand
instead. Change all uses of "git config advice" in the tests to use the
subcommand.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 11:24:21 +09:00
012bc566ba remote set-head: set followRemoteHEAD to "warn" if "always"
When running "remote set-head" manually it is unlikely, that the user
would actually like to have "fetch" always update the remote/HEAD. On
the contrary, it is more likely, that the user would expect remote/HEAD
to stay the way they manually set it, and just forgot about having
"followRemoteHEAD" set to "always".

When "followRemoteHEAD" is set to "always" make running "remote
set-head" change the config to "warn".

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 02:59:40 +09:00
9e2b7005be fetch set_head: add warn-if-not-$branch option
Currently if we want to have a remote/HEAD locally that is different
from the one on the remote, but we still want to get a warning if remote
changes HEAD, our only option is to have an indiscriminate warning with
"follow_remote_head" set to "warn". Add a new option
"warn-if-not-$branch", where $branch is a branch name we do not wish to
get a warning about. If the remote HEAD is $branch do not warn,
otherwise, behave as "warn".

E.g. let's assume, that our remote origin has HEAD
set to "master", but locally we have "git remote set-head origin seen".
Setting 'remote.origin.followRemoteHEAD = "warn"' will always print
a warning, even though the remote has not changed HEAD from "master".
Setting 'remote.origin.followRemoteHEAD = "warn-if-not-master" will
squelch the warning message, unless the remote changes HEAD from
"master". Note, that should the remote change HEAD to "seen" (which we
have locally), there will still be no warning.

Improve the advice message in report_set_head to also include silencing
the warning message with "warn-if-not-$branch".

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 02:59:40 +09:00
ad739f525e fetch set_head: move warn advice into advise_if_enabled
Advice about what to do when getting a warning is typed out explicitly
twice and is printed as regular output. The output is also tested for.
Extract the advice message into a single place and use a wrapper
function, so if later the advice is made more chatty the signature only
needs to be changed in once place. Remove the testing for the advice
output in the tests.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06 02:59:16 +09:00
24d3dd79e4 midx: inline the MIDX_MIN_SIZE definition
The `MIDX_MIN_SIZE` definition is used to check the midx_size in
`local_multi_pack_index_one`. This definition relies on the
`the_hash_algo` global variable. Inline this and remove the global
variable usage.

With this, remove `USE_THE_REPOSITORY_VARIABLE` usage from `midx.c`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:21 +09:00
f59de71cf7 midx: pass down hash_algo to functions using global variables
The functions `get_split_midx_filename_ext()`, `get_midx_filename()` and
`get_midx_filename_ext()` use `hash_to_hex()` which internally uses the
`the_hash_algo` global variable.

Remove this dependency on global variables by passing down the
`hash_algo` through to the functions mentioned and instead calling
`hash_to_hex_algop()` along with the obtained `hash_algo`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:21 +09:00
d5c2ca576a midx: pass repository to load_multi_pack_index
The `load_multi_pack_index` function in midx uses `the_repository`
variable to access the `repository` struct. Modify the function and its
callee's to send the `repository` field.

This moves usage of `the_repository` to the `test-read-midx.c` file.
While that is not optimal, it is okay, since the upcoming commits will
slowly move the usage of `the_repository` up the layers and remove it
eventually.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:20 +09:00
fae9bae709 midx: cleanup internal usage of the_repository and the_hash_algo
In the `midx.c` file, there are multiple usages of `the_repository` and
`the_hash_algo` within static functions of the file. Some of the usages
can be simply swapped out with the available `repository` struct. While
some of them can be swapped out by passing the repository to the
required functions.

This leaves out only some other usages of `the_repository` and
`the_hash_algo` in the file in non-static functions, which we'll tackle
in upcoming commits.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:20 +09:00
2fed09aa9b midx-write: pass down repository to write_midx_file[_only]
In a previous commit, we passed the repository field to all
subcommands in the `builtin/` directory. Utilize this to pass the
repository field down to the `write_midx_file[_only]` functions to
remove the usage of `the_repository` global variables.

With this, all usage of global variables in `midx-write.c` is removed,
hence, remove the `USE_THE_REPOSITORY_VARIABLE` guard from the file.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:20 +09:00
dfa7c68245 write-midx: add repository field to write_midx_context
The struct `write_midx_context` is used to pass context for creating
MIDX files. Add the repository field here to ensure that most functions
within `midx-write.c` have access to the field and can use that instead
of the global `the_repository` variable.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:20 +09:00
20df8141f5 midx-write: use revs->repo inside read_refs_snapshot
The function `read_refs_snapshot()` uses `parse_oid_hex()`, which relies
on the global `the_hash_algo` variable. Let's instead use
`parse_oid_hex_algop()` and provide the hash algo via `revs->repo`.

Also, while here, fix a missing newline after the function's definition.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:20 +09:00
3687a4b3e1 midx-write: pass down repository to static functions
In 'midx-write.c' there are a lot of static functions which use global
variables `the_repository` or `the_hash_algo`. In a follow up commit,
the repository variable will be added to `write_midx_context`, which
some of the functions can use. But for functions which do not have
access to this struct, pass down the required information from
non-static functions `write_midx_file` and `write_midx_file_only`.

This requires that the function `hash_to_hex` is also replaced with
`hash_to_hex_algop` since the former internally accesses the
`the_hash_algo` global variable.

This ensures that the usage of global variables is limited to these
non-static functions, which will be cleaned up in a follow up commit.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:32:20 +09:00
aaafb67ba9 Merge branch 'kn/pass-repo-to-builtin-sub-sub-commands' into kn/midx-wo-the-repository
* kn/pass-repo-to-builtin-sub-sub-commands:
  builtin: pass repository to sub commands
  Git 2.47.1
  Makefile(s): avoid recipe prefix in conditional statements
  doc: switch links to https
  doc: update links to current pages
  The eleventh batch
  pack-objects: only perform verbatim reuse on the preferred pack
  t5332-multi-pack-reuse.sh: demonstrate duplicate packing failure
  test-lib: move malloc-debug setup after $PATH setup
  builtin/difftool: intialize some hashmap variables
  refspec: store raw refspecs inside refspec_item
  refspec: drop separate raw_nr count
  fetch: adjust refspec->raw_nr when filtering prefetch refspecs
  test-lib: check malloc debug LD_PRELOAD before using
2024-12-04 10:32:02 +09:00
33833ed08b Merge branch 'kn/the-repository' into kn/midx-wo-the-repository
* kn/the-repository:
  packfile.c: remove unnecessary prepare_packed_git() call
  midx: add repository to `multi_pack_index` struct
  config: make `packed_git_(limit|window_size)` non-global variables
  config: make `delta_base_cache_limit` a non-global variable
  packfile: pass down repository to `for_each_packed_object`
  packfile: pass down repository to `has_object[_kept]_pack`
  packfile: pass down repository to `odb_pack_name`
  packfile: pass `repository` to static function in the file
  packfile: use `repository` from `packed_git` directly
  packfile: add repository to struct `packed_git`
2024-12-04 10:31:46 +09:00
23692e08c6 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 10:14:50 +09:00
f334c387f4 Merge branch 'ja/git-diff-doc-markup'
Documentation mark-up updates.

* ja/git-diff-doc-markup:
  doc: git-diff: apply format changes to config part
  doc: git-diff: apply format changes to diff-generate-patch
  doc: git-diff: apply format changes to diff-format
  doc: git-diff: apply format changes to diff-options
  doc: git-diff: apply new documentation guidelines
2024-12-04 10:14:50 +09:00
4c1b7e364e Merge branch 'bc/drop-ancient-libcurl-and-perl'
Drop support for older libcURL and Perl.

* bc/drop-ancient-libcurl-and-perl:
  gitweb: make use of s///r
  Require Perl 5.26.0
  INSTALL: document requirement for libcurl 7.61.0
  git-curl-compat: remove check for curl 7.56.0
  git-curl-compat: remove check for curl 7.53.0
  git-curl-compat: remove check for curl 7.52.0
  git-curl-compat: remove check for curl 7.44.0
  git-curl-compat: remove check for curl 7.43.0
  git-curl-compat: remove check for curl 7.39.0
  git-curl-compat: remove check for curl 7.34.0
  git-curl-compat: remove check for curl 7.25.0
  git-curl-compat: remove check for curl 7.21.5
2024-12-04 10:14:48 +09:00
1e18cf4310 Merge branch 'kn/pass-repo-to-builtin-sub-sub-commands'
Built-in Git subcommands are supplied the repository object to work
with; they learned to do the same when they invoke sub-subcommands.

* kn/pass-repo-to-builtin-sub-sub-commands:
  builtin: pass repository to sub commands
2024-12-04 10:14:47 +09:00
8c917be5d2 Merge branch 'ps/bisect-double-free-fix'
Work around Coverity warning that would not trigger in practice.

* ps/bisect-double-free-fix:
  bisect: address Coverity warning about potential double free
2024-12-04 10:14:46 +09:00
e5b71577a6 Merge branch 'tb/use-test-file-size-more'
Use the right helper program to measure file size in performance tests.

* tb/use-test-file-size-more:
  t/perf: use 'test_file_size' in more places
2024-12-04 10:14:45 +09:00
0a0712e05f Merge branch 'tb/boundary-traversal-fix'
A trivial "correctness" fix that does not yet matter in practice.

* tb/boundary-traversal-fix:
  pack-bitmap.c: typofix in `find_boundary_objects()`
2024-12-04 10:14:44 +09:00
57e81b59f3 Merge branch 'sj/ref-contents-check'
"git fsck" learned to issue warnings on "curiously formatted" ref
contents that have always been taken valid but something Git
wouldn't have written itself (e.g., missing terminating end-of-line
after the full object name).

* sj/ref-contents-check:
  ref: add symlink ref content check for files backend
  ref: check whether the target of the symref is a ref
  ref: add basic symref content check for files backend
  ref: add more strict checks for regular refs
  ref: port git-fsck(1) regular refs check for files backend
  ref: support multiple worktrees check for refs
  ref: initialize ref name outside of check functions
  ref: check the full refname instead of basename
  ref: initialize "fsck_ref_report" with zero
2024-12-04 10:14:42 +09:00
7ee055b237 Merge branch 'ps/ref-backend-migration-optim'
The migration procedure between two ref backends has been optimized.

* ps/ref-backend-migration-optim:
  reftable: rename scratch buffer
  refs: adapt `initial_transaction` flag to be unsigned
  reftable/block: optimize allocations by using scratch buffer
  reftable/block: rename `block_writer::buf` variable
  reftable/writer: optimize allocations by using a scratch buffer
  refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG`
  refs: skip collision checks in initial transactions
  refs: use "initial" transaction semantics to migrate refs
  refs/files: support symbolic and root refs in initial transaction
  refs: introduce "initial" transaction flag
  refs/files: move logic to commit initial transaction
  refs: allow passing flags when setting up a transaction
2024-12-04 10:14:41 +09:00
a5dd262a75 Merge branch 'ps/leakfixes-part-10'
Leakfixes.

* ps/leakfixes-part-10: (27 commits)
  t: remove TEST_PASSES_SANITIZE_LEAK annotations
  test-lib: unconditionally enable leak checking
  t: remove unneeded !SANITIZE_LEAK prerequisites
  t: mark some tests as leak free
  t5601: work around leak sanitizer issue
  git-compat-util: drop now-unused `UNLEAK()` macro
  global: drop `UNLEAK()` annotation
  t/helper: fix leaking commit graph in "read-graph" subcommand
  builtin/branch: fix leaking sorting options
  builtin/init-db: fix leaking directory paths
  builtin/help: fix leaks in `check_git_cmd()`
  help: fix leaking return value from `help_unknown_cmd()`
  help: fix leaking `struct cmdnames`
  help: refactor to not use globals for reading config
  builtin/sparse-checkout: fix leaking sanitized patterns
  split-index: fix memory leak in `move_cache_to_base_index()`
  git: refactor builtin handling to use a `struct strvec`
  git: refactor alias handling to use a `struct strvec`
  strvec: introduce new `strvec_splice()` function
  line-log: fix leak when rewriting commit parents
  ...
2024-12-04 10:14:40 +09:00
2f605347da Merge branch 'ps/gc-stale-lock-warning'
Give a bit of advice/hint message when "git maintenance" stops finding a
lock file left by another instance that still is potentially running.

* ps/gc-stale-lock-warning:
  t7900: fix host-dependent behaviour when testing git-maintenance(1)
  builtin/gc: provide hint when maintenance hits a stale schedule lock
2024-12-04 10:14:37 +09:00
8cb4c6e62f t9300: test verification of renamed paths
Commit da91a90c2f (fast-import: disallow more path components,
2024-11-30) added two separate verify_path() calls (one for
added/modified files, and one for renames/copies). But our tests only
exercise the first one. Let's protect ourselves against regressions by
tweaking one of the tests to rename into the bad path. There are
adjacent tests that will stay as additions, so now both calls are
covered.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 09:12:07 +09:00
bc1a980759 doc: mention rev-list --ancestry-path restrictions
The rev-list documentation doesn't mention that the given
commit must be in the specified commit range, leading
to unexpected results.

Signed-off-by: Kai Koponen <kaikopone@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:27:58 +09:00
2811649951 packfile.c: remove unnecessary prepare_packed_git() call
In 454ea2e4d7 (treewide: use get_all_packs, 2018-08-20) we converted
existing calls to both:

  - get_packed_git(), as well as
  - the_repository->objects->packed_git

, to instead use the new get_all_packs() function.

In the instance that this commit addresses, there was a preceding call
to prepare_packed_git(), which dates all the way back to 660c889e46
(sha1_file: add for_each iterators for loose and packed objects,
2014-10-15) when its caller (for_each_packed_object()) was first
introduced.

This call could have been removed in 454ea2e4d7, since get_all_packs()
itself calls prepare_packed_git(). But the translation in 454ea2e4d7 was
(to the best of my knowledge) a find-and-replace rather than inspecting
each individual caller.

Having an extra prepare_packed_git() call here is harmless, since it
will notice that we have already set the 'packed_git_initialized' field
and the call will be a noop. So we're only talking about a few dozen CPU
cycles to set up and tear down the stack frame.

But having a lone prepare_packed_git() call immediately before a call to
get_all_packs() confused me, so let's remove it as redundant to avoid
more confusion in the future.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:56 +09:00
e106040722 midx: add repository to multi_pack_index struct
The `multi_pack_index` struct represents the MIDX for a repository.
Here, we add a pointer to the repository in this struct, allowing direct
use of the repository variable without relying on the global
`the_repository` struct.

With this addition, we can determine the repository associated with a
`bitmap_index` struct. A `bitmap_index` points to either a `packed_git`
or a `multi_pack_index`, both of which have direct repository
references. To support this, we introduce a static helper function,
`bitmap_repo`, in `pack-bitmap.c`, which retrieves a repository given a
`bitmap_index`.

With this, we clear up all usages of `the_repository` within
`pack-bitmap.c` and also remove the `USE_THE_REPOSITORY_VARIABLE`
definition. Bringing us another step closer to remove all global
variable usage.

Although this change also opens up the potential to clean up `midx.c`,
doing so would require additional refactoring to pass the repository
struct to functions where the MIDX struct is created: a task better
suited for future patches.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:55 +09:00
d284713bae config: make packed_git_(limit|window_size) non-global variables
The variables `packed_git_window_size` and `packed_git_limit` are global
config variables used in the `packfile.c` file. Since it is only used in
this file, let's change it from being a global config variable to a
local variable for the subsystem.

With this, we rid `packfile.c` from all global variable usage and this
means we can also remove the `USE_THE_REPOSITORY_VARIABLE` guard from
the file.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:55 +09:00
d6b2d21fbf config: make delta_base_cache_limit a non-global variable
The `delta_base_cache_limit` variable is a global config variable used
by multiple subsystems. Let's make this non-global, by adding this
variable independently to the subsystems where it is used.

First, add the setting to the `repo_settings` struct, this provides
access to the config in places where the repository is available. Use
this in `packfile.c`.

In `index-pack.c` we add it to the `pack_idx_option` struct and its
constructor. While the repository struct is available here, it may not
be set  because `git index-pack` can be used without a repository.

In `gc.c` add it to the `gc_config` struct and also the constructor
function. The gc functions currently do not have direct access to a
repository struct.

These changes are made to remove the usage of `delta_base_cache_limit`
as a global variable in `packfile.c`. This brings us one step closer to
removing the `USE_THE_REPOSITORY_VARIABLE` definition in `packfile.c`
which we complete in the next patch.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:55 +09:00
c87910b96b packfile: pass down repository to for_each_packed_object
The function `for_each_packed_object` currently relies on the global
variable `the_repository`. To eliminate global variable usage in
`packfile.c`, we should progressively shift the dependency on
the_repository to higher layers. Let's remove its usage from this
function and closely related function `is_promisor_object`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:54 +09:00
cc656f4eb2 packfile: pass down repository to has_object[_kept]_pack
The functions `has_object[_kept]_pack` currently rely on the global
variable `the_repository`. To eliminate global variable usage in
`packfile.c`, we should progressively shift the dependency on
the_repository to higher layers. Let's remove its usage from these
functions and any related ones.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:54 +09:00
873b00597b packfile: pass down repository to odb_pack_name
The function `odb_pack_name` currently relies on the global variable
`the_repository`. To eliminate global variable usage in `packfile.c`, we
should progressively shift the dependency on the_repository to higher
layers.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:54 +09:00
4f9e6bd492 packfile: pass repository to static function in the file
Some of the static functions in the `packfile.c` access global
variables, which can simply be avoided by passing the `repository`
struct down to them. Let's do that.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:54 +09:00
9c5ce06d74 packfile: use repository from packed_git directly
In the previous commit, we introduced the `repository` structure inside
`packed_git`. This provides an alternative route instead of using the
global `the_repository` variable. Let's modify `packfile.c` now to use
this field wherever possible instead of relying on the global state.
There are still a few instances of `the_repository` usage in the file,
where there is no struct `packed_git` locally available, which will be
fixed in the following commits.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:53 +09:00
2cf3fe63f6 packfile: add repository to struct packed_git
The struct `packed_git` holds information regarding a packed object
file. Let's add the repository variable to this object, to represent the
repository that this packfile belongs to. This helps remove dependency
on the global `the_repository` object in `packfile.c` by simply using
repository information now readily available in the struct.

We do need to consider that a packfile could be part of the alternates
of a repository, but considering that we only have one repository struct
and also that we currently anyways use 'the_repository', we should be
OK with this change.

We also modify `alloc_packed_git` to ensure that the repository is added
to newly created `packed_git` structs. This requires modifying the
function and all its callee to pass the repository object down the
levels.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:53 +09:00
bbd445d5ef tag: "git tag" refuses to use HEAD as a tagname
Even though the plumbing level allows you to create refs/tags/HEAD
and refs/heads/HEAD, doing so makes it confusing within the context
of the UI Git Porcelain commands provides.  Just like we prevent a
branch from getting called "HEAD" at the Porcelain layer (i.e. "git
branch" command), teach "git tag" to refuse to create a tag "HEAD".

With a few new tests, we make sure that

 - "git tag HEAD" and "git tag -a HEAD" are rejected

 - "git update-ref refs/tags/HEAD" is still allowed (this is a
   deliberate design decision to allow others to create their own UI
   on top of Git infrastructure that may be different from our UI).

 - "git tag -d HEAD" can remove refs/tags/HEAD to recover from an
   mistake.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03 12:38:50 +09:00
e5ce5b05d0 t5604: do not expect that HEAD can be a valid tagname
09116a1c (refs: loosen over-strict "format" check, 2011-11-16)
introduced a test piece (originally in t5700) that expects to be
able to create a tag named "HEAD" and then a local clone using the
repository as its own reference works correctly.  Later, another
test piece started using this tag starting at acede2eb (t5700:
document a failure of alternates to affect fetch, 2012-02-11).

But the breakage 09116a1c fixed was not specific to the tagname
HEAD.  It would have failed exactly the same way if the tag used
were foo instead of HEAD.

Before forbidding "git tag" from creating "refs/tags/HEAD", update
these tests to use 'foo', not 'HEAD', as the name of the test tag.

Note that the test piece that uses the tag learned the value of the
tag in unnecessarily inefficient and convoluted way with for-each-ref.
Just use "rev-parse" instead.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03 12:38:50 +09:00
93e5e048f8 refs: drop strbuf_ prefix from helpers
The helper functions (strbuf_branchname, strbuf_check_branch_ref,
and strbuf_check_tag_ref) are about handling branch and tag names,
and it is a non-essential fact that these functions use strbuf to
hold these names.  Rename them to make it clarify that these are
more about "ref".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03 12:38:49 +09:00
5bcbde9e49 refs: move ref name helpers around
strbuf_branchname(), strbuf_check_{branch,tag}_ref() are helper
functions to deal with branch and tag names, and the fact that they
happen to use strbuf to hold the name of a branch or a tag is not
essential.  These functions fit better in the refs API than strbuf
API, the latter of which is about string manipulations.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03 12:38:49 +09:00
60c778d172 Merge branch 'ps/leakfixes-part-10' into rj/strvec-splice-fix
* ps/leakfixes-part-10: (49 commits)
  t: remove TEST_PASSES_SANITIZE_LEAK annotations
  test-lib: unconditionally enable leak checking
  t: remove unneeded !SANITIZE_LEAK prerequisites
  t: mark some tests as leak free
  t5601: work around leak sanitizer issue
  git-compat-util: drop now-unused `UNLEAK()` macro
  global: drop `UNLEAK()` annotation
  t/helper: fix leaking commit graph in "read-graph" subcommand
  builtin/branch: fix leaking sorting options
  builtin/init-db: fix leaking directory paths
  builtin/help: fix leaks in `check_git_cmd()`
  help: fix leaking return value from `help_unknown_cmd()`
  help: fix leaking `struct cmdnames`
  help: refactor to not use globals for reading config
  builtin/sparse-checkout: fix leaking sanitized patterns
  split-index: fix memory leak in `move_cache_to_base_index()`
  git: refactor builtin handling to use a `struct strvec`
  git: refactor alias handling to use a `struct strvec`
  strvec: introduce new `strvec_splice()` function
  line-log: fix leak when rewriting commit parents
  ...
2024-12-02 16:27:17 +09:00
e2f5d3b491 Documentation/git-update-ref.txt: add missing word
Add missing word “that” in the phrase “after verifying that”, like
what was done in 1b2dfb7050 (Documentation/git-update-ref.txt: drop
“flag”, 2024-10-21)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 10:54:30 +09:00
18693d7d65 Documentation/git-bundle.txt: fix word join typo
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 10:29:59 +09:00
da91a90c2f fast-import: disallow more path components
Instead of just disallowing '.' and '..', make use of verify_path() to
ensure that fast-import will disallow anything we wouldn't allow into
the index, such as anything under .git/, .gitmodules as a symlink, or
a dos drive prefix on Windows.

Since a few fast-export and fast-import tests that tried to stress-test
the correct handling of quoting relied on filenames that fail
is_valid_win32_path(), such as spaces or periods at the end of filenames
or backslashes within the filename, turn off core.protectNTFS for those
tests to ensure they keep passing.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 10:09:48 +09:00
b7f7d16562 fetch: add configuration for set_head behaviour
In the current implementation, if refs/remotes/$remote/HEAD does not
exist, running fetch will create it, but if it does exist it will not do
anything, which is a somewhat safe and minimal approach. Unfortunately,
for users who wish to NOT have refs/remotes/$remote/HEAD set for any
reason (e.g. so that `git rev-parse origin` doesn't accidentally point
them somewhere they do not want to), there is no way to remove this
behaviour. On the other side of the spectrum, users may want fetch to
automatically update HEAD or at least give them a warning if something
changed on the remote.

Introduce a new setting, remote.$remote.followRemoteHEAD with four
options:

    - "never": do not ever do anything, not even create
    - "create": the current behaviour, now the default behaviour
    - "warn": print a message if remote and local HEAD is different
    - "always": silently update HEAD on every change

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:55:17 +09:00
2037ca85ad worktree: refactor repair_worktree_after_gitdir_move()
This refactors `repair_worktree_after_gitdir_move()` to use the new
`write_worktree_linking_files` function. It also preserves the
relativity of the linking files; e.g., if an existing worktree used
absolute paths then the repaired paths will be absolute (and visa-versa).
`repair_worktree_after_gitdir_move()` is used to repair both sets of
worktree linking files if the `.git` directory is moved during a
re-initialization using `git init`.

This also adds a test case for reinitializing a repository that has
relative worktrees.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:18 +09:00
e6df1ee2c1 worktree: add relative cli/config options to repair command
This teaches the `worktree repair` command to respect the
`--[no-]relative-paths` CLI option and `worktree.useRelativePaths`
config setting. If an existing worktree with an absolute path is repaired
with `--relative-paths`, the links will be replaced with relative paths,
even if the original path was correct. This allows a user to covert
existing worktrees between absolute/relative as desired.

To simplify things, both linking files are written when one of the files
needs to be repaired. In some cases, this fixes the other file before it
is checked, in other cases this results in a correct file being written
with the same contents.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:17 +09:00
298d2917e2 worktree: add relative cli/config options to move command
This teaches the `worktree move` command to respect the
`--[no-]relative-paths` CLI option and `worktree.useRelativePaths`
config setting. If an existing worktree is moved with `--relative-paths`
the new path will be relative (and visa-versa).

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:17 +09:00
b7016344f1 worktree: add relative cli/config options to add command
This introduces the `--[no-]relative-paths` CLI option and
`worktree.useRelativePaths` configuration setting to the `worktree add`
command. When enabled these options allow worktrees to be linked using
relative paths, enhancing portability across environments where absolute
paths may differ (e.g., containerized setups, shared network drives).
Git still creates absolute paths by default, but these options allow
users to opt-in to relative paths if desired.

The t2408 test file is removed and more comprehensive tests are
written for the various worktree operations in their own files.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:17 +09:00
4dac9e3c01 worktree: add write_worktree_linking_files() function
A new helper function, `write_worktree_linking_files()`, centralizes
the logic for computing and writing either relative or absolute
paths, based on the provided configuration. This function accepts
`strbuf` pointers to both the worktree’s `.git` link and the
repository’s `gitdir`, and then writes the appropriate path to each.
The `relativeWorktrees` extension is automatically set when a worktree
is linked with relative paths.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:17 +09:00
5976310916 worktree: refactor infer_backlink return
The previous round[1] was merged a bit early before reviewer feedback
could be applied. This correctly indents a code block and updates the
`infer_backlink` function to return `-1` on failure and strbuf.len on
success.

[1]: https://lore.kernel.org/git/20241007-wt_relative_paths-v3-0-622cf18c45eb@pm.me

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:16 +09:00
1860ba1a2a worktree: add relativeWorktrees extension
A new extension, `relativeWorktrees`, is added to indicate that at least
one worktree in the repository has been linked with relative paths.
This ensures older Git versions do not attempt to automatically prune
worktrees with relative paths, as they would not not recognize the
paths as being valid.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:16 +09:00
d897f2c16d setup: correctly reinitialize repository version
When reinitializing a repository, Git does not account for extensions
other than `objectformat` and `refstorage` when determining the
repository version. This can lead to a repository being downgraded to
version 0 if extensions are set, causing Git future operations to fail.

This patch teaches Git to check if other extensions are defined in the
config to ensure that the repository version is set correctly.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02 09:36:16 +09:00
168ebb7159 CodingGuidelines: a handful of error message guidelines
It is more efficient to have something in the coding guidelines
document to point at, when we want to review and comment on a new
message in the codebase to make sure it "fits" in the set of
existing messages.

Let's write down established best practice we are aware of.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-29 10:36:06 +09:00
b952bd0c2e Merge branch 'sv-20231026' of https://github.com/nafmo/gitk-l10n-sv
* 'sv-20231026' of https://github.com/nafmo/gitk-l10n-sv:
  gitk: sv.po: Update Swedish translation (323t)
2024-11-28 21:36:58 +01:00
baa159137b transport: propagate fsck configuration during bundle fetch
When fetching directly from a bundle, fsck message severity
configuration is not propagated to the underlying git-index-pack(1). It
is only capable of enabling or disabling fsck checks entirely. This does
not align with the fsck behavior for fetches through git-fetch-pack(1).

Use the fsck config parsing from fetch-pack to populate fsck message
severity configuration and wire it through to `unbundle()` to enable the
same fsck verification as done through fetch-pack.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-28 12:07:58 +09:00
05596e93c5 fetch-pack: split out fsck config parsing
When `fetch_pack_config()` is invoked, fetch-pack configuration is
parsed from the config. As part of this operation, fsck message severity
configuration is assigned to the `fsck_msg_types` global variable. This
is optionally used to configure the downstream git-index-pack(1) when
the `--strict` option is specified.

The same parsed fsck message severity configuration is also needed
outside of fetch-pack. Instead of exposing/relying on the existing
global state, split out the fsck config parsing logic into
`fetch_pack_fsck_config()` and expose it. In a subsequent commit, this
is used to provide fsck configuration when invoking `unbundle()`.

For `fetch_pack_fsck_config()` to discern between errors and unhandled
config variables, the return code when `git_config_path()` errors is
changed to a different value also indicating success. This frees up the
previous return code to now indicate the provided config variable
was unhandled. The behavior remains functionally the same.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-28 12:07:58 +09:00
187574ce86 bundle: support fsck message configuration
If the `VERIFY_BUNDLE_FLAG` is set during `unbundle()`, the
git-index-pack(1) spawned is configured with the `--fsck-options` flag
to perform fsck verification. With this flag enabled, there is not a way
to configure fsck message severity though.

Extend the `unbundle_opts` type to store fsck message severity
configuration and update `unbundle()` to conditionally append it to the
`--fsck-objects` flag if provided. This enables `unbundle()` call sites
to support optionally setting the severity for specific fsck messages.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-28 12:07:58 +09:00
87c01003cd bundle: add bundle verification options type
When `unbundle()` is invoked, fsck verification may be configured by
passing the `VERIFY_BUNDLE_FSCK` flag. This mechanism allows fsck checks
on the bundle to be enabled or disabled entirely. To facilitate more
fine-grained fsck configuration, additional context must be provided to
`unbundle()`.

Introduce the `unbundle_opts` type, which wraps the existing
`verify_bundle_flags`, to facilitate future extension of `unbundle()`
configuration. Also update `unbundle()` and its call sites to accept
this new options type instead of the flags directly. The end behavior is
functionally the same, but allows for the set of configurable options to
be extended. This is leveraged in a subsequent commit to enable fsck
message severity configuration.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-28 12:07:57 +09:00
761e62a09a Merge branch 'bf/set-head-symref' into bf/fetch-set-head-config
* bf/set-head-symref:
  fetch set_head: handle mirrored bare repositories
  fetch: set remote/HEAD if it does not exist
  refs: add create_only option to refs_update_symref_extended
  refs: add TRANSACTION_CREATE_EXISTS error
  remote set-head: better output for --auto
  remote set-head: refactor for readability
  refs: atomically record overwritten ref in update_symref
  refs: standardize output of refs_read_symbolic_ref
  t/t5505-remote: test failure of set-head
  t/t5505-remote: set default branch to main
2024-11-27 22:49:05 +09:00
cc01bad4a9 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-27 07:57:10 +09:00
4a611ee7eb Merge branch 'kn/ref-transaction-hook-with-reflog'
The ref-transaction hook triggered for reflog updates, which has
been corrected.

* kn/ref-transaction-hook-with-reflog:
  refs: don't invoke reference-transaction hook for reflogs
2024-11-27 07:57:10 +09:00
1f3d9b9814 Merge branch 'jt/index-pack-allow-promisor-only-while-fetching'
We now ensure "index-pack" is used with the "--promisor" option
only during a "git fetch".

* jt/index-pack-allow-promisor-only-while-fetching:
  index-pack: teach --promisor to forbid pack name
2024-11-27 07:57:09 +09:00
8eaa06590f Merge branch 'en/fast-import-avoid-self-replace'
"git fast-import" can be tricked into a replace ref that maps an
object to itself, which is a useless thing to do.

* en/fast-import-avoid-self-replace:
  fast-import: avoid making replace refs point to themselves
2024-11-27 07:57:08 +09:00
89ceab7b4c Merge branch 'kh/trailer-in-glossary'
Doc updates.

* kh/trailer-in-glossary:
  Documentation/glossary: describe "trailer"
2024-11-27 07:57:07 +09:00
f670d811e2 Merge branch 'jk/gcc15'
GCC 15 compatibility updates.

* jk/gcc15:
  object-file: inline empty tree and blob literals
  object-file: treat cached_object values as const
  object-file: drop oid field from find_cached_object() return value
  object-file: move empty_tree struct into find_cached_object()
  object-file: drop confusing oid initializer of empty_tree struct
  object-file: prefer array-of-bytes initializer for hash literals
2024-11-27 07:57:06 +09:00
93905d3b70 Merge branch 'bc/c23'
C23 compatibility updates.

* bc/c23:
  reflog: rename unreachable
  index-pack: rename struct thread_local
2024-11-27 07:57:05 +09:00
87fc668ce5 Merge branch 'ps/clar-build-improvement'
Fix for clar unit tests to support CMake build.

* ps/clar-build-improvement:
  Makefile: let clar header targets depend on their scripts
  cmake: use verbatim arguments when invoking clar commands
  cmake: use SH_EXE to execute clar scripts
  t/unit-tests: convert "clar-generate.awk" into a shell script
2024-11-27 07:57:04 +09:00
c515230dcf Merge branch 'kh/bundle-docs'
Documentation for "git bundle" saw improvements to more prominently
call out the use of '--all' when creating bundles.

* kh/bundle-docs:
  Documentation/git-bundle.txt: discuss naïve backups
  Documentation/git-bundle.txt: mention --all in spec. refs
  Documentation/git-bundle.txt: remove old `--all` example
  Documentation/git-bundle.txt: mention full backup example
2024-11-27 07:57:03 +09:00
e1fbebe347 Git 2.47.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:02 +01:00
3fad508c3f Sync with 2.46.3
* maint-2.46:
  Git 2.46.3
  Git 2.45.3
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:15:02 +01:00
5c21db3a0d Git 2.46.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:01 +01:00
67809f7c4c Sync with 2.45.3
* maint-2.45:
  Git 2.45.3
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:15:01 +01:00
2f323bb162 Git 2.44.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:00 +01:00
fc16eb306c Git 2.45.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:00 +01:00
99cb64c31a Sync with 2.44.3
* maint-2.44:
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:15:00 +01:00
14799610a8 Sync with 2.43.6
* maint-2.43:
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:59 +01:00
664d4fa692 Git 2.43.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:59 +01:00
c39c2d29e6 Sync with 2.42.4
* maint-2.42:
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:59 +01:00
54ddf17f82 Git 2.42.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:58 +01:00
102e0e6daa Sync with 2.41.3
* maint-2.41:
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:58 +01:00
6fd641a521 Git 2.41.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:58 +01:00
676cddebf9 Sync with 2.40.4
* maint-2.40:
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:57 +01:00
54a3711a9d Git 2.40.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:57 +01:00
08756131a3 Merge branch 'disallow-control-characters-in-credential-urls-by-default'
This addresses two vulnerabilities:

- CVE-2024-50349:

	Printing unsanitized URLs when asking for credentials made the
	user susceptible to crafted URLs (e.g. in recursive clones) that
	mislead the user into typing in passwords for trusted sites that
	would then be sent to untrusted sites instead.

- CVE-2024-52006

	Git may pass on Carriage Returns via the credential protocol to
	credential helpers which use line-reading functions that
	interpret said Carriage Returns as line endings, even though Git
	did not intend that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:45 +01:00
b6318cf23a ref-cache: fix invalid free operation in free_ref_entry
In cfd971520e (refs: keep track of unresolved reference value in
iterators, 2024-08-09), we added a new field "referent" into the "struct
ref" structure. In order to free the "referent", we unconditionally
freed the "referent" by simply adding a "free" statement.

However, this is a bad usage. Because when ref entry is either directory
or loose ref, we will always execute the following statement:

  free(entry->u.value.referent);

This does not make sense. We should never access the "entry->u.value"
field when "entry" is a directory. However, the change obviously doesn't
break the tests. Let's analysis why.

The anonymous union in the "ref_entry" has two members: one is "struct
ref_value", another is "struct ref_dir". On a 64-bit machine, the size
of "struct ref_dir" is 32 bytes, which is smaller than the 48-byte size
of "struct ref_value". And the offset of "referent" field in "struct
ref_value" is 40 bytes. So, whenever we create a new "ref_entry" for a
directory, we will leave the offset from 40 bytes to 48 bytes untouched,
which means the value for this memory is zero (NULL). It's OK to free a
NULL pointer, but this is merely a coincidence of memory layout.

To fix this issue, we now ensure that "free(entry->u.value.referent)" is
only called when "entry->flag" indicates that it represents a loose
reference and not a directory to avoid the invalid memory operation.

Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-27 04:34:37 +09:00
b01b9b81d3 credential: disallow Carriage Returns in the protocol by default
While Git has documented that the credential protocol is line-based,
with newlines as terminators, the exact shape of a newline has not been
documented.

From Git's perspective, which is firmly rooted in the Linux ecosystem,
it is clear that "a newline" means a Line Feed character.

However, even Git's credential protocol respects Windows line endings
(a Carriage Return character followed by a Line Feed character, "CR/LF")
by virtue of using `strbuf_getline()`.

There is a third category of line endings that has been used originally
by MacOS, and that is respected by the default line readers of .NET and
node.js: bare Carriage Returns.

Git cannot handle those, and what is worse: Git's remedy against
CVE-2020-5260 does not catch when credential helpers are used that
interpret bare Carriage Returns as newlines.

Git Credential Manager addressed this as CVE-2024-50338, but other
credential helpers may still be vulnerable. So let's not only disallow
Line Feed characters as part of the values in the credential protocol,
but also disallow Carriage Return characters.

In the unlikely event that a credential helper relies on Carriage
Returns in the protocol, introduce an escape hatch via the
`credential.protectProtocol` config setting.

This addresses CVE-2024-52006.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 20:24:04 +01:00
7725b8100f credential: sanitize the user prompt
When asking the user interactively for credentials, we want to avoid
misleading them e.g. via control sequences that pretend that the URL
targets a trusted host when it does not.

While Git learned, over the course of the preceding commits, to disallow
URLs containing URL-encoded control characters by default, credential
helpers are still allowed to specify values very freely (apart from Line
Feed and NUL characters, anything is allowed), and this would allow,
say, a username containing control characters to be specified that would
then be displayed in the interactive terminal prompt asking the user for
the password, potentially sending those control characters directly to
the terminal. This is undesirable because control characters can be used
to mislead users to divulge secret information to untrusted sites.

To prevent such an attack vector, let's add a `git_prompt()` that forces
the displayed text to be sanitized, i.e. displaying question marks
instead of control characters.

Note: While this commit's diff changes a lot of `user@host` strings to
`user%40host`, which may look suspicious on the surface, there is a good
reason for that: this string specifies a user name, not a
<username>@<hostname> combination! In the context of t5541, the actual
combination looks like this: `user%40@127.0.0.1:5541`. Therefore, these
string replacements document a net improvement introduced by this
commit, as `user@host@127.0.0.1` could have left readers wondering where
the user name ends and where the host name begins.

Hinted-at-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 20:24:04 +01:00
c903985bf7 credential_format(): also encode <host>[:<port>]
An upcoming change wants to sanitize the credential password prompt
where a URL is displayed that may potentially come from a `.gitmodules`
file. To this end, the `credential_format()` function is employed.

To sanitize the host name (and optional port) part of the URL, we need a
new mode of the `strbuf_add_percentencode()` function because the
current mode is both too strict and too lenient: too strict because it
encodes `:`, `[` and `]` (which should be left unencoded in
`<host>:<port>` and in IPv6 addresses), and too lenient because it does
not encode invalid host name characters `/`, `_` and `~`.

So let's introduce and use a new mode specifically to encode the host
name and optional port part of a URI, leaving alpha-numerical
characters, periods, colons and brackets alone and encoding all others.

This only leads to a change of behavior for URLs that contain invalid
host names.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 20:24:00 +01:00
7cf65e2660 refs/reftable: reuse iterators when reading refs
When reading references the reftable backend has to:

  1. Create a new ref iterator.

  2. Seek the iterator to the record we're searching for.

  3. Read the record.

We cannot really avoid the last two steps, but re-creating the iterator
every single time we want to read a reference is kind of expensive and a
waste of resources. We couldn't help it in the past though because it
was not possible to reuse iterators. But starting with 5bf96e0c39
(reftable/generic: move seeking of records into the iterator,
2024-05-13) we have split up the iterator lifecycle such that creating
the iterator and seeking are two different concerns.

Refactor the code such that we cache iterators in the reftable backend.
This cache is invalidated whenever the respective stack is reloaded such
that we know to recreate the iterator in that case. This leads to a
sizeable speedup when creating many refs, which requires a lot of random
reference reads:

    Benchmark 1: update-ref: create many refs (refcount = 100000, revision = master)
      Time (mean ± σ):      1.793 s ±  0.010 s    [User: 0.954 s, System: 0.835 s]
      Range (min … max):    1.781 s …  1.811 s    10 runs

    Benchmark 2: update-ref: create many refs (refcount = 100000, revision = HEAD)
      Time (mean ± σ):      1.680 s ±  0.013 s    [User: 0.846 s, System: 0.831 s]
      Range (min … max):    1.664 s …  1.702 s    10 runs

    Summary
      update-ref: create many refs (refcount = 100000, revision = HEAD) ran
        1.07 ± 0.01 times faster than update-ref: create many refs (refcount = 100000, revision = master)

While 7% is not a huge win, you have to consider that the benchmark is
_writing_ data, so _reading_ references is only one part of what we do.
Flame graphs show that we spend around 40% of our time reading refs, so
the speedup when reading refs is approximately ~2.5x that. I could not
find better benchmarks where we perform a lot of random ref reads.

You can also see a sizeable impact on memory usage when creating 100k
references. Before this change:

    HEAP SUMMARY:
        in use at exit: 19,112,538 bytes in 200,170 blocks
      total heap usage: 8,400,426 allocs, 8,200,256 frees, 454,367,048 bytes allocated

After this change:

    HEAP SUMMARY:
        in use at exit: 674,416 bytes in 169 blocks
      total heap usage: 7,929,872 allocs, 7,929,703 frees, 281,509,985 bytes allocated

As an additional factor, this refactoring opens up the possibility for
more performance optimizations in how we re-seek iterators. Any change
that allows us to optimize re-seeking by e.g. reusing data structures
would thus also directly speed up random reads.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
9d471b9dfe reftable/merged: drain priority queue on reseek
In 5bf96e0c39 (reftable/generic: move seeking of records into the
iterator, 2024-05-13) we have refactored the reftable codebase such that
iterators can be initialized once and then re-seeked multiple times.
This feature is used by 1869525066 (refs/reftable: wire up support for
exclude patterns, 2024-09-16) in order to skip records based on exclude
patterns provided by the caller.

The logic to re-seek the merged iterator is insufficient though because
we don't drain the priority queue on a re-seek. This means that the
queue may contain stale entries and thus reading the next record in the
queue will return the wrong entry. While this is an obvious bug, it is
harmless in the context of above exclude patterns:

  - If the queue contained stale entries that match the pattern then the
    caller would already know to filter out such refs. This is because
    our codebase is prepared to handle backends that don't have a way to
    efficiently implement exclude patterns.

  - If the queue contained stale entries that don't match the pattern
    we'd eventually filter out any duplicates. This is because the
    reftable code discards items with the same ref name and sorts any
    remaining entries properly.

So things happen to work in this context regardless of the bug, and
there is no other use case yet where we re-seek iterators. We're about
to introduce a caching mechanism though where iterators are reused by
the reftable backend, and that will expose the bug.

Fix the issue by draining the priority queue when seeking and add a
testcase that surfaces the issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
eb22c1b46b reftable/stack: add mechanism to notify callers on reload
Reftable stacks are reloaded in two cases:

  - When calling `reftable_stack_reload()`, if the stat-cache tells us
    that the stack has been modified.

  - When committing a reftable addition.

While callers can figure out the second case, they do not have a
mechanism to figure out whether `reftable_stack_reload()` led to an
actual reload of the on-disk data. All they can do is thus to assume
that data is always being reloaded in that case.

Improve the situation by introducing a new `on_reload()` callback to the
reftable options. If provided, the function will be invoked every time
the stack has indeed been reloaded. This allows callers to invalidate
data that depends on the current stack data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
96e7cb83b6 refs/reftable: refactor reflog expiry to use reftable backend
Refactor the callback function that expires reflog entries in the
reftable backend to use `reftable_backend_read_ref()` instead of
accessing the reftable stack directly. This ensures that the function
will benefit from the new caching layer that we're about to introduce.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
ad6c41f4b7 refs/reftable: refactor reading symbolic refs to use reftable backend
Refactor the callback function that reads symbolic references in the
reftable backend to use `reftable_backend_read_ref()` instead of
accessing the reftable stack directly. This ensures that the function
will benefit from the new caching layer that we're about to introduce.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
27fdf8f4ed refs/reftable: read references via struct reftable_backend
Refactor `read_ref_without_reload()` to accept `struct reftable_backend`
as parameter instead of `struct reftable_stack`. Rename the function to
`reftable_backend_read_ref()` to clarify its scope and move it close to
other functions operating on `struct reftable_backend`.

This change allows us to implement an additional caching layer when
reading refs where we can reuse reftable iterators.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
3ec8022bb0 refs/reftable: figure out hash via reftable_stack
The function `read_ref_without_reload()` accepts a ref store as input
only so that we can figure out the hash function used by it. This is
duplicate information though because the reftable stack knows about its
hash function, too.

Drop the superfluous parameter to simplify the calling convention a bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
c9f76fc7d1 reftable/stack: add accessor for the hash ID
Add an accessor function that allows callers to access the hash ID of a
reftable stack. This function will be used in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:36 +09:00
46b5f67019 refs/reftable: handle reloading stacks in the reftable backend
When accessing a stack we almost always have to reload the stack before
reading data from it. This is mostly because Git does not have a
notification mechanism for when underlying data has been changed, and
thus we are forced to opportunistically reload the stack every single
time to account for any changes that may have happened concurrently.

Handle the reload internally in `backend_for()`. For one this forces
callsites to think about whether or not they need to reload the stack.
But second this makes the logic to access stacks more self-contained by
letting the `struct reftable_backend` manage themselves.

Update callsites where we don't reload the stack to document why we
don't. In some cases it's unclear whether it is the right thing to do in
the first place, but fixing that is outside of the scope of this patch
series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:36 +09:00
ad0986c676 refs/reftable: encapsulate reftable stack
The reftable ref store needs to keep track of multiple stacks, one for
the main worktree and an arbitrary number of stacks for worktrees. This
is done by storing pointers to `struct reftable_stack`, which we then
access directly.

Wrap the stack in a new `struct reftable_backend`. This will allow us to
attach more data to each respective stack in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:36 +09:00
6f33d8e255 builtin: pass repository to sub commands
In 9b1cb5070f (builtin: add a repository parameter for builtin
functions, 2024-09-13) the repository was passed down to all builtin
commands. This allowed the repository to be passed down to lower layers
without depending on the global `the_repository` variable.

Continue this work by also passing down the repository parameter from
the command to sub-commands. This will help pass down the repository to
other subsystems and cleanup usage of global variables like
'the_repository' and 'the_hash_algo'.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:36:08 +09:00
4a2790a257 fast-import: disallow "." and ".." path components
If a user specified e.g.
   M 100644 :1 ../some-file
then fast-import previously would happily create a git history where
there is a tree in the top-level directory named "..", and with a file
inside that directory named "some-file".  The top-level ".." directory
causes problems.  While git checkout will die with errors and fsck will
report hasDotdot problems, the user is going to have problems trying to
remove the problematic file.  Simply avoid creating this bad history in
the first place.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:30:04 +09:00
5f9f7fafb7 bisect: address Coverity warning about potential double free
Coverity has started to warn about a potential double-free in
`find_bisection()`. This warning is triggered because we may modify the
list head of the passed-in `commit_list` in case it is an UNINTERESTING
commit, but still call `free_commit_list()` on the original variable
that points to the now-freed head in case where `do_find_bisection()`
returns a `NULL` pointer.

As far as I can see, this double free cannot happen in practice, as
`do_find_bisection()` only returns a `NULL` pointer when it was passed a
`NULL` input. So in order to trigger the double free we would have to
call `find_bisection()` with a commit list that only consists of
UNINTERESTING commits, but I have not been able to construct a case
where that happens.

Drop the `else` branch entirely as it seems to be a no-op anyway.
Another option might be to instead call `free_commit_list()` on `list`,
which is the modified version of `commit_list` and thus wouldn't cause a
double free. But as mentioned, I couldn't come up with any case where a
passed-in non-NULL list becomes empty, so this shouldn't be necessary.
And if it ever does become necessary we'd notice anyway via the leak
sanitizer.

Interestingly enough we did not have a single test exercising this
branch: all tests pass just fine even when replacing it with a call to
`BUG()`. Add a test that exercises it.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:22:24 +09:00
c6c977e82b Merge branch 'ps/leakfixes-part-10' into ps/bisect-double-free-fix
* ps/leakfixes-part-10: (27 commits)
  t: remove TEST_PASSES_SANITIZE_LEAK annotations
  test-lib: unconditionally enable leak checking
  t: remove unneeded !SANITIZE_LEAK prerequisites
  t: mark some tests as leak free
  t5601: work around leak sanitizer issue
  git-compat-util: drop now-unused `UNLEAK()` macro
  global: drop `UNLEAK()` annotation
  t/helper: fix leaking commit graph in "read-graph" subcommand
  builtin/branch: fix leaking sorting options
  builtin/init-db: fix leaking directory paths
  builtin/help: fix leaks in `check_git_cmd()`
  help: fix leaking return value from `help_unknown_cmd()`
  help: fix leaking `struct cmdnames`
  help: refactor to not use globals for reading config
  builtin/sparse-checkout: fix leaking sanitized patterns
  split-index: fix memory leak in `move_cache_to_base_index()`
  git: refactor builtin handling to use a `struct strvec`
  git: refactor alias handling to use a `struct strvec`
  strvec: introduce new `strvec_splice()` function
  line-log: fix leak when rewriting commit parents
  ...
2024-11-26 10:21:58 +09:00
7e2f377b03 sequencer: comment commit messages properly
The rebase todo editor has commands like `fixup -c` which affects
the commit messages of the rebased commits.[1]  For example:

    pick hash1 <msg>
    fixup hash2 <msg>
    fixup -c hash3 <msg>

This says that hash2 and hash3 should be squashed into hash1 and
that hash3’s commit message should be used for the resulting commit.
So the user is presented with an editor where the two first commit
messages are commented out and the third is not.  However this does
not work if `core.commentChar`/`core.commentString` is in use since
the comment char is hardcoded (#) in this `sequencer.c` function.
As a result the first commit message will not be commented out.

† 1: See 9e3cebd97c (rebase -i: add fixup [-C | -c] command,
    2021-01-29)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Co-authored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:05:08 +09:00
515d034f8d sequencer: comment --reference subject line properly
`git revert --reference <commit>` leaves behind a comment in the
first line:[1]

    # *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***

Meaning that the commit will just consist of the next line if the user
exits the editor directly:

    This reverts commit <--format=reference commit>

But the comment char here is hardcoded (#).  Which means that the
comment line will inadvertently be included in the commit message if
`core.commentChar`/`core.commentString` is in use.

† 1: See 43966ab315 (revert: optionally refer to commit in the
    "reference" format, 2022-05-26)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:05:08 +09:00
94304b9f48 sequencer: comment checked-out branch properly
`git rebase --update-ref` does not insert commands for dependent/sub-
branches which are checked out.[1]  Instead it leaves a comment about
that fact.  The comment char is hardcoded (#).  In turn the comment
line gets interpreted as an invalid command when `core.commentChar`/
`core.commentString` is in use.

† 1: See 900b50c242 (rebase: add --update-refs option, 2022-07-19)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:05:08 +09:00
ef46ad0815 reftable: rename scratch buffer
Both `struct block_writer` and `struct reftable_writer` have a `buf`
member that is being reused to optimize the number of allocations.
Rename the variable to `scratch` to clarify its intend and provide a
comment explaining why it exists.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 08:39:38 +09:00
0f5762b043 refs: adapt initial_transaction flag to be unsigned
The `initial_transaction` flag is tracked as a signed integer, but we
typically pass around flags via unsigned integers. Adapt the type
accordingly.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 08:39:38 +09:00
ba874d1dac t7900: fix host-dependent behaviour when testing git-maintenance(1)
We have recently added a new test to t7900 that exercises whether
git-maintenance(1) fails as expected when the "schedule.lock" file
exists. The test depends on whether or not the host has the required
executables present to schedule maintenance tasks in the first place,
like systemd or launchctl -- if not, the test fails with an unrelated
error before even checking for the lock file. This fails for example in
our CI systems, where macOS images do not have launchctl available.

Fix this issue by creating a stub systemctl(1) binary and using the
systemd scheduler.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 15:22:04 +09:00
1bc1e94091 doc: option value may be separate for valid reasons
Even though `git help cli` recommends users to prefer using
"--option=value" over "--option value", there can be reasons why
giving them separately is a good idea.  One reason is that shells do
not perform tilde expansion for `--option=~/path/name` but they
expand `--options ~/path/name` just fine.

This is not a problem for many options whose option parsing is
properly written using OPT_FILENAME(), because the value given to
OPT_FILENAME() is tilde-expanded internally by us, but some commands
take a pathname as a mere string, which needs this trick to have the
shell help us.

I think the reason we originally decided to recommend the stuck form
was because an option that takes an optional value requires you to
use it in the stuck form, and it is one less thing for users to
worry about if they get into the habit to always use the stuck form.
But we should be discouraging ourselves from adding an option with
an optional value in the first place, and we might want to weaken
the current recommendation.

In any case, let's describe this one case where it is necessary to
use the separate form, with an example.

Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 14:20:15 +09:00
6ea2d9d271 Sync with Git 2.47.1
* maint:
  Git 2.47.1
  Makefile(s): avoid recipe prefix in conditional statements
  doc: switch links to https
  doc: update links to current pages
2024-11-25 12:33:36 +09:00
92999a42db Git 2.47.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 12:32:21 +09:00
b3ba1efa50 Merge branch 'ak/typofixes' into maint-2.47
Typofixes.

* ak/typofixes:
  t: fix typos
  t/helper: fix a typo
  t/perf: fix typos
  t/unit-tests: fix typos
  contrib: fix typos
  compat: fix typos
2024-11-25 12:29:48 +09:00
00c388f487 Merge branch 'xx/protocol-v2-doc-markup-fix' into maint-2.47
Docfix.

* xx/protocol-v2-doc-markup-fix:
  Documentation/gitprotocol-v2.txt: fix a slight inconsistency in format
2024-11-25 12:29:47 +09:00
3357b3d88d Merge branch 'tc/bundle-uri-leakfix' into maint-2.47
Leakfix.

* tc/bundle-uri-leakfix:
  bundle-uri: plug leak in unbundle_from_file()
2024-11-25 12:29:46 +09:00
058c36aa26 Merge branch 'kh/checkout-ignore-other-docfix' into maint-2.47
Doc updates.

* kh/checkout-ignore-other-docfix:
  checkout: refer to other-worktree branch, not ref
2024-11-25 12:29:45 +09:00
fd78021b91 Merge branch 'kh/merge-tree-doc' into maint-2.47
Docfix.
cf. <CABPp-BE=JfoZp19Va-1oF60ADBUibGDwDkFX-Zytx7A3uJ__gg@mail.gmail.com>

* kh/merge-tree-doc:
  doc: merge-tree: improve example script
2024-11-25 12:29:44 +09:00
bd8a8a71dc Merge branch 'kn/loose-object-layer-wo-global-hash' into maint-2.47
Code clean-up.

* kn/loose-object-layer-wo-global-hash:
  loose: don't rely on repository global state
2024-11-25 12:29:43 +09:00
5f380e4017 Merge branch 'jc/doc-refspec-syntax' into maint-2.47
Doc updates.

* jc/doc-refspec-syntax:
  doc: clarify <src> in refspec syntax
2024-11-25 12:29:42 +09:00
f675674ced Merge branch 'js/doc-platform-support-link-fix' into maint-2.47
Docfix.

* js/doc-platform-support-link-fix:
  docs: fix the `maintain-git` links in `technical/platform-support`
2024-11-25 12:29:41 +09:00
e52276d340 Merge branch 'jh/config-unset-doc-fix' into maint-2.47
Docfix.

* jh/config-unset-doc-fix:
  git-config.1: remove value from positional args in unset usage
2024-11-25 12:29:40 +09:00
6b03fd8dcd Merge branch 'jk/output-prefix-cleanup' into maint-2.47
Code clean-up.

* jk/output-prefix-cleanup:
  diff: store graph prefix buf in git_graph struct
  diff: return line_prefix directly when possible
  diff: return const char from output_prefix callback
  diff: drop line_prefix_length field
  line-log: use diff_line_prefix() instead of custom helper
2024-11-25 12:29:39 +09:00
304e77d2f8 Merge branch 'sk/doc-maintenance-schedule' into maint-2.47
Doc update to clarify how periodical maintenance are scheduled,
spread across time to avoid thundering hurds.

* sk/doc-maintenance-schedule:
  doc: add a note about staggering of maintenance
2024-11-25 12:29:38 +09:00
2a18f26d77 Merge branch 'tb/notes-amlog-doc' into maint-2.47
Document "amlog" notes.

* tb/notes-amlog-doc:
  Documentation: mention the amlog in howto/maintain-git.txt
2024-11-25 12:29:37 +09:00
98c839d58f Merge branch 'master' of https://github.com/j6t/gitk into maint-2.47
* 'master' of https://github.com/j6t/gitk:
  Makefile(s): avoid recipe prefix in conditional statements
  doc: switch links to https
  doc: update links to current pages
2024-11-25 12:20:42 +09:00
dbaece3526 git-difftool--helper.sh: exit upon initialize_merge_tool errors
Since the introduction of 'initialize_merge_tool' in de8dafbada
(mergetool: break setup_tool out into separate initialization function,
2021-02-09), any errors from this function are ignored in
git-difftool--helper.sh::launch_merge_tool, which is not the case for
its call in git-mergetool.sh::merge_file.

Despite the in-code comment, initialize_merge_tool (via its call to
setup_tool) does different checks than run_merge_tool, so it makes sense
to abort early if it encounters errors. Add exit calls if
initialize_merge_tool fails.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:59:19 +09:00
acca46d124 git-mergetool--lib.sh: add error message for unknown tool variant
In setup_tool, we check if the given tool is a known variant of a tool,
and quietly return with an error if not. This leads to the following
invocation quietly failing:

	git mergetool --tool=vimdiff4

Add an error message before returning in this case.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:59:19 +09:00
bba503d43e git-mergetool--lib.sh: add error message if 'setup_user_tool' fails
In git-mergetool--lib.sh::setup_tool, we check if the given tool is a
known builtin tool, a known variant, or a user-defined tool by calling
setup_user_tool, and we return with the exit code from setup_user_tool
if it was called. setup_user_tool checks if {diff,merge}tool.$tool.cmd
is set and quietly returns with an error if not.

This leads to the following invocation quietly failing:

	git mergetool --tool=unknown

which is not very user-friendly. Adjust setup_tool to output an error
message before returning if setup_user_tool returned with an error.

Note that we do not check the result of the second call to
setup_user_tool in setup_tool, as this call is only meant to allow users
to redefine 'cmd' for a builtin tool; it is not an error if they have
not done so.

Note that this behaviour of quietly failing is a regression dating back
to de8dafbada (mergetool: break setup_tool out into separate
initialization function, 2021-02-09), as before this commit an unknown
mergetool would be diagnosed in get_merge_tool_path when called from
run_merge_tool.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:59:19 +09:00
00536761df git-mergetool--lib.sh: use TOOL_MODE when erroring about unknown tool
In git-mergetool--lib.sh::get_merge_tool_path, we check if the chosen
tool is valid via valid_tool and exit with an error message if not. This
error message mentions "Unknown merge tool", even if the command the
user tried was 'git difftool --tool=unknown'. Use the global 'TOOL_MODE'
variable for a more correct error message.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:59:19 +09:00
fe99a52225 completion: complete '--tool-help' in 'git mergetool'
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:59:18 +09:00
b1b713f722 fetch set_head: handle mirrored bare repositories
When adding a remote to bare repository with "git remote add --mirror",
running fetch will fail to update HEAD to the remote's HEAD, since it
does not know how to handle bare repositories. On the other hand HEAD
already has content, since "git init --bare" has already set HEAD to
whatever is the default branch set for the user. Unless this - by chance
- is the same as the remote's HEAD, HEAD will be pointing to a bad
symref. Teach set_head to handle bare repositories, by overwriting HEAD
so it mirrors the remote's HEAD.

Note, that in this case overriding the local HEAD reference is
necessary, since HEAD will exist before fetch can be run, but this
should not be an issue, since the whole purpose of --mirror is to be an
exact mirror of the remote, so following any changes to HEAD makes
sense.

Also note, that although "git remote set-head" also fails when trying to
update the remote's locally tracked HEAD in a mirrored bare repository,
the usage of the command does not make much sense after this patch:
fetch will update the remote HEAD correctly, and setting it manually to
something else is antithetical to the concept of mirroring.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:37 +09:00
3f763ddf28 fetch: set remote/HEAD if it does not exist
When cloning a repository remote/HEAD is created, but when the user
creates a repository with git init, and later adds a remote, remote/HEAD
is only created if the user explicitly runs a variant of "remote
set-head". Attempt to set remote/HEAD during fetch, if the user does not
have it already set. Silently ignore any errors.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:37 +09:00
9963746c84 refs: add create_only option to refs_update_symref_extended
Allow the caller to specify that it only wants to update the symref if
it does not already exist. Silently ignore the error from the
transaction API if the symref already exists.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:36 +09:00
ed2f6f8804 refs: add TRANSACTION_CREATE_EXISTS error
Currently there is only one special error for transaction, for when
there is a naming conflict, all other errors are dumped under a generic
error. Add a new special error case for when the caller requests the
reference to be updated only when it does not yet exist and the
reference actually does exist.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:36 +09:00
dfe86fa06b remote set-head: better output for --auto
Currently, set-head --auto will print a message saying "remote/HEAD set
to branch", which implies something was changed.

Change the output of --auto, so the output actually reflects what was
done: a) set a previously unset HEAD, b) change HEAD because remote
changed or c) no updates. As edge cases, if HEAD is changed from
a previous symbolic reference that was not a remote branch, explicitly
call attention to this fact, and also notify the user if the previous
reference was not a symbolic reference.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:36 +09:00
4f07c45e25 remote set-head: refactor for readability
Make two different readability refactors:

Rename strbufs "buf" and "buf2" to something more explanatory.

Instead of calling get_main_ref_store(the_repository) multiple times,
call it once and store the result in a new refs variable. Although this
change probably offers some performance benefits, the main purpose is to
shorten the line lengths of function calls using this variable.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:35 +09:00
d842cd1301 refs: atomically record overwritten ref in update_symref
When updating a symref with update_symref it's currently not possible to
know for sure what was the previous value that was overwritten. Extend
refs_update_symref under a new function name, to record the value after
the ref has been locked if the caller of refs_update_symref_extended
requests it via a new variable in the function call. Make the return
value of the function notify the caller, if the previous value was
actually not a symbolic reference. Keep the original refs_update_symref
function with the same signature, but now as a wrapper around
refs_update_symref_extended.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:35 +09:00
8102d10ff8 refs: standardize output of refs_read_symbolic_ref
When the symbolic reference we want to read with refs_read_symbolic_ref
is actually not a symbolic reference, the files and the reftable
backends return different values (1 and -1 respectively). Standardize
the returned values so that 0 is success, -1 is a generic error and -2
is that the reference was actually non-symbolic.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:35 +09:00
2fd5555895 t/t5505-remote: test failure of set-head
The test coverage was missing a test for the failure branch of remote
set-head auto's output. Add the missing text and while we are at it,
correct a small grammatical mistake in the error's output ("setup" is
the noun, "set up" is the verb).

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:34 +09:00
54d820d7d4 t/t5505-remote: set default branch to main
Consider the bare repository called "mirror" in the test. Running `git
remote add --mirror -f origin ../one` will not change HEAD, consequently
if init.defaultBranch is not the same as what HEAD in the remote
("one"), HEAD in "mirror" will be pointing to a non-existent reference.
Hence if "mirror" is used as a remote by yet another repository,
ls-remote will not show HEAD. On the other hand, if init.defaultBranch
happens to match HEAD in "one", then ls-remote will show HEAD.

Since the "ci/run-build-and-tests.sh" script globally exports
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main for some (but not all) jobs,
there may be a drift in some tests between how the test repositories are
set up in the CI and during local testing, if the test itself uses
"master" as default instead of "main". In particular, this happens in
t5505-remote.sh. This issue does not manifest currently, as the test
does not do any remote HEAD manipulation where this would come up, but
should such things be added, a locally passing test would break the CI
and vice-versa.

Set GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main in t5505-remote to be
consistent with the CI.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:34 +09:00
2afd8996ae gitk: check main window visibility before waiting for it to show
If the main window is already visible when gitk waits for it to
become visible, gitk hangs forever.
This commit adds a check whether the window is already visible.
See https://wiki.tcl-lang.org/page/tkwait+visibility

Signed-off-by: Tobias Pietzsch <pietzsch@mycroft.speedport.ip>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-11-24 15:31:45 +01:00
c18400c6bb Makefile(s): avoid recipe prefix in conditional statements
In GNU Make commit 07fcee35 ([SV 64815] Recipe lines cannot contain
conditional statements, 2023-05-22) and following, conditional
statements may no longer be preceded by a tab character (which Make
refers to as the recipe prefix).

There are a handful of spots in our various Makefile(s) which will break
in a future release of Make containing 07fcee35. For instance, trying to
compile the pre-image of this patch with the tip of make.git results in
the following:

    $ make -v | head -1 && make
    GNU Make 4.4.90
    config.mak.uname:842: *** missing 'endif'.  Stop.

The kernel addressed this issue in 82175d1f9430 (kbuild: Replace tabs
with spaces when followed by conditionals, 2024-01-28). Address the
issues in Git's tree by applying the same strategy.

When a conditional word (ifeq, ifneq, ifdef, etc.) is preceded by one or
more tab characters, replace each tab character with 8 space characters
with the following:

    find . -type f -not -path './.git/*' -name Makefile -or -name '*.mak' |
      xargs perl -i -pe '
        s/(\t+)(ifn?eq|ifn?def|else|endif)/" " x (length($1) * 8) . $2/ge unless /\\$/
      '

The "unless /\\$/" removes any false-positives (like "\telse \"
appearing within a shell script as part of a recipe).

After doing so, Git compiles on newer versions of Make:

    $ make -v | head -1 && make
    GNU Make 4.4.90
    GIT_VERSION = 2.44.0.414.gfac1dc44ca9
    [...]

    $ echo $?
    0

Reported-by: Dario Gjorgjevski <dario.gjorgjevski@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: 728b9ac0c3
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-11-24 13:45:49 +01:00
ed87b13a50 doc: switch links to https
These sites offer https versions of their content.
Using the https versions provides some protection for users.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: d05b08cd52
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-11-24 13:44:39 +01:00
7539e569ef doc: update links to current pages
It's somewhat traditional to respect sites' self-identification.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picked-from: 65175d9ea2
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-11-24 13:43:45 +01:00
04eaff62f2 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-22 14:34:19 +09:00
0a83b39594 Merge branch 'tb/multi-pack-reuse-dupfix'
Object reuse code based on multi-pack-index sent an unwanted copy
of object.

* tb/multi-pack-reuse-dupfix:
  pack-objects: only perform verbatim reuse on the preferred pack
  t5332-multi-pack-reuse.sh: demonstrate duplicate packing failure
2024-11-22 14:34:19 +09:00
76bb16db5c Merge branch 'sm/difftool'
Use of some uninitialized variables in "git difftool" has been
corrected.

* sm/difftool:
  builtin/difftool: intialize some hashmap variables
2024-11-22 14:34:18 +09:00
aa1d4b42e5 Merge branch 'jk/fetch-prefetch-double-free-fix'
Double-free fix.

* jk/fetch-prefetch-double-free-fix:
  refspec: store raw refspecs inside refspec_item
  refspec: drop separate raw_nr count
  fetch: adjust refspec->raw_nr when filtering prefetch refspecs
2024-11-22 14:34:17 +09:00
0b9b6cda6e Merge branch 'jk/test-malloc-debug-check'
Avoid build/test breakage on a system without working malloc debug
support dynamic library.

* jk/test-malloc-debug-check:
  test-lib: move malloc-debug setup after $PATH setup
  test-lib: check malloc debug LD_PRELOAD before using
2024-11-22 14:34:16 +09:00
3f97f1bce6 t/perf: use 'test_file_size' in more places
The perf test suite prefers to use test_file_size over 'wc -c' when
inside of a test_size block. One advantage is that accidentally writign
"wc -c file" (instead of "wc -c <file") does not inadvertently break the
tests (since the former will include the filename in the output of wc).

Both of the two uses of test_size use "wc -c", but let's convert those
to the more conventional test_file_size helper instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-22 09:44:34 +09:00
91f88f76e6 pack-bitmap.c: typofix in find_boundary_objects()
In the boundary-based bitmap traversal, we use the given 'rev_info'
structure to first do a commit-only walk in order to determine the
boundary between interesting and uninteresting objects.

That walk only looks at commit objects, regardless of the state of
revs->blob_objects, revs->tree_objects, and so on. In order to do this,
we store the state of these variables in temporary fields before
setting them back to zero, performing the traversal, and then setting
them back.

But there is a typo here that dates back to b0afdce5da (pack-bitmap.c:
use commit boundary during bitmap traversal, 2023-05-08), where we
incorrectly store the value of the "tags" field as "revs->blob_objects".

This could lead to problems later on if, say, the caller wants tag
objects but *not* blob objects. In the pre-image behavior, we'd set
revs->tag_objects back to the old value of revs->blob_objects, thus
emitting fewer objects than expected back to the caller.

Fix that by correctly assigning the value of 'revs->tag_objects' to the
'tmp_tags' field.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-22 08:57:18 +09:00
fc1ddf42af t: remove TEST_PASSES_SANITIZE_LEAK annotations
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there
is no longer a need to have that variable declared in all of our tests.
Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:48 +09:00
1fc7ddf35b test-lib: unconditionally enable leak checking
Over the last two releases we have plugged a couple hundred of memory
leaks exposed by the Git test suite. With the preceding commits we have
finally fixed the last leak exposed by our test suite, which means that
we are now basically leak free wherever we have branch coverage.

From hereon, the Git test suite should ideally stay free of memory
leaks. Most importantly, any test suite that is being added should
automatically be subject to the leak checker, and if that test does not
pass it is a strong signal that the added code introduced new memory
leaks and should not be accepted without further changes.

Drop the infrastructure around TEST_PASSES_SANITIZE_LEAK to reflect this
new requirement. Like this, all test suites will be subject to the leak
checker by default.

This is being intentionally strict, but we still have an escape hatch:
the SANITIZE_LEAK prerequisite. There is one known case in t5601 where
the leak sanitizer itself is buggy, so adding this prereq in such cases
is acceptable. Another acceptable situation is when a newly added test
uncovers preexisting memory leaks: when fixing that memory leak would be
sufficiently complicated it is fine to annotate and document the leak
accordingly. But in any case, the burden is now on the patch author to
explain why exactly they have to add the SANITIZE_LEAK prerequisite.

The TEST_PASSES_SANITIZE_LEAK annotations will be dropped in the next
patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:47 +09:00
0b7f0ce751 t: remove unneeded !SANITIZE_LEAK prerequisites
We have a couple of !SANITIZE_LEAK prerequisites for tests that used to
fail due to memory leaks. These have all been fixed by now, so let's
drop the prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:47 +09:00
33e782e959 t: mark some tests as leak free
Both t5558 and t5601 are leak-free starting with 6dab49b9fb (bundle-uri:
plug leak in unbundle_from_file(), 2024-10-10). Mark them accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:46 +09:00
8415595203 t5601: work around leak sanitizer issue
When running t5601 with the leak checker enabled we can see a hang in
our CI systems. This hang seems to be system-specific, as I cannot
reproduce it on my own machine.

As it turns out, the issue is in those testcases that exercise cloning
of `~repo`-style paths. All of the testcases that hang eventually end up
interpreting "repo" as the username and will call getpwnam(3p) with that
username. That should of course be fine, and getpwnam(3p) should just
return an error. But instead, the leak sanitizer seems to be recursing
while handling a call to `free()` in the NSS modules:

    #0  0x00007ffff7fd98d5 in _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:720
    #1  0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916
    #2  0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55
    #3  0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27
    #4  0x00007ffff7a2b33a in __lsan::Deallocate (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127
    #5  __lsan::lsan_free (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220
    ...
    #261505 0x00007ffff7fd99f2 in free (ptr=<optimized out>) at ../include/rtld-malloc.h:50
    #261506 _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:822
    #261507 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916
    #261508 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55
    #261509 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27
    #261510 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127
    #261511 __lsan::lsan_free (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220
    #261512 0x00007ffff793da25 in module_load (module=0x515000000280) at ./nss/nss_module.c:188
    #261513 0x00007ffff793dee5 in __nss_module_load (module=0x515000000280) at ./nss/nss_module.c:302
    #261514 __nss_module_get_function (module=0x515000000280, name=name@entry=0x7ffff79b9128 "getpwnam_r") at ./nss/nss_module.c:328
    #261515 0x00007ffff793e741 in __GI___nss_lookup_function (fct_name=<optimized out>, ni=<optimized out>) at ./nss/nsswitch.c:137
    #261516 __GI___nss_next2 (ni=ni@entry=0x7fffffffa458, fct_name=fct_name@entry=0x7ffff79b9128 "getpwnam_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7fffffffa460,
        status=status@entry=0, all_values=all_values@entry=0) at ./nss/nsswitch.c:120
    #261517 0x00007ffff794c6a7 in __getpwnam_r (name=name@entry=0x501000000060 "repo", resbuf=resbuf@entry=0x7ffff79fb320 <resbuf>, buffer=<optimized out>,
        buflen=buflen@entry=1024, result=result@entry=0x7fffffffa4b0) at ../nss/getXXbyYY_r.c:343
    #261518 0x00007ffff794c4d8 in getpwnam (name=0x501000000060 "repo") at ../nss/getXXbyYY.c:140
    #261519 0x00005555557e37ff in getpw_str (username=0x5020000001a1 "repo", len=4) at path.c:613
    #261520 0x00005555557e3937 in interpolate_path (path=0x5020000001a0 "~repo", real_home=0) at path.c:654
    #261521 0x00005555557e3aea in enter_repo (path=0x501000000040 "~repo", strict=0) at path.c:718
    #261522 0x000055555568f0ba in cmd_upload_pack (argc=1, argv=0x502000000100, prefix=0x0, repo=0x0) at builtin/upload-pack.c:57
    #261523 0x0000555555575ba8 in run_builtin (p=0x555555a20c98 <commands+3192>, argc=2, argv=0x502000000100, repo=0x555555a53b20 <the_repo>) at git.c:481
    #261524 0x0000555555576067 in handle_builtin (args=0x7fffffffaab0) at git.c:742
    #261525 0x000055555557678d in cmd_main (argc=2, argv=0x7fffffffac58) at git.c:912
    #261526 0x00005555556963cd in main (argc=2, argv=0x7fffffffac58) at common-main.c:64

Note that this stack is more than 260000 function calls deep. Run under
the debugger this will eventually segfault, but in our CI systems it
seems like this just hangs forever.

I assume that this is a bug either in the leak sanitizer or in glibc, as
I cannot reproduce it on my machine. In any case, let's work around the
bug for now by marking those tests with the "!SANITIZE_LEAK" prereq.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:46 +09:00
52c7dbd036 git-compat-util: drop now-unused UNLEAK() macro
The `UNLEAK()` macro has been introduced with 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) to help us
reduce the amount of reported memory leaks in cases we don't care about,
e.g. when exiting immediately afterwards. We have since removed all of
its users in favor of freeing the memory and thus don't need the macro
anymore.

Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:46 +09:00
d91a9db33c global: drop UNLEAK() annotation
There are two users of `UNLEAK()` left in our codebase:

  - In "builtin/clone.c", annotating the `repo` variable. That leak has
    already been fixed though as you can see in the context, where we do
    know to free `repo_to_free`.

  - In "builtin/diff.c", to unleak entries of the `blob[]` array. That
    leak has also been fixed, because the entries we assign to that
    array come from `rev.pending.objects`, and we do eventually release
    `rev`.

This neatly demonstrates one of the issues with `UNLEAK()`: it is quite
easy for the annotation to become stale. A second issue is that its
whole intent is to paper over leaks. And while that has been a necessary
evil in the past, because Git was leaking left and right, it isn't
really much of an issue nowadays where our test suite has no known leaks
anymore.

Remove the last two users of this macro.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:46 +09:00
818e165898 t/helper: fix leaking commit graph in "read-graph" subcommand
We're leaking the commit-graph in the "test-helper read-graph"
subcommand, but as the leak is annotated with `UNLEAK()` the leak
sanitizer doesn't complain.

Fix the leak by calling `free_commit_graph()`. Besides getting rid of
the `UNLEAK()` annotation, it also increases code coverage because we
properly release resources as Git would do it, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:45 +09:00
b97301c13c builtin/branch: fix leaking sorting options
The sorting options are leaking, but given that they are marked with
`UNLEAK()` the leak sanitizer doesn't complain.

Fix the leak by creating a common exit path and clearing the vector such
that we can get rid of the `UNLEAK()` annotation entirely.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:45 +09:00
8ef15c205b builtin/init-db: fix leaking directory paths
We've got a couple of leaking directory paths in git-init(1), all of
which are marked with `UNLEAK()`. Fixing them is trivial, so let's do
that instead so that we can get rid of `UNLEAK()` entirely.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:45 +09:00
2379b5c900 builtin/help: fix leaks in check_git_cmd()
The `check_git_cmd()` function is declared to return a string constant.
And while it sometimes does return a constant, it may also return an
allocated string in two cases:

  - When handling aliases. This case is already marked with `UNLEAK()`
    to work around the leak.

  - When handling unknown commands in case "help.autocorrect" is
    enabled. This one is not marked with `UNLEAK()`.

The function only has a single caller, so let's fix its return type to
be non-constant, consistently return an allocated string and free it at
its callsite to plug the leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:44 +09:00
7720dbe99b help: fix leaking return value from help_unknown_cmd()
While `help_unknown_cmd()` would usually die on an unknown command, it
instead returns an autocorrected command when "help.autocorrect" is set.
But while the function is declared to return a string constant, it
actually returns an allocated string in that case. Callers thus aren't
aware that they have to free the string, leading to a memory leak.

Fix the function return type to be non-constant and free the returned
value at its only callsite.

Note that we cannot simply take ownership of `main_cmds.names[0]->name`
and then eventually free it. This is because the `struct cmdname` is
using a flex array to allocate the name, so the name pointer points into
the middle of the structure and thus cannot be freed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:44 +09:00
889c597961 help: fix leaking struct cmdnames
We're populating multiple `struct cmdnames`, but don't ever free them.
Plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:44 +09:00
94aa96cd59 help: refactor to not use globals for reading config
We're reading the "help.autocorrect" and "alias.*" configuration into
global variables, which makes it hard to manage their lifetime
correctly. Refactor the code to use a struct instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:44 +09:00
58e7568c61 builtin/sparse-checkout: fix leaking sanitized patterns
Both `git sparse-checkout add` and `git sparse-checkout set` accept a
list of additional directories or patterns. These get massaged via calls
to `sanitize_paths()`, which may end up modifying the passed-in array by
updating its pointers to be prefixed paths. This allocates memory that
we never free.

Refactor the code to instead use a `struct strvec`, which makes it way
easier for us to track the lifetime correctly. The couple of extra
memory allocations likely do not matter as we only ever populate it with
command line arguments.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:43 +09:00
a5408d1820 split-index: fix memory leak in move_cache_to_base_index()
In `move_cache_to_base_index()` we move the index cache of the main
index into the split index, which is used when writing a shared index.
But we don't release the old split index base in case we already had a
split index before this operation, which can thus leak memory.

Plug the leak by releasing the previous base.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:43 +09:00
1dd7c32daa git: refactor builtin handling to use a struct strvec
Similar as with the preceding commit, `handle_builtin()` does not
properly track lifetimes of the `argv` array and its strings. As it may
end up modifying the array this can lead to memory leaks in case it
contains allocated strings.

Refactor the function to use a `struct strvec` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:43 +09:00
ffc5c046fb git: refactor alias handling to use a struct strvec
In `handle_alias()` we use both `argcp` and `argv` as in-out parameters.
Callers mostly pass through the static array from `main()`, but once we
handle an alias we replace it with an allocated array that may contain
some allocated strings. Callers do not handle this scenario at all and
thus leak memory.

We could in theory handle the lifetime of `argv` in a hacky fashion by
letting callers free it in case they see that an alias was handled. But
while that would likely work, we still wouldn't be able to easily handle
the lifetime of strings referenced by `argv`.

Refactor the code to instead use a `struct strvec`, which effectively
removes the need for us to manually track lifetimes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:42 +09:00
3f5fadef37 strvec: introduce new strvec_splice() function
Introduce a new `strvec_splice()` function that can replace a range of
strings in the vector with another array of strings. This function will
be used in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:42 +09:00
141766d1bb line-log: fix leak when rewriting commit parents
In `process_ranges_merge_commit()` we try to figure out which of the
parents can be blamed for the given line changes. When we figure out
that none of the files in the line-log have changed we assign the
complete blame to that commit and rewrite the parents of the current
commit to only use that single parent.

This is done via `commit_list_append()`, which is misleadingly _not_
appending to the list of parents. Instead, we overwrite the parents with
the blamed parent. This makes us lose track of the old pointers,
creating a memory leak.

Fix this issue by freeing the parents before we overwrite them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:42 +09:00
c1e98f9010 bisect: fix various cases where we leak commit list items
There are various cases where we leak commit list items because we
evict items from the list, but don't free them. Plug those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:42 +09:00
2b7706aae5 bisect: fix leaking commit list items in check_merge_base()
While we free the result commit list at the end of `check_merge_base()`,
we forget to free any items that we have already iterated over. Fix this
by using a separate variable to iterate through them.

This leak is exposed by t6030, but plugging it does not make the whole
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:41 +09:00
cfb8a0da55 bisect: fix multiple leaks in bisect_next_all()
There are multiple leaks in `bisect_next_all()`. For one we don't free
the `tried` commit list. Second, one of the branches uses a direct
return instead of jumping to the cleanup code.

Fix these by freeing the commit list and converting the return to a
goto.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:41 +09:00
a13d4a19d2 bisect: fix leaking current_bad_oid
When reading bisect refs we read the reference mapping to the "bad" term
into the global `current_bad_oid` variable. This is an allocated string,
but because it is global we never have to free it. This changes though
when `register_ref()` is being called multiple times, at which point
we'll overwrite the previous pointer and thus make it unreachable.

Fix this issue by freeing the previous value. This leak is exposed by
t6030, but plugging it does not make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:41 +09:00
96ab0e7b8b bisect: fix leaking string in handle_bad_merge_base()
When handling a bad merge base we print an error, which includes the set
of good revisions joined by spaces. This string is allocated, but never
freed.

Fix this memory leak. Note that the local `bad_hex` varible also looks
like a string that we should free. But in fact, `oid_to_hex()` returns
an address to a static variable even though it is declared to return a
non-constant string. The function signature is thus quite misleading and
really should be fixed, but doing so is outside of the scope of this
patch series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:40 +09:00
79366add74 bisect: fix leaking good/bad terms when reading multipe times
Even though `read_bisect_terms()` is declared as assigning string
constants, it in fact assigns allocated strings to the `read_bad` and
`read_good` out parameters. The only callers of this function assign the
result to global variables and thus don't have to free them in order to
be leak-free. But that changes when executing the function multiple
times because we'd then overwrite the previous value and thus make it
unreachable.

Fix the function signature and free the previous values. This leak is
exposed by t0630, but plugging it does not make the whole test suite
pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:40 +09:00
65a1b7e2bd builtin/blame: fix leaking blame entries with --incremental
When passing `--incremental` to git-blame(1) we exit early by jumping to
the `cleanup` label. But some of the cleanups we perform are handled
between the `goto` and its label, and thus we leak the data.

Move the cleanups after the `cleanup` label. While at it, move the logic
to free the scoreboard's `final_buf` into `cleanup_scoreboard()` and
drop its `const` declaration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:23:40 +09:00
c9f03f3882 ref: add symlink ref content check for files backend
Besides the textual symref, we also allow symbolic links as the symref.
So, we should also provide the consistency check as what we have done
for textual symref. And also we consider deprecating writing the
symbolic links. We first need to access whether symbolic links still
be used. So, add a new fsck message "symlinkRef(INFO)" to tell the
user be aware of this information.

We have already introduced "files_fsck_symref_target". We should reuse
this function to handle the symrefs which use legacy symbolic links. We
should not check the trailing garbage for symbolic refs. Add a new
parameter "symbolic_link" to disable some checks which should only be
executed for textual symrefs.

And we need to also generate the "referent" parameter for reusing
"files_fsck_symref_target" by the following steps:

1. Use "strbuf_add_real_path" to resolve the symlink and get the
   absolute path "ref_content" which the symlink ref points to.
2. Generate the absolute path "abs_gitdir" of "gitdir" and combine
   "ref_content" and "abs_gitdir" to extract the relative path
   "relative_referent_path".
3. If "ref_content" is outside of "gitdir", we just set "referent" with
   "ref_content". Instead, we set "referent" with
   "relative_referent_path".

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:34 +09:00
d996b4475c ref: check whether the target of the symref is a ref
Ideally, we want to the users use "git symbolic-ref" to create symrefs
instead of writing raw contents into the filesystem. However, "git
symbolic-ref" is strict with the refname but not strict with the
referent. For example, we can make the "referent" located at the
"$(gitdir)/logs/aaa" and manually write the content into this where we
can still successfully parse this symref by using "git rev-parse".

  $ git init repo && cd repo && git commit --allow-empty -mx
  $ git symbolic-ref refs/heads/test logs/aaa
  $ echo $(git rev-parse HEAD) > .git/logs/aaa
  $ git rev-parse test

We may need to add some restrictions for "referent" parameter when using
"git symbolic-ref" to create symrefs because ideally all the
nonpseudo-refs should be located under the "refs" directory and we may
tighten this in the future.

In order to tell the user we may tighten the above situation, create
a new fsck message "symrefTargetIsNotARef" to notify the user that this
may become an error in the future.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:33 +09:00
a6354e6048 ref: add basic symref content check for files backend
We have code that checks regular ref contents, but we do not yet check
the contents of symbolic refs. By using "parse_loose_ref_content" for
symbolic refs, we will get the information of the "referent".

We do not need to check the "referent" by opening the file. This is
because if "referent" exists in the file system, we will eventually
check its correctness by inspecting every file in the "refs" directory.
If the "referent" does not exist in the filesystem, this is OK as it is
seen as the dangling symref.

So we just need to check the "referent" string content. A regular ref
could be accepted as a textual symref if it begins with "ref:", followed
by zero or more whitespaces, followed by the full refname, followed only
by whitespace characters. However, we always write a single SP after
"ref:" and a single LF after the refname. It may seem that we should
report a fsck error message when the "referent" does not apply above
rules and we should not be so aggressive because third-party
reimplementations of Git may have taken advantage of the looser syntax.
Put it more specific, we accept the following contents:

1. "ref: refs/heads/master   "
2. "ref: refs/heads/master   \n  \n"
3. "ref: refs/heads/master\n\n"

When introducing the regular ref content checks, we created two fsck
infos "refMissingNewline" and "trailingRefContent" which exactly
represents above situations. So we will reuse these two fsck messages to
write checks to info the user about these situations.

But we do not allow any other trailing garbage. The followings are bad
symref contents which will be reported as fsck error by "git-fsck(1)".

1. "ref: refs/heads/master garbage\n"
2. "ref: refs/heads/master \n\n\n garbage  "

And we introduce a new "badReferentName(ERROR)" fsck message to report
above errors by using "is_root_ref" and "check_refname_format" to check
the "referent". Since both "is_root_ref" and "check_refname_format"
don't work with whitespaces, we use the trimmed version of "referent"
with these functions.

In order to add checks, we will do the following things:

1. Record the untrimmed length "orig_len" and untrimmed last byte
   "orig_last_byte".
2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure
   "is_root_ref" and "check_refname_format" won't be failed by them.
3. Use "orig_len" and "orig_last_byte" to check whether the "referent"
   misses '\n' at the end or it has trailing whitespaces or newlines.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:33 +09:00
1c0e2a0019 ref: add more strict checks for regular refs
We have already used "parse_loose_ref_contents" function to check
whether the ref content is valid in files backend. However, by
using "parse_loose_ref_contents", we allow the ref's content to end with
garbage or without a newline.

Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. We should notify the users about such
"curiously formatted" loose refs so that adequate care is taken before
we decide to tighten the rules in the future.

And it's not suitable either to report a warn fsck message to the user.
We don't yet want the "--strict" flag that controls this bit to end up
generating errors for such weirdly-formatted reference contents, as we
first want to assess whether this retroactive tightening will cause
issues for any tools out there. It may cause compatibility issues which
may break the repository. So, we add the following two fsck infos to
represent the situation where the ref content ends without newline or
has trailing garbages:

1. refMissingNewline(INFO): A loose ref that does not end with
   newline(LF).
2. trailingRefContent(INFO): A loose ref has trailing content.

It might appear that we can't provide the user with any warnings by
using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert
FSCK_INFO to FSCK_WARN and we can still warn the user about these
situations when using "git refs verify" without introducing
compatibility issues.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:33 +09:00
824aa541aa ref: port git-fsck(1) regular refs check for files backend
"git-fsck(1)" implicitly checks the ref content by passing the
callback "fsck_handle_ref" to the "refs.c::refs_for_each_rawref".
Then, it will check whether the ref content (eventually "oid")
is valid. If not, it will report the following error to the user.

  error: refs/heads/main: invalid sha1 pointer 0000...

And it will also report above errors when there are dangling symrefs
in the repository wrongly. This does not align with the behavior of
the "git symbolic-ref" command which allows users to create dangling
symrefs.

As we have already introduced the "git refs verify" command, we'd better
check the ref content explicitly in the "git refs verify" command thus
later we could remove these checks in "git-fsck(1)" and launch a
subprocess to call "git refs verify" in "git-fsck(1)" to make the
"git-fsck(1)" more clean.

Following what "git-fsck(1)" does, add a similar check to "git refs
verify". Then add a new fsck error message "badRefContent(ERROR)" to
represent that a ref has an invalid content.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:32 +09:00
7c78d819e6 ref: support multiple worktrees check for refs
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. However, "git-fsck(1)"
will check the refs of worktrees. As we decide to get feature parity
with "git-fsck(1)", we need to set up support for multiple worktrees.

Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/<id>/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.

The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.

Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/<id>/" prefix when we are not in the main worktree.

Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:32 +09:00
56ca603957 ref: initialize ref name outside of check functions
We passes "refs_check_dir" to the "files_fsck_refs_name" function which
allows it to create the checked ref name later. However, when we
introduce a new check function, we have to allocate redundant memory and
re-calculate the ref name. It's bad for us to allocate redundant memory
and duplicate logic. Instead, we should allocate and calculate it only
once and pass the ref name to the check functions.

In order not to do repeat calculation, rename "refs_check_dir" to
"refname". And in "files_fsck_refs_dir", create a new strbuf "refname",
thus whenever we handle a new ref, calculate the name and call the check
functions one by one.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:32 +09:00
32dc1c7ec3 ref: check the full refname instead of basename
In "files-backend.c::files_fsck_refs_name", we validate the refname
format by using "check_refname_format" to check the basename of the
iterator with "REFNAME_ALLOW_ONELEVEL" flag.

However, this is a bad implementation. Although we doesn't allow a
single "@" in ".git" directory, we do allow "refs/heads/@". So, we will
report an error wrongly when there is a "refs/heads/@" ref by using one
level refname "@".

Because we just check one level refname, we either cannot check the
other parts of the full refname. And we will ignore the following
errors:

  "refs/heads/ new-feature/test"
  "refs/heads/~new-feature/test"

In order to fix the above problem, enhance "files_fsck_refs_name" to use
the full name for "check_refname_format". Then, replace the tests which
are related to "@" and add tests to exercise the above situations using
for loop to avoid repetition.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:31 +09:00
38cd6eead1 ref: initialize "fsck_ref_report" with zero
In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
"referent" is NULL. So, we need to always initialize these parameters to
NULL instead of letting them point to anywhere when creating a new
"fsck_ref_report" structure.

The original code explicitly initializes the "path" member in the
"struct fsck_ref_report" to NULL (which implicitly 0-initializes other
members in the struct). It is more customary to use "{ 0 }" to express
that we are 0-initializing everything. In order to align with the
codebase, initialize "fsck_ref_report" with zero.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:31 +09:00
d94ac23d3b reftable/block: optimize allocations by using scratch buffer
The block writer needs to compute the key for every record that one adds
to the writer. The buffer for this key is stored on the stack and thus
reallocated on every call to `block_writer_add()`, which is inefficient.

Refactor the code so that we store the buffer in the `block_writer`
struct itself so that we can reuse it. This reduces the number of
allocations when writing many refs, e.g. when migrating one million refs
from the "files" backend to the "reftable backend. Before this change:

    HEAP SUMMARY:
        in use at exit: 80,048 bytes in 49 blocks
      total heap usage: 3,025,864 allocs, 3,025,815 frees, 372,746,291 bytes allocated

After this change:

    HEAP SUMMARY:
        in use at exit: 80,048 bytes in 49 blocks
      total heap usage: 2,013,250 allocs, 2,013,201 frees, 347,543,583 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:17 +09:00
aa248b8ab2 reftable/block: rename block_writer::buf variable
Adapt the name of the `block_writer::buf` variable to instead be called
`block`. This aligns it with the existing `block_len` variable, which
tracks the length of this buffer, and is generally a bit more tied to
the actual context where this variable gets used.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:17 +09:00
66ed011bf7 reftable/writer: optimize allocations by using a scratch buffer
Both `writer_add_record()` and `reftable_writer_add_ref()` get executed
for every single ref record we're adding to the reftable writer. And as
both functions use a local buffer to write data, the allocations we have
to do here add up during larger transactions.

Refactor the code to use a scratch buffer part of the `reftable_writer`
itself such that we can reuse it. This signifcantly reduces the number
of allocations during large transactions, e.g. when migrating refs from
the "files" backend to the "reftable" backend. Before this change:

    HEAP SUMMARY:
        in use at exit: 80,048 bytes in 49 blocks
      total heap usage: 5,032,171 allocs, 5,032,122 frees, 418,792,092 bytes allocated

After this change:

    HEAP SUMMARY:
        in use at exit: 80,048 bytes in 49 blocks
      total heap usage: 3,025,864 allocs, 3,025,815 frees, 372,746,291 bytes allocated

This also translate into a small speedup:

    Benchmark 1: migrate files:reftable (refcount = 1000000, revision = HEAD~)
      Time (mean ± σ):     827.2 ms ±  16.5 ms    [User: 689.4 ms, System: 124.9 ms]
      Range (min … max):   809.0 ms … 924.7 ms    50 runs

    Benchmark 2: migrate files:reftable (refcount = 1000000, revision = HEAD)
      Time (mean ± σ):     813.6 ms ±  11.6 ms    [User: 679.0 ms, System: 123.4 ms]
      Range (min … max):   786.7 ms … 833.5 ms    50 runs

    Summary
      migrate files:reftable (refcount = 1000000, revision = HEAD) ran
        1.02 ± 0.02 times faster than migrate files:reftable (refcount = 1000000, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:16 +09:00
a7004abd0b refs: don't normalize log messages with REF_SKIP_CREATE_REFLOG
When the `REF_SKIP_CREATE_REFLOG` flag is set we skip the creation of
the reflog entry, but we still normalize the reflog message when we
queue the update. This is a waste of resources as the normalized message
will never get used in the first place.

Fix this issue by skipping the normalization in case the flag is set.
This leads to a surprisingly large speedup when migrating from the
"files" to the "reftable" backend:

    Benchmark 1: migrate files:reftable (refcount = 1000000, revision = HEAD~)
      Time (mean ± σ):     878.5 ms ±  14.9 ms    [User: 726.5 ms, System: 139.2 ms]
      Range (min … max):   858.4 ms … 941.3 ms    50 runs

    Benchmark 2: migrate files:reftable (refcount = 1000000, revision = HEAD)
      Time (mean ± σ):     831.1 ms ±  10.5 ms    [User: 694.1 ms, System: 126.3 ms]
      Range (min … max):   812.4 ms … 851.4 ms    50 runs

    Summary
      migrate files:reftable (refcount = 1000000, revision = HEAD) ran
        1.06 ± 0.02 times faster than migrate files:reftable (refcount = 1000000, revision = HEAD~)

And an ever larger speedup when migrating the other way round:

    Benchmark 1: migrate reftable:files (refcount = 1000000, revision = HEAD~)
      Time (mean ± σ):     923.6 ms ±  11.6 ms    [User: 705.5 ms, System: 208.1 ms]
      Range (min … max):   905.3 ms … 946.5 ms    50 runs

    Benchmark 2: migrate reftable:files (refcount = 1000000, revision = HEAD)
      Time (mean ± σ):     818.5 ms ±   9.0 ms    [User: 627.6 ms, System: 180.6 ms]
      Range (min … max):   802.2 ms … 842.9 ms    50 runs

    Summary
      migrate reftable:files (refcount = 1000000, revision = HEAD) ran
        1.13 ± 0.02 times faster than migrate reftable:files (refcount = 1000000, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:16 +09:00
e4929cdf79 refs: skip collision checks in initial transactions
Reference transactions use `refs_verify_refname_available()` to check
for colliding references. This check consists of two parts:

  - Checks for whether multiple ref updates in the same transaction
    conflict with each other.

  - Checks for whether existing refs conflict with any refs part of the
    transaction.

While we generally cannot avoid the first check, the second check is
superfluous in cases where the transaction is an initial one in an
otherwise empty ref store. The check results in multiple ref reads as
well as the creation of a ref iterator for every ref we're checking,
which adds up quite fast when performing the check for many refs.

Introduce a new flag that allows us to skip this check and wire it up in
such that the backends pass it when running an initial transaction. This
leads to significant speedups when migrating ref storage backends. From
"files" to "reftable":

    Benchmark 1: migrate files:reftable (refcount = 100000, revision = HEAD~)
      Time (mean ± σ):     472.4 ms ±   6.7 ms    [User: 175.9 ms, System: 285.2 ms]
      Range (min … max):   463.5 ms … 483.2 ms    10 runs

    Benchmark 2: migrate files:reftable (refcount = 100000, revision = HEAD)
      Time (mean ± σ):      86.1 ms ±   1.9 ms    [User: 67.9 ms, System: 16.0 ms]
      Range (min … max):    82.9 ms …  90.9 ms    29 runs

    Summary
      migrate files:reftable (refcount = 100000, revision = HEAD) ran
        5.48 ± 0.15 times faster than migrate files:reftable (refcount = 100000, revision = HEAD~)

And from "reftable" to "files":

    Benchmark 1: migrate reftable:files (refcount = 100000, revision = HEAD~)
      Time (mean ± σ):     452.7 ms ±   3.4 ms    [User: 209.9 ms, System: 235.4 ms]
      Range (min … max):   445.9 ms … 457.5 ms    10 runs

    Benchmark 2: migrate reftable:files (refcount = 100000, revision = HEAD)
      Time (mean ± σ):      95.2 ms ±   2.2 ms    [User: 73.6 ms, System: 20.6 ms]
      Range (min … max):    91.7 ms … 100.8 ms    28 runs

    Summary
      migrate reftable:files (refcount = 100000, revision = HEAD) ran
        4.76 ± 0.11 times faster than migrate reftable:files (refcount = 100000, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:16 +09:00
00bd6c3e46 refs: use "initial" transaction semantics to migrate refs
Until now, we couldn't use "initial" transaction semantics to migrate
refs because the "files" backend only supported writing regular refs via
the initial transaction because it simply mapped the transaction to a
"packed-refs" transaction. But with the preceding commit, the "files"
backend has learned to also write symbolic and root refs in the initial
transaction by creating a second transaction for all refs that need to
be written as loose refs.

Adapt the code to migrate refs to commit the transaction as an initial
transaction. This results in a signiticant speedup when migrating many
refs:

    Benchmark 1: migrate reftable:files (refcount = 100000, revision = HEAD~)
      Time (mean ± σ):      3.247 s ±  0.034 s    [User: 0.485 s, System: 2.722 s]
      Range (min … max):    3.216 s …  3.309 s    10 runs

    Benchmark 2: migrate reftable:files (refcount = 100000, revision = HEAD)
      Time (mean ± σ):     453.6 ms ±   1.9 ms    [User: 214.6 ms, System: 230.5 ms]
      Range (min … max):   451.5 ms … 456.4 ms    10 runs

    Summary
      migrate reftable:files (refcount = 100000, revision = HEAD) ran
        7.16 ± 0.08 times faster than migrate reftable:files (refcount = 100000, revision = HEAD~)

As the reftable backend doesn't (yet) special-case initial transactions
there is no comparable speedup for that backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:16 +09:00
c0b9cf3b55 refs/files: support symbolic and root refs in initial transaction
The "files" backend has implemented special logic when committing
the first transactions in an otherwise empty ref store: instead of
writing all refs as separate loose files, it instead knows to write them
all into a "packed-refs" file directly. This is significantly more
efficient than having to write each of the refs as separate "loose" ref.

The only user of this optimization is git-clone(1), which only uses this
mechanism to write regular refs. Consequently, the implementation does
not know how to handle both symbolic and root refs. While fine in the
context of git-clone(1), this keeps us from using the mechanism in more
cases.

Adapt the logic to also support symbolic and root refs by using a second
transaction that we use for all of the refs that need to be written as
loose refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:15 +09:00
1c299d03e5 refs: introduce "initial" transaction flag
There are two different ways to commit a transaction:

  - `ref_transaction_commit()` can be used to commit a regular
    transaction and is what almost every caller wants.

  - `initial_ref_transaction_commit()` can be used when it is known that
    the ref store that the transaction is committed for is empty and
    when there are no concurrent processes. This is used when cloning a
    new repository.

Implementing this via two separate functions has a couple of downsides.
First, every reference backend needs to implement a separate callback
even in the case where they don't special-case the initial transaction.
Second, backends are basically forced to reimplement the whole logic for
how to commit the transaction like the "files" backend does, even though
backends may wish to only tweak certain behaviour of a "normal" commit.
Third, it is awkward that callers must never prepare the transaction as
this is somewhat different than how a transaction typically works.

Refactor the code such that we instead mark initial transactions via a
separate flag when starting the transaction. This addresses all of the
mentioned painpoints, where the most important part is that it will
allow backends to have way more leeway in how exactly they want to
handle the initial transaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:15 +09:00
83b8ed8bba refs/files: move logic to commit initial transaction
Move the logic to commit initial transactions such that we can start to
call it in `files_transaction_finish()` in a subsequent commit without
requiring a separate function declaration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:15 +09:00
a0efef1446 refs: allow passing flags when setting up a transaction
Allow passing flags when setting up a transaction such that the
behaviour of the transaction itself can be altered. This functionality
will be used in a subsequent patch.

Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:14 +09:00
4083a6f052 Sync with 'maint' 2024-11-20 14:47:56 +09:00
44ac252971 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20 14:47:17 +09:00
38e4df6615 Merge branch 'la/trailer-info'
Renaming a handful of variables and structure fields.

* la/trailer-info:
  trailer: spread usage of "trailer_block" language
2024-11-20 14:47:17 +09:00
ff44124044 Merge branch 'ja/git-add-doc-markup'
Documentation mark-up updates.

* ja/git-add-doc-markup:
  doc: git-add.txt: convert to new style convention
2024-11-20 14:47:17 +09:00
0c11ef1356 Merge branch 'jt/repack-local-promisor'
"git gc" discards any objects that are outside promisor packs that
are referred to by an object in a promisor pack, and we do not
refetch them from the promisor at runtime, resulting an unusable
repository.  Work it around by including these objects in the
referring promisor pack at the receiving end of the fetch.

* jt/repack-local-promisor:
  index-pack: repack local links into promisor packs
  t5300: move --window clamp test next to unclamped
  t0410: use from-scratch server
  t0410: make test description clearer
2024-11-20 14:47:16 +09:00
f1a384425d Prepare for 2.47.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20 14:43:30 +09:00
cc53ddf7f0 Merge branch 'db/submodule-fetch-with-remote-name-fix' into maint-2.47
A "git fetch" from the superproject going down to a submodule used
a wrong remote when the default remote names are set differently
between them.

* db/submodule-fetch-with-remote-name-fix:
  submodule: correct remote name with fetch
2024-11-20 14:43:00 +09:00
257f2de964 Merge branch 'ps/cache-tree-w-broken-index-entry' into maint-2.47
Fail gracefully instead of crashing when attempting to write the
contents of a corrupt in-core index as a tree object.

* ps/cache-tree-w-broken-index-entry:
  unpack-trees: detect mismatching number of cache-tree/index entries
  cache-tree: detect mismatching number of index entries
  cache-tree: refactor verification to return error codes
2024-11-20 14:42:59 +09:00
76c1953395 Merge branch 'ps/maintenance-start-crash-fix' into maint-2.47
"git maintenance start" crashed due to an uninitialized variable
reference, which has been corrected.

* ps/maintenance-start-crash-fix:
  builtin/gc: fix crash when running `git maintenance start`
2024-11-20 14:42:58 +09:00
f1a50f12b9 Merge branch 'jk/fsmonitor-event-listener-race-fix' into maint-2.47
On macOS, fsmonitor can fall into a race condition that results in
a client waiting forever to be notified for an event that have
already happened.  This problem has been corrected.

* jk/fsmonitor-event-listener-race-fix:
  fsmonitor: initialize fs event listener before accepting clients
  simple-ipc: split async server initialization and running
2024-11-20 14:42:57 +09:00
3117dd359a Merge branch 'ds/line-log-asan-fix' into maint-2.47
Use after free and double freeing at the end in "git log -L... -p"
had been identified and fixed.

* ds/line-log-asan-fix:
  line-log: protect inner strbuf from free
2024-11-20 14:42:56 +09:00
1f2be8bed6 index-pack: teach --promisor to forbid pack name
Currently,

 - Running "index-pack --promisor" outside a repo segfaults.
 - It may be confusing to a user that running "index-pack --promisor"
   within a repo may make changes to the repo's object DB, especially
   since the packs indexed by the index-pack invocation may not even be
   related to the repo.

As discussed in [1] and [2], teaching --promisor to forbid a packfile
name solves both these problems. This combination of arguments requires
a repo (since we are writing the resulting .pack and .idx to it) and it
is clear that the files are related to the repo.

Currently, Git uses "index-pack --promisor" only when fetching into
a repo, so it could be argued that we should teach "index-pack" a
new argument (say, "--fetching-mode") instead of tying --promisor to
a generic argument like the packfile name. However, this --promisor
feature could conceivably be used whenever we have a packfile that is
known to come from the promisor remote (whether obtained through Git's
fetch protocol or through other means) so not using a new argument seems
reasonable - one could envision a user-made script obtaining a packfile
and then running "index-pack --promisor --stdin", for example. In fact,
it might be possible to relax the restriction further (say, by also
allowing --promisor when indexing a packfile that is in the object DB),
but relaxing the restriction is backwards-compatible so we can revisit
that later.

One thing to watch out for is the possibility of a future Git feature
that indexes a pack in the context of a repo, but does not necessarily
write the resulting pack to it (and does not necessarily desire to
make any changes to the object DB). One such feature would be fetch
quarantine, which might need the repo context in order to detect
hash collisions, but would also need to ensure that the object DB
is undisturbed in case the fetch fails for whatever reason, even if
the reason occurs only after the indexing is complete. It may not be
obvious to the implementer of such a feature that "index-pack" could
sometimes write packs other than the indexed pack to the object DB,
but there are already other ways that "fetch" could write to the object
DB (in particular, packfile URIs and bundle URIs), so hopefully the
implementation of this future feature would already include a test that
the object DB be undisturbed.

This change requires the change to t5300 by 1f52cdfacb (index-pack:
document and test the --promisor option, 2022-03-09) to be undone.
(--promisor is already tested indirectly, so we don't need the explicit
test here any more.)

[1] https://lore.kernel.org/git/20241114005652.GC1140565@coredump.intra.peff.net/
[2] https://lore.kernel.org/git/20241119185345.GB15723@coredump.intra.peff.net/

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20 10:37:56 +09:00
656ca9204a builtin/gc: provide hint when maintenance hits a stale schedule lock
When running scheduled maintenance via `git maintenance start`, we
acquire a lockfile to ensure that no other scheduled maintenance task is
running in the repository concurrently. If so, we do provide an error to
the user hinting that another process seems to be running in this repo.

There are two important cases why such a lockfile may exist:

  - An actual git-maintenance(1) process is still running in this
    repository.

  - An earlier process may have crashed or was interrupted part way
    through and has left a stale lockfile behind.

In c95547a394 (builtin/gc: fix crash when running `git maintenance
start`, 2024-10-10), we have fixed an issue where git-maintenance(1)
would crash with the "start" subcommand, and the underlying bug causes
the second scenario to trigger quite often now.

Most users don't know how to get out of that situation again though.
Ideally, we'd be removing the stale lock for our users automatically.
But in the context of repository maintenance this is rather risky, as it
can easily run for hours or even days. So finding a clear point where we
know that the old process has exited is basically impossible.

We have the same issue in other subsystems, e.g. when locking refs. Our
lockfile interfaces thus provide the `unable_to_lock_message()` function
for exactly this purpose: it provides a nice hint to the user that
explains what is going on and how to get out of that situation again by
manually removing the file.

Adapt git-maintenance(1) to print a similar hint. While we could use the
above function, we can provide a bit more context as we know exactly
what kind of process would create the lockfile.

Reported-by: Miguel Rincon Barahona <mrincon@gitlab.com>
Reported-by: Kev Kloss <kkloss@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20 10:26:12 +09:00
f3b2ceea39 doc: git-diff: apply format changes to config part
By the way, we also change the sentences where git-diff would refer to
itself, so that no link is created in the HTML output.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:31:05 +09:00
0b080a70ab doc: git-diff: apply format changes to diff-generate-patch
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:31:05 +09:00
6ace09b2f9 doc: git-diff: apply format changes to diff-format
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:31:04 +09:00
6b552e39c0 doc: git-diff: apply format changes to diff-options
The format change is only applied to the sections of the file that are
filtered in git-diff.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:31:04 +09:00
e72c2d2e91 doc: git-diff: apply new documentation guidelines
The documentation for git-diff has been updated to follow the new
documentation guidelines. The following changes have been applied to
the series of patches:

- switching the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- use _<placeholder>_ instead of <placeholder> in the description
- use `backticks for keywords and more complex option
descriptions`. The new rendering engine will apply synopsis rules to
these spans.
- prevent git-diff from self-referencing itself via gitlink macro when
the generated link would point to the same page.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:31:04 +09:00
b8558e6abd Merge branch 'ps/reftable-detach' into ps/reftable-iterator-reuse
* ps/reftable-detach:
  reftable/system: provide thin wrapper for lockfile subsystem
  reftable/stack: drop only use of `get_locked_file_path()`
  reftable/system: provide thin wrapper for tempfile subsystem
  reftable/stack: stop using `fsync_component()` directly
  reftable/system: stop depending on "hash.h"
  reftable: explicitly handle hash format IDs
  reftable/system: move "dir.h" to its only user
2024-11-19 12:24:33 +09:00
988e7f5e95 reftable/system: provide thin wrapper for lockfile subsystem
We use the lockfile subsystem to write lockfiles for "tables.list". As
with the tempfile subsystem, the lockfile subsystem also hooks into our
infrastructure to prune stale locks via atexit(3p) or signal handlers.

Furthermore, the lockfile subsystem also handles locking timeouts, which
do add quite a bit of logic. Having to reimplement that in the context
of Git wouldn't make a whole lot of sense, and it is quite likely that
downstream users of the reftable library may have a better idea for how
exactly to implement timeouts.

So again, provide a thin wrapper for the lockfile subsystem instead such
that the compatibility shim is fully self-contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:11 +09:00
6361226b79 reftable/stack: drop only use of get_locked_file_path()
We've got a single callsite where we call `get_locked_file_path()`. As
we're about to convert our usage of the lockfile subsystem to instead be
used via a compatibility shim we'd have to implement more logic for this
single callsite. While that would be okay if Git was the only supposed
user of the reftable library, it's a bit more awkward when considering
that we have to reimplement this functionality for every user of the
library eventually.

Refactor the code such that we don't call `get_locked_file_path()`
anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:10 +09:00
01e49941d6 reftable/system: provide thin wrapper for tempfile subsystem
We use the tempfile subsystem to write temporary tables, but given that
we're in the process of converting the reftable library to become
standalone we cannot use this subsystem directly anymore. While we could
in theory convert the code to use mkstemp(3p) instead, we'd lose access
to our infrastructure that automatically prunes tempfiles via atexit(3p)
or signal handlers.

Provide a thin wrapper for the tempfile subsystem instead. Like this,
the compatibility shim is fully self-contained in "reftable/system.c".
Downstream users of the reftable library would have to implement their
own tempfile shims by replacing "system.c" with a custom version.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:10 +09:00
86b770b0bb reftable/stack: stop using fsync_component() directly
We're executing `fsync_component()` directly in the reftable library so
that we can fsync data to disk depending on "core.fsync". But as we're
in the process of converting the reftable library to become standalone
we cannot use that function in the library anymore.

Refactor the code such that users of the library can inject a custom
fsync function via the write options. This allows us to get rid of the
dependency on "write-or-die.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:10 +09:00
c2f08236ed reftable/system: stop depending on "hash.h"
We include "hash.h" in "reftable/system.h" such that we can use hash
format IDs as well as the raw size of SHA1 and SHA256. As we are in the
process of converting the reftable library to become standalone we of
course cannot rely on those constants anymore.

Introduce a new `enum reftable_hash` to replace internal uses of the
hash format IDs and new constants that replace internal uses of the hash
size. Adapt the reftable backend to set up the correct hash function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:10 +09:00
88e297275b reftable: explicitly handle hash format IDs
The hash format IDs are used for two different things across the
reftable codebase:

  - They are used as a 32 bit unsigned integer when reading and writing
    the header in order to identify the hash function.

  - They are used internally to identify which hash function is in use.

When one only considers the second usecase one might think that one can
easily change the representation of those hash IDs. But because those
IDs end up in the reftable header and footer on disk it is important
that those never change.

Create separate constants `REFTABLE_FORMAT_ID_*` and use them in
contexts where we read or write reftable headers. This serves multiple
purposes:

  - It allows us to more easily discern cases where we actually use
    those constants for the on-disk format.

  - It detangles us from the same constants that are defined in
    libgit.a, which is another required step to convert the reftable
    library to become standalone.

  - It makes the next step easier where we stop using `GIT_*_FORMAT_ID`
    constants in favor of a custom enum.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:09 +09:00
17e8039878 reftable/system: move "dir.h" to its only user
We still include "dir.h" in "reftable/system.h" even though it is not
used by anything but by a single unit test. Move it over into that unit
test so that we don't accidentally use any functionality provided by it
in the reftable codebase.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:08 +09:00
5e904f1a4a fast-import: avoid making replace refs point to themselves
If someone replaces a commit with a modified version, then builds on
that commit, and then later decides to rewrite history in a format like

    git fast-export --all | CMD_TO_TWEAK_THE_STREAM | git fast-import

and CMD_TO_TWEAK_THE_STREAM undoes the modifications that the
replacement did, then at the end you'd get a replace ref that points to
itself.  For example:

    $ git show-ref | grep replace
    fb92ebc654641b310e7d0360d0a5a49316fd7264 refs/replace/fb92ebc654641b310e7d0360d0a5a49316fd7264

Git commands which pay attention to replace refs will die with an error
when a self-referencing replace ref is present:

    $ git log
    fatal: replace depth too high for object fb92ebc654641b310e7d0360d0a5a49316fd7264

Avoid such problems by deleting replace refs that will simply end up
pointing to themselves at the end of our writing.  Unless users specify
--quiet, warn them when we delete such a replace ref.

Two notes about this patch:
  * We are not ignoring the problematic update of the replace ref
    (turning it into a no-op), we are replacing the update with a delete.
    The logic here is that if the repository had a value for the replace
    ref before fast-import was run, and the replace ref was explicitly
    named in the fast-import stream, we don't want the replace ref to be
    left with a pre-fast-import value.
  * While loops with more than one element (e.g. refs/replace/A points
    to B, and refs/replace/B points to A) are possible, they seem much
    less plausible.  It is pretty easy to create a sequence of
    git-filter-repo commands that will trigger a self-referencing replace
    ref, but I do not know how to trigger a scenario with a cycle length
    greater than 1.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 09:39:33 +09:00
2af8ead52b object-file: inline empty tree and blob literals
We define macros with the bytes of the empty trees and blobs for sha1
and sha256. But since e1ccd7e2b1 (sha1_file: only expose empty object
constants through git_hash_algo, 2018-05-02), those are used only for
initializing the git_hash_algo entries. Any other code using the macros
directly would be suspicious, since a hash_algo pointer is the level of
indirection we use to make everything work with both sha1 and sha256.

So let's future proof against code doing the wrong thing by dropping the
macros entirely and just initializing the structs directly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 21:48:48 +09:00
e37feea00b object-file: treat cached_object values as const
The cached-object API maps oids to in-memory entries. Once inserted,
these entries should be immutable. Let's return them from the
find_cached_object() call with a const tag to make this clear.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 21:48:48 +09:00
9202ffcf10 object-file: drop oid field from find_cached_object() return value
The pretend_object_file() function adds to an array mapping oids to
object contents, which are later retrieved with find_cached_object().
We naturally need to store the oid for each entry, since it's the lookup
key.

But find_cached_object() also returns a hard-coded empty_tree object.
There we don't care about its oid field and instead compare against
the_hash_algo->empty_tree. The oid field is left as all-zeroes.

This all works, but it means that the cached_object struct we return
from find_cached_object() may or may not have a valid oid field, depend
whether it is the hard-coded tree or came from pretend_object_file().

Nobody looks at the field, so there's no bug. But let's future-proof it
by returning only the object contents themselves, not the oid. We'll
continue to call this "struct cached_object", and the array entry
mapping the key to those contents will be a "cached_object_entry".

This would also let us swap out the array for a better data structure
(like a hashmap) if we chose, but there's not much point. The only code
that adds an entry is git-blame, which adds at most a single entry per
process.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 21:48:48 +09:00
b2a95dfd63 object-file: move empty_tree struct into find_cached_object()
The fake empty_tree struct is a static global, but the only code that
looks at it is find_cached_object(). The struct itself is a little odd,
with an invalid "oid" field that is handled specially by that function.

Since it's really just an implementation detail, let's move it to a
static within the function. That future-proofs against other code trying
to use it and seeing the weird oid value.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 21:48:47 +09:00
2911f9ed1e object-file: drop confusing oid initializer of empty_tree struct
We treat the empty tree specially, providing an in-memory "cached" copy,
which allows you to diff against it even if the object doesn't exist in
the repository. This is implemented as part of the larger cached_object
subsystem, but we use a stand-alone empty_tree struct.

We initialize the oid of that struct using EMPTY_TREE_SHA1_BIN_LITERAL.
At first glance, that seems like a bug; how could this ever work for
sha256 repositories?

The answer is that we never look at the oid field! The oid field is used
to look up entries added by pretend_object_file() to the cached_objects
array. But for our stand-alone entry, we look for it independently using
the_hash_algo->empty_tree, which will point to the correct algo struct
for the repository.

This happened in 62ba93eaa9 (sha1_file: convert cached object code to
struct object_id, 2018-05-02), which even mentions that this field is
never used. Let's reduce confusion for anybody reading this code by
replacing the sha1 initializer with a comment. The resulting field will
be all-zeroes, so any violation of our assumption that the oid field is
not used will break equally for sha1 and sha256.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 21:48:47 +09:00
e770f36307 object-file: prefer array-of-bytes initializer for hash literals
We hard-code a few well-known hash values for empty trees and blobs in
both sha1 and sha256 formats. We do so with string literals like this:

  #define EMPTY_TREE_SHA256_BIN_LITERAL \
         "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \
         "\x04\xd4\x5d\x8d\x85\xef\xa9\xb0\x57\xb5" \
         "\x3b\x14\xb4\xb9\xb9\x39\xdd\x74\xde\xcc" \
         "\x53\x21"

and then use it to initialize the hash field of an object_id struct.
That hash field is exactly 32 bytes long (the size we need for sha256).
But the string literal above is actually 33 bytes long due to the NUL
terminator. This is legal in C, and the NUL is ignored.

  Side note on legality: in general excess initializer elements are
  forbidden, and gcc will warn on both of these:

    char foo[3] = { 'h', 'u', 'g', 'e' };
    char bar[3] = "VeryLongString";

  I couldn't find specific language in the standard allowing
  initialization from a string literal where _just_ the NUL is ignored,
  but C99 section 6.7.8 (Initialization), paragraph 32 shows this exact
  case as "example 8".

However, the upcoming gcc 15 will start warning for this case (when
compiled with -Wextra via DEVELOPER=1):

      CC object-file.o
  object-file.c:52:9: warning: initializer-string for array of ‘unsigned char’ is too long [-Wunterminated-string-initialization]
     52 |         "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  object-file.c:79:17: note: in expansion of macro ‘EMPTY_TREE_SHA256_BIN_LITERAL’

which is understandable. Even though this is not a bug for us, since we
do not care about the NUL terminator (and are just using the literal as
a convenient format), it would be easy to accidentally create an array
that was mistakenly unterminated.

We can avoid this warning by switching the initializer to an actual
array of unsigned values. That arguably demonstrates our intent more
clearly anyway.

Reported-by: Sam James <sam@gentoo.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 21:48:47 +09:00
5dac35bbde Makefile: let clar header targets depend on their scripts
The targets that generate clar headers depend on their source files, but
not on the script that is actually generating the output. Fix the issue
by adding the missing dependencies.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:59:26 +09:00
8caa7b9b05 cmake: use verbatim arguments when invoking clar commands
Pass the VERBATIM option to `add_custom_command()`. Like this, all
arguments to the commands will be escaped properly for the build tool so
that the invoked command receives each argument unchanged.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:59:26 +09:00
8839dccc8d cmake: use SH_EXE to execute clar scripts
In 30bf9f0aaa (cmake: set up proper dependencies for generated clar
headers, 2024-10-21), we have deduplicated the logic to generate our
clar headers by reusing the same scripts that our Makefile does. Despite
the deduplication, this refactoring also made us rebuild the headers in
case the source files change, which didn't happen previously.

The commit also introduced an issue though: we execute the scripts
directly, so when the host does not have "/bin/sh" available they will
fail. This is for example the case on Windows when importing the CMake
project into Microsoft Visual Studio.

Address the issue by invoking the scripts with `SH_EXE`, which contains
the discovered path of the shell interpreter.

While at it, wrap the overly long lines in the CMake build instructions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:59:25 +09:00
9a91ab9400 t/unit-tests: convert "clar-generate.awk" into a shell script
Convert "clar-generate.awk" into a shell script that invokes awk(1).
This allows us to avoid the shell redirect in the build system, which
may otherwise be a problem with build systems on platforms that use a
different shell.

While at it, wrap the overly long lines in the CMake build instructions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:59:25 +09:00
820fd1a569 Documentation/git-bundle.txt: discuss naïve backups
It might be naïve to think that those who need this education would end
up here in the first place.  But I think it’s good to mention this
high-level concept here on a command which provides a backup strategy.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:56:26 +09:00
c43a67f83d Documentation/git-bundle.txt: mention --all in spec. refs
Mention `--all` as an alternative in “Specifying References”.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:56:25 +09:00
f27b48d904 Documentation/git-bundle.txt: remove old --all example
We don’t need this part now that we have a fleshed-out `--all` example.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:56:25 +09:00
df0cf6faad Documentation/git-bundle.txt: mention full backup example
Provide an example about how to make a “full backup” with caveats about
what that means in this case.

This is a requested use-case.[1]  But the doc is a bit unassuming
about it:

    If you want to match `git clone --mirror`, which would include your
    refs such as `refs/remotes/*`, use `--all`.

The user cannot be expected to formulate “I want a full backup” as “I
want to match `git clone --mirror`” for a bundle file or something.
Let’s drop this mention of `--all` later in the doc and frontload it.

† 1: E.g.:

    • https://stackoverflow.com/questions/5578270/fully-backup-a-git-repohttps://stackoverflow.com/questions/11792671/how-to-git-bundle-a-complete-repo

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:56:25 +09:00
639cd8db63 reflog: rename unreachable
In C23, "unreachable" is a macro that invokes undefined behavior if it
is invoked.  To make sure that our code compiles on a variety of C
versions, rename unreachable to "is_unreachable".

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:42:08 +09:00
e8b3bcf491 index-pack: rename struct thread_local
"thread_local" is a keyword in C23.  To make sure that our code compiles
on a wide variety of C versions, rename struct thread_local to "struct
thread_local_data" to avoid a conflict.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:42:08 +09:00
68e3c69efa Documentation/glossary: describe "trailer"
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18 09:41:24 +09:00
090d24e9af Clean up RelNotes for 2.48
There somehow ended up too many bogus "merge X later to maint"
comments for topics that cannot be merged ever down to 'maint'
because they were forked from more recent integration branches
in the draft release notes.  Remove them, as they are inviting
for mistakes later.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-16 02:27:40 +09:00
0ffb5a6bf1 Allow cloning from repositories owned by another user
Historically, Git has allowed users to clone from an untrusted
repository, and we have documented that this is safe to do so:

    `upload-pack` tries to avoid any dangerous configuration options or
    hooks from the repository it's serving, making it safe to clone an
    untrusted directory and run commands on the resulting clone.

However, this was broken by f4aa8c8bb1 ("fetch/clone: detect dubious
ownership of local repositories", 2024-04-10) in an attempt to make
things more secure.  That change resulted in a variety of problems when
cloning locally and over SSH, but it did not change the stated security
boundary.  Because the security boundary has not changed, it is safe to
adjust part of the code that patch introduced.

To do that and restore the previous functionality, adjust enter_repo to
take two flags instead of one.

The two bits are

 - ENTER_REPO_STRICT: callers that require exact paths (as opposed
   to allowing known suffixes like ".git", ".git/.git" to be
   omitted) can set this bit.  Corresponds to the "strict" parameter
   that the flags word replaces.

 - ENTER_REPO_ANY_OWNER_OK: callers that are willing to run without
   ownership check can set this bit.

The former is --strict-paths option of "git daemon".  The latter is
set only by upload-pack, which honors the claimed security boundary.

Note that local clones across ownership boundaries require --no-local so
that upload-pack is used.  Document this fact in the manual page and
provide an example.

This patch was based on one written by Junio C Hamano.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15 11:05:06 +09:00
e199290592 pack-objects: only perform verbatim reuse on the preferred pack
When reusing objects from source pack(s), write_reused_pack_verbatim()
is responsible for reusing objects whole eword_t's at a time. It works
by taking the longest continuous run of objects from the beginning of
each source pack that the caller wants, and reuses the entirety of that
section from each pack.

This is based on the assumption that we don't have any gaps within the
region. This assumption relieves us from having to patch any
OFS_DELTAs, since we know that there aren't any gaps between any delta
and its base in that region.

To illustrate why this assumption is necessary, suppose we have some
pack P, which has objects X, Y, and Z. If the MIDX's copy of Y was
selected from a pack other than P, then the bit corresponding to object
Y will appear earlier in the bitmap than the bits corresponding to X and
Z.

If pack-objects already has or will use the copy of Y from the pack it
was selected from in the MIDX, then it is an error to reuse all objects
between X and Z in the source pack. Doing so will cause us to reuse Y
from a different pack than the one which represents Y in the MIDX,
causing us to either:

 - include the object twice, assuming that the caller wants Y in the
   pack, or

 - include the object once, resulting in us packing more objects than
   necessary.

This regression comes from ca0fd69e37 (pack-objects: prepare
`write_reused_pack_verbatim()` for multi-pack reuse, 2023-12-14), which
incorrectly assumed that there would be no gaps in reusable regions of
non-preferred packs.

Instead, we can only safely perform the whole-word reuse optimization on
the preferred pack, where we know with certainty that no gaps exist in
that region of the bitmap. We can still reuse objects from non-preferred
packs, but we have to inspect them individually in write_reused_pack()
to ensure that any gaps that may exist are accounted for.

This allows us to simplify the implementation of
write_reused_pack_verbatim() back to almost its pre-multi-pack reuse
form, since we can now assume that the beginning of the pack appears at
the beginning of the bitmap, meaning that we don't have to account for
any bits up to the first word boundary (like we had to special case in
ca0fd69e37).

The only significant changes from the pre-ca0fd69e37 implementation are:

 - that we can no longer inspect words up to the end of
   reuse_packfile_bitmap->word_alloc, since we only want to look at
   words whose bits all correspond to objects in the given packfile, and

 - that we return early when given a reuse_packfile which is not
   preferred, making the call a noop.

In the future, it might be possible to restore this optimization if we
could guarantee that some reuse packs don't contain any gaps by
construction (similar to the "disjoint packs" idea in very early
versions of multi-pack reuse).

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15 09:13:31 +09:00
57f35cfd7c t5332-multi-pack-reuse.sh: demonstrate duplicate packing failure
In the multi-pack reuse code, there are two paths for reusing the
on-disk representation of an object, handled by:

  - builtin/pack-objects.c::write_reused_pack_one()
  - builtin/pack-objects.c::write_reused_pack_verbatim()

The former is responsible for copying the bytes for a single object out
of an existing source pack. The latter does the same but for a region of
objects aligned at eword_t boundaries.

Demonstrate a bug whereby write_reused_pack_verbatim() can be tricked
into writing out objects from some source pack, even when those objects
were selected from a different source pack in the MIDX bitmap.

When the caller wants at least one of the objects in that region,
pack-objects will write the same object twice as a result of this bug.
In the other case where the caller doesn't want any of the objects in
the region of interest, we will write out objects that weren't
requested.

Demonstrate this bug by creating two packs, where the preferred one of
those packs contains a single object which also appears in the main
(non-preferred) pack. A separate bug[^1] prevents us from triggering the
main bug when the duplicated object is the last one in the main pack,
but any earlier object will suffice.

We could fix that separate bug, but the following commit will simplify
write_reused_pack_verbatim() and only call it on the preferred pack, so
doing so would have little point.

[^1]: Because write_reused_pack_verbatim() only reuses bits in the range

    off_t pack_start_off = pack_pos_to_offset(reuse_packfile->p, 0);
    off_t pack_end_off = pack_pos_to_offset(reuse_packfile->p,
                                            pos - reuse_packfile->bitmap_pos);

    written += pos - reuse_packfile->bitmap_pos;

    /* We're recording one chunk, not one object. */
    record_reused_object(pack_start_off,
                         pack_start_off - (hashfile_total(out) - pack_start));

  , or in other words excluding the object beginning at position 'pos -
  reuse_packfile->bitmap_pos' in the source pack. But since
  reuse_packfile->bitmap_pos is '1' in the non-preferred pack
  (accounting for the single-object pack which is preferred), we don't
  actually copy the bytes from the last object.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15 09:13:31 +09:00
b886db48c6 refs: don't invoke reference-transaction hook for reflogs
The reference-transaction hook is invoked whenever there is a reference
update being performed. For each state of the transaction, we iterate
over the updates present and pass this information to the hook.

The `ref_update` structure is used to hold these updates within a
`transaction`. We use the same structure for holding reflog updates too.
Which means that the reference transaction hook is also obtaining
information about a reflog update. This is a bug, since:

  - The hook is designed to work with reference updates and reflogs
  updates are different.
  - The hook doesn't have the required information to distinguish
  reference updates from reflog updates.

This is particularly evident when the default branch (pointed by HEAD)
is updated, we see that the hook also receives information about HEAD
being changed. In reality, we only add a reflog update for HEAD, while
HEAD's values remains the same.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15 08:48:23 +09:00
72ad6dc368 test-lib: move malloc-debug setup after $PATH setup
Originally, the conditional definition of the setup/teardown functions
for malloc checking could be run at any time, because they depended only
on command-line options and the system getconf function.

But since 02d900361c (test-lib: check malloc debug LD_PRELOAD before
using, 2024-11-11), we probe the system by running "git version". Since
this code runs before we've set $PATH to point to the version of Git we
intend to test, we actually run the system version of git.

This mostly works, since what we really care about is whether the
LD_PRELOAD works, and it should work the same with any program. But
there are some corner cases:

  1. You might not have a system git at all, in which case the preload
     will appear to fail, even though it could work with the actual
     built version of git.

  2. Your system git could be linked in a different way. For example, if
     it was built statically, then it will ignore LD_PRELOAD entirely,
     and we might assume that the preload works, even though it might
     not when used with a dynamic build.

We could give a more complete path to the version of Git we intend to
test, but features like GIT_TEST_INSTALLED make that not entirely
trivial. So instead, let's just bump the setup until after we've set up
the $PATH. There's no need for us to do it early, as long as it is done
before the first test runs.

Reported-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-14 12:19:26 +09:00
25b0f41288 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-13 08:35:34 +09:00
183ea3eabf Merge branch 'ps/mingw-rename'
The MinGW compatibility layer has been taught to support POSIX
semantics for atomic renames when other process(es) have a file
opened at the destination path.

* ps/mingw-rename:
  compat/mingw: support POSIX semantics for atomic renames
  compat/mingw: allow deletion of most opened files
  compat/mingw: share file handles created via `CreateFileW()`
2024-11-13 08:35:34 +09:00
486c9d3995 Merge branch 'jt/commit-graph-missing'
A regression where commit objects missing from a commit-graph can
cause an infinite loop when doing a fetch in a partial clone has
been fixed.

* jt/commit-graph-missing:
  fetch-pack: die if in commit graph but not obj db
  Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
2024-11-13 08:35:33 +09:00
51ba601160 Merge branch 'en/shallow-exclude-takes-a-ref-fix'
The "--shallow-exclude=<ref>" option to various history transfer
commands takes a ref, not an arbitrary revision.

* en/shallow-exclude-takes-a-ref-fix:
  doc: correct misleading descriptions for --shallow-exclude
  upload-pack: fix ambiguous error message
2024-11-13 08:35:32 +09:00
110c8fe8f5 Merge branch 'ak/t1016-style'
Test modernization.

* ak/t1016-style:
  t1016: clean up style
2024-11-13 08:35:32 +09:00
6890c99e38 Merge branch 'ps/leakfixes-part-9'
More leakfixes.

* ps/leakfixes-part-9: (22 commits)
  list-objects-filter-options: work around reported leak on error
  builtin/merge: release output buffer after performing merge
  dir: fix leak when parsing "status.showUntrackedFiles"
  t/helper: fix leaking buffer in "dump-untracked-cache"
  t/helper: stop re-initialization of `the_repository`
  sparse-index: correctly free EWAH contents
  dir: release untracked cache data
  combine-diff: fix leaking lost lines
  builtin/tag: fix leaking key ID on failure to sign
  transport-helper: fix leaking import/export marks
  builtin/commit: fix leaking cleanup config
  trailer: fix leaking strbufs when formatting trailers
  trailer: fix leaking trailer values
  builtin/commit: fix leaking change data contents
  upload-pack: fix leaking URI protocols
  pretty: clear signature check
  diff-lib: fix leaking diffopts in `do_diff_cache()`
  revision: fix leaking bloom filters
  builtin/grep: fix leak with `--max-count=0`
  grep: fix leak in `grep_splice_or()`
  ...
2024-11-13 08:35:31 +09:00
98e4015593 builtin/difftool: intialize some hashmap variables
When running a dir-diff command that produces no diff, variables
`wt_modified` and `tmp_modified` are used while uninitialized, causing:

    $ /home/smarchi/src/git/git-difftool --dir-diff master
    free(): invalid pointer
    [1]    334004 IOT instruction (core dumped)  /home/smarchi/src/git/git-difftool --dir-diff master
    $ valgrind --track-origins=yes /home/smarchi/src/git/git-difftool --dir-diff master
    ...
    Invalid free() / delete / delete[] / realloc()
       at 0x48478EF: free (vg_replace_malloc.c:989)
       by 0x422CAC: hashmap_clear_ (hashmap.c:208)
       by 0x283830: run_dir_diff (difftool.c:667)
       by 0x284103: cmd_difftool (difftool.c:801)
       by 0x238E0F: run_builtin (git.c:484)
       by 0x2392B9: handle_builtin (git.c:750)
       by 0x2399BC: cmd_main (git.c:921)
       by 0x356FEF: main (common-main.c:64)
     Address 0x1ffefff180 is on thread 1's stack
     in frame #2, created by run_dir_diff (difftool.c:358)
    ...

If taking any `goto finish` path before these variables are initialized,
`hashmap_clear_and_free()` operates on uninitialized data, sometimes
causing a crash.

This regression was introduced in 7f795a1715 (builtin/difftool: plug
several trivial memory leaks, 2024-09-26).

Fix it by initializing those variables with the `HASHMAP_INIT` macro.

Add a test comparing the main branch to itself, resulting in no diff.

Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-13 08:11:19 +09:00
fe17a25905 refspec: store raw refspecs inside refspec_item
The refspec struct keeps two matched arrays: one for the refspec_item
structs and one for the original raw refspec strings. The main reason
for this is that there are other users of refspec_item that do not care
about the raw strings. But it does make managing the refspec struct
awkward, as we must keep the two arrays in sync. This has led to bugs in
the past (both leaks and double-frees).

Let's just store a copy of the raw refspec string directly in each
refspec_item struct. This simplifies the handling at a small cost:

  1. Direct callers of refspec_item_init() will now get an extra copy of
     the refspec string, even if they don't need it. This should be
     negligible, as the struct is already allocating two strings for the
     parsed src/dst values (and we tend to only do it sparingly anyway
     for things like the TAG_REFSPEC literal).

  2. Users of refspec_appendf() will now generate a temporary string,
     copy it, and then free the result (versus handing off ownership of
     the temporary string). We could get around this by having a "nodup"
     variant of refspec_item_init(), but it doesn't seem worth the extra
     complexity for something that is not remotely a hot code path.

Code which accesses refspec->raw now needs to look at refspec->item.raw.
Other callers which just use refspec_item directly can remain the same.
We'll free the allocated string in refspec_item_clear(), which they
should be calling anyway to free src/dst.

One subtle note: refspec_item_init() can return an error, in which case
we'll still have set its "raw" field. But that is also true of the "src"
and "dst" fields, so any caller which does not _clear() the failed item
is already potentially leaking. In practice most code just calls die()
on an error anyway, but you can see the exception in valid_fetch_refspec(),
which does correctly call _clear() even on error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12 18:16:48 +09:00
d36af33081 refspec: drop separate raw_nr count
A refspec struct contains zero or more refspec_item structs, along with
matching "raw" strings. The items and raw strings are kept in separate
arrays, but those arrays will always have the same length (because we
write them only via refspec_append_nodup(), which grows both). This can
lead to bugs when manipulating the array, since the arrays and lengths
must be modified in lockstep. For example, the bug fixed in the previous
commit, which forgot to decrement raw_nr.

So let's get rid of "raw_nr" and have only "nr", making this kind of bug
impossible (and also making it clear that the two are always matched,
something that existing code already assumed but was not guaranteed by
the interface).

Even though we'd expect "alloc" and "raw_alloc" to likewise move in
lockstep, we still need to keep separate counts there if we want to
continue to use ALLOC_GROW() for both.

Conceptually this would all be simpler if refspec_item just held onto
its own raw string, and we had a single array. But there are callers
which use refspec_item outside of "struct refspec" (and so don't hold on
to a matching "raw" string at all), which we'd possibly need to adjust.
So let's not worry about refactoring that for now, and just get rid of
the redundant count variable. That is the first step on the road to
combining them anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12 18:16:48 +09:00
b970509c59 fetch: adjust refspec->raw_nr when filtering prefetch refspecs
In filter_prefetch_refspecs(), we may remove one or more refspecs if
they point into refs/tags/. When we do, we remove the item from the
refspec->items array, shifting subsequent items down, and then decrement
the refspec->nr count.

We also remove the item from the refspec->raw array, but fail to
decrement refspec->raw_nr. This leaves us with a count that is too high,
and anybody looking at the "raw" array will erroneously see either:

  1. The removed entry, if there were no subsequent items to shift down.

  2. A duplicate of the final entry, as everything is shifted down but
     there was nothing to overwrite the final item.

The obvious culprit to run into this is calling refspec_clear(), which
will try to free the removed entry (case 1) or double-free the final
entry (case 2). But even though the bug has existed since the function
was added in 2e03115d0c (fetch: add --prefetch option, 2021-04-16), we
did not trigger it in the test suite. The --prefetch option is normally
only used with configured refspecs, and we never bother to call
refspec_clear() on those (they are stored as part of a struct remote,
which is held in a global variable).

But you could trigger case 2 manually like:

  git fetch --prefetch . refs/tags/foo refs/tags/bar

Ironically you couldn't trigger case 1, because the code accidentally
leaked the string in the raw array, and the two bugs (the leak and the
double-free) cancelled out. But when we fixed the leak in ea4780307c
(fetch: free "raw" string when shrinking refspec, 2024-09-24), it became
possible to trigger that, too, with a single item:

  git fetch --prefetch . refs/tags/foo

We can fix both cases by just correctly decrementing "raw_nr" when we
shrink the array. Even though we don't expect people to use --prefetch
with command-line refspecs, we'll add a test to make sure it behaves
well (like the test just before it, we're just confirming that the
filtered prefetch succeeds at all).

Reported-by: Eric Mills <ermills@epic.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12 18:16:47 +09:00
c08589efdc index-pack: repack local links into promisor packs
Teach index-pack to, when processing the objects in a pack with
--promisor specified on the CLI, repack local objects (and the local
objects that they refer to, recursively) referenced by these objects
into promisor packs.

This prevents the situation in which, when fetching from a promisor
remote, we end up with promisor objects (newly fetched) referring
to non-promisor objects (locally created prior to the fetch). This
situation may arise if the client had previously pushed objects to the
remote, for example. One issue that arises in this situation is that,
if the non-promisor objects become inaccessible except through promisor
objects (for example, if the branch pointing to them has moved to
point to the promisor object that refers to them), then GC will garbage
collect them. There are other ways to solve this, but the simplest
seems to be to enforce the invariant that we don't have promisor objects
referring to non-promisor objects.

This repacking is done from index-pack to minimize the performance
impact. During a fetch, the only time most objects are fully inflated
in memory is when their object ID is computed, so we also scan the
objects (to see which objects they refer to) during this time.

Also to minimize the performance impact, an object is calculated to be
local if it's a loose object or present in a non-promisor pack. (If it's
also in a promisor pack or referred to by an object in a promisor pack,
it is technically already a promisor object. But a misidentification
of a promisor object as a non-promisor object is relatively benign
here - we will thus repack that promisor object into a promisor pack,
duplicating it in the object store, but there is no correctness issue,
just an issue of inefficiency.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12 10:18:16 +09:00
0c2c5e5f2e doc: git-add.txt: convert to new style convention
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12 09:32:18 +09:00
02d900361c test-lib: check malloc debug LD_PRELOAD before using
This fixes test failures across the suite on glibc platforms that don't
have libc_malloc_debug.so.0.

We added support for glibc's malloc checking routines long ago in
a731fa916e (Add MALLOC_CHECK_ and MALLOC_PERTURB_ libc env to the test
suite for detecting heap corruption, 2012-09-14). Back then we didn't
need to do any checks to see if the platform supported it. We were just
setting some environment variables which would either enable it or not.

That changed in 131b94a10a (test-lib.sh: Use GLIBC_TUNABLES instead of
MALLOC_CHECK_ on glibc >= 2.34, 2022-03-04). Now that glibc split this
out into libc_malloc_debug.so, we have to add it to LD_PRELOAD. We only
do that when we detect glibc, but it's possible to have glibc but not
the malloc debug library. In that case LD_PRELOAD will complain to
stderr, and tests which check for an empty stderr will fail.

You can work around this by setting TEST_NO_MALLOC_CHECK, which disables
the feature entirely. But it's not obvious to know you need to do that.
Instead, since this malloc checking is best-effort anyway, let's just
automatically disable it when the LD_PRELOAD appears not to work. We can
check it by running something simple that should work (and produce
nothing on stderr) like "git version".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12 07:44:28 +09:00
b31fb630c0 Merge https://github.com/j6t/git-gui
* https://github.com/j6t/git-gui:
  git gui: add directly calling merge tool from configuration
  git-gui: strip commit messages less aggressively
  git-gui: strip comments and consecutive empty lines from commit messages
2024-11-11 12:47:44 +09:00
34d3f2a984 t5300: add test for 'show-index --object-format'
In 88a09a557c (builtin/show-index: provide options to determine hash
algo), the flag --object-format was added to show-index builtin as a way
to provide a hash algorithm explicitly. However, we do not have tests in
place for that functionality. Add them.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-11 12:17:00 +09:00
4da8d90fdd show-index: fix uninitialized hash function
In c8aed5e8da (repository: stop setting SHA1 as the default object
hash), we got rid of the default hash algorithm for the_repository.
Due to this change, it is now the responsibility of the callers to set
their own default when this is not present.

As stated in the docs, show-index should use SHA1 as the default hash
algorithm when run outside a repository. Make sure this promise is
met by falling back to SHA1 when the_hash_algo is not present (i.e.
when the command is run outside a repository). Also add a test that
verifies this behavior.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-11 12:16:59 +09:00
e5033898da Merge branch 'ob/strip-comments-on-commit'
* ob/strip-comments-on-commit:
  git-gui: strip commit messages less aggressively
  git-gui: strip comments and consecutive empty lines from commit messages
2024-11-09 14:37:45 +01:00
492550155a Merge branch 'tb/mergetool-from-config'
* tb/mergetool-from-config:
  git gui: add directly calling merge tool from configuration
2024-11-09 14:34:50 +01:00
facbe4f633 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-08 12:56:28 +09:00
02a2d5706d Merge branch 'jk/left-right-bitmap'
When called with '--left-right' and '--use-bitmap-index', 'rev-list'
will produce output without any left/right markers, which has been
corrected.

* jk/left-right-bitmap:
  rev-list: skip bitmap traversal for --left-right
2024-11-08 12:56:28 +09:00
1ee7dbde67 Merge branch 'ps/upgrade-clar'
Buildfix and upgrade of Clar to a newer version.

* ps/upgrade-clar:
  cmake: set up proper dependencies for generated clar headers
  cmake: fix compilation of clar-based unit tests
  Makefile: extract script to generate clar declarations
  Makefile: adjust sed command for generating "clar-decls.h"
  t/unit-tests: update clar to 206accb
2024-11-08 12:56:28 +09:00
31fe1390cd Merge branch 'cw/config-extensions'
Centralize documentation for repository extensions into a single place.

* cw/config-extensions:
  doc: consolidate extensions in git-config documentation
2024-11-08 12:56:27 +09:00
a2ac8b0707 Merge branch 'kn/ci-clang-format-tidy'
Updates the '.clang-format' to match project conventions.

* kn/ci-clang-format-tidy:
  clang-format: align consecutive macro definitions
  clang-format: re-adjust line break penalties
2024-11-08 12:56:26 +09:00
c14fa9a511 Merge branch 'kn/arbitrary-suffixes'
Update the project's CodingGuidelines to discourage naming functions
with a "_1()" suffix.

* kn/arbitrary-suffixes:
  CodingGuidelines: discourage arbitrary suffixes in function names
2024-11-08 12:56:26 +09:00
b8150bfee1 describe: stop traversing when we run out of names
When trying to describe a commit, we'll traverse from the commit,
collecting candidate tags that point to its ancestors. But once we've
seen all of the tags in the repo, there's no point in traversing
further. There's nothing left to find!

For a default "git describe", this isn't usually a big problem. In a
large repo you'll probably have multiple tags, so we'll eventually find
10 candidates (the default for max_candidates) and stop there. And in a
small repo, it's quick to traverse to the root.

But you can imagine a large repo with few tags. Or, as we saw in a real
world case, explicitly limiting the set of matches like this (on
linux.git):

  git describe --match=v6.12-rc4 HEAD

which goes all the way to the root before realizing that no, there are
no other tags under consideration besides the one we fed via --match.
If we add in "--candidates=1" there, it's much faster (at least as of
the previous commit).

But we should be able to speed this up without the user asking for it.
After expanding all matching tags, we know the total number of names. We
could just stop the traversal there, but as hinted at above we already
have a mechanism for doing that: the max_candidate limit. So we can just
reduce that limit to match the number of possible candidates.

Our p6100 test shows this off:

  Test                                           HEAD^             HEAD
  ---------------------------------------------------------------------------------------
  6100.2: describe HEAD                          0.71(0.65+0.06)   0.72(0.68+0.04) +1.4%
  6100.3: describe HEAD with one max candidate   0.01(0.00+0.00)   0.01(0.00+0.00) +0.0%
  6100.4: describe HEAD with one tag             0.72(0.66+0.05)   0.01(0.00+0.00) -98.6%

Now we are fast automatically, just as if --candidates=1 were supplied
by the user.

Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Helped-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07 13:28:22 +09:00
7379046221 describe: stop digging for max_candidates+1
By default, describe considers only 10 candidate matches, and stops
traversing when we have enough. This makes things much faster in a large
repository, where collecting all candidates requires walking all the way
down to the root (or at least to the oldest tag). This goes all the way
back to 8713ab3079 (Improve git-describe performance by reducing
revision listing., 2007-01-13).

However, we don't stop immediately when we have enough candidates. We
keep traversing and only bail when we find one more candidate that we're
ignoring. Usually this is not too expensive, if the tags are sprinkled
evenly throughout history. But if you are unlucky, you might hit the max
candidate quickly, and then have a huge swath of history before finding
the next one.

Our p6100 test has exactly this unlucky case: with a max of "1", we find
a recent tag quickly and then have to go all the way to the root to find
the old tag that will be discarded.

A more interesting real-world case is:

  git describe --candidates=1 --match=v6.12-rc4 HEAD

in the linux.git repo. There we restrict the set of tags to a single
one, so there is no older candidate to find at all! But despite
--candidates=1, we keep traversing to the root only to find nothing.

So why do we keep traversing after hitting thet max? There are two
reasons I can see:

  1. In theory the extra information that there was another candidate
     could be useful, and we record it in the gave_up_on variable. But
     we only show this information with --debug.

  2. After finding the candidate, there's more processing we do in our
     loop. The most important of this is propagating the "within" flags
     to our parent commits, and putting them in the commit_list we'll
     use for finish_depth_computation().

     That function continues the traversal until we've counted all
     commits reachable from the starting point but not reachable from
     our best candidate tag (so essentially counting "$tag..$start", but
     avoiding re-walking over the bits we've seen).  If we break
     immediately without putting those commits into the list, our depth
     computation will be wrong (in the worst case we'll count all the
     way down to the root, not realizing those commits are included in
     our tag).

But we don't need to find a new candidate for (2). As soon as we finish
the loop iteration where we hit max_candidates, we can then quit on the
next iteration. This should produce the same output as the original code
(which could, after all, find a candidate on the very next commit
anyway) but ends the traversal with less pointless digging.

We still have to set "gave_up_on"; we've popped it off the list and it
has to go back. An alternative would be to re-order the loop so that it
never gets popped, but it's perhaps still useful to show in the --debug
output, so we need to know it anyway. We do have to adjust the --debug
output since it's now just a commit where we stopped traversing, and not
the max+1th candidate.

p6100 shows the speedup using linux.git:

  Test                                           HEAD^             HEAD
  ---------------------------------------------------------------------------------------
  6100.2: describe HEAD                          0.70(0.63+0.06)   0.71(0.66+0.04) +1.4%
  6100.3: describe HEAD with one max candidate   0.70(0.64+0.05)   0.01(0.00+0.00) -98.6%
  6100.4: describe HEAD with one tag             0.70(0.67+0.03)   0.70(0.63+0.06) +0.0%

Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Helped-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07 13:28:22 +09:00
bb0830c682 t/perf: add tests for git-describe
We don't have a perf script for git-describe, despite it often being
accused of slowness. Let's add a few simple tests to start with.

Rather than use the existing tags from our test repo, we'll make our own
so that we have a known quantity and position. We'll add a "new" tag
near the tip of HEAD, and an "old" one that is at the very bottom. And
then our tests are:

  1. Describing HEAD naively requires walking all the way down to the
     old tag as we collect candidates. This gives us a baseline for what
     "slow" looks like.

  2. Doing the same with --candidates=1 can potentially be fast, because
     we can quie after finding "new". But we don't, and it's also slow.

  3. Likewise we should be able to quit when there are no more tags to
     find. This can happen naturally if a repo has few tags, but also if
     you restrict the set of tags with --match.

Here are the results running against linux.git. Note that I have a
commit-graph built for the repo, so "slow" here is ~700ms. Without a
commit graph it's more like 9s!

  Test                                           HEAD
  --------------------------------------------------------------
  6100.2: describe HEAD                          0.70(0.66+0.04)
  6100.3: describe HEAD with one max candidate   0.70(0.66+0.04)
  6100.4: describe HEAD with one tag             0.70(0.64+0.06)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07 13:28:22 +09:00
d0e52c1728 t6120: demonstrate weakness in disjoint-root handling
Commit 30b1c7ad9d (describe: don't abort too early when searching tags,
2020-02-26) tried to fix a problem that happens when there are disjoint
histories: to accurately compare the counts for different tags, we need
to keep walking the history longer in order to find a common base.

But its fix misses a case: we may still bail early if we hit the
max_candidates limit, producing suboptimal output. You can see this in
action by adding "--candidates=2" to the tests; we'll stop traversing as
soon as we see the second tag and will produce the wrong answer. I hit
this in practice while trying to teach git-describe not to keep looking
for candidates after we've seen all tags in the repo (effectively adding
--candidates=2, since these toy repos have only two tags each).

This is probably fixable by continuing to walk after hitting the
max-candidates limit, all the way down to a common ancestor of all
candidates. But it's not clear in practice what the preformance
implications would be (it would depend on how long the branches that
hold the candidates are).

So I'm punting on that for now, but I'd like to adjust the tests to be
more resilient, and to document the findings. So this patch:

  1. Adds an extra tag at the bottom of history. This shouldn't change
     the output, but does mean we are more resilient to low values of
     --candidates (e.g., if we start reducing it to the total number of
     tags). This is arguably closer to the real world anyway, where
     you're not going to have just 2 tags, but an arbitrarily long
     history going back in time, possibly with multiple irrelevant tags
     in it (I called the new tag "H" here for "history").

  2. Run the same tests with --candidates=2, which shows that even with
     the current code they can fail if we end the traversal early. That
     leaves a trail for anybody interested in trying to improve the
     behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07 13:28:21 +09:00
2664f2a0cb Merge branch 'ps/leakfixes-part-9' into ps/leakfixes-part-10
* ps/leakfixes-part-9: (22 commits)
  list-objects-filter-options: work around reported leak on error
  builtin/merge: release output buffer after performing merge
  dir: fix leak when parsing "status.showUntrackedFiles"
  t/helper: fix leaking buffer in "dump-untracked-cache"
  t/helper: stop re-initialization of `the_repository`
  sparse-index: correctly free EWAH contents
  dir: release untracked cache data
  combine-diff: fix leaking lost lines
  builtin/tag: fix leaking key ID on failure to sign
  transport-helper: fix leaking import/export marks
  builtin/commit: fix leaking cleanup config
  trailer: fix leaking strbufs when formatting trailers
  trailer: fix leaking trailer values
  builtin/commit: fix leaking change data contents
  upload-pack: fix leaking URI protocols
  pretty: clear signature check
  diff-lib: fix leaking diffopts in `do_diff_cache()`
  revision: fix leaking bloom filters
  builtin/grep: fix leak with `--max-count=0`
  grep: fix leak in `grep_splice_or()`
  ...
2024-11-07 13:25:01 +09:00
391bceae43 compat/mingw: support POSIX semantics for atomic renames
By default, Windows restricts access to files when those files have been
opened by another process. As explained in the preceding commits, these
restrictions can be loosened such that reads, writes and/or deletes of
files with open handles _are_ allowed.

While we set up those sharing flags in most relevant code paths now, we
still don't properly handle POSIX-style atomic renames in case the
target path is open. This is failure demonstrated by t0610, where one of
our tests spawns concurrent writes in a reftable-enabled repository and
expects all of them to succeed. This test fails most of the time because
the process that has acquired the "tables.list" lock is unable to rename
it into place while other processes are busy reading that file.

Windows 10 has introduced the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag
that allows us to fix this usecase [1]. When set, it is possible to
rename a file over a preexisting file even when the target file still
has handles open. Those handles must have been opened with the
`FILE_SHARE_DELETE` flag, which we have ensured in the preceding
commits.

Careful readers might have noticed that [1] does not mention the above
flag, but instead mentions `FILE_RENAME_POSIX_SEMANTICS`. This flag is
not for use with `SetFileInformationByHandle()` though, which is what we
use. And while the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag exists, it is
not documented on [2] or anywhere else as far as I can tell.

Unfortunately, we still support Windows systems older than Windows 10
that do not yet have this new flag. Our `_WIN32_WINNT` SDK version still
targets 0x0600, which is Windows Vista and later. And even though that
Windows version is out-of-support, bumping the SDK version all the way
to 0x0A00, which is Windows 10 and later, is not an option as it would
make it impossible to compile on Windows 8.1, which is still supported.
Instead, we have to manually declare the relevant infrastructure to make
this feature available and have fallback logic in place in case we run
on a Windows version that does not yet have this flag.

On another note: `mingw_rename()` has a retry loop that is used in case
deleting a file failed because it's still open in another process. One
might be pressed to not use this loop anymore when we can use POSIX
semantics. But unfortunately, we have to keep it around due to our
dependence on the `FILE_SHARE_DELETE` flag. While we know to set that
sharing flag now, other applications may not do so and may thus still
cause sharing violations when we try to rename a file.

This fixes concurrent writes in the reftable backend as demonstrated in
t0610, but may also end up fixing other usecases where Git wants to
perform renames.

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_rename_information
[2]: https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_rename_info

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-06 00:15:25 -08:00
5d4cc78f72 fetch-pack: die if in commit graph but not obj db
When fetching, there is a step in which sought objects are first checked
against the local repository; only objects that are not in the local
repository are then fetched. This check first looks up the commit graph
file, and returns "present" if the object is in there.

However, the action of first looking up the commit graph file is not
done everywhere in Git, especially if the type of the object at the time
of lookup is not known. This means that in a repo corruption situation,
a user may encounter an "object missing" error, attempt to fetch it, and
still encounter the same error later when they reattempt their original
action, because the object is present in the commit graph file but not in
the object DB.

Therefore, make it a fatal error when this occurs. (Note that we cannot
proceed to include this object in the list of objects to be fetched
without changing at least the fetch negotiation code: what would happen
is that the client will send "want X" and "have X" and when I tested
at $DAYJOB with a work server that uses JGit, the server reasonably
returned an empty packfile. And changing the fetch negotiation code to
only use the object DB when deciding what to report as "have" would be
an unnecessary slowdown, I think.)

This was discovered when a lazy fetch of a missing commit completed with
nothing actually fetched, and the writing of the commit graph file after
every fetch then attempted to read said missing commit, triggering a
lazy fetch of said missing commit, resulting in an infinite loop with no
user-visible indication (until they check the list of processes running
on their computer). With this fix, there is no infinite loop. Note that
although the repo corruption we discovered was caused by a bug in GC in
a partial clone, the behavior that this patch teaches Git to warn about
applies to any repo with commit graph enabled and with a missing commit,
whether it is a partial clone or not.

t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in
lookup_commit_in_graph(), 2022-07-01), tests that an interaction between
fetch and the commit graph does not cause an infinite loop. This patch
changes the exit code in that situation, so that test had to be changed.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05 18:57:22 -08:00
bf1feb9e53 Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
This reverts commit a6e65fb39c.

This revert simplifies the next patch in this patch set.

The commit message of that commit mentions that the new function "will
be used for the bundle-uri client in a subsequent commit", but it seems
that eventually it wasn't used.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05 18:57:22 -08:00
00e10e0751 doc: correct misleading descriptions for --shallow-exclude
The documentation for the --shallow-exclude option to clone/fetch/etc.
claims that the option takes a revision, but it does not.  As per
upload-pack.c's process_deepen_not(), it passes the option to
expand_ref() and dies if it does not find exactly one ref matching the
name passed.  Further, this has always been the case ever since these
options were introduced by the commits merged in a460ea4a3c (Merge
branch 'nd/shallow-deepen', 2016-10-10).  Fix the documentation to
match the implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:53:23 -08:00
c810549be1 list-objects-filter-options: work around reported leak on error
This one is a little bit more curious. In t6112, we have a test that
exercises the `git rev-list --filter` option with invalid filters. We
execute git-rev-list(1) via `test_must_fail`, which means that we check
for leaks even though Git exits with an error code. This causes the
following leak:

    Direct leak of 27 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8
        #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2
        #3 0x5555558b7550 in strbuf_add strbuf.c:311:2
        #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2
        #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3
        #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3
        #7 0x555555884e20 in setup_revisions revision.c:3014:11
        #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9
        #9 0x5555555ec5e3 in run_builtin git.c:483:11
        #10 0x5555555eb1e4 in handle_builtin git.c:749:13
        #11 0x5555555ec001 in run_argv git.c:819:4
        #12 0x5555555eaf94 in cmd_main git.c:954:19
        #13 0x5555556fd569 in main common-main.c:64:11
        #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d)
        #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208)
        #16 0x5555555ad064 in _start (git+0x59064)

This leak is valid, as we call `die()` and do not clean up the memory at
all. But what's curious is that this is the only leak reported, because
we don't clean up any other allocated memory, either, and I have no idea
why the leak sanitizer treats this buffer specially.

In any case, we can work around the leak by shuffling things around a
bit. Instead of calling `gently_parse_list_objects_filter()` and dying
after we have modified the filter spec, we simply do so beforehand. Like
this we don't allocate the buffer in the error case, which makes the
reported leak go away.

It's not pretty, but it manages to make t6112 leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:57 -08:00
ff67083ccd builtin/merge: release output buffer after performing merge
The `obuf` member of `struct merge_options` is used to buffer output in
some cases. In order to not discard its allocated memory we only release
its contents in `merge_finalize()` when we're not currently recursing
into a subtree.

This results in some situations where we seemingly do not release the
buffer reliably. We thus have calls to `strbuf_release()` for this
buffer scattered across the codebase. But we're missing one callsite in
git-merge(1), which causes a memory leak.

We should ideally refactor this interface so that callers don't have to
know about any such internals. But for now, paper over the issue by
adding one more `strbuf_release()` call.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:57 -08:00
813b12b6f7 dir: fix leak when parsing "status.showUntrackedFiles"
We use `repo_config_get_string()` to read "status.showUntrackedFiles"
from the config subsystem. This function allocates the result, but we
never free the result after parsing it.

The value never leaves the scope of the calling function, so refactor it
to instead use `repo_config_get_string_tmp()`, which does not hand over
ownership to the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:56 -08:00
0bc0fcf0b2 t/helper: fix leaking buffer in "dump-untracked-cache"
We never release the local `struct strbuf base` buffer, thus leaking
memory. Fix this leak.

This leak is exposed by t7063, but plugging it alone does not make the
whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:56 -08:00
a53144cf1b t/helper: stop re-initialization of the_repository
While "common-main.c" already initializes `the_repository` for us, we do
so a second time in the "read-cache" test helper. This causes a memory
leak because the old repository's contents isn't released.

Stop calling `initialize_repository()` to plug this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:56 -08:00
1f5ff83eab sparse-index: correctly free EWAH contents
While we free the `fsmonitor_dirty` member of `struct index_state`, we
do not free the contents of that EWAH. Do so by using `ewah_free()`
instead of `FREE_AND_NULL()`.

This leak is exposed by t7519, but plugging it alone does not make the
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:56 -08:00
e4ba54d47b dir: release untracked cache data
There are several cases where we invalidate untracked cache directory
entries where we do not free the underlying data, but reset the number
of entries. This causes us to leak memory because `free_untracked()`
will not iterate over any potential entries which we still had in the
array.

Fix this issue by freeing old entries. The leak is exposed by t7519, but
plugging it alone does not make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:55 -08:00
1981d1eb3e combine-diff: fix leaking lost lines
The `cnt` variable tracks the number of lines in a patch diff. It can
happen though that there are no newlines, in which case we'd still end
up allocating our array of `sline`s. In fact, we always allocate it with
`cnt + 2` entries: one extra entry for the deletion hunk at the end, and
another entry that we don't seem to ever populate at all but acts as a
kind of sentinel value.

When we loop through the array to clear it at the end of this function
we only loop until `lno < cnt`, and thus we may not end up releasing
whatever the two extra `sline`s contain. While that shouldn't matter for
the sentinel value, it does matter for the extra deletion hunk sline.
Regardless of that, plug this memory leak by releasing both extra
entries, which makes the logic a bit easier to reason about.

While at it, fix the formatting of a local comment, which incidentally
also provides the necessary context for why we overallocate the `sline`
array.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:55 -08:00
d06e3ec858 builtin/tag: fix leaking key ID on failure to sign
We do not free the key ID when signing a tag fails. Do so by using
the common exit path.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:55 -08:00
1a99173de0 transport-helper: fix leaking import/export marks
Fix leaking import and export marks for transport helpers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:54 -08:00
6ef9f77a15 builtin/commit: fix leaking cleanup config
The cleanup string set by the config is leaking when it is being
overridden by an option. Fix this by tracking these via two separate
variables such that we can free the old value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:54 -08:00
ff31b7b941 trailer: fix leaking strbufs when formatting trailers
When formatting trailer lines we iterate through each of the trailers
and munge their respective token/value pairs according to the trailer
options. When formatting a trailer that has its `item->token` pointer
set we perform the munging in two local buffers. In the case where we
figure out that the value is empty and `trim_empty` is set we just skip
over the trailer item. But the buffers are local to the loop and we
don't release their contents, leading to a memory leak.

Plug this leak by lifting the buffers outside of the loop and releasing
them on function return. This fixes the memory leaks, but also optimizes
the loop as we don't have to reallocate the buffers on every single
iteration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:54 -08:00
3f692fe5be trailer: fix leaking trailer values
Fix leaking trailer values when replacing the value with a command or
when the token value is empty.

This leak is exposed by t7513, but plugging it does not make the whole
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:54 -08:00
d34b5cbf02 builtin/commit: fix leaking change data contents
While we free the worktree change data, we never free its contents. Fix
this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:53 -08:00
3b373150c8 upload-pack: fix leaking URI protocols
We don't clear `struct upload_pack::uri_protocols`, which causes a
memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:53 -08:00
0b20a28811 pretty: clear signature check
The signature check in the formatting context is never getting released.
Fix this to plug the resulting memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:53 -08:00
8dd3cb4b45 diff-lib: fix leaking diffopts in do_diff_cache()
In `do_diff_cache()` we initialize a new `rev_info` and then overwrite
its `diffopt` with a user-provided set of options. This can leak memory
because `repo_init_revisions()` may end up allocating memory for the
`diffopt` itself depending on the configuration. And since that field is
overwritten we won't ever free it.

Plug the memory leak by releasing the diffopts before we overwrite them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:52 -08:00
e29ff075e0 revision: fix leaking bloom filters
The memory allocated by `prepare_to_use_bloom_filter()` is not released
by `release_revisions()`, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:52 -08:00
43fedde3df builtin/grep: fix leak with --max-count=0
When executing with `--max-count=0` we'll return early from git-grep(1)
without performing any cleanup, which causes memory leaks. Plug these.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:52 -08:00
a6590ccdd4 grep: fix leak in grep_splice_or()
In `grep_splice_or()` we search for the next `TRUE` node in our tree of
grep expressions and replace it with the given new expression. But we
don't free the old node, which causes a memory leak. Plug it.

This leak is exposed by t7810, but plugging it alone isn't sufficient to
make the test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:52 -08:00
ee3e8c3afa t/helper: fix leaks in "reach" test tool
The "reach" test tool doesn't bother to clean up any of its allocated
resources, causing various leaks. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:51 -08:00
5f5dd8e297 builtin/ls-remote: plug leaking server options
The list of server options populated via `OPT_STRING_LIST()` is never
cleared, causing a memory leak. Plug it.

This leak is exposed by t5702, but plugging it alone does not make the
whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 22:37:51 -08:00
5a875ff7fb upload-pack: fix ambiguous error message
upload-pack.c takes any --shallow-exclude argument(s) from
clone/fetch/etc. and passes them through expand_ref().  If it does not
get back exactly one ref from the call to expand_ref(), it will die with
the following error:

    fatal: git upload-pack: ambiguous deepen-not: %s

Given that the documentation suggests to users that --shallow-exclude
accepts a revision rather than a ref (which will be corrected in a
subsequent commit), users may try to pass a revision.  In such a case,
expand_ref() will return 0 matches, but the error message we print will
be misleading since "ambiguous" suggests there are multiple matches.
Provide a clearer error message for such a case.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04 17:20:21 -08:00
a1fb77fcb8 t1016: clean up style
Adhere to Documentation/CodingGuidelines:
  - Whitespace and redirect operator.
  - Case arms indentation.
  - Tabs for indentation.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-03 15:30:04 -08:00
d9e24ce2ca t5300: move --window clamp test next to unclamped
A subsequent commit will change the behavior of "git index-pack
--promisor", which is exercised in "build pack index for an existing
pack", causing the unclamped and clamped versions of the --window
test to exhibit different behavior. Move the clamp test closer to the
unclamped test that it references.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02 04:08:21 -07:00
78995ff57c t0410: use from-scratch server
A subsequent commit will add functionality: when fetching from a
promisor remote, existing non-promisor objects that are ancestors of any
fetched object will be repacked into promisor packs (since if a promisor
remote has an object, it also has all its ancestors).

This means that sometimes, a fetch from a promisor remote results in 2
new promisor packs (instead of the 1 that you would expect). There is a
test that fetches a descendant of a local object from a promisor remote,
but also specifically tests that there is exactly 1 promisor pack as
a result of the fetch. This means that this test will fail when the
subsequent commit is added.

Since the ancestry of the fetched object is not the concern of this
test, make the fetched objects have no ancestry in common with the
objets in the client repo. This is done by making the server from
scratch, instead of using an existing repo that has objects in common
with the client.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02 04:08:20 -07:00
da80429cef t0410: make test description clearer
Commit 9a4c507886 (t0410: test fetching from many promisor remotes,
2019-06-25) adds some tests that demonstrate not the automatic fetching
of missing objects, but the direct fetching from another promisor remote
(configured explicitly in one test and implicitly via --filter on the
"git fetch" CLI invocation in the other test) - thus demonstrating
support for multiple promisor remotes, as described in the commit
message.

Change the test descriptions accordingly to make this clearer.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02 04:08:20 -07:00
8f8d6eee53 The seventh batch 2024-11-01 12:59:31 -04:00
1c5a712f26 Merge branch 'jk/dumb-http-finalize'
The dumb-http code regressed when the result of re-indexing a pack
yielded an *.idx file that differs in content from the *.idx file it
downloaded from the remote. This has been corrected by no longer
relying on the *.idx file we got from the remote.

* jk/dumb-http-finalize:
  packfile: use oidread() instead of hashcpy() to fill object_id
  packfile: use object_id in find_pack_entry_one()
  packfile: convert find_sha1_pack() to use object_id
  http-walker: use object_id instead of bare hash
  packfile: warn people away from parse_packed_git()
  packfile: drop sha1_pack_index_name()
  packfile: drop sha1_pack_name()
  packfile: drop has_pack_index()
  dumb-http: store downloaded pack idx as tempfile
  t5550: count fetches in "previously-fetched .idx" test
  midx: avoid duplicate packed_git entries
2024-11-01 12:53:32 -04:00
6d81fe64dd Merge branch 'kh/update-ref'
Documentation updates to 'git-update-ref(1)'.

* kh/update-ref:
  Documentation: mutually link update-ref and symbolic-ref
  Documentation/git-update-ref.txt: discuss symbolic refs
  Documentation/git-update-ref.txt: remove confusing paragraph
  Documentation/git-update-ref.txt: demote symlink to last section
  Documentation/git-update-ref.txt: remove safety paragraphs
  Documentation/git-update-ref.txt: drop “flag”
2024-11-01 12:53:30 -04:00
aebc4bd8ce Merge branch 'ak/more-typofixes'
More typofixes.

* ak/more-typofixes:
  t: fix typos
2024-11-01 12:53:29 -04:00
43ac23945c Merge branch 'rs/grep-lookahead'
Fix 'git grep' regression on macOS by disabling lookahead when
encountering invalid UTF-8 byte sequences.

* rs/grep-lookahead:
  grep: disable lookahead on error
2024-11-01 12:53:28 -04:00
81a5461518 Merge branch 'ak/t1016-cleanup'
Test cleanup.

* ak/t1016-cleanup:
  t1016: clean up style
2024-11-01 12:53:27 -04:00
59dc0ab83c Merge branch 'ua/atoi'
Replace various calls to atoi() with strtol_i() and strtoul_ui(), and
add improved error handling.

* ua/atoi:
  imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing
  merge: replace atoi() with strtol_i() for marker size validation
  daemon: replace atoi() with strtoul_ui() and strtol_i()
2024-11-01 12:53:26 -04:00
20ab7fa3b6 Merge branch 'sa/notes-edit'
Teach 'git notes add' and 'git notes append' a new '-e' flag,
instructing them to open the note in $GIT_EDITOR before saving.

* sa/notes-edit:
  notes: teach the -e option to edit messages in editor
2024-11-01 12:53:25 -04:00
8237b49ade Merge branch 'sk/t9101-cleanup'
Test cleanup.

* sk/t9101-cleanup:
  t9101: ensure no whitespace after redirect
2024-11-01 12:53:24 -04:00
7b1f01f02e Merge branch 'ss/duplicate-typos'
Typofixes.

* ss/duplicate-typos:
  global: Fix duplicate word typos
2024-11-01 12:53:23 -04:00
aabbcf2783 Merge branch 'ps/upload-pack-doc'
Documentation update to clarify that 'uploadpack.allowAnySHA1InWant'
implies both 'allowTipSHA1InWant' and 'allowReachableSHA1InWant'.

* ps/upload-pack-doc:
  doc: document how uploadpack.allowAnySHA1InWant impact other allow options
2024-11-01 12:53:22 -04:00
07c6066f82 Merge branch 'kh/mv-breakage'
Demonstrate an assertion failure in 'git mv'.

* kh/mv-breakage:
  t7001: add failure test which triggers assertion
2024-11-01 12:53:21 -04:00
787297b396 Merge branch 'rj/cygwin-exit'
Treat ECONNABORTED the same as ECONNRESET in 'git credential-cache' to
work around a possible Cygwin regression. This resolves a race condition
caused by changes in Cygwin's handling of socket closures, allowing the
client to exit cleanly when encountering ECONNABORTED.

* rj/cygwin-exit:
  credential-cache: treat ECONNABORTED like ECONNRESET
2024-11-01 12:53:19 -04:00
a524cc77ad Merge branch 'ua/t3404-cleanup'
Test update.

* ua/t3404-cleanup:
  t3404: replace test with test_line_count()
  t3404: avoid losing exit status with focus on `git show` and `git cat-file`
2024-11-01 12:53:18 -04:00
268fd2fe58 Merge branch 'ps/platform-compat-fixes'
Various platform compatibility fixes split out of the larger effort
to use Meson as the primary build tool.

* ps/platform-compat-fixes:
  t6006: fix prereq handling with `test_format ()`
  http: fix build error on FreeBSD
  builtin/credential-cache: fix missing parameter for stub function
  t7300: work around platform-specific behaviour with long paths on MinGW
  t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin
  t3404: work around platform-specific behaviour on macOS 10.15
  t1401: make invocation of tar(1) work with Win32-provided one
  t/lib-gpg: fix setup of GNUPGHOME in MinGW
  t/lib-gitweb: test against the build version of gitweb
  t/test-lib: wire up NO_ICONV prerequisite
  t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE
2024-11-01 12:53:17 -04:00
47c3170a3e Merge branch 'jc/breaking-changes-early-adopter-option'
Describe the policy to introduce breaking changes.

* jc/breaking-changes-early-adopter-option:
  BreakingChanges: early adopter option
2024-11-01 12:53:14 -04:00
16a186fede rev-list: skip bitmap traversal for --left-right
Running:

  git rev-list --left-right --use-bitmap-index one...two

will produce output without any left-right markers, since the bitmap
traversal returns only a single set of reachable commits. Instead we
should refuse to use bitmaps here and produce the correct output using a
traditional traversal.

This is probably not the only remaining option that misbehaves with
bitmaps, but it's particularly egregious in that it feels like it
_could_ work. Doing two separate traversals for the left/right sides and
then taking the symmetric set differences should yield the correct
answer, but our traversal code doesn't know how to do that.

It's not clear if naively doing two separate traversals would always be
a performance win. A traditional traversal only needs to walk down to
the merge base, but bitmaps always fill out the full reachability set.
So depending on your bitmap coverage, we could end up walking old bits
of history twice to fill out the same uninteresting bits on both sides.
We'd also of course end up with a very large --boundary set, if the user
asked for that.

So this might or might not be something worth implementing later. But
for now, let's make sure we don't produce the wrong answer if somebody
tries it.

The test covers this, but also the same thing with "--count" (which is
what I originally tried in a real-world case). Ironically the
try_bitmap_count() code already realizes that "--left-right" won't work
there. But that just causes us to fall back to the regular bitmap
traversal code, which itself doesn't handle counting (we produce a list
of objects rather than a count).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-11-01 11:02:27 -04:00
ac112fd4f0 Add additional CI jobs to avoid accidental breakage
In general, we'd like to make sure Git works on the LTS versions of
major Linux distributions.  To do that, let's add CI jobs for the oldest
regular (non-extended) LTS versions of the major distributions: Ubuntu
20.04, Debian 11, and RHEL 8.  Because RHEL isn't available to the
public at no charge, use AlmaLinux, which is binary compatible with it.

Note that Debian does not offer the language-pack packages, but suitable
locale support can be installed with the locales-all package.
Otherwise, use the set of installation instructions which exist and are
most similar to the existing supported distros.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-11-01 10:54:18 -04:00
ad797eace4 ci: remove clause for Ubuntu 16.04
We're no longer testing this version and it's well beyond regular LTS
support now, so remove the stanza for it from the case statement in our
CI code.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-11-01 10:54:18 -04:00
c85bcb5de1 gitlab-ci: switch from Ubuntu 16.04 to 20.04
Ubuntu 16.04 is past its normal LTS lifespan, so let's switch to Ubuntu
20.04 instead, which is the latest regular LTS version.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-11-01 10:54:18 -04:00
23d289d273 The sixth batch 2024-10-30 13:36:44 -04:00
6aa984af3b Merge branch 'sk/t7011-cleanup'
Test cleanup.

* sk/t7011-cleanup:
  t7011: ensure no whitespace after redirect
2024-10-30 13:08:07 -04:00
8d305fdaac Merge branch 'co/t6050-pipefix'
Avoid losing exit status by having Git command being tested on the
upstream side of a pipe.

* co/t6050-pipefix:
  t6050: avoid pipes with upstream Git commands
2024-10-30 13:08:06 -04:00
cec4461e3f Merge branch 'ks/t4205-fixup'
Testfix.

* ks/t4205-fixup:
  t4205: fix typo in 'NUL termination with --stat'
2024-10-30 13:08:05 -04:00
9947803926 Merge branch 'kh/submitting-patches'
Docfix.

* kh/submitting-patches:
  SubmittingPatches: tags -> trailers
2024-10-30 13:08:04 -04:00
6f763d798b Merge branch 'ps/ref-filter-sort'
Teaches the ref-filter machinery to recognize and avoid cases where
sorting would be redundant.

* ps/ref-filter-sort:
  ref-filter: format iteratively with lexicographic refname sorting
2024-10-30 13:08:02 -04:00
bc627658b0 Merge branch 'ps/reftable-strbuf'
Implements a new reftable-specific strbuf replacement to reduce
reftable's dependency on Git-specific data structures.

* ps/reftable-strbuf:
  reftable: handle trivial `reftable_buf` errors
  reftable/stack: adapt `stack_filename()` to handle allocation failures
  reftable/record: adapt `reftable_record_key()` to handle allocation failures
  reftable/stack: adapt `format_name()` to handle allocation failures
  t/unit-tests: check for `reftable_buf` allocation errors
  reftable/blocksource: adapt interface name
  reftable: convert from `strbuf` to `reftable_buf`
  reftable/basics: provide new `reftable_buf` interface
  reftable: stop using `strbuf_addf()`
  reftable: stop using `strbuf_addbuf()`
2024-10-30 13:08:01 -04:00
062d9fb033 Merge branch 'backport-github-actions-fixes'
The planet keeps revolving, and CI definitions (even old ones) need to
be kept up to date, even if they worked unchanged before (because now
they don't).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-10-30 17:38:38 +01:00
83b08eb19f t7300: work around platform-specific behaviour with long paths on MinGW
Windows by default has a restriction in place to only allow paths up to
260 characters. This restriction can nowadays be lifted by setting a
registry key, but is still active by default.

In t7300 we have one test that exercises the behaviour of git-clean(1)
with such long paths. Interestingly enough, this test fails on my system
that uses Windows 10 with mingw-w64 installed via MSYS2: instead of
observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent
across multiple different environments I have tried.

I cannot say why exactly we observe a different error here, but I would
not be surprised if this was either dependent on the Windows version,
the version of MinGW, the current working directory of Git or any kind
of combination of these.

Work around the issue by handling both errors.

[Backported from 106834e34a (t7300: work around platform-specific
behaviour with long paths on MinGW, 2024-10-09).]

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-10-30 17:38:35 +01:00
7e6073d270 compat/regex: fix argument order to calloc(3)
Windows compiler suddenly started complaining that calloc(3) takes
its arguments in <nmemb, size> order.  Indeed, there are many calls
that has their arguments in a _wrong_ order.

Fix them all.

A sample breakage can be seen at

  https://github.com/git/git/actions/runs/9046793153/job/24857988702#step:4:272

[Backported from f01301aabe (compat/regex: fix argument order to
calloc(3), 2024-05-11).]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-10-30 15:27:18 +01:00
5d828879f3 mingw: drop bogus (and unneeded) declaration of _pgmptr
In 08809c09aa (mingw: add a helper function to attach GDB to the
current process, 2020-02-13), I added a declaration that was not needed.
Back then, that did not matter, but now that the declaration of that
symbol was changed in mingw-w64's headers, it causes the following
compile error:

      CC compat/mingw.o
  compat/mingw.c: In function 'open_in_gdb':
  compat/mingw.c:35:9: error: function declaration isn't a prototype [-Werror=strict-prototypes]
     35 |         extern char *_pgmptr;
        |         ^~~~~~
  In file included from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/mm_malloc.h:27,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/xmmintrin.h:34,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/immintrin.h:31,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/x86intrin.h:32,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/winnt.h:1658,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/minwindef.h:163,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/windef.h:9,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/windows.h:69,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/winsock2.h:23,
                   from compat/../git-compat-util.h:215,
                   from compat/mingw.c:1:
  compat/mingw.c:35:22: error: '__p__pgmptr' redeclared without dllimport attribute: previous dllimport ignored [-Werror=attributes]
     35 |         extern char *_pgmptr;
        |                      ^~~~~~~

Let's just drop the declaration and get rid of this compile error.

[Backported from 3c295c87c2 (mingw: drop bogus (and unneeded)
declaration of `_pgmptr`, 2024-06-19).]

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-10-30 15:27:18 +01:00
0d606d8c2a ci: remove 'Upload failed tests' directories' step from linux32 jobs
Linux32 jobs seem to be getting:

    Error: This request has been automatically failed because it uses a
    deprecated version of `actions/upload-artifact: v1`. Learn more:
    https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/

before doing anything useful.  For now, disable the step.

Ever since actions/upload-artifact@v1 got disabled, mentioning the
offending version of it seems to stop anything from happening.  At
least this should run the same build and test.

See

    https://github.com/git/git/actions/runs/10780030750/job/29894867249

for example.

[Backported from 90f2c7240c (ci: remove 'Upload failed tests'
directories' step from linux32 jobs, 2024-09-09).]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-10-30 15:02:35 +01:00
dd6003f200 t6006: fix prereq handling with test_format ()
In df383b5842 (t/test-lib: wire up NO_ICONV prerequisite, 2024-10-16) we
have introduced a new NO_ICONV prerequisite that makes us skip tests in
case Git is not compiled with support for iconv. This change subtly
broke t6006: while the test suite still passes, some of its tests won't
execute because they run into an error.

    ./t6006-rev-list-format.sh: line 92: test_expect_%e: command not found

The broken tests use `test_format ()`, and the mentioned commit simply
prepended the new prerequisite to its arguments. But that does not work,
as the function is not aware of prereqs at all and will now treat all of
its arguments incorrectly.

Fix this by making the function aware of prereqs by accepting an
optional fourth argument. Adapt the callsites accordingly.

Reported-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-28 13:44:38 -04:00
a270cb1540 compat/mingw: allow deletion of most opened files
On Windows, we emulate open(3p) via `mingw_open()`. This function
implements handling of some platform-specific quirks that are required
to make it behave as closely as possible like open(3p) would, but for
most cases we just call the Windows-specific `_wopen()` function.

This function has a major downside though: it does not allow us to
specify the sharing mode. While there is `_wsopen()` that allows us to
pass sharing flags, those sharing flags are not the same `FILE_SHARE_*`
flags as `CreateFileW()` accepts. Instead, `_wsopen()` only allows
concurrent read- and write-access, but does not allow for concurrent
deletions. Unfortunately though, we have to allow concurrent deletions
if we want to have POSIX-style atomic renames on top of an existing file
that has open file handles.

Implement a new function that emulates open(3p) for existing files via
`CreateFileW()` such that we can set the required sharing flags.

While we have the same issue when calling open(3p) with `O_CREAT`,
implementing that mode would be more complex due to the required
permission handling. Furthermore, atomic updates via renames typically
write to exclusive lockfile and then perform the rename, and thus we
don't have to handle the case where the locked path has been created
with `O_CREATE`. So while it would be nice to have proper POSIX
semantics in all paths, we instead aim for a minimum viable fix here.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-27 19:51:35 -04:00
b0b65ec593 compat/mingw: share file handles created via CreateFileW()
Unless told otherwise, Windows will keep other processes from reading,
writing and deleting files when one has an open handle that was created
via `CreateFileW()`. This behaviour can be altered via `FILE_SHARE_*`
flags:

  - `FILE_SHARE_READ` allows a concurrent process to open the file for
    reading.

  - `FILE_SHARE_WRITE` allows a concurrent process to open the file for
    writing.

  - `FILE_SHARE_DELETE` allows a concurrent process to delete the file
    or to replace it via an atomic rename.

This sharing mechanism is quite important in the context of Git, as we
assume POSIX semantics all over the place. But there are two callsites
where we don't pass all three of these flags:

  - We don't set `FILE_SHARE_DELETE` when creating a file for appending
    via `mingw_open_append()`. This makes it impossible to delete the
    file from another process or to replace it via an atomic rename. The
    function was introduced via d641097589 (mingw: enable atomic
    O_APPEND, 2018-08-13) and has been using `FILE_SHARE_READ |
    FILE_SHARE_WRITE` since the inception. There aren't any indicators
    that the omission of `FILE_SHARE_DELETE` was intentional.

  - We don't set any sharing flags in `mingw_utime()`, which changes the
    access and modification of a file. This makes it impossible to
    perform any kind of operation on this file at all from another
    process. While we only open the file for a short amount of time to
    update its timestamps, this still opens us up for a race condition
    with another process.

    `mingw_utime()` was originally implemented via `_wopen()`, which
    doesn't give you full control over the sharing mode. Instead, it
    calls `_wsopen()` with `_SH_DENYNO`, which ultimately translates to
    `FILE_SHARE_READ | FILE_SHARE_WRITE`. It was then refactored via
    090a3085bc (t/helper/test-chmtime: update mingw to support chmtime
    on directories, 2022-03-02) to use `CreateFileW()`, but we stopped
    setting any sharing flags at all, which seems like an unintentional
    side effect. By restoring `FILE_SHARE_READ | FILE_SHARE_WRITE` we
    thus fix this and get back the old behaviour of `_wopen()`.

    The fact that we didn't set the equivalent of `FILE_SHARE_DELETE`
    can be explained, as well: neither `_wopen()` nor `_wsopen()` allow
    you to do so. So overall, it doesn't seem intentional that we didn't
    allow deletions here, either.

Adapt both of these callsites to pass all three sharing flags.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-27 19:51:35 -04:00
863f2459a2 packfile: use oidread() instead of hashcpy() to fill object_id
When chasing a REF_DELTA, we need to pull the raw hash bytes out of the
mmap'd packfile into an object_id struct. We do that with a raw
hashcpy() of the appropriate length (that happens directly now, though
before the previous commit it happened inside find_pack_entry_one(),
also using a hashcpy).

But I think this creates a potentially dangerous situation due to
d4d364b2c7 (hash: convert `oidcmp()` and `oideq()` to compare whole
hash, 2024-06-14). When using sha1, we'll have uninitialized bytes in
the latter part of the object_id.hash buffer, which could fool oideq(),
etc.

We should use oidread() instead, which correctly zero-pads the extra
bytes, as of c98d762ed9 (global: ensure that object IDs are always
padded, 2024-06-14).

As far as I can see, this has not been a problem in practice because the
object_id we feed to find_pack_entry_one() is never used with oideq(),
etc. It is being compared to the bytes mmap'd from a pack idx file,
which of course do not have the extra padding bytes themselves. So
there's no bug here, but this just puzzled me while looking at the code.
We should do the more obviously safe thing, both for future-proofing and
to avoid confusing readers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
479ab76c9f packfile: use object_id in find_pack_entry_one()
The main function we use to search a pack index for an object is
find_pack_entry_one(). That function still takes a bare pointer to the
hash, despite the fact that its underlying bsearch_pack() function needs
an object_id struct. And so we end up making an extra copy of the hash
into the struct just to do a lookup.

As it turns out, all callers but one already have such an object_id. So
we can just take a pointer to that struct and use it directly. This
avoids the extra copy and provides a more type-safe interface.

The one exception is get_delta_base() in packfile.c, when we are chasing
a REF_DELTA from inside the pack (and thus we have a pointer directly to
the mmap'd pack memory, not a struct). We can just bump the hashcpy()
from inside find_pack_entry_one() to this one caller that needs it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
4d99559147 packfile: convert find_sha1_pack() to use object_id
The find_sha1_pack() function has a few problems:

  - it's badly named, since it works with any object hash

  - it takes the hash as a bare pointer rather than an object_id struct

We can fix both of these easily, as all callers actually have a real
object_id anyway.

I also found the existence of this function somewhat confusing, as it is
about looking in an arbitrary set of linked packed_git structs. It's
good for things like dumb-http which are looking in downloaded remote
packs, and not our local packs. But despite the name, it is not a good
way to find the pack which contains a local object (it skips the use of
the midx, the pack mru list, and so on).

So let's also add an explanatory comment above the declaration that may
point people in the right direction.

I suspect the calls in fast-import.c, which use the packed_git list from
the repository struct, could actually just be using find_pack_entry().
But since we'd need to keep it anyway for dumb-http, I didn't dig
further there. If we eventually drop dumb-http support, then it might be
worth examining them to see if we can get rid of the function entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
0af861e0c8 http-walker: use object_id instead of bare hash
We long ago switched most code to using object_id structs instead of
bare "unsigned char *" hashes. This gives us more type safety from the
compiler, and generally makes it easier to understand what we expect in
each parameter.

But the dumb-http code has lagged behind. And indeed, the whole "walker"
subsystem interface has the same problem, though http-walker is the only
user left.

So let's update the walker interface to pass object_id structs (which we
already have anyway at all call sites!), and likewise use those within
the http-walker methods that it calls.

This cleans up the dumb-http code a bit, but will also let us fix a few
more commonly used helper functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
6b2fc22050 packfile: warn people away from parse_packed_git()
With a name like parse_packed_git(), you might think it's the right way
to access a local pack index and its associated objects. But not so!
It's a one-off used by the dumb-http code to access pack idx files we've
downloaded from the remote, but whose packs we might not have.

There's only one caller left for this function, and ideally we'd drop it
completely and just inline it there. But that would require exposing
other internals from packfile.[ch], like alloc_packed_git() and
check_packed_git_idx().

So let's leave it be for now, and just warn people that it's probably
not what they're looking for. Perhaps in the long run if we eventually
drop dumb-http support, we can remove the function entirely then.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
4390fea963 packfile: drop sha1_pack_index_name()
Like sha1_pack_name() that we dropped in the previous commit, this
function uses an error-prone static strbuf and the somewhat misleading
name "sha1".

The only caller left is in pack-redundant.c. While this command is
marked for potential removal in our BreakingChanges document, we still
have it for now. But it's simple enough to convert it to use its own
strbuf with the underlying odb_pack_name() function, letting us drop the
otherwise obsolete function.

Note that odb_pack_name() does its own strbuf_reset(), so it's safe to
use directly within a loop like this.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
c2dc4c9fbb packfile: drop sha1_pack_name()
The sha1_pack_name() function has a few ugly bits:

  - it writes into a static strbuf (and not even a ring buffer of them),
    which can lead to subtle invalidation problems

  - it uses the term "sha1", but it's really using the_hash_algo, which
    could be sha256

There's only one caller of it left. And in fact that caller is better
off using the underlying odb_pack_name() function itself, since it's
just copying the result into its own strbuf anyway.

Converting that caller lets us get rid of this now-obselete function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
03b8eed7f5 packfile: drop has_pack_index()
The has_pack_index() function has several oddities that may make it
surprising if you are trying to find out if we have a pack with some
$hash:

  - it is not looking for a valid pack that we found while searching
    object directories. It just looks for any pack-$hash.idx file in the
    pack directory.

  - it only looks in the local directory, not any alternates

  - it takes a bare "unsigned char" hash, which we try to avoid these
    days

The only caller it has is in the dumb http code; it wants to know if we
already have the pack idx in question. This can happen if we downloaded
the pack (and generated its index) during a previous fetch.

Before the previous patch ("dumb-http: store downloaded pack idx as
tempfile"), it could also happen if we downloaded the .idx from the
remote but didn't get the matching .pack. But since that patch, we don't
hold on to those .idx files. So there's no need to look for the .idx
file in the filesystem; we can just scan through the packed_git list to
see if we have it.

That lets us simplify the dumb http code a bit, as we know that if we
have the .idx we have the matching .pack already. And it lets us get rid
of this odd function that is unlikely to be needed again.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
63aca3f7f1 dumb-http: store downloaded pack idx as tempfile
This patch fixes a regression in b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26) where fetching a v1 pack idx file
over the dumb-http protocol would cause the fetch to fail.

The core of the issue is that dumb-http stores the idx we fetch from the
remote at the same path that will eventually hold the idx we generate
from "index-pack --stdin". The sequence is something like this:

  0. We realize we need some object X, which we don't have locally, and
     nor does the other side have it as a loose object.

  1. We download the list of remote packs from objects/info/packs.

  2. For each entry in that file, we download each pack index and store
     it locally in .git/objects/pack/pack-$hash.idx (the $hash is not
     something we can verify yet and is given to us by the remote).

  3. We check each pack index we got to see if it has object X. When we
     find a match, we download the matching .pack file from the remote
     to a tempfile. We feed that to "index-pack --stdin", which
     reindexes the pack, rather than trusting that it has what the other
     side claims it does. In most cases, this will end up generating the
     exact same (byte-for-byte) pack index which we'll store at the same
     pack-$hash.idx path, because the index generation and $hash id are
     computed based on what's in the packfile. But:

       a. The other side might have used other options to generate the
          index. For instance we use index v2 by default, but long ago
          it was v1 (and you can still ask for v1 explicitly).

       b. The other side might even use a different mechanism to
          determine $hash. E.g., long ago it was based on the sorted
          list of objects in the packfile, but we switched to using the
          pack checksum in 1190a1acf8 (pack-objects: name pack files
          after trailer hash, 2013-12-05).

The regression we saw in the real world was (3a). A recent client
fetching from a server with a v1 index downloaded that index, then
complained about trying to overwrite it with its own v2 index. This
collision is otherwise harmless; we know we want to replace the remote
version with our local one, but the collision check doesn't realize
that.

There are a few options to fix it:

  - we could teach index-pack a command-line option to ignore only pack
    idx collisions, and use it when the dumb-http code invokes
    index-pack. This would be an awkward thing to expose users to and
    would involve a lot of boilerplate to get the option down to the
    collision code.

  - we could delete the remote .idx file right before running
    index-pack. It should be redundant at that point (since we've just
    downloaded the matching pack). But it feels risky to delete
    something from our own .git/objects based on what the other side has
    said. I'm not entirely positive that a malicious server couldn't lie
    about which pack-$hash.idx it has and get us to delete something
    precious.

  - we can stop co-mingling the downloaded idx files in our local
    objects directory. This is a slightly bigger change but I think
    fixes the root of the problem more directly.

This patch implements the third option. The big design questions are:
where do we store the downloaded files, and how do we manage their
lifetimes?

There are some additional quirks to the dumb-http system we should
consider. Remember that in step 2 we downloaded every pack index, but in
step 3 we may only download some of the matching packs. What happens to
those other idx files now? They sit in the .git/objects/pack directory,
possibly waiting to be used at a later date. That may save bandwidth for
a subsequent fetch, but it also creates a lot of weird corner cases:

  - our local object directory now has semi-untrusted .idx files sitting
    around, without their matching .pack

  - in case 3b, we noted that we might not generate the same hash as the
    other side. In that case even if we download the matching pack,
    our index-pack invocation will store it in a different
    pack-$hash.idx file. And the unmatched .idx will sit there forever.

  - if the server repacks, it may delete the old packs. Now we have
    these orphaned .idx files sitting around locally that will never be
    used (nor deleted).

  - if we repack locally we may delete our local version of the server's
    pack index and not realize we have it. So we'll download it again,
    even though we have all of the objects it mentions.

I think the right solution here is probably some more complex cache
management system: download the remote .idx files to their own storage
directory, mark them as "seen" when we get their matching pack (to avoid
re-downloading even if we repack), and then delete them when the
server's objects/info/refs no longer mentions them.

But since the dumb http protocol is so ancient and so inferior to the
smart http protocol, I don't think it's worth spending a lot of time
creating such a system. For this patch I'm just downloading the idx
files to .git/objects/tmp_pack_*, and marking them as tempfiles to be
deleted when we exit (and due to the name, any we miss due to a crash,
etc, should eventually be removed by "git gc" runs based on timestamps).

That is slightly worse for one case: if we download an idx but not the
matching pack, we won't retain that idx for subsequent runs. But the
flip side is that we're making other cases better (we never hold on to
useless idx files forever). I suspect that worse case does not even come
up often, since it implies that the packs are generated to match
distinct parts of history (i.e., in practice even in a repo with many
packs you're going to end up grabbing all of those packs to do a clone).
If somebody really cares about that, I think the right path forward is a
managed cache directory as above, and this patch is providing the first
step in that direction anyway (by moving things out of the objects/pack/
directory).

There are two test changes. One demonstrates the broken v1 index case
(it double-checks the resulting clone with fsck to be careful, but prior
to this patch it actually fails at the clone step). The other tweaks the
expectation for a test that covers the "slightly worse" case to
accommodate the extra index download.

The code changes are fairly simple. We stop using finalize_object_file()
to copy the remote's index file into place, and leave it as a tempfile.
We give the tempfile a real ".idx" name, since the packfile code expects
that, and thus we make sure it is out of the usual packs/ directory (so
we'd never mistake it for a real local .idx).

We also have to change parse_pack_index(), which creates a temporary
packed_git to access our index (we need this because all of the pack idx
code assumes we have that struct). It reads the index data from the
tempfile, but prior to this patch would speculatively write the
finalized name into the packed_git struct using the pack-$hash we expect
to use.

I was mildly surprised that this worked at all, since we call
verify_pack_index() on the packed_git which mentions the final name
before moving the file into place! But it works because
parse_pack_index() leaves the mmap-ed data in the struct, so the
lazy-open in verify_pack_index() never triggers, and we read from the
tempfile, ignoring the filename in the struct completely. Hacky, but it
works.

After this patch, parse_pack_index() now uses the index filename we pass
in to derive a matching .pack name. This is OK to change because there
are only two callers, both in the dumb http code (and the other passes
in an existing pack-$hash.idx name, so the derived name is going to be
pack-$hash.pack, which is what we were using anyway).

I'll follow up with some more cleanups in that area, but this patch is
sufficient to fix the regression.

Reported-by: fox <fox.gbr@townlong-yak.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
019b21d402 t5550: count fetches in "previously-fetched .idx" test
We have a test in t5550 that looks at index fetching over dumb http. It
creates two branches, each of which is completely stored in its own
pack, then fetches the branches independently. What should (and does)
happen is that the first fetch grabs both .idx files and one .pack file,
and then the fetch of the second branch re-uses the previously
downloaded .idx files (fetching none) and grabs the now-required .pack
file.

Since the next few patches will be touching this area of the code, let's
beef up the test a little by checking that we're downloading the
expected items at each step.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
8b5763e8fa midx: avoid duplicate packed_git entries
When we scan a pack directory to load the idx entries we find into the
packed_git list, we skip any of them that are contained in a midx. We
then load them later lazily if we actually need to access the
corresponding pack, referencing them both from the midx struct and the
packed_git list.

The lazy-load in the midx code checks to see if the midx already
mentions the pack, but doesn't otherwise check the packed_git list. This
makes sense, since we should have added any pack to both lists.

But there's a loophole! If we call close_object_store(), that frees the
midx entirely, but _not_ the packed_git structs, which we must keep
around for Reasons[1]. If we then try to look up more objects, we'll
auto-load the midx again, which won't realize that we've already loaded
those packs, and will create duplicate entries in the packed_git list.

This is possibly inefficient, because it means we may open and map the
pack redundantly. But it can also lead to weird user-visible behavior.
The case I found is in "git repack", which closes and reopens the midx
after repacking and then calls update_server_info(). We end up writing
the duplicate entries into objects/info/packs.

We could obviously de-dup them while writing that file, but it seems
like a violation of more core assumptions that we end up with these
duplicate structs at all.

We can avoid the duplicates reasonably efficiently by checking their
names in the pack_map hash. This annoyingly does require a little more
than a straight hash lookup due to the naming conventions, but it should
only happen when we are about to actually open a pack. I don't think one
extra malloc will be noticeable there.

[1] I'm not entirely sure of all the details, except that we generally
    assume the packed_git structs never go away. We noted this
    restriction in the comment added by 6f1e9394e2 (object: fix leaking
    packfiles when closing object store, 2024-08-08), but it's somewhat
    vague. At any rate, if you try freeing the structs in
    close_object_store(), you can observe segfaults all over the test
    suite. So it might be fixable, but it's not trivial.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
6a11438f43 The fifth batch 2024-10-25 14:11:13 -04:00
55d12c24d7 Merge branch 'wm/shortlog-hash'
Teaches 'shortlog' to explicitly use SHA-1 when operating outside of
a repository.

* wm/shortlog-hash:
  builtin/shortlog: explicitly set hash algo when there is no repo
2024-10-25 14:02:49 -04:00
fcaac14abf Merge branch 'sk/msvc-warnings'
Fixes compile time warnings with 64-bit MSVC.

* sk/msvc-warnings:
  mingw.c: Fix complier warnings for a 64 bit msvc
2024-10-25 14:02:44 -04:00
0ab43ed95c Merge branch 'jc/a-commands-without-the-repo'
Commands that can also work outside Git have learned to take the
repository instance "repo" when we know we are in a repository, and
NULL when we are not, in a parameter.  The uses of the_repository
variable in a few of them have been removed using the new calling
convention.

* jc/a-commands-without-the-repo:
  archive: remove the_repository global variable
  annotate: remove usage of the_repository global
  git: pass in repo to builtin based on setup_git_directory_gently
2024-10-25 14:02:36 -04:00
dca32b8288 Merge branch 'pb/clar-build-fix'
Build fix.

* pb/clar-build-fix:
  Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o
2024-10-25 14:02:25 -04:00
448022a7fb Merge branch 'bf/t-readme-mention-reftable'
Doc update.

* bf/t-readme-mention-reftable:
  t/README: add missing value for GIT_TEST_DEFAULT_REF_FORMAT
2024-10-25 14:02:21 -04:00
f25bb60393 Merge branch 'ak/typofix'
More typofixes.

* ak/typofix:
  t: fix typos
2024-10-25 14:02:08 -04:00
4d334e5205 Merge branch 'ak/typofixes'
Typofixes.

* ak/typofixes:
  t: fix typos
  t/helper: fix a typo
  t/perf: fix typos
  t/unit-tests: fix typos
  contrib: fix typos
  compat: fix typos
2024-10-25 14:02:04 -04:00
55bc7d54ab Merge branch 'ps/ci-gitlab-windows'
Enable Windows-based CI in GitLab.

* ps/ci-gitlab-windows:
  gitlab-ci: exercise Git on Windows
  gitlab-ci: introduce stages and dependencies
  ci: handle Windows-based CI jobs in GitLab CI
  ci: create script to set up Git for Windows SDK
  t7300: work around platform-specific behaviour with long paths on MinGW
2024-10-25 14:01:21 -04:00
6cbcc68ea7 Merge branch 'db/submodule-fetch-with-remote-name-fix'
A "git fetch" from the superproject going down to a submodule used
a wrong remote when the default remote names are set differently
between them.

* db/submodule-fetch-with-remote-name-fix:
  submodule: correct remote name with fetch
2024-10-25 14:01:09 -04:00
e226ba81a2 imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing
Replace unsafe uses of atoi() with strtol_i() to improve error handling
when parsing UIDVALIDITY, UIDNEXT, and APPENDUID in IMAP commands.
Invalid values, such as those with letters, now trigger error messages and
prevent malformed status responses.
I did not add any test for this commit as we do not have any test
for git-imap-send(1) at this point.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 14:03:44 -04:00
e36f009e69 merge: replace atoi() with strtol_i() for marker size validation
Replace atoi() with strtol_i() for parsing conflict-marker-size to
improve error handling. Invalid values, such as those containing letters
now trigger a clear error message.
Update the test to verify invalid input handling.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 14:03:44 -04:00
cc4023477f daemon: replace atoi() with strtoul_ui() and strtol_i()
Replace atoi() with strtoul_ui() for --timeout and --init-timeout
(non-negative integers) and with strtol_i() for --max-connections
(signed integers). This improves error handling and input validation
by detecting invalid values and providing clear error messages.
Update tests to ensure these arguments are properly validated.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 14:03:43 -04:00
be75cec1b6 CodingGuidelines: discourage arbitrary suffixes in function names
We often name functions with arbitrary suffixes like `_1` as an
extension of another existing function. This creates confusion and
doesn't provide good clarity into the functions purpose. Let's document
good function naming etiquette in our CodingGuidelines.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 12:51:30 -04:00
f56f9d6c0b t: fix typos
Fix typos and grammar in documentation, comments, etc.

Via codespell.

Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 12:45:53 -04:00
0fcd473fdd t7001: add failure test which triggers assertion
`git mv a/a.txt a b/` is a nonsense instruction.  Instead of failing
gracefully the command trips over itself,[1] leaving behind unfinished
work:

1. first it moves `a/a.txt` to `b/a.txt`; then
2. tries to move `a/`, including `a/a.txt`; then
3. figures out that it’s in a bad state (assertion); and finally
4. aborts.

Now you’re left with a partially-updated index.

The command should instead fail gracefully and make no changes to the
index until it knows that it can complete a sensible action.

For now just add a failing test since this has been known about for
a while.[2]

† 1: Caused by a `pos >= 0` assertion
[2]: https://lore.kernel.org/git/d1f739fe-b28e-451f-9e01-3d2e24a0fe0d@app.fastmail.com/

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:36:15 -04:00
88d21e3176 Merge branch 'ps/reftable-strbuf' into ps/reftable-detach
* ps/reftable-strbuf:
  reftable: handle trivial `reftable_buf` errors
  reftable/stack: adapt `stack_filename()` to handle allocation failures
  reftable/record: adapt `reftable_record_key()` to handle allocation failures
  reftable/stack: adapt `format_name()` to handle allocation failures
  t/unit-tests: check for `reftable_buf` allocation errors
  reftable/blocksource: adapt interface name
  reftable: convert from `strbuf` to `reftable_buf`
  reftable/basics: provide new `reftable_buf` interface
  reftable: stop using `strbuf_addf()`
  reftable: stop using `strbuf_addbuf()`
2024-10-23 16:21:11 -04:00
5f139a194f gitweb: make use of s///r
In Perl 5.14, released in May 2011, the r modifier was added to the s///
operator to allow it to return the modified string instead of modifying
the string in place. This allows to write nicer, more succinct code in
several cases, so let's do that here.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:36 -04:00
702d8c1f3b Require Perl 5.26.0
Our platform support policy states that we require "versions of
dependencies which are generally accepted as stable and supportable,
e.g., in line with the version used by other long-term-support
distributions".  Of Debian, Ubuntu, RHEL, and SLES, the four most common
distributions that provide LTS versions, the version with mainstream
long-term security support with the oldest Perl is 5.26.0 in SLES 15.6.

This is a major upgrade, since Perl 5.8.1, according to the Perl
documentation, was released in September of 2003.  It brings a lot of
new features that we can choose to use, such as s///r to return the
modified string, the postderef functionality, and subroutine signatures,
although the latter was still considered experimental until 5.36.

This change was made with the following one-liner, which intentionally
excludes modifying the vendored modules we include to avoid conflicts:

    git grep -l 'use 5.008001' | grep -v 'LoadCPAN/' | xargs perl -pi -e 's/use 5.008001/require v5.26/'

Use require instead of use to avoid changing the behavior as the latter
enables features and the former does not.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:36 -04:00
7bae4e7f58 INSTALL: document requirement for libcurl 7.61.0
Our platform support policy states that we require "versions of
dependencies which are generally accepted as stable and supportable,
e.g., in line with the version used by other long-term-support
distributions".  Of Debian, Ubuntu, and RHEL, the three most common
distributions that provide LTS versions, the version with mainstream
long-term security support with the oldest libcurl is 7.61.0 in RHEL 8.

Update the documentation to state that this is the new base version for
libcurl.  Remove text that is no longer applicable to older versions.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
603cf3e942 git-curl-compat: remove check for curl 7.56.0
libcurl 7.56.0 was released in September 2017, which is over seven years
ago, and no major operating system vendor is still providing security
support for it.  Debian 10, which is out of mainstream security support,
has supported a newer version, and Ubuntu 20.04 and RHEL 8, which are
still in support, also have a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
d2f078c341 git-curl-compat: remove check for curl 7.53.0
libcurl 7.53.0 was released in February 2017, which is over seven years
ago, and no major operating system vendor is still providing security
support for it.  Debian 10 and Ubuntu 18.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
17de6fd83b git-curl-compat: remove check for curl 7.52.0
libcurl 7.52.0 was released in August 2017, which is over seven years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 18.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
5c91da6d5b git-curl-compat: remove check for curl 7.44.0
libcurl 7.44.0 was released in August 2015, which is over nine years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 16.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
f47a1faa9b git-curl-compat: remove check for curl 7.43.0
libcurl 7.43.0 was released in June 2015, which is over nine years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 16.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
05dd4ec507 git-curl-compat: remove check for curl 7.39.0
libcurl 7.39.0 was released in November 2014, which is almost ten years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 16.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
6545b26eeb git-curl-compat: remove check for curl 7.34.0
libcurl 7.34.0 was released in December 2013, which is well over ten
years ago, and no major operating system vendor is still providing
security support for it.  Debian 8 and Ubuntu 14.04, both of which are
out of mainstream security support, have supported a newer version, and
RHEL 8, which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
f7c094060c git-curl-compat: remove check for curl 7.25.0
libcurl 7.25.0 was released in March 2012, which is well over ten years
ago, and no major operating system vendor is still providing security
support for it.  Debian 8, RHEL 7, and Ubuntu 12.10, all of which are
out of mainstream security support, have all supported a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
8bf7f9e1ff git-curl-compat: remove check for curl 7.21.5
libcurl 7.21.5 was released in April 2011, which is well over ten years
ago, and no major operating system vendor is still providing security
support for it.  Debian 7, RHEL 7, and Ubuntu 12.04, all of which are
out of mainstream security support, have all supported a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
09bf122507 t9101: ensure no whitespace after redirect
This change updates the script to conform to the coding
standards outlined in the Git project's documentation. According to the
guidelines in Documentation/CodingGuidelines under "Redirection
operators", there should be no whitespace after redirection operators.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 14:57:32 -04:00
91687cd13f t7011: ensure no whitespace after redirect
This change updates the script to conform to the coding standards
outlined in the Git project's documentation. According to the guidelines
in Documentation/CodingGuidelines under "Redirection operators", there
should be no whitespace after redirection operators.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 15:37:59 -04:00
fd3785337b The third batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 14:43:46 -04:00
8e08668322 Merge branch 'cw/worktree-relative'
An extra worktree attached to a repository points at each other to
allow finding the repository from the worktree and vice versa
possible.  Turn this linkage to relative paths.

* cw/worktree-relative:
  worktree: add test for path handling in linked worktrees
  worktree: link worktrees with relative paths
  worktree: refactor infer_backlink() to use *strbuf
  worktree: repair copied repository and linked worktrees
2024-10-22 14:40:39 -04:00
6ca9a05e63 Merge branch 'ps/cache-tree-w-broken-index-entry'
Fail gracefully instead of crashing when attempting to write the
contents of a corrupt in-core index as a tree object.

* ps/cache-tree-w-broken-index-entry:
  unpack-trees: detect mismatching number of cache-tree/index entries
  cache-tree: detect mismatching number of index entries
  cache-tree: refactor verification to return error codes
2024-10-22 14:40:38 -04:00
19f5ce0bc2 doc: consolidate extensions in git-config documentation
The `technical/repository-version.txt` document originally served as the
master list for extensions, requiring that any new extensions be defined
there. However, the `config/extensions.txt` file was introduced later
and has since become the de facto location for describing extensions,
with several extensions listed there but missing from
`repository-version.txt`.

This consolidates all extension definitions into `config/extensions.txt`,
making it the authoritative source for extensions. The references in
`repository-version.txt` are updated to point to `config/extensions.txt`,
and cross-references to related documentation such as
`gitrepository-layout[5]` and `git-config[1]` are added.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:49:32 -04:00
9e362dd060 t6050: avoid pipes with upstream Git commands
In pipes, the exit code of a chain of commands is determined by
the final command. In order not to miss the exit code of a failed
Git command, avoid pipes instead write output of Git commands
into a file.
For better debugging experience, instances of "grep" were changed
to "test_grep". "test_grep" provides more context in case of a
failed "grep".

Signed-off-by: Chizoba ODINAKA <chizobajames21@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:47:27 -04:00
ce025ae4f6 grep: disable lookahead on error
regexec(3) can fail.  E.g. on macOS it fails if it is used with an UTF-8
locale to match a valid regex against a buffer containing invalid UTF-8
characters.

git grep has two ways to search for matches in a file: Either it splits
its contents into lines and matches them separately, or it matches the
whole content and figures out line boundaries later.  The latter is done
by look_ahead() and it's quicker in the common case where most files
don't contain a match.

Fall back to line-by-line matching if look_ahead() encounters an
regexec(3) error by propagating errors out of patmatch() and bailing out
of look_ahead() if there is one.  This way we at least can find matches
in lines that contain only valid characters.  That matches the behavior
of grep(1) on macOS.

pcre2match() dies if pcre2_jit_match() or pcre2_match() fail, but since
we use the flag PCRE2_MATCH_INVALID_UTF it handles invalid UTF-8
characters gracefully.  So implement the fall-back only for regexec(3)
and leave the PCRE2 matching unchanged.

Reported-by: David Gstir <david@sigma-star.at>
Signed-off-by: René Scharfe <l.s.r@web.de>
Tested-by: David Gstir <david@sigma-star.at>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:45:49 -04:00
c348192afe t1016: clean up style
Use `test_config`.

Remove whitespace after redirect operator.

Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:35:05 -04:00
a73070fbd4 t4205: fix typo in 'NUL termination with --stat'
Correct "expected" to rightly terminate with NUL ie '\0' instead of '0'
which may have been typoed.

We didn't notice this before because the test is run with
"test_expect_failure", meaning the test would have been marked broken
anyways.

Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 17:37:11 -04:00
52acf6771b SubmittingPatches: tags -> trailers
“Trailer” is the preferred nomenclature in this project.  Also add a
definite article where I think it makes sense.

As we can see the rest of the document already prefers this term.  This
just gets rid of the last stragglers.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 17:23:00 -04:00
30bf9f0aaa cmake: set up proper dependencies for generated clar headers
The auto-generated headers used by clar are written at configure time
and thus do not get regenerated automatically. Refactor the build
recipes such that we use custom commands instead, which also has the
benefit that we can reuse the same infrastructure as our Makefile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
a4f8a59ddc cmake: fix compilation of clar-based unit tests
The compilation of clar-based unit tests is broken because we do not
add the binary directory into which we generate the "clar-decls.h" and
"clar.suite" files as include directories. Instead, we accidentally set
up the source directory as include directory.

Fix this by including the binary directory instead of the source
directory. Furthermore, set up the include directories as PUBLIC instead
of PRIVATE such that they propagate from "unit-tests.lib" to the
"unit-tests" executable, which needs to include the same directory.

Reported-by: Ed Reel <edreel@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
67f75dfe1b Makefile: extract script to generate clar declarations
Extract the script to generate function declarations for the clar unit
testing framework into a standalone script. This is done such that we
can reuse it in other build systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
a779c8e8d5 Makefile: adjust sed command for generating "clar-decls.h"
This moves the end-of-line marker out of the captured group, matching
the start-of-line marker and for some reason fixing generation of
"clar-decls.h" on some older, more esoteric platforms.

Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
7d5f18a901 t/unit-tests: update clar to 206accb
Update clar from:

    - 1516124 (Merge pull request #97 from pks-t/pks-whitespace-fixes, 2024-08-15).

To:

    - 206accb (Merge pull request #108 from pks-t/pks-uclibc-without-wchar, 2024-10-21)

This update includes a bunch of fixes and improvements that we have
discussed in Git when initial support for clar was merged:

  - There is a ".editorconfig" file now.

  - Compatibility with Windows has been improved so that the clar
    compiles on this platform without an issue. This has been tested
    with Cygwin, MinGW and Microsoft Visual Studio.

  - clar now uses CMake. This does not impact us at all as we wire up
    the clar into our own build infrastructure anyway. This conversion
    was done such that we can easily run CI jobs against Windows.

  - Allocation failures are now checked for consistently.

  - We now define feature test macros in "clar.c", which fixes
    compilation on some platforms that didn't previously pull in
    non-standard functions like lstat(3p) or strdup(3p). This was
    reported by a user of OpenSUSE Leap.

  - We stop using `struct timezone`, which is undefined behaviour
    nowadays and results in a compilation error on some platforms.

  - We now use the combination of mktemp(3) and mkdir(3) on SunOS, same
    as we do on NonStop.

  - We now support uClibc without support for <wchar.h>.

The most important bits here are the improved platform compatibility
with Windows, OpenSUSE, SunOS and uClibc.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
18b0b6c690 Documentation: mutually link update-ref and symbolic-ref
These two commands are similar enough to acknowledge each other on their
documentation pages.

See the previous commit where we discussed that option-less update-ref
does not support updating symbolic refs but symbolic-ref does.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
74522b6b12 Documentation/git-update-ref.txt: discuss symbolic refs
Add a paragraph which just emphasizes that the command without any
options does not support refs in the final arguments.  This is clear
already from the names `<new-oid>` and `<old-oid>` but the right balance
of redundancy makes documentation robust against stray interpretation.

This is also a good place to mention why `--stdin` has those `symref-*`
commands.

Suggested-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
793e308f1e Documentation/git-update-ref.txt: remove confusing paragraph
This paragraph interrupts the flow of the section by going into detail
about what a symbolic ref file is and how it is implemented.  It is not
clear what the purpose is since symbolic refs were already mentioned
prior (“possibly dereferencing the symbolic refs”).  Worse, it can
confuse the reader about what argument can be a symbolic ref since it
just says “it” and not which of the parameters; in turn the reader can
be lead to try `<new-oid>` and then get a confusing error since
update-ref will just say that it is not a valid SHA1.

gitglossary(7) already documents what a symref is, concretely, and quite
well at that.

Reported-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
dc6050f67e Documentation/git-update-ref.txt: demote symlink to last section
Move the discussion of file system symbolic links to a new “Notes”
section (inspired by the one in git-symbolic-ref(1)) since this is
mostly of historical note at this point, not something that is needed in
the main section of the documentation.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
744c282cd4 Documentation/git-update-ref.txt: remove safety paragraphs
Remove paragraphs which explain that using this command is safer than
echoing the branch name into `HEAD`.

Evoking the echo strategy is wrong now under the reftable backend since
this file does not exist.  And the ref file backend majority user base
use porcelain commands to manage `HEAD` unless they are intentionally
poking at the implementation.

Maybe this warning was relevant for the usage patterns when it was
added[1] but now it just takes up space.

† 1: 129056370a (Add missing documentation., 2005-10-04)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
1b2dfb7050 Documentation/git-update-ref.txt: drop “flag”
The other paragraphs on options say “With <option>,”.  Let’s be uniform.

Also add missing word “that”.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
2e7c6d2f41 ref-filter: format iteratively with lexicographic refname sorting
In bd98f9774e (ref-filter.c: filter & format refs in the same callback,
2023-11-14), we have introduced logic into the ref-filter subsystem that
determines whether or not we can output references iteratively instead
of first collecting all references, post-processing them and printing
them once done. This has the advantage that we don't have to store all
refs in memory and, when used with e.g. `--count=1`, that we don't have
to read all refs in the first place.

One restriction we have in place for that is that caller must not ask
for sorted refs, because there is no way to sort the refs without first
reading them all into an array. So the benefits can only be reaped when
explicitly asking for output not to be sorted.

But there is one exception here where we _can_ get away with sorting
refs while streaming: ref backends sort references returned by their
iterators in lexicographic order. So if the following conditions are all
true we can do iterative streaming:

  - There must be at most a single sorting specification, as otherwise
    we're not using plain lexicographic ordering.

  - The sorting specification must use the "refname".

  - The sorting specification must not be using any flags, like
    case-insensitive sorting.

Now the resulting logic does feel quite fragile overall, which makes me
a bit uneasy. But after thinking about this for a while I couldn't find
any obvious gaps in my reasoning. Furthermore, given that lexicographic
sorting order is the default in git-for-each-ref(1), this is likely to
benefit a whole lot of usecases out there.

The following benchmark executes git-for-each-ref(1) in a crafted repo
with 1 million references:

  Benchmark 1: git for-each-ref (revision = HEAD~)
    Time (mean ± σ):      6.756 s ±  0.014 s    [User: 3.004 s, System: 3.541 s]
    Range (min … max):    6.738 s …  6.784 s    10 runs

  Benchmark 2: git for-each-ref (revision = HEAD)
    Time (mean ± σ):      6.479 s ±  0.017 s    [User: 2.858 s, System: 3.422 s]
    Range (min … max):    6.450 s …  6.519 s    10 runs

  Summary
    git for-each-ref (revision = HEAD)
      1.04 ± 0.00 times faster than git for-each-ref (revision = HEAD~)

The change results in a slight performance improvement, but nothing that
would really stand out. Something that cannot be seen in the benchmark
though is peak memory usage, which went from 404.5MB to 68.96kB.

A more interesting benchmark is printing a single referenence with
`--count=1`:

  Benchmark 1: git for-each-ref --count=1 (revision = HEAD~)
    Time (mean ± σ):      6.655 s ±  0.018 s    [User: 2.865 s, System: 3.576 s]
    Range (min … max):    6.630 s …  6.680 s    10 runs

  Benchmark 2: git for-each-ref --count=1 (revision = HEAD)
    Time (mean ± σ):       8.6 ms ±   1.3 ms    [User: 2.3 ms, System: 6.1 ms]
    Range (min … max):     6.7 ms …  14.4 ms    266 runs

  Summary
    git git for-each-ref --count=1 (revision = HEAD)
    770.58 ± 116.19 times faster than git for-each-ref --count=1 (revision = HEAD~)

Whereas we scaled with the number of references before, we now print the
first reference and exit immediately, which provides a massive win.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:46:03 -04:00
c32d4a8cfe global: Fix duplicate word typos
Used regex to find these typos:

    (?<!struct )(?<=\s)([a-z]{1,}) \1(?=\s)

Signed-off-by: Sven Strickroth <email@cs-ware.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:05:04 -04:00
dab0b9e176 notes: teach the -e option to edit messages in editor
Notes can be added to a commit using:
	- "-m" to provide a message on the command line.
	- -C to copy a note from a blob object.
	- -F to read the note from a file.
When these options are used, Git does not open an editor,
it simply takes the content provided via these options and
attaches it to the commit as a note.

Improve flexibility to fine-tune the note before finalizing it
by allowing the messages to be prefilled in the editor and edited
after the messages have been provided through -[mF].

Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 15:52:48 -04:00
bddfccead1 doc: document how uploadpack.allowAnySHA1InWant impact other allow options
Document how setting of `uploadpack.allowAnySHA1InWant`
influences other `uploadpack` options - `allowTipSHA1InWant`
and `allowReachableSHA1InWant`.

Signed-off-by: Piotr Szlazak <piotr.szlazak@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 15:49:35 -04:00
7e785b87db clang-format: align consecutive macro definitions
We generally align consecutive macro definitions for better readability:

  #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
  #define OUTPUT_PORCELAIN            (1U<<3)

over

  #define OUTPUT_ANNOTATE_COMPAT (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP (1U<<2)
  #define OUTPUT_PORCELAIN (1U<<3)

So let's add the rule in clang-format to follow this.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18 17:37:16 -04:00
5e9fa0f9fa clang-format: re-adjust line break penalties
In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
adjusted the line break penalties to really fine tune what we care about
while doing line breaks. Modify some of those to be more inline with
what we care about in the Git project now.

We need to understand that the values set to penalties in
'.clang-format' are relative to each other and do not hold any absolute
value. The penalty arguments take an 'Unsigned' value, so we have some
liberty over the values we can set.

First, in that commit, we decided, that under no circumstances do we
want to exceed 80 characters. This seems a bit too strict. We do
overshoot this limit from time to time to prioritize readability. So
let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
that we add a penalty of 10 for each character that exceeds the column
limit. By itself this is enough to restrict to column limit. Tuning
other penalties in relation to this is what is important.

The penalty `PenaltyBreakAssignment` talks about the penalty for
breaking an assignment operator on to the next line. In our project, we
are okay with this, so giving a value of 5, which is below the value for
'PenaltyExcessCharacter' ensures that in the end, even 1 character over
the column limit is not worth keeping an assignment on the same line.

Similarly set the penalty for breaking before the first call parameter
'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
comments 'PenaltyBreakComment' and the penalty for breaking string
literals 'PenaltyBreakString' also to 5.

Finally, we really care about not breaking the return type into its own
line and we really care about not breaking before an open parenthesis.
This avoids weird formatting like:

   static const struct strbuf *
          a_really_really_large_function_name(struct strbuf resolved,
          const char *path, int flags)

or

   static const struct strbuf *a_really_really_large_function_name(
   	    struct strbuf resolved, const char *path, int flags)

to instead have something more readable like:

   static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
                                                                   const char *path,
                                                                   int flags)

(note: the tabs here have been replaced by spaces for easier reading)

This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
few characters above the 80 column limit to make code more readable.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18 17:37:16 -04:00
468a7e41e8 credential-cache: treat ECONNABORTED like ECONNRESET
On Cygwin, t0301 fails because "git credential-cache exit" returns a
non-zero exit code. What's supposed to happen here is:

  1. The client (the "credential-cache" invocation above) connects to a
     previously-spawned credential-cache--daemon.

  2. The client sends an "exit" command to the daemon.

  3. The daemon unlinks the socket and then exits, closing the
     descriptor back to the client.

  4. The client sees EOF on the descriptor and exits successfully.

That works on most platforms, and even _used_ to work on Cygwin. But
that changed in Cygwin's ef95c03522 (Cygwin: select: Fix FD_CLOSE
handling, 2021-04-06). After that commit, the client sees a read error
with errno set to ECONNABORTED, and it reports the error and dies.

It's not entirely clear if this is a Cygwin bug. It seems that calling
fclose() on the filehandles pointing to the sockets is sufficient to
avoid this error return, even though exiting should in general look the
same from the client's perspective.

However, we can't just call fclose() here. It's important in step 3
above to unlink the socket before closing the descriptor to avoid the
race mentioned by 7d5e9c9849 (credential-cache--daemon: clarify "exit"
action semantics, 2016-03-18). The client will exit as soon as it sees
the descriptor close, and the daemon may or may not have actually
unlinked the socket by then. That makes test code like this:

  git credential exit &&
  test_path_is_missing .git-credential-cache

racy.

So we probably _could_ fix this by calling:

  delete_tempfile(&socket_file);
  fclose(in);
  fclose(out);

before we exit(). Or by replacing the exit() with a return up the stack,
in which case the fclose() happens as we unwind. But in that case we'd
still need to call delete_tempfile() here to avoid the race.

But simpler still is that we can notice that we already special-case
ECONNRESET on the client side, courtesy of 1f180e5eb9 (credential-cache:
interpret an ECONNRESET as an EOF, 2017-07-27). We can just do the same
thing here (I suspect that prior to the Cygwin commit that introduced
this problem, we were really just seeing ECONNRESET instead of
ECONNABORTED, so the "new" problem is just the switch of the errno
values).

There's loads more debugging in this thread:

  https://lore.kernel.org/git/9dc3e85f-a532-6cff-de11-1dfb2e4bc6b6@ramsayjones.plus.com/

but I've tried to summarize the useful bits in this commit message.

[jk: commit message]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18 17:18:05 -04:00
34b6ce9b30 The third batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18 14:01:50 -04:00
c1662a00b6 Merge branch 'ps/maintenance-start-crash-fix'
"git maintenance start" crashed due to an uninitialized variable
reference, which has been corrected.

* ps/maintenance-start-crash-fix:
  builtin/gc: fix crash when running `git maintenance start`
2024-10-18 13:56:26 -04:00
2849552beb Merge branch 'xx/protocol-v2-doc-markup-fix'
Docfix.

* xx/protocol-v2-doc-markup-fix:
  Documentation/gitprotocol-v2.txt: fix a slight inconsistency in format
2024-10-18 13:56:25 -04:00
728ae63c05 Merge branch 'tc/bundle-uri-leakfix'
Leakfix.

* tc/bundle-uri-leakfix:
  bundle-uri: plug leak in unbundle_from_file()
2024-10-18 13:56:24 -04:00
645cc7a2a7 Merge branch 'kh/checkout-ignore-other-docfix'
Doc updates.

* kh/checkout-ignore-other-docfix:
  checkout: refer to other-worktree branch, not ref
2024-10-18 13:56:24 -04:00
4491734107 Merge branch 'kh/merge-tree-doc'
Docfix.

* kh/merge-tree-doc:
  doc: merge-tree: improve example script
2024-10-18 13:56:23 -04:00
6fe1b8cee0 Merge branch 'ng/rebase-merges-branch-name-as-label'
"git rebase --rebase-merges" now uses branch names as labels when
able.

* ng/rebase-merges-branch-name-as-label:
  rebase-merges: try and use branch names as labels
  rebase-update-refs: extract load_branch_decorations
  load_branch_decorations: fix memory leak with non-static filters
2024-10-18 13:56:22 -04:00
b967851417 Merge branch 'kn/loose-object-layer-wo-global-hash'
Code clean-up.

* kn/loose-object-layer-wo-global-hash:
  loose: don't rely on repository global state
2024-10-18 13:56:22 -04:00
ee064ba65a Merge branch 'jc/doc-refspec-syntax'
Doc updates.

* jc/doc-refspec-syntax:
  doc: clarify <src> in refspec syntax
2024-10-18 13:56:20 -04:00
020c16bdb9 Merge branch 'aa/t7300-modernize'
Test modernization.

* aa/t7300-modernize:
  t7300-clean.sh: use test_path_* helper functions for error logging
2024-10-18 13:54:43 -04:00
20590cd287 reftable: handle trivial reftable_buf errors
Convert the reftable library such that we handle failures with the
new `reftable_buf` interfaces.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
591c6a600e reftable/stack: adapt stack_filename() to handle allocation failures
The `stack_filename()` function cannot pass any errors to the caller as it
has a `void` return type. Adapt it and its callers such that we can
handle errors and start handling allocation failures.

There are two interesting edge cases in `reftable_stack_destroy()` and
`reftable_addition_close()`. Both of these are trying to tear down their
respective structures, and while doing so they try to unlink some of the
tables they have been keeping alive. Any earlier attempts to do that may
fail on Windows because it keeps us from deleting such tables while they
are still open, and thus we re-try on close. It's okay and even expected
that this can fail when the tables are still open by another process, so
we handle the allocation failures gracefully and just skip over any file
whose name we couldn't figure out.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
4abc8022ff reftable/record: adapt reftable_record_key() to handle allocation failures
The `reftable_record_key()` function cannot pass any errors to the
caller as it has a `void` return type. Adapt it and its callers such
that we can handle errors and start handling allocation failures.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
e693ccf2c9 reftable/stack: adapt format_name() to handle allocation failures
The `format_name()` function cannot pass any errors to the caller as it
has a `void` return type. Adapt it and its callers such that we can
handle errors and start handling allocation failures.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
31eedd1d11 t/unit-tests: check for reftable_buf allocation errors
Adapt our unit tests to check for allocations errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
f177d49163 reftable/blocksource: adapt interface name
Adapt the name of the `strbuf` block source to no longer relate to this
interface, but instead to the `reftable_buf` interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
be4c070a3c reftable: convert from strbuf to reftable_buf
Convert the reftable library to use the `reftable_buf` interface instead
of the `strbuf` interface. This is mostly a mechanical change via sed(1)
with some manual fixes where functions for `strbuf` and `reftable_buf`
differ. The converted code does not yet handle allocation failures. This
will be handled in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:56 -04:00
81eddda540 reftable/basics: provide new reftable_buf interface
Implement a new `reftable_buf` interface that will replace Git's own
`strbuf` interface. This is done due to three reasons:

  - The `strbuf` interfaces do not handle memory allocation failures and
    instead causes us to die. This is okay in the context of Git, but is
    not in the context of the reftable library, which is supposed to be
    usable by third-party applications.

  - The `strbuf` interface is quite deeply tied into Git, which makes it
    hard to use the reftable library as a standalone library. Any
    dependent would have to carefully extract the relevant parts of it
    to make things work, which is not all that sensible.

  - The `strbuf` interface does not use the pluggable allocators that
    can be set up via `reftable_set_alloc()`.

So we have good reasons to use our own type, and the implementation is
rather trivial. Implement our own type. Conversion of the reftable
library will be handled in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:55 -04:00
7fa7e14ebe reftable: stop using strbuf_addf()
We're about to introduce our own `reftable_buf` type to replace
`strbuf`. One function we'll have to convert is `strbuf_addf()`, which
is used in a handful of places. This function uses `snprintf()`
internally, which makes porting it a bit more involved:

  - It is not available on all platforms.

  - Some platforms like Windows have broken implementations.

So by using `snprintf()` we'd also push the burden on downstream users
of the reftable library to make available a properly working version of
it.

Most callsites of `strbuf_addf()` are trivial to convert to not using
it. We do end up using `snprintf()` in our unit tests, but that isn't
much of a problem for downstream users of the reftable library.

While at it, remove a useless call to `strbuf_reset()` in
`t_reftable_stack_auto_compaction_with_locked_tables()`. We don't write
to the buffer before this and initialize it with `STRBUF_INIT`, so there
is no need to reset anything.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:55 -04:00
409f04995e reftable: stop using strbuf_addbuf()
We're about to introduce our own `reftable_buf` type to replace
`strbuf`. Get rid of the seldomly-used `strbuf_addbuf()` function such
that we have to reimplement one less function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:59:55 -04:00
f1eea0b620 t: fix typos
Fix typos in documentation, comments, etc.

Via codespell.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:14:56 -04:00
b33001645e builtin/shortlog: explicitly set hash algo when there is no repo
Whilst git-shortlog(1) does not explicitly need any repository
information when run without reference to one, it still parses some of
its arguments with parse_revision_opt() which assumes that the hash
algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) we stopped setting up a default
hash algorithm and instead require commands to set it up explicitly.

This was done for most other commands like in ab274909d4 (builtin/diff:
explicitly set hash algo when there is no repo, 2024-05-07) but was
missed for builtin/shortlog, making git-shortlog(1) segfault outside of
a repository when given arguments like --author that trigger a call to
parse_revision_opt().

Fix this for now by explicitly setting the hash algorithm to SHA1. Also
add a regression test for the segfault.

Thanks-to: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 16:10:54 -04:00
386d372031 mingw.c: Fix complier warnings for a 64 bit msvc
Remove some complier warnings from msvc in compat/mingw.c for value
truncation from 64 bit to 32 bit integers.

Compiling compat/mingw.c under a 64 bit version of msvc produces
warnings. An "int" is 32 bit, and ssize_t or size_t should be 64 bit
long. Prepare compat/vcbuild/include/unistd.h to have a 64 bit type
_ssize_t, when _WIN64 is defined and 32 bit otherwise.

Further down in this include file, as before, ssize_t is defined as
_ssize_t, if needed.

Use size_t instead of int for all variables that hold the result of
strlen() or wcslen() (which cannot be negative).

Use ssize_t to hold the return value of read().

Signed-off-by: Sören Krecker <soekkle@freenet.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17 14:42:27 -04:00
751d063f27 fuzz: port fuzz-url-decode-mem from OSS-Fuzz
Git's fuzz tests are run continuously as part of OSS-Fuzz [1]. Several
additional fuzz tests have been contributed directly to OSS-Fuzz;
however, these tests are vulnerable to bitrot because they are not built
during Git's CI runs, and thus breaking changes are much less likely to
be noticed by Git contributors.

Port one of these tests back to the Git project:
fuzz-url-decode-mem

This test was originally written by Eric Sesterhenn as part of a
security audit of Git [2]. It was then contributed to the OSS-Fuzz repo
in commit c58ac4492 (Git fuzzing: uncomment the existing and add new
targets. (#11486), 2024-02-21) by Jaroslav Lobačevski. I (Josh Steadmon)
have verified with both Eric and Jaroslav that they're OK with moving
this test to the Git project.

[1] https://github.com/google/oss-fuzz
[2] https://ostif.org/wp-content/uploads/2023/01/X41-OSTIF-Gitlab-Git-Security-Audit-20230117-public.pdf

Co-authored-by: Jaroslav Lobačevski <jarlob@gmail.com>
Co-authored-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 18:14:11 -04:00
72686d4e5e fuzz: port fuzz-parse-attr-line from OSS-Fuzz
Git's fuzz tests are run continuously as part of OSS-Fuzz [1]. Several
additional fuzz tests have been contributed directly to OSS-Fuzz;
however, these tests are vulnerable to bitrot because they are not built
during Git's CI runs, and thus breaking changes are much less likely to
be noticed by Git contributors.

Port one of these tests back to the Git project:
fuzz-parse-attr-line

This test was originally written by Eric Sesterhenn as part of a
security audit of Git [2]. It was then contributed to the OSS-Fuzz repo
in commit c58ac4492 (Git fuzzing: uncomment the existing and add new
targets. (#11486), 2024-02-21) by Jaroslav Lobačevski. I (Josh Steadmon)
have verified with both Eric and Jaroslav that they're OK with moving
this test to the Git project.

[1] https://github.com/google/oss-fuzz
[2] https://ostif.org/wp-content/uploads/2023/01/X41-OSTIF-Gitlab-Git-Security-Audit-20230117-public.pdf

Co-authored-by: Jaroslav Lobačevski <jarlob@gmail.com>
Co-authored-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 18:14:11 -04:00
966253db75 fuzz: port fuzz-credential-from-url-gently from OSS-Fuzz
Git's fuzz tests are run continuously as part of OSS-Fuzz [1]. Several
additional fuzz tests have been contributed directly to OSS-Fuzz;
however, these tests are vulnerable to bitrot because they are not built
during Git's CI runs, and thus breaking changes are much less likely to
be noticed by Git contributors.

Port one of these tests back to the Git project:
fuzz-credential-from-url-gently

This test was originally written by Eric Sesterhenn as part of a
security audit of Git [2]. It was then contributed to the OSS-Fuzz repo
in commit c58ac4492 (Git fuzzing: uncomment the existing and add new
targets. (#11486), 2024-02-21) by Jaroslav Lobačevski. I (Josh Steadmon)
have verified with both Eric and Jaroslav that they're OK with moving
this test to the Git project.

[1] https://github.com/google/oss-fuzz
[2] https://ostif.org/wp-content/uploads/2023/01/X41-OSTIF-Gitlab-Git-Security-Audit-20230117-public.pdf

Co-authored-by: Jaroslav Lobačevski <jarlob@gmail.com>
Co-authored-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 18:14:11 -04:00
80ebd91b83 http: fix build error on FreeBSD
The `result` parameter passed to `http_request_reauth()` may either
point to a `struct strbuf` or a `FILE *`, where the `target` parameter
tells us which of either it actually is. To accommodate for both types
the pointer is a `void *`, which we then pass directly to functions
without doing a cast.

This is fine on most platforms, but it breaks on FreeBSD because
`fileno()` is implemented as a macro that tries to directly access the
`FILE *` structure.

Fix this issue by storing the `FILE *` in a local variable before we
pass it on to other functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
87ad2a9d56 builtin/credential-cache: fix missing parameter for stub function
When not compiling the credential cache we may use a stub function for
`cmd_credential_cache()`. With commit 9b1cb5070f (builtin: add a
repository parameter for builtin functions, 2024-09-13), we have added a
new parameter to all of those top-level `cmd_*()` functions, and did
indeed adapt the non-stubbed-out `cmd_credential_cache()`. But we didn't
adapt the stubbed-out variant, so the code does not compile.

Fix this by adding the missing parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
bb0d76dbf7 t7300: work around platform-specific behaviour with long paths on MinGW
Windows by default has a restriction in place to only allow paths up to
260 characters. This restriction can nowadays be lifted by setting a
registry key, but is still active by default.

In t7300 we have one test that exercises the behaviour of git-clean(1)
with such long paths. Interestingly enough, this test fails on my system
that uses Windows 10 with mingw-w64 installed via MSYS2: instead of
observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent
across multiple different environments I have tried.

I cannot say why exactly we observe a different error here, but I would
not be surprised if this was either dependent on the Windows version,
the version of MinGW, the current working directory of Git or any kind
of combination of these.

Work around the issue by handling both errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
5f8af25ff9 t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin
Parsing repositories which contain '[::1]' is broken on Cygwin. It seems
as if Cygwin is confusing those as drive letter prefixes or something
like this, but I couldn't deduce the actual root cause.

Mark those tests as broken for now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
f74949fa3d t3404: work around platform-specific behaviour on macOS 10.15
Two of our tests in t3404 use indented HERE docs where leading tabs on
some of the lines are actually relevant. The tabs do get removed though,
and we try to fix this up by using sed(1) to replace leading tabs in the
actual output, as well. But macOS 10.15 uses an oldish version of sed(1)
that has BSD lineage, which does not understand "\t", and thus we fail
to strip those leading tabs and fail the test.

Address this issue by using `q_to_tab` such that we do not have to strip
leading tabs from the actual output.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
3acb1f7199 t1401: make invocation of tar(1) work with Win32-provided one
Windows nowadays provides a tar(1) binary in "C:\Windows\system32". This
version of tar(1) doesn't seem to handle the case where directory paths
end with a trailing forward slash. And as we do that in t1401 the result
is that the test fails.

Drop the trailing slash. Other tests that use tar(1) work alright, this
is the only instance where it has been failing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
b4b77ea280 t/lib-gpg: fix setup of GNUPGHOME in MinGW
In "t/lib-gpg.sh" we set up the "GNUPGHOME" environment variable to
point to a test-specific directory. This is done by using "$PWD/gpghome"
as value, where "$PWD" is the current test's trash directory.

This is broken for MinGW though because "$PWD" will use Windows-style
paths that contain drive letters. What we really want in this context is
a Unix-style path, which we can get by using `$(pwd)` instead. It is
somewhat puzzling that nobody ever hit this issue, but it may easily be
that nobody ever tests on Windows with GnuPG installed, which would make
us skip those tests.

Adapt the code accordingly to fix tests using this library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
6b1f9e9c8c t/lib-gitweb: test against the build version of gitweb
When testing gitweb we set up the CGI script as "gitweb.perl", which is
the source file of the build target "gitweb.cgi". This file doesn't have
a patched shebang and still contains `++REPLACEMENT++` markers, but
things generally work because we replace the configuration with our own
test configuration.

But this only works as long as "$GIT_BUILD_DIR" actually points to the
source tree, because "gitweb.cgi" and "gitweb.perl" happen to sit next
to each other. This is not the case though once you have out-of-tree
builds like with CMake, where the source and built versions live in
different directories. Consequently, "$GIT_BUILD_DIR/gitweb/gitweb.perl"
won't exist there.

While we could ask build systems with out-of-tree builds to instead set
up GITWEB_TEST_INSTALLED, which allows us to override the location of
the script, it goes against the spirit of this environment variable. We
_don't_ want to test against an installed version, we want to use the
version we have just built.

Fix this by using "gitweb.cgi" instead. This means that you cannot run
test scripts without building that file, but in general we do expect
developers to build stuff before they test it anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
df383b5842 t/test-lib: wire up NO_ICONV prerequisite
The iconv library is used by Git to reencode files, commit messages and
other things. As such it is a rather integral part, but given that many
platforms nowadays use UTF-8 everywhere you can live without support for
reencoding in many situations. It is thus optional to build Git with
iconv, and some of our platforms wired up in "config.mak.uname" disable
it. But while we support building without it, running our test suite
with "NO_ICONV=Yes" causes many test failures.

Wire up a new test prerequisite ICONV that gets populated via our
GIT-BUILD-OPTIONS. Annotate failing tests accordingly.

Note that this commit does not do a deep dive into every single test to
assess whether the failure is expected or not. Most of the tests do
smell like the expected kind of failure though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
ed7634ebcc t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE
When assembling our LSAN_OPTIONS that configure the leak sanitizer we
end up prepending the string with various different colon-separated
options via calls to `prepend_var`. One of the settings we add is the
path where the sanitizer should store logs, which can be an arbitrary
filesystem path.

Naturally, filesystem paths may contain whitespace characters. And while
it does seem as if we were quoting the value, we use escaped quotes and
consequently split up the value if it does contain spaces. This leads to
the following error in t0000 when having a value with whitespaces:

    .../t/test-lib.sh: eval: line 64: unexpected EOF while looking for matching `"'
    ++ return 1
    error: last command exited with $?=1
    not ok 5 - subtest: 3 passing tests

The error itself is a bit puzzling at first. The basic problem is that
the code sees the leading escaped quote during eval, but because we
truncate everything after the space character it doesn't see the
trailing escaped quote and thus fails to parse the string.

Properly quote the value to fix the issue while using single-quotes to
quote the inner value passed to eval. The issue can be reproduced by
t0000 with such a path that contains spaces.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16 17:00:49 -04:00
15030f9556 The second batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-15 17:12:40 -04:00
b43e23fa02 Merge branch 'jk/fsmonitor-event-listener-race-fix'
On macOS, fsmonitor can fall into a race condition that results in
a client waiting forever to be notified for an event that have
already happened.  This problem has been corrected.

* jk/fsmonitor-event-listener-race-fix:
  fsmonitor: initialize fs event listener before accepting clients
  simple-ipc: split async server initialization and running
2024-10-15 16:56:43 -04:00
fd98f659fd Merge branch 'xx/remote-server-option-config'
A new configuration variable remote.<name>.serverOption makes the
transport layer act as if the --serverOption=<value> option is
given from the command line.

* xx/remote-server-option-config:
  ls-remote: leakfix for not clearing server_options
  fetch: respect --server-option when fetching multiple remotes
  transport.c:🤝 make use of server options from remote
  remote: introduce remote.<name>.serverOption configuration
  transport: introduce parse_transport_option() method
2024-10-15 16:56:43 -04:00
8a5545b949 Merge branch 'js/doc-platform-support-link-fix'
Docfix.

* js/doc-platform-support-link-fix:
  docs: fix the `maintain-git` links in `technical/platform-support`
2024-10-15 16:56:43 -04:00
f004467b04 Merge branch 'jh/config-unset-doc-fix'
Docfix.

* jh/config-unset-doc-fix:
  git-config.1: remove value from positional args in unset usage
2024-10-15 16:56:43 -04:00
3f0346d4dc trailer: spread usage of "trailer_block" language
Deprecate the "trailer_info" struct name and replace it with
"trailer_block". This is more readable, for two reasons:

  1. "trailer_info" on the surface sounds like it's about a single
     trailer when in reality it is a collection of one or more trailers,
     and

  2. the "*_block" suffix is more informative than "*_info", because it
     describes a block (or region) of contiguous text which has trailers
     in it, which has been parsed into the trailer_block structure.

Rename the

    size_t trailer_block_start, trailer_block_end;

members of trailer_info to just "start" and "end". Rename the "info"
pointer to "trailer_block" because it is more descriptive. Update
comments accordingly.

Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-14 12:33:02 -04:00
19c291e5b2 t3404: replace test with test_line_count()
Refactor t3404 to replace instances of `test` with `test_line_count()`
for checking line counts. This improves readability and aligns with Git's
current test practices.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-14 12:03:35 -04:00
c8fbae25c3 t3404: avoid losing exit status with focus on git show and git cat-file
The exit code of the preceding command in a pipe is disregarded. So
if that preceding command is a Git command that fails, the test would
not fail. Instead, by saving the output of that Git command to a file,
and removing the pipe, we make sure the test will fail if that Git
command fails. This particular patch focuses on all `git show` and
some instances of `git cat-file`.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-14 12:03:35 -04:00
2454970930 BreakingChanges: early adopter option
Discussing the desire to make breaking changes, declaring that
breaking changes are made at a certain version boundary, and
recording these decisions in this document, are necessary but not
sufficient.  We need to make sure that we can implement, test, and
deploy such impactful changes.

Earlier we considered to guard the breaking changes with a run-time
check of the `feature.git<version>` configuration to allow brave
users and developers to opt into them as early adoptors.  But the
engineering cost to support such a run-time switch, covering new and
disappearing git subcommands and how "git help" would adjust the
documentation to the run-time switch, would be unrealistically high
to be worth it.

Formalize the mechanism based on a compile-time switch to allow
early adopters to opt into the breaking change in a version of Git
before the planned version for the breaking change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11 14:50:21 -07:00
dcd590a39d t/README: add missing value for GIT_TEST_DEFAULT_REF_FORMAT
The documentation only lists "files" as a possible value, but
"reftable" is also valid.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11 14:18:39 -07:00
ea3422662d Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o
The clar source file '$(UNIT_TEST_DIR)/clar/clar.c' includes the
generated 'clar.suite', but this dependency is not taken into account by
our Makefile, so that it is possible for a parallel build to fail if
Make tries to build 'clar.o' before 'clar.suite' is generated.

Correctly specify the dependency.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11 11:08:08 -07:00
528d3e4d53 archive: remove the_repository global variable
As part of the effort to get rid of global state due to the global
the_repository variable, replace the_repository with the repository
argument that gets passed down through the builtin function.

The repo might be NULL, but we should be safe in write_archive() because
it detects if we are outside of a repository and calls
setup_git_directory() which will error.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11 09:37:18 -07:00
ebe8f4b6ec annotate: remove usage of the_repository global
As part of the effort to get rid of global state due to the_repository
variable, remove the the_repository with the repository argument that
gets passed down through the builtin function.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11 09:37:18 -07:00
5db948d413 git: pass in repo to builtin based on setup_git_directory_gently
The current code in run_builtin() passes in a repository to the builtin
based on whether cmd_struct's option flag has RUN_SETUP.

This is incorrect, however, since some builtins that only have
RUN_SETUP_GENTLY can potentially take a repository.
setup_git_directory_gently() tells us whether or not a command is being
run inside of a repository.

Use the output of setup_git_directory_gently() to help determine whether
or not there is a repository to pass to the builtin. If not, then we
just pass NULL.

As part of this patch, we need to modify add to check for a NULL repo
before calling repo_git_config(), since add -h can be run outside of a
repository.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11 09:37:17 -07:00
ef8ce8f3d4 Start the 2.48 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 14:22:30 -07:00
3eb4cc451e Merge branch 'jk/output-prefix-cleanup'
Code clean-up.

* jk/output-prefix-cleanup:
  diff: store graph prefix buf in git_graph struct
  diff: return line_prefix directly when possible
  diff: return const char from output_prefix callback
  diff: drop line_prefix_length field
  line-log: use diff_line_prefix() instead of custom helper
2024-10-10 14:22:30 -07:00
31bc4454de Merge branch 'ps/leakfixes-part-8'
More leakfixes.

* ps/leakfixes-part-8: (23 commits)
  builtin/send-pack: fix leaking list of push options
  remote: fix leaking push reports
  t/helper: fix leaks in proc-receive helper
  pack-write: fix return parameter of `write_rev_file_order()`
  revision: fix leaking saved parents
  revision: fix memory leaks when rewriting parents
  midx-write: fix leaking buffer
  pack-bitmap-write: fix leaking OID array
  pseudo-merge: fix leaking strmap keys
  pseudo-merge: fix various memory leaks
  line-log: fix several memory leaks
  diff: improve lifecycle management of diff queues
  builtin/revert: fix leaking `gpg_sign` and `strategy` config
  t/helper: fix leaking repository in partial-clone helper
  builtin/clone: fix leaking repo state when cloning with bundle URIs
  builtin/pack-redundant: fix various memory leaks
  builtin/stash: fix leaking `pathspec_from_file`
  submodule: fix leaking submodule entry list
  wt-status: fix leaking buffer with sparse directories
  shell: fix leaking strings
  ...
2024-10-10 14:22:29 -07:00
d29d644d18 Merge branch 'ds/line-log-asan-fix'
Use after free and double freeing at the end in "git log -L... -p"
had been identified and fixed.

* ds/line-log-asan-fix:
  line-log: protect inner strbuf from free
2024-10-10 14:22:27 -07:00
e29296745d Merge branch 'sk/doc-maintenance-schedule'
Doc update to clarify how periodical maintenance are scheduled,
spread across time to avoid thundering hurds.

* sk/doc-maintenance-schedule:
  doc: add a note about staggering of maintenance
2024-10-10 14:22:26 -07:00
325772f0d5 Merge branch 'tb/notes-amlog-doc'
Document "amlog" notes.

* tb/notes-amlog-doc:
  Documentation: mention the amlog in howto/maintain-git.txt
2024-10-10 14:22:25 -07:00
5575c713c2 Merge branch 'ps/reftable-alloc-failures'
The reftable library is now prepared to expect that the memory
allocation function given to it may fail to allocate and to deal
with such an error.

* ps/reftable-alloc-failures: (26 commits)
  reftable/basics: fix segfault when growing `names` array fails
  reftable/basics: ban standard allocator functions
  reftable: introduce `REFTABLE_FREE_AND_NULL()`
  reftable: fix calls to free(3P)
  reftable: handle trivial allocation failures
  reftable/tree: handle allocation failures
  reftable/pq: handle allocation failures when adding entries
  reftable/block: handle allocation failures
  reftable/blocksource: handle allocation failures
  reftable/iter: handle allocation failures when creating indexed table iter
  reftable/stack: handle allocation failures in auto compaction
  reftable/stack: handle allocation failures in `stack_compact_range()`
  reftable/stack: handle allocation failures in `reftable_new_stack()`
  reftable/stack: handle allocation failures on reload
  reftable/reader: handle allocation failures in `reader_init_iter()`
  reftable/reader: handle allocation failures for unindexed reader
  reftable/merged: handle allocation failures in `merged_table_init_iter()`
  reftable/writer: handle allocation failures in `reftable_new_writer()`
  reftable/writer: handle allocation failures in `writer_index_hash()`
  reftable/record: handle allocation failures when decoding records
  ...
2024-10-10 14:22:25 -07:00
799450316b Merge branch 'ja/doc-synopsis-markup'
The way AsciiDoc is used for SYNOPSIS part of the manual pages has
been revamped.  The sources, at least for the simple cases, got
vastly pleasant to work with.

* ja/doc-synopsis-markup:
  doc: apply synopsis simplification on git-clone and git-init
  doc: update the guidelines to reflect the current formatting rules
  doc: introduce a synopsis typesetting
2024-10-10 14:22:24 -07:00
41869f7447 t: fix typos
Fix typos via codespell.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:31:14 -07:00
897124aa1b t/helper: fix a typo
Fix a typo in comments: bellow -> below.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:31:13 -07:00
050e0ef6ea t/perf: fix typos
Fix typos via codespell.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:31:13 -07:00
ca2746b791 t/unit-tests: fix typos
Fix typos via codespell.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:31:13 -07:00
f5dedddb75 contrib: fix typos
Fix typos via codespell.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:31:12 -07:00
54ee29cfd5 compat: fix typos
Fix typos and grammar.

Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:31:12 -07:00
b8139c8f4e checkout: refer to other-worktree branch, not ref
We can only check out commits or branches, not refs in general.  And the
problem here is if another worktree is using the branch that we want to
check out.

Let’s be more direct and just talk about branches instead of refs.

Also replace “be held” with “in use”.  Further, “in use” is not
restricted to a branch being checked out (e.g. the branch could be busy
on a rebase), hence generalize to “or otherwise in use” in the option
description.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 13:09:13 -07:00
f1ed39987b Documentation/gitprotocol-v2.txt: fix a slight inconsistency in format
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Acked-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 11:54:07 -07:00
6dab49b9fb bundle-uri: plug leak in unbundle_from_file()
The function `unbundle_from_file()` has two memory leaks:

  - We do not release the `struct bundle_header header` when hitting
    errors because we return early without any cleanup.

  - We do not release the `struct strbuf bundle_ref` at all.

Plug these leaks by creating a common exit path where both of these
variables are released.

While at it, refactor the code such that the variable assignments do not
happen inside the conditional statement itself according to our coding
style.

Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 11:47:24 -07:00
c95547a394 builtin/gc: fix crash when running git maintenance start
It was reported on the mailing list that running `git maintenance start`
immediately segfaults starting with b6c3f8e12c (builtin/maintenance: fix
leak in `get_schedule_cmd()`, 2024-09-26). And indeed, this segfault is
trivial to reproduce up to a point where one is scratching their head
why we didn't catch this regression in our test suite.

The root cause of this error is `get_schedule_cmd()`, which does not
populate the `out` parameter in all cases anymore starting with the
mentioned commit. Callers do assume it to always be populated though and
will e.g. call `strvec_split()` on the returned value, which will of
course segfault when the variable is uninitialized.

So why didn't we catch this trivial regression? The reason is that our
tests always set up the "GIT_TEST_MAINT_SCHEDULER" environment variable
via "t/test-lib.sh", which allows us to override the scheduler command
with a custom one so that we don't accidentally modify the developer's
system. But the faulty code where we don't set the `out` parameter will
only get hit in case that environment variable is _not_ set, which is
never the case when executing our tests.

Fix the regression by again unconditionally allocating the value in the
`out` parameter, if provided. Add a test that unsets the environment
variable to catch future regressions in this area.

Reported-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10 10:04:43 -07:00
8ead1bba3e doc: clarify <src> in refspec syntax
We explicitly avoid saying "ref <src>" when introducing the source
side of a refspec, because it can be a fully-spelled hexadecimal
object name, and it also can be a pattern that is not quite a "ref".

But we are loose when we introduce <dst> and say "ref <dst>", even
though it can also be a pattern.  Let's omit "ref" also from the
destination side.

Clarify that <src> can be a ref, a (limited glob) pattern, or an
object name.

Even though the very original design of refspec expected that '*'
was used only at the end (e.g., "refs/heads/*" was expected, but not
"refs/heads/*-wip"), the code and its use evolved to handle a single
'*' anywhere in the pattern.  Update the text to remove the mention
of "the same prefix".  Anything that matches the pattern are named
by such a (limited glob) pattern in <src>.

Also put a bit more stress on the fact that we accept only one '*'
in the pattern by saying "one and only one `*`".

Helped-by: Monika Kairaitytė <monika@kibit.lt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 16:59:01 -07:00
77af53f56f t7300-clean.sh: use test_path_* helper functions for error logging
This test script uses "test - [def]", but when a test fails because
the file passed to it does not exist,
it fails silently without an error message.
Use test_path_* helper functions, which are designed to give better
error messages when their expectations are not met.

I have added a mechanical validation that applies the same transformation
done in this patch, when the test script is passed to a sed script as shown
below.

sed -e 's/^\(	*\)test -f /\1test_path_is_file /' \
    -e 's/^\(	*\)test -d /\1test_path_is_dir /' \
    -e 's/^\(	*\)test -e /\1test_path_exists /' \
    -e 's/^\(	*\)! test -[edf] /\1test_path_is_missing /' \
    -e 's/^\(	*\)test ! -[edf] /\1test_path_is_missing /' \
       "$1" >foo.sh

Reviewers can use the sed script to tranform the original test script and
compare the result in foo.sh with the results of applying the patch.
You will see an instance of "!(test -e 3)" which was manually replaced with
""test_path_is_missing 3", and everything else should match.

Careful and deliberate observation was done to check instances where
"test ! - [df] foo" was used in the test script to make sure that the test
instances were expecting foo to EITHER be a file or a directory, and NOT a
possibility of being both as this would make replacing "test ! -f foo" with
"test_path_is_missing foo" unreasonable.

In the tests control flow, foo has been created as EITHER a
reguar file OR a directory and should NOT exist
after "git clean" or "git clean -d", as the case maybe, has been called.
This made it reasonable to replace
"test ! -[df] foo" with "test_path_is_missing foo".

Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 15:04:39 -07:00
432f666aa6 loose: don't rely on repository global state
In `loose.c`, we rely on the global variable `the_hash_algo`. As such we
have guarded the file with the 'USE_THE_REPOSITORY_VARIABLE' definition.
Let's derive the hash algorithm from the available repository variable
and remove this guard. This brings us one step closer to removing the
global 'the_repository' variable.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 11:51:31 -07:00
631ddbbcbd gitlab-ci: exercise Git on Windows
Add jobs that exercise Git on Windows. Unfortunately, building and
especially testing Git on Windows is inherently slower compared to other
Unix-like systems, mostly because spawning processes is way slower. We
thus use the same layout as we use in GitHub Actions, where we have one
build job, and then pass on the resulting build artifacts to ten test
jobs that split up the work across each other.

Unfortunately, the GitLab runners for Windows machines are embarassingly
slow by themselves. So while this strategy leads to around 20 minutes of
build time in GitHub Actions, the same pipeline takes around an hour in
GitLab CI. Still, having late coverage is certainly better than having
none at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 11:33:05 -07:00
05a928a93e gitlab-ci: introduce stages and dependencies
We're about to add a couple of jobs for Windows. As the Windows runners
are quite slow, we will split those up across two stages: one stage to
build the artifacts, and one stage that runs test slices in parallel.

Introduce stages and "needs" dependencies for the preexisting jobs as a
preparatory step. The stages will lead to a more natural representation
of jobs in the UI, whereas the "needs" dependency ensures that jobs do
not have to wait for all jobs in the preceding stage to finish.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 11:33:05 -07:00
b7a08e947e ci: handle Windows-based CI jobs in GitLab CI
We try to abstract away any differences between different CI platforms
in "ci/lib.sh", such that knowledge specific to e.g. GitHub Actions or
GitLab CI is neatly encapsulated in a single place. Next to some generic
variables, we also set up some variables that are specific to the actual
platform that the CI operates on, e.g. Linux or macOS.

We do not yet support Windows runners on GitLab CI. Unfortunately, those
systems do not use the same "CI_JOB_IMAGE" environment variable as both
Linux and macOS do. Instead, we can use the "OS" variable, which should
have a value of "Windows_NT" on Windows platforms.

Handle the combination of "$OS,$CI_JOB_IMAGE" and introduce support for
Windows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 11:33:04 -07:00
91839a8827 ci: create script to set up Git for Windows SDK
In order to build and test Git, we have to first set up the Git for
Windows SDK, which contains various required tools and libraries. The
SDK is basically a clone of [1], but that repository is quite large due
to all the binaries it contains. We thus use both shallow clones and
sparse checkouts to speed up the setup. To handle this complexity we use
a GitHub action that is hosted externally at [2].

Unfortunately, this makes it rather hard to reuse the logic for CI
platforms other than GitHub Actions. After chatting with Johannes
Schindelin we came to the conclusion that it would be nice if the Git
for Windows SDK would regularly publish releases that one can easily
download and extract, thus moving all of the complexity into that single
step. Like this, all that a CI job needs to do is to fetch and extract
the resulting archive. This published release comes in the form of a new
"ci-artifacts" tag that gets updated regularly [3].

Implement a new script that knows how to fetch and extract that script
and convert GitHub Actions to use it.

[1]: https://github.com/git-for-windows/git-sdk-64/
[2]: https://github.com/git-for-windows/setup-git-for-windows-sdk/
[3]: https://github.com/git-for-windows/git-sdk-64/releases/tag/ci-artifacts/

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 11:33:04 -07:00
106834e34a t7300: work around platform-specific behaviour with long paths on MinGW
Windows by default has a restriction in place to only allow paths up to
260 characters. This restriction can nowadays be lifted by setting a
registry key, but is still active by default.

In t7300 we have one test that exercises the behaviour of git-clean(1)
with such long paths. Interestingly enough, this test fails on my system
that uses Windows 10 with mingw-w64 installed via MSYS2: instead of
observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent
across multiple different environments I have tried.

I cannot say why exactly we observe a different error here, but I would
not be surprised if this was either dependent on the Windows version,
the version of MinGW, the current working directory of Git or any kind
of combination of these.

Work around the issue by handling both errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 11:33:04 -07:00
436892123d rebase-merges: try and use branch names as labels
When interactively rebasing merge commits, the commit message is parsed to
extract a probably meaningful label name. For instance if the merge commit
is “Merge branch 'feature0'”, then the rebase script will have thes lines:
```
label feature0

merge -C $sha feature0 # “Merge branch 'feature0'
```

This heuristic fails in the case of octopus merges or when the merge commit
message is actually unrelated to the parent commits.

An example that combines both is:
```
*---.   967bfa4 (HEAD -> integration) Integration
|\ \ \
| | | * 2135be1 (feature2, feat2) Feature 2
| |_|/
|/| |
| | * c88b01a Feature 1
| |/
|/|
| * 75f3139 (feat0) Feature 0
|/
* 25c86d0 (main) Initial commit
```
yields the labels Integration, Integration-2 and Integration-3.

Fix this by using a branch name for each merge commit's parent that is the
tip of at least one branch, and falling back to a label derived from the
merge commit message otherwise.
In the example above, the labels become feat0, Integration and feature2.

Signed-off-by: Nicolas Guichard <nicolas@guichard.eu>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 10:52:46 -07:00
68c9fcb027 rebase-update-refs: extract load_branch_decorations
Extract load_branch_decorations from todo_list_add_update_ref_commands so
it can be re-used in make_script_with_merges.

Since it can now be called multiple times, use non-static lists and place
it next to load_ref_decorations to re-use the decoration_loaded guard.

Signed-off-by: Nicolas Guichard <nicolas@guichard.eu>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 10:52:45 -07:00
e4d03b7938 load_branch_decorations: fix memory leak with non-static filters
load_branch_decorations calls normalize_glob_ref on each string of filter's
string_lists. This effectively replaces the potentially non-owning char* of
those items with an owning char*.

Set the strdup_string flag on those string_lists.

This was not caught until now because:
- when passing string_lists already with the strdup_string already set, the
  behaviour was correct
- when passing static string_lists, the new char* remain reachable until
  program exit

Signed-off-by: Nicolas Guichard <nicolas@guichard.eu>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 10:52:44 -07:00
0c1a9987da submodule: correct remote name with fetch
The code fetches the submodules remote based on the superproject remote name
instead of the submodule remote name[1].

Instead of grabbing the default remote of the superproject repository, ask
the default remote of the submodule we are going to run 'git fetch' in.

1. https://lore.kernel.org/git/ZJR5SPDj4Wt_gmRO@pweza/

Signed-off-by: Daniel Black <daniel@mariadb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 10:48:08 -07:00
c4b8fb6ef2 doc: merge-tree: improve example script
• Provide a commit message in the example command.

  The command will hang since it is waiting for a commit message on
  stdin.  Which is usable but not straightforward enough since this is
  example code.
• Use `||` directly since that is more straightforward than checking the
  last exit status.

  Also use `echo` and `exit` since `die` is not defined.
• Expose variable declarations.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09 10:40:42 -07:00
f36b8cbaef git-config.1: remove value from positional args in unset usage
The synopsis for `git config unset` mentions two positional arguments:
`<name>` and `<value>`. While the first argument is correct, the second
is not. Users are expected to provide the value via `--value=<value>`.

Remove the positional argument. The `--value=<value>` option is already
documented correctly, so this is all we need to do to fix the
documentation.

Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 23:35:45 -07:00
51907f8fee fsmonitor: initialize fs event listener before accepting clients
There's a racy hang in fsmonitor on macOS that we sometimes see in CI.
When we serve a client, what's supposed to happen is:

  1. The client thread calls with_lock__wait_for_cookie() in which we
     create a cookie file and then wait for a pthread_cond event

  2. The filesystem event listener sees the cookie file creation, does
     some internal book-keeping, and then triggers the pthread_cond.

But there's a problem: we start the listener that accepts client threads
before we start the fs event thread. So it's possible for us to accept a
client which creates the cookie file and starts waiting before the fs
event thread is initialized, and we miss those filesystem events
entirely. That leaves the client thread hanging forever.

In CI, the symptom is that t9210 (which is testing scalar, which always
enables fsmonitor under the hood) may hang forever in "scalar clone". It
is waiting on "git fetch" which is waiting on the fsmonitor daemon.

The race happens more frequently under load, but you can trigger it
predictably with a sleep like this, which delays the start of the fs
event thread:

  --- a/compat/fsmonitor/fsm-listen-darwin.c
  +++ b/compat/fsmonitor/fsm-listen-darwin.c
  @@ -510,6 +510,7 @@ void fsm_listen__loop(struct fsmonitor_daemon_state *state)
          FSEventStreamSetDispatchQueue(data->stream, data->dq);
          data->stream_scheduled = 1;

  +       sleep(1);
          if (!FSEventStreamStart(data->stream)) {
                  error(_("Failed to start the FSEventStream"));
                  goto force_error_stop_without_loop;

One solution might be to reverse the order of initialization: start the
fs event thread before we start the thread listening for clients. But
the fsmonitor code explicitly does it in the opposite direction. The fs
event thread wants to refer to the ipc_server_data struct, so we need it
to be initialized first.

A further complication is that we need a signal from the fs event thread
that it is actually ready and listening. And those details happen within
backend-specific fsmonitor code, whereas the initialization is in the
shared code.

So instead, let's use the ipc_server init/start split added in the
previous commit. The generic fsmonitor code will init the ipc_server but
_not_ start it, leaving that to the backend specific code, which now
needs to call ipc_server_start_async() at the right time.

For macOS, that is right after we start the FSEventStream that you can
see in the diff above.

It's not clear to me if Windows suffers from the same problem (and we
simply don't trigger it in CI), or if it is immune. Regardless, the
obvious place to start accepting clients there is right after we've
established the ReadDirectoryChanges watch.

This makes the hangs go away in our macOS CI environment, even when
compiled with the sleep() above.

Helped-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 12:03:56 -07:00
766fce69e9 simple-ipc: split async server initialization and running
To start an async ipc server, you call ipc_server_run_async(). That
initializes the ipc_server_data object, and starts all of the threads
running, which may immediately start serving clients.

This can create some awkward timing problems, though. In the fsmonitor
daemon (the sole user of the simple-ipc system), we want to create the
ipc server early in the process, which means we may start serving
clients before the rest of the daemon is fully initialized.

To solve this, let's break run_async() into two parts: an initialization
which allocates all data and spawns the threads (without letting them
run), and a start function which actually lets them begin work. Since we
have two simple-ipc implementations, we have to handle this twice:

  - in ipc-unix-socket.c, we have a central listener thread which hands
    connections off to worker threads using a work_available mutex. We
    can hold that mutex after init, and release it when we're ready to
    start.

    We do need an extra "started" flag so that we know whether the main
    thread is holding the mutex or not (e.g., if we prematurely stop the
    server, we want to make sure all of the worker threads are released
    to hear about the shutdown).

  - in ipc-win32.c, we don't have a central mutex. So we'll introduce a
    new startup_barrier mutex, which we'll similarly hold until we're
    ready to let the threads proceed.

    We again need a "started" flag here to make sure that we release the
    barrier mutex when shutting down, so that the sub-threads can
    proceed to the finish.

I've renamed the run_async() function to init_async() to make sure we
catch all callers, since they'll now need to call the matching
start_async().

We could leave run_async() as a wrapper that does both, but there's not
much point. There are only two callers, one of which is fsmonitor, which
will want to actually do work between the two calls. And the other is
just a test-tool wrapper.

For now I've added the start_async() calls in fsmonitor where they would
otherwise have happened, so there should be no behavior change with this
patch.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 12:03:56 -07:00
08830ac00f worktree: add test for path handling in linked worktrees
A failure scenario reported in an earlier patch series[1] that several
`git worktree` subcommands failed or misbehaved when invoked from within
linked worktrees that used relative paths.

This adds a test that executes a `worktree prune` command inside both an
internally and an externally linked worktree and asserts that the other
worktree was not pruned.

[1]: https://lore.kernel.org/git/CAPig+cQXFy=xPVpoSq6Wq0pxMRCjS=WbkgdO+3LySPX=q0nPCw@mail.gmail.com/

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 11:49:22 -07:00
717af916cd worktree: link worktrees with relative paths
Git currently stores absolute paths to both the main repository and
linked worktrees. However, this causes problems when moving repositories
or working in containerized environments where absolute paths differ
between systems. The worktree links break, and users are required to
manually execute `worktree repair` to repair them, leading to workflow
disruptions. Additionally, mapping repositories inside of containerized
environments renders the repository unusable inside the containers, and
this is not repairable as repairing the worktrees inside the containers
will result in them being broken outside the containers.

To address this, this patch makes Git always write relative paths when
linking worktrees. Relative paths increase the resilience of the
worktree links across various systems and environments, particularly
when the worktrees are self-contained inside the main repository (such
as when using a bare repository with worktrees). This improves
portability, workflow efficiency, and reduces overall breakages.

Although Git now writes relative paths, existing repositories with
absolute paths are still supported. There are no breaking changes
to workflows based on absolute paths, ensuring backward compatibility.

At a low level, the changes involve modifying functions in `worktree.c`
and `builtin/worktree.c` to use `relative_path()` when writing the
worktree’s `.git` file and the main repository’s `gitdir` reference.
Instead of hardcoding absolute paths, Git now computes the relative path
between the worktree and the repository, ensuring that these links are
portable. Locations where these respective file are read have also been
updated to properly handle both absolute and relative paths. Generally,
relative paths are always resolved into absolute paths before any
operations or comparisons are performed.

Additionally, `repair_worktrees_after_gitdir_move()` has been introduced
to address the case where both the `<worktree>/.git` and
`<repo>/worktrees/<id>/gitdir` links are broken after the gitdir is
moved (such as during a re-initialization). This function repairs both
sides of the worktree link using the old gitdir path to reestablish the
correct paths after a move.

The `worktree.path` struct member has also been updated to always store
the absolute path of a worktree. This ensures that worktree consumers
never have to worry about trying to resolve the absolute path themselves.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 11:49:22 -07:00
bb4a883584 worktree: refactor infer_backlink() to use *strbuf
This lays the groundwork for the next patch, which needs the backlink
returned from infer_backlink() as a `strbuf`. It seemed inefficient to
convert from `strbuf` to `char*` and back to `strbuf` again.

This refactors infer_backlink() to return an integer result and use a
pre-allocated `strbuf` for the inferred backlink path, replacing the
previous `char*` return type and improving efficiency.

Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 11:49:21 -07:00
58d8805de2 Merge branch 'es/worktree-repair-copied' into cw/worktrees-relative
* es/worktree-repair-copied:
  worktree: repair copied repository and linked worktrees
2024-10-08 11:49:13 -07:00
0f490d270a ls-remote: leakfix for not clearing server_options
Ensure `server_options` is properly cleared using `string_list_clear()`
in `builtin/ls-remote.c:cmd_ls_remote`.

Although we cannot yet enable `TEST_PASSES_SANITIZE_LEAK=true` for
`t/t5702-protocol-v2.sh` due to other existing leaks, this fix ensures
that "git-ls-remote" related server options tests pass the sanitize leak
check:

  ...
  ok 12 - server-options are sent when using ls-remote
  ok 13 - server-options from configuration are used by ls-remote
  ...

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:10 -07:00
148bc7bf4b fetch: respect --server-option when fetching multiple remotes
Fix an issue where server options specified via the command line
(`--server-option` or `-o`) were not sent when fetching from multiple
remotes using Git protocol v2.

To reproduce the issue with a repository containing multiple remotes:

  GIT_TRACE_PACKET=1 git -c protocol.version=2 fetch --server-option=demo --all

Observe that no server options are sent to any remote.

The root cause was identified in `builtin/fetch.c:fetch_multiple`, which
is invoked when fetching from more than one remote. This function forks
a `git-fetch` subprocess for each remote but did not include the
specified server options in the subprocess arguments.

This commit ensures that command-line specified server options are
properly passed to each subprocess. Relevant tests have been added.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:09 -07:00
094f78a16a transport.c:🤝 make use of server options from remote
Utilize the `server_options` from the corresponding remote during the
handshake in `transport.c` when Git protocol v2 is detected. This helps
initialize the `server_options` in `transport.h:transport` if no server
options are set for the transport (typically via `--server-option` or
`-o`).

While another potential place to incorporate server options from the
remote is in `transport.c:transport_get`, setting server options for a
transport using a protocol other than v2 could lead to unexpected errors
(see `transport.c:die_if_server_options`).

Relevant tests and documentation have been updated accordingly.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:08 -07:00
72da5cfb1c remote: introduce remote.<name>.serverOption configuration
Currently, server options for Git protocol v2 can only be specified via
the command line option "--server-option" or "-o", which is inconvenient
when users want to specify a list of default options to send. Therefore,
we are introducing a new configuration to hold a list of default server
options, akin to the `push.pushOption` configuration for push options.

Initially, I named the new configuration `fetch.serverOption` to align
with `push.pushOption`. However, after discussing with Patrick, it was
renamed to `remote.<name>.serverOption` as suggested, because:

1. Server options are designed to be server-specific, making it more
   logical to use a per-remote configuration.
2. Using "fetch." prefixed configurations in git-clone or git-ls-remote
   seems out of place and inconsistent in design.

The parsing logic for `remote.<name>.serverOption` also relies on
`transport.c:parse_transport_option`, similar to `push.pushOption`, and
they follow the same priority design:

1. Server options set in lower-priority configuration files (e.g.,
   /etc/gitconfig or $HOME/.gitconfig) can be overridden or unset in
   more specific repository configurations using an empty string.
2. Command-line specified server options take precedence over those from
   the configuration.

Server options from configuration are stored to the corresponding
`remote.h:remote` as a new field `server_options`.  The field will be
utilized in the subsequent commit to help initialize the
`server_options` of `transport.h:transport`.

And documentation have been updated accordingly.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Reported-by: Liu Zhongbo <liuzhongbo.6666@bytedance.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:07 -07:00
06708ce180 transport: introduce parse_transport_option() method
Add the `parse_transport_option()` method to parse the `push.pushOption`
configuration. This method will also be used in the next commit to
handle the new `remote.<name>.serverOption` configuration for setting
server options in Git protocol v2.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:06 -07:00
4154ed4108 docs: fix the maintain-git links in technical/platform-support
These links should point to `.html` files, not to `.txt` ones.

Compare also to 4945f046c7 (api docs: link to html version of
api-trace2, 2022-09-16).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07 15:34:16 -07:00
ecb5c4318c unpack-trees: detect mismatching number of cache-tree/index entries
Same as the preceding commit, we unconditionally dereference the index's
cache entries depending on the number of cache-tree entries, which can
lead to a segfault when the cache-tree is corrupted. Fix this bug.

This also makes t4058 pass with the leak sanitizer enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07 15:08:11 -07:00
2be7fc012e cache-tree: detect mismatching number of index entries
In t4058 we have some tests that exercise git-read-tree(1) when used
with a tree that contains duplicate entries. While the expectation is
that we fail, we ideally should fail gracefully without a segfault.

But that is not the case: we never check that the number of entries in
the cache-tree is less than or equal to the number of entries in the
index. This can lead to an out-of-bounds read as we unconditionally
access `istate->cache[idx]`, where `idx` is controlled by the number of
cache-tree entries and the current position therein. The result is a
segfault.

Fix this segfault by adding a sanity check for the number of index
entries before dereferencing them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07 15:08:11 -07:00
9f119599a6 cache-tree: refactor verification to return error codes
The function `cache_tree_verify()` will `BUG()` whenever it finds that
the cache-tree extension of the index is corrupt. The function is thus
inherently untestable because the resulting call to `abort()` will be
detected by our testing framework and labelled an error. And rightfully
so: it shouldn't ever be possible to hit bugs, as they should indicate a
programming error rather than corruption of on-disk state.

Refactor the function to instead return error codes. This also ensures
that the function can be used e.g. by git-fsck(1) without the whole
process dying. Furthermore, this refactoring plugs some memory leaks
when returning early by creating a common exit path.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07 15:08:11 -07:00
777489f9e0 Git 2.47
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-06 15:56:06 -07:00
5c97f7ba5c Merge tag 'l10n-2.47.0-rnd2' of https://github.com/git-l10n/git-po
l10n-2.47.0-rnd2

* tag 'l10n-2.47.0-rnd2' of https://github.com/git-l10n/git-po:
  l10n: Update German translation
  l10n: bg.po: Updated Bulgarian translation (5772t)
  l10n: vi: Updated translation for 2.47
  l10n: zh_TW: Git 2.47
  l10n: new lead for Catalan translation
  l10n: Update Catalan translation
  l10n: fr.po: 2.47.0
  l10n: zh_CN: updated translation for 2.47
  l10n: po-id for 2.47
  l10n: tr: Update Turkish translations for 2.47.0
  l10n: sv.po: Update Swedish translation
2024-10-06 11:14:12 -07:00
81e7bd6151 Merge branch 'l10n-de-2.47' of github.com:ralfth/git
* 'l10n-de-2.47' of github.com:ralfth/git:
  l10n: Update German translation
2024-10-06 12:06:21 +08:00
dde6096b16 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5772t)
2024-10-06 12:04:11 +08:00
93d2fa651f Merge branch 'catalan-247' of github.com:Softcatala/git-po
* 'catalan-247' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2024-10-06 12:03:46 +08:00
be0bd9669d Merge branch 'new-catalan-maintainer' of github.com:Softcatala/git-po
* 'new-catalan-maintainer' of github.com:Softcatala/git-po:
  l10n: new lead for Catalan translation
2024-10-06 12:03:08 +08:00
498f8cb54c Merge branch 'l10n/zh-TW/2024-10-05' of github.com:l10n-tw/git-po
* 'l10n/zh-TW/2024-10-05' of github.com:l10n-tw/git-po:
  l10n: zh_TW: Git 2.47
2024-10-06 11:39:29 +08:00
c1b5fb0f01 Merge branch 'tl/zh_CN_2.47.0_rnd' of github.com:dyrone/git
* 'tl/zh_CN_2.47.0_rnd' of github.com:dyrone/git:
  l10n: zh_CN: updated translation for 2.47
2024-10-06 11:39:03 +08:00
1ff21bff12 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation
2024-10-06 11:38:15 +08:00
fc49119c03 Merge branch 'fr_2.47.0_rnd1' of github.com:jnavila/git
* 'fr_2.47.0_rnd1' of github.com:jnavila/git:
  l10n: fr.po: 2.47.0
2024-10-06 11:37:56 +08:00
3a19f2d4fc Merge branch 'vi-2.47' of github.com:Nekosha/git-po
* 'vi-2.47' of github.com:Nekosha/git-po:
  l10n: vi: Updated translation for 2.47
2024-10-06 11:35:59 +08:00
770ea7bee7 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.47
2024-10-06 11:35:06 +08:00
f4110efbc3 l10n: Update German translation
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2024-10-05 19:28:19 +02:00
d6aa1da141 l10n: bg.po: Updated Bulgarian translation (5772t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2024-10-05 13:21:30 +02:00
365a7ed9bd l10n: vi: Updated translation for 2.47
Signed-off-by: Vũ Tiến Hưng <newcomerminecraft@gmail.com>
2024-10-05 17:23:48 +07:00
507b364f44 l10n: zh_TW: Git 2.47
Co-authored-by: Lumynous <lumynou5.tw@gmail.com>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2024-10-05 15:47:12 +08:00
52d4a65070 l10n: new lead for Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2024-10-05 09:26:43 +02:00
cd0ef8b6e3 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2024-10-05 09:19:18 +02:00
90fe3800b9 Mostly there for 2.47 final
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-04 14:21:44 -07:00
2ab53b59ef Merge branch 'kn/osx-fsmonitor-with-submodules-fix'
macOS with fsmonitor daemon can hang forever when a submodule is
involved, which has been corrected.

* kn/osx-fsmonitor-with-submodules-fix:
  fsmonitor OSX: fix hangs for submodules
2024-10-04 14:21:43 -07:00
bffc417e7c Merge branch 'ak/doc-typofix'
Typofixes.

* ak/doc-typofix:
  Documentation: fix typos
  Documentation/config: fix typos
2024-10-04 14:21:43 -07:00
68ac04ad85 Merge branch 'tb/weak-sha1-for-tail-sum'
Build fix.

* tb/weak-sha1-for-tail-sum:
  hash.h: set NEEDS_CLONE_HELPER_UNSAFE in fallback mode
2024-10-04 14:21:42 -07:00
b1c6ed40cd Merge branch 'ps/reftable-concurrent-writes'
Test fix.

* ps/reftable-concurrent-writes:
  t0610: work around flaky test with concurrent writers
2024-10-04 14:21:42 -07:00
d30c2c4c53 Merge branch 'mh/w-unused-fix'
Buildfix.

* mh/w-unused-fix:
  utf8.h: squelch unused-parameter warnings with NO_ICONV
2024-10-04 14:21:41 -07:00
12841c449c Merge branch 'rs/archive-with-attr-pathspec-fix'
Message update.

* rs/archive-with-attr-pathspec-fix:
  archive: fix misleading error message
2024-10-04 14:21:40 -07:00
4861bbf85a Merge branch 'ak/typofix-2.46-maint'
Typofixes.

* ak/typofix-2.46-maint:
  perl: fix a typo
  mergetool: fix a typo
  reftable: fix a typo
  trace2: fix typos
2024-10-04 14:21:40 -07:00
5187f2b738 l10n: fr.po: 2.47.0
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2024-10-04 23:04:55 +02:00
5d5cd454e9 l10n: zh_CN: updated translation for 2.47
Signed-off-by: Teng Long <dyroneteng@gmail.com>
2024-10-05 03:32:47 +08:00
8895aca996 A bit more after 2.47-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-04 10:14:07 -07:00
b4efdfe165 Merge branch 'ds/read-cache-mempool-leakfix'
Leakfix.

* ds/read-cache-mempool-leakfix:
  read-cache: free threaded memory pool
2024-10-04 10:14:07 -07:00
b9b995e371 Merge branch 'jc/doc-discarding-stalled-topics'
Document that inactive topics are subject to be discarded.

* jc/doc-discarding-stalled-topics:
  howto-maintain-git: discarding inactive topics
2024-10-04 10:14:07 -07:00
441e0df980 Merge branch 'jk/test-lsan-improvements'
Usability improvements for running tests in leak-checking mode.

* jk/test-lsan-improvements:
  test-lib: check for leak logs after every test
  test-lib: show leak-sanitizer logs on --immediate failure
  test-lib: stop showing old leak logs
2024-10-04 10:14:06 -07:00
7355574a22 t0610: work around flaky test with concurrent writers
In 6241ce2170 (refs/reftable: reload locked stack when preparing
transaction, 2024-09-24) we have introduced a new test that exercises
how the reftable backend behaves with many concurrent writers all racing
with each other. This test was introduced after a couple of fixes in
this context that should make concurrent writes behave gracefully. As it
turns out though, Windows systems do not yet handle concurrent writes
properly, as we've got two reports for Cygwin and MinGW failing in this
newly added test.

The root cause of this is how we update the "tables.list" file: when
writing a new stack of tables we first write the data into a lockfile
and then rename that file into place. But Windows forbids us from doing
that rename when the target path is open for reading by another process.
And as the test races both readers and writers with each other we are
quite likely to hit this edge case.

This is not a regression: the logic didn't work before the mentioned
commit, and after the commit it performs well on Linux and macOS, and
the situation on Windows should have at least improved a bit. But the
test shows that we need to put more thought into how to make this work
properly there.

Work around the issue by disabling the test on Windows for now. While at
it, increase the locking timeout to address reported timeouts when using
either the address or memory sanitizer, which also tend to significantly
extend the runtime of this test.

This should be revisited after Git v2.47 is out.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-04 09:34:47 -07:00
435a6900d2 fsmonitor OSX: fix hangs for submodules
fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
the value with trailing "/." (as repo_get_git_dir(the_repository) on
Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
IS_OUTSIDE_CONE.

In this case, fsevent_callback() doesn't update cookie_list so that
fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
not invoked.

As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
that with_lock__mark_cookies_seen() should unlock, the whole daemon
hangs.

Remove trailing "/." from state->path_gitdir_watch.buf for submodules
and add a corresponding test in t7527-builtin-fsmonitor.sh. The test is
disabled for MINGW because hangs treated with this patch occur only for
Darwin and there is no simple way to terminate the win32 fsmonitor
daemon that hangs.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-04 08:01:27 -07:00
2179b5c831 reftable/basics: fix segfault when growing names array fails
When growing the `names` array fails we would end up with a `NULL`
pointer. This causes two problems:

  - We would run into a segfault because we try to free names that we
    have assigned to the array already.

  - We lose track of the old array and cannot free its contents.

Fix this issue by using a temporary variable. Like this we do not
clobber the old array that we tried to reallocate, which will remain
valid when a call to realloc(3P) fails.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-04 07:59:31 -07:00
cc64e172c4 l10n: po-id for 2.47
Update following components:

  * add-patch.c
  * apply.c
  * builtin/check-mailmap.c
  * builtin/checkout.c
  * builtin/commit.c
  * builtin/config.c
  * builtin/fetch.c
  * builtin/gc.c
  * builtin/multi-pack-index.c
  * builtin/refs.c
  * builtin/show-refs.c
  * builtin/sparse-checkout.c
  * builtin/submodule--helper.c
  * loose.c
  * midx-write.c
  * midx.c
  * object-file.c
  * ref-filter.c
  * refs/file-backend.c
  * scalar.c
  * setup.c
  * git-send-email.perl

Translate following new components:

  * t/unit-tests/unit-tests.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2024-10-04 08:55:32 +07:00
1164e270b5 diff: store graph prefix buf in git_graph struct
The diffopt output_prefix interface makes it the callback's job to
handle ownership of the memory it returns, keeping it valid while
callers are using it and then eventually freeing it when we are done
diffing.

In diff_output_prefix_callback() we handle this with a static strbuf,
effectively "leaking" it when the diff is done (but not triggering any
leak detectors because it's technically still reachable). This has not
been a big problem in practice, but it is a problem for libification:
two diffs running in the same process could stomp on each other's prefix
buffers.

Since we only need the strbuf when we are formatting graph padding, we
can give ownership of the strbuf to the git_graph struct, letting us
free it when that struct is no longer in use.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 14:22:22 -07:00
19752d9c91 diff: return line_prefix directly when possible
We may point our output_prefix callback to diff_output_prefix_callback()
in any of these cases:

  1. we have a user-provided line_prefix

  2. we have a graph prefix to show

  3. both (1) and (2)

The function combines the available elements into a strbuf and returns
its pointer.

In the case that we just have the line_prefix, though, there is no need
for the strbuf. We can return the string directly.

This is a minor optimization by itself, but also will allow us to clean
up some memory ownership issues on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 14:22:22 -07:00
436728fe9d diff: return const char from output_prefix callback
The diff_options structure has an output_prefix callback for returning a
prefix string, but it does so by returning a pointer to a strbuf.

This makes the interface awkward. There's no reason the callback should
need to use a strbuf, and it creates questions about whether the
ownership of the resulting buffer should be transferred to the caller
(it should not be, but a recent attempt to clean up this code led to a
double-free in some cases).

The one advantage we get is that the strbuf contains a ptr/len pair, so
we could in theory have a prefix with embedded NULs. But we can observe
that none of the existing callbacks would ever produce such a NUL (they
are usually just indentation or graph symbols, and even the
"--line-prefix" option takes a NUL-terminated string).

And anyway, only one caller (the one in log_tree_diff_flush) actually
looks at the strbuf length. In every other case we use a helper function
which discards the length and just returns the NUL-terminated string.

So let's just have the callback return a "const char *" pointer. It's up
to the callbacks themselves if they want to use a strbuf under the hood.
And now the caller in log_tree_diff_flush() can just use the helper
function along with everybody else. That lets us even simplify out the
function pointer check, since the helper returns an empty string
(technically this does mean we'll sometimes issue an empty fputs() call,
but I don't think this code path is hot enough to care about that).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 14:22:22 -07:00
2011bb4f34 diff: drop line_prefix_length field
The diff_options structure holds a line_prefix string and an associated
length. But the length is always just the strlen() of the NUL-terminated
string. Let's simplify the code by just storing the string pointer and
assuming it is NUL-terminated when we use it.

This will cause us to compute the string length in a few extra spots,
but I don't think any of these are particularly hot code paths.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 14:22:21 -07:00
8aeff2c287 line-log: use diff_line_prefix() instead of custom helper
Our local output_prefix() is exactly the same as the public
diff_line_prefix() function. Let's just use that one, saving us a little
bit of code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 14:22:21 -07:00
686f3337a6 perl: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 12:06:51 -07:00
2c1070c758 mergetool: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 12:06:51 -07:00
a54601c38b reftable: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 12:06:51 -07:00
23925a153d trace2: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 12:06:50 -07:00
ddfb5bcfc6 Documentation: mention the amlog in howto/maintain-git.txt
Part of the maintainer's job is to keep up-to-date and publish the
'amlog' which stores a mapping between a patch's 'Message-Id' e-mail
header and the commit generated by applying said patch.

But our Documentation/howto/maintain-git.txt does not mention the amlog,
or the scripts which exist to help the maintainer keep the amlog
up-to-date.

(This bit me during the first integration round I did as interim
maintainer[1] involved a lot of manual clean-up. More recently it has
come up as part of a research effort to better understand a patch's
lifecycle on the list[2].)

Address this gap by briefly documenting the existence and purpose of the
'post-applypatch' hook in maintaining the amlog entries.

[1]: https://lore.kernel.org/git/Y19dnb2M+yObnftj@nand.local/
[2]: https://lore.kernel.org/git/CAJoAoZ=4ARuH3aHGe5yC_Xcnou_c396q_ZienYPY7YnEzZcyEg@mail.gmail.com/

Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 12:00:21 -07:00
3d6ab4177d doc: add a note about staggering of maintenance
Git maintenance tasks are staggered to a random minute of the hour per
client to avoid thundering herd issues. Updates the doc to add a note
about the same.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 11:23:09 -07:00
4638250b7b hash.h: set NEEDS_CLONE_HELPER_UNSAFE in fallback mode
Commit 253ed9ecff (hash.h: scaffolding for _unsafe hashing variants,
2024-09-26) introduced the concept of having two hash algorithms: a safe
and an unsafe one. When the Makefile knobs do not explicitly request an
unsafe one, we fall back to using the safe algorithm.

However, the fallback to do so forgot one case: we should inherit the
NEEDS_CLONE_HELPER flag from the safe variant. Failing to do so means
that we'll end up defining two clone functions (the algorithm specific
one, and the generic one that just calls memcpy). You'll see an error
like this:

  $ make OPENSSL_SHA1=1
  [...]
  sha1/openssl.h:46:29: error: redefinition of ‘openssl_SHA1_Clone’
     46 | #define platform_SHA1_Clone openssl_SHA1_Clone
        |                             ^~~~~~~~~~~~~~~~~~
  hash.h:83:40: note: in expansion of macro ‘platform_SHA1_Clone’
     83 | #    define platform_SHA1_Clone_unsafe platform_SHA1_Clone
        |                                        ^~~~~~~~~~~~~~~~~~~
  hash.h:101:33: note: in expansion of macro ‘platform_SHA1_Clone_unsafe’
    101 | #  define git_SHA1_Clone_unsafe platform_SHA1_Clone_unsafe
        |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~
  hash.h:133:20: note: in expansion of macro ‘git_SHA1_Clone_unsafe’
    133 | static inline void git_SHA1_Clone_unsafe(git_SHA_CTX_unsafe *dst,
        |                    ^~~~~~~~~~~~~~~~~~~~~
  sha1/openssl.h:37:20: note: previous definition of ‘openssl_SHA1_Clone’ with type ‘void(struct openssl_SHA1_CTX *, const struct openssl_SHA1_CTX *)’
     37 | static inline void openssl_SHA1_Clone(struct openssl_SHA1_CTX *dst,
        |                    ^~~~~~~~~~~~~~~~~~

This only matters when compiling with openssl as the "safe" variant,
since it's the only algorithm that requires a clone helper (and even
then, only if you are using openssl 3.0+). And you should never do that,
because it's not safe. But still, the invocation above used to work and
should continue to do so until we decide to require a
collision-detecting variant for the safe algorithm entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 11:18:36 -07:00
bebf0e2487 archive: fix misleading error message
The error message added by 296743a7ca (archive: load index before
pathspec checks, 2024-09-21) is misleading: unpack_trees() is not
touching the working tree at all here, but just loading a tree into
the index.  Correct it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 09:53:04 -07:00
fc5589d6c1 line-log: protect inner strbuf from free
The output_prefix() method in line-log.c may call a function pointer via
the diff_options struct. This function pointer returns a strbuf struct
and then its buffer is passed back. However, that implies that the
consumer is responsible to free the string. This is especially true
because the default behavior is to duplicate the empty string.

The existing functions used in the output_prefix pointer include:

 1. idiff_prefix_cb() in diff-lib.c. This returns the data pointer, so
    the value exists across multiple calls.

 2. diff_output_prefix_callback() in graph.c. This uses a static strbuf
    struct, so it reuses buffers across calls. These should not be
    freed.

 3. output_prefix_cb() in range-diff.c. This is similar to the
    diff-lib.c case.

In each case, we should not be freeing this buffer. We can convert the
output_prefix() function to return a const char pointer and stop freeing
the result.

This choice is essentially the opposite of what was done in 394affd46d
(line-log: always allocate the output prefix, 2024-06-07).

This was discovered via 'valgrind' while investigating a public report
of a bug in 'git log --graph -L' [1].

[1] https://github.com/git-for-windows/git/issues/5185

This issue would have been caught by the new test, when Git is compiled
with ASan to catch these double frees.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 09:07:16 -07:00
0d44bdb505 l10n: tr: Update Turkish translations for 2.47.0
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2024-10-03 06:55:07 +03:00
e03b2a2105 utf8.h: squelch unused-parameter warnings with NO_ICONV
Since DEVELOPER=YesPlease build enables -Wunused-parameter warnings
these days, the fallback definition for reencode_string_len() that
did not touch any of its parameters but one needs to be annotated
properly.

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 15:52:48 -07:00
35730302e9 reftable/basics: ban standard allocator functions
The reftable library uses pluggable allocators, which means that we
shouldn't ever use the standard allocator functions. But it is an easy
mistake to make to accidentally use e.g. free(3P) instead of the
reftable-specific `reftable_free()` function, and we do not have any
mechanism to detect this misuse right now.

Introduce a couple of macros that ban the standard allocators, similar
to how we do it in "banned.h".

Note that we do not ban the following two classes of functions:

  - Macros like `FREE_AND_NULL()` or `REALLOC_ARRAY()`. As those expand
    to code that contains already-banned functions we'd get a compiler
    error even without banning those macros explicitly.

  - Git-specific allocators like `xmalloc()` and friends. The primary
    reason is that there are simply too many of them, so we're rather
    aiming for best effort here. Furthermore, the eventual goal is to
    make them unavailable in the reftable library place by not pulling
    them in via "git-compat-utils.h" anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:56 -07:00
24e0ade65b reftable: introduce REFTABLE_FREE_AND_NULL()
We have several calls to `FREE_AND_NULL()` in the reftable library,
which of course uses free(3P). As the reftable allocators are pluggable
we should rather call the reftable specific function, which is
`reftable_free()`.

Introduce a new macro `REFTABLE_FREE_AND_NULL()` and adapt the callsites
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:56 -07:00
daa59e9c43 reftable: fix calls to free(3P)
There are a small set of calls to free(3P) in the reftable library. As
the reftable allocators are pluggable we should rather call the reftable
specific function, which is `reftable_free()`.

Convert the code accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:56 -07:00
12b9078066 reftable: handle trivial allocation failures
Handle trivial allocation failures in the reftable library and its unit
tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
51afc709dc reftable/tree: handle allocation failures
The tree interfaces of the reftable library handle both insertion and
searching of tree nodes with a single function, where the behaviour is
altered between the two via an `insert` bit. This makes it quit awkward
to handle allocation failures because on inserting we'd have to check
for `NULL` pointers and return an error, whereas on searching entries we
don't have to handle it as an allocation error.

Split up concerns of this function into two separate functions, one for
inserting entries and one for searching entries. This makes it easy for
us to check for allocation errors as `tree_insert()` should never return
a `NULL` pointer now. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
d0501c8c9d reftable/pq: handle allocation failures when adding entries
Handle allocation failures when adding entries to the pqueue. Adapt its
only caller accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
2d5dbb37b2 reftable/block: handle allocation failures
Handle allocation failures in `block_writer_init()` and
`block_reader_init()`. This requires us to bubble up error codes into
`writer_reinit_block_writer()`. Adapt call sites accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
cd6a47167e reftable/blocksource: handle allocation failures
Handle allocation failures in the blocksource code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
cc6a9af5d7 reftable/iter: handle allocation failures when creating indexed table iter
Handle allocation failures in `new_indexed_table_ref_iter()`. While at
it, rename the function to match our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
5b67cc6477 reftable/stack: handle allocation failures in auto compaction
Handle allocation failures in `reftable_stack_auto_compact()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
694af039f5 reftable/stack: handle allocation failures in stack_compact_range()
Handle allocation failures in `stack_compact_range()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
5dbe266212 reftable/stack: handle allocation failures in reftable_new_stack()
Handle allocation failures in `reftable_new_stack()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
dce75e15ff reftable/stack: handle allocation failures on reload
Handle allocation failures in `reftable_stack_reload_once()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
0a8372f509 reftable/reader: handle allocation failures in reader_init_iter()
Handle allocation failures in `reader_init_iter()`. This requires us to
also adapt `reftable_reader_init_*_iterator()` to bubble up the new
error codes. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
18da600293 reftable/reader: handle allocation failures for unindexed reader
Handle allocation failures when creating unindexed readers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
802c0646ac reftable/merged: handle allocation failures in merged_table_init_iter()
Handle allocation failures in `merged_table_init_iter()`. While at it,
merge `merged_iter_init()` into the function. It only has a single
caller and merging them makes it easier to handle allocation failures
consistently.

This change also requires us to adapt `reftable_stack_init_*_iterator()`
to bubble up the new error codes of `merged_table_iter_init()`. Adapt
callsites accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
74d1c18757 reftable/writer: handle allocation failures in reftable_new_writer()
Handle allocation failures in `reftable_new_writer()`. Adapt the
function to return an error code to return such failures. While at it,
rename it to match our code style as we have to touch up every callsite
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
b680af2dba reftable/writer: handle allocation failures in writer_index_hash()
Handle allocation errors in `writer_index_hash()`. Adjust its only
caller in `reftable_writer_add_ref()` accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
31f5b972e0 reftable/record: handle allocation failures when decoding records
Handle allocation failures when decoding records. While at it, fix some
error codes to be `REFTABLE_FORMAT_ERROR`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
ea194f9c46 reftable/record: handle allocation failures on copy
Handle allocation failures when copying records. While at it, convert
from `xstrdup()` to `reftable_strdup()`. Adapt callsites to check for
error codes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
eef7bcdafe reftable/basics: handle allocation failures in parse_names()
Handle allocation failures in `parse_names()` by returning `NULL` in
case any allocation fails. While at it, refactor the function to return
the array directly instead of assigning it to an out-pointer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
6593e147d3 reftable/basics: handle allocation failures in reftable_calloc()
Handle allocation failures in `reftable_calloc()`.

While at it, remove our use of `st_mult()` that would cause us to die on
an overflow. From the caller's point of view there is not much of a
difference between arguments that are too large to be multiplied and a
request that is too big to handle by the allocator: in both cases the
allocation cannot be fulfilled. And in neither of these cases do we want
the reftable library to die.

While we could use `unsigned_mult_overflows()` to handle the overflow
gracefully, we instead open-code it to further our goal of converting
the reftable codebase to become a standalone library that can be reused
by external projects.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
7f0969febf reftable: introduce reftable_strdup()
The reftable library provides the ability to swap out allocators. There
is a gap here though, because we continue to use `xstrdup()` even in the
case where all the other allocators have been swapped out.

Introduce `reftable_strdup()` that uses `reftable_malloc()` to do the
allocation.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
a5a15a4514 reftable/basics: merge "publicbasics" into "basics"
The split between "basics" and "publicbasics" is somewhat arbitrary and
not in line with how we typically structure code in the reftable
library. While we do indeed split up headers into a public and internal
part, we don't do that for the compilation unit itself. Furthermore, the
declarations for "publicbasics.c" are in "reftable-malloc.h", which
isn't in line with our naming schema, either.

Fix these inconsistencies by:

  - Merging "publicbasics.c" into "basics.c".

  - Renaming "reftable-malloc.h" to "reftable-basics.h" as the public
    header.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
bcd5a4059a reftable/error: introduce out-of-memory error code
The reftable library does not use the same memory allocation functions
as the rest of the Git codebase. Instead, as the reftable library is
supposed to be usable as a standalone library without Git, it provides a
set of pluggable memory allocators.

Compared to `xmalloc()` and friends these allocators are _not_ expected
to die when an allocation fails. This design choice is concious, as a
library should leave it to its caller to handle any kind of error. While
it is very likely that the caller cannot really do much in the case of
an out-of-memory situation anyway, we are not the ones to make that
decision.

Curiously though, we never handle allocation errors even though memory
allocation functions are allowed to fail. And as we do not plug in Git's
memory allocator via `reftable_set_alloc()` either the consequence is
that we'd instead segfault as soon as we run out of memory.

While the easy fix would be to wire up `xmalloc()` and friends, it
would only fix the usage of the reftable library in Git itself. Other
users like libgit2, which is about to revive its efforts to land a
backend for reftables, wouldn't be able to benefit from this solution.

Instead, we are about to do it the hard way: adapt all allocation sites
to perform error checking. Introduce a new error code for out-of-memory
errors that we will wire up in subsequent steps.

This commit also serves as the motivator for all the remaining steps in
this series such that we do not have to repeat the same arguments in
every single subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:50 -07:00
111e864d69 Git 2.47-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:46:27 -07:00
ead0a050e2 Merge branch 'tb/weak-sha1-for-tail-sum'
The checksum at the tail of files are now computed without
collision detection protection.  This is safe as the consumer of
the information to protect itself from replay attacks checks for
hash collisions independently.

* tb/weak-sha1-for-tail-sum:
  csum-file.c: use unsafe SHA-1 implementation when available
  Makefile: allow specifying a SHA-1 for non-cryptographic uses
  hash.h: scaffolding for _unsafe hashing variants
  sha1: do not redefine `platform_SHA_CTX` and friends
  pack-objects: use finalize_object_file() to rename pack/idx/etc
  finalize_object_file(): implement collision check
  finalize_object_file(): refactor unlink_or_warn() placement
  finalize_object_file(): check for name collision before renaming
2024-10-02 07:46:27 -07:00
59ee4f7013 Merge branch 'jk/http-leakfixes'
Leakfixes.

* jk/http-leakfixes: (28 commits)
  http-push: clean up local_refs at exit
  http-push: clean up loose request when falling back to packed
  http-push: clean up objects list
  http-push: free xml_ctx.cdata after use
  http-push: free remote_ls_ctx.dentry_name
  http-push: free transfer_request strbuf
  http-push: free transfer_request dest field
  http-push: free curl header lists
  http-push: free repo->url string
  http-push: clear refspecs before exiting
  http-walker: free fake packed_git list
  remote-curl: free HEAD ref with free_one_ref()
  http: stop leaking buffer in http_get_info_packs()
  http: call git_inflate_end() when releasing http_object_request
  http: fix leak of http_object_request struct
  http: fix leak when redacting cookies from curl trace
  transport-helper: fix leak of dummy refs_list
  fetch-pack: clear pack lockfiles list
  fetch: free "raw" string when shrinking refspec
  transport-helper: fix strbuf leak in push_refs_with_push()
  ...
2024-10-02 07:46:26 -07:00
365529e1ea Merge branch 'ps/leakfixes-part-7'
More leak-fixes.

* ps/leakfixes-part-7: (23 commits)
  diffcore-break: fix leaking filespecs when merging broken pairs
  revision: fix leaking parents when simplifying commits
  builtin/maintenance: fix leak in `get_schedule_cmd()`
  builtin/maintenance: fix leaking config string
  promisor-remote: fix leaking partial clone filter
  grep: fix leaking grep pattern
  submodule: fix leaking submodule ODB paths
  trace2: destroy context stored in thread-local storage
  builtin/difftool: plug several trivial memory leaks
  builtin/repack: fix leaking configuration
  diffcore-order: fix leaking buffer when parsing orderfiles
  parse-options: free previous value of `OPTION_FILENAME`
  diff: fix leaking orderfile option
  builtin/pull: fix leaking "ff" option
  dir: fix off by one errors for ignored and untracked entries
  builtin/submodule--helper: fix leaking remote ref on errors
  t/helper: fix leaking subrepo in nested submodule config helper
  builtin/submodule--helper: fix leaking error buffer
  builtin/submodule--helper: clear child process when not running it
  submodule: fix leaking update strategy
  ...
2024-10-02 07:46:26 -07:00
9293a93186 Merge branch 'ds/sparse-checkout-expansion-advice'
When "git sparse-checkout disable" turns a sparse checkout into a
regular checkout, the index is fully expanded.  This totally
expected behaviour however had an "oops, we are expanding the
index" advice message, which has been corrected.

* ds/sparse-checkout-expansion-advice:
  sparse-checkout: disable advice in 'disable'
2024-10-02 07:46:25 -07:00
5e6f359f6b read-cache: free threaded memory pool
In load_cache_entries_threaded(), each thread allocates its own memory
pool. This pool needs to be cleaned up while closing the threads down,
or it will be leaked.

This ce_mem_pool pointer could theoretically be converted to an inline
copy of the struct, but the use of a pointer helps with existing lazy-
initialization logic. Adjusting that behavior only to avoid this pointer
would be a much bigger change.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-01 11:51:15 -07:00
e9356ba3ea another batch after 2.47-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 16:16:17 -07:00
92198dd335 Merge branch 'ps/includeif-onbranch-cornercase-fix'
"git --git-dir=nowhere cmd" failed to properly notice that it
wasn't in any repository while processing includeIf.onbranch
configuration and instead crashed.

* ps/includeif-onbranch-cornercase-fix:
  config: fix evaluating "onbranch" with nonexistent git dir
  t1305: exercise edge cases of "onbranch" includes
2024-09-30 16:16:17 -07:00
4251403327 Merge branch 'ds/background-maintenance-with-credential'
Background tasks "git maintenance" runs may need to use credential
information when going over the network, but a credential helper
may work only in an interactive environment, and end up blocking a
scheduled task waiting for UI.  Credential helpers can now behave
differently when they are not running interactively.

* ds/background-maintenance-with-credential:
  scalar: configure maintenance during 'reconfigure'
  maintenance: add custom config to background jobs
  credential: add new interactive config option
2024-09-30 16:16:16 -07:00
c58eee0928 Merge branch 'rs/archive-with-attr-pathspec-fix'
"git archive" with pathspec magic that uses the attribute
information did not work well, which has been corrected.

* rs/archive-with-attr-pathspec-fix:
  archive: load index before pathspec checks
2024-09-30 16:16:16 -07:00
1a898cee01 Merge branch 'rs/commit-graph-ununleak'
Code clean-up.

* rs/commit-graph-ununleak:
  commit-graph: remove unnecessary UNLEAK
2024-09-30 16:16:15 -07:00
22baac8892 Merge branch 'pw/submodule-process-sigpipe'
When a subprocess to work in a submodule spawned by "git submodule"
fails with SIGPIPE, the parent Git process caught the death of it,
but gave a generic "failed to work in that submodule", which was
misleading.  We now behave as if the parent got SIGPIPE and die.

* pw/submodule-process-sigpipe:
  submodule status: propagate SIGPIPE
2024-09-30 16:16:15 -07:00
ab68c70a8b Merge branch 'ps/reftable-concurrent-writes'
Give timeout to the locking code to write to reftable.

* ps/reftable-concurrent-writes:
  refs/reftable: reload locked stack when preparing transaction
  reftable/stack: allow locking of outdated stacks
  refs/reftable: introduce "reftable.lockTimeout"
2024-09-30 16:16:14 -07:00
66893a14d0 builtin/send-pack: fix leaking list of push options
The list of push options is leaking. Plug the leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:09 -07:00
a6c30623d7 remote: fix leaking push reports
The push reports that report failures to the user when pushing a
reference leak in several places. Plug these leaks by introducing a new
function `ref_push_report_free()` that frees the list of reports and
call it as required. While at it, fix a trivially leaking error string
in the vicinity.

These leaks get hit in t5411, but plugging them does not make the whole
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:08 -07:00
12f0fb9538 t/helper: fix leaks in proc-receive helper
Fix trivial leaks in the proc-receive helpe.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:08 -07:00
2f0ee051dd pack-write: fix return parameter of write_rev_file_order()
While the return parameter of `write_rev_file_order()` is a string
constant, the function may indeed return an allocated string when its
first parameter is a `NULL` pointer. This makes for a confusing calling
convention, where callers need to be aware of these intricate ownership
rules and cast away the constness to free the string in some cases.

Adapt the function and its caller `write_rev_file()` to always return an
allocated string and adapt callers to always free the return value.

Note that this requires us to also adapt `rename_tmp_packfile()`, which
compares the pointers to packfile data with each other. Now that the
path of the reverse index file gets allocated unconditionally the check
will always fail. This is fixed by using strcmp(3P) instead, which also
feels way less fragile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:08 -07:00
6512d6e473 revision: fix leaking saved parents
The `saved_parents` slab is used by `--full-diff` to save parents of a
commit which we are about to rewrite. We do not release its contents
once it's not used anymore, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:07 -07:00
4cc2cee5ac revision: fix memory leaks when rewriting parents
Both `rewrite_parents()` and `remove_duplicate_parents()` may end up
dropping some parents from a commit without freeing the respective
`struct commit_list` items. This causes a bunch of memory leaks. Plug
these.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:07 -07:00
9d4855eef3 midx-write: fix leaking buffer
The buffer used to compute the final MIDX name is never released. Plug
this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:07 -07:00
7f97266ee1 pack-bitmap-write: fix leaking OID array
Fix a leaking OID array in `write_pseudo_merges()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:07 -07:00
d0ab6630a7 pseudo-merge: fix leaking strmap keys
When creating a new pseudo-merge group we collect a set of matchnig
commits and put them into a string map. This strmap is initialized such
that it does not allocate its keys, and instead we try to pass ownership
of the keys to it via `strmap_put()`. This isn't how it works though:
the strmap will never try to release these keys, and consequently they
end up leaking.

Fix this leak by initializing the strmap as duplicating its keys and not
trying to hand over ownership.

The leak is exposed by t5333, but plugging it does not yet make the full
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:06 -07:00
55e563a90c pseudo-merge: fix various memory leaks
Fix various memory leaks hit by the pseudo-merge machinery. These leaks
are exposed by t5333, but plugging them does not yet make the whole test
suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:06 -07:00
5ce08ed4fb line-log: fix several memory leaks
As described in "line-log.c" itself, the code is "leaking like a sieve".
These leaks are all of rather trivial nature, so this commit plugs them
without going too much into details for each of those leaks.

The leaks are hit by t4211, but plugging them alone does not make the
full test suite pass. The remaining leaks are unrelated to the line-log
subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:06 -07:00
a5aecb2cdc diff: improve lifecycle management of diff queues
The lifecycle management of diff queues is somewhat confusing:

  - For most of the part this can be attributed to `DIFF_QUEUE_CLEAR()`,
    which does not release any memory but rather initializes the queue,
    only. This is in contrast to our common naming schema, where
    "clearing" means that we release underlying memory and then
    re-initialize the data structure such that it is ready to use.

  - A second offender is `diff_free_queue()`, which does not free the
    queue structure itself. It is rather a release-style function.

Refactor the code to make things less confusing. `DIFF_QUEUE_CLEAR()` is
replaced by `DIFF_QUEUE_INIT` and `diff_queue_init()`, while
`diff_free_queue()` is replaced by `diff_queue_release()`. While on it,
adapt callsites where we call `DIFF_QUEUE_CLEAR()` with the intent to
release underlying memory to instead call `diff_queue_clear()` to fix
memory leaks.

This memory leak is exposed by t4211, but plugging it alone does not
make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:05 -07:00
fdf972a9df builtin/revert: fix leaking gpg_sign and strategy config
We leak the config values when `gpg_sign` or `strategy` options are
being overridden via the command line. To fix this we need to free the
old value, which requires us to figure out whether the value was changed
via an option in the first place. The easy way to do this, which is to
initialize local variables with `NULL`, doesn't work because we cannot
tell the case where the user has passed e.g. `--no-gpg-sign`. Instead,
we use a sentinel value for both values that we can compare against to
check whether the user has passed the option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:05 -07:00
58888c0401 t/helper: fix leaking repository in partial-clone helper
We initialize but never clear a repository in the partial-clone test
helper. Plug this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:05 -07:00
6361dea6e8 builtin/clone: fix leaking repo state when cloning with bundle URIs
When cloning with bundle URIs we re-initialize `the_repository` after
having fetched the bundle. This causes a bunch of memory leaks though
because we do not release its previous state.

These leaks can be plugged by calling `repo_clear()` before we call
`repo_init()`. But this causes another issue because the remote that we
used is tied to the lifetime of the repository's remote state, which
would also get released. We thus have to make sure that it does not get
free'd under our feet.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:04 -07:00
a0f2a2f581 builtin/pack-redundant: fix various memory leaks
There are various different memory leaks in git-pack-redundant(1),
mostly caused by not even trying to free allocated memory. Fix them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:04 -07:00
64fe1e4a8c builtin/stash: fix leaking pathspec_from_file
The `OPT_PATHSPEC_FROM_FILE()` option maps to `OPT_FILENAME()`, which we
know will always allocate memory when passed. We never free the memory
though, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:04 -07:00
5cca114973 submodule: fix leaking submodule entry list
The submodule entry list returned by `submodules_of_tree()` is never
completely free'd by its only caller. Introduce a new function that
free's the list for us and call it.

While at it, also fix the leaking `branch_point` string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:03 -07:00
666643fa89 wt-status: fix leaking buffer with sparse directories
When hitting a sparse directory in `wt_status_collect_changes_initial()`
we use a `struct strbuf` to assemble the directory's name. We never free
that buffer though, causing a memory leak.

Fix the leak by releasing the buffer. While at it, move the buffer
outside of the loop and reset it to save on some wasteful allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:03 -07:00
c75841687b shell: fix leaking strings
There are two memory leaks in "shell.c". The first one in `run_shell()`
is trivial and fixed without further explanation. The second one in
`cmd_main()` happens because we overwrite the `prog` variable, which
contains an allocated string. In fact though, the memory pointed to by
that variable is still in use because we use `split_cmdline()`, which
may create pointers into the middle of that string. But as we do not
have a direct pointer to the head of the allocated string anymore, we
get a complaint by the leak checker.

Address this by not overwriting the `prog` pointer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:03 -07:00
d607bd8816 scalar: fix leaking repositories
In the scalar code we iterate through multiple repositories,
initializing each of them. We never clear them though, causing memory
leaks. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:02 -07:00
a69d120c07 read-cache: fix leaking hash context in do_write_index()
When writing an index with the EOIE extension we allocate a separate
hash context. We never free that context though, causing a memory leak.
Plug it.

This leak is exposed by t9210, but plugging it alone does not make the
whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:02 -07:00
9a48fc1da2 builtin/annotate: fix leaking args vector
We're leaking the args vector in git-annotate(1) because we never clear
it. Fixing it isn't as easy as calling `strvec_clear()` though because
calling `cmd_blame()` will cause the underlying array to be modified.
Instead, we also need to pass a shallow copy of the argv array to the
function.

Do so to plug the memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30 11:23:02 -07:00
a5031223cd Merge branch 'jk/http-leakfixes' into ps/leakfixes-part-8
* jk/http-leakfixes: (28 commits)
  http-push: clean up local_refs at exit
  http-push: clean up loose request when falling back to packed
  http-push: clean up objects list
  http-push: free xml_ctx.cdata after use
  http-push: free remote_ls_ctx.dentry_name
  http-push: free transfer_request strbuf
  http-push: free transfer_request dest field
  http-push: free curl header lists
  http-push: free repo->url string
  http-push: clear refspecs before exiting
  http-walker: free fake packed_git list
  remote-curl: free HEAD ref with free_one_ref()
  http: stop leaking buffer in http_get_info_packs()
  http: call git_inflate_end() when releasing http_object_request
  http: fix leak of http_object_request struct
  http: fix leak when redacting cookies from curl trace
  transport-helper: fix leak of dummy refs_list
  fetch-pack: clear pack lockfiles list
  fetch: free "raw" string when shrinking refspec
  transport-helper: fix strbuf leak in push_refs_with_push()
  ...
2024-09-30 11:22:21 -07:00
674e46fdd5 Merge branch 'ps/leakfixes-part-7' into ps/leakfixes-part-8
* ps/leakfixes-part-7: (23 commits)
  diffcore-break: fix leaking filespecs when merging broken pairs
  revision: fix leaking parents when simplifying commits
  builtin/maintenance: fix leak in `get_schedule_cmd()`
  builtin/maintenance: fix leaking config string
  promisor-remote: fix leaking partial clone filter
  grep: fix leaking grep pattern
  submodule: fix leaking submodule ODB paths
  trace2: destroy context stored in thread-local storage
  builtin/difftool: plug several trivial memory leaks
  builtin/repack: fix leaking configuration
  diffcore-order: fix leaking buffer when parsing orderfiles
  parse-options: free previous value of `OPTION_FILENAME`
  diff: fix leaking orderfile option
  builtin/pull: fix leaking "ff" option
  dir: fix off by one errors for ignored and untracked entries
  builtin/submodule--helper: fix leaking remote ref on errors
  t/helper: fix leaking subrepo in nested submodule config helper
  builtin/submodule--helper: fix leaking error buffer
  builtin/submodule--helper: clear child process when not running it
  submodule: fix leaking update strategy
  ...
2024-09-30 11:22:10 -07:00
4de34a4233 l10n: sv.po: Update Swedish translation
Also fix issue reported by Anders Jonsson
<anders.jonsson@norsjovallen.se>.

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2024-09-28 15:45:19 +01:00
1b9e9be8b4 csum-file.c: use unsafe SHA-1 implementation when available
Update hashwrite() and friends to use the unsafe_-variants of hashing
functions, calling for e.g., "the_hash_algo->unsafe_update_fn()" instead
of "the_hash_algo->update_fn()".

These callers only use the_hash_algo to produce a checksum, which we
depend on for data integrity, but not for cryptographic purposes, so
these callers are safe to use the unsafe (non-collision detecting) SHA-1
implementation.

To time this, I took a freshly packed copy of linux.git, and ran the
following with and without the OPENSSL_SHA1_UNSAFE=1 build-knob. Both
versions were compiled with -O3:

    $ git for-each-ref --format='%(objectname)' refs/heads refs/tags >in
    $ valgrind --tool=callgrind ~/src/git/git-pack-objects \
        --revs --stdout --all-progress --use-bitmap-index <in >/dev/null

Without OPENSSL_SHA1_UNSAFE=1 (that is, using the collision-detecting
SHA-1 implementation for both cryptographic and non-cryptographic
purposes), we spend a significant amount of our instruction count in
hashwrite():

    $ callgrind_annotate --inclusive=yes | grep hashwrite | head -n1
    159,998,868,413 (79.42%)  /home/ttaylorr/src/git/csum-file.c:hashwrite [/home/ttaylorr/src/git/git-pack-objects]

, and the resulting "clone" takes 19.219 seconds of wall clock time,
18.94 seconds of user time and 0.28 seconds of system time.

Compiling with OPENSSL_SHA1_UNSAFE=1, we spend ~60% fewer instructions
in hashwrite():

    $ callgrind_annotate --inclusive=yes | grep hashwrite | head -n1
     59,164,001,176 (58.79%)  /home/ttaylorr/src/git/csum-file.c:hashwrite [/home/ttaylorr/src/git/git-pack-objects]

, and generate the resulting "clone" much faster, in only 11.597 seconds
of wall time, 11.37 seconds of user time, and 0.23 seconds of system
time, for a ~40% speed-up.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:47 -07:00
06c92dafb8 Makefile: allow specifying a SHA-1 for non-cryptographic uses
Introduce _UNSAFE variants of the OPENSSL_SHA1, BLK_SHA1, and
APPLE_COMMON_CRYPTO_SHA1 compile-time knobs which indicate which SHA-1
implementation is to be used for non-cryptographic uses.

There are a couple of small implementation notes worth mentioning:

  - There is no way to select the collision detecting SHA-1 as the
    "fast" fallback, since the fast fallback is only for
    non-cryptographic uses, and is meant to be faster than our
    collision-detecting implementation.

  - There are no similar knobs for SHA-256, since no collision attacks
    are presently known and thus no collision-detecting implementations
    actually exist.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:47 -07:00
253ed9ecff hash.h: scaffolding for _unsafe hashing variants
Git's default SHA-1 implementation is collision-detecting, which hardens
us against known SHA-1 attacks against Git objects. This makes Git
object writes safer at the expense of some speed when hashing through
the collision-detecting implementation, which is slower than
non-collision detecting alternatives.

Prepare for loading a separate "unsafe" SHA-1 implementation that can be
used for non-cryptographic purposes, like computing the checksum of
files that use the hashwrite() API.

This commit does not actually introduce any new compile-time knobs to
control which implementation is used as the unsafe SHA-1 variant, but
does add scaffolding so that the "git_hash_algo" structure has five new
function pointers which are "unsafe" variants of the five existing
hashing-related function pointers:

  - git_hash_init_fn unsafe_init_fn
  - git_hash_clone_fn unsafe_clone_fn
  - git_hash_update_fn unsafe_update_fn
  - git_hash_final_fn unsafe_final_fn
  - git_hash_final_oid_fn unsafe_final_oid_fn

The following commit will introduce compile-time knobs to specify which
SHA-1 implementation is used for non-cryptographic uses.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:47 -07:00
4c61a1d040 sha1: do not redefine platform_SHA_CTX and friends
Our in-tree SHA-1 wrappers all define platform_SHA_CTX and related
macros to point at the opaque "context" type, init, update, and similar
functions for each specific implementation.

In hash.h, we use these platform_ variables to set up the function
pointers for, e.g., the_hash_algo->init_fn(), etc.

But while these header files have a header-specific macro that prevents
them declaring their structs / functions multiple times, they
unconditionally define the platform variables, making it impossible to
load multiple SHA-1 implementations at once.

As a prerequisite for loading a separate SHA-1 implementation for
non-cryptographic uses, only define the platform_ variables if they have
not already been defined.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:47 -07:00
c177d3dc50 pack-objects: use finalize_object_file() to rename pack/idx/etc
In most places that write files to the object database (even packfiles
via index-pack or fast-import), we use finalize_object_file(). This
prefers link()/unlink() over rename(), because it means we will prefer
data that is already in the repository to data that we are newly
writing.

We should do the same thing in pack-objects. Even though we don't think
of it as accepting outside data (and thus not being susceptible to
collision attacks), in theory a determined attacker could present just
the right set of objects to cause an incremental repack to generate
a pack with their desired hash.

This has some test and real-world fallout, as seen in the adjustment to
t5303 below. That test script assumes that we can "fix" corruption by
repacking into a good state, including when the pack generated by that
repack operation collides with a (corrupted) pack with the same hash.
This violates our assumption from the previous adjustments to
finalize_object_file() that if we're moving a new file over an existing
one, that since their checksums match, so too must their contents.

This makes "fixing" corruption like this a more explicit operation,
since the test (and users, who may fix real-life corruption using a
similar technique) must first move the broken contents out of the way.

Note also that we now call adjust_shared_perm() twice. We already call
adjust_shared_perm() in stage_tmp_packfiles(), and now call it again in
finalize_object_file(). This is somewhat wasteful, but cleaning up the
existing calls to adjust_shared_perm() is tricky (because sometimes
we're writing to a tmpfile, and sometimes we're writing directly into
the final destination), so let's tolerate some minor waste until we can
more carefully clean up the now-redundant calls.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:47 -07:00
b1b8dfde69 finalize_object_file(): implement collision check
We've had "FIXME!!! Collision check here ?" in finalize_object_file()
since aac1794132 (Improve sha1 object file writing., 2005-05-03). That
is, when we try to write a file with the same name, we assume the
on-disk contents are the same and blindly throw away the new copy.

One of the reasons we never implemented this is because the files it
moves are all named after the cryptographic hash of their contents
(either loose objects, or packs which have their hash in the name these
days). So we are unlikely to see such a collision by accident. And even
though there are weaknesses in sha1, we assume they are mitigated by our
use of sha1dc.

So while it's a theoretical concern now, it hasn't been a priority.
However, if we start using weaker hashes for pack checksums and names,
this will become a practical concern. So in preparation, let's actually
implement a byte-for-byte collision check.

The new check will cause the write of new differing content to be a
failure, rather than a silent noop, and we'll retain the temporary file
on disk. If there's no collision present, we'll clean up the temporary
file as usual after either rename()-ing or link()-ing it into place.

Note that this may cause some extra computation when the files are in
fact identical, but this should happen rarely.

Loose objects are exempt from this check, and the collision check may be
skipped by calling the _flags variant of this function with the
FOF_SKIP_COLLISION_CHECK bit set. This is done for a couple of reasons:

  - We don't treat the hash of the loose object file's contents as a
    checksum, since the same loose object can be stored using different
    bytes on disk (e.g., when adjusting core.compression, using a
    different version of zlib, etc.).

    This is fundamentally different from cases where
    finalize_object_file() is operating over a file which uses the hash
    value as a checksum of the contents. In other words, a pair of
    identical loose objects can be stored using different bytes on disk,
    and that should not be treated as a collision.

  - We already use the path of the loose object as its hash value /
    object name, so checking for collisions at the content level doesn't
    add anything.

    Adding a content-level collision check would have to happen at a
    higher level than in finalize_object_file(), since (avoiding race
    conditions) writing an object loose which already exists in the
    repository will prevent us from even reaching finalize_object_file()
    via the object freshening code.

    There is a collision check in index-pack via its `check_collision()`
    function, but there isn't an analogous function in unpack-objects,
    which just feeds the result to write_object_file().

    So skipping the collision check here does not change for better or
    worse the hardness of loose object writes.

As a small note related to the latter bullet point above, we must teach
the tmp-objdir routines to similarly skip the content-level collision
checks when calling migrate_one() on a loose object file, which we do by
setting the FOF_SKIP_COLLISION_CHECK bit when we are inside of a loose
object shard.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:47 -07:00
9ca7c2c13b finalize_object_file(): refactor unlink_or_warn() placement
As soon as we've tried to link() a temporary object into place, we then
unlink() the tempfile immediately, whether we were successful or not.

For the success case, this is because we no longer need the old file
(it's now linked into place).

For the error case, there are two outcomes. Either we got EEXIST, in
which case we consider the collision to be a noop. Or we got a system
error, in which we case we are just cleaning up after ourselves.

Using a single line for all of these cases has some problems:

  - in the error case, our unlink() may clobber errno, which we use in
    the error message

  - for the collision case, there's a FIXME that indicates we should do
    a collision check. In preparation for implementing that, we'll need
    to actually hold on to the file.

Split these three cases into their own calls to unlink_or_warn(). This
is more verbose, but lets us do the right thing in each case.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:46 -07:00
d1b44bb764 finalize_object_file(): check for name collision before renaming
We prefer link()/unlink() to rename() for object files, with the idea
that we should prefer the data that is already on disk to what is
incoming. But we may fall back to rename() if the user has configured us
to do so, or if the filesystem seems not to support cross-directory
links. This loses the "prefer what is on disk" property.

We can mitigate this somewhat by trying to stat() the destination
filename before doing the rename. This is racy, since the object could
be created between the stat() and rename() calls. But in practice it is
expanding the definition of "what is already on disk" to be the point
that the function is called. That is enough to deal with any potential
attacks where an attacker is trying to collide hashes with what's
already in the repository.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 11:27:46 -07:00
12dfc2475c diffcore-break: fix leaking filespecs when merging broken pairs
When merging file pairs after they have been broken up we queue a new
file pair and discard the broken-up ones. The newly-queued file pair
reuses one filespec of the broken up pairs each, where the respective
other filespec gets discarded. But we only end up freeing the filespec's
data, not the filespec itself, and thus leak memory.

Fix these leaks by using `free_filespec()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:37 -07:00
fa016423c7 revision: fix leaking parents when simplifying commits
When simplifying commits, e.g. because they are treesame with their
parents, we unset the commit's parent pointers but never free them. Plug
the resulting memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:37 -07:00
b6c3f8e12c builtin/maintenance: fix leak in get_schedule_cmd()
The `get_schedule_cmd()` function allows us to override the schedule
command with a specific test command such that we can verify the
underlying logic in a platform-independent way. Its memory management is
somewhat wild though, because it basically gives up and assigns an
allocated string to the string constant output pointer. While this part
is marked with `UNLEAK()` to mask this, we also leak the local string
lists.

Rework the function such that it has a separate out parameter. If set,
we will assign it the final allocated command. Plug the other memory
leaks and create a common exit path.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:37 -07:00
84e9fc361d builtin/maintenance: fix leaking config string
When parsing the maintenance strategy from config we allocate a config
string, but do not free it after parsing it. Plug this leak by instead
using `git_config_get_string_tmp()`, which does not allocate any memory.

This leak is exposed by t7900, but plugging it alone does not make the
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:37 -07:00
355b3190ee promisor-remote: fix leaking partial clone filter
The partial clone filter of a promisor remote is never free'd, causing
memory leaks. Furthermore, in case multiple partial clone filters are
defined for the same remote, we'd overwrite previous values without
freeing them.

Fix these leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:36 -07:00
6d82437a47 grep: fix leaking grep pattern
When creating a pattern via `create_grep_pat()` we allocate the pattern
member of the structure regardless of the token type. But later, when we
try to free the structure, we free the pattern member conditionally on
the token type and thus leak memory.

Plug this leak. The leak is exposed by t7814, but plugging it alone does
not make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:36 -07:00
f8d2ca7246 submodule: fix leaking submodule ODB paths
In `add_submodule_odb_by_path()` we add a path into a global string
list. The list is initialized with `NODUP`, which means that we do not
pass ownership of strings to the list. But we use `xstrdup()` when we
insert a path, with the consequence that the string will never get
free'd.

Plug the leak by marking the list as `DUP`. There is only a single
callsite where we insert paths anyway, and as explained above that
callsite was mishandling the allocation.

This leak is exposed by t7814, but plugging it does not make the whole
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:36 -07:00
64d9adafba trace2: destroy context stored in thread-local storage
Each thread may have a specific context in the trace2 subsystem that we
set up via thread-local storage. We do not set up a destructor for this
data though, which means that the context data will leak.

Plug this leak by installing a destructor. This leak is exposed by
t7814, but plugging it alone does not make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:36 -07:00
7f795a1715 builtin/difftool: plug several trivial memory leaks
There are several leaking data structures in git-difftool(1). Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:36 -07:00
dea4a9521e builtin/repack: fix leaking configuration
When repacking, we assemble git-pack-objects(1) arguments both for the
"normal" pack and for the cruft pack. This configuration gets populated
with a bunch of `OPT_PASSTHRU` options that we end up passing to the
child process. These options are allocated, but never free'd.

Create a new `pack_objects_args_release()` function that releases the
memory for us and call it for both sets of options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:36 -07:00
6932ec8183 diffcore-order: fix leaking buffer when parsing orderfiles
In `prepare_order()` we parse an orderfile and assign it to a global
array. In order to save on some allocations, we replace newlines with
NUL characters and then assign pointers into the allocated buffer to
that array. This can cause the buffer to be completely unreferenced
though in some cases, e.g. because the order file is empty or because we
had to use `xmemdupz()` to copy the lines instead of NUL-terminating
them.

Refactor the code to always `xmemdupz()` the strings. This is a bit
simpler, and it is rather unlikely that saving a handful of allocations
really matters. This allows us to release the string buffer and thus
plug the memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
cf8c4237eb parse-options: free previous value of OPTION_FILENAME
The `OPTION_FILENAME` option always assigns either an allocated string
or `NULL` to the value. In case it is passed multiple times it does not
know to free the previous value though, which causes a memory leak.

Refactor the function to always free the previous value. None of the
sites where this option is used pass a string constant, so this change
is safe.

While at it, fix the argument of `fix_filename()` to be a string
constant. The only reason why it's not is because we use it as an
in-out-parameter, where the input is a constant and the output is not.
This is weird and unnecessary, as we can just return the result instead
of using the parameter for this.

This leak is being hit in t7621, but plugging it alone does not make the
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
76c7e708bb diff: fix leaking orderfile option
The `orderfile` diff option is being assigned via `OPT_FILENAME()`,
which assigns an allocated string to the variable. We never free it
though, causing a memory leak.

Change the type of the string to `char *` and free it to plug the leak.
This also requires us to use `xstrdup()` to assign the global config to
it in case it is set.

This leak is being hit in t7621, but plugging it alone does not make the
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
49af1b7722 builtin/pull: fix leaking "ff" option
The `opt_ff` field gets populated either via `OPT_PASSTHRU` via
`config_get_ff()` or when `--rebase` is passed. So we sometimes end up
overriding the value in `opt_ff` with another value, but we do not free
the old value, causing a memory leak.

Adapt the type of the variable to be `char *` and consistently assign
allocated strings to it such that we can easily free it when it is being
overridden.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
04ff8008f3 dir: fix off by one errors for ignored and untracked entries
In `treat_directory()` we perform some logic to handle ignored and
untracked entries. When populating a directory with entries we first
save the current number of ignored/untracked entries and then populate
new entries at the end of our arrays that keep track of those entries.
When we figure out that all entries have been ignored/are untracked we
then remove this tail of entries from those vectors again. But there is
an off by one error in both paths that causes us to not free the first
ignored and untracked entries, respectively.

Fix these off-by-one errors to plug the resulting leak. While at it,
massage the code a bit to match our modern code style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
5bf922a4e9 builtin/submodule--helper: fix leaking remote ref on errors
When `update_submodule()` fails we return with `die_message()`, which
only causes us to print the same message as `die()` would without
actually causing the process to die. We don't free memory in that case
and thus leak memory.

Fix the leak by freeing the remote ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
f1652c04b5 t/helper: fix leaking subrepo in nested submodule config helper
In the "submodule-nested-repo-config" helper we create a submodule
repository and print its configuration. We do not clear the repo,
causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:35 -07:00
2266bb4f6a builtin/submodule--helper: fix leaking error buffer
Fix leaking error buffer when `compute_alternate_path()` fails.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:34 -07:00
8f786a8e9f builtin/submodule--helper: clear child process when not running it
In `runcommand_in_submodule_cb()` we may end up not executing the child
command when `argv` is empty. But we still populate the command with
environment variables and other things, which needs cleanup. This leads
to a memory leak because we do not call `finish_command()`.

Fix this by clearing the child process when we don't execute it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:34 -07:00
2e492f2047 submodule: fix leaking update strategy
We're not freeing the submodule update strategy command. Provide a
helper function that does this for us and call it in
`update_data_release()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:34 -07:00
3aef7a05ad git: fix leaking argv when handling builtins
In `handle_builtin()` we may end up creating an ad-hoc argv array in
case we see that the command line contains the "--help" parameter. In
this case we observe two memory leaks though:

  - We leak the `struct strvec` itself because we directly exit after
    calling `run_builtin()`, without bothering about any cleanups.

  - Even if we free'd that vector we'd end up leaking some of its
    strings because `run_builtin()` will modify the array.

Plug both of these leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:34 -07:00
0f26223b6d builtin/help: fix leaking html_path when reading config multiple times
The `html_path` variable gets populated via `git_help_config()`, which
puts an allocated string into it if its value has been configured. We do
not clear the old value though, which causes a memory leak in case the
config exists multiple times.

Plug this leak. The leak is exposed by t0012, but plugging it alone is
not sufficient to make the test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:34 -07:00
02e36f9ffa builtin/help: fix dangling reference to html_path
In `get_html_page_path()` we may end up assigning the return value of
`system_path()` to the global `html_path` variable. But as we also
assign the returned value to `to_free`, we will deallocate its memory
upon returning from the function. Consequently, `html_path` will now
point to deallocated memory.

Fix this issue by instead assigning the value to a separate local
variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27 08:25:34 -07:00
9c4c840901 howto-maintain-git: discarding inactive topics
When a patch series happened to look interesting to the maintainer
but is not ready for 'next', it is applied on a topic branch and
merged to the 'seen' branch to keep an eye on it.  In an ideal
world, the participants give reviews and the original author
responds to the reviews, and such iterations may produce newer
versions of the patch series, and at some point, a concensus is
formed that the latest round is good enough for 'next'.  Then the
topic is merged to 'next' for inclusion in a future release.

In a much less ideal world we live in, however, a topic sometimes
get stalled.  The original author may not respond to hanging review
comments, may promise an update will be sent but does not manage to
do so, nobody talks about the topic on the list and nobody builds
upon it, etc.

Following the recent trend to document and give more transparency to
the decision making process, let's set a deadline to keep a topic
still alive, and actively discard those that are inactive for a long
period of time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-26 12:13:34 -07:00
3857aae53f Git 2.47-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 18:24:52 -07:00
1522467d13 Merge branch 'jk/sendemail-mailmap-doc'
Docfix.

* jk/sendemail-mailmap-doc:
  send-email: document --mailmap and associated configuration
2024-09-25 18:24:52 -07:00
f92c61aef0 Merge branch 'rs/diff-exit-code-binary'
"git diff --exit-code" ignored modified binary files, which has
been corrected.

* rs/diff-exit-code-binary:
  diff: report modified binary files as changes in builtin_diff()
2024-09-25 18:24:52 -07:00
cd845c0422 Merge branch 'cb/ci-freebsd-13-4'
CI updates.

* cb/ci-freebsd-13-4:
  ci: update FreeBSD image to 13.4
2024-09-25 18:24:51 -07:00
4f454e14b5 Merge branch 'ak/doc-sparse-co-typofix'
Docfix.

* ak/doc-sparse-co-typofix:
  Documentation/technical: fix a typo
2024-09-25 18:24:51 -07:00
a344b47165 Merge branch 'ak/typofix-builtins'
Typofix.

* ak/typofix-builtins:
  builtin: fix typos
2024-09-25 18:24:50 -07:00
a116aba5d5 The 21st batch
This pretty much should match what we would have in the upcoming
preview of 2.47.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:37:13 -07:00
cbb5b53a9c Merge branch 'jc/cmake-unit-test-updates'
CMake adjustments for recent changes around unit tests.

* jc/cmake-unit-test-updates:
  cmake: generalize the handling of the `UNIT_TEST_OBJS` list
  cmake: stop looking for `REFTABLE_TEST_OBJS` in the Makefile
  cmake: rename clar-related variables to avoid confusion
2024-09-25 10:37:13 -07:00
7644bb0aaa Merge branch 'ps/ci-gitlab-upgrade'
CI updates.

* ps/ci-gitlab-upgrade:
  gitlab-ci: upgrade machine type of Linux runners
2024-09-25 10:37:13 -07:00
7834cc3212 Merge branch 'ak/refs-symref-referent-typofix'
Typofix.

* ak/refs-symref-referent-typofix:
  ref-filter: fix a typo
2024-09-25 10:37:12 -07:00
78ce6660bb Merge branch 'ak/typofix-2.46-maint'
Typofix.

* ak/typofix-2.46-maint:
  upload-pack: fix a typo
  sideband: fix a typo
  setup: fix a typo
  run-command: fix a typo
  revision: fix a typo
  refs: fix typos
  rebase: fix a typo
  read-cache-ll: fix a typo
  pretty: fix a typo
  object-file: fix a typo
  merge-ort: fix typos
  merge-ll: fix a typo
  http: fix a typo
  gpg-interface: fix a typo
  git-p4: fix typos
  git-instaweb: fix a typo
  fsmonitor-settings: fix a typo
  diffcore-rename: fix typos
  config.mak.dev: fix a typo
2024-09-25 10:37:12 -07:00
52f57e94bd Merge branch 'ps/reftable-exclude'
The reftable backend learned to more efficiently handle exclude
patterns while enumerating the refs.

* ps/reftable-exclude:
  refs/reftable: wire up support for exclude patterns
  reftable/reader: make table iterator reseekable
  t/unit-tests: introduce reftable library
  Makefile: stop listing test library objects twice
  builtin/receive-pack: fix exclude patterns when announcing refs
  refs: properly apply exclude patterns to namespaced refs
2024-09-25 10:37:11 -07:00
c639478d79 Merge branch 'ps/apply-leakfix'
"git apply" had custom buffer management code that predated before
use of strbuf got widespread, which has been updated to use strbuf,
which also plugged some memory leaks.

* ps/apply-leakfix:
  apply: refactor `struct image` to use a `struct strbuf`
  apply: rename members that track line count and allocation length
  apply: refactor code to drop `line_allocated`
  apply: introduce macro and function to init images
  apply: rename functions operating on `struct image`
  apply: reorder functions to move image-related things together
2024-09-25 10:37:10 -07:00
f4c768c639 http-push: clean up local_refs at exit
We allocate a list of ref structs from get_local_heads() but never clean
it up. We should do so before exiting to avoid complaints from the
leak-checker. Note that we have to initialize it to NULL, because
there's one code path that can jump to the cleanup label before we
assign to it.

Fixing this lets us mark t5540 as leak-free.

Curiously building with SANITIZE=leak and gcc does not seem to find this
problem, but switching to clang does. It seems like a fairly obvious
leak, though.

I was curious that the matching remote_refs did not have the same leak.
But that is because we store the list in a global variable, so it's
still reachable after we exit. Arguably we could treat it the same as
future-proofing, but I didn't bother (now that the script is marked
leak-free, anybody moving it to a stack variable will notice).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:58 -07:00
9699327945 http-push: clean up loose request when falling back to packed
In http-push's finish_request(), if we fail a loose object request we
may fall back to trying a packed request. But if we do so, we leave the
http_loose_object_request in place, leaking it.

We can fix this by always cleaning it up. Note that the obj_req pointer
here (which we'll set to NULL) is a copy of the request->userData
pointer, which will now point to freed memory. But that's OK. We'll
either release the parent request struct entirely, or we'll convert it
into a packed request, which will overwrite userData itself.

This leak is found by t5540, but it's not quite leak-free yet.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:58 -07:00
92e1eb491a http-push: clean up objects list
In http-push's get_delta(), we generate a list of pending objects by
recursively processing trees and blobs, adding them to a linked list.
And then we iterate over the list, adding a new request for each
element.

But since we iterate using the list head pointer, at the end it is NULL
and all of the actual list structs have been leaked.

We can fix this either by using a separate iterator and then calling
object_list_free(), or by just freeing as we go. I picked the latter,
just because it means we continue to shrink the list as we go, though
I'm not sure it matters in practice (we call add_send_request() in the
loop, but I don't think it ever looks at the global objects list
itself).

This fixes several leaks noticed in t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:57 -07:00
3245a2ade5 http-push: free xml_ctx.cdata after use
When we ask libexpat to parse XML data, we sometimes set xml_cdata as a
CharacterDataHandler callback. This fills in an allocated string in the
xml_ctx struct which we never free, causing a leak.

I won't pretend to understand the purpose of the field, but it looks
like it is used by other callbacks during the parse. At any rate, we
never look at it again after XML_Parse() returns, so we should be OK to
free() it then.

This fixes several leaks triggered by t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:57 -07:00
a1528093ba http-push: free remote_ls_ctx.dentry_name
The remote_ls_ctx struct has dentry_name string, which is filled in with
a heap allocation in the handle_remote_ls_ctx() XML callback. After the
XML parse is done in remote_ls(), we should free the string to avoid a
leak.

This fixes several leaks found by running t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:57 -07:00
94c6285780 http-push: free transfer_request strbuf
When we issue a PUT, we initialize and fill a strbuf embedded in the
transfer_request struct. But we never release this buffer, causing a
leak.

We can fix this by adding a strbuf_release() call to release_request().
If we stopped there, then non-PUT requests would try to release a
zero-initialized strbuf. This works OK in practice, but we should try to
follow the strbuf API more closely. So instead, we'll always initialize
the strbuf when we create the transfer_request struct.

That in turn means switching the strbuf_init() call in start_put() to a
simple strbuf_grow().

This leak is triggered in t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:57 -07:00
7d3c71ddbf http-push: free transfer_request dest field
When we issue a PUT request, we store the destination in the "dest"
field by detaching from a strbuf. But we never free the result, causing
a leak.

We can address this in the release_request() function. But note that we
also need to initialize it to NULL, as most other request types do not
set it at all.

Curiously there are _two_ functions to initialize a transfer_request
struct. Adding the initialization only to add_fetch_request() seems to
be enough for t5540, but I won't pretend to understand why. Rather than
just adding "request->dest = NULL" in both spots, let's zero the whole
struct. That addresses this problem, as well as any future ones (and it
can't possibly hurt, as by definition we'd be hitting uninitialized
memory previously).

This fixes several leaks noticed by t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:57 -07:00
747a71019c http-push: free curl header lists
To pass headers to curl, we have to allocate a curl_slist linked list
and then feed it to curl_easy_setopt(). But the header list is not
copied by curl, and must remain valid until we are finished with the
request.

A few spots in http-push get this right, freeing the list after
finishing the request, but many do not. In most cases the fix is simple:
we set up the curl slot, start it, and then use run_active_slot() to
take it to completion. After that, we don't need the headers anymore and
can call curl_slist_free_all().

But one case is trickier: when we do a MOVE request, we start the
request but don't immediately finish it. It's possible we could change
this to be more like the other requests, but I didn't want to get into
risky refactoring of this code. So we need to stick the header list into
the request struct and remember to free it later.

Curiously, the struct already has a headers field for this purpose! It
goes all the way back to 58e60dd203 (Add support for pushing to a remote
repository using HTTP/DAV, 2005-11-02), but it doesn't look like it was
ever used. We can make use of it just by assigning our headers to it,
and there is already code in finish_request() to clean it up.

This fixes several leaks triggered by t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:56 -07:00
4324c6c0d9 http-push: free repo->url string
Our repo->url string comes from str_end_url_with_slash(), which always
allocates its output buffer. We should free it before exiting to avoid
triggering the leak-checker.

This can be seen by leak-checking t5540.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:56 -07:00
85430af347 http-push: clear refspecs before exiting
We parse the command-line arguments into a refspec struct, but we never
free them. We should do so before exiting to avoid triggering the
leak-checker.

This triggers in t5540 many times (basically every invocation of
http-push).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:56 -07:00
134bfedf6d http-walker: free fake packed_git list
The dumb-http walker code creates a "fake" packed_git list representing
packs we've downloaded from the remote (I call it "fake" because
generally that struct is only used and managed by the local repository
struct). But during our cleanup phase we don't touch those at all,
causing a leak.

There's no support here from the rest of the object-database API, as
these structs are not meant to be freed, except when closing the object
store completely. But we can see that raw_object_store_clear() just
calls free() on them, and that's enough here to fix the leak.

I also added a call to close_pack() before each. In the regular code
this happens via close_object_store(), which we do as part of
raw_object_store_clear(). This is necessary to prevent leaking mmap'd
data (like the pack idx) or descriptors. The leak-checker won't catch
either of these itself, but I did confirm with some hacky warning()
calls and running t5550 that it's easy to leak at least index data.

This is all much more intimate with the packed_git struct than I'd like,
but I think fixing it would be a pretty big refactor. And it's just not
worth it for dumb-http code which is rarely used these days. If we can
silence the leak-checker without creating too much hassle, we should
just do that.

This lets us mark t5550 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:56 -07:00
cf8072ed7a remote-curl: free HEAD ref with free_one_ref()
After dumb-http downloads the remote info/refs file, it adds an extra
HEAD ref struct to our list by downloading the remote symref and finding
the matching ref within our list. If either of those fails, we throw
away the ref struct. But we do so with free(), when we should use
free_one_ref() to catch any embedded allocations (in particular, if
fetching the remote HEAD succeeded but the branch is unborn, its
ref->symref field will be populated but we'll still throw it all away).

This leak is triggered by t5550 (but we still have a little more work to
mark it leak-free).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:56 -07:00
75f4acc981 http: stop leaking buffer in http_get_info_packs()
We use http_get_strbuf() to fetch the remote info/packs content into a
strbuf, but never free it, causing a leak. There's no need to hold onto
it, as we've already parsed it completely.

This lets us mark t5619 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:55 -07:00
8bdb84ebbb http: call git_inflate_end() when releasing http_object_request
In new_http_object_request(), we initialize the zlib stream with
git_inflate_init(). We must have a matching git_inflate_end() to avoid
leaking any memory allocated by zlib.

In most cases this happens in finish_http_object_request(), but we don't
always get there. If we abort a request mid-stream, then we may clean it
up without hitting that function.

We can't just add a git_inflate_end() call to the release function,
though. That would double-free the cases that did actually finish.
Instead, we'll move the call from the finish function to the release
function. This does delay it for the cases that do finish, but I don't
think it matters. We should have already reached Z_STREAM_END (and
complain if we didn't), and we do not record any status code from
git_inflate_end().

This leak is triggered by t5550 at least (and probably other dumb-http
tests).

I did find one other related spot of interest. If we try to read a
previously downloaded file and fail, we reset the stream by calling
memset() followed by a fresh git_inflate_init(). I don't think this case
is triggered in the test suite, but it seemed like an obvious leak, so I
added the appropriate git_inflate_end() before the memset() there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:55 -07:00
a1bc3c88de http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.

The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.

These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.

I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).

To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.

Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.

This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.

This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:55 -07:00
3d33e96653 http: fix leak when redacting cookies from curl trace
When redacting headers for GIT_TRACE_CURL, we build up a redacted cookie
header in a local strbuf, and then copy it into the output. But we
forget to release the temporary strbuf, leaking it for every cookie
header we show.

The other redacted headers don't run into this problem, since they're
able to work in-place in the output buffer. But the cookie parsing is
too complicated for that, since we redact the cookies individually.

This leak is triggered by the cookie tests in t5551.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:55 -07:00
cb2732f0ca transport-helper: fix leak of dummy refs_list
When using a remote-helper, the fetch_refs() function will issue a
"list" command if we haven't already done so. We don't care about the
result, but this is just to maintain compatibility as explained in
ac3fda82bf (transport-helper: skip ls-refs if unnecessary, 2019-08-21).

But get_refs_list_using_list(), the function we call to issue the
command, does parse and return the resulting ref list, which we simply
leak. We should record the return value and free it immediately (another
approach would be to teach it to avoid allocating at all, but it does
not seem worth the trouble to micro-optimize this mostly historical
case).

Triggering this requires the v0 protocol (since in v2 we use stateless
connect to take over the connection). You can see it in t5551.37, "fetch
by SHA-1 without tag following", as it explicitly enables v0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:54 -07:00
d121a7dd21 fetch-pack: clear pack lockfiles list
If the --lock-pack option is passed (which it typically is when
fetch-pack is used under the hood by smart-http), then we may end up
with entries in our pack_lockfiles string_list. We need to clear them
before returning to avoid a leak.

In git-fetch this isn't a problem, since the same cleanup happens via
transport_unlock_pack(). But the leak is detectable in t5551, which does
http fetches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:54 -07:00
ea4780307c fetch: free "raw" string when shrinking refspec
The "--prefetch" option to git-fetch modifies the default refspec,
including eliminating some entries entirely. When we drop an entry we
free the strings in the refspec_item, but we forgot to free the matching
string in the "raw" array of the refspec struct. There's no behavioral
bug here (since we correctly shrink the raw array, too), but we're
leaking the allocated string.

Let's add in the leak-fix, and while we're at it drop "const" from
the type of the raw string array. These strings are always allocated by
refspec_append(), etc, and this makes the memory ownership more clear.

This is all a bit more intimate with the refspec code than I'd like, and
I suspect it would be better if each refspec_item held on to its own raw
string, we had a single array, and we could use refspec_item_clear() to
clean up everything. But that's a non-trivial refactoring, since
refspec_item structs can be held outside of a "struct refspec", without
having a matching raw string at all. So let's leave that for now and
just fix the leak in the most immediate way.

This lets us mark t5582 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:54 -07:00
e00e1cff0d transport-helper: fix strbuf leak in push_refs_with_push()
We loop over the refs to push, building up a strbuf with the set of
"push" directives to send to the remote helper. But if the atomic-push
flag is set and we hit a rejected ref, we'll bail from the function
early. We clean up most things, but forgot to release the strbuf.

Fixing this lets us mark t5541 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:54 -07:00
05372c28be send-pack: free cas options before exit
The send-pack --force-with-lease option populates a push_cas_option
struct with allocated strings. Exiting without cleaning this up will
cause leak-checkers to complain.

We can fix this by calling clear_cas_option(), after making it publicly
available. Previously it was used only for resetting the list when we
saw --no-force-with-lease.

The git-push command has the same "leak", though in this case it won't
trigger a leak-checker since it stores the push_cas_option struct as a
global rather than on the stack (and is thus reachable even after main()
exits). I've added cleanup for it here anyway, though, as
future-proofing.

The leak is triggered by t5541 (it tests --force-with-lease over http,
which requires a separate send-pack process under the hood), but we
can't mark it as leak-free yet.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:53 -07:00
753f6708d0 commit: avoid leaking already-saved buffer
When we parse a commit via repo_parse_commit_internal(), if
save_commit_buffer is set we'll stuff the buffer of the object contents
into a cache, overwriting any previous value.

This can result in a leak of that previously cached value, though it's
rare in practice. If we have a value in the cache it would have come
from a previous parse, and during that parse we'd set the object.parsed
flag, causing any subsequent parse attempts to exit without doing any
work.

But it's possible to "unparse" a commit, which we do when registering a
commit graft. And since shallow fetches are implemented using grafts,
the leak is triggered in practice by t5539.

There are a number of possible ways to address this:

  1. the unparsing function could clear the cached commit buffer, too. I
     think this would work for the case I found, but I'm not sure if
     there are other ways to end up in the same state (an unparsed
     commit with an entry in the commit buffer cache).

  2. when we parse, we could check the buffer cache and prefer it to
     reading the contents from the object database. In theory the
     contents of a particular sha1 are immutable, but the code in
     question is violating the immutability with grafts. So this
     approach makes me a bit nervous, although I think it would work in
     practice (the grafts are applied to what we parse, but we still
     retain the original contents).

  3. We could realize the cache is already populated and discard its
     contents before overwriting. It's possible some other code could be
     holding on to a pointer to the old cache entry (and we'd introduce
     a use-after-free), but I think the risk of that is relatively low.

  4. The reverse of (3): when the cache is populated, don't bother
     saving our new copy. This is perhaps a little weird, since we'll
     have just populated the commit struct based on a different buffer.
     But the two buffers should be the same, even in the presence of
     grafts (as in (2) above).

I went with option 4. It addresses the leak directly and doesn't carry
any risk of breaking other assumptions. And it's the same technique used
by parse_object_buffer() for this situation, though I'm not sure when it
would even come up there. The extra safety has been there since
bd1e17e245 (Make "parse_object()" also fill in commit message buffer
data., 2005-05-25).

This lets us mark t5539 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:53 -07:00
c800963578 fetch-pack, send-pack: clean up shallow oid array
When we call get_remote_heads() for protocol v0, that may populate the
"shallow" oid_array, which must be cleaned up to avoid a leak at the
program exit. The same problem exists for both fetch-pack and send-pack,
but not for the usual transport.c code paths, since we already do this
cleanup in disconnect_git().

Fixing this lets us mark t5542 as leak-free for the send-pack side, but
fetch-pack will need some more fixes before we can do the same for
t5539.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:53 -07:00
0c23f1a9e4 fetch-pack: free object filter before exiting
Our fetch_pack_args holds a filter_options struct that may be populated
with allocated strings by the by the "--filter" command-line option. We
must free it before exiting to avoid a leak when the program exits.

The usual fetch code paths that use transport.c don't have the same
leak, because we do the cleanup in disconnect_git().

Fixing this leak lets us mark t5500 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:53 -07:00
91aa673539 connect: clear child process before freeing in diagnostic mode
The git_connect() function has a special CONNECT_DIAG_URL mode, where we
stop short of actually connecting to the other side and just print some
parsing details. For URLs that require a child process (like ssh), we
free() the child_process struct but forget to clear it, leaking the
strings we stuffed into its "env" list.

This leak is triggered many times in t5500, which uses "fetch-pack
--diag-url", but we're not yet ready to mark it as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:53 -07:00
6f54d00439 fetch-pack: fix leaking sought refs
When calling `fetch_pack()` the caller is expected to pass in a set of
sought-after refs that they want to fetch. This array gets massaged to
not contain duplicate entries, which is done by replacing duplicate refs
with `NULL` pointers. This modifies the caller-provided array, and in
case we do unset any pointers the caller now loses track of that ref and
cannot free it anymore.

Now the obvious fix would be to not only unset these pointers, but to
also free their contents. But this doesn't work because callers continue
to use those refs. Another potential solution would be to copy the array
in `fetch_pack()` so that we dont modify the caller-provided one. But
that doesn't work either because the NULL-ness of those entries is used
by callers to skip over ref entries that we didn't even try to fetch in
`report_unmatched_refs()`.

Instead, we make it the responsibility of our callers to duplicate these
arrays as needed. It ain't pretty, but it works to plug the memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:52 -07:00
61133e6ebb shallow: fix leak when unregistering last shallow root
When unregistering a shallow root we shrink the array of grafts by one
and move remaining grafts one to the left. This can of course only
happen when there are any grafts left, because otherwise there is
nothing to move. As such, this code is guarded by a condition that only
performs the move in case there are grafts after the position of the
graft to be unregistered.

By mistake we also put the call to free the unregistered graft into that
condition. But that doesn't make any sense, as we want to always free
the graft when it exists. Fix the resulting memory leak by doing so.

This leak is exposed by t5500, but plugging it does not make the whole
test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:52 -07:00
2ccf570efe http-fetch: clear leaking git-index-pack(1) arguments
We never clear the arguments that we pass to git-index-pack(1). Create a
common exit path and release them there to plug this leak.

This is leak is exposed by t5702, but plugging the leak does not make
the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:52 -07:00
cf1464331b test-lib: check for leak logs after every test
If you are trying to find and fix leaks in a large test script, it can
be overwhelming to see the leak logs for every test at once. The
previous commit let you use "--immediate" to see the logs after the
first failing test, but this isn't always the first leak. As discussed
there, we may see leaks from previous tests that didn't happen to fail.

To catch those, let's check for any logs that appeared after each test
snippet is run, meaning that in a SANITIZE=leak build, any leak is an
immediate failure of the test snippet.

This check is mostly free in non-leak builds (just a "test -z"), and
only a few extra processes in a leak build, so I don't think the
overhead should matter (if it does, we could probably optimize for the
common "no logs" case without even spending a process).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:23:01 -07:00
5fabf6e5ad test-lib: show leak-sanitizer logs on --immediate failure
When we've compiled with SANITIZE=leak, at the end of the test script
we'll dump any collected logs to stdout. These logs have two uses:

  1. Leaks don't always cause a test snippet to fail (e.g., if they
     happen in a sub-process that we expect to return non-zero).
     Checking the logs catches these cases that we'd otherwise miss
     entirely.

  2. LSan will dump the leak info to stderr, but that is sometimes
     hidden (e.g., because it's redirected by the test, or because it's
     in a sub-process whose stderr goes elsewhere). Dumping the logs is
     the easiest way for the developer to see them.

One downside is that the set of logs for an entire script may be very
long, especially when you're trying to fix existing test scripts. You
can run with --immediate to stop at the first failing test, which means
we'll have accrued fewer logs. But we don't show the logs in that case!

Let's start doing so. This can only help case (2), of course (since it
depends on test failure). And it's somewhat weakened by the fact that
any cases of (1) will pollute the logs. But we can improve things
further in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:23:01 -07:00
95c679ad86 test-lib: stop showing old leak logs
We ask LSan to record the logs of all leaks in test-results/, which is
useful for finding leaks that didn't trigger a test failure.

We don't clean out the leak/ directory for each test before running it,
though. Instead, we count the number of files it has, and complain only
if we ended up with more when the script finishes. So we shouldn't
trigger any output if you've made a script leak free. But if you simply
_reduced_ the number of leaks, then there is an annoying outcome: we do
not record which logs were from this run and which were from previous
ones. So when we dump them to stdout, you get a mess of
possibly-outdated leaks. This is very confusing when you are in an
edit-compile-test cycle trying to fix leaks.

The instructions do note that you should "rm -rf test-results/" if you
want to avoid this. But I'm having trouble seeing how this cumulative
count could ever be useful. It is not even counting the number of leaks,
but rather the number of processes with at least one leak!

So let's just blow away the per-test leak/ directory before running. We
already overwrite the ".out" file in test-results/ in the same way, so
this is following that pattern.

Running "make test" isn't affected by this, since it blows away all of
test-results/ already. This only comes up when you are iterating on a
single script that you're running manually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:23:01 -07:00
7ffcbafbf3 send-email: document --mailmap and associated configuration
241499aba0 ("send-email: add mailmap support via sendemail.mailmap and
--mailmap", 2024-08-27) added support for --mailmap, and the associated
sendemail.mailmap.* configuration variables. Add documentation to
reflect this feature.

Fixes: 241499aba0 ("send-email: add mailmap support via sendemail.mailmap and --mailmap")
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 08:58:38 -07:00
ed4d4f3837 builtin: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 10:54:39 -07:00
22293895c0 doc: apply synopsis simplification on git-clone and git-init
With the new synopsis formatting backend, no special asciidoc markup
is needed.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 10:20:26 -07:00
029eff9e34 doc: update the guidelines to reflect the current formatting rules
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 10:20:25 -07:00
974cdca345 doc: introduce a synopsis typesetting
In order to follow the common manpage usage, the synopsis of the
commands needs to be heavily typeset. A first try was performed with
using native markup, but it turned out to make the document source
almost unreadable, difficult to write and prone to mistakes with
unwanted Asciidoc's role attributes.

In order to both simplify the writer's task and obtain a consistant
typesetting in the synopsis, a custom 'synopsis' paragraph type is
created and the processor for backticked text are modified. The
backends of asciidoc and asciidoctor take in charge to correctly add
the required typesetting.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 10:20:25 -07:00
6241ce2170 refs/reftable: reload locked stack when preparing transaction
When starting a reftable transaction we lock all stacks we are about to
modify. While it may happen that the stack is out-of-date at this point
in time we don't really care: transactional updates encode the expected
state of a certain reference, so all that we really want to verify is
that the _current_ value matches that expected state.

Pass `REFTABLE_STACK_NEW_ADDITION_RELOAD` when locking the stack such
that an out-of-date stack will be reloaded after having been locked.
This change is safe because all verifications of the expected state
happen after this step anyway.

Add a testcase that verifies that many writers are now able to write to
the stack concurrently without failures and with a deterministic end
result.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:45:26 -07:00
80e7342ea8 reftable/stack: allow locking of outdated stacks
In `reftable_stack_new_addition()` we first lock the stack and then
check whether it is still up-to-date. If it is not we return an error to
the caller indicating that the stack is outdated.

This is overly restrictive in our ref transaction interface though: we
lock the stack right before we start to verify the transaction, so we do
not really care whether it is outdated or not. What we really want is
that the stack is up-to-date after it has been locked so that we can
verify queued updates against its current state while we know that it is
locked for concurrent modification.

Introduce a new flag `REFTABLE_STACK_NEW_ADDITION_RELOAD` that alters
the behaviour of `reftable_stack_init_addition()` in this case: when we
notice that it is out-of-date we reload it instead of returning an error
to the caller.

This logic will be wired up in the reftable backend in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:45:25 -07:00
bc39b6a796 refs/reftable: introduce "reftable.lockTimeout"
When multiple concurrent processes try to update references in a
repository they may try to lock the same lockfiles. This can happen even
when the updates are non-conflicting and can both be applied, so it
doesn't always make sense to abort the transaction immediately. Both the
"loose" and "packed" backends thus have a grace period that they wait
for the lock to be released that can be controlled via the config values
"core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively.

The reftable backend doesn't have such a setting yet and instead fails
immediately when it sees such a lock. But the exact same concepts apply
here as they do apply to the other backends.

Introduce a new "reftable.lockTimeout" config that controls how long we
may wait for a "tables.list" lock to be released. The default value of
this config is 100ms, which is the same default as we have it for the
"loose" backend.

Note that even though we also lock individual tables, this config really
only applies to the "tables.list" file. This is because individual
tables are only ever locked when we already hold the "tables.list" lock
during compaction. When we observe such a lock we in fact do not want to
compact the table at all because it is already in the process of being
compacted by a concurrent process. So applying the same timeout here
would not make any sense and only delay progress.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:45:25 -07:00
320c96b0cb config: fix evaluating "onbranch" with nonexistent git dir
The `include_by_branch()` function is responsible for evaluating whether
or not a specific include should be pulled in based on the currently
checked out branch. Naturally, his condition can only be evaluated when
we have a properly initialized repository with a ref store in the first
place. This is why the function guards against the case when either
`data->repo` or `data->repo->gitdir` are `NULL` pointers.

But the second check is insufficient: the `gitdir` may be set even
though the repository has not been initialized. Quoting "setup.c":

  NEEDSWORK: currently we allow bogus GIT_DIR values to be set in some
  code paths so we also need to explicitly setup the environment if the
  user has set GIT_DIR.  It may be beneficial to disallow bogus GIT_DIR
  values at some point in the future.

So when either the GIT_DIR environment variable or the `--git-dir`
global option are set by the user then `the_repository` may end up with
an initialized `gitdir` variable. And this happens even when the dir is
invalid, like for example when it doesn't exist. It follows that only
checking for whether or not `gitdir` is `NULL` is not sufficient for us
to determine whether the repository has been properly initialized.

This issue can lead to us triggering a BUG: when using a config with an
"includeIf.onbranch:" condition outside of a repository while using the
`--git-dir` option pointing to an invalid Git directory we may end up
trying to evaluate the condition even though the ref storage format has
not been set up.

This bisects to 173761e21b (setup: start tracking ref storage format,
2023-12-29), but that commit really only starts to surface the issue
that has already existed beforehand. The code to check for `gitdir` was
introduced via 85fe0e800c (config: work around bug with
includeif:onbranch and early config, 2019-07-31), which tried to fix
similar issues when we didn't yet have a repository set up. But the fix
was incomplete as it missed the described scenario.

As the quoted comment mentions, we'd ideally refactor the code to not
set up `gitdir` with an invalid value in the first place, but that may
be a bigger undertaking. Instead, refactor the code to use the ref
storage format as an indicator of whether or not the ref store has been
set up to fix the bug.

Reported-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:18:17 -07:00
9cc2590ab9 t1305: exercise edge cases of "onbranch" includes
Add a couple more tests for "onbranch" includes for several edge cases.
All tests except for the last one pass, so for the most part this change
really only aims to nail down behaviour of include conditionals further.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:18:16 -07:00
537e516a39 sparse-checkout: disable advice in 'disable'
When running 'git sparse-checkout disable' with the sparse index
enabled, Git is expected to expand the index into a full index. However,
it currently outputs the advice message saying that that is unexpected
and likely due to an issue with the working directory.

Disable this advice message when in this code path. Establish a pattern
for doing a similar removal in the future.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 13:19:01 -07:00
9310f10e2b Documentation: fix typos
Fix typos in documentation.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 12:47:36 -07:00
90e82eb01e Documentation/config: fix typos
Fix typos in documentation.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 12:46:59 -07:00
98398f3b6b Documentation/technical: fix a typo
Fix a typo in documentation.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 12:40:52 -07:00
6258f68c3c The 20th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 10:35:09 -07:00
b8e318ea58 Merge branch 'jc/pass-repo-to-builtins'
The convention to calling into built-in command implementation has
been updated to pass the repository, if known, together with the
prefix value.

* jc/pass-repo-to-builtins:
  add: pass in repo variable instead of global the_repository
  builtin: remove USE_THE_REPOSITORY for those without the_repository
  builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h
  builtin: add a repository parameter for builtin functions
2024-09-23 10:35:09 -07:00
0f41fd28f9 Merge branch 'jk/t9001-deflake'
Test fix.

* jk/t9001-deflake:
  t9001: use a more distinct fake BugID
2024-09-23 10:35:08 -07:00
621ac241be Merge branch 'jk/jump-quickfix-fixes'
A few usability fixes to "git jump" (in contrib/).

* jk/jump-quickfix-fixes:
  git-jump: ignore deleted files in diff mode
  git-jump: always specify column 1 for diff entries
2024-09-23 10:35:08 -07:00
fed9298d6d Merge branch 'ak/typofixes'
Trivial typofixes.

* ak/typofixes:
  cbtree: fix a typo
  bloom: fix a typo
  attr: fix a typo
2024-09-23 10:35:07 -07:00
a4f062bdcf Merge branch 'jk/diag-unexpected-remote-helper-death'
When a remote-helper dies before Git writes to it, SIGPIPE killed
Git silently.  We now explain the situation a bit better to the end
user in our error message.

* jk/diag-unexpected-remote-helper-death:
  print an error when remote helpers die during capabilities
2024-09-23 10:35:06 -07:00
31a17429c0 Merge branch 'jc/t5512-sigpipe-fix'
Test fix.

* jc/t5512-sigpipe-fix:
  t5512.40 sometimes dies by SIGPIPE
2024-09-23 10:35:05 -07:00
3eb6679959 Merge branch 'ps/environ-wo-the-repository'
Code clean-up.

* ps/environ-wo-the-repository: (21 commits)
  environment: stop storing "core.notesRef" globally
  environment: stop storing "core.warnAmbiguousRefs" globally
  environment: stop storing "core.preferSymlinkRefs" globally
  environment: stop storing "core.logAllRefUpdates" globally
  refs: stop modifying global `log_all_ref_updates` variable
  branch: stop modifying `log_all_ref_updates` variable
  repo-settings: track defaults close to `struct repo_settings`
  repo-settings: split out declarations into a standalone header
  environment: guard state depending on a repository
  environment: reorder header to split out `the_repository`-free section
  environment: move `set_git_dir()` and related into setup layer
  environment: make `get_git_namespace()` self-contained
  environment: move object database functions into object layer
  config: make dependency on repo in `read_early_config()` explicit
  config: document `read_early_config()` and `read_very_early_config()`
  environment: make `get_git_work_tree()` accept a repository
  environment: make `get_graft_file()` accept a repository
  environment: make `get_index_file()` accept a repository
  environment: make `get_object_directory()` accept a repository
  environment: make `get_git_common_dir()` accept a repository
  ...
2024-09-23 10:35:05 -07:00
57155e7b4a Sync with Git 2.46.2 2024-09-23 10:34:39 -07:00
4f71522dfb Git 2.46.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 10:33:01 -07:00
d497bd9d59 Merge branch 'ma/test-libcurl-prereq' into maint-2.46
Test portability fix.

* ma/test-libcurl-prereq:
  t0211: add missing LIBCURL prereq
  t1517: add missing LIBCURL prereq
2024-09-23 10:33:00 -07:00
52c1a7322f Merge branch 'jc/doc-skip-fetch-all-and-prefetch' into maint-2.46
Doc updates.

* jc/doc-skip-fetch-all-and-prefetch:
  doc: remote.*.skip{DefaultUpdate,FetchAll} stops prefetch
2024-09-23 10:33:00 -07:00
1c8d664dfd Merge branch 'bl/trailers-and-incomplete-last-line-fix' into maint-2.46
The interpret-trailers command failed to recognise the end of the
message when the commit log ends in an incomplete line.

* bl/trailers-and-incomplete-last-line-fix:
  interpret-trailers: handle message without trailing newline
2024-09-23 10:33:00 -07:00
c7577aedf5 Merge branch 'rj/cygwin-has-dev-tty' into maint-2.46
Cygwin does have /dev/tty support that is needed by things like
single-key input mode.

* rj/cygwin-has-dev-tty:
  config.mak.uname: add HAVE_DEV_TTY to cygwin config section
2024-09-23 10:32:59 -07:00
7794e09034 Merge branch 'rs/diff-exit-code-fix' into maint-2.46
In a few corner cases "git diff --exit-code" failed to report
"changes" (e.g., renamed without any content change), which has
been corrected.

* rs/diff-exit-code-fix:
  diff: report dirty submodules as changes in builtin_diff()
  diff: report copies and renames as changes in run_diff_cmd()
2024-09-23 10:32:58 -07:00
992f7a4fdb worktree: repair copied repository and linked worktrees
For each linked worktree, Git maintains two pointers: (1)
<repo>/worktrees/<id>/gitdir which points at the linked worktree, and
(2) <worktree>/.git which points back at <repo>/worktrees/<id>. Both
pointers are absolute pathnames.

Aside from manually manipulating those raw files, it is possible to
easily "break" one or both pointers by ignoring the "git worktree move"
command and instead manually moving a linked worktree, moving the
repository, or moving both. The "git worktree repair" command was
invented to handle this case by restoring these pointers to sane values.

For the "repair" command, the "git worktree" manual page states:

  Repair worktree administrative files, if possible, if they have
  become corrupted or outdated due to external factors.

The "if possible" clause was chosen deliberately to convey that the
existing implementation may not be able to fix every possible breakage,
and to imply that improvements may be made to handle other types of
breakage.

A recent problem report[*] illustrates a case in which "git worktree
repair" not only fails to fix breakage, but actually causes breakage.
Specifically, if a repository / main-worktree and linked worktrees are
*copied* as a unit (rather than *moved*), then "git worktree repair" run
in the copy leaves the copy untouched but botches the pointers in the
original repository and the original worktrees.

For instance, given this directory structure:

  orig/
    main/ (main-worktree)
    linked/ (linked worktree)

if "orig" is copied (not moved) to "dup", then immediately after the
manual copy operation:

  * orig/main/.git/worktrees/linked/gitdir points at orig/linked/.git
  * orig/linked/.git points at orig/main/.git/worktrees/linked
  * dup/main/.git/worktrees/linked/gitdir points at orig/linked/.git
  * dup/linked/.git points at orig/main/.git/worktrees/linked

So, dup/main thinks its linked worktree is orig/linked, and worktree
dup/linked thinks its repository / main-worktree is orig/main.

"git worktree repair" is reasonably simple-minded; it wants to trust
valid-looking pointers, hence doesn't try to second-guess them. In this
case, when validating dup/linked/.git, it finds a legitimate repository
pointer, orig/main/.git/worktrees/linked, thus trusts that is correct,
but does notice that gitdir in that directory doesn't point at
dup/linked/.git, so it (incorrectly) _fixes_
orig/main/.git/worktrees/linked/gitdir to point at dup/linked/.git.
Similarly, when validating dup/main/.git/worktrees/linked/gitdir, it
finds a legitimate worktree pointer, orig/linked/.git, but notices that
its .git file doesn't point back at dup/main, thus (incorrectly) _fixes_
orig/linked/.git to point at dup/main/.git/worktrees/linked. Hence, it
has modified and broken the linkage between orig/main and orig/linked
rather than fixing dup/main and dup/linked as expected.

Fix this problem by also checking if a plausible .git/worktrees/<id>
exists in the *current* repository -- not just in the repository pointed
at by the worktree's .git file -- and comparing whether they are the
same. If not, then it is likely because the repository / main-worktree
and linked worktrees were copied, so prefer the discovered plausible
pointer rather than the one from the existing .git file.

[*]: https://lore.kernel.org/git/E1sr5iF-0007zV-2k@binarylane-bailey.stuart.id.au/

Reported-by: Russell Stuart <russell+git.vger.kernel.org@stuart.id.au>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 10:08:32 -07:00
ff0eb72fb6 commit-graph: remove unnecessary UNLEAK
When f4dbdfc4d5 (commit-graph: clean up leaked memory during write,
2018-10-03) added the UNLEAK, it was right before a call to die_errno().
e103f7276f (commit-graph: return with errors during write, 2019-06-12)
made it unnecessary, as it was then followed by a free() call for the
allocated string.

The code moved to write_commit_graph_file() in the meantime and the
string pointer is now part of a struct, but the function's only caller
still cleans up the allocation.  Drop the superfluous UNLEAK.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 10:03:59 -07:00
296743a7ca archive: load index before pathspec checks
git archive checks whether pathspec arguments match anything to avoid
surprises due to typos and later loads the index to get attributes.

This order was OK when these features were introduced by ba053ea96c
(archive: do not read .gitattributes in working directory, 2009-04-18)
and d5f53d6d6f (archive: complain about path specs that don't match
anything, 2009-12-12).

But when attribute matching was added to pathspec in b0db704652
(pathspec: allow querying for attributes, 2017-03-13), the pathspec
checker in git archive did not support it fully, because it lacks the
attributes from the index.

Load the index earlier, before the pathspec check, to support attr
pathspecs.

Reported-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 09:47:20 -07:00
9a41735af6 diff: report modified binary files as changes in builtin_diff()
The diff machinery has two ways to detect changes to set the exit code:
Just comparing hashes and comparing blob contents.  The latter is needed
if certain changes have to be ignored, e.g. with --ignore-space-change
or --ignore-matching-lines.  It's enabled by the diff_options flag
diff_from_contents.

The code for handling binary files added by 1aaf69e669 (diff: shortcut
for diff'ing two binary SHA-1 objects, 2014-08-16) always uses a quick
hash-only comparison, even if the slow way is taken.  We need it to
report a hash difference as a change for the purpose of setting the
exit code, though, but it never did.  Fix that.

d7b97b7185 (diff: let external diffs report that changes are
uninteresting, 2024-06-09) set diff_from_contents if external diff
programs are allowed.  This is the default e.g. for git diff, and so
that change exposed the inconsistency much more widely.

Reported-by: Kohei Shibata <shiba200712@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23 09:41:07 -07:00
b9183b0a02 scalar: configure maintenance during 'reconfigure'
The 'scalar reconfigure' command is intended to update registered repos
with the latest settings available. However, up to now we were not
reregistering the repos with background maintenance.

In particular, this meant that the background maintenance schedule would
not be updated if there are improvements between versions.

Be sure to register repos for maintenance during the reconfigure step.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-20 14:44:32 -07:00
4f5551957d maintenance: add custom config to background jobs
At the moment, some background jobs are getting blocked on credentials
during the 'prefetch' task. This leads to other tasks, such as
incremental repacks, getting blocked. Further, if a user manages to fix
their credentials, then they still need to cancel the background process
before their background maintenance can continue working.

Update the background schedules for our four scheduler integrations to
include these config options via '-c' options:

 * 'credential.interactive=false' will stop Git and some credential
   helpers from prompting in the UI (assuming the '-c' parameters are
   carried through and respected by GCM).

 * 'core.askPass=true' will replace the text fallback for a username
   and password into the 'true' command, which will return a success in
   its exit code, but Git will treat the empty string returned as an
   invalid password and move on.

We can do some testing that the credentials are passed, at least in the
systemd case due to writing the service files.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-20 14:44:31 -07:00
719399b57b credential: add new interactive config option
When scripts or background maintenance wish to perform HTTP(S) requests,
there is a risk that our stored credentials might be invalid. At the
moment, this causes the credential helper to ping the user and block the
process. Even if the credential helper does not ping the user, Git falls
back to the 'askpass' method, which includes a direct ping to the user
via the terminal.

Even setting the 'core.askPass' config as something like 'echo' will
causes Git to fallback to a terminal prompt. It uses
git_terminal_prompt(), which finds the terminal from the environment and
ignores whether stdin has been redirected. This can also block the
process awaiting input.

Create a new config option to prevent user interaction, favoring a
failure to a blocked process.

The chosen name, 'credential.interactive', is taken from the config
option used by Git Credential Manager to already avoid user
interactivity, so there is already one credential helper that integrates
with this option. However, older versions of Git Credential Manager also
accepted other string values, including 'auto', 'never', and 'always'.
The modern use is to use a boolean value, but we should still be
careful that some users could have these non-booleans. Further, we
should respect 'never' the same as 'false'. This is respected by the
implementation and test, but not mentioned in the documentation.

The implementation for the Git interactions takes place within
credential_getpass(). The method prototype is modified to return an
'int' instead of 'void'. This allows us to detect that no attempt was
made to fill the given credential, changing the single caller slightly.

Also, a new trace2 region is added around the interactive portion of the
credential request. This provides a way to measure the amount of time
spent in that region for commands that _are_ interactive. It also makes
a conventient way to test that the config option works with
'test_region'.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-20 14:44:31 -07:00
2eeb29702e ci: update FreeBSD image to 13.4
FreeBSD 13.4 was recently released, and that means the version
of the image used by this job (13.2) will be out of support soon.

Update it before the job starts failing because packages are no
longer compatible or the image gets retired by the provider since
it is now EOL.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-20 14:40:41 -07:00
082caf527e submodule status: propagate SIGPIPE
It has been reported than running

     git submodule status --recurse | grep -q ^+

results in an unexpected error message

    fatal: failed to recurse into submodule $submodule

When "git submodule--helper" recurses into a submodule it creates a
child process. If that process fails then the error message above is
displayed by the parent. In the case above the child is killed by
SIGPIPE as "grep -q" exits as soon as it sees the first match. Fix this
by propagating SIGPIPE so that it is visible to the process running
git. We could propagate other signals but I'm not sure there is much
value in doing that. In the common case of the user pressing Ctrl-C or
Ctrl-\ then SIGINT or SIGQUIT will be sent to the foreground process
group and so the parent process will receive the same signal as the
child.

Reported-by: Matt Liberty <mliberty@precisioninno.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-20 13:07:03 -07:00
94b60adee3 The 19th batch
Merge the topics that have been cooking since 2024-09-13 or so in
'next'.

Let's try a new workflow to update the maintenance track by removing
the "merge ... later to maint" comments from the draft release notes
on the 'master' track.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-20 11:16:33 -07:00
83c1cc99a8 Merge branch 'jk/git-pm-bare-repo-fix'
In Git 2.39, Git.pm stopped working in a bare repository, which has
been corrected.

* jk/git-pm-bare-repo-fix:
  Git.pm: use "rev-parse --absolute-git-dir" rather than perl code
  Git.pm: fix bare repository search with Directory option
2024-09-20 11:16:33 -07:00
5d77008437 Merge branch 'bb/unicode-width-table-16'
Update the character width table for Unicode 16.

* bb/unicode-width-table-16:
  unicode: update the width tables to Unicode 16
2024-09-20 11:16:32 -07:00
e12df759e6 Merge branch 'ma/test-libcurl-prereq'
Test portability fix.

* ma/test-libcurl-prereq:
  t0211: add missing LIBCURL prereq
  t1517: add missing LIBCURL prereq
2024-09-20 11:16:31 -07:00
53c7a9643f Merge branch 'jk/interop-test-build-options'
The support to customize build options to adjust for older versions
and/or older systems for the interop tests has been improved.

* jk/interop-test-build-options:
  t/interop: allow per-version make options
2024-09-20 11:16:31 -07:00
4c22e57bab Merge branch 'jk/no-openssl-with-openssl-sha1'
The "imap-send" now allows to be compiled with NO_OPENSSL and
OPENSSL_SHA1 defined together.

* jk/no-openssl-with-openssl-sha1:
  imap-send: handle NO_OPENSSL even when openssl exists
2024-09-20 11:16:31 -07:00
16c0906e8c Merge branch 'ps/leakfixes-part-6'
More leakfixes.

* ps/leakfixes-part-6: (22 commits)
  builtin/repack: fix leaking keep-pack list
  merge-ort: fix two leaks when handling directory rename modifications
  match-trees: fix leaking prefixes in `shift_tree()`
  builtin/fmt-merge-msg: fix leaking buffers
  builtin/grep: fix leaking object context
  builtin/pack-objects: plug leaking list of keep-packs
  builtin/repack: fix leaking line buffer when packing promisors
  negotiator/skipping: fix leaking commit entries
  shallow: fix leaking members of `struct shallow_info`
  shallow: free grafts when unregistering them
  object: clear grafts when clearing parsed object pool
  gpg-interface: fix misdesigned signing key interfaces
  send-pack: fix leaking push cert nonce
  remote: fix leak in reachability check of a remote-tracking ref
  remote: fix leaking tracking refs
  builtin/submodule--helper: fix leaking refs on push-check
  submodule: fix leaking fetch task data
  upload-pack: fix leaking child process data on reachability checks
  builtin/push: fix leaking refspec query result
  send-pack: fix leaking common object IDs
  ...
2024-09-20 11:16:30 -07:00
2b800ec45e Merge branch 'pw/rebase-autostash-fix'
"git rebase --autostash" failed to resurrect the autostashed
changes when the command gets aborted after giving back control
asking for hlep in conflict resolution.

* pw/rebase-autostash-fix:
  rebase: apply and cleanup autostash when rebase fails to start
2024-09-20 11:16:30 -07:00
5c5d29e1c4 gitlab-ci: upgrade machine type of Linux runners
With the recent effort to make the test suite free of memory leaks we
now run a lot more of test suites with the leak-sanitizer enabled. While
we were originally only executing around 23000 tests, we're now at 30000
tests. Naturally, this has a significant impact on the runtime of such a
test run.

Naturally, this impact can also be felt for our leak-checking CI jobs.
While macOS used to be the slowest-executing job on GitLab CI with ~15
minutes of runtime, nowadays it is our leak checks which take around 45
to 55 minutes.

Our Linux runners for GitLab CI are untagged, which means that they
default to the "small" machine type with two CPU cores [1]. Upgrade
these to the "medium" runner, which provide four CPU cores and which
should thus provide a noticeable speedup.

In theory, we could upgrade to an ever larger machine than that. The
official mirror [2] has an Ultimate license, so we could get up to 128
cores. But anybody running a fork of the Git project without such a
license wouldn't be able to use those beefier machines and thus their
pipelines would fail.

[1]: https://docs.gitlab.com/ee/ci/runners/hosted_runners/linux.html
[2]: https://gitlab.com/git-scm/git/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 14:39:53 -07:00
2065295642 ref-filter: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:50:36 -07:00
e02cc08a88 upload-pack: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
e61651b1a8 sideband: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
d1d93ae8b1 setup: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
b71d52cef5 run-command: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
a0ef3816c1 revision: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
619cbc01a3 refs: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
ce42f57af4 rebase: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:01 -07:00
d9369f78e7 read-cache-ll: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
e13c49a4c5 pretty: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
28012b915c object-file: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
a966ad1e1b merge-ort: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
a3621abaf9 merge-ll: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
7a6d5a4641 http: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
c055a29109 gpg-interface: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
086ba2eb3f git-p4: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
bbe92166d4 git-instaweb: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
be645cd268 fsmonitor-settings: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
4b8c76638f diffcore-rename: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:00 -07:00
26eab80642 config.mak.dev: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:45:59 -07:00
8afda42fce cmake: generalize the handling of the UNIT_TEST_OBJS list
In a15d4465a9 (cmake: also build unit tests, 2023-09-25), I
accommodated the CMake definition. Seeing that a `UNIT_TEST_OBJS` list
was introduced that was built by transforming the `UNIT_TEST_PROGRAMS`
list and then adding a single, hard-coded file
("t/unit-tests/test-lib.c"), I decided to hard-code that in the CMake
definition, too.

The reason why I hard-coded it instead of imitating the
`parse_makefile_for_sources()` paradigm that was used elsewhere when
using the `Makefile` as source of truth for given lists of files: This
function expects _only_ hard-coded values, and that transformed
`UNIT_TEST_PROGRAMS` list complicated everything.

In 872721538c (cmake: fix build of `t-oidtree`, 2024-07-12), I
accommodated the CMake definition again, after seeing that the
`UNIT_TEST_OBJS` was still defined via that transformed list but now
appending _two_ hard-coded files ("t/unit-tests/lib-oid.c" joined the
fray).

In 428672a3b1 (Makefile: stop listing test library objects twice,
2024-09-16), the `Makefile` was changed so that `UNIT_TEST_OBJS` is
finally only constructed using hard-coded file names just like the other
`*_OBJS` variables. I missed that and therefore did not adjust the CMake
definition. Besides, the code was working, so there was no real need to
adjust it.

With a4f50bb1e9 (t/unit-tests: introduce reftable library, 2024-09-16),
however, the `UNIT_TEST_OBJS` list became a trio, and the CMake
definition has to be adjusted again. Now that we can use the
`parse_makefile_for_sources()` function without many complications,
let's do that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-18 18:06:05 -07:00
75c4d8f044 cmake: stop looking for REFTABLE_TEST_OBJS in the Makefile
As of 15e29ea1c6 (t: move reftable/stack_test.c to the unit testing
framework, 2024-09-08), the reftable tests are no longer part of
`test-tool.exe`, so let's stop looking for those lines that are no
longer in the `Makefile`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-18 18:06:05 -07:00
77c6bd9f38 cmake: rename clar-related variables to avoid confusion
In c3de556a84 (Makefile: rename clar-related variables to avoid
confusion, 2024-09-10) some `Makefile` variables were renamed that were
partially used by the CMake definition. Adapt the latter to the new lay
of the land.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-18 18:06:05 -07:00
cbc46c0583 Merge branch 'ps/reftable-exclude' into jc/cmake-unit-test-updates
* ps/reftable-exclude:
  refs/reftable: wire up support for exclude patterns
  reftable/reader: make table iterator reseekable
  t/unit-tests: introduce reftable library
  Makefile: stop listing test library objects twice
  builtin/receive-pack: fix exclude patterns when announcing refs
  refs: properly apply exclude patterns to namespaced refs
2024-09-18 18:05:44 -07:00
6531f31ef3 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-18 18:02:06 -07:00
e6cc6939e0 Merge branch 'es/chainlint-message-updates'
The error messages from the test script checker have been improved.

* es/chainlint-message-updates:
  chainlint: reduce annotation noise-factor
  chainlint: make error messages self-explanatory
  chainlint: don't be fooled by "?!...?!" in test body
2024-09-18 18:02:05 -07:00
5d55832f5c Merge branch 'ps/clar-unit-test'
Import clar unit tests framework libgit2 folks invented for our
use.

* ps/clar-unit-test:
  Makefile: rename clar-related variables to avoid confusion
  clar: add CMake support
  t/unit-tests: convert ctype tests to use clar
  t/unit-tests: convert strvec tests to use clar
  t/unit-tests: implement test driver
  Makefile: wire up the clar unit testing framework
  Makefile: do not use sparse on third-party sources
  Makefile: make hdr-check depend on generated headers
  Makefile: fix sparse dependency on GENERATED_H
  clar: stop including `shellapi.h` unnecessarily
  clar(win32): avoid compile error due to unused `fs_copy()`
  clar: avoid compile error with mingw-w64
  t/clar: fix compatibility with NonStop
  t: import the clar unit testing framework
  t: do not pass GIT_TEST_OPTS to unit tests with prove
2024-09-18 18:02:05 -07:00
3fc4eab466 apply: refactor struct image to use a struct strbuf
The `struct image` uses a character array to track the pre- or postimage
of a patch operation. This has multiple downsides:

  - It is somewhat hard to track memory ownership. In fact, we have
    several memory leaks in git-apply(1) because we do not (and cannot
    easily) free the buffer in all situations.

  - We have to reinvent the wheel and manually implement a lot of
    functionality that would already be provided by `struct strbuf`.

  - We have to carefully track whether `update_pre_post_images()` can do
    an in-place update of the postimage or whether it has to allocate a
    new buffer for it.

This is all rather cumbersome, and especially `update_pre_post_images()`
is really hard to understand as a consequence even though what it is
doing is rather trivial.

Refactor the code to use a `struct strbuf` instead, addressing all of
the above. Like this we can easily perform in-place updates in all
situations, the logic to perform those updates becomes way simpler and
the lifetime of the buffer becomes a ton easier to track.

This refactoring also plugs some leaking buffers as a side effect.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17 13:53:30 -07:00
e73686f6e4 apply: rename members that track line count and allocation length
The `struct image` has two members `nr` and `alloc` that track the
number of lines as well as how large its array is. It is somewhat easy
to confuse these members with `len` though, which tracks the length of
the `buf` member.

Rename these members to `line_nr` and `line_alloc` respectively to avoid
confusion. This is in line with how we typically name variables that
track an array in this way.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17 13:53:30 -07:00
6eff8b8f40 apply: refactor code to drop line_allocated
The `struct image` has two members `line` and `line_allocated`. The
former member is the one that should be used throughout the code,
whereas the latter one is used to track whether the lines have been
allocated or not.

In practice, the array of lines is always allocated. The reason why we
have `line_allocated` is that `remove_first_line()` will advance the
array pointer to drop the first entry, and thus it points into the array
instead of to the array header.

Refactor the function to use memmove(3P) instead, which allows us to get
rid of this double bookkeeping. This is less efficient, but I doubt that
this matters much in practice. If this judgement call is found to be
wrong at a later point in time we can likely refactor the surrounding
loop such that we first calculate the number of leading context lines to
remove and then remove them in a single call to memmove(3P).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17 13:53:30 -07:00
7db28d0133 apply: introduce macro and function to init images
We're about to convert the `struct image` to gain a `struct strbuf`
member, which requires more careful initialization than just memsetting
it to zeros. Introduce the `IMAGE_INIT` macro and `image_init()`
function to prepare for this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17 13:53:29 -07:00
2231903778 apply: rename functions operating on struct image
Rename functions operating on `struct image` to have a `image_` prefix
to match our modern code style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17 13:53:29 -07:00
1f2df6f9a5 apply: reorder functions to move image-related things together
While most of the functions relating to `struct image` are relatively
close to one another, `fuzzy_matchlines()` sits in between those even
though it is rather unrelated.

Reorder functions such that `struct image`-related functions are next to
each other. While at it, move `clear_image()` to the top such that it is
close to the struct definition itself. This makes this lifecycle-related
thing easy to discover.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17 13:53:29 -07:00
3fb745257b ci updates
This batch is solely to unbreak the 32-bit CI jobs that can no
longer work with Ubuntu xenial image that is too ancient.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 15:31:39 -07:00
60a3dbb452 Sync with 'maint' 2024-09-16 15:27:46 -07:00
aeda40b96e Merge branch 'jk/ci-linux32-update'
CI updates

* jk/ci-linux32-update:
  ci: add Ubuntu 16.04 job to GitLab CI
  ci: use regular action versions for linux32 job
  ci: use more recent linux32 image
  ci: unify ubuntu and ubuntu32 dependencies
  ci: drop run-docker scripts
2024-09-16 15:27:08 -07:00
f9fff154d3 Merge branch 'jc/ci-upload-artifact-and-linux32'
CI started failing completely for linux32 jobs, as the step to
upload failed test directory uses GitHub actions that is deprecated
and is now disabled.  Remove the step so at least we will know if
the tests are passing.

* jc/ci-upload-artifact-and-linux32:
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-09-16 15:27:08 -07:00
e29e5cf288 Start preparing for Git 2.46.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 15:19:05 -07:00
dbf38e9a43 Merge branch 'jk/ci-linux32-update' into maint-2.46
CI updates

* jk/ci-linux32-update:
  ci: add Ubuntu 16.04 job to GitLab CI
  ci: use regular action versions for linux32 job
  ci: use more recent linux32 image
  ci: unify ubuntu and ubuntu32 dependencies
  ci: drop run-docker scripts
2024-09-16 15:13:24 -07:00
af51e464bf Merge branch 'jc/ci-upload-artifact-and-linux32' into maint-2.46
CI started failing completely for linux32 jobs, as the step to
upload failed test directory uses GitHub actions that is deprecated
and is now disabled.  Remove the step so at least we will know if
the tests are passing.

* jc/ci-upload-artifact-and-linux32:
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-09-16 15:13:24 -07:00
d6bf6527eb Revert "Merge branch 'jc/patch-id' into maint-2.46"
This reverts commit 41c952ebac,
reversing changes made to 712d970c01.
Keeping a known breakage for now is better than introducing new
regression(s).
2024-09-16 15:12:06 -07:00
3969d78396 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 14:22:55 -07:00
b708e8b8c1 Merge branch 'jk/ref-filter-trailer-fixes'
Bugfixes and leak plugging in "git for-each-ref --format=..." code
paths.

* jk/ref-filter-trailer-fixes:
  ref-filter: fix leak with unterminated %(if) atoms
  ref-filter: add ref_format_clear() function
  ref-filter: fix leak when formatting %(push:remoteref)
  ref-filter: fix leak with %(describe) arguments
  ref-filter: fix leak of %(trailers) "argbuf"
  ref-filter: store ref_trailer_buf data per-atom
  ref-filter: drop useless cast in trailers_atom_parser()
  ref-filter: strip signature when parsing tag trailers
  ref-filter: avoid extra copies of payload/signature
  t6300: drop newline from wrapped test title
2024-09-16 14:22:55 -07:00
be8ca2848a Merge branch 'jc/range-diff-lazy-setup'
Code clean-up.

* jc/range-diff-lazy-setup:
  remerge-diff: clean up temporary objdir at a central place
  remerge-diff: lazily prepare temporary objdir on demand
2024-09-16 14:22:55 -07:00
6e2a18cb04 Merge branch 'ah/apply-3way-ours'
"git apply --3way" learned to take "--ours" and other options.

* ah/apply-3way-ours:
  apply: support --ours, --theirs, and --union for three-way merges
2024-09-16 14:22:54 -07:00
c1f41bbe1a Merge branch 'cp/unit-test-reftable-stack'
Another reftable test migrated to the unit-test framework.

* cp/unit-test-reftable-stack:
  t-reftable-stack: add test for stack iterators
  t-reftable-stack: add test for non-default compaction factor
  t-reftable-stack: use reftable_ref_record_equal() to compare ref records
  t-reftable-stack: use Git's tempfile API instead of mkstemp()
  t: harmonize t-reftable-stack.c with coding guidelines
  t: move reftable/stack_test.c to the unit testing framework
2024-09-16 14:22:53 -07:00
e8a0c243f9 Merge branch 'ps/reftable-exclude' into ps/reftable-alloc-failures
* ps/reftable-exclude:
  refs/reftable: wire up support for exclude patterns
  reftable/reader: make table iterator reseekable
  t/unit-tests: introduce reftable library
  Makefile: stop listing test library objects twice
  builtin/receive-pack: fix exclude patterns when announcing refs
  refs: properly apply exclude patterns to namespaced refs
2024-09-16 14:06:31 -07:00
d29fc595c8 Merge branch 'cp/unit-test-reftable-stack' into ps/reftable-alloc-failures
* cp/unit-test-reftable-stack:
  t-reftable-stack: add test for stack iterators
  t-reftable-stack: add test for non-default compaction factor
  t-reftable-stack: use reftable_ref_record_equal() to compare ref records
  t-reftable-stack: use Git's tempfile API instead of mkstemp()
  t: harmonize t-reftable-stack.c with coding guidelines
  t: move reftable/stack_test.c to the unit testing framework
2024-09-16 14:06:06 -07:00
a2b7f03e65 Merge branch 'ps/leakfixes-part-6' into ps/leakfixes-part-7
* ps/leakfixes-part-6: (22 commits)
  builtin/repack: fix leaking keep-pack list
  merge-ort: fix two leaks when handling directory rename modifications
  match-trees: fix leaking prefixes in `shift_tree()`
  builtin/fmt-merge-msg: fix leaking buffers
  builtin/grep: fix leaking object context
  builtin/pack-objects: plug leaking list of keep-packs
  builtin/repack: fix leaking line buffer when packing promisors
  negotiator/skipping: fix leaking commit entries
  shallow: fix leaking members of `struct shallow_info`
  shallow: free grafts when unregistering them
  object: clear grafts when clearing parsed object pool
  gpg-interface: fix misdesigned signing key interfaces
  send-pack: fix leaking push cert nonce
  remote: fix leak in reachability check of a remote-tracking ref
  remote: fix leaking tracking refs
  builtin/submodule--helper: fix leaking refs on push-check
  submodule: fix leaking fetch task data
  upload-pack: fix leaking child process data on reachability checks
  builtin/push: fix leaking refspec query result
  send-pack: fix leaking common object IDs
  ...
2024-09-16 14:03:30 -07:00
1869525066 refs/reftable: wire up support for exclude patterns
Exclude patterns can be used by reference backends to skip over blocks
of references that are uninteresting to the caller. Reference backends
do not have to wire up support for them, and all callers are expected to
behave as if the backend didn't support them. In fact, the only backend
that supports exclude patterns right now is the "packed" backend.

Exclude patterns can be quite an important performance optimization in
repositories that have loads of references. The patterns are set up in
case "transfer.hideRefs" and friends are configured during a fetch, so
handling these patterns becomes important once there are lots of hidden
refs in a served repository.

Now that we have properly re-seekable reftable iterators we can also
wire up support for these patterns in the "reftable" backend. Doing so
is conceptually simple: once we hit a reference whose prefix matches the
current exclude pattern we re-seek the iterator to the first reference
that doesn't match the pattern anymore. This schema only works for
trivial patterns that do not have any globbing characters in them, but
this restriction also applies do the "packed" backend.

This makes t1419 work with the "reftable" backend with some slight
modifications. Of course it also speeds up listing of references with
hidden refs. The following benchmark prints one reference with 1 million
hidden references:

    Benchmark 1: HEAD~
      Time (mean ± σ):      93.3 ms ±   2.1 ms    [User: 90.3 ms, System: 2.5 ms]
      Range (min … max):    89.8 ms …  97.2 ms    33 runs

    Benchmark 2: HEAD
      Time (mean ± σ):       4.2 ms ±   0.6 ms    [User: 2.2 ms, System: 1.8 ms]
      Range (min … max):     3.1 ms …   8.1 ms    765 runs

    Summary
      HEAD ran
       22.15 ± 3.19 times faster than HEAD~

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:19 -07:00
0a148a8eda reftable/reader: make table iterator reseekable
In 67ce50ba26 (Merge branch 'ps/reftable-reusable-iterator', 2024-05-30)
we have refactored the interface of reftable iterators such that they
can be reused in theory. This patch series only landed the required
changes on the interface level, but didn't yet implement the actual
logic to make iterators reusable.

As it turns out almost all of the infrastructure already does support
re-seeking. The only exception is the table iterator, which does not
reset its `is_finished` bit. Do so and add a couple of tests that verify
that we can re-seek iterators.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:19 -07:00
a4f50bb1e9 t/unit-tests: introduce reftable library
We have recently migrated all of the reftable unit tests that were part
of the reftable library into our own unit testing framework. As part of
that migration we have duplicated some of the functionality that was
part of the reftable test framework into each of the migrated test
suites. This was a sensible decision to not have all of the migrations
dependent on each other, but now that the migration is done it makes
sense to deduplicate the functionality again.

Introduce a new reftable test library that hosts some shared code and
adapt tests to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:18 -07:00
428672a3b1 Makefile: stop listing test library objects twice
Whenever one adds another test library compilation unit one has to wire
it up twice in the Makefile: once to append it to `UNIT_TEST_OBJS`, and
once to append it to the `UNIT_TEST_PROGS` target. Ideally, we'd just
reuse the `UNIT_TEST_OBJS` variable in the target so that we can avoid
the duplication. But it also contains all the objects for our test
programs, each of which contains a `cmd_main()`, and thus we cannot link
them all into the target executable.

Refactor the code such that `UNIT_TEST_OBJS` does not contain the unit
test program objects anymore, which we can instead manually append to
the `OBJECTS` variable. Like this, the former variable now only contains
objects for test libraries and can thus be reused.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:18 -07:00
d8faf50c36 builtin/receive-pack: fix exclude patterns when announcing refs
In `write_head_info()` we announce references to the remote client. We
need to honor "transfer.hideRefs" here so that we do not announce any
references that the client shouldn't be able to learn about. This is
done via two separate mechanisms:

  - We hand over exclude patterns to the reference backend. We can only
    honor "plain" exclude patterns here that do not have prefixes with
    special meaning such as "^" or "!". Filtering down the references is
    handled by `hidden_refs_to_excludes()`.

  - In `show_ref_cb()` we perform a second check against hidden refs.
    For one this is done such that we can handle those special prefixes.
    And second, handling exclude patterns in ref backends is optional,
    so we also have to handle "normal" patterns.

The special-meaning "^" prefix alters whether a hidden ref applies to
the namespace-stripped reference name or the full name. So while we
would usually call `refs_for_each_namespaced_ref()` to only get those
references in the current namespace, we can't because we'd get the
already-rewritten reference names. Instead, we are forced to use
`refs_for_each_fullref_in()` and then manually strip away the namespace
prefix such that we have access to both names.

But this also means that we do not get namespace handling for exclude
patterns, which `refs_for_each_namespaced_ref()` brings for free. This
results in a bug because we potentially end up hiding away references
based on their namespaced name and not on the stripped name as we really
should be doing.

Fix this by manually rewriting the exclude patterns to their namespaced
variants.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:18 -07:00
155dc8447d refs: properly apply exclude patterns to namespaced refs
Reference namespaces allow commands like git-upload-pack(1) to serve
different sets of references to the client depending on which namespace
is enabled, which is for example useful in fork networks. Namespaced
refs are stored with a `refs/namespaces/$namespace` prefix, but all the
user will ultimately see is a stripped version where that prefix is
removed.

The way that this interacts with "transfer.hideRefs" is not immediately
obvious: the hidden refs can either apply to the stripped references, or
to the non-stripped ones that still have the namespace prefix. In fact,
the "transfer.hideRefs" machinery does the former and applies to the
stripped reference by default, but rules can have "^" prefixed to switch
this behaviour to instead match against the full reference name.

Namespaces are exclusively handled at the generic "refs" layer, the
respective backends have no clue that such a thing even exists. This
also has the consequence that they cannot handle hiding references as
soon as reference namespaces come into play because they neither know
whether a namespace is active, nor do they know how to strip references
if they are active.

Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
`refs_for_each_fullref_in_prefixes()` is broken though, as both support
that the user passes both namespaces and exclude patterns. In the case
where both are set we will exclude references with unstripped names,
even though we really wanted to exclude references based on their
stripped names.

This only surfaces when:

  - A repository uses reference namespaces.

  - "transfer.hideRefs" is active.

  - The namespaced references are packed into the "packed-refs" file.

None of our tests exercise this scenario, and thus we haven't ever hit
it. While t5509 exercises both (1) and (2), it does not happen to hit
(3). It is trivial to demonstrate the bug though by explicitly packing
refs in the tests, and then we indeed surface the breakage.

Fix this bug by prefixing exclude patterns with the namespace in the
generic layer. The newly introduced function will be used outside of
"refs.c" in the next patch, so we add a declaration to "refs.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:18 -07:00
0627c58e7a cbtree: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 10:46:00 -07:00
a3711f9faf bloom: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 10:46:00 -07:00
7a216cd16b attr: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 10:46:00 -07:00
83799f1500 t9001: use a more distinct fake BugID
In the test "cc list is sanitized", we feed a commit with a variety of
trailers to send-email, and then check its output to see how it handled
them. For most of them, we are grepping for a specific mention of the
header, but there's a "BugID" header which we expect to be ignored. We
confirm this by grepping for "12345", the fake BugID, and making sure it
is not present.

But we can be fooled by false positives! I just tracked down a flaky
test failure here that was caused by matching this unrelated line in the
output:

  <20240914090449.612345-1-author@example.com>

which will change from run to run based on the time, pid, etc.

Ideally we'd tighten the regex to make this more specifically, but since
the point is that it _shouldn't_ be mentioned, it's hard to say what the
right match would be (e.g., would there be a leading space?).

Instead, let's just choose a match that is much less likely to appear.
The actual content of the header isn't important, since it's supposed to
be ignored.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 09:27:52 -07:00
083b82544d git-jump: ignore deleted files in diff mode
If you do something like this:

  rm file_a
  echo change >file_b
  git jump diff

then we'll generate two quickfix entries for the diff, one for each
file. But the one for the deleted file is rather pointless. There's no
content to show since the file is gone, and in fact we open the editor
with the path /dev/null!

In vim, at least, the result is a confusing annoyance: the editor opens
with an empty buffer, and you have to skip past it to the useful
quickfix entry (after scratching your head and figuring out that no,
nothing is broken).

Let's skip such entries entirely. There's nothing useful to show, since
the point is that the file has been deleted.

It is possible that you could be doing a diff whose post-image is not
the working tree, and then you'd perhaps be jumping to the deleted
content (or at least something that was in the same spot). But I don't
think it's worth worrying about that case. For one thing, using git-jump
for such diffs is a bad idea in general, as it's going to sometimes move
you to the wrong spot. And two, a deletion is always going to have one
hunk starting at line 1, which is not that interesting to jump to in the
first place.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 09:20:46 -07:00
9f5978e777 git-jump: always specify column 1 for diff entries
When we generate a quickfix entry for a diff hunk, we provide just the
filename and line number along with the content, like:

  file:1: contents of the line

This can be a problem if the line itself looks like a quickfix header.
For example (and this is adapted from a real-world case that bit me):

  echo 'static_lease 10:11:12:13:14:15:16 10.0.0.1' >file
  git add file
  echo change >file

produces:

  file:1: static_lease 10:11:12:13:14:15:16 10.0.0.1

which is ambiguous. It could be line 1 of "file", or line 11 of the file
"file:1: static_lease 10", and so on. In the case of vim's default
config, it seems to prefer the latter (you can configure "errorformat"
with a variety of patterns, but out of the box it matches some common
ones).

One easy way to fix this is to provide a column number, like:

  file:1:1: static_lease 10:11:12:13:14:15:16 10.0.0.1

which causes vim to prefer line 1 of "file" again (due to the preference
order of the various patterns in the default errorformat).

There are other options. For example, at least in my version of vim,
wrapping the file in quotation marks like:

  "file":1: static_lease 10:11:12:13:14:15:16 10.0.0.1

also works. That perhaps would the right thing even if you had the silly
file name "file:1:1: foo 10". But it's not clear what would happen if
you had a filename with quotes in it.

This feature is inherently scraping text, and there's bound to be some
ambiguities. I don't think it's worth worrying too much about unlikely
filenames, as its the file content that is more likely to introduce
unexpected characters.

So let's just go with the extra ":1" column specifier. We know this is
supported everywhere, as git-jump's "grep" mode already uses it (and
thus doesn't exhibit the same problem).

The "merge" mode is mostly immune to this, as it only matches "<<<<<<<"
conflict marker lines. It's possible of course to have a marker that
says "foo 10:11" later in the line, but in practice these will only have
branches and perhaps file names, so it's probably not worth worrying
about (and fixing it would involve passing --column to the system grep,
which may not be portable).

I also gave some thought as to whether we could put something more
useful than "1" in the column field for diffs. In theory we could find
the first changed character of the line, but this is tricky in practice.
You'd have to correlate before/after lines of the hunk to decide what
changed. So:

  -this is a foo line
  +this is a bar line

is easy (column 11). But:

  -this is a foo line
  +another line
  +this is a bar line

is harder.

This commit certainly doesn't preclude trying to do something more
clever later, but it's a much deeper rabbit hole than just fixing the
syntactic ambiguity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 09:20:43 -07:00
6e7fac9bca print an error when remote helpers die during capabilities
The transport-helper code generally relies on the
remote-helper to provide an informative message to the user
when it encounters an error. In the rare cases where the
helper does not do so, the output can be quite confusing.
E.g.:

  $ git clone https://example.com/foo.git
  Cloning into 'foo'...
  $ echo $?
  128
  $ ls foo
  /bin/ls: cannot access foo: No such file or directory

We tried to address this with 81d340d (transport-helper:
report errors properly, 2013-04-10).

But that makes the common case much more confusing. The
remote helper protocol's method for signaling normal errors
is to simply hang up. So when the helper does encounter a
routine error and prints something to stderr, the extra
error message is redundant and misleading. So we dropped it
again in 266f1fd (transport-helper: be quiet on read errors
from helpers, 2013-06-21).

This puts the uncommon case right back where it started. We
may be able to do a little better, though. It is common for
the helper to die during a "real" command, like fetching the
list of remote refs. It is not common for it to die during
the initial "capabilities" negotiation, right after we
start. Reporting failure here is likely to catch fundamental
problems that prevent the helper from running (and reporting
errors) at all. Anything after that is the responsibility of
the helper itself to report.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-14 09:35:53 -07:00
8ff65c7a53 git gui: add directly calling merge tool from configuration
git gui can open a merge tool when conflicts are detected (Right click
in the diff of the file with conflicts).
The merge tools that are allowed to use are hard coded into git gui.

If one wants to add a new merge tool it has to be added to git gui
through a source code change.
This is not convenient in comparison to how it works in git (without gui).

git itself has configuration options for a merge tools path and command
in the git configuration.
New merge tools can be set up there without a source code change.

Those options are used only by pure git in contrast to git gui. git calls
the configured merge tools directly from the configuration while git Gui
doesn't.

With this change git gui can call merge tools configured in the
configuration directly without a change in git gui source code.
It needs a configured "merge.tool" and a configured
"mergetool.<mergetool name>.cmd" configuration entry as shown in the
git-config manual page.

Configuration example:
[merge]
	tool = vscode
[mergetool "vscode"]
	cmd = \"the/path/to/Code.exe\" --wait --merge \"$LOCAL\" \"$REMOTE\" \"$BASE\" \"$MERGED\"

Without the "mergetool.<mergetool name>.cmd" entry and an unsupported
"merge.tool" entry, git gui behaves mainly as before this change and
informs the user about an unsupported merge tool. In addtition, it also
shows a hint to add a configuration entry to use the tool as an
unsupported tool with degraded support.

If a wrong "mergetool.<mergetool name>.cmd" is configured by accident,
it gets handled by git gui already. In this case git gui informs the
user that the merge tool couldn't be opened. This behavior is preserved
by this change and should not change.

"Beyond Compare 3" and "Visual Studio Code" were tested as manually
configured merge tools.

Signed-off-by: Tobias Boesch <tobias.boesch@miele.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-09-14 15:20:16 +02:00
ed155187b4 Sync with Git 2.46.1 2024-09-13 15:31:57 -07:00
9cf95c0ca0 The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 15:27:45 -07:00
77cf81e988 Merge branch 'bl/trailers-and-incomplete-last-line-fix'
The interpret-trailers command failed to recognise the end of the
message when the commit log ends in an incomplete line.

* bl/trailers-and-incomplete-last-line-fix:
  interpret-trailers: handle message without trailing newline
2024-09-13 15:27:45 -07:00
bf42b23901 Merge branch 'rj/cygwin-has-dev-tty'
Cygwin does have /dev/tty support that is needed by things like
single-key input mode.

* rj/cygwin-has-dev-tty:
  config.mak.uname: add HAVE_DEV_TTY to cygwin config section
2024-09-13 15:27:44 -07:00
41390eb3e6 Merge branch 'rs/diff-exit-code-fix'
In a few corner cases "git diff --exit-code" failed to report
"changes" (e.g., renamed without any content change), which has
been corrected.

* rs/diff-exit-code-fix:
  diff: report dirty submodules as changes in builtin_diff()
  diff: report copies and renames as changes in run_diff_cmd()
2024-09-13 15:27:43 -07:00
da1c402a47 Merge branch 'jc/doc-skip-fetch-all-and-prefetch'
Doc updates.

* jc/doc-skip-fetch-all-and-prefetch:
  doc: remote.*.skip{DefaultUpdate,FetchAll} stops prefetch
2024-09-13 15:27:43 -07:00
19de221f36 Merge branch 'ds/doc-wholesale-disabling-advice-messages'
The environment GIT_ADVICE has been intentionally kept undocumented
to discourage its use by interactive users.  Add documentation to
help tool writers.

* ds/doc-wholesale-disabling-advice-messages:
  advice: recommend GIT_ADVICE=0 for tools
2024-09-13 15:27:43 -07:00
17ae0b8249 Merge branch 'jk/sparse-fdleak-fix'
A file descriptor left open is now properly closed when "git
sparse-checkout" updates the sparse patterns.

* jk/sparse-fdleak-fix:
  sparse-checkout: use fdopen_lock_file() instead of xfdopen()
  sparse-checkout: check commit_lock_file when writing patterns
  sparse-checkout: consolidate cleanup when writing patterns
2024-09-13 15:27:43 -07:00
0299251319 Merge branch 'ds/scalar-no-tags'
The "scalar clone" command learned the "--no-tags" option.

* ds/scalar-no-tags:
  scalar: add --no-tags option to 'scalar clone'
2024-09-13 15:27:42 -07:00
a731929aa8 Git 2.46.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 15:26:52 -07:00
8ef5549b06 Merge branch 'rj/compat-terminal-unused-fix' into maint-2.46
Build fix.

* rj/compat-terminal-unused-fix:
  compat/terminal: mark parameter of git_terminal_prompt() UNUSED
2024-09-13 15:26:52 -07:00
8b4bb65a8f Merge branch 'jc/config-doc-update' into maint-2.46
Docfix.

* jc/config-doc-update:
  git-config.1: fix description of --regexp in synopsis
  git-config.1: --get-all description update
2024-09-13 15:26:52 -07:00
d3d7c8dfb8 Merge branch 'aa/cat-file-batch-output-doc' into maint-2.46
Docfix.

* aa/cat-file-batch-output-doc:
  docs: explain the order of output in the batched mode of git-cat-file(1)
2024-09-13 15:26:52 -07:00
118c74d143 Merge branch 'cl/config-regexp-docfix' into maint-2.46
Docfix.

* cl/config-regexp-docfix:
  doc: replace 3 dash with correct 2 dash in git-config(1)
2024-09-13 15:26:51 -07:00
bb57f055ae Merge branch 'jc/coding-style-c-operator-with-spaces' into maint-2.46
Write down whitespacing rules around C opeators.

* jc/coding-style-c-operator-with-spaces:
  CodingGuidelines: spaces around C operators
2024-09-13 15:26:51 -07:00
480124470c Merge branch 'ps/stash-keep-untrack-empty-fix' into maint-2.46
A corner case bug in "git stash" was fixed.

* ps/stash-keep-untrack-empty-fix:
  builtin/stash: fix `--keep-index --include-untracked` with empty HEAD
2024-09-13 15:26:51 -07:00
be344f3631 Merge branch 'ps/index-pack-outside-repo-fix' into maint-2.46
"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.

* ps/index-pack-outside-repo-fix:
  builtin/index-pack: fix segfaults when running outside of a repo
2024-09-13 15:26:50 -07:00
bc79932048 Merge branch 'jk/free-commit-buffer-of-skipped-commits' into maint-2.46
The code forgot to discard unnecessary in-core commit buffer data
for commits that "git log --skip=<number>" traversed but omitted
from the output, which has been corrected.

* jk/free-commit-buffer-of-skipped-commits:
  revision: free commit buffers for skipped commits
2024-09-13 15:26:49 -07:00
836474560b add: pass in repo variable instead of global the_repository
With the repository variable available in the builtin function as an
argument, pass this down into helper functions instead of using the
global the_repository.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 14:33:30 -07:00
49d2434664 builtin: remove USE_THE_REPOSITORY for those without the_repository
For builtins that do not operate on a repository, remove
the #define USE_THE_REPOSITORY.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 14:33:30 -07:00
03eae9afb4 builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every
builtin, remove it from builtin.h and add it to all the builtins that
include builtin.h (by definition, that means all builtins/*.c).

Also, remove the include statement for repository.h since it gets
brought in through builtin.h.

The next step will be to migrate each builtin
from having to use the_repository.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 14:32:24 -07:00
9b1cb5070f builtin: add a repository parameter for builtin functions
In order to reduce the usage of the global the_repository, add a
parameter to builtin functions that will get passed a repository
variable.

This commit uses UNUSED on most of the builtin functions, as subsequent
commits will modify the actual builtins to pass the repository parameter
down.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 14:27:08 -07:00
e1e0d305c4 t5512.40 sometimes dies by SIGPIPE
The last test in t5512 we recently added seems to be flaky.
Running

    $ make && cd t && sh ./t5512-ls-remote.sh --stress

shows that "git ls-remote foo::bar" exited with status 141, which
means we got a SIGPIPE.  This test piece was introduced by 9e89dcb6
(builtin/ls-remote: fall back to SHA1 outside of a repo, 2024-08-02)
and is pretty much independent from all other tests in the script
(it can even run standalone with everything before it removed).

The transport-helper.c:get_helper() function tries to write to the
helper.  As we can see the helper script is very short and can exit
even before it reads anything, when get_helper() tries to give the
first command, "capabilities", the helper may already be gone.

A trivial fix, presented here, is to make sure that the helper reads
the first command it is given, as what it writes later is a response
to that command.

I however would wonder if the interactions with the helper initiated
by get_helper() should be done on a non-blocking I/O (we do check
the return value from our write(2) system calls, do we?).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 12:45:55 -07:00
d3edb0bdde Git.pm: use "rev-parse --absolute-git-dir" rather than perl code
When we open a repository with the "Directory" option, we use "rev-parse
--git-dir" to get the path relative to that directory, and then use
Cwd::abs_path() to make it absolute (since our process working directory
may not be the same).

These days we can just ask for "--absolute-git-dir" instead, which saves
us a little code. That option was added in Git v2.13.0 via a2f5a87626
(rev-parse: add '--absolute-git-dir' option, 2017-02-03). I don't think
we make any promises about running mismatched versions of git and
Git.pm, but even if somebody tries it, that's sufficiently old that it
should be OK.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 10:42:22 -07:00
e4b353d0a1 Git.pm: fix bare repository search with Directory option
When opening a bare repository like:

  Git->repository(Directory => '/path/to/bare.git');

we will incorrectly point the repository object at the _current_
directory, not the one specified by the option.

The bug was introduced by 20da61f25f (Git.pm: trust rev-parse to find
bare repositories, 2022-10-22). Before then, we'd ask "rev-parse
--git-dir" if it was a Git repo, and if it returned anything, we'd
correctly convert that result to an absolute path using File::Spec and
Cwd::abs_path(). If it didn't, we'd guess it might be a bare repository
and find it ourselves, which was wrong (rev-parse should find even a
bare repo, and our search circumvented some of its rules).

That commit dropped most of the custom bare-repo search code in favor of
using "rev-parse --is-bare-repository" and trusting the "--git-dir" it
returned. But it mistakenly left some of the bare-repo code path in
place, which was now broken. That code calls Cwd::abs_path($dir); prior
to 20da61f25f $dir contained the "Directory" option the user passed in.
But afterwards, it contains the output of "rev-parse --git-dir". And
since our tentative rev-parse command is invoked after changing
directory, it will always be the relative path "."! So we'll end up with
the absolute path of the process's current directory, not the Directory
option the caller asked for.

So the non-bare case is correct, but the bare one is broken. Our tests
only check the non-bare one, so we didn't notice. We can fix this by
running the same absolute-path fixup code for both sides.

Helped-by: Rodrigo <rodrigolive@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 10:42:19 -07:00
7cd8f1cc6e ci: add Ubuntu 16.04 job to GitLab CI
In the preceding commits we had to convert the linux32 job to be based
on Ubuntu 20.04 instead of Ubuntu 16.04 due to a limitation in GitHub
Workflows. This was the only job left that still tested against this old
but supported Ubuntu version, and we have no other jobs that test with a
comparatively old Linux distribution.

Add a new job to GitLab CI that tests with Ubuntu 16.04 to cover the
resulting test gap. GitLab doesn't modify Docker images in the same way
GitHub does and thus doesn't fall prey to the same issue. There are two
compatibility issues uncovered by this:

  - Ubuntu 16.04 does not support HTTP/2 in Apache. We thus cannot set
    `GIT_TEST_HTTPD=true`, which would otherwise cause us to fail when
    Apache fails to start.

  - Ubuntu 16.04 cannot use recent JGit versions as they depend on a
    more recent Java runtime than we have available. We thus disable
    installing any kind of optional dependencies that do not come from
    the package manager.

These two restrictions are fine though, as we only really care about
whether Git compiles and runs on such old distributions in the first
place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13 09:02:30 -07:00
44dc651132 unicode: update the width tables to Unicode 16
Unicode 16 has been announced on 2024-09-10 [0], so update the character
width tables to the new version.

[0] https://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 14:20:44 -07:00
22ef5f02a8 t/interop: allow per-version make options
Building older versions of Git may require tweaking some build knobs. In
particular, very old versions of Git will fail to build with recent
OpenSSL, because the bignum type switched from a struct to a pointer.

The i5500 interop test uses Git v1.0.0 by default, which triggers this
problem. You can work around it by setting NO_OPENSSL in your
GIT_TEST_MAKE_OPTS variable. But there are two downsides:

  1. You have to know to do this, and it's not at all obvious.

  2. That sets the options for _all_ versions of Git that we build. And
     it's possible for two versions to require conflicting knobs. E.g.,
     building with "make NO_OPENSSL=Nope OPENSSL_SHA1=Yes" causes
     imap-send.c to barf, because it declares a fallback typedef for SSL.
     This is something we may want to fix, but of course many historical
     versions are affected, and the interop scripts should be flexible
     enough to build everything.

So let's introduce per-version make options, along with the ability for
scripts to specify knobs that match their default versions. That should
make everything build out of the box, but also allow testers flexibility
if they are testing interoperability between non-default versions.

We'll set NO_OPENSSL by default for v1.0.0 in i5500. It doesn't have to
worry about the conflict with OPENSSL_SHA1 because imap-send did not
exist back then (but if it did, it could also just explicitly use a
different hash implementation).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 13:27:36 -07:00
57974d46a4 Sync with 'maint' 2024-09-12 11:48:46 -07:00
f8ca6d0064 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 11:47:24 -07:00
f286e0a01c Merge branch 'kl/cat-file-on-sparse-index'
"git cat-file" works well with the sparse-index, and gets marked as
such.

* kl/cat-file-on-sparse-index:
  builtin/cat-file: mark 'git cat-file' sparse-index compatible
  t1092: allow run_on_* functions to use standard input
2024-09-12 11:47:24 -07:00
b64f249726 Merge branch 'jk/messages-with-excess-lf-fix'
One-line messages to "die" and other helper functions will get LF
added by these helper functions, but many existing messages had an
unnecessary LF at the end, which have been corrected.

* jk/messages-with-excess-lf-fix:
  drop trailing newline from warning/error/die messages
2024-09-12 11:47:23 -07:00
143682ec43 Merge branch 'ps/pack-refs-auto-heuristics'
"git pack-refs --auto" for the files backend was too aggressive,
which has been a bit tamed.

* ps/pack-refs-auto-heuristics:
  refs/files: use heuristic to decide whether to repack with `--auto`
  t0601: merge tests for auto-packing of refs
  wrapper: introduce `log2u()`
2024-09-12 11:47:23 -07:00
3bf057a0cd Merge branch 'tb/multi-pack-reuse-fix'
A data corruption bug when multi-pack-index is used and the same
objects are stored in multiple packfiles has been corrected.

* tb/multi-pack-reuse-fix:
  builtin/pack-objects.c: do not open-code `MAX_PACK_OBJECT_HEADER`
  pack-bitmap.c: avoid repeated `pack_pos_to_offset()` during reuse
  builtin/pack-objects.c: translate bit positions during pack-reuse
  pack-bitmap: tag bitmapped packs with their corresponding MIDX
  t/t5332-multi-pack-reuse.sh: verify pack generation with --strict
2024-09-12 11:47:23 -07:00
04595eb407 Merge branch 'gt/unit-test-oid-array'
Another unit-test.

* gt/unit-test-oid-array:
  t: port helper/test-oid-array.c to unit-tests/t-oid-array.c
2024-09-12 11:47:23 -07:00
63b5fcdde9 Merge branch 'ps/index-pack-outside-repo-fix'
"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.

* ps/index-pack-outside-repo-fix:
  builtin/index-pack: fix segfaults when running outside of a repo
2024-09-12 11:47:22 -07:00
3265304f94 Merge branch 'jc/mailinfo-header-cleanup'
Code clean-up.

* jc/mailinfo-header-cleanup:
  mailinfo: we parse fixed headers
2024-09-12 11:47:22 -07:00
6074a7d4ae Another batch of topics for 2.46.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 11:09:46 -07:00
d2b936f1dc Merge branch 'jc/grammo-fixes' into maint-2.46
Doc updates.

* jc/grammo-fixes:
  doc: grammofix in git-diff-tree
  tutorial: grammofix
2024-09-12 11:02:19 -07:00
1b9a1246ef Merge branch 'jc/tests-no-useless-tee' into maint-2.46
Test fixes.

* jc/tests-no-useless-tee:
  tests: drop use of 'tee' that hides exit status
2024-09-12 11:02:18 -07:00
9e2cb073ec Merge branch 'jc/how-to-maintain-updates' into maint-2.46
Doc updates.

* jc/how-to-maintain-updates:
  howto-maintain: mention preformatted docs
2024-09-12 11:02:17 -07:00
b4e826a720 Merge branch 'ps/bundle-outside-repo-fix' into maint-2.46
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.

* ps/bundle-outside-repo-fix:
  bundle: default to SHA1 when reading bundle headers
  builtin/bundle: have unbundle check for repo before opening its bundle
2024-09-12 11:02:16 -07:00
41c952ebac Merge branch 'jc/patch-id' into maint-2.46
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.
cf. <Zqh2T_2RLt0SeKF7@tanuki>

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format
2024-09-12 11:02:16 -07:00
712d970c01 Merge branch 'jk/apply-patch-mode-check-fix' into maint-2.46
Test fix.

* jk/apply-patch-mode-check-fix:
  t4129: fix racy index when calling chmod after git-add
  apply: canonicalize modes read from patches
2024-09-12 11:02:15 -07:00
997950a750 imap-send: handle NO_OPENSSL even when openssl exists
If NO_OPENSSL is defined, then imap-send.c defines a fallback "SSL"
type, which is just a void pointer that remains NULL. This works, but it
has one problem: it is using the type name "SSL", which conflicts with
the upstream name, if some other part of the system happens to include
openssl. For example:

  $ make NO_OPENSSL=Nope OPENSSL_SHA1=Yes imap-send.o
      CC imap-send.o
  imap-send.c:35:15: error: conflicting types for ‘SSL’; have ‘void *’
     35 | typedef void *SSL;
        |               ^~~
  In file included from /usr/include/openssl/evp.h:26,
                   from sha1/openssl.h:4,
                   from hash.h:10,
                   from object.h:4,
                   from commit.h:4,
                   from refs.h:4,
                   from setup.h:4,
                   from imap-send.c:32:
  /usr/include/openssl/types.h:187:23: note: previous declaration of ‘SSL’ with type ‘SSL’ {aka ‘struct ssl_st’}
    187 | typedef struct ssl_st SSL;
        |                       ^~~
  make: *** [Makefile:2761: imap-send.o] Error 1

This is not a terribly common combination in practice:

  1. Why are we disabling openssl support but still using its sha1? The
     answer is that you may use the same build options across many
     versions, and some older versions of Git no longer build with
     modern versions of openssl.

  2. Why are we using a totally unsafe sha1 that does not detect
     collisions? You're right, we shouldn't. But in preparation for
     using unsafe sha1 for non-cryptographic checksums, it would be nice
     to be able to turn it on without hassle.

We can make this work by adjusting the way imap-send handles its
fallback. One solution is something like this:

  #ifdef NO_OPENSSL
  #define git_SSL void *
  #else
  #define git_SSL SSL
  #endif

But we can observe that we only need this definition in one spot: the
struct which holds the variable. So rather than play around with macros
that may cause unexpected effects, we can just directly use the correct
type in that struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:24:51 -07:00
9c261856c9 ci: use regular action versions for linux32 job
The linux32 job runs inside a docker container with a 32-bit libc, etc.
This breaks any GitHub Actions scripts that are implemented in
javascript, because they ship with their own 64-bit version of Node.js
that's dynamically linked. They'll fail with a message like:

    exec /__e/node20/bin/node: no such file or directory

because they can't find the runtime linker.

This hasn't been a problem until recently because we special-case older,
non-javascript versions of these actions for the linux32 job. But it
recently became an issue when our old version of actions/upload-artifact
was deprecated, causing the job to fail. We worked around that in
90f2c7240c (ci: remove 'Upload failed tests' directories' step from
linux32 jobs, 2024-09-09), but it meant a loss of functionality for that
job. And we may eventually run into the same deprecation problem with
actions/checkout, which can't just be removed.

We can solve the linking issue by installing the 64-bit libc and stdc++
packages before doing anything else. Coupled with the switch to a more
recent image in the previous patch, that lets us remove the
special-casing of the action scripts entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:21:10 -07:00
9ce2e99c7d ci: use more recent linux32 image
The Xenial image we're using was released more than 8 years ago. This is
a problem for using some recent GitHub Actions scripts, as they require
Node.js 20, and all of the binaries they ship need glibc 2.28 or later.
We're not using them yet, but moving forward prepares us for a future
patch which will.

Xenial was actually the last official 32-bit Ubuntu release, but you can
still find i386 images for more recent releases. This patch uses Focal,
which was released in 2020 (and is the oldest one with glibc 2.28).

There are two small downsides here:

  - while Xenial is pretty old, it is still in LTS support until April
    2026. So there's probably some value in testing with such an old
    system, and we're losing that.

  - there are no i386 subversion packages in the Focal repository. So we
    won't be able to test that (OTOH, we had never tested it until the
    previous patch which unified the 32/64-bit dependency code).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:21:10 -07:00
e24a7bc7f0 ci: unify ubuntu and ubuntu32 dependencies
The script to install dependencies has two separate entries for 32-bit
and 64-bit Ubuntu systems. This increases the maintenance burden since
both should need roughly the same packages.

That hasn't been too bad so far because we've stayed on the same 32-bit
image since 2017. Trying to move to a newer image revealed several
problems with the linux32 job:

  - newer images complain about using "linux32 --32bit i386", due to
    seccomp restrictions. We can loosen these with a docker option, but
    I don't think running it is even doing anything. We use it only for
    pretending to "apt" that we're on a 32-bit machine, but inside the
    container image apt is already configured as a 32-bit system (even
    though the kernel outside the container is obviously 64-bit).  Using
    the same apt invocation for both architectures just gets rid of this
    call entirely.

  - we set DEBIAN_FRONTEND to avoid hanging on packages that ask the
    user questions. This wasn't a problem on the old image, but it is on
    newer ones. The 64-bit stanza handles this already.

    As a bonus, the 64-bit stanza uses "apt -q" instead of redirecting
    output to /dev/null. This would have saved me a lot of debugging
    time trying to figure out why it was hanging. :)

  - the old image seems to have zlib-dev installed by default, but newer
    ones do not.

In addition, there were probably many tests being skipped on the 32-bit
build because we didn't have support packages installed (e.g., gpg). Now
we'll run them.

We do need to keep some parts split off just for 64-bit systems: our p4
and lfs installs reference x86_64/amd64 binaries. The downloaded jgit
should work in theory, since it's just a jar file embedded in a shell
script that relies on the system java. But the system java in our image
is too old, so I've left it as 64-bit only for now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:21:10 -07:00
48c55943c5 ci: drop run-docker scripts
We haven't used these scripts since 4a6e4b9602 (CI: remove Travis CI
support, 2021-11-23), as the GitHub Actions config has support for
directly running jobs within docker containers.

It's possible we might want to resurrect something like this in order to
be more agnostic to the CI platform. But it's not clear exactly what it
would look like. And in the meantime, it's just a maintenance burden as
we make changes to CI config, and is subject to bitrot. In fact it's
already broken; it references ci/install-docker-dependencies.sh, which
went away in 9cdeb34b96 (ci: merge scripts which install dependencies,
2024-04-12).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:21:10 -07:00
1e7e4a111f environment: stop storing "core.notesRef" globally
Stop storing the "core.notesRef" config value globally. Instead,
retrieve the value in `default_notes_ref()`. The code is never called in
a hot loop anyway, so doing this on every invocation should be perfectly
fine.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:44 -07:00
11dbb4ace3 environment: stop storing "core.warnAmbiguousRefs" globally
Same as the preceding commits, storing the "core.warnAmbiguousRefs"
value globally is misdesigned as this setting may be set per repository.

Move the logic into the repo-settings subsystem. The usual pattern here
is that users are expected to call `prepare_repo_settings()` before they
access the settings themselves. This seems somewhat fragile though, as
it is easy to miss and leads to somewhat ugly code patterns at the call
sites.

Instead, introduce a new function that encapsulates this logic for us.
This also allows us to change how exactly the lazy initialization works
in the future, e.g. by only partially initializing values as requested
by the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:44 -07:00
8e2e8a33f3 environment: stop storing "core.preferSymlinkRefs" globally
Same as the preceding commit, storing the "core.preferSymlinkRefs" value
globally is misdesigned as this setting may be set per repository.

There is only a single user of this value anyway, namely the "files"
backend. So let's just remove the global variable and read the value of
this setting when initializing the backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
eafb126456 environment: stop storing "core.logAllRefUpdates" globally
The value of "core.logAllRefUpdates" is being stored in the global
variable `log_all_ref_updates`. This design is somewhat aged nowadays,
where it is entirely possible to access multiple repositories in the
same process which all have different values for this setting. So using
a single global variable to track it is plain wrong.

Remove the global variable. Instead, we now provide a new function part
of the repo-settings subsystem that parses the value for a specific
repository. While that may require us to read the value multiple times,
we work around this by reading it once when the ref backends are set up
and caching the value there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
9a20b889e8 refs: stop modifying global log_all_ref_updates variable
In refs-related code we modify the global `log_all_ref_updates`
variable, which is done because `should_autocreate_reflog()` does not
accept passing an `enum log_refs_config` but instead accesses the global
variable. Adapt its interface such that the value is provided by the
caller, which allows us to compute the proper value locally without
having to modify global state.

This change requires us to move the enum to "repo-settings.h", or
otherwise we get compilation errors due to include cycles. We're about
to fully move this setting into the repo-settings subsystem anyway, so
this is fine.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
118fd1a26d branch: stop modifying log_all_ref_updates variable
In "branch.c" we modify the global `log_all_ref_updates` variable to
force creation of a reflog entry. Modifying global state like this is
discouraged, as it may have all kinds of consequences in other places of
our codebase.

Stop modifying the variable and pass the `REF_FORCE_CREATE_REFLOG` flag
instead. Setting this flag has a stronger meaning than setting the
config to `LOG_REFS_NORMAL`:

  - `LOG_REFS_NORMAL` will ask us to only create reflog entries for
    preexisting reflogs or branches, remote refs, note refs and HEAD.

  - `REF_FORCE_CREATE_REFLOG` will unconditionally create a reflog and
    is thus equivalent to `LOG_REFS_ALWAYS`.

But as we are in `create_branch()` and thus do not have to worry about
arbitrary references, but only about branches, `LOG_REFS_NORMAL` and
`LOG_REFS_ALWAYS` are indeed equivalent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
f1d3d07900 repo-settings: track defaults close to struct repo_settings
The default values for `struct repo_settings` are set up in
`prepare_repo_settings()`. This is somewhat different from how we
typically do this, namely by providing an `INIT` macro that sets up the
default values for us.

Refactor the code to do the same.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:42 -07:00
a0d09c56ba repo-settings: split out declarations into a standalone header
While we have "repo-settings.c", we do not have a corresponding
"repo-settings.h" file. Instead, this functionality is part of the
"repository.h" header, making it hard to discover.

Split the declarations out of "repository.h" and create a standalone
header file with them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:42 -07:00
673af418d0 environment: guard state depending on a repository
In "environment.h" we have quite a lot of functions and variables that
either explicitly or implicitly depend on `the_repository`.

The implicit set of stateful declarations includes for example variables
which get populated when parsing a repository's Git configuration. This
set of variables is broken by design, as their state often depends on
the last repository config that has been parsed. So they may or may not
represent the state of `the_repository`.

Fixing that is quite a big undertaking, and later patches in this series
will demonstrate a solution for a first small set of those variables. So
for now, let's guard these with `USE_THE_REPOSITORY_VARIABLE` so that
callers are aware of the implicit dependency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:42 -07:00
f2d70847bd environment: reorder header to split out the_repository-free section
Reorder the "environment.h" header such that declarations which are free
from `the_repository` come before those which aren't. The new structure
is now:

    - Defines for environment variable names.

    - Things which do not rely on a repository.

    - Things which do, including those that implicitly rely on a parsed
      repository. This includes for example variables which get
      populated when reading repository config.

This will allow us to guard the last category of declarations with
`USE_THE_REPOSITORY_VARIABLE`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:42 -07:00
a52beae3a3 environment: move set_git_dir() and related into setup layer
The functions `set_git_dir()` and friends are used to set up
repositories. As such, they are quite clearly part of the setup
subsystem, but still live in "environment.c". Move them over, which also
helps to get rid of dependencies on `the_repository` in the environment
subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:41 -07:00
c22d183b01 environment: make get_git_namespace() self-contained
The logic to set up and retrieve `git_namespace` is distributed across
different functions which communicate with each other via a global
environment variable. This is rather pointless though, as the value is
always derived from an environment variable, and this environment
variable does not change after we have parsed global options.

Convert the function to be fully self-contained such that it lazily
populates once called.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:41 -07:00
26b4df907b environment: move object database functions into object layer
The `odb_mkstemp()` and `odb_pack_keep()` functions are quite clearly
tied to the object store, but regardless of that they are located in
"environment.c". Move them over, which also helps to get rid of
dependencies on `the_repository` in the environment subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
b92266b79c config: make dependency on repo in read_early_config() explicit
The `read_early_config()` function can be used to read configuration
where a repository has not yet been set up. As such, it is optional
whether or not `the_repository` has already been initialized. If it was
initialized we use its commondir and gitdir. If not, the function will
try to detect the Git directories by itself and, if found, also parse
their config files.

This means that we implicitly rely on `the_repository`. Make this
dependency explicit by passing a `struct repository`. This allows us to
again drop the `USE_THE_REPOSITORY_VARIABLE` define in "config.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
c0b03e8b6d config: document read_early_config() and read_very_early_config()
It's not clear what `read_early_config()` and `read_very_early_config()`
do differently compared to `repo_read_config()` from just looking at
their names. Document both of these in the header file to clarify their
intent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
edc2c92624 environment: make get_git_work_tree() accept a repository
The `get_git_work_tree()` function retrieves the path of the work tree
of `the_repository`. Make it accept a `struct repository` such that it
can work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
14c90ac088 environment: make get_graft_file() accept a repository
The `get_graft_file()` function retrieves the path to the graft file of
`the_repository`. Make it accept a `struct repository` such that it can
work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
1dc4ec2102 environment: make get_index_file() accept a repository
The `get_index_file()` function retrieves the path to the index file
of `the_repository`. Make it accept a `struct repository` such that it
can work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
a3673f4898 environment: make get_object_directory() accept a repository
The `get_object_directory()` function retrieves the path to the object
directory for `the_repository`. Make it accept a `struct repository`
such that it can work on arbitrary repositories and make it part of the
repository subsystem. This reduces our reliance on `the_repository` and
clarifies scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
661624a4f6 environment: make get_git_common_dir() accept a repository
The `get_git_common_dir()` function retrieves the path to the common
directory for `the_repository`. Make it accept a `struct repository`
such that it can work on arbitrary repositories and make it part of the
repository subsystem. This reduces our reliance on `the_repository` and
clarifies scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
246deeac95 environment: make get_git_dir() accept a repository
The `get_git_dir()` function retrieves the path to the Git directory for
`the_repository`. Make it accept a `struct repository` such that it can
work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
86b93bddeb t0211: add missing LIBCURL prereq
After building Git with NO_LIBCURL, we're lacking `git remote-http` and
`git http-fetch`, so when we test that they trace as they should, we're
bound to fail. Add the LIBCURL prereq to those tests.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-11 08:38:08 -07:00
dc542fcd6b t1517: add missing LIBCURL prereq
After building Git with NO_LIBCURL, there is no `git remote-http`, so
it's not meaningful to test that it can run outside of a repository.
Indeed, that test will fail. Add the LIBCURL prereq to it.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-11 08:38:07 -07:00
c5ee8f2d1c The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-10 13:16:43 -07:00
2e0808ca0e Merge branch 'sp/mailmap'
Update to a mailmap entry.

* sp/mailmap:
  .mailmap document current address.
2024-09-10 13:16:43 -07:00
48642ec7ab Merge branch 'ps/declare-pack-redundamt-dead'
"git pack-redundant" has been marked for removal in Git 3.0.

* ps/declare-pack-redundamt-dead:
  Documentation/BreakingChanges: announce removal of git-pack-redundant(1)
2024-09-10 13:16:43 -07:00
d1ea0f70cb Merge branch 'ah/mergetols-vscode'
"git mergetool" learned to use VSCode as a merge backend.

* ah/mergetols-vscode:
  mergetools: vscode: new tool
2024-09-10 13:16:42 -07:00
f4806a9a3e Merge branch 'rj/compat-terminal-unused-fix'
Build fix.

* rj/compat-terminal-unused-fix:
  compat/terminal: mark parameter of git_terminal_prompt() UNUSED
2024-09-10 13:16:42 -07:00
a6dce0afc3 Merge branch 'jk/free-commit-buffer-of-skipped-commits'
The code forgot to discard unnecessary in-core commit buffer data
for commits that "git log --skip=<number>" traversed but omitted
from the output, which has been corrected.

* jk/free-commit-buffer-of-skipped-commits:
  revision: free commit buffers for skipped commits
2024-09-10 13:16:41 -07:00
c3de556a84 Makefile: rename clar-related variables to avoid confusion
The Makefile variables related to the recently-introduced clar testing
framework have a `UNIT_TESTS_` prefix. This prefix is extremely similar
to the prefix used by our other unit tests that use our homegrown unit
testing framework, which is `UNIT_TEST_`. The consequence is that it is
easy to misread the names and confuse them with each other.

Rename the clar-related variables to instead have a `CLAR_TEST_` prefix
to address this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-10 10:27:27 -07:00
a13ff41963 chainlint: reduce annotation noise-factor
When chainlint detects a problem in a test definition, it highlights the
offending code with a "?!...?!" annotation. The rather curious "?!"
decoration was chosen to draw the reader's attention to the problem area
and to act as a good "needle" when using the terminal's search feature
to "jump" to the next problem.

Later, chainlint learned to color its output when sent to a terminal.
Problem annotations are colored with a red background which stands out
well from surrounding text, thus easily draws the reader's attention.
Together with the preceding change which gave all problem annotations a
uniform "LINT:" prefix, the noisy "?!" decoration has become superfluous
as a search "needle" so omit it when output is colored.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-10 10:01:40 -07:00
e44f15ba3e chainlint: make error messages self-explanatory
The annotations emitted by chainlint to indicate detected problems are
overly terse, so much so that developers new to the project -- those who
should most benefit from the linting -- may find them baffling. For
instance, although the author of chainlint and seasoned Git developers
may understand that "?!AMP?!" is an abbreviation of "ampersand" and
indicates a break in the &&-chain, this may not be obvious to newcomers.

The "?!LOOP?!" case is particularly serious because that terse single
word does nothing to convey that the loop body should end with
"|| return 1" (or "|| exit 1" in a subshell) to ensure that a failing
command in the body aborts the loop immediately. Moreover, unlike
&&-chaining which is ubiquitous in Git tests, the "|| return 1" idiom is
relatively infrequent, thus may be harder for a newcomer to discover by
consulting nearby code.

Address these shortcomings by emitting human-readable messages which
both explain the problem and give a strong hint about how to correct it.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-10 10:01:40 -07:00
588ef84ece chainlint: don't be fooled by "?!...?!" in test body
As originally implemented, chainlint did not collect structured
information about detected problems. Instead, it merely emitted raw
parse tokens (not the original test text), along with a "?!...?!"
annotation directly into the output stream each time a problem was
discovered. In order to report statistics (in --stats mode) and to
adjust its exit code to indicate success or failure, it merely counts
the number of times "?!...?!" appears in the output stream. An obvious
shortcoming of this approach is that it can be fooled by a legitimate
"?!...?!" sequence in the body of a test (though, only if an actual
problem is detected in the test).

The situation did not improve when 7c04aa7390 (chainlint: colorize
problem annotations and test delimiters, 2022-09-13) colored the
annotations after-the-fact by searching for "?!...?!" in the output
stream and inserting color codes. As above, a shortcoming is that this
approach can incorrectly color a legitimate "?!...?!" sequence in a test
body as if it is an error.

However, when 73c768dae9 (chainlint: annotate original test definition
rather than token stream, 2022-11-08) taught chainlint to output the
original test text verbatim, it started collecting structured
information about detected problems.

Now that it is available, take advantage of the structured problem
information to deterministically count the number of problems detected
and to color the annotations directly, rather than scanning the output
stream for "?!...?!" and performing these operations after-the-fact.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-10 10:01:40 -07:00
04d9744f83 ref-filter: fix leak with unterminated %(if) atoms
When parsing `%(if)` atoms we expect a few other atoms to exist to
complete it, like `%(then)` and `%(end)`. Whether or not we have seen
these other atoms is tracked in an allocated `if_then_else` structure,
which gets free'd by the `if_then_else_handler()` once we have parsed
the complete conditional expression.

This results in a memory leak when the `%(if)` atom is not terminated
correctly and thus incomplete. We never end up executing its handler and
thus don't end up freeing the structure.

Plug this memory leak by introducing a new `at_end_data_free` callback
function. If set, we'll execute it in `pop_stack_element()` and pass it
the `at_end_data` variable with the intent to free its state. Wire it up
for the `%(if)` atom accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-10 09:26:13 -07:00
db629c61f0 ref-filter: add ref_format_clear() function
After using the ref-filter API, callers should use ref_filter_clear() to
free any used memory. However, there's not a matching function to clear
the ref_format struct.

Traditionally this did not need to be cleaned up, as it was just a way
for the caller to store and pass format options as a single unit. Even
though the parsing step of some placeholders may allocate data, that's
usually inside their "used_atom" structs, which are part of the
ref_filter itself.

But a few placeholders keep data outside of there. The %(ahead-behind)
and %(is-base) parsers both keep a master list of bases, because they
perform a single filtering pass outside of the use of any particular
atom. And since the format parser does not have access to the ref_filter
struct, they store their cross-atom data in the ref_format struct
itself.

And thus when they are finished, the ref_format also needs to be cleaned
up. So let's add a function to do so, and call it from all of the users
of the ref-filter API.

The %(is-base) case is found by running LSan on t6300. After this patch,
the script can now be marked leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:11 -07:00
f046127b66 ref-filter: fix leak when formatting %(push:remoteref)
When we expand the %(upstream) or %(push) placeholders, we rely on
remote.c's remote_ref_for_branch() to fill in the ":refname" argument.
But that function has confusing memory ownership semantics: it may or
may not return an allocated string, depending on whether we are in
"upstream" mode or "push" mode. The caller in ref-filter.c always
duplicates the result, meaning that we leak the original in the case of
%(push:refname).

To solve this, let's make the return value from remote_ref_for_branch()
consistent, by always returning an allocated pointer. Note that the
switch to returning a non-const pointer has a ripple effect inside the
function, too. We were storing the "dst" result as a const pointer, too,
even though it is always allocated! It is the return value from
apply_refspecs(), which is always a non-const allocated string.

And then on the caller side in ref-filter.c (and this is the only caller
at all), we just need to avoid the extra duplication when the return
value is non-NULL.

This clears up one case that LSan finds in t6300, but there are more.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:10 -07:00
ec007cde94 ref-filter: fix leak with %(describe) arguments
When we parse a %(describe) placeholder, we stuff its arguments into a
strvec, which is then detached into the used_atom struct. But later,
when ref_array_clear() frees the atom, we never free the memory.

To solve this, we just need to add the appropriate free() calls. But
it's a little awkward, since we have to free each element of the array,
in addition to the array itself. Instead, let's store the actual strvec,
which lets us do a simple strvec_clear().

This clears up one case that LSan finds in t6300, but there are more.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:10 -07:00
f6ba781903 ref-filter: fix leak of %(trailers) "argbuf"
When we parse a placeholder like "%(trailers:key=foo)", our atom parsing
function is passed just the argument string "key=foo". We duplicate this
into its own string, but never free it, causing a leak.

We do the duplication for two reasons:

  1. There's a mismatch with the pretty.c trailer-formatting code that
     we rely on. It expects to see a closing paren, like "key=foo)". So
     we duplicate the argument string with that extra character to pass
     along.

     This is probably something we could fix in the long run, but it's
     somewhat non-trivial if we want to avoid regressing error cases for
     things like "git log --format='%(trailer:oops'". So let's accept
     it as a necessity for now.

  2. The argument parser expects to store the list of "key" entries
     ("foo" in this case) in a string-list. It also stores the length of
     the string in the string-list "util" field. The original caller in
     pretty.c uses this with a "nodup" string list to avoid making extra
     copies, which creates a subtle dependency on the lifetime of the
     original format string.

     We do the same here, which creates that same dependency. So we
     can't simply free it as soon as the parsing is done.

There are two possible solutions here. The first is to hold on to the
duplicated "argbuf" string in the used_atom struct, so that it lives as
long as the string_list which references it.

But I think a less-subtle solution, and what this patch does, is to
switch to a duplicating string_list. That makes it self-contained, and
lets us free argbuf immediately. It may involve a few extra allocations,
but this parsing is something that happens once per program, not once
per output ref.

This clears up one case that LSan finds in t6300, but there are more.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:10 -07:00
e595b016fc ref-filter: store ref_trailer_buf data per-atom
The trailer API takes options via a trailer_opts struct. Some of those
options point to data structures which require extra storage. Those
structures aren't actually embedded in the options struct, but rather we
pass pointers, and the caller is responsible for managing them. This is
a little convoluted, but makes sense since some of them are not even
concrete (e.g., you can pass a filter function and a void data pointer,
but the trailer code doesn't even know what's in the pointer).

When for-each-ref, etc, parse the %(trailers) placeholder, they stuff
the extra data into a ref_trailer_buf struct. But we only hold a single
static global instance of this struct. So if a format string has
multiple %(trailer) placeholders, they'll stomp on each other: the "key"
list will end up with entries for all of them, and the separator buffers
will use the values from whichever was parsed last.

Instead, we should have a ref_trailer_buf for each instance of the
placeholder, and store it alongside the trailer_opts in the used_atom
structure.

And that's what this patch does. Note that we also have to add code to
clean them up in ref_array_clear(). The original code did not bother
cleaning them up, but it wasn't technically a "leak" since they were
still reachable from the static global instance.

Reported-by: Brooke Kuhlmann <brooke@alchemists.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:10 -07:00
a2417a03c9 ref-filter: drop useless cast in trailers_atom_parser()
There's no need to cast invalid_arg before freeing it. It is already a
non-const pointer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:09 -07:00
99448c3d78 ref-filter: strip signature when parsing tag trailers
To expand the "%(trailers)" placeholder, we have to feed the commit or
tag body to the trailer API. But that API doesn't know anything about
signatures, and will be confused by a signed tag like this:

  the subject

  the body

  Some-trailer: foo
  -----BEGIN PGP SIGNATURE-----
  ...etc...

because it will start looking for trailers after the signature, and get
stopped walking backwards by the very non-trailer signature lines. So it
thinks there are no trailers.

This problem has existed since %(trailers) was added to the ref-filter
code, but back then trailers on tags weren't something we really
considered (commits don't have the same problem because their signatures
are embedded in the header). But since 066cef7707 (builtin/tag: add
--trailer option, 2024-05-05), we'd generate an object like the above
for "git tag -s --trailer 'Some-trailer: foo' my-tag".

The implementation here is pretty simple: we just make a NUL-terminated
copy of the non-signature part of the tag (which we've already parsed)
and pass it to the trailer API. There are some alternatives I rejected,
at least for now:

  - the trailer code already understands skipping past some cruft at the
    end of a commit, such as patch dividers. see find_end_of_log_message().
    We could teach it to do the same for signatures. But since this is
    the only context where we'd want that feature, and since we've already
    parsed the object into subject/body/signature here, it seemed easier
    to just pass in the truncated message.

  - it would be nice if we could just pass in a pointer/len pair to the
    trailer API (rather than a NUL-terminated string) to avoid the extra
    copy. I think this is possible, since as noted above, the trailer
    code already has to deal with ignoring some cruft at the end of the
    input. But after an initial attempt at this, it got pretty messy, as
    we have to touch a lot of intermediate functions that are also
    called in other contexts.

    So I went for the simple and stupid thing, at least for now. I don't
    think the extra copy overhead will be all that bad. The previous
    patch noted that an extra copy seemed to cause about 1-2% slowdown
    for something simple like "%(subject)". But here we are only
    triggering it for "%(trailers)" (and only when there is a
    signature), and the trailer code is a bit allocation-heavy already.
    I couldn't measure any difference formatting "%(trailers)" on
    linux.git before and after (even though there are not even any
    trailers to find).

Reported-by: Brooke Kuhlmann <brooke@alchemists.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:09 -07:00
7291699928 ref-filter: avoid extra copies of payload/signature
When we know we're going to show the subject or body of a tag or commit,
we call find_subpos(), which returns pointers and lengths for the three
parts: subject, body, signature.

Oddly, the function finds the signature twice: once by calling
parse_signature() at the start, which copies the signature into a
separate strbuf, and then again by calling parse_signed_buffer() after
we've parsed past the subject.

This is due to 482c119186 (gpg-interface: improve interface for parsing
tags, 2021-02-11) and 88bce0e24c (ref-filter: hoist signature parsing,
2021-02-11). The idea is that in a multi-hash world, tag signatures may
appear in the header, rather than at the end of the body, in which case
we need to extract them into a separate buffer.

But parse_signature() would never find such a buffer! It only looks for
signature lines (like "-----BEGIN PGP") at the start of each line,
without any header keyword. So this code will never find anything except
the usual in-body signature.

And the extra code has two downsides:

  1. We spend time copying the payload and signature into strbufs. That
     might even be useful if we ended up with a NUL-terminated copy of
     the payload data, but we throw it away immediately. And the
     signature, since it comes at the end of the message, is already its
     own NUL-terminated buffer.

     The overhead isn't huge, but I measured a pretty consistent 1-2%
     speedup running "git for-each-ref --format='%(subject)'" with this
     patch on a clone of linux.git.

  2. The output of find_subpos() is a set of three ptr/len combinations,
     but only two of them point into the original buffer. This makes the
     interface confusing: you can't do pointer comparisons between them,
     and you have to remember to free the signature buffer. Since
     there's only one caller, it's not too bad in practice, but it did
     bite me while working on the next patch (and simplifying it will
     pave the way for that).

In the long run we might have to go back to something like this
approach, if we do have multi-hash header signatures. But I would argue
that the extra buffer should kick in only for a header signature, and be
passed out of find_subpos() separately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:09 -07:00
87fbddd57e t6300: drop newline from wrapped test title
We don't usually include newlines in test titles, because you get funny
TAP output like:

  ok 417 - show good signature with custom format
  ok 418 - show good signature with custom format
  			    with ssh
  ok 419 - signature atom with grade option and bad signature

where a TAP parser would ignore the extra line anyway, giving the wrong
title. This comes from 26c9c03f0a (ref-filter: add new "signature" atom,
2023-06-04), and I think it was probably just editor line wrapping.

I checked for other cases with:

  git grep "test_expect_success [A-Z_,]* '[^']*$"
  git grep 'test_expect_success [A-Z_,]* "[^"]*$'

but this was the only hit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:26:09 -07:00
90f2c7240c ci: remove 'Upload failed tests' directories' step from linux32 jobs
Linux32 jobs seem to be getting:

    Error: This request has been automatically failed because it uses a
    deprecated version of `actions/upload-artifact: v1`. Learn more:
    https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/

before doing anything useful.  For now, disable the step.

Ever since actions/upload-artifact@v1 got disabled, mentioning the
offending version of it seems to stop anything from happening.  At
least this should run the same build and test.

See

    https://github.com/git/git/actions/runs/10780030750/job/29894867249

for example.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 16:00:53 -07:00
d70600526e Merge branch 'cp/unit-test-reftable-stack' into ps/reftable-exclude
* cp/unit-test-reftable-stack:
  t-reftable-stack: add test for stack iterators
  t-reftable-stack: add test for non-default compaction factor
  t-reftable-stack: use reftable_ref_record_equal() to compare ref records
  t-reftable-stack: use Git's tempfile API instead of mkstemp()
  t: harmonize t-reftable-stack.c with coding guidelines
  t: move reftable/stack_test.c to the unit testing framework
2024-09-09 10:13:44 -07:00
2b14ced370 t-reftable-stack: add test for stack iterators
reftable_stack_init_ref_iterator and reftable_stack_init_log_iterator
as defined by reftable/stack.{c,h} initialize a stack iterator to
iterate over the ref and log records in a reftable stack respectively.
Since these functions are not exercised by any of the existing tests,
add a test for them.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 10:12:56 -07:00
e87952443a t-reftable-stack: add test for non-default compaction factor
In a recent codebase update (commit ae8e378430, merge branch
'ps/reftable-write-options', 2024/05/13) the geometric factor used
in auto-compaction of reftable tables was made configurable. Add
a test to verify the functionality introduced by this update.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 10:12:56 -07:00
1052280136 t-reftable-stack: use reftable_ref_record_equal() to compare ref records
In the current stack tests, ref records are compared for equality
by sometimes using the dedicated function for ref-record comparison,
reftable_ref_record_equal(), and sometimes by explicity comparing
contents of the ref records.

The latter method is undesired because there can exist unequal ref
records with some of the contents being equal. Replace the latter
instances of ref-record comparison with the former. This has the
added benefit of preserving uniformity throughout the test file.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 10:12:15 -07:00
57f583c748 apply: support --ours, --theirs, and --union for three-way merges
--ours, --theirs, and --union are already supported in `git merge-file`
for automatically resolving conflicts in favor of one version or the
other, instead of leaving conflict markers in the file. Support them in
`git apply -3` as well because the two commands do the same kind of
file-level merges.

In case in the future --ours, --theirs, and --union gain a meaning
outside of three-way-merges, they do not imply --3way but rather must be
specified alongside it.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 10:07:24 -07:00
9a36ea37ae doc: remote.*.skip{DefaultUpdate,FetchAll} stops prefetch
Back when 7cc91a2f (Add the configuration option skipFetchAll,
2009-11-09) added for the sole purpose of adding skipFetchAll as a
synonym to skipDefaultUpdate, there was no explanation about the
reason why it was needed., but these two configuration variables
mean exactly the same thing.

Also, when we taught the "prefetch" task to "git maintenance" later,
we did make it pay attention to the setting, but we forgot to
document it.

Document these variables as synonyms that collectively implements
the last-one-wins semantics, and also clarify that the prefetch task
is also controlled by this variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09 10:06:13 -07:00
39ba986b0e config.mak.uname: add HAVE_DEV_TTY to cygwin config section
If neither HAVE_DEV_TTY nor GIT_WINDOWS_NATIVE is set, while compiling
the 'compat/terminal.c' code, then the fallback code calls the system
getpass() function. Unfortunately, this ignores the 'echo' parameter of
the git_terminal_prompt() function, since it has no way to implement that
functionality. This results in a less than optimal user experience on
cygwin, which does not define either of those build flags.

However, cygwin does have a functional '/dev/tty', so that it can build
with HAVE_DEV_TTY and benefit from the improved user experience.

The improved git_terminal_prompt() function that comes with HAVE_DEV_TTY
is used in the git_prompt() function, which in turn is used by the
'git credential', 'git bisect' and 'git help' commands. In addition to
git_terminal_prompt(), read_key_without_echo() is likewise improved and
used by the 'git add -p' command.

While using the 'git credential fill' command, for example:

  $ printf "%s\n" protocol=https host=example.com path=git | ./git credential fill
  Username for 'https://example.com': user
  Password for 'https://user@example.com':
  protocol=https
  host=example.com
  username=user
  password=pass
  $

The 'user' name is now echoed while typing (the password isn't), where this
wasn't the case before.

When using the auto-correct feature:

  $ ./git -c help.autocorrect=prompt fred
  WARNING: You called a Git command named 'fred', which does not exist.
  Run 'grep' instead [y/N]? n
  $ ./git -c help.autocorrect=prompt fred
  WARNING: You called a Git command named 'fred', which does not exist.
  Run 'grep' instead [y/N]? y
  fatal: no pattern given
  $

The user can actually see what they are typing at the prompt. Similar
comments apply to 'git bisect':

  $ ./git bisect bad master~1
  You need to start by "git bisect start"

  Do you want me to do it for you [Y/n]? y
  status: waiting for both good and bad commits
  status: waiting for good commit(s), bad commit known
  $ ./git bisect reset
  Already on 'master-tmp'
  $

  $ ./git bisect start
  status: waiting for both good and bad commits
  $ ./git bisect bad master~1
  status: waiting for good commit(s), bad commit known
  $ ./git bisect next
  warning: bisecting only with a bad commit
  Are you sure [Y/n]? n
  $ ./git bisect reset
  Already on 'master-tmp'
  $

The read_key_without_echo() function leads to a much improved 'git add -p'
command, when the 'interactive.singleKey' configuration is set:

  $ cd ..
  $ mkdir test-git
  $ cd test-git
  $ git init -q
  $ echo foo >file
  $ git add file
  $ echo bar >file
  $ ../git/git -c interactive.singleKey=true add -p
  diff --git a/file b/file
  index 257cc56..5716ca5 100644
  --- a/file
  +++ b/file
  @@ -1 +1 @@
  -foo
  +bar
  (1/1) Stage this hunk [y,n,q,a,d,e,p,?]? y

  $

Note that, not only is the user input echoed, but that it is immediately
accepted (without having to type <return>) and the program exits with the
hunk staged (in this case) or not.

In order to reap these benefits, set the HAVE_DEV_TTY build flag in the
cygwin configuration section of config.mak.uname.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 21:57:59 -07:00
476abc39ba t-reftable-stack: use Git's tempfile API instead of mkstemp()
Git's tempfile API defined by $GIT_DIR/tempfile.{c,h} provides
a unified interface for tempfile operations. Since reftable/stack.c
uses this API for all its tempfile needs instead of raw functions
like mkstemp(), make the ported stack test strictly use Git's
tempfile API as well.

A bigger benefit is the fact that we know to clean up the tempfile
in case the test fails because it gets registered and pruned via a
signal handler.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 13:24:03 -07:00
e4e384f68d t: harmonize t-reftable-stack.c with coding guidelines
Harmonize the newly ported test unit-tests/t-reftable-stack.c
with the following guidelines:
- Single line 'for' statements must omit curly braces.
- Structs must be 0-initialized with '= { 0 }' instead of '= { NULL }'.
- Array sizes and indices should preferably be of type 'size_t' and
  not 'int'.
- Function pointers should be passed as 'func' and not '&func'.

While at it, remove initialization for those variables that are
re-used multiple times, like loop variables.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 13:24:03 -07:00
15e29ea1c6 t: move reftable/stack_test.c to the unit testing framework
reftable/stack_test.c exercises the functions defined in
reftable/stack.{c, h}. Migrate reftable/stack_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to be in-line with unit-tests'
standards.

Since some of the tests use set_test_hash() defined by
reftable/test_framework.{c, h} but these files are not
'#included' in the test file, copy this function in the
ported test file.

With the migration of stack test to the unit-tests framework,
"test-tool reftable" becomes a no-op. Hence, get rid of everything
that uses "test-tool reftable" alongside everything that is used
to implement it.

While at it, alphabetically sort the cmds[] list in
helper/test-tool.c by moving the entry for "dump-reftable".

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 13:24:03 -07:00
11591850dd diff: report dirty submodules as changes in builtin_diff()
The diff machinery has two ways to detect changes to set the exit code:
Just comparing hashes and comparing blob contents.  The latter is needed
if certain changes have to be ignored, e.g. with --ignore-space-change
or --ignore-matching-lines.  It's enabled by the diff_options flag
diff_from_contents.

The slower mode as never considered submodules (and subrepos) as changes
with --submodule=diff or --submodule=log, which is inconsistent with
--submodule=short (the default).  Fix it.

d7b97b7185 (diff: let external diffs report that changes are
uninteresting, 2024-06-09) set diff_from_contents if external diff
programs are allowed.  This is the default e.g. for git diff, and so
that change exposed the inconsistency much more widely.

Reported-by: David Hull <david.hull@friendbuy.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 13:21:24 -07:00
87cf96094a diff: report copies and renames as changes in run_diff_cmd()
The diff machinery has two ways to detect changes to set the exit code:
Just comparing hashes and comparing blob contents.  The latter is needed
if certain changes have to be ignored, e.g. with --ignore-space-change
or --ignore-matching-lines.  It's enabled by the diff_options flag
diff_from_contents.

The slower mode has never considered copies and renames to be changes,
which is inconsistent with the quicker one.  Fix it.  Even if we ignore
the file contents (because it's empty or contains only ignored lines),
there's still the meta data change of adding or changing a filename, so
we need to report it in the exit code.

d7b97b7185 (diff: let external diffs report that changes are
uninteresting, 2024-06-09) set diff_from_contents if external diff
programs are allowed.  This is the default e.g. for git diff, and so
that change exposed the inconsistency much more widely.

Reported-by: Jorge Luis Martinez Gomez <jol@jol.dev>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 13:21:23 -07:00
fb2b9815a4 advice: recommend GIT_ADVICE=0 for tools
The GIT_ADVICE environment variable was added implicitly in b79deeb554
(advice: add --no-advice global option, 2024-05-03) but was not
documented. Add documentation to show that it is an option for tools
that want to disable these messages. Make note that while the
--no-advice option exists, older Git versions will fail to parse that
option. The environment variable presents a way to change the behavior
of Git versions that understand it without disrupting older versions.

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 14:15:16 -07:00
ce31b82ca9 scalar: add --no-tags option to 'scalar clone'
Some large repositories use tags to track a huge list of release
versions. While this choice is costly on the ref advertisement, it is
further wasteful for clients who do not need those tags. Allow clients
to optionally skip the tag advertisement.

This behavior is similar to that of 'git clone --no-tags' implemented in
0dab2468ee (clone: add a --no-tags option to clone without tags,
2017-04-26), including the modification of the remote.origin.tagOpt
config value to include "--no-tags".

One thing that is opposite of the 'git clone' implementation is that
this allows '--tags' as an assumed option, which can be naturally negated
with '--no-tags'. The clone command does not accept '--tags' but allows
"--no-no-tags" as the negation of its '--no-tags' option.

While testing this option, combine the test with the previously untested
'--no-src' option introduced in 4527db8ff8 (scalar: add --[no-]src
option, 2023-08-28).

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 14:13:48 -07:00
4c42d5ff28 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 10:38:52 -07:00
f1160b2700 Merge branch 'jk/maybe-unused-cleanup'
Code clean-up.

* jk/maybe-unused-cleanup:
  grep: prefer UNUSED to MAYBE_UNUSED for pcre allocators
  gc: drop MAYBE_UNUSED annotation from used parameter
2024-09-06 10:38:52 -07:00
21c66081ca Merge branch 'jc/unused-on-windows'
Fix more fallouts from -Werror=unused-parameter.

* jc/unused-on-windows:
  refs/files-backend: work around -Wunused-parameter
2024-09-06 10:38:51 -07:00
5ecd5fa58b Merge branch 'jk/unused-parameters'
Make our codebase compilable with the -Werror=unused-parameter
option.

* jk/unused-parameters:
  CodingGuidelines: mention -Wunused-parameter and UNUSED
  config.mak.dev: enable -Wunused-parameter by default
  compat: mark unused parameters in win32/mingw functions
  compat: disable -Wunused-parameter in win32/headless.c
  compat: disable -Wunused-parameter in 3rd-party code
  t-reftable-readwrite: mark unused parameter in callback function
  gc: mark unused config parameter in virtual functions
2024-09-06 10:38:50 -07:00
4476304a06 Merge branch 'jc/maybe-unused'
Developer doc updates.

* jc/maybe-unused:
  CodingGuidelines: also mention MAYBE_UNUSED
2024-09-06 10:38:50 -07:00
6dcb2db0fa Merge branch 'jk/send-email-mailmap'
"git send-email" learned "--mailmap" option to allow rewriting the
recipient addresses.

* jk/send-email-mailmap:
  send-email: add mailmap support via sendemail.mailmap and --mailmap
  check-mailmap: add options for additional mailmap sources
  check-mailmap: accept "user@host" contacts
2024-09-06 10:38:49 -07:00
66710f91ff .mailmap document current address.
Cox Communications no longer supports email and transfered accounts to
yahoo. I closed the account at yahoo since I use gmail.com.

Signed-off-by: Stephen P. Smith <ishchis2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 09:31:15 -07:00
c02414a997 interpret-trailers: handle message without trailing newline
When git-interpret-trailers is used to add a trailer to a message that
does not end in a trailing newline, the new trailer is added on the line
immediately following the message instead of as a trailer block
separated from the message by a blank line.

For example, if a message's text was exactly "The subject" with no
trailing newline present, `git interpret-trailers --trailer
my-trailer=true` will result in the following malformed commit message:

    The subject
    my-trailer: true

While it is generally expected that a commit message should end with a
newline character, git-interpret-trailers should not be returning an
invalid message in this case.

Use `strbuf_complete_line` to ensure that the message ends with a
newline character when reading the input.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 09:21:44 -07:00
a71c47825d sparse-checkout: use fdopen_lock_file() instead of xfdopen()
When updating sparse patterns, we open a lock_file to write out the new
data. The lock_file struct holds the file descriptor, but we call
fdopen() to get a stdio handle to do the actual write.

After we finish writing, we fflush() so that all of the data is on disk,
and then call commit_lock_file() which closes the descriptor. But we
never fclose() the stdio handle, leaking it.

The obvious solution seems like it would be to just call fclose(). But
when? If we do it before commit_lock_file(), then the lock_file code is
left thinking it owns the now-closed file descriptor, and will do an
extra close() on the descriptor. But if we do it before, we have the
opposite problem: the lock_file code will close the descriptor, and
fclose() will do the extra close().

We can handle this correctly by using fdopen_lock_file(). That leaves
ownership of the stdio handle with the lock_file, which knows not to
double-close it.

We do have to adjust the code a bit:

  - we have to handle errors ourselves; we can just die(), since that's
    what xfdopen() would have done (and we can even provide a more
    specific error message).

  - we no longer need to call fflush(); committing the lock-file
    auto-closes it, which will now do the flush for us. As a bonus, this
    will actually check that the flush was successful before renaming
    the file into place.

  - we can get rid of the local "fd" variable, since we never look at it
    ourselves now

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 08:02:26 -07:00
19ace71de0 sparse-checkout: check commit_lock_file when writing patterns
When writing a new "sparse-checkout" file, we do the usual strategy of
writing to a lockfile and committing it into place. But we don't check
the outcome of commit_lock_file(). Failing there would prevent us from
writing a bogus file (good), but we would ignore the error and return a
successful exit code (bad).

Fix this by calling die(). Note that we need to keep the sparse_filename
variable valid for longer, since the filename stored in the lock_file
struct will be dropped when we run commit_lock_file().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 08:02:26 -07:00
d39cc7185e sparse-checkout: consolidate cleanup when writing patterns
In write_patterns_and_update(), we always need to free the pattern list
before exiting the function.  Rather than handling it manually when we
return early, we can jump to an "out" label where cleanup happens. This
let us drop one line, but also establishes a pattern we can use for
other cleanup.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-06 08:02:26 -07:00
1a60f2066a drop trailing newline from warning/error/die messages
Our error reporting routines append a trailing newline, and the strings
we pass to them should not include them (otherwise we get an extra blank
line after the message).

These cases were all found by looking at the results of:

  git grep -P '[^_](error|error_errno|warning|die|die_errno)\(.*\\n"[,)]' '*.c'

Note that we _do_ sometimes include a newline in the middle of such
messages, to create multiline output (hence our grep matching "," or ")"
after we see the newline, so we know we're at the end of the string).

It's possible that one or more of these cases could intentionally be
including a blank line at the end, but having looked at them all
manually, I think these are all just mistakes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 09:07:12 -07:00
46f6ca2a68 builtin/repack: fix leaking keep-pack list
The list of packs to keep is populated via a command line option but
never free'd. Plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:13 -07:00
ed78f048ae merge-ort: fix two leaks when handling directory rename modifications
There are two leaks in `apply_directory_rename_modifications()`:

  - We do not release the `dirs_to_insert` string list.

  - We do not release some `conflict_info` we put into the
    `opt->priv->paths` string map.

The former is trivial to fix. The latter is a bit less straight forward:
the `util` pointer of the string map may sometimes point to data that
has been allocated via `CALLOC()`, while at other times it may point to
data that has been allocated via a `mem_pool`.

It very much seems like an oversight that we didn't also allocate the
conflict info in this code path via the memory pool, though. So let's
fix that, which will also plug the memory leak for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:13 -07:00
2a01470891 match-trees: fix leaking prefixes in shift_tree()
In `shift_tree()` we allocate two empty strings that we end up
passing to `match_trees()`. If that function finds a better match it
will update these pointers to point to a newly allocated strings,
freeing the old strings. We never free the final results though, neither
the ones we have allocated ourselves, nor the one that `match_trees()`
might've returned to us.

Fix the resulting memory leaks by creating a common exit path where we
free them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
68bd0a94be builtin/fmt-merge-msg: fix leaking buffers
Fix leaking input and output buffers in git-fmt-merge-msg(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
ee087c29c8 builtin/grep: fix leaking object context
Even when `get_oid_with_context()` fails it may have allocated some data
in the object context. But we do not release it in git-grep(1) when the
call fails, leading to a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
149c83e0aa builtin/pack-objects: plug leaking list of keep-packs
The `--keep-pack` option of git-pack-objects(1) populates the arguments
into a string list. And while the list is marked as `NODUP` and thus
won't duplicate the strings, the list entries themselves still need to
be free'd. We don't though, causing a leak.

Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
860b678016 builtin/repack: fix leaking line buffer when packing promisors
In `repack_promisor_objects()` we read output from git-pack-objects(1)
line by line, using `strbuf_getline_lf()`. We never free the line
buffer, causing a memory leak. Plug it.

This leak is being hit in t5616, but plugging it alone is not
sufficient to make the whole test suite leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
a46f231975 negotiator/skipping: fix leaking commit entries
When releasing the skipping negotiator we free its priority queue, but
not the contained entries. Fix this to plug a memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
16c6fb5a94 shallow: fix leaking members of struct shallow_info
We do not free several struct members in `clear_shallow_info()`. Fix
this to plug the resulting leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
14c0ea0f6f shallow: free grafts when unregistering them
When removing a graft via `unregister_shallow()` we remove it from the
grafts array, but do not free the structure. Fix this to plug the leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
0d1d22f5a3 object: clear grafts when clearing parsed object pool
We do not clear grafts part of the parsed object pool when clearing the
pool itself, which can lead to memory leaks when a repository is being
cleared.

Fix this by moving `reset_commit_grafts()` into "object.c" and making it
part of the `struct parsed_object_pool` interface such that we can call
it from `parsed_object_pool_clear()`. Adapt `parsed_object_pool_new()`
to take and store a reference to its owning repository, which is needed
by `unparse_commit()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
b8849e236f gpg-interface: fix misdesigned signing key interfaces
The interfaces to retrieve signing keys and their IDs are misdesigned as
they return string constants even though they indeed allocate memory,
which leads to memory leaks. Refactor the code to instead always return
allocated strings and let the callers free them accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
49d47eb541 send-pack: fix leaking push cert nonce
When retrieving the push cert nonce from the server, we first store the
constant returned by `server_feature_value()` and then, if the nonce is
valid, we duplicate the nonce memory to a NUL-terminated string, so that
we can pass it to `generate_push_cert()`. We never free the latter and
thus cause a memory leak.

Fix this by storing the limited-lifetime nonce into a scope-local
variable such that the long-lived, allocated nonce can be easily freed
without having to cast away its constness.

This leak was exposed by t5534, but fixing it is not sufficient to make
the whole test suite leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
42c153e1c0 remote: fix leak in reachability check of a remote-tracking ref
In `check_if_includes_upstream()` we retrieve the local ref
corresponding to a remote-tracking ref we want to check reachability
for. We never free that local ref and thus cause a memory leak. Fix
this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
cdbb7208c8 remote: fix leaking tracking refs
When computing the remote tracking ref we cause two memory leaks:

  - We leak when `remote_tracking()` fails.

  - We leak when the call to `remote_tracking()` succeeds and sets
    `ref->tracking_ref()`.

Fix both of these leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
1e8cb17ac5 builtin/submodule--helper: fix leaking refs on push-check
In the push-check subcommand of the submodule helper we acquire a list
of local refs, but never free that list. Fix this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
3eefd348e5 submodule: fix leaking fetch task data
The `submodule_parallel_fetch` structure contains various data
structures that we use to set up parallel fetches of submodules. We do
not free some of its data though, causing memory leaks. Plug those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
ac2e7d545e upload-pack: fix leaking child process data on reachability checks
We spawn a git-rev-list(1) command to perform reachability checks in
"upload-pack.c". We do not release memory associated with the process
in error cases though, thus leaking memory.

Fix these by calling `child_process_clear()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
7eb6f02c55 builtin/push: fix leaking refspec query result
When appending a refspec via `refspec_append_mapped()` we leak the
result of `query_refspecs()`. The overall logic around refspec queries
is quite weird, as callers are expected to either set the `src` or `dst`
pointers, and then the (allocated) result will be in the respective
other struct member.

As we have the `src` member set, plugging the memory leak is thus as
easy as just freeing the `dst` member. While at it, use designated
initializers to initialize the structure.

This leak was exposed by t5516, but fixing it is not sufficient to make
the whole test suite leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:10 -07:00
e03004f7f8 send-pack: fix leaking common object IDs
We're leaking the array of common object IDs in `send_pack()`. Fix this
by creating a common exit path where we free the leaking data. While at
it, unify some other cleanups now that we have a central place to put
them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:10 -07:00
63494913ec fetch-pack: fix memory leaks on fetch negotiation
We leak both the `nt_object_array` and `negotiator` structures in
`negotiate_using_fetch()`. Plug both of these leaks.

These leaks were exposed by t5516, but fixing them is not sufficient to
make the whole test suite leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:10 -07:00
a9539a993a t/test-lib: allow skipping leak checks for passing tests
With `GIT_TEST_PASSING_SANITIZE_LEAK=check`, one can double check
whether a memory leak fix caused some test suites to become leak free.
This is done by running all tests with the leak checker enabled. If a
test suite does not declare `TEST_PASSES_SANITIZE_LEAK=true` but still
finishes successfully with the leak checker enabled, then this indicates
that the test is leak free and thus missing the annotation.

It is somewhat slow to execute though because it runs all of our test
suites with the leak sanitizer enabled. It is also pointless in most
cases, because the only test suites that need to be checked are those
which _aren't_ yet marked with `TEST_PASSES_SANITIZE_LEAK=true`.

Introduce a new value "check-failing". When set, we behave the same as
if "check" was passed, except that we only check those tests which do
not have `TEST_PASSES_SANITIZE_LEAK=true` set. This is significantly
faster than running all test suites but still fulfills the usecase of
finding newly-leak-free test suites.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:10 -07:00
e65b0c7c36 builtin/cat-file: mark 'git cat-file' sparse-index compatible
This change affects how 'git cat-file' works with the index when
specifying an object with the ":<path>" syntax (which will give file
contents from the index).

'git cat-file' expands a sparse index to a full index any time contents
are requested from the index by specifying an object with the ":<path>"
syntax. This is true even when the requested file is part of the sparse
index, and results in much slower 'git cat-file' operations when working
within the sparse index.

Mark 'git cat-file' as not needing a full index, so that you only pay
the cost of expanding the sparse index to a full index when you request
a file outside of the sparse index.

Add tests to ensure both that:
- 'git cat-file' returns the correct file contents whether or not the
  file is in the sparse index
- 'git cat-file' expands to the full index any time you request
  something outside of the sparse index

Signed-off-by: Kevin Lyles <klyles+github@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 09:19:04 -07:00
68c57590d3 t1092: allow run_on_* functions to use standard input
The 'run_on_sparse' and 'run_on_all' functions do not work correctly for
commands accepting standard input, because they run the same command
multiple times and the first instance consumes it. This also indirectly
affects 'test_all_match' and 'test_sparse_match'.

To allow these functions to work with commands accepting standard input,
first slurp standard input to a temporary file, and then run the command
with its standard input redirected from the temporary file. This ensures
that each command sees the same contents from its standard input.

Note that this does not impact commands that do not read from standard
input; they continue to ignore it. Additionally, existing uses of the
run_on_* functions do not need to do anything differently, as the
standard input of the test environment is already connected to
/dev/null.

We do not explicitly clean up the input files because they are cleaned
up with the rest of the test repositories and their contents may be
useful for figuring out which command failed when a test case fails.

Signed-off-by: Kevin Lyles <klyles@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 09:19:04 -07:00
894deb76a0 clar: add CMake support
Now that we're using `clar` as powerful test framework, we have to
adjust the Visual C build (read: the CMake definition) to be able to
handle that, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:38 -07:00
c9763684ea t/unit-tests: convert ctype tests to use clar
Convert the ctype tests to use the new clar unit testing framework.
Introduce a new function `cl_failf()` that allows us to print a
formatted error message, which we can use to point out which of the
characters was classified incorrectly. This results in output like this
on failure:

    # start of suite 1: ctype
    not ok 1 - ctype::isspace
        ---
        reason: |
          Test failed.
          0x0d is classified incorrectly: expected 0, got 1
        at:
          file: 't/unit-tests/ctype.c'
          line: 36
          function: 'test_ctype__isspace'
        ---
    ok 2 - ctype::isdigit
    ok 3 - ctype::isalpha
    ok 4 - ctype::isalnum
    ok 5 - ctype::is_glob_special
    ok 6 - ctype::is_regex_special
    ok 7 - ctype::is_pathspec_magic
    ok 8 - ctype::isascii
    ok 9 - ctype::islower
    ok 10 - ctype::isupper
    ok 11 - ctype::iscntrl
    ok 12 - ctype::ispunct
    ok 13 - ctype::isxdigit
    ok 14 - ctype::isprint

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
d7f0c47964 t/unit-tests: convert strvec tests to use clar
Convert the strvec tests to use the new clar unit testing framework.
This is a first test balloon that demonstrates how the testing infra for
clar-based tests looks like.

The tests are part of the "t/unit-tests/bin/unit-tests" binary. When
running that binary with an injected error, it generates TAP output:

    # ./t/unit-tests/bin/unit-tests
    TAP version 13
    # start of suite 1: strvec
    ok 1 - strvec::init
    ok 2 - strvec::dynamic_init
    ok 3 - strvec::clear
    not ok 4 - strvec::push
        ---
        reason: |
          String mismatch: (&vec)->v[i] != expect[i]
          'foo' != 'fo' (at byte 2)
        at:
          file: 't/unit-tests/strvec.c'
          line: 48
          function: 'test_strvec__push'
        ---
    ok 5 - strvec::pushf
    ok 6 - strvec::pushl
    ok 7 - strvec::pushv
    ok 8 - strvec::replace_at_head
    ok 9 - strvec::replace_at_tail
    ok 10 - strvec::replace_in_between
    ok 11 - strvec::replace_with_substring
    ok 12 - strvec::remove_at_head
    ok 13 - strvec::remove_at_tail
    ok 14 - strvec::remove_in_between
    ok 15 - strvec::pop_empty_array
    ok 16 - strvec::pop_non_empty_array
    ok 17 - strvec::split_empty_string
    ok 18 - strvec::split_single_item
    ok 19 - strvec::split_multiple_items
    ok 20 - strvec::split_whitespace_only
    ok 21 - strvec::split_multiple_consecutive_whitespaces
    ok 22 - strvec::detach
    1..22

The binary also supports some parameters that allow us to run only a
subset of unit tests or alter the output:

    $ ./t/unit-tests/bin/unit-tests -h
    Usage: ./t/unit-tests/bin/unit-tests [options]

    Options:
      -sname        Run only the suite with `name` (can go to individual test name)
      -iname        Include the suite with `name`
      -xname        Exclude the suite with `name`
      -v            Increase verbosity (show suite names)
      -q            Only report tests that had an error
      -Q            Quit as soon as a test fails
      -t            Display results in tap format
      -l            Print suite names
      -r[filename]  Write summary file (to the optional filename)

Furthermore, running `make unit-tests` runs the binary along with all
the other unit tests we have.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
3d5d4c876a t/unit-tests: implement test driver
The test driver in "unit-test.c" is responsible for setting up our unit
tests and eventually running them. As such, it is also responsible for
parsing the command line arguments.

The clar unit testing framework provides function `clar_test()` that
parses command line arguments and then executes the tests for us. In
theory that would already be sufficient. We have the special requirement
to always generate TAP-formatted output though, so we'd have to always
pass the "-t" argument to clar. Furthermore, some of the options exposed
by clar are ineffective when "-t" is used, but they would still be shown
when the user passes the "-h" parameter to have the clar show its usage.

Implement our own option handling instead of using the one provided by
clar, which gives us greater flexibility in how exactly we set things
up.

We would ideally not use any "normal" code of ours for this such that
the unit testing framework doesn't depend on it working correctly. But
it is somewhat dubious whether we really want to reimplement all of the
option parsing. So for now, let's be pragmatic and reuse it until we
find a good reason in the future why we'd really want to avoid it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
8bc5d33bd8 Makefile: wire up the clar unit testing framework
Wire up the clar unit testing framework by introducing a new
"unit-tests" executable. In contrast to the existing framework, this
will result in a single executable for all test suites. The ability to
pick specific tests to execute is retained via functionality built into
the clar itself.

Note that we need to be a bit careful about how we need to invalidate
our Makefile rules. While we obviously have to regenerate the clar suite
when our test suites change, we also have to invalidate it in case any
of the test suites gets removed. We do so by using our typical pattern
of creating a `GIT-TEST-SUITES` file that gets updated whenever the set
of test suites changes, so that we can easily depend on that file.

Another specialty is that we generate a "clar-decls.h" file. The test
functions are neither static, nor do they have external declarations.
This is because they are getting parsed via "generate.py", which then
creates the external generations that get populated into an array. These
declarations are only seen by the main function though.

The consequence is that we will get a bunch of "missing prototypes"
errors from our compiler for each of these test functions. To fix those
errors, we extract the `extern` declarations from "clar.suite" and put
them into a standalone header that then gets included by each of our
unit tests. This gets rid of compiler warnings for every function which
has been extracted by "generate.py". More importantly though, it does
_not_ get rid of warnings in case a function really isn't being used by
anything. Thus, it would cause a compiler error if a function name was
mistyped and thus not picked up by "generate.py".

The test driver "unit-test.c" is an empty stub for now. It will get
implemented in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
416f4585d6 Makefile: do not use sparse on third-party sources
We have several third-party sources in our codebase that we have
imported from upstream projects. These sources are mostly excluded from
our static analysis, for example when running Coccinelle.

Do the same for our "sparse" target by filtering them out.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
9ec76ad9ed Makefile: make hdr-check depend on generated headers
The "hdr-check" Makefile target compiles each of our headers as a
standalone code unit to ensure that they are not missing any type
declarations and can be included standalone.

With the next commit we will wire up the clar unit testing framework,
which will have the effect that some headers start depending on
generated ones. While we could declare that dependency explicitly, it
does not really feel very maintainable in the future.

Instead, we do the same as in the preceding commit and have the objects
depend on all of our generated headers. While again overly broad, it is
easy to maintain and generating headers is not an expensive thing to do
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
11c1b5ca59 Makefile: fix sparse dependency on GENERATED_H
The "check" Makefile target is essentially an alias around the "sparse"
target. The one difference though is that it will tell users to instead
run the "test" target in case they do not have sparse(1) installed, as
chances are high that they wanted to execute the test suite rather than
doing semantic checks.

But even though the "check" target ultimately just ends up executing
`make sparse`, it still depends on our generated headers. This does not
make any sense though: they are irrelevant for the "test" target advice,
and if these headers are required for the "sparse" target they must be
declared as a dependency on the aliased target, not the alias.

But even moving the dependency to the "sparse" target is wrong, as
concurrent builds may then end up generating the headers and running
sparse concurrently. Instead, we make them a dependency of the specific
objects. While that is overly broad, it does ensure correct ordering.
The alternative, specifying which file depends on what generated header
explicitly, feels rather unmaintainable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
fe7066a9d9 clar: stop including shellapi.h unnecessarily
The `shellapi.h` header was included as of
https://github.com/clar-test/clar/commit/136e763211aa, to have
`SHFileOperation()` declared so that it could be called.

However, https://github.com/clar-test/clar/commit/5ce31b69b525 removed
that call, and therefore that `#include <shellapi.h>` is unnecessary.

It is also unwanted in Git because this project uses a subset of Git for
Windows' SDK in its CI builds that (for bandwidth reasons) excludes tons
of header files, including `shellapi.h`.

So let's remove it.

Note: Since the `windows.h` header would include `shellapi.h` anyway, we
also define `WIN32_LEAN_AND_MEAN` to avoid this and similar other
unnecessary includes before including `windows.h`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:37 -07:00
7d83563713 clar(win32): avoid compile error due to unused fs_copy()
When CLAR_FIXTURE_PATH is unset, the `fs_copy()` function seems not to
be used. But it is declared as `static`, and GCC does not like that,
complaining that it should not be declared/defined to begin with.

We could mark this function as (potentially) unused by following the
`MAYBE_UNUSED` pattern from Git's `git-compat-util.h`. However, this is
a GCC-only construct that is not understood by Visual C. Besides, `clar`
does not use that pattern at all.

Instead, let's use the `((void)SYMBOL);` pattern that `clar` already
uses elsewhere; This avoids the compile error by sorta kinda make the
function used after a fashion.

Note: GCC 14.x (which Git for Windows' SDK already uses) is able to
figure out that this function is unused even though there are recursive
calls between `fs_copy()` and `fs_copydir_helper()`; Earlier GCC
versions do not detect that, and therefore the issue has been hidden
from the regular Linux CI builds (where GCC 14.x is not yet used). That
is the reason why this change is only made in the Windows-specific
portion of `t/unit-tests/clar/clar/fs.h`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:36 -07:00
42020d2dc0 clar: avoid compile error with mingw-w64
When using mingw-w64 to compile the code, and using `_stat()`, it is
necessary to use `struct _stat`, too, and not `struct stat` (as the
latter is incompatible with the "dashed" version because it is limited
to 32-bit time types for backwards compatibility).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:36 -07:00
aa57db2822 t/clar: fix compatibility with NonStop
The NonStop platform does not have `mkdtemp()` available, which we rely
on in `build_sandbox_path()`. Fix this issue by using `mktemp()` and
`mkdir()` instead on this platform.

This has been cherry-picked from the upstream pull request at [1].

[1]: https://github.com/clar-test/clar/pull/96

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:36 -07:00
9b7caa2809 t: import the clar unit testing framework
Our unit testing framework is a homegrown solution. While it supports
most of our needs, it is likely that the volume of unit tests will grow
quite a bit in the future such that we can exercise low-level subsystems
directly. This surfaces several shortcomings that the current solution
has:

  - There is no way to run only one specific tests. While some of our
    unit tests wire this up manually, others don't. In general, it
    requires quite a bit of boilerplate to get this set up correctly.

  - Failures do not cause a test to stop execution directly. Instead,
    the test author needs to return manually whenever an assertion
    fails. This is rather verbose and is not done correctly in most of
    our unit tests.

  - Wiring up a new testcase requires both implementing the test
    function and calling it in the respective test suite's main
    function, which is creating code duplication.

We can of course fix all of these issues ourselves, but that feels
rather pointless when there are already so many unit testing frameworks
out there that have those features.

We line out some requirements for any unit testing framework in
"Documentation/technical/unit-tests.txt". The "clar" unit testing
framework, which isn't listed in that table yet, ticks many of the
boxes:

  - It is licensed under ISC, which is compatible.

  - It is easily vendorable because it is rather tiny at around 1200
    lines of code.

  - It is easily hackable due to the same reason.

  - It has TAP support.

  - It has skippable tests.

  - It preprocesses test files in order to extract test functions, which
    then get wired up automatically.

While it's not perfect, the fact that clar originates from the libgit2
project means that it should be rather easy for us to collaborate with
upstream to plug any gaps.

Import the clar unit testing framework at commit 1516124 (Merge pull
request #97 from pks-t/pks-whitespace-fixes, 2024-08-15). The framework
will be wired up in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:36 -07:00
71360809ec t: do not pass GIT_TEST_OPTS to unit tests with prove
When using the prove target, we append GIT_TEST_OPTS to the arguments
that we execute each of the tests with. This doesn't only include the
intended test scripts, but also ends up passing the arguments to our
unit tests. This is unintentional though as they do not even know to
interpret those arguments, and is inconsistent with how we execute unit
tests without prove.

This isn't much of an issue because our current set of unit tests mostly
ignore their arguments anyway. With the introduction of clar-based unit
tests this is about to become an issue though, as these do parse their
command line argument to alter behaviour.

Prepare for this by passing GIT_TEST_OPTS to "run-test.sh" via an
environment variable. Like this, we can conditionally forward it to our
test scripts, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:36 -07:00
c3459ae9ef refs/files: use heuristic to decide whether to repack with --auto
The `--auto` flag for git-pack-refs(1) allows the ref backend to decide
whether or not a repack is in order. This switch has been introduced
mostly with the "reftable" backend in mind, which already knows to
auto-compact its tables during normal operations. When the flag is set,
then it will use the same auto-compaction mechanism and thus end up
doing nothing in most cases.

The "files" backend does not have any such heuristic yet and instead
packs any loose references unconditionally. So we rewrite the complete
"packed-refs" file even if there's only a single loose reference to be
packed.

Even worse, starting with 9f6714ab3e (builtin/gc: pack refs when using
`git maintenance run --auto`, 2024-03-25), `git pack-refs --auto` is
unconditionally executed via our auto maintenance, so we end up repacking
references every single time auto maintenance kicks in. And while that
commit already mentioned that the "files" backend unconditionally packs
refs now, the author obviously didn't quite think about the consequences
thereof. So while the idea was sound, we really should have added a
heuristic to the "files" backend before implementing it.

Introduce a heuristic that decides whether or not it is worth to pack
loose references. The important factors to decide here are the number of
loose references in comparison to the overall size of the "packed-refs"
file. The bigger the "packed-refs" file, the longer it takes to rewrite
it and thus we scale up the limit of allowed loose references before we
repack.

As is the nature of heuristics, this mechansim isn't obviously
"correct", but should rather be seen as a tradeoff between how much
resources we spend packing refs and how inefficient the ref store
becomes. For all I can say, we have successfully been using the exact
same heuristic in Gitaly for several years by now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00
bd51dca36e t0601: merge tests for auto-packing of refs
We have two tests in t0601 which exercise the same underlying logic,
once via `git pack-refs --auto` and once via `git maintenance run
--auto`. Merge these two tests into one such that it becomes easier to
extend test coverage for both commands at the same time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00
d343068e4a wrapper: introduce log2u()
We have an implementation of a function that computes the log2 for an
integer. While we could instead use log2(3P), that involves floating
point numbers and is thus needlessly complex and inefficient.

We're about to add a second caller that wants to compute log2 for a
`size_t`. Let's thus move the function into "wrapper.h" such that it
becomes generally available.

While at it, tweak the implementation a bit:

  - The parameter is converted from `int` to `uintmax_t`. This
    conversion is safe to do in "bisect.c" because we already check that
    the argument is positive.

  - The return value is an `unsigned`. It cannot ever be negative, so it
    is pointless for it to be a signed integer.

  - Loop until `!n` instead of requiring that `n > 1` and then subtract
    1 from the result and add a special case for `!sz`. This helps
    compilers to generate more efficient code.

Compilers recognize the pattern of this function and optimize
accordingly. On GCC 14.2 x86_64:

    log2u(unsigned long):
            test    rdi, rdi
            je      .L3
            bsr     rax, rdi
            ret
    .L3:
            mov     eax, -1
            ret

Clang 18.1 does not yet recognize the pattern, but starts to do so on
Clang trunk x86_64. The code isn't quite as efficient as the one
generated by GCC, but still manages to optimize away the loop:

    log2u(unsigned long):
            test    rdi, rdi
            je      .LBB0_1
            shr     rdi
            bsr     rcx, rdi
            mov     eax, 127
            cmovne  rax, rcx
            xor     eax, -64
            add     eax, 65
            ret
    .LBB0_1:
            mov     eax, -1
            ret

The pattern is also recognized on other platforms like ARM64 GCC 14.2.0,
where we end up using `clz`:

    log2u(unsigned long):
            clz     x2, x0
            cmp     x0, 0
            mov     w1, 63
            sub     w0, w1, w2
            csinv   w0, w0, wzr, ne
            ret

Note that we have a similar function `fastlog2()` in the reftable code.
As that codebase is separate from the Git codebase we do not adapt it to
use the new function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00
b2dbf97f47 builtin/index-pack: fix segfaults when running outside of a repo
It was reported that git-verify-pack(1) has started to crash with Git
v2.46.0 when run outside of a repository. This is another fallout from
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), where we have stopped setting the default hash algorithm
for `the_repository`. Consequently, code that relies on `the_hash_algo`
will now crash when it hasn't explicitly been initialized, which may be
the case when running outside of a Git repository.

The crash is not in git-verify-pack(1) but instead in git-index-pack(1),
which gets called by the former. Ideally, both of these programs should
be able to identify the hash algorithm used by the packfile and index
without having to rely on external information. But unfortunately, the
format for neither of them is completely self-describing, so it is not
possible to derive that information. This is a design issue that we
should address by introducing a new packfile version that encodes its
object hash.

For now though the more important fix is to not make either of these
programs crash anymore, which we do by falling back to SHA1 when the
object hash is unconfigured. This pessimizes reading packfiles which
use a different hash than SHA1, but restores previous behaviour.

Reported-by: Ilya K <me@0upti.me>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 07:40:00 -07:00
bf6ab087d1 rebase: apply and cleanup autostash when rebase fails to start
If "git rebase" fails to start after stashing the user's uncommitted
changes then it forgets to restore the stashed changes and remove the
state directory. To make matters worse, running "git rebase --abort" to
apply the stashed changes and cleanup the state directory fails because
the state directory only contains the "autostash" file and is missing
the "head-name" and "onto" files required by read_basic_state().

Fix this by applying the autostash and removing the state directory if
the pre-rebase hook or initial checkout fail. This matches what
finish_rebase() does at the end of a successful rebase. If the user
modifies any files after the autostash is created it is possible there
will be conflicts when the autostash is applied. In that case
apply_autostash() saves the stash in a new entry under refs/stash and so
it is safe to remove the state directory containing the autostash file.

New tests are added to check the autostash is applied and the state
directory is removed if the rebase fails to start. Checks are also added
to some existing tests in order to ensure there is no state directory
left behind when a rebase fails to start and no autostash has been
created.

Reported-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03 11:24:43 -07:00
53a92c9552 Documentation/BreakingChanges: announce removal of git-pack-redundant(1)
The git-pack-redundant(1) command is already in the process of being
phased out and dies unless the user passes the `--i-still-use-this` flag
since 4406522b76 (pack-redundant: escalate deprecation warning to an
error, 2023-03-23). We haven't heard any complaints, so let's announce
the removal of this command in Git 3.0 in our breaking changes document.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03 11:05:22 -07:00
2e7b89e038 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03 09:15:04 -07:00
bd3abe0417 Merge branch 'jc/config-doc-update'
Docfix.

* jc/config-doc-update:
  git-config.1: fix description of --regexp in synopsis
  git-config.1: --get-all description update
2024-09-03 09:15:04 -07:00
bb4248452e Merge branch 'rs/remote-leakfix'
Leakfix.

* rs/remote-leakfix:
  remote: plug memory leaks at early returns
2024-09-03 09:15:03 -07:00
17636cdf3b Merge branch 'ps/reftable-concurrent-compaction'
The code path for compacting reftable files saw some bugfixes
against concurrent operation.

* ps/reftable-concurrent-compaction:
  reftable/stack: fix segfault when reload with reused readers fails
  reftable/stack: reorder swapping in the reloaded stack contents
  reftable/reader: keep readers alive during iteration
  reftable/reader: introduce refcounting
  reftable/stack: fix broken refnames in `write_n_ref_tables()`
  reftable/reader: inline `reader_close()`
  reftable/reader: inline `init_reader()`
  reftable/reader: rename `reftable_new_reader()`
  reftable/stack: inline `stack_compact_range_stats()`
  reftable/blocksource: drop malloc block source
2024-09-03 09:15:03 -07:00
dd903659cd Merge branch 'js/fetch-push-trace2-annotation'
More trace2 events at key points on push and fetch code paths have
been added.

* js/fetch-push-trace2-annotation:
  send-pack: add new tracing regions for push
  fetch: add top-level trace2 regions
  trace2: implement trace2_printf() for event target
2024-09-03 09:15:02 -07:00
533e30819a Merge branch 'aa/cat-file-batch-output-doc'
Docfix.

* aa/cat-file-batch-output-doc:
  docs: explain the order of output in the batched mode of git-cat-file(1)
2024-09-03 09:15:01 -07:00
739c509b6d Merge branch 'dh/runtime-prefix-on-zos'
Support for the RUNTIME_PREFIX feature has been added to z/OS port.

* dh/runtime-prefix-on-zos:
  exec_cmd: RUNTIME_PREFIX on z/OS systems
2024-09-03 09:15:00 -07:00
8c1c63d525 Merge branch 'ps/leakfixes-part-5'
Even more leak fixes.

* ps/leakfixes-part-5:
  transport: fix leaking negotiation tips
  transport: fix leaking arguments when fetching from bundle
  builtin/fetch: fix leaking transaction with `--atomic`
  remote: fix leaking peer ref when expanding refmap
  remote: fix leaks when matching refspecs
  remote: fix leaking config strings
  builtin/fetch-pack: fix leaking refs
  sideband: fix leaks when configuring sideband colors
  builtin/send-pack: fix leaking refspecs
  transport: fix leaking OID arrays in git:// transport data
  t/helper: fix leaking multi-pack-indices in "read-midx"
  builtin/repack: fix leaks when computing packs to repack
  midx-write: fix leaking hashfile on error cases
  builtin/archive: fix leaking `OPT_FILENAME()` value
  builtin/upload-archive: fix leaking args passed to `write_archive()`
  builtin/merge-tree: fix leaking `-X` strategy options
  pretty: fix leaking key/value separator buffer
  pretty: fix memory leaks when parsing pretty formats
  convert: fix leaks when resetting attributes
  mailinfo: fix leaking header data
2024-09-03 09:15:00 -07:00
f123c19e72 Merge branch 'cl/config-regexp-docfix'
Docfix.

* cl/config-regexp-docfix:
  doc: replace 3 dash with correct 2 dash in git-config(1)
2024-09-03 09:14:59 -07:00
6b77283f5e mergetools: vscode: new tool
VSCode has supported three-way merges since 2022, see
<https://github.com/microsoft/vscode/issues/5770#issuecomment-1188658476>.

Although the program binary is located at /usr/bin/code, name the
mergetool "vscode" because the word "code" is too generic and would lead
to confusion. The name "vscode" also matches Git's existing
contrib/vscode directory.

On Windows, VSCode adds the directory that contains code.cmd to %PATH%,
so there is no need to invoke mergetool_find_win32_cmd to search for the
program.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-01 20:47:58 -07:00
a680635e05 t: port helper/test-oid-array.c to unit-tests/t-oid-array.c
helper/test-oid-array.c along with t0064-oid-array.sh test the
oid-array.h API, which provides storage and processing
efficiency over large lists of object identifiers.

Migrate them to the unit testing framework for better runtime
performance and efficiency. As we don't initialize a repository
in these tests, the hash algo that functions like oid_array_lookup()
use is not initialized, therefore call repo_set_hash_algo() to
initialize it. And init_hash_algo():lib-oid.c can aid in this
process, so make it public.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-01 20:43:38 -07:00
d4dc0efd7d compat/terminal: mark parameter of git_terminal_prompt() UNUSED
If neither HAVE_DEV_TTY nor GIT_WINDOWS_NATIVE is set, the fallback
code calls the system getpass(). This unfortunately ignores the "echo"
boolean parameter, as we have no way to implement that functionality.
But we still have to keep the unused parameter, since our interface
has to match the other implementations.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-01 08:26:51 -07:00
6bd2ae67a5 revision: free commit buffers for skipped commits
In git-log we leave the save_commit_buffer flag set to "1", which tells
the commit parsing code to store the object content after it has parsed
it to find parents, tree, etc. That lets us reuse the contents for
pretty-printing the commit in the output. And then after printing each
commit, we call free_commit_buffer(), since we don't need it anymore.

But some options may cause us to traverse commits which are not part of
the output. And so git-log does not see them at all, and doesn't free
them. One such case is something like:

  git log -n 1000 --skip=1000000

which will churn through a million commits, before showing only a
thousand. We loop through these inside get_revision(), without freeing
the contents. As a result, we end up storing the object data for those
million commits simultaneously.

We should free the stored buffers (if any) for those commits as we skip
over them, which is what this patch does. Running the above command in
linux.git drops the peak heap usage from ~1.1GB to ~200MB, according to
valgrind/massif. (I thought we might get an even bigger improvement, but
the remaining memory is going to commit/tree structs, which we do hold
on to forever).

Note that this problem doesn't occur if:

  - you're running a git-rev-list without a --format parameter; it turns
    off save_commit_buffer by default, since it only output the object
    id

  - you've built a commit-graph file, since in that case we'd use the
    optimized graph data instead of the initial parse, and then do a
    lazy parse for commits we're actually going to output

There are probably some other option combinations that can likewise
end up with useless stored commit buffers. For example, if you ask for
"foo..bar", then we'll have to walk down to the merge base, and
everything on the "foo" side won't be shown. Tuning the "save" behavior
to handle that might be tricky (I guess maybe drop buffers for anything
we mark as UNINTERESTING?). And in the long run, the right solution here
is probably to make sure the commit-graph is built (since it fixes the
memory problem _and_ drastically reduces CPU usage).

But since this "--skip" case is an easy one-liner, it's worth fixing in
the meantime. It should be OK to make this call even if there is no
saved buffer (e.g., because save_commit_buffer=0, or because a
commit-graph was used), since it's O(1) to look up the buffer and is a
noop if it isn't present. I verified by running the above command after
"git commit-graph write --reachable", and it takes the same time with
and without this patch.

Reported-by: Yuri Karnilaev <karnilaev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-30 14:03:00 -07:00
ab8bcd2dbd refs/files-backend: work around -Wunused-parameter
This is needed to build things with -Werror=unused-parameter on a
platform without symbolic link support.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-30 12:34:04 -07:00
516a9ec3d5 grep: prefer UNUSED to MAYBE_UNUSED for pcre allocators
We provide custom malloc/free callbacks for the pcre library to use.
Those take an extra "data" parameter, but we don't use it. Back when
these were added in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16), we only had MAYBE_UNUSED.

But these days we have UNUSED, which we should prefer, as it will
let the compiler inform us if the code changes to actually use the
parameters.

I also moved the annotations to come after the variable name, which is
how we typically spell it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29 13:59:46 -07:00
3cdddcf6b2 gc: drop MAYBE_UNUSED annotation from used parameter
The "opts" parameter is always used, so marking it with MAYBE_UNUSED is
just confusing.

This annotation goes back to 41abfe15d9 (maintenance: add pack-refs
task, 2021-02-09), when it really was unused. Back then we did not have
the UNUSED macro that would complain if the code changed to use the
parameter. So when we started using it in bfc2f9eb8e (builtin/gc:
forward git-gc(1)'s `--auto` flag when packing refs, 2024-03-25), nobody
noticed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29 13:56:46 -07:00
a051ca5e65 CodingGuidelines: also mention MAYBE_UNUSED
A function that uses a parameter in one build may lose all uses of
the parameter in another build, depending on the configuration.  A
workaround for such a case, MAYBE_UNUSED, should also be mentioned
when we recommend the use of UNUSED to our developers.

Keep the addition to the guideline short and document the criteria
to choose between UNUSED and MAYBE_UNUSED near their definition.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29 11:28:07 -07:00
c3b92d4037 Merge branch 'jk/unused-parameters' into jc/maybe-unused
* jk/unused-parameters:
  CodingGuidelines: mention -Wunused-parameter and UNUSED
  config.mak.dev: enable -Wunused-parameter by default
  compat: mark unused parameters in win32/mingw functions
  compat: disable -Wunused-parameter in win32/headless.c
  compat: disable -Wunused-parameter in 3rd-party code
  t-reftable-readwrite: mark unused parameter in callback function
  gc: mark unused config parameter in virtual functions
2024-08-29 11:09:20 -07:00
4590f2e941 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29 11:08:17 -07:00
11fd53a6c2 Merge branch 'ds/sparse-diff-index'
The underlying machinery for "git diff-index" has long been made to
expand the sparse index as needed, but the command fully expanded
the sparse index upfront, which now has been taught not to do.

* ds/sparse-diff-index:
  diff-index: integrate with the sparse index
2024-08-29 11:08:17 -07:00
839b808325 Merge branch 'cp/unit-test-reftable-block'
Another test for reftable library ported to the unit test framework.

* cp/unit-test-reftable-block:
  t-reftable-block: mark unused argv/argc
  t-reftable-block: add tests for index blocks
  t-reftable-block: add tests for obj blocks
  t-reftable-block: add tests for log blocks
  t-reftable-block: remove unnecessary variable 'j'
  t-reftable-block: use xstrfmt() instead of xstrdup()
  t-reftable-block: use block_iter_reset() instead of block_iter_close()
  t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
  t-reftable-block: use reftable_record_equal() instead of check_str()
  t-reftable-block: release used block reader
  t: harmonize t-reftable-block.c with coding guidelines
  t: move reftable/block_test.c to the unit testing framework
2024-08-29 11:08:16 -07:00
d4d677704d Merge branch 'ps/reftable-drop-generic'
The code in the reftable library has been cleaned up by discarding
unused "generic" interface.

* ps/reftable-drop-generic:
  reftable: mark unused parameters in empty iterator functions
  reftable/generic: drop interface
  t/helper: refactor to not use `struct reftable_table`
  t/helper: use `hash_to_hex_algop()` to print hashes
  t/helper: inline printing of reftable records
  t/helper: inline `reftable_table_print()`
  t/helper: inline `reftable_stack_print_directory()`
  t/helper: inline `reftable_reader_print_file()`
  t/helper: inline `reftable_dump_main()`
  reftable/dump: drop unused `compact_stack()`
  reftable/generic: move generic iterator code into iterator interface
  reftable/iter: drop double-checking logic
  reftable/stack: open-code reading refs
  reftable/merged: stop using generic tables in the merged table
  reftable/merged: rename `reftable_new_merged_table()`
  reftable/merged: expose functions to initialize iterators
2024-08-29 11:08:16 -07:00
17d4b10aea The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 10:31:28 -07:00
d19863b970 Merge branch 'ah/git-prompt-portability'
The command line prompt support used to be littered with bash-isms,
which has been corrected to work with more shells.

* ah/git-prompt-portability:
  git-prompt: support custom 0-width PS1 markers
  git-prompt: ta-da! document usage in other shells
  git-prompt: don't use shell $'...'
  git-prompt: add some missing quotes
  git-prompt: replace [[...]] with standard code
  git-prompt: don't use shell arrays
  git-prompt: fix uninitialized variable
  git-prompt: use here-doc instead of here-string
2024-08-28 10:31:28 -07:00
a9bc27fb18 Merge branch 'gt/unit-test-urlmatch-normalization'
Another rewrite of test.

* gt/unit-test-urlmatch-normalization:
  t: migrate t0110-urlmatch-normalization to the new framework
2024-08-28 10:31:27 -07:00
029c870ab5 Merge branch 'mt/rebase-x-quiet'
"git rebase -x --quiet" was not quiet, which was corrected.

* mt/rebase-x-quiet:
  rebase --exec: respect --quiet
2024-08-28 10:31:26 -07:00
e49d2472d2 reftable: mark unused parameters in empty iterator functions
These unused parameters were marked in a68ec8683a (reftable: mark unused
parameters in virtual functions, 2024-08-17), but the functions were
moved to a new file in a parallel branch via f2406c81b9
(reftable/generic: move generic iterator code into iterator interface,
2024-08-22).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 10:09:56 -07:00
08e83b5ec5 t-reftable-block: mark unused argv/argc
This is conceptually the same as the cases in df9d638c24 (unit-tests:
ignore unused argc/argv, 2024-08-17), but this unit test was migrated
from the reftable tests in a parallel branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 10:09:32 -07:00
a61bc8879e CodingGuidelines: mention -Wunused-parameter and UNUSED
Now that -Wunused-parameter is on by default for DEVELOPER=1 builds,
people may trigger it, blocking their build. When it's a mistake for the
parameter to exist, the path forward is obvious: remove it. But
sometimes you need to suppress the warning, and the "UNUSED" mechanism
for that is specific to our project, so people may not know about it.

Let's put some advice in CodingGuidelines, including an example warning
message. That should help people who grep for the warning text after
seeing it from the compiler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:25 -07:00
a219a6739c config.mak.dev: enable -Wunused-parameter by default
Having now removed or annotated all of the unused function parameters in
our code base, I found that each instance falls into one of three
categories:

  1. ignoring the parameter is a bug (e.g., a function takes a ptr/len
     pair, but ignores the length). Detecting these helps us find the
     bugs.

  2. the parameter is unnecessary (and usually left over from a
     refactoring or earlier iteration of a patches series). Removing
     these cleans up the code.

  3. the function has to conform to a specific interface (because it's
     used via a function pointer, or matches something on the other side
     of an #ifdef). These ones are annoying, but annotating them with
     UNUSED is not too bad (especially if the compiler tells you about
     the problem promptly).

Certainly instances of (3) are more common than (1), but after finding
all of these, I think there were enough cases of (1) that it justifies
the work in annotating all of the (3)s.

And since the code base is now at a spot where we compile cleanly with
-Wunused-parameter, turning it on will make it the responsibility of
individual patch writers going forward.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:18 -07:00
b652382d76 compat: mark unused parameters in win32/mingw functions
The compat/ directory contains many stub functions, wrappers, and so on
that have to conform to a specific interface, but don't necessarily need
to use all of their parameters. Let's mark them to avoid complaints from
-Wunused-parameter.

This was done mostly via guess-and-check with the Windows build in
GitHub CI. I also confirmed that the win+VS build is similarly happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:18 -07:00
141491840d compat: disable -Wunused-parameter in win32/headless.c
As with the files touched in the previous commit, win32/headless.c does
not include git-compat-util.h, so it doesn't have our UNUSED macro.
Unlike those ones, this is not third-party code, so it would not be a
big deal to modify it.

However, I'm not sure if including git-compat-util.h would create other
headaches (and I don't even have a machine to test this on; I'm relying
on Windows CI to compile it at all). Given how trivial the file is, and
that the unused parameters are not interesting (they are just
boilerplate for the wWinMain() function), we can just use the same trick
as the previous commit and disable the warnings via pragma.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:18 -07:00
4550c16434 compat: disable -Wunused-parameter in 3rd-party code
We carry some vendored 3rd-party code in compat/ that does not build
cleanly with -Wunused-parameters. We could mark these with UNUSED, but
there are two reasons not to:

  1. This is code imported from elsewhere, so we'd prefer to avoid
     modifying it in an invasive way that could create conflicts if we
     tried to pull in a new version.

  2. These files don't include git-compat-util.h at all, so we'd need to
     factor out (or repeat) our UNUSED macro.

In theory we could modify the build process to invoke the compiler with
the extra warning disabled for these files, but there are tricky corner
cases there (e.g., for NO_REGEX we cannot assume that the compiler
understands -Wno-unused-parameter as an option, so we'd have to use our
detect-compiler script).

Instead, let's rely on the gcc diagnostic #pragma. This is horribly
unportable, of course, but it should do what we want.  Compilers which
don't understand this particular pragma should ignore it (per the
standard), and compilers which do care about "-Wunused-parameter" will
hopefully respect it, even if they are not gcc (e.g., clang does).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:18 -07:00
8c90b41f0a t-reftable-readwrite: mark unused parameter in callback function
This spot was originally marked in in 4695c3f3a9 (reftable: mark unused
parameters in virtual functions, 2024-08-17), but was copied in
5b539a5361 (t: move reftable/readwrite_test.c to the unit testing
framework, 2024-08-13).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:17 -07:00
551e4de8e1 gc: mark unused config parameter in virtual functions
Commit d1ae15d68b (builtin/gc: refactor to read config into structure,
2024-08-16) added a new parameter to the maintenance_task virtual
functions, but most of them don't need to look at it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:17 -07:00
241499aba0 send-email: add mailmap support via sendemail.mailmap and --mailmap
In some cases, a user may be generating a patch for an old commit which
now has an out-of-date author or other identity. For example, consider a
team member who contributes to an internal fork of an upstream project,
but leaves before this change is submitted upstream.

In this case, the team members company address may no longer be valid,
and will thus bounce when sending email.

This can be manually avoided by editing the generated patch files, or by
carefully using --suppress-<cc|to> options. This requires a lot of
manual intervention and is easy to forget.

Git has support for mapping old email addresses and names to a canonical
name and address via the .mailmap file (and its associated mailmap.file,
mailmap.blob, and log.mailmap options).

Teach git send-email to enable mailmap support for all addresses. This
ensures that addresses point to the canonical real name and email
address.

Add the sendemail.mailmap configuration option and its associated
--mailmap (and --use-mailmap for compatibility with git log) options.
For now, the default behavior is to disable the mailmap in order to
avoid any surprises or breaking any existing setups.

These options support per-identity configuration via the
sendemail.identity configuration blocks. This enables identity-specific
configuration in cases where users may not want to enable support.

In addition, support send-email specific mailmap data via
sendemail.mailmap.file, sendemail.mailmap.blob and their
identity-specific variants.

The intention of these options is to enable mapping addresses which are
no longer valid to a current project or team maintainer. Such mappings
may change the actual person being referred to, and may not make sense
in a traditional mailmap file which is intended for updating canonical
name and address for the same individual.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:29 -07:00
f54ca6ae72 check-mailmap: add options for additional mailmap sources
The git check-mailmap command reads the mailmap from either the default
.mailmap location and then from the mailmap.blob and mailmap.file
configurations.

A following change to git send-email will want to support new
configuration options based on the configured identity. The
identity-based configuration and options only make sense in the context
of git send-email.

Expose the read_mailmap_file and read_mailmap_blob functions from
mailmap.c.  Teach git check-mailmap the --mailmap-file and
--mailmap-blob options which load the additional mailmap sources.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:29 -07:00
3a27e991f2 check-mailmap: accept "user@host" contacts
git check-mailmap splits each provided contact using split_ident_line.
This function requires that the contact either be of the form "Name
<user@host>" or of the form "<user@host>". In particular, if the mail
portion of the contact is not surrounded by angle brackets,
split_ident_line will reject it.

This results in git check-mailmap rejecting attempts to translate simple
email addresses:

  $ git check-mailmap user@host
  fatal: unable to parse contact: user@host

This limits the usability of check-mailmap as it requires placing angle
brackets around plain email addresses.

In particular, attempting to use git check-mailmap to support mapping
addresses in git send-email is not straight forward. The sanitization
and validation functions in git send-email strip angle brackets from
plain email addresses. It is not trivial to add brackets prior to
invoking git check-mailmap.

Instead, modify check_mailmap() to allow such strings as contacts. In
particular, treat any line which cannot be split by split_ident_line as
a simple email address.

No attempt is made to actually parse the address line, or validate that
it is actually an email address. Implementing such validation is not
trivial. Besides, we weren't validating the address between angle
brackets before anyways.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:28 -07:00
d3e7db2b82 builtin/pack-objects.c: do not open-code MAX_PACK_OBJECT_HEADER
The function `write_reused_pack_one()` defines an header to store the
OFS_DELTA header, but uses the constant "10" instead of
"MAX_PACK_OBJECT_HEADER" (as is done elsewhere in the same patch, circa
bb514de356 (pack-objects: improve partial packfile reuse, 2019-12-18)).

Declare the `ofs_header` field to be sized according to
`MAX_PACK_OBJECT_HEADER` (which is 10, as defined in "pack.h") instead
of the constant 10.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:27 -07:00
db40e3c92b pack-bitmap.c: avoid repeated pack_pos_to_offset() during reuse
When calling `try_partial_reuse()`, the (sole) caller from the function
`reuse_partial_packfile_from_bitmap_1()` has to translate its bit
position to a pack position.

In the MIDX bitmap case, the caller translates from the bit position, to
a position in the MIDX's pseudo-pack order (with `pack_pos_to_midx()`),
then get a pack offset (with `nth_midxed_offset()`) before finally
working backwards to get the pack position in the source pack by calling
`offset_to_pack_pos()`.

In the non-MIDX bitmap case, we can use the bit position as the pack
position directly (see the comment at the beginning of the
`reuse_partial_packfile_from_bitmap_1()` function for why).

In either case, the first thing that `try_partial_reuse()` does after
being called is determine the offset of the object at the given pack
position by calling `pack_pos_to_offset()`. But we already have that
information in the MIDX case!

Avoid re-computing that information by instead passing it in. In the
MIDX case, we already have that information stored. In the non-MIDX
case, the call to `pack_pos_to_offset()` moves from the function
`try_partial_reuse()` to its caller. In total, we'll save one call to
`pack_pos_to_offset()` when processing MIDX bitmaps.

(On my machine, there is a slight speed-up on the order of ~2ms, but it
is within the margin of error over 10 runs, so I think you'd have to
have a truly gigantic repository to confidently measure any significant
improvement here).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:27 -07:00
125c32605a builtin/pack-objects.c: translate bit positions during pack-reuse
When reusing chunks verbatim from an existing source pack, the function
write_reused_pack() first attempts to reuse whole words (via the
function `write_reused_pack_verbatim()`), and then individual bits (via
`write_reused_pack_one()`).

In the non-MIDX case, all of this code works fine. Likewise, in the MIDX
case, processing bits individually from the first (preferred) pack works
fine. However, processing subsequent packs in the MIDX case is broken
when there are duplicate objects among the set of MIDX'd packs.

This is because we treat the individual bit positions as valid pack
positions within the source pack(s), which does not account for gaps in
the source pack, like we see when the MIDX must break ties between
duplicate objects which appear in multiple packs.

The broken code looks like:

    for (; i < reuse_packfile_bitmap->word_alloc; i++) {
            for (offset = 0; offset < BITS_IN_EWORD, offset++) {
                    /* ... */

                    write_reused_pack_one(reuse_packfile->p,
                                          pos + offset - reuse_packfile->bitmap_pos,
                                          f, pack_start, &w_curs);
            }
    }

, where the second argument is incorrect and does not account for gaps.

Instead, make sure that we translate bit positions in the MIDX's
pseudo-pack order to pack positions in the respective source packs by:

  - Translating the bit position (pseudo-pack order) to a MIDX position
    (lexical order).

  - Use the MIDX position to obtain the offset at which the given object
    occurs in the source pack.

  - Then translate that offset back into a pack relative position within
    the source pack by calling offset_to_pack_pos().

After doing this, then we can safely use the result as a pack position.
Note that when doing single-pack reuse, as well as reusing objects from
the MIDX's preferred pack, such translation is not necessary, since
either ties are broken in favor of the preferred pack, or there are no
ties to break at all (in the case of non-MIDX bitmaps).

Failing to do this can result in strange failure modes. One example that
can occur when misinterpreting bits in the above fashion is that Git
thinks it's supposed to send a delta that the caller does not want.
Under this (incorrect) assumption, we try to look up the delta's base
(so that we can patch any OFS_DELTAs if necessary). We do this using
find_reused_offset().

But if we try and call that function for an offset belonging to an
object we did not send, we'll get back garbage. This can result in us
computing a negative fixup value, which results in memory corruption
when trying to write the (patched) OFS_DELTA header.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:26 -07:00
41cd4b478f pack-bitmap: tag bitmapped packs with their corresponding MIDX
The next commit will need to use the bitmap's MIDX (if one exists) to
translate bit positions into pack-relative positions in the source pack.

Ordinarily, we'd use the "midx" field of the bitmap_index struct. But
since that struct is defined within pack-bitmap.c, and our caller is in
a separate compilation unit, we do not have access to the MIDX field.

Instead, add a "from_midx" field to the bitmapped_pack structure so that
we can use that piece of data from outside of pack-bitmap.c. The caller
that uses this new piece of information will be added in the following
commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:26 -07:00
bbc393a9f3 t/t5332-multi-pack-reuse.sh: verify pack generation with --strict
In our tests for multi-pack reuse, we have two helper functions:

  - test_pack_objects_reused_all(), and
  - test_pack_objects_reused()

which invoke pack-objects (either with `--all`, or the supplied tips via
stdin, respectively) and ensure that (a) the number of reused objects,
and (b) the number of packs which those objects were reused from both
match the expected values.

Both functions discard the output of pack-objects and assert only on the
contents of the trace2 stream.

However, if we store the pack and attempt to index it with `--strict`,
we find that a number of our tests are broken, indicating a bug within
multi-pack reuse.

That bug will be addressed in a subsequent commit. But let's first
harden these tests by trying to index the resulting pack, marking the
tests which fail appropriately.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:26 -07:00
1609470409 git-config.1: fix description of --regexp in synopsis
The synopsis says --regexp=<regexp> but the --regexp option is a
Boolean that says "the name given is not literal, but a pattern to
match the name".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26 11:49:37 -07:00
686e9f616f git-config.1: --get-all description update
"git config --get-all foo.bar" shows all values for the foo.bar
variable, but does not give the variable name in each output entry.
Hence it is equivalent to "git config get --all foo.bar", without
"--show-names", in the more modern syntax.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26 11:49:27 -07:00
159f2d50e7 Sync with 'maint' 2024-08-26 11:38:08 -07:00
b63a92d515 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26 11:32:24 -07:00
27d4f4032e Merge branch 'jc/coding-style-c-operator-with-spaces'
Write down whitespacing rules around C opeators.

* jc/coding-style-c-operator-with-spaces:
  CodingGuidelines: spaces around C operators
2024-08-26 11:32:24 -07:00
3222718ad7 Merge branch 'ds/for-each-ref-is-base'
'git for-each-ref' learned a new "--format" atom to find the branch
that the history leading to a given commit "%(is-base:<commit>)" is
likely based on.

* ds/for-each-ref-is-base:
  p1500: add is-base performance tests
  for-each-ref: add 'is-base' token
  commit: add gentle reference lookup method
  commit-reach: add get_branch_base_for_tip
2024-08-26 11:32:24 -07:00
3dd2a2feca Merge branch 'jk/send-email-translate-aliases'
"git send-email" learned "--translate-aliases" option that reads
addresses from the standard input and emits the result of applying
aliases on them to the standard output.

* jk/send-email-translate-aliases:
  send-email: teach git send-email option to translate aliases
  t9001-send-email.sh: update alias list used for pine test
  t9001-send-email.sh: fix quoting for mailrc --dump-aliases test
2024-08-26 11:32:23 -07:00
2b30d66c43 Merge branch 'jk/mark-unused-parameters'
Mark unused parameters as UNUSED to squelch -Wunused warnings.

* jk/mark-unused-parameters:
  t-hashmap: stop calling setup() for t_intern() test
  scalar: mark unused parameters in dummy function
  daemon: mark unused parameters in non-posix fallbacks
  setup: mark unused parameter in config callback
  test-mergesort: mark unused parameters in trivial callback
  t-hashmap: mark unused parameters in callback function
  reftable: mark unused parameters in virtual functions
  reftable: drop obsolete test function declarations
  reftable: ignore unused argc/argv in test functions
  unit-tests: ignore unused argc/argv
  t/helper: mark more unused argv/argc arguments
  oss-fuzz: mark unused argv/argc argument
  refs: mark unused parameters in do_for_each_reflog_helper()
  refs: mark unused parameters in ref_store fsck callbacks
  update-ref: mark more unused parameters in parser callbacks
  imap-send: mark unused parameter in ssl_socket_connect() fallback
2024-08-26 11:32:23 -07:00
2ff26d2286 Merge branch 'jk/drop-unused-parameters'
Drop unused parameters from functions.

* jk/drop-unused-parameters:
  diff-lib: drop unused index argument from get_stat_data()
  ref-filter: drop unused parameters from email_atom_option_parser()
  pack-bitmap: drop unused parameters from select_pseudo_merges()
  pack-bitmap: load writer config from repository parameter
  refs: drop some unused parameters from create_symref_lock()
2024-08-26 11:32:22 -07:00
1f4d89dfce Merge branch 'tb/pseudo-merge-bitmap-fixes'
We created a useless pseudo-merge reachability bitmap that is about
0 commits, and attempted to include commits that are not in packs,
which made no sense.  These bugs have been corrected.

* tb/pseudo-merge-bitmap-fixes:
  pseudo-merge.c: ensure pseudo-merge groups are closed
  pseudo-merge.c: do not generate empty pseudo-merge commits
  t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
  pack-bitmap-write.c: select pseudo-merges even for small bitmaps
  pack-bitmap: drop redundant args from `bitmap_writer_finish()`
  pack-bitmap: drop redundant args from `bitmap_writer_build()`
  pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()`
  pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-26 11:32:21 -07:00
6e6f68b59b Merge branch 'ps/maintenance-detach-fix-more'
A tests for "git maintenance" that were broken on Windows have been
corrected.

* ps/maintenance-detach-fix-more:
  builtin/maintenance: fix loose objects task emitting pack hash
  t7900: exercise detaching via trace2 regions
  t7900: fix flaky test due to leaking background job
2024-08-26 11:32:20 -07:00
1e8962ee08 Merge branch 'ps/maintenance-detach-fix'
Maintenance tasks other than "gc" now properly go background when
"git maintenance" runs them.

* ps/maintenance-detach-fix:
  run-command: fix detaching when running auto maintenance
  builtin/maintenance: add a `--detach` flag
  builtin/gc: add a `--detach` flag
  builtin/gc: stop processing log file on signal
  builtin/gc: fix leaking config values
  builtin/gc: refactor to read config into structure
  config: fix constness of out parameter for `git_config_get_expiry()`
2024-08-26 11:32:20 -07:00
6809f8ccad A bit more topics for 2.46.x maintenance track
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26 11:13:19 -07:00
5072ad8260 Merge branch 'xx/diff-tree-remerge-diff-fix' into maint-2.46
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-26 11:10:25 -07:00
164cffa35c Merge branch 'rs/t-example-simplify' into maint-2.46
Unit test simplification.

* rs/t-example-simplify:
  t-example-decorate: remove test messages
2024-08-26 11:10:24 -07:00
c93649f98a Merge branch 'jc/safe-directory' into maint-2.46
Follow-up on 2.45.1 regression fix.

* jc/safe-directory:
  safe.directory: setting safe.directory="." allows the "current" directory
  safe.directory: normalize the configured path
  safe.directory: normalize the checked path
  safe.directory: preliminary clean-up
2024-08-26 11:10:24 -07:00
b452be06ff Merge branch 'jc/document-use-of-local' into maint-2.46
Doc update.

* jc/document-use-of-local:
  doc: note that AT&T ksh does not work with our test suite
2024-08-26 11:10:23 -07:00
9a7bd3d0cb Merge branch 'rs/use-decimal-width' into maint-2.46
Code clean-up.

* rs/use-decimal-width:
  log-tree: use decimal_width()
2024-08-26 11:10:23 -07:00
5d0870d68c Merge branch 'ss/packed-ref-store-leakfix' into maint-2.46
Leakfix.

* ss/packed-ref-store-leakfix:
  refs/files: prevent memory leak by freeing packed_ref_store
2024-08-26 11:10:22 -07:00
24a64ea0eb Merge branch 'kl/test-fixes' into maint-2.46
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-26 11:10:21 -07:00
710ef8a945 Merge branch 'jc/reflog-expire-lookup-commit-fix' into maint-2.46
"git reflog expire" failed to honor annotated tags when computing
reachable commits.

* jc/reflog-expire-lookup-commit-fix:
  Revert "reflog expire: don't use lookup_commit_reference_gently()"
2024-08-26 11:10:21 -07:00
7bba1bd806 Merge branch 'jr/ls-files-expand-literal-doc' into maint-2.46
Docfix.

* jr/ls-files-expand-literal-doc:
  doc: fix hex code escapes in git-ls-files
2024-08-26 11:10:20 -07:00
528a762ca6 Merge branch 'jc/leakfix-mailmap' into maint-2.46
Leakfix.

* jc/leakfix-mailmap:
  mailmap: plug memory leak in read_mailmap_blob()
2024-08-26 11:10:20 -07:00
88639e5d4c Merge branch 'jc/leakfix-hashfile' into maint-2.46
Leakfix.

* jc/leakfix-hashfile:
  csum-file: introduce discard_hashfile()
2024-08-26 11:10:19 -07:00
a5e4f53baf Merge branch 'jc/jl-git-no-advice-fix' into maint-2.46
Remove leftover debugging cruft from a test script.

* jc/jl-git-no-advice-fix:
  t0018: remove leftover debugging cruft
2024-08-26 11:10:19 -07:00
5613c83f30 Merge branch 'tb/config-fixed-value-with-valueless-true' into maint-2.46
"git config --value=foo --fixed-value section.key newvalue" barfed
when the existing value in the configuration file used the
valueless true syntax, which has been corrected.

* tb/config-fixed-value-with-valueless-true:
  config.c: avoid segfault with --fixed-value and valueless config
2024-08-26 11:10:18 -07:00
a991ffff92 Merge branch 'ps/ls-remote-out-of-repo-fix' into maint-2.46
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-26 11:10:18 -07:00
87f8426bf7 Merge branch 'jk/osxkeychain-username-is-nul-terminated' into maint-2.46
The credential helper to talk to OSX keychain sometimes sent
garbage bytes after the username, which has been corrected.

* jk/osxkeychain-username-is-nul-terminated:
  credential/osxkeychain: respect NUL terminator in username
2024-08-26 11:10:17 -07:00
4e7aa344f2 remote: plug memory leaks at early returns
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 14:20:07 -07:00
6a09c36371 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 09:02:36 -07:00
62c5b88157 Merge branch 'ps/stash-keep-untrack-empty-fix'
A corner case bug in "git stash" was fixed.

* ps/stash-keep-untrack-empty-fix:
  builtin/stash: fix `--keep-index --include-untracked` with empty HEAD
2024-08-23 09:02:36 -07:00
2cf9c2206c Merge branch 'ps/hash-and-ref-format-from-config'
The default object hash and ref backend format used to be settable
only with explicit command line option to "git init" and
environment variables, but now they can be configured in the user's
global and system wide configuration.

* ps/hash-and-ref-format-from-config:
  setup: make ref storage format configurable via config
  setup: make object format configurable via config
  setup: merge configuration of repository formats
  t0001: delete repositories when object format tests finish
  t0001: exercise initialization with ref formats more thoroughly
2024-08-23 09:02:36 -07:00
668843e6d8 Merge branch 'cp/unit-test-reftable-readwrite'
* cp/unit-test-reftable-readwrite:
  t-reftable-readwrite: add test for known error
  t-reftable-readwrite: use 'for' in place of infinite 'while' loops
  t-reftable-readwrite: use free_names() instead of a for loop
  t: move reftable/readwrite_test.c to the unit testing framework
2024-08-23 09:02:35 -07:00
5e56a39e6a Merge branch 'ps/config-wo-the-repository'
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.

* ps/config-wo-the-repository:
  config: hide functions using `the_repository` by default
  global: prepare for hiding away repo-less config functions
  config: don't depend on `the_repository` with branch conditions
  config: don't have setters depend on `the_repository`
  config: pass repo to functions that rename or copy sections
  config: pass repo to `git_die_config()`
  config: pass repo to `git_config_get_expiry_in_days()`
  config: pass repo to `git_config_get_expiry()`
  config: pass repo to `git_config_get_max_percent_split_change()`
  config: pass repo to `git_config_get_split_index()`
  config: pass repo to `git_config_get_index_threads()`
  config: expose `repo_config_clear()`
  config: introduce missing setters that take repo as parameter
  path: hide functions using `the_repository` by default
  path: stop relying on `the_repository` in `worktree_git_path()`
  path: stop relying on `the_repository` when reporting garbage
  hooks: remove implicit dependency on `the_repository`
  editor: do not rely on `the_repository` for interactive edits
  path: expose `do_git_common_path()` as `repo_common_pathv()`
  path: expose `do_git_path()` as `repo_git_pathv()`
2024-08-23 09:02:34 -07:00
1b6b2bfae5 Merge branch 'ps/leakfixes-part-4'
More leak fixes.

* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-23 09:02:33 -07:00
85da2a2ab6 reftable/stack: fix segfault when reload with reused readers fails
It is expected that reloading the stack fails with concurrent writers,
e.g. because a table that we just wanted to read just got compacted.
In case we decided to reuse readers this will cause a segfault though
because we unconditionally release all new readers, including the reused
ones. As those are still referenced by the current stack, the result is
that we will eventually try to dereference those already-freed readers.

Fix this bug by incrementing the refcount of reused readers temporarily.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:48 -07:00
1302ed68d4 reftable/stack: reorder swapping in the reloaded stack contents
The code flow of how we swap in the reloaded stack contents is somewhat
convoluted because we switch back and forth between swapping in
different parts of the stack.

Reorder the code to simplify it. We now first close and unlink the old
tables which do not get reused before we update the stack to point to
the new stack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
89eada4ea1 reftable/reader: keep readers alive during iteration
The lifetime of a table iterator may survive the lifetime of a reader
when the stack gets reloaded. Keep the reader from being released by
increasing its refcount while the iterator is still being used.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
d857469d85 reftable/reader: introduce refcounting
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
4ac2fd9b4a reftable/stack: fix broken refnames in write_n_ref_tables()
The `write_n_ref_tables()` helper function writes N references in
separate tables. We never reset the computed name of those references
though, leading us to end up with unexpected names.

Fix this by resetting the buffer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
00e130a6bb reftable/reader: inline reader_close()
Same as with the preceding commit, we also provide a `reader_close()`
function that allows the caller to close a reader without freeing it.
This is unnecessary now that all users will have an allocated version of
the reader.

Inline it into `reftable_reader_free()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
2de3c0d345 reftable/reader: inline init_reader()
Most users use an allocated version of the `reftable_reader`, except for
some tests. We are about to convert the reader to become refcounted
though, and providing the ability to keep a reader on the stack makes
this conversion harder than necessary.

Update the tests to use `reftable_reader_new()` instead to prepare for
this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
a0218203cd reftable/reader: rename reftable_new_reader()
Rename the `reftable_new_reader()` function to `reftable_reader_new()`
to match our coding guidelines.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
a52bac9ac0 reftable/stack: inline stack_compact_range_stats()
The only difference between `stack_compact_range_stats()` and
`stack_compact_range()` is that the former updates stats on failure,
whereas the latter doesn't. There are no callers anymore that do not
want their stats updated though, making the indirection unnecessary.

Inline the stat updates into `stack_compact_range()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
afdafade1a reftable/blocksource: drop malloc block source
The reftable blocksource provides a generic interface to read blocks via
different sources, e.g. from disk or from memory. One of the block
sources is the malloc block source, which can in theory read data from
memory. We nowadays also have a strbuf block source though, which
provides essentially the same functionality with better ergonomics.

Adapt the only remaining user of the malloc block source in our tests
to use the strbuf block source, instead, and remove the now-unused
malloc block source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
596f4ff6ad doc: replace 3 dash with correct 2 dash in git-config(1)
Commit 4e51389000 (builtin/config: introduce "get" subcommand, 2024-05-06)
introduced this typo.  It uses 3 dashes for regexp argument instead of
correct 2 dashes.

Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:02:58 -07:00
db5281276e send-pack: add new tracing regions for push
At $DAYJOB we experienced some slow pushes and needed additional trace
data to diagnose them.

Add trace2 regions for various sections of send_pack().

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 15:02:32 -07:00
a45ab54987 fetch: add top-level trace2 regions
At $DAYJOB we experienced some slow fetch operations and needed some
additional data to help diagnose the issue.

Add top-level trace2 regions for the various modes of operation of
`git-fetch`. None of these regions are in recursive code, so any
enclosed trace messages should only see their nesting level increase by
one.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 15:02:31 -07:00
cbe140754b trace2: implement trace2_printf() for event target
The trace2 event target does not have an implementation for
trace2_printf(). While the event target is for structured events, and
trace2_printf() is for unstructured, human-readable messages, it may
still be useful to wrap these unstructured messages in a structured JSON
object. Among other things, it may reduce confusion when manually
debugging using event trace data.

Add a simple implementation for the event target that wraps
trace2_printf() messages in a minimal JSON object. Document this in
Documentation/technical/api-trace2.txt, and bump the event format
version since we're adding a new event type.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 15:02:31 -07:00
4881328617 docs: explain the order of output in the batched mode of git-cat-file(1)
The batched mode of git-cat-file(1) reads multiple objects from stdin
and prints their respective contents to stdout.
The order in which those objects are printed is not documented
and may not be immediately obvious to the user.
Document it.

Signed-off-by: ahmed akef <aemed.akef.1@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 14:59:22 -07:00
f975a3a38c Merge branch 'ps/reftable-drop-generic' into ps/reftable-concurrent-compaction
* ps/reftable-drop-generic: (24 commits)
  reftable/generic: drop interface
  t/helper: refactor to not use `struct reftable_table`
  t/helper: use `hash_to_hex_algop()` to print hashes
  t/helper: inline printing of reftable records
  t/helper: inline `reftable_table_print()`
  t/helper: inline `reftable_stack_print_directory()`
  t/helper: inline `reftable_reader_print_file()`
  t/helper: inline `reftable_dump_main()`
  reftable/dump: drop unused `compact_stack()`
  reftable/generic: move generic iterator code into iterator interface
  reftable/iter: drop double-checking logic
  reftable/stack: open-code reading refs
  reftable/merged: stop using generic tables in the merged table
  reftable/merged: rename `reftable_new_merged_table()`
  reftable/merged: expose functions to initialize iterators
  reftable/stack: handle locked tables during auto-compaction
  reftable/stack: fix corruption on concurrent compaction
  reftable/stack: use lock_file when adding table to "tables.list"
  reftable/stack: do not die when fsyncing lock file files
  reftable/stack: simplify tracking of table locks
  ...
2024-08-22 11:30:51 -07:00
b44c926c9f diff-index: integrate with the sparse index
The sparse index allows focusing the index data structure on the files
present in the sparse-checkout, leaving only tree entries for
directories not within the sparse-checkout. Each builtin needs a
repository setting to indicate that it has been tested with the sparse
index before Git will allow the index to be loaded into memory in its
sparse form. This is a safety precaution.

There are still some builtins that haven't been integrated due to the
complexity of the integration and the lack of significant use. However,
'git diff-index' was neglected only because of initial data showing low
usage. The diff machinery was already integrated and there is no more
work to be done there but add some tests to be sure 'git diff-index'
behaves as expected.

For this purpose, we can follow the testing pattern used in 51ba65b5c3
(diff: enable and test the sparse index, 2021-12-06). One difference
here is that we only verify that the sparse index case agrees with the
full index case, but do not generate the expected output. The 'git diff'
tests use the '--name-status' option to ease the creation of the
expected output, but that's not an option for 'diff-index'. Since the
underlying diff machinery is the same, a simple comparison is sufficient
to give some coverage.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:29:14 -07:00
13b23d2da5 transport: fix leaking negotiation tips
We do not free negotiation tips in the transport's smart options. Fix
this by freeing them on disconnect.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
7720460ccf transport: fix leaking arguments when fetching from bundle
In `fetch_refs_from_bundle()` we assemble a vector of arguments to pass
to `unbundle()`, but never free it. And in theory we wouldn't have to
because `unbundle()` already knows to free the vector for us. But it
fails to do so when it exits early due to `verify_bundle()` failing.

The calling convention that the arguments are freed by the callee and
not the caller feels somewhat weird. Refactor the code such that it is
instead the responsibility of the caller to free the vector, adapting
the only two callsites where we pass extra arguments. This also fixes
the memory leak.

This memory leak gets hit in t5510, but fixing it isn't sufficient to
make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
c92abe71df builtin/fetch: fix leaking transaction with --atomic
With the `--atomic` flag, we use a single ref transaction to commit all
ref updates in git-fetch(1). The lifetime of transactions is somewhat
weird: while `ref_transaction_abort()` will free the transaction, a call
to `ref_transaction_commit()` won't. We thus have to manually free the
transaction in the successful case.

Adapt the code to free the transaction in the exit path to plug the
resulting memory leak. As `ref_transaction_abort()` already freed the
transaction for us, we have to unset the transaction when we hit that
code path to not cause a double free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
8960819e73 remote: fix leaking peer ref when expanding refmap
When expanding remote refs via the refspec in `get_expanded_map()`, we
first copy the remote ref and then override its peer ref with the
expanded name. This may cause a memory leak though in case the peer ref
is already set, as this field is being copied by `copy_ref()`, as well.

Fix the leak by freeing the peer ref before we re-assign the field.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
5e9e04a064 remote: fix leaks when matching refspecs
In `match_explicit()`, we try to match a source ref with a destination
ref according to a refspec item. This matching sometimes requires us to
allocate a new source spec so that it looks like we expect. And while we
in some end up assigning this allocated ref as `peer_ref`, which hands
over ownership of it to the caller, in other cases we don't. We neither
free it though, causing a memory leak.

Fix the leak by creating a common exit path where we can easily free the
source ref in case it is allocated and hasn't been handed over to the
caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
f5ccb535cc remote: fix leaking config strings
We're leaking several config strings when assembling remotes, either
because we do not free preceding values in case a config was set
multiple times, or because we do not free them when releasing the remote
state. This includes config strings for "branch" sections, "insteadOf",
"pushInsteadOf", and "pushDefault".

Plug those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
46e440694f builtin/fetch-pack: fix leaking refs
We build several ref lists in git-fetch-pack(1), but never free them.
Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
2a2d5da1f2 sideband: fix leaks when configuring sideband colors
We read a bunch of configs in `use_sideband_colors()` to configure the
colors that Git should use. We never free the strings read from the
config though, causing memory leaks.

Refactor the code to use `git_config_get_string_tmp()` instead, which
does not allocate memory. As we throw the strings away after parsing
them anyway there is no need to use allocated strings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
a09efb74e3 builtin/send-pack: fix leaking refspecs
We never free data associated with the assembled refspec in
git-send-pack(1), causing a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
ca52234183 transport: fix leaking OID arrays in git:// transport data
The transport data for the "git://" protocol contains two OID arrays
that we never free, creating a memory leak. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
fb24460e1d t/helper: fix leaking multi-pack-indices in "read-midx"
Several of the subcommands of `test-helper read-midx` do not close the
MIDX that they have opened, leading to memory leaks. Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
bda97cb119 builtin/repack: fix leaks when computing packs to repack
When writing an MIDX in git-repack(1) we first collect all the pack
names that we want to add to it in a string list. This list is marked as
`NODUP`, which indicates that it will neither duplicate nor own strings
added to it. In `write_midx_included_packs()` we then `insert()` strings
via `xstrdup()` or `strbuf_detach()`, but the resulting strings will not
be owned by anything and thus leak.

Fix this issue by marking the list as `DUP` and using a local buffer to
compute the pack names.

This leak is hit in t5319, but plugging it is not sufficient to make the
whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
8a7846383e midx-write: fix leaking hashfile on error cases
When writing the MIDX file we first create the `struct hashfile` used to
write the trailer hash, and then afterwards we verify whether we can
actually write the MIDX in the first place. When we decide that we
can't, this leads to a memory leak because we never free the hash file
contents.

We could fix this by freeing the hashfile on the exit path. There is a
better option though: we can simply move the checks for the error
condition earlier. As there is no early exit between creating the
hashfile and finalizing it anymore this is sufficient to fix the memory
leak.

While at it, also move around the block checking for `ctx.entries_nr`.
This change is not required to fix the memory leak, but it feels natural
to move together all massaging of parameters before we go with them and
execute the actual logic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
479601e9f4 builtin/archive: fix leaking OPT_FILENAME() value
The "--output" switch is an `OPT_FILENAME()` option, which allocates
memory when specified by the user. But while we free the string when
executed without the "--remote" switch, we don't otherwise because we
return via a separate exit path that doesn't know to free it.

Fix this by creating a common exit path.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
149c9e200c builtin/upload-archive: fix leaking args passed to write_archive()
In git-upload-archive(1), we pass an array of arguments to
`write_archive()` to tell it what exactly to do. We don't ever clear the
vector though, causing a memory leak. Furthermore though, the call to
`write_archive()` may cause contents of the array to be modified, which
would cause us to leak memory to allocated strings held by it.

Fix the issue by having `write_archive()` create a shallow copy of
`argv` before parsing the arguments. Like this, we won't modify the
caller's array and can easily `strvec_clear()` it to plug these memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
ff0935b96e builtin/merge-tree: fix leaking -X strategy options
The `-X` switch for git-merge-tree(1) will push each option into a local
`xopts` vector that we then end up parsing. The vector never gets freed
though, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
82ea7e59b2 pretty: fix leaking key/value separator buffer
The `format_set_trailers_options()` function is responsible for parsing
a custom pretty format for trailers. It puts the parsed options into a
`struct process_trailer_options` structure, while the allocated memory
required for this will be put into separate caller-provided arguments.
It is thus the caller's responsibility to free the memory not via the
options structure, but via the other parameters.

While we do this alright for the separator and filter keys, we do not
free the memory associated with the key/value separator. Fix this to
plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
60289b50d0 pretty: fix memory leaks when parsing pretty formats
When parsing pretty formats from the config we leak the name and user
format whenever these are set multiple times. This is because we do not
free any already-set value in case there is one.

Plugging this leak for the name is trivial. For the user format we need
to be a bit more careful, because we may end up assigning a pointer into
the allocated region when the string is prefixed with either "format" or
"tformat:". In order to make it safe to unconditionally free the user
format we thus strdup the stripped string into the field instead of a
pointer into the string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:03 -07:00
643c6f576c convert: fix leaks when resetting attributes
When resetting parsed gitattributes, we free the list of convert drivers
parsed from the config. We only free some of the drivers' fields though
and thus have memory leaks.

Fix this by freeing all allocated convert driver fields to plug these
memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:03 -07:00
e5530f9c5c mailinfo: fix leaking header data
We populate the `mailinfo` arrays `p_hdr_data` and `s_hdr_data` with
data parsed from the mail headers. These arrays may end up being only
partially populated with gaps in case some of the headers do not parse
properly. This causes memory leaks because `strbuf_list_free()` will
stop iterating once it hits the first `NULL` pointer in the backing
array.

Fix this by open-coding a variant of `strbuf_list_free()` that knows to
iterate through all headers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:03 -07:00
987bbcd088 exec_cmd: RUNTIME_PREFIX on z/OS systems
Enable Git to resolve its own binary location using __getprogramdir
and getprogname.

Since /proc is not a mandatory filesystem on z/OS, we cannot rely on the
git_get_exec_path_procfs method to determine Git's executable path. To
address this, we have implemented git_get_exec_path_zos, which resolves
the executable path by extracting it from the current program's
directory and filename.

Signed-off-by: D Harithamma <harithamma.d@ibm.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 08:58:46 -07:00
6014639837 reftable/generic: drop interface
The `reftable_table` interface provides a generic infrastructure that
can abstract away whether the underlying table is a single table, or a
merged table. This abstraction can make it rather hard to reason about
the code. We didn't ever use it to implement the reftable backend, and
with the preceding patches in this patch series we in fact don't use it
at all anymore. Furthermore, it became somewhat useless with the recent
refactorings that made it possible to seek reftable iterators multiple
times, as these now provide generic access to tables for us. The
interface is thus redundant and only brings unnecessary complexity with
it.

Remove the `struct reftable_table` interface and its associated
functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
89191232b8 t/helper: refactor to not use struct reftable_table
The `struct reftable_table` interface in our "reftable" test helper gets
used such that we can easily print either a single table, or a merged
stack. This generic interface is about to go away.

Prepare the code for this change by using merged tables instead. When
printing the stack we've already got one. When using a single table, we
can create a merged table from it to adapt.

This removes the last user of the generic interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
1f39dd2ae5 t/helper: use hash_to_hex_algop() to print hashes
The "reftable" test helper uses a hand-crafted version to convert from a
raw hash to its hex variant. This was done because this code used to be
part of the reftable library, where we do not use most functions from
the Git core.

Now that the code is integrated into the "dump-reftable" helper though,
that limitation went away. Let's thus use `hash_to_hex_algop()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
42c424d69d t/helper: inline printing of reftable records
Move printing of reftable records into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
64a5b7a8ca t/helper: inline reftable_table_print()
Move `reftable_table_print()` into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
ca74ef6ffb t/helper: inline reftable_stack_print_directory()
Move `reftable_stack_print_directory()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Note that this requires us to remove the tests for this functionality in
`reftable/stack_test.c`. The test does not really add much anyway,
because all it verifies is that we do not crash or run into an error,
and it specifically doesn't check the outputted data. Also, as the code
is now part of the test helper, it doesn't make much sense to have a
unit test for it in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
22f519a9a0 t/helper: inline reftable_reader_print_file()
Move `reftable_reader_print_file()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
2b06b28fd6 t/helper: inline reftable_dump_main()
The printing functionality part of `reftable/dump.c` is really only used
by our "dump-reftable" test helper. It is certainly not generic logic
that is useful to anybody outside of Git, and the format it generates is
quite specific. Still, parts of it are used in our test suite and the
output may be useful to take a peek into reftable stacks, tables and
blocks. So while it does not make sense to expose this as part of the
reftable library, it does make sense to keep it around.

Inline the `reftable_dump_main()` function into the "dump-reftable" test
helper. This clarifies that its format is subject to change and not part
of our public interface. Furthermore, this allows us to iterate on the
implementation in subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
55c7ff42f9 reftable/dump: drop unused compact_stack()
The `compact_stack()` function is exposed via `reftable_dump_main()`,
which ultimately ends up being wired into "test-tool reftable". It is
never used by our tests though, and nowadays we have wired up support
for stack compaction into git-pack-refs(1).

Remove the code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
f2406c81b9 reftable/generic: move generic iterator code into iterator interface
Move functions relating to the reftable iterator from "generic.c" into
"iter.c". This prepares for the removal of the former subsystem.

While at it, remove some unneeded braces to conform to our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
b34ce7e911 reftable/iter: drop double-checking logic
The filtering ref iterator can be used to only yield refs which are not
in a specific skip list. This iterator has an option to double-check the
results it returns, which causes us to seek the reference we are about
to yield via a separate table such that we detect whether the reference
that the first iterator has yielded actually exists.

The value of this is somewhat dubious, and I cannot think of any usecase
where this functionality should be required. Furthermore, this option is
never set in our codebase, which means that it is essentially untested.
And last but not least, the `struct reftable_table` that is used to
implement it is about to go away.

So while we could refactor the code to not use a `reftable_table`, it
very much feels like a wasted effort. Let's just drop this code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
aef8602653 reftable/stack: open-code reading refs
To read a reference for the reftable stack, we first create a generic
`reftable_table` from the merged table and then read the reference via a
convenience function. We are about to remove these generic interfaces,
so let's instead open-code the logic to prepare for this removal.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
b8ca235ca5 reftable/merged: stop using generic tables in the merged table
The merged table provides access to a reftable stack by merging the
contents of those tables into a virtual table. These subtables are being
tracked via `struct reftable_table`, which is a generic interface for
accessing either a single reftable or a merged reftable. So in theory,
it would be possible for the merged table to merge together other merged
tables.

This is somewhat nonsensical though: we only ever set up a merged table
over normal reftables, and there is no reason to do otherwise. This
generic interface thus makes the code way harder to follow and reason
about than really necessary. The abstraction layer may also have an
impact on performance, even though the extra set of vtable function
calls probably doesn't really matter.

Refactor the merged tables to use a `struct reftable_reader` for each of
the subtables instead, which gives us direct access to the underlying
tables. Adjust names accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
6631ed3ce7 reftable/merged: rename reftable_new_merged_table()
Rename `reftable_new_merged_table()` to `reftable_merged_table_new()`
such that the name matches our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
987762a51a reftable/merged: expose functions to initialize iterators
We do not expose any functions via our public headers that would allow a
caller to initialize a reftable iterator from a merged table. Instead,
they are expected to go via the generic `reftable_table` interface,
which is somewhat roundabout.

Implement two new functions to initialize iterators for ref and log
records to plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:45 -07:00
3a7362eb9f The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 12:02:25 -07:00
74f94f27a9 Merge branch 'jc/how-to-maintain-updates'
Doc updates.

* jc/how-to-maintain-updates:
  howto-maintain: mention preformatted docs
2024-08-21 12:02:25 -07:00
eb630683c2 Merge branch 'jk/apply-patch-mode-check-fix'
Test fix.

* jk/apply-patch-mode-check-fix:
  t4129: fix racy index when calling chmod after git-add
2024-08-21 12:02:25 -07:00
b772c9cf2e Merge branch 'ps/bundle-outside-repo-fix'
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.

* ps/bundle-outside-repo-fix:
  bundle: default to SHA1 when reading bundle headers
  builtin/bundle: have unbundle check for repo before opening its bundle
2024-08-21 12:02:24 -07:00
fdf70da8c3 Merge branch 'jc/grammo-fixes'
Doc updates.

* jc/grammo-fixes:
  doc: grammofix in git-diff-tree
  tutorial: grammofix
2024-08-21 12:02:24 -07:00
d97956b8bd Merge branch 'ag/git-svn-global-ignores'
"git svn" has been taught about svn:global-ignores property
recent versions of Subversion has.

* ag/git-svn-global-ignores:
  git-svn: mention `svn:global-ignores` in help+docs
  git-svn: use `svn:global-ignores` to create .gitignore
  git-svn: add public property `svn:global-ignores`
2024-08-21 12:02:23 -07:00
8311e3b551 builtin/maintenance: fix loose objects task emitting pack hash
The "loose-objects" maintenance tasks executes git-pack-objects(1) to
pack all loose objects into a new packfile. This command ends up
printing the hash of the packfile to stdout though, which clutters the
output of `git maintenance run`.

Fix this issue by disabling stdout of the child process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:22 -07:00
51a0b8a2a7 t7900: exercise detaching via trace2 regions
In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.

Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: dropped a stale in-code comment from a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:02 -07:00
772408fe75 t-reftable-block: add tests for index blocks
In the current testing setup, block operations are left unexercised
for index blocks. Add a test that exercises these operations for
index blocks.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
1528c481d7 t-reftable-block: add tests for obj blocks
In the current testing setup, block operations are left unexercised
for obj blocks. Add a test that exercises these operations for obj
blocks.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
5cba56173b t-reftable-block: add tests for log blocks
In the current testing setup, block operations are only exercised
for ref blocks. Add another test that exercises these operations
for log blocks as well.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
abcddcef3d t-reftable-block: remove unnecessary variable 'j'
Currently, there are two variables for array indices, 'i' and 'j'.
The variable 'j' is used only once and can be easily replaced with
'i'. Get rid of 'j' and replace its occurence with 'i'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
29ee6d5a20 t-reftable-block: use xstrfmt() instead of xstrdup()
Use xstrfmt() to assign a formatted string to a ref record's
refname instead of xstrdup(). This helps save the overhead of
a local 'char' buffer as well as makes the test more compact.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
31216ee28a t-reftable-block: use block_iter_reset() instead of block_iter_close()
block_iter_reset() restores a block iterator to its state at the time
of initialization without freeing any memory while block_iter_close()
deallocates the memory for the iterator.

In the current testing setup, a block iterator is allocated and
deallocated for every iteration of a loop, which hurts performance.
Improve upon this by using block_iter_reset() at the start of each
iteration instead. This has the added benifit of testing
block_iter_reset(), which currently remains untested.

Similarly, remove reftable_record_release() for a reftable record
that is still in use.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
c25cbcd352 t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
In the current testing setup, the record key required for many block
iterator functions is manually stored in a strbuf struct and then
passed to these functions. This is not ideal when there exists a
dedicated function to encode a record's key into a strbuf, namely
reftable_record_key(). Use this function instead of manual encoding.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
e638e9c8f3 t-reftable-block: use reftable_record_equal() instead of check_str()
In the current testing setup, operations like read and write for
reftable blocks as defined by reftable/block.{c, h} are verified by
comparing only the keys of input and output reftable records. This is
not ideal because there can exist inequal reftable records with the
same key. Use the dedicated function for record comparison,
reftable_record_equal(), instead of key-based comparison.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
353672f9f8 t-reftable-block: release used block reader
Used block readers must be released using block_reader_release() to
prevent the occurence of a memory leak. Make test_block_read_write()
conform to this statement.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
6853b931bd t: harmonize t-reftable-block.c with coding guidelines
Harmonize the newly ported test unit-tests/t-reftable-block.c
with the following guidelines:
- Single line 'for' statements must omit curly braces.
- Structs must be 0-initialized with '= { 0 }' instead of '= { NULL }'.
- Array sizes and indices should preferably be of type 'size_t'and
  not 'int'.
- Return code variable should preferably be named 'ret', not 'n'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
546cc0d64e t: move reftable/block_test.c to the unit testing framework
reftable/block_test.c exercises the functions defined in
reftable/block.{c, h}. Migrate reftable/block_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to follow the unit-tests'
naming conventions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
4bdd6b7bf2 rebase --exec: respect --quiet
rebase --exec doesn't obey --quiet and ends up printing messages about
the command being executed:

  git rebase HEAD~3 --quiet --exec true
  Executing: true
  Executing: true
  Executing: true

Let's fix that by omitting the "Executing" messages when using --quiet.

Furthermore, the sequencer code includes a few calls to
term_clear_line(), which prints a special character sequence to erase
the previous line displayed on stderr (even when nothing was printed
yet). For an user running the command interactively, the net effect of
calling this function with or without --quiet is the same as the
characters are invisible in the terminal. However, when redirecting the
output to a file or piping to another command, the presence of these
invisible characters is noticeable, and it may break user expectation as
--quiet is not being respected.

We could skip the term_clear_line() calls when --quiet is used, like we
are doing with the "Executing" messages, but it makes much more sense to
condition the line cleaning upon stderr being TTY, since these
characters are really only useful for TTY outputs.

The added test checks for both these two changes.

Reported-by: Lincoln Yuji <lincolnyuji@hotmail.com>
Reported-by: Rodrigo Siqueira <siqueirajordao@riseup.net>
Signed-off-by: Matheus Tavares <matheus.tavb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 08:57:51 -07:00
80ccd8a260 Sync with 'maint' for Windows+VS build jobs used at CI 2024-08-20 14:24:57 -07:00
870e227a67 Merge branch 'jk/midx-unused-fix'
Code clean-up in the base topic.

* jk/midx-unused-fix:
  midx: drop unused parameters from add_midx_to_chain()
2024-08-20 14:23:46 -07:00
6a562e68a3 Merge branch 'js/ci-win-vs-build' into maint-2.46
Sync with Windows+VS build jobs used at CI.

* js/ci-win-vs-build:
  ci(win+VS): download the vcpkg artifacts using a dedicated GitHub Action
  ci: bump microsoft/setup-msbuild from v1 to v2
2024-08-20 14:23:12 -07:00
be10ac7037 mailinfo: we parse fixed headers
The code was written as if we have a small room to add additional
headers to be parsed to the header[] array at runtime, but that is
not our intention at all.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 14:20:58 -07:00
44db6f75cc CodingGuidelines: spaces around C operators
As we have operated with "write like how your surrounding code is
written" for too long, after a huge code drop from another project,
we'll end up being inconsistent before such an imported code is
cleaned up.  We have many uses of cast operator with a space before
its operand, mostly in the reftable code.

Spell the convention out before it spreads to other places.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 14:10:10 -07:00
2df380c280 Merge branch 'ps/leakfixes-part-4' into ps/leakfixes-part-5
* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-20 10:15:27 -07:00
05026637f3 t: migrate t0110-urlmatch-normalization to the new framework
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.

In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shell script. This involves
changing

'\'' -> '
'\!' -> !

in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shell script.

Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.

And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.

[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 10:08:28 -07:00
a6bcb3ca01 t-hashmap: stop calling setup() for t_intern() test
Commit f24a9b78a9 (t-hashmap: mark unused parameters in callback
function, 2024-08-17) noted that the t_intern() does not need its
hashmap parameter, but we have to keep it to conform to the function
pointer interface of setup().

But since the only thing setup() does is create and tear down the
hashmap, we can just skip calling setup() entirely for this case, and
drop the unused parameters. This simplifies the code a bit.

Helped-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:33:18 -07:00
fbcdfab348 git-prompt: support custom 0-width PS1 markers
When using colors, the shell needs to identify 0-width substrings
in PS1 - such as color escape sequences - when calculating the
on-screen width of the prompt.

Until now, we used the form %F{<color>} in zsh - which it knows is
0-width, or otherwise use standard SGR esc sequences wrapped between
byte values 1 and 2 (SOH, STX) as 0-width start/end markers, which
bash/readline identify as such.

But now that more shells are supported, the standard SGR sequences
typically work, but the SOH/STX markers might not be identified.

This commit adds support for vars GIT_PS1_COLOR_{PRE,POST} which
set custom 0-width markers or disable the markers.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:19 -07:00
0dbe3d3f16 git-prompt: ta-da! document usage in other shells
With one big exception, git-prompt.sh should now be both almost posix
compliant, and also compatible with most (posix-ish) shells.

That exception is the use of "local" vars in functions, which happens
extensively in the current code, and is not simple to replace with
posix compliant code (but also not impossible).

Luckily, almost all shells support "local" as used by the current
code, with the notable exception of ksh93[u+m], but also the Schily
minimal posix sh (pbosh), and yash in posix mode.

See assessment below that "local" is likely the only blocker in those.

So except mainly ksh93, git-prompt.sh now works in most shells:
- bash, zsh, dash since at least 0.5.8, free/net bsd sh, busybox-ash,
  mksh, openbsd sh, pdksh(!), Schily extended Bourne sh (bosh), yash.

which is quite nice.

As an anecdote, replacing the 1st line in __git_ps1() (local exit=$?)
with these 2 makes it work in all tested shells, even without "local":

  # handles only 0/1 args for simplicity. needs +5 LOC for any $#
  __git_e=$?; local exit="$__git_e" 2>/dev/null ||
    {(eval 'local() { export "$@"; }'; __git_ps1 "$@"); return "$__git_e"; }

Explanation:

  If the shell doesn't have the command "local", define our own
  function "local" which instead does plain (global) assignents.
  Then use __git_ps1 in a subshell to not clober the caller's vars.

  This happens to work because currently there are no name conflicts
  (shadow) at the code, initial value is not assumed (i.e. always
  doing either 'local x=...'  or 'local x;...  x=...'), and assigned
  initial values are quoted (local x="$y"), preventing word split and
  glob expansion (i.e. assignment context is not assumed).

  The last two (always init, quote values) seem to be enough to use
  "local" portably if supported, and otherwise shells indeed differ.

  Uses "eval", else shells with "local" may reject it during parsing.
  We don't need "export", but it's smaller than writing our own loop.

While cute, this approach is not really sustainable because all the
vars become global, which is hard to maintain without conflicts
(but hey, it currently has no conflicts - without even trying...).

However, regardless of being an anecdote, it provides some support to
the assessment that "local" is the only blocker in those shells.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:19 -07:00
29bcec82a6 git-prompt: don't use shell $'...'
$'...' is new in POSIX (2024), and some shells support it in recent
versions, while others have had it for decades (bash, zsh, ksh93).

However, there are still enough shells which don't support it, and
it's cheap to use an alternative form which works in all shells,
so let's do that instead of dismissing it as "it's compliant".

It was agreed to use one form rather than $'...' where supported and
fallback otherwise.

shells where $'...' works:
- bash, zsh, ksh93, mksh, busybox-ash, dash master, free/net bsd sh.

shells where it doesn't work, but the new fallback works:
- all dash releases (up to 0.5.12), older versions of free/net bsd sh,
  openbsd sh, pdksh, all Schily Bourne sh variants, yash.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
b732e08671 git-prompt: add some missing quotes
The issues which this commit fixes are unlikely to be broken
in real life, but the fixes improve correctness, and would prevent
bugs in some uncommon cases, such as weird IFS values.

Listing some portability guidelines here for future reference.

I'm leaving it to someone else to decide whether to include
it in the file itself, place it as a new file, or not.

---------

The command "local" is non standard, but is allowed in this file:
- Quote initialization if it can expand (local x="$y"). See below.
- Don't assume initial value after "local x". Either initialize it
  (local x=..), or set before first use (local x;.. x=..; <use $x>).
  (between shells, "local x" can unset x, or inherit it, or do x= )

Other non-standard features beyond "local" are to be avoided.

Use the standard "test" - [...] instead of non-standard [[...]] .

--------

Quotes (some portability things, but mainly general correctness):

Quotes prevent tilde-expansion of some unquoted literal tildes (~).
If the expansion is undesirable, quotes would ensure that.
  Tilds expanded: a=~user:~/ ;  echo ~user ~/dir
  not expanded:   t="~"; a=${t}user  b=\~foo~;  echo "~user" $t/dir

But the main reason for quoting is to prevent IFS field splitting
(which also coalesces IFS chars) and glob expansion in parts which
contain parameter/arithmetic expansion or command substitution.

"Simple command" (POSIX term) is assignment[s] and/or command [args].
Examples:
  foo=bar         # one assignment
  foo=$bar x=y    # two assignments
  foo bar         # command, no assignments
  x=123 foo bar   # one assignment and a command

The assignments part is not IFS-split or glob-expanded.

The command+args part does get IFS field split and glob expanded,
but only at unquoted expanded/substituted parts.

In the command+args part, expanded/substituted values must be quoted.
(the commands here are "[" and "local"):
  Good: [ "$mode" = yes ]; local s="*" x="$y" e="$?" z="$(cmd ...)"
  Bad:  [ $mode = yes ];   local s=*   x=$y   e=$?   z=$(cmd...)

The arguments to "local" do look like assignments, but they're not
the assignment part of a simple command; they're at the command part.

Still at the command part, no need to quote non-expandable values:
  Good:                 local x=   y=yes;   echo OK
  OK, but not required: local x="" y="yes"; echo "OK"
But completely empty (NULL) arguments must be quoted:
  foo ""   is not the same as:   foo

Assignments in simple commands - with or without an actual command,
don't need quoting becase there's no IFS split or glob expansion:
  Good:   s=* a=$b c=$(cmd...)${x# foo }${y-   } [cmd ...]
  It's also OK to use double quotes, but not required.

This behavior (no IFS/glob) is called "assignment context", and
"local" does not behave with assignment context in some shells,
hence we require quotes when using "local" - for compatibility.

The value between 'case' and 'in' doesn't IFS-split/glob-expand:
  Good:       case  * $foo $(cmd...)  in ... ; esac
  identical:  case "* $foo $(cmd...)" in ... ; esac

Nested quotes in command substitution are fine, often necessary:
  Good: echo "$(foo... "$x" "$(bar ...)")"

Nested quotes in substring ops are legal, and sometimes needed
to prevent interpretation as a pattern, but not the most readable:
  Legal:  foo "${x#*"$y" }"

Nested quotes in "maybe other value" subst are invalid, unnecessary:
  Good:  local x="${y- }";   foo "${z:+ $a }"
  Bad:   local x="${y-" "}"; foo "${z:+" $a "}"
Outer/inner quotes in "maybe other value" have different use cases:
  "${x-$y}"  always one quoted arg: "$x" if x is set, else "$y".
  ${x+"$x"}  one quoted arg "$x" if x is set, else no arg at all.
  Unquoted $x is similar to the second case, but it would get split
  into few arguments if it includes any of the IFS chars.

Assignments don't need the outer quotes, and the braces delimit the
value, so nested quotes can be avoided, for readability:
  a=$(foo "$x")  a=${x#*"$y" }  c=${y- };  bar "$a" "$b" "$c"

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
fe445a1026 git-prompt: replace [[...]] with standard code
The existing [[...]] tests were either already valid as standard [...]
tests, or only required minimal retouch:

Notes:

- [[...]] doesn't do field splitting and glob expansion, so $var
  or $(cmd...) don't need quoting, but [... does need quotes.

- [[ X == Y ]] when Y is a string is same as [ X = Y ], but if Y is
  a pattern, then we need:  case X in Y)... ; esac  .

- [[ ... && ... ]] was replaced with [ ... ] && [ ... ] .

- [[ -o <zsh-option> ]] requires [[...]], so put it in "eval" and only
  eval it in zsh, so other shells would not abort on syntax error
  (posix says [[ has unspecified results, shells allowed to reject it)

- ((x++)) was changed into x=$((x+1))  (yeah, not [[...]] ...)

Shells which accepted the previous forms:
- bash, zsh, ksh93, mksh, openbsd sh, pdksh.

Shells which didn't, and now can process it:
- dash, free/net bsd sh, busybox-ash, Schily Bourne sh, yash.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
f2e264e43f git-prompt: don't use shell arrays
Arrays only existed in the svn-upstream code, used to:
- Keep a list of svn remotes.
- Convert commit msg to array of words, extract the 2nd-to-last word.

Except bash/zsh, nearly all shells failed load on syntax errors here.

Now:
- The svn remotes are a list of newline-terminated values.
- The 2nd-to-last word is extracted using standard shell substrings.
- All shells can digest the svn-upstream code.

While using shell field splitting to extract the word is simple, and
doesn't even need non-standard code, e.g. set -- $(git log -1 ...),
it would have the same issues as the old array code: it depends on IFS
which we don't control, and it's subject to glob-expansion, e.g. if
the message happens to include * or **/* (as this commit message just
did), then the array could get huge. This was not great.

Now it uses standard shell substrings, and we know the exact delimiter
to expect, because it's the match from our grep just one line earlier.

The new word extraction code also fixes svn-upstream in zsh, because
previously it used arr[len-2], but because in zsh, unlike bash, array
subscripts are 1-based, it incorrectly extracted the 3rd-to-last word.
symptom: missing upstream status in a git-svn repo: u=, u+N-M, etc.

The breakage in zsh is surprising, because it was last touched by
  commit d0583da838 (prompt: fix show upstream with svn and zsh),
claiming to fix exactly that. However, it only mentions syntax fixes.
It's unclear if behavior was fixed too. But it was broken, now fixed.

Note LF=$'\n' and then using $LF instead of $'\n' few times.
A future commit will add fallback for shells without $'...', so this
would be the only line to touch instead of replacing every $'\n' .

Shells which could run the previous array code:
- bash

Shells which have arrays but were broken anyway:
- zsh: 1-based subscript
- ksh93: no "local" (the new code can't fix this part...)
- mksh, openbsd sh, pdksh: failed load on syntax error: "for ((...))".

More shells which Failed to load due to syntax error:
- dash, free/net bsd sh, busybox-ash, Schily Bourne shell, yash.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
6df4b09159 git-prompt: fix uninitialized variable
First use is in the form:  local var; ...; var=$var$whatever...

If the variable was unset (as bash and others do after "local x"),
then it would error if set -u is in effect.

Also, many shells inherit the existing value after "local var"
without init, but in this case it's unlikely to have a prior value.

Now we initialize it.

(local var= is enough, but local var="" is the custom in this file)

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:17 -07:00
f037e607a8 git-prompt: use here-doc instead of here-string
Here-documend is standard, and works in all shells.

Both here-string and here-doc add final newline, which is important
in this case, because $output is without final newline, but we do
want "read" to succeed on the last line as well.

Shells which support here-string:
- bash, zsh, mksh, ksh93, yash (non-posix-mode).

shells which don't, and got fixed:
- ash-derivatives (dash, free/net bsd sh, busybox-ash).
- pdksh, openbsd sh.
- All Schily Bourne shell variants.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:17 -07:00
9f39e2fa26 ci(win+VS): download the vcpkg artifacts using a dedicated GitHub Action
The Git for Windows project provides a GitHub Action to download and
cache Azure Pipelines artifacts (such as the `vcpkg` artifacts), hiding
gnarly internals, and also providing some robustness against network
glitches. Let's use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:24:28 -07:00
46cbfd3f7e ci: bump microsoft/setup-msbuild from v1 to v2
The main benefit: The new version uses a node.js version that is not yet
deprecated.

Links:
- [Release notes](https://github.com/microsoft/setup-msbuild/releases)
- [Changelog](https://github.com/microsoft/setup-msbuild/blob/main/building-release.md)
- [Commits](https://github.com/microsoft/setup-msbuild/compare/v1...v2)

This patch was originally by GitHub's Dependabot, but I cannot attribute
that bot properly because it has no dedicated email address. Probably
because it hasn't reached legal age yet, or something.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:24:27 -07:00
bb9c16bd4f The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-19 11:07:38 -07:00
ee218ee952 Merge branch 'ps/transport-leakfix-test-updates'
Test updates.

* ps/transport-leakfix-test-updates:
  transport: mark more tests leak-free
2024-08-19 11:07:38 -07:00
b9497848df Merge branch 'tb/incremental-midx-part-1'
Incremental updates of multi-pack index files.

* tb/incremental-midx-part-1:
  midx: implement support for writing incremental MIDX chains
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  midx: implement verification support for incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: add new fields for incremental MIDX chains
  Documentation: describe incremental MIDX format
2024-08-19 11:07:37 -07:00
53129a0680 Merge branch 'jc/tests-no-useless-tee'
Test fixes.

* jc/tests-no-useless-tee:
  tests: drop use of 'tee' that hides exit status
2024-08-19 11:07:37 -07:00
4dbca805e0 Merge branch 'rs/unit-tests-test-run'
Unit-test framework has learned a simple control structure to allow
embedding test statements in-line instead of having to create a new
function to contain them.

* rs/unit-tests-test-run:
  t-strvec: use if_test
  t-reftable-basics: use if_test
  t-ctype: use if_test
  unit-tests: add if_test
  unit-tests: show location of checks outside of tests
  t0080: use here-doc test body
2024-08-19 11:07:36 -07:00
759b453f9f t7900: fix flaky test due to leaking background job
One of the recently-added tests in t7900 exercises git-maintanance(1)
with the `--detach` flag, which causes it to perform maintenance in the
background. We do not wait for the backgrounded process to exit though,
which causes the process to leak outside of the test, leading to racy
behaviour.

Fix this by synchronizing with the process via a separate file
descriptor. This is the same workaround as we use in t6500, see the
function `run_and_wait_for_auto_gc ()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-19 09:36:23 -07:00
c038a6f1d7 send-email: teach git send-email option to translate aliases
git send-email has support for converting shorthand alias names to
canonical email addresses via the alias file. It supports a wide variety
of alias file formats based on popular email program file formats.

Other programs, such as b4, would like the ability to convert aliases in
the same way as git send-email without needing to re-implement the logic
for understanding the many file formats.

Teach git send-email a new option, --translate-aliases, which will
enable this functionality. Similar to --dump-aliases, this option works
like a new mode of operation for git send-email.

When run with --translate-aliases, git send-email reads from standard
input and converts any provided alias into its canonical name and email
according to the alias file. Each expanded name and address is printed
to standard output, one per line.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 10:03:06 -07:00
5e75e503c4 scalar: mark unused parameters in dummy function
We have a dummy load_builtin_commands() function to satisfy the linker,
but which we never expect to be called. Mark its parameters to avoid
complaints from -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:24 -07:00
0b1376d448 daemon: mark unused parameters in non-posix fallbacks
If NO_POSIX_GOODIES is set, we compile fallback versions of a few
functions. These don't do anything, so their parameters are unused, but
we must keep them to match the ones on the other side of the #ifdef.
Mark them to quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:24 -07:00
e2ef77cf7c setup: mark unused parameter in config callback
This is logically a continuation of 783a86c142 (config: mark unused
callback parameters, 2022-08-19), but this case was introduced much
later in 4412a04fe6 (init.templateDir: consider this config setting
protected, 2024-03-29).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:23 -07:00
f288a57789 test-mergesort: mark unused parameters in trivial callback
The mode_copy() function does nothing, but since it's used as a function
pointer within "struct mode", it has to conform to the interface. Mark
it to quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:23 -07:00
f24a9b78a9 t-hashmap: mark unused parameters in callback function
The t_intern() setup function doesn't operate on a hashmap, so it
ignores its parameters. But we can't drop them since it is passed as a
pointer to setup(), so we have to match the other setup functions. Mark
them to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:13 -07:00
4695c3f3a9 reftable: mark unused parameters in virtual functions
The reftable code uses a lot of virtual function pointers, but many of
the concrete implementations do not need all of the parameters.

For the most part these are obviously fine to just mark as UNUSED (e.g.,
the empty_iterator functions unsurprisingly do not do anything). Here
are a few cases where I dug a little deeper (but still ended up just
marking them UNUSED):

  - the iterator exclude_patterns is best-effort and optional (though it
    would be nice to support in the long run as an optimization)

  - ignoring the ref_store in many transaction functions is unexpected,
    but works because the ref_transaction itself carries enough
    information to do what we need.

  - ignoring "err" for in some cases (e.g., transaction abort) is OK
    because we do not return any errors. It is a little odd for
    reftable_be_create_reflog(), though, since we do return errors
    there. We should perhaps be creating string error messages at this
    layer, but I've punted on that for now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:12 -07:00
561666cc4c reftable: drop obsolete test function declarations
These functions were moved to the unit test framework in ba9661b457 (t:
move reftable/record_test.c to the unit testing framework, 2024-07-02)
and b34116a30c (t: move reftable/basics_test.c to the unit testing
framework, 2024-05-29). The declarations in reftable-tests.h are
leftover cruft.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:12 -07:00
a66fad2d28 reftable: ignore unused argc/argv in test functions
There are several reftable test "main" functions that don't look at
their argc/argv. They don't technically need to take these parameters,
as they are called individually by cmd__reftable(). But it probably
makes sense to keep them all consistent for now. In the long run these
will probably all get converted to the unit-test framework anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:12 -07:00
df9d638c24 unit-tests: ignore unused argc/argv
All of the unit test programs have their own cmd_main() function, but
none of them actually look at the argc/argv that is passed in.

In the long run we may want them to handle options for the test harness.
But we'd probably do that with a shared harness cmd_main(), dispatching
to the individual tests. In the meantime, let's annotate the unused
parameters to avoid triggering -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:12 -07:00
7046c85cce t/helper: mark more unused argv/argc arguments
This is a continuation of 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28) to cover a few new cases:

 - test-example-tap was added since that commit

 - test-hashmap used to accept the "ignorecase" argument on the command
   line. But since most of its logic was moved to a unit-test in
   3469a23659 (t: port helper/test-hashmap.c to unit-tests/t-hashmap.c,
   2024-08-03), it now ignores its argv entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:11 -07:00
4350676cdd oss-fuzz: mark unused argv/argc argument
The dummy fuzz cmd_main() does not look at its argc/argv parameters
(since it should never even be run), but has to match the usual
cmd_main() declaration.

Mark them to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:11 -07:00
bdc71b43ee refs: mark unused parameters in do_for_each_reflog_helper()
This is an each_ref_fn callback, so it has to match that interface. We
marked most of these in 63e14ee2d6 (refs: mark unused each_ref_fn
parameters, 2022-08-19), but in this case:

  - this function was created in 31f898397b (refs: drop unused params
    from the reflog iterator callback, 2024-02-21), and most of the
    arguments were correctly mark as UNUSED, but "flags" was missed.

  - commit e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09)
    added a new argument to the each_ref_fn callback. In most callbacks
    it added an UNUSED annotation, but it missed one case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:11 -07:00
d1aa0fcd45 refs: mark unused parameters in ref_store fsck callbacks
Commit ab6f79d8df (refs: set up ref consistency check infrastructure,
2024-08-08) added virtual functions to the ref store for doing fsck
checks. But the packed and reftable backends do not yet do anything.

Let's annotate them to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:11 -07:00
9dc1e748ef update-ref: mark more unused parameters in parser callbacks
This is a continuation of 44ad082968 (update-ref: mark unused parameter
in parser callbacks, 2023-08-29), as we've grown a few more virtual
functions since then.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:10 -07:00
4647f24302 imap-send: mark unused parameter in ssl_socket_connect() fallback
Commit cea1ff7f1f (imap-send: drop global `imap_server_conf` variable,
2024-06-07) added an imap_server_conf parameter to several functions.
But when compiled with NO_OPENSSL, the ssl_socket_connect() fallback
just returns immediately, so its parameters all need to be annotated to
avoid triggering -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:10 -07:00
72c9793c15 diff-lib: drop unused index argument from get_stat_data()
The "struct index_state" parameter passed to get_stat_data() has been
unused since we stopped passing it to check_removed() in 6a044a2048
(diff-lib: fix check_removed when fsmonitor is on, 2023-09-11). We can
just drop it, which in turns lets us simplify our callers a bit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:44:41 -07:00
4d7de2cf6e ref-filter: drop unused parameters from email_atom_option_parser()
This code was extracted from person_email_atom_parser() in a3d2e83a17
(ref-filter: add mailmap support, 2023-09-25), but the part that was
extracted doesn't care about the atom struct or the error strbuf.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:44:41 -07:00
4756494504 pack-bitmap: drop unused parameters from select_pseudo_merges()
We take the array of indexed_commits (and its length), but there's no
need. The selection is based on ref reachability, not the linearized set
of commits we're packing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:44:41 -07:00
ecc6fa9ae9 pack-bitmap: load writer config from repository parameter
In bitmap_writer_init(), we take a repository parameter but ever look at
it. Most of the initialization here is independent of the repository,
but we do load some config. So let's pass the repo we get down to
load_pseudo_merges_from_config(), which in turn can use repo_config(),
rather than depending on the_repository via git_config().

The outcome is the same, since all callers pass in the_repository
anyway. But it takes us a step closer to getting rid of the global, and
as a bonus it silences an unused parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:44:40 -07:00
65e7a4478c refs: drop some unused parameters from create_symref_lock()
This function was factored out in 57d0b1e2ea (files-backend: extract out
`create_symref_lock()`, 2024-05-07), but we never look at the ref_store
or refname parameters. We just need the path, which is already contained
in the lockfile struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:44:40 -07:00
b9849e4f76 Sync with 'maint'
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 12:57:37 -07:00
fa3b914457 Prepare for 2.46.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 12:52:53 -07:00
b3d175409d Merge branch 'sj/ref-fsck'
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.

* sj/ref-fsck:
  fsck: add ref name check for files backend
  files-backend: add unified interface for refs scanning
  builtin/refs: add verify subcommand
  refs: set up ref consistency check infrastructure
  fsck: add refs report function
  fsck: add a unified interface for reporting fsck messages
  fsck: make "fsck_error" callback generic
  fsck: rename objects-related fsck error functions
  fsck: rename "skiplist" to "skip_oids"
2024-08-16 12:51:51 -07:00
d07bb0cd2a Merge branch 'ps/p4-tests-updates' into maint-2.46
Perforce tests have been updated.
cf. <na5mwletzpnacietbc7pzqcgb622mvrwgrkjgjosysz3gvjcso@gzxxi7d7icr7>

* ps/p4-tests-updates:
  t98xx: mark Perforce tests as memory-leak free
  ci: update Perforce version to r23.2
  t98xx: fix Perforce tests with p4d r23 and newer
2024-08-16 12:50:56 -07:00
e6698fbfa9 Merge branch 'ks/unit-test-comment-typofix' into maint-2.46
Typofix.

* ks/unit-test-comment-typofix:
  unit-tests/test-lib: fix typo in check_pointer_eq() description
2024-08-16 12:50:56 -07:00
2ad2f2f751 Merge branch 'dh/encoding-trace-optim' into maint-2.46
An expensive operation to prepare tracing was done in re-encoding
code path even when the tracing was not requested, which has been
corrected.

* dh/encoding-trace-optim:
  convert: return early when not tracing
2024-08-16 12:50:55 -07:00
c09721cb63 Merge branch 'dd/notes-empty-no-edit-by-default' into maint-2.46
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-16 12:50:55 -07:00
9dd837e64f Merge branch 'jc/doc-rebase-fuzz-vs-offset-fix' into maint-2.46
"git rebase --help" referred to "offset" (the difference between
the location a change was taken from and the change gets replaced)
incorrectly and called it "fuzz", which has been corrected.

* jc/doc-rebase-fuzz-vs-offset-fix:
  doc: difference in location to apply is "offset", not "fuzz"
2024-08-16 12:50:55 -07:00
b74d885b11 Merge branch 'tn/doc-commit-fix' into maint-2.46
Docfix.

* tn/doc-commit-fix:
  doc: remove dangling closing parenthesis
2024-08-16 12:50:54 -07:00
72a50fa03b Merge branch 'pw/add-patch-with-suppress-blank-empty' into maint-2.46
"git add -p" by users with diff.suppressBlankEmpty set to true
failed to parse the patch that represents an unmodified empty line
with an empty line (not a line with a single space on it), which
has been corrected.

* pw/add-patch-with-suppress-blank-empty:
  add-patch: use normalize_marker() when recounting edited hunk
  add-patch: handle splitting hunks with diff.suppressBlankEmpty
2024-08-16 12:50:54 -07:00
fca5ece278 Merge branch 'jt/doc-post-receive-hook-update' into maint-2.46
Doc update.

* jt/doc-post-receive-hook-update:
  doc: clarify post-receive hook behavior
2024-08-16 12:50:53 -07:00
8ad56325e9 Merge branch 'jc/how-to-maintain-updates' (early part) into maint-2.46
* 'jc/how-to-maintain-updates' (early part):
  howto-maintain: update daily tasks
  howto-maintain: cover a whole development cycle
2024-08-16 12:50:52 -07:00
cb9c47ca2b Merge branch 'jc/doc-one-shot-export-with-shell-func' into maint-2.46
It has been documented that we avoid "VAR=VAL shell_func" and why.

* jc/doc-one-shot-export-with-shell-func:
  CodingGuidelines: document a shell that "fails" "VAR=VAL shell_func"
2024-08-16 12:50:52 -07:00
bb250b5378 Merge branch 'jc/checkout-no-op-switch-errors' into maint-2.46
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-08-16 12:50:51 -07:00
d2511eeae5 setup: make ref storage format configurable via config
Similar to the preceding commit, introduce a new "init.defaultRefFormat"
config that allows the user to globally set the ref storage format used
by newly created repositories.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:22 -07:00
0c22e09b73 setup: make object format configurable via config
The object format for repositories can either be configured explicitly
by passing the `--object-format=` option to git-init(1) or git-clone(1),
or globally by setting the `GIT_DEFAULT_HASH` environment variable.
While the former makes sense, setting random environment variables is
not really a good user experience in case someone decides to only use
SHA256 repositories.

It is only natural to expect for a user that things like this can also
be configured via their config. As such, introduce a new config
"init.defaultObjectFormat", similar to "init.defaultBranch", that allows
the user to configure the default object format when creating new repos.

The precedence order now is the following, where the first one wins:

  1. The `--object-format=` switch.

  2. The `GIT_DEFAULT_HASH` environment variable.

  3. The `init.defaultObjectFormat` config variable.

This matches the typical precedence order we use in Git. We typically
let the environment override the config such that the latter can easily
be overridden on an ephemeral basis, for example by scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
39e15b789a setup: merge configuration of repository formats
The configuration of repository formats is split up across two functions
`validate_hash_algorithm()` and `validate_ref_storage_format()`. This is
fine as-is, but we are about to extend the logic to also read default
values from the config. With the logic split across two functions, we
would either have to pass in additional parameters read from the config,
or read the config multiple times. Both of these options feel a bit
unwieldy.

Merge the code into a new function `repository_format_configure()` that
is responsible for configuring the whole repository's format. Like this,
we can easily read the config in a single place, only.

Furthermore, move the calls to `repo_set_ref_storage_format()` and
`repo_set_hash_algo()` into this new function as well, such that all the
logic to configure the repository format is self-contained here.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
7689f6cbd1 t0001: delete repositories when object format tests finish
The object format tests create one-shot repositories that are only used
by the respective test, but never delete them. This makes it hard to
pick a proper repository name in subsequent tests, as more and more
names are taken already.

Delete these repositories via `test_when_finished`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
05d20915bc t0001: exercise initialization with ref formats more thoroughly
While our object format tests for git-init(1) exercise tests with all
known formats in t0001, the tests for the ref format don't. This leads
to some missing test coverage for interesting cases, like whether or not
a non-default ref storage format causes us to bump the repository format
version. We also don't test for the precedence of the `--ref-format=`
and the `GIT_DEFAULT_REF_FORMAT=` environment variable.

Extend the test suite to cover more scenarios related to the ref format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
e3209bd4df builtin/stash: fix --keep-index --include-untracked with empty HEAD
It was reported that creating a stash with `--keep-index
--include-untracked` causes an error when HEAD points to a commit whose
tree is empty:

    $ git stash push --keep-index --include-untracked
    error: pathspec ':/' did not match any file(s) known to git

This error comes from `git checkout --no-overlay $i_tree -- :/`, which
we execute to reset the working tree to the state in our index. As the
tree generated from the index is empty in our case, ':/' does not match
any files and thus causes git-checkout(1) to error out.

Fix the issue by skipping the checkout when the index tree is empty. As
explained in the in-code comment, this should be the correct thing to do
as there is nothing that we'd have to reset in the first place.

Reported-by: Piotr Siupa <piotrsiupa@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:50:33 -07:00
98077d06b2 run-command: fix detaching when running auto maintenance
In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
a6affd3343 builtin/maintenance: add a --detach flag
Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
c7185df01b builtin/gc: add a --detach flag
When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
9b6b994f90 builtin/gc: stop processing log file on signal
When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
0ce44e2293 builtin/gc: fix leaking config values
We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
d1ae15d68b builtin/gc: refactor to read config into structure
The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.

Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
a70a9bf6ee config: fix constness of out parameter for git_config_get_expiry()
The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.

Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:24 -07:00
87a1768b93 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 13:22:16 -07:00
0da7673a51 Merge branch 'xx/diff-tree-remerge-diff-fix'
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-15 13:22:16 -07:00
e7f86cb69d Merge branch 'jc/refs-symref-referent'
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.

* jc/refs-symref-referent:
  ref-filter: populate symref from iterator
  refs: add referent to each_ref_fn
  refs: keep track of unresolved reference value in iterators
2024-08-15 13:22:15 -07:00
88457a6151 Merge branch 'ps/submodule-ref-format'
Support to specify ref backend for submodules has been enhanced.

* ps/submodule-ref-format:
  object: fix leaking packfiles when closing object store
  submodule: fix leaking seen submodule names
  submodule: fix leaking fetch tasks
  builtin/submodule: allow "add" to use different ref storage format
  refs: fix ref storage format for submodule ref stores
  builtin/clone: propagate ref storage format to submodules
  builtin/submodule: allow cloning with different ref storage format
  git-submodule.sh: break overly long command lines
2024-08-15 13:22:14 -07:00
6891103f72 Merge branch 'ag/t7004-modernize'
Coding style fixes to a test script.

* ag/t7004-modernize:
  t7004: make use of write_script
  t7004: use single quotes instead of double quotes
  t7004: begin the test body on the same line as test_expect_success
  t7004: description on the same line as test_expect_success
  t7004: do not prepare things outside test_expect_success
  t7004: use indented here-doc
  t7004: one command per line
  t7004: remove space after redirect operators
2024-08-15 13:22:14 -07:00
69b737999c Merge branch 'ps/reftable-stack-compaction'
The code paths to compact multiple reftable files have been updated
to correctly deal with multiple compaction triggering at the same
time.

* ps/reftable-stack-compaction:
  reftable/stack: handle locked tables during auto-compaction
  reftable/stack: fix corruption on concurrent compaction
  reftable/stack: use lock_file when adding table to "tables.list"
  reftable/stack: do not die when fsyncing lock file files
  reftable/stack: simplify tracking of table locks
  reftable/stack: update stats on failed full compaction
  reftable/stack: test compaction with already-locked tables
  reftable/stack: extract function to setup stack with N tables
  reftable/stack: refactor function to gather table sizes
2024-08-15 13:22:13 -07:00
2b9b229cb4 Merge branch 'es/doc-platform-support-policy'
A policy document that describes platform support levels and
expectation on platform stakeholders has been introduced.

* es/doc-platform-support-policy:
  Documentation: add platform support policy
2024-08-15 13:22:13 -07:00
a3d71f2076 Merge branch 'gt/unit-test-hashmap'
An existing test of hashmap API has been rewritten with the
unit-test framework.

* gt/unit-test-hashmap:
  t: port helper/test-hashmap.c to unit-tests/t-hashmap.c
2024-08-15 13:22:12 -07:00
f6df5e2d05 Merge branch 'jc/t3206-test-when-finished-fix'
Test clean-up.

* jc/t3206-test-when-finished-fix:
  t3206: test_when_finished before dirtying operations, not after
2024-08-15 13:22:12 -07:00
402f36f33e Merge branch 'rs/t-example-simplify'
Unit test simplification.

* rs/t-example-simplify:
  t-example-decorate: remove test messages
2024-08-15 13:22:11 -07:00
0ed3dde067 Merge branch 'jc/safe-directory'
Follow-up on 2.45.1 regression fix.

* jc/safe-directory:
  safe.directory: setting safe.directory="." allows the "current" directory
  safe.directory: normalize the configured path
  safe.directory: normalize the checked path
  safe.directory: preliminary clean-up
2024-08-15 13:22:11 -07:00
a72dfab8b8 pseudo-merge.c: ensure pseudo-merge groups are closed
When generating pseudo-merge bitmaps, it's possible that concurrent
reference updates may reveal some pseudo-merge candidates which reach
objects that are not contained in the bitmap's pack or pseudo-pack
order (in the case of MIDX bitmaps).

The latter case is relatively easy to demonstrate: if we generate a MIDX
bitmap with only half of the repository packed, then the unpacked
contents are not part of the MIDX's object order.

If we happen to select one or more commit(s) from the unpacked portion
of the repository for inclusion in a pseudo-merge, we'll get the
following message when trying to generate its bitmap:

    $ git multi-pack-index write --bitmap
    [...]
    Selecting pseudo-merge commits: 100% (1/1), done.
    warning: Failed to write bitmap index. Packfile doesn't have full closure (object ... is missing)
    Building bitmaps:  50% (1/2), done.
    error: could not write multi-pack bitmap

, and the attempted bitmap write will fail, leaving the repository
without a current bitmap.

Rectify this by ensuring that the commits which are pseudo-merge
candidates can only be so if they appear somewhere in the packing order.

This is sufficient, since we know that the original packing order is
closed under reachability, so if a commit appears in that list as a
potential pseudo-merge candidate, we know that everything reachable from
it also appears in the list (and thus the candidate is a good one).

Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:32:28 -07:00
25b78668de pseudo-merge.c: do not generate empty pseudo-merge commits
The previous commit demonstrated it is possible to generate empty
pseudo-merge commits, which is not useful as such pseudo-merges carry no
information.

Ensure that we only generate non-empty groups by not pushing a new
commit onto the bitmap_writer when that commit has no parents.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:29:15 -07:00
42f80e361c t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
Demonstrate that it is possible to generate empty pseudo-merge commits
in certain cases.

In the below instance, we generate one non-empty pseudo-merge
(containing commit "base"), and one empty pseudo-merge group
(corresponding to the unstable commits within that group).

(In my testing, the pseudo-merge machinery seems to handle empty groups
just fine, but generating them is pointless as they carry no
information.)

This commit (introducing a deliberate "test_expect_failure") is split
out from the actual fix (which will appear in the following commit) to
demonstrate that the failure is correctly induced.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:26:35 -07:00
187504f9b2 pack-bitmap-write.c: select pseudo-merges even for small bitmaps
Ordinarily, the pack-bitmap machinery will select some subset of
reachable commits to receive bitmaps. But when there are fewer than 100
commits indexed in the first place, they will all receive bitmaps as a
special case.

When this happens, pseudo-merges are not generated, making it impossible
to test pseudo-merge corner cases with fewer than 100 commits.

Select pseudo-merges even for bitmaps with fewer than 100 commits to
make such testing easier. In practice, this should not make a difference
to non-testing bitmaps, as they are unlikely to be used when a
repository has so few commits to begin with.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:25:02 -07:00
11a08e8332 pack-bitmap: drop redundant args from bitmap_writer_finish()
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_finish()` function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:23:15 -07:00
f00dda4849 pack-bitmap: drop redundant args from bitmap_writer_build()
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_build()` function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:22:27 -07:00
125ee4ae80 pack-bitmap: drop redundant args from bitmap_writer_build_type_index()
The previous commit ensures that the bitmap_writer's "to_pack" field is
initialized early on, so the "to_pack" and "index_nr" arguments to
`bitmap_writer_build_type_index()` are redundant.

Drop them and adjust the callers accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:20:24 -07:00
01e9d12939 pack-bitmap: initialize bitmap_writer_init() with packing_data
In order to determine its object order, the pack-bitmap machinery keeps
a 'struct packing_data' corresponding to the pack or pseudo-pack (when
writing a MIDX bitmap) being written.

The to_pack field is provided to the bitmap machinery by callers of
bitmap_writer_build() and assigned to the bitmap_writer struct at that
point.

But a subsequent commit will want to have access to that data earlier on
during commit selection. Prepare for that by adding a 'to_pack' argument
to 'bitmap_writer_init()', and initializing the field during that
function.

Subsequent commits will clean up other functions which take
now-redundant arguments (like nr_objects, which is equivalent to
pdata->objects_nr, or pdata itself).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:18:04 -07:00
49e5cc5b26 t4129: fix racy index when calling chmod after git-add
This patch fixes a racy test failure in t4129.

The deletion test added by e95d515141 (apply: canonicalize modes read
from patches, 2024-08-05) wants to make sure that git-apply does not
complain about a non-canonical mode in the patch, even if that mode does
not match the working tree file. So it does this:

	echo content >non-canon &&
	git add non-canon &&
	chmod 666 non-canon &&

This is wrong, because running chmod will update the ctime on the file,
making it stat-dirty and causing git-apply to refuse to apply the patch.
But this only happens sometimes, since it depends on the timestamps
crossing a second boundary (but it triggers pretty quickly when run with
--stress).

We can fix this by doing the chmod before updating the index. The order
isn't important here, as the mode will be canonicalized to 100644 in the
index anyway (in fact, the chmod is not even that important in the first
place, since git-apply will only look at the index; I only added it as
an extra confirmation that git-apply would not be confused by it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 09:41:11 -07:00
9b4df82634 Merge branch 'ps/reftable-stack-compaction' into ps/reftable-drop-generic
* ps/reftable-stack-compaction:
  reftable/stack: handle locked tables during auto-compaction
  reftable/stack: fix corruption on concurrent compaction
  reftable/stack: use lock_file when adding table to "tables.list"
  reftable/stack: do not die when fsyncing lock file files
  reftable/stack: simplify tracking of table locks
  reftable/stack: update stats on failed full compaction
  reftable/stack: test compaction with already-locked tables
  reftable/stack: extract function to setup stack with N tables
  reftable/stack: refactor function to gather table sizes
2024-08-15 08:22:03 -07:00
90934966bb git-gui: strip commit messages less aggressively
We would strip all leading and trailing whitespace, which git commit
does not. Let's be consistent here.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-08-15 16:10:23 +02:00
1ae85ff6d4 git-gui: strip comments and consecutive empty lines from commit messages
This is also known as "washing". This is consistent with the behavior of
interactive git commit, which we should emulate as closely as possible
to avoid usability problems. This way commit message templates and
prepare hooks can be used properly, and comments from conflicted rebases
and merges are cleaned up without having to introduce special handling
for them.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-08-15 16:10:23 +02:00
983555a1f2 howto-maintain: mention preformatted docs
Forgot to mention that the preformatted documentation repositories
are updated every time the master branch of the project advances.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 16:04:18 -07:00
be9bd463f1 git-svn: mention svn:global-ignores in help+docs
Git-SVN was previously taught to use the svn:global-ignores property as
well as svn:ignore when creating or showing .gitignore files from a
Subversion repository. However, the documentation and help message still
only mentioned svn:ignore. Update Git-SVN's documentation and help
command to mention support for the new property. Also capitalize the
help message for the 'mkdirs' command, for consistency.

Signed-off-by: Alex Galvin <agalvin@comqi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 15:10:24 -07:00
477ce5ccd6 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 14:54:58 -07:00
d639123742 Merge branch 'tb/t7704-deflake'
A test that fails on an unusually slow machine was found, and made
less likely to cause trouble by lengthening the expiry value it
uses.

* tb/t7704-deflake:
  t/t7704-repack-cruft.sh: avoid failures during long-running tests
2024-08-14 14:54:58 -07:00
dd59778f76 Merge branch 'jc/document-use-of-local'
Doc update.

* jc/document-use-of-local:
  doc: note that AT&T ksh does not work with our test suite
2024-08-14 14:54:58 -07:00
4a443f00c4 Merge branch 'rs/use-decimal-width'
Code clean-up.

* rs/use-decimal-width:
  log-tree: use decimal_width()
2024-08-14 14:54:57 -07:00
81903d0472 Merge branch 'ss/packed-ref-store-leakfix'
Leakfix.

* ss/packed-ref-store-leakfix:
  refs/files: prevent memory leak by freeing packed_ref_store
2024-08-14 14:54:57 -07:00
7b11e20bff Merge branch 'cp/unit-test-reftable-tree'
A test in reftable library has been rewritten using the unit test
framework.

* cp/unit-test-reftable-tree:
  t-reftable-tree: improve the test for infix_walk()
  t-reftable-tree: add test for non-existent key
  t-reftable-tree: split test_tree() into two sub-test functions
  t: move reftable/tree_test.c to the unit testing framework
  reftable: remove unnecessary curly braces in reftable/tree.c
2024-08-14 14:54:56 -07:00
61fd5de05f Merge branch 'kl/test-fixes'
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-14 14:54:55 -07:00
494c9788e4 Merge branch 'jc/reflog-expire-lookup-commit-fix'
"git reflog expire" failed to honor annotated tags when computing
reachable commits.

* jc/reflog-expire-lookup-commit-fix:
  Revert "reflog expire: don't use lookup_commit_reference_gently()"
2024-08-14 14:54:55 -07:00
7a95eceb6b Merge branch 'jr/ls-files-expand-literal-doc'
Docfix.

* jr/ls-files-expand-literal-doc:
  doc: fix hex code escapes in git-ls-files
2024-08-14 14:54:54 -07:00
c147b41f4c Merge branch 'jc/leakfix-mailmap'
Leakfix.

* jc/leakfix-mailmap:
  mailmap: plug memory leak in read_mailmap_blob()
2024-08-14 14:54:54 -07:00
dfaa04f3c6 Merge branch 'jc/leakfix-hashfile'
Leakfix.

* jc/leakfix-hashfile:
  csum-file: introduce discard_hashfile()
2024-08-14 14:54:53 -07:00
44773b9f70 Merge branch 'jc/patch-id'
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format
2024-08-14 14:54:53 -07:00
c7ca437d9f Merge branch 'ps/refs-wo-the-repository'
In the refs subsystem, implicit reliance of the_repository has been
eliminated; the repository associated with the ref store object is
used instead.

* ps/refs-wo-the-repository:
  refs/reftable: stop using `the_repository`
  refs/packed: stop using `the_repository`
  refs/files: stop using `the_repository`
  refs/files: stop using `the_repository` in `parse_loose_ref_contents()`
  refs: stop using `the_repository`
2024-08-14 14:54:52 -07:00
5a74eb07ca Merge branch 'jc/jl-git-no-advice-fix'
Remove leftover debugging cruft from a test script.

* jc/jl-git-no-advice-fix:
  t0018: remove leftover debugging cruft
2024-08-14 14:54:51 -07:00
4cf2f1be56 Merge branch 'tb/config-fixed-value-with-valueless-true'
"git config --value=foo --fixed-value section.key newvalue" barfed
when the existing value in the configuration file used the
valueless true syntax, which has been corrected.

* tb/config-fixed-value-with-valueless-true:
  config.c: avoid segfault with --fixed-value and valueless config
2024-08-14 14:54:51 -07:00
0b2c4bc3ff Merge branch 'jk/apply-patch-mode-check-fix'
The patch parser in 'git apply' has been a bit more lenient against
unexpected mode bits, like 100664, recorded on extended header lines.

* jk/apply-patch-mode-check-fix:
  apply: canonicalize modes read from patches
2024-08-14 14:54:50 -07:00
505312a83f Merge branch 'ps/ref-api-cleanup'
Code clean-up.

* ps/ref-api-cleanup:
  refs: drop `ref_store`-less functions
2024-08-14 14:54:50 -07:00
760348212b Merge branch 'ps/ls-remote-out-of-repo-fix'
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-14 14:54:49 -07:00
ecbed3ff45 Merge branch 'jc/transport-leakfix'
Leakfix.

* jc/transport-leakfix:
  transport: fix leak with transport helper URLs
2024-08-14 14:54:49 -07:00
4bad0119f2 Merge branch 'rh/http-proxy-path'
The value of http.proxy can have "path" at the end for a socks
proxy that listens to a unix-domain socket, but we started to
discard it when we taught proxy auth code path to use the
credential helpers, which has been corrected.

* rh/http-proxy-path:
  http: do not ignore proxy path
2024-08-14 14:54:49 -07:00
d65332f241 Merge branch 'cp/unit-test-reftable-pq'
The tests for "pq" part of reftable library got rewritten to use
the unit test framework.

* cp/unit-test-reftable-pq:
  t-reftable-pq: add tests for merged_iter_pqueue_top()
  t-reftable-pq: add test for index based comparison
  t-reftable-pq: make merged_iter_pqueue_check() callable by reference
  t-reftable-pq: make merged_iter_pqueue_check() static
  t: move reftable/pq_test.c to the unit testing framework
  reftable: change the type of array indices to 'size_t' in reftable/pq.c
  reftable: remove unnecessary curly braces in reftable/pq.c
2024-08-14 14:54:48 -07:00
6c3c451fb6 Merge branch 'jk/osxkeychain-username-is-nul-terminated'
The credential helper to talk to OSX keychain sometimes sent
garbage bytes after the username, which has been corrected.

* jk/osxkeychain-username-is-nul-terminated:
  credential/osxkeychain: respect NUL terminator in username
2024-08-14 14:54:48 -07:00
4385f8a52d Merge branch 'ps/leakfixes-part-3'
More leakfixes.

* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-14 14:54:47 -07:00
bbc04b0094 t9001-send-email.sh: update alias list used for pine test
The set of aliases used for the pine --dump-aliases test do not
perfectly mesh with the way the pine address book is defined. While
technically all valid, there are some oddities including bob's name
being partially split so that the actual address is returned as
"Bobbyton <bob@example.com>". A strict reading of the pine documentation
indicates that the address should either be of the form
"address@domain" or a comma separated list of address, name/address
pairs, or other aliases enclosed by ().

The parsing implementation in git-send-email is not as strict, but it
makes sense to ensure the test data used is. Although the --dump-aliases
test does not make use of the address data, it is helpful to avoid
giving future developers the wrong impression of the file format.

Also add an alias which translates to multiple addresses using the ()
format.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 12:13:00 -07:00
4b707a6e99 p1500: add is-base performance tests
The previous two changes introduced a commit walking heuristic for finding
the most likely base branch for a given source. This algorithm walks
first-parent histories until reaching a collision.

This walk _should_ be very fast. Exceptions include cases where a
commit-graph file does not exist, leading to a full walk of all reachable
commits to compute generation numbers, or a case where no collision in the
first-parent history exists, leading to a walk of all first-parent history
to the root commits.

The p1500 test script guarantees a complete commit-graph file during its
setup, so we will not test that scenario. Do create a new root commit in an
effort to test the scenario of parallel first-parent histories.

Even with the extra root commit, these tests take no longer than 0.02
seconds on my machine for the Git repository. However, the results are
slightly more interesting in a copy of the Linux kernel repository:

Test
---------------------------------------------------------------
1500.2: ahead-behind counts: git for-each-ref              0.12
1500.3: ahead-behind counts: git branch                    0.12
1500.4: ahead-behind counts: git tag                       0.12
1500.5: contains: git for-each-ref --merged                0.04
1500.6: contains: git branch --merged                      0.04
1500.7: contains: git tag --merged                         0.04
1500.8: is-base check: test-tool reach (refs)              0.03
1500.9: is-base check: test-tool reach (tags)              0.03
1500.10: is-base check: git for-each-ref                   0.03
1500.11: is-base check: git for-each-ref (disjoint-base)   0.07

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:06 -07:00
9c1732ca11 for-each-ref: add 'is-base' token
The previous change introduced the get_branch_base_for_tip() method in
commit-reach.c. The motivation of that change was about using a heuristic to
deteremine the base branch for a source commit from a list of candidate
commit tips. This change makes that algorithm visible to users via a new
atom in the 'git for-each-ref' format. This change is very similar to the
chang in 49abcd21da (for-each-ref: add ahead-behind format atom,
2023-03-20).

Introduce the 'is-base:<source>' atom, which will indicate that the
algorithm should be computed and the result of the algorithm is reported
using an indicator of the form '(<source>)'. For example, using
'%(is-base:HEAD)' would result in one line having the token '(HEAD)'.

Use the sorted order of refs included in the ref filter to break ties in the
algorithm's heuristic. In the previous change, the motivating examples
include using an L0 trunk, long-lived L1 branches, and temporary release
branches. A caller could communicate the ordered preference among these
categories using the input refpecs and avoiding a different sort mechanism.
This sorting behavior is tested in the test scripts.

It is important to include this atom as a special case to
can_do_iterative_format() to match the expectations created in bd98f9774e
(ref-filter.c: filter & format refs in the same callback, 2023-11-14). The
ahead-behind atom was one of the special cases, and this similarly requires
using an algorithm across all input refs before starting the format of any
single ref.

In the test script, the format tokens use colons or lack whitespace to avoid
Git complaining about trailing whitespace errors.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:06 -07:00
69020d034b commit: add gentle reference lookup method
The lookup_commit_reference_by_name() method uses lookup_commit_reference()
without an option to use lookup_commit_reference_gently(). Create a gentle
version of the method so it can be used in locations where non-commits may
be found but error messages should be silenced.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:05 -07:00
e32eaf73b0 commit-reach: add get_branch_base_for_tip
Add a new reachability algorithm that intends to discover (from a heuristic)
which branch was used as the starting point for a given commit. Add focused
tests using the 'test-tool reach' command.

In repositories that use pull requests (or merge requests) to advance one or
more "protected" branches, the history of that reference can be recovered by
following the first-parent history in most cases. Most are completed using
no-fast-forward merges, though squash merges are quite common. Less common
is rebase-and-merge, which still validates this assumption. Finally, the
case that breaks this assumption is the fast-forward update (with potential
rebasing).  Even in this case, the previous commit commonly appears in the
first-parent history of the branch.

Similar assumptions can be made for a topic branch created by a single user
with the intention to merge back into another branch. Using 'git commit',
'git merge', and 'git cherry-pick' from HEAD will default to having the
first-parent commit be the previous commit at HEAD. This history changes
only with commands such as 'git reset' or 'git rebase', where the command
names also imply that the branch is starting from a new location.

With this movement of branches in mind, the following heuristic is proposed
as a way to determine the base branch for a given source branch:

  Among a list of candidate base branches, select the candidate that
  minimizes the number of commits in the first-parent history of the source
  that are not in the first-parent history of the candidate.

Prior third-party solutions to this problem have used this optimization
criteria, but have relied upon extracting the first-parent history and
comparing those lists as tables instead of using commit-graph walks.

Given current command-line interface options, this optimization criteria is
not easy to detect directly. Even using the command

  git rev-list --count --first-parent <base>..<source>

does not measure this count, as it uses full reachability from <base> to
determine which commits to remove from the range '<base>..<source>'. This
may lead to one asking if we should instead be using the full reachability
of the candidate and only the first-parent history of the source. This,
unfortunately, does not work for repositories that use long-lived branches
and automation to merge across those branches.

In extremely large repositories, merging into a single trunk may not be
feasible.  This is usually due to the desired frequency of updates
(thousands of engineers doing daily work) combined with the time required to
perform a validation build.  These factors combine to create significant
risk of semantic merge conflicts, leading to build breaks on the trunk. In
response, repository maintainers can create a single Level Zero (L0) trunk
and multiple Level One (L1) branches. By partitioning the engineers by
organization, these engineers may see lower risk of semantic merge conflicts
as well as be protected against build breaks in other L1 branches. The key
to making this system work is a semi-automated process of merging L1
branches into the L0 trunk and vice-versa.  In a large enough organization,
these L1 branches may further split into L2 or L3 branches, but the same
principles apply for merging across deeper levels.

If these automated merges use a typical merge with the second parent
bringing in the "new" content, then each L0 and L1 branch can track its
previous positions by following first-parent history, which appear as
parallel paths (until reaching the first place where the branches diverged).
If we also walk to second parents, then the histories overlap significantly
and cannot be distinguished except for very-recent changes.

For this reason, the first-parent condition should be symmetrical across the
base and source branches.

Another common case for desiring the result of this optimization method is
the use of release branches. When releasing a version of a repository, a
branch can be used to track that release. Any updates that are worth fixing
in that release can be merged to the release branch and shipped with only
the necessary fixes without any new features introduced in the trunk branch.
The 'maint-2.<X>' branches represent this pattern in the Git project. The
microsoft/git fork uses 'vfs-2.<X>.<Y>' branches to track the changes that
are custom to that fork on top of each upstream Git release 2.<X>.<Y>. This
application doesn't need the symmetrical first-parent condition, but the use
of first-parent histories does not change the results for these branches.

To determine the base branch from a list of candidates, create a new method
in commit-reach.c that performs a single* commit-graph walk. The core
concept is to walk first-parents starting at the candidate bases and the
source, tracking the "best" base to reach a given commit. Use generation
numbers to ensure that a commit is walked at most once and all children have
been explored before visiting it.  When reaching a commit that is reachable
from both a base and the source, we will then have a guarantee that this is
the closest intersection of first-parent histories. Track the best base to
reach that commit and return it as a result. In rare cases involving
multiple root commits, the first-parent history of the source may never
intersect any of the candidates and thus a null result is returned.

* There are up to two walks, since we require all commits to have a computed
  generation number in order to avoid incorrect results. This is similar to
  the need for computed generation numbers in ahead_behind() as implemented
  in fd67d149bd (commit-reach: implement ahead_behind() logic, 2023-03-20).

In order to track the "best" base, use a new commit slab that stores an
integer.  This value defaults to zero upon initialization, so use -1 to
track that the source commit can reach this commit and use 'i + 1' to track
that the ith base can reach this commit. When multiple bases can reach a
commit, minimize the index to break ties. This allows the caller to specify
an order to the bases that determines some amount of preference when the
heuristic does not result in a unique result.

The trickiest part of the integer slab is what happens when reaching a
collision among the histories of the bases and the history of the source.
This is noticed when viewing the first parent and seeing that it has a slab
value that differs in sign (negative or positive). In this case, the
collision commit is stored in the method variable 'branch_point' and its
slab value is set to -1. The index of the best base (so far) is stored in
the method variable 'best_index'. It is possible that there are multiple
commits that have the branch_point as its first parent, leading to multiple
updates of best_index.  The result is determined when 'branch_point' is
visited in the commit walk, giving the guarantee that all commits that could
reach 'branch_point' were visited.

Several interesting cases of collisions and different results are tested in
the t6600-test-reach.sh script. Recall that this script also tests the
algorithm in three possible states involving the commit-graph file and how
many commits are written in the file. This provides some coverage of the
need (and lack of need) for the ensure_generations_valid() method.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:05 -07:00
77d4b3dd73 builtin/diff: free symmetric diff members
We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:02 -07:00
36f971f861 diff: free state populated via options
The `objfind` and `anchors` members of `struct diff_options` are
populated via option parsing, but are never freed in `diff_free()`. Fix
this to plug those memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
0aaca0ec09 builtin/log: fix leak when showing converted blob contents
In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
38678e5df5 userdiff: fix leaking memory for configured diff drivers
The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
1bc158e750 builtin/format-patch: fix various trivial memory leaks
There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
6b15d9ca7f diff: fix leak when parsing invalid ignore regex option
When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
4dfd4f1dfe unpack-trees: clear index when not propagating it
When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
2f07d228c3 sequencer: release todo list on error paths
We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
hitting an error path. Restructure the function to have a common exit
path such that we can easily clean up the list and thus plug this memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
de54b450a3 merge-ort: unconditionally release attributes index
We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.

Fix this issue by unconditionally releasing the index.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
a0b82622cb builtin/fast-export: plug leaking tag names
When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
8ed4e96b5b builtin/fast-export: fix leaking diff options
Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
such that its contents aren't getting freed inside of `handle_commit()`.
We never unset that flag though, which means that the structure's
allocated resources will ultimately leak.

Fix this by unsetting the flag after the loop such that we release its
resources via `release_revisions()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
0662f0dacb builtin/fast-import: plug trivial memory leaks
Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
187b623eef builtin/notes: fix leaking struct notes_tree when merging notes
We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
1ca57bea4a builtin/rebase: fix leaking commit.gpgsign value
In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
648abbe22d config: fix leaking comment character config
When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
5f6519b62c submodule-config: fix leaking name entry when traversing submodules
We traverse through submodules in the tree via `tree_entry()`, passing
to it a `struct name_entry` that it is supposed to populate with the
tree entry's contents. We unnecessarily allocate this variable instead
of passing a variable that is allocated on the stack, and the ultimately
don't even free that variable. This is unnecessary and leaks memory.

Convert the variable to instead be allocated on the stack to plug the
memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
d1c53f6703 read-cache: fix leaking hashfile when writing index fails
In `do_write_index()`, we use a `struct hashfile` to write the index
with a trailer hash. In case the write fails though, we never clean up
the allocated `hashfile` state and thus leak memory.

Refactor the code to have a common exit path where we can free this and
other allocated memory. While at it, refactor our use of `strbuf`s such
that we reuse the same buffer to avoid some unneeded allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
c81dcf630c bulk-checkin: fix leaking state TODO
When flushing a bulk-checking to disk we also reset the `struct
bulk_checkin_packfile` state. But while we free some of its members,
others aren't being free'd, leading to memory leaks:

  - The temporary packfile name is not getting freed.

  - The `struct hashfile` only gets freed in case we end up calling
    `finalize_hashfile()`. There are code paths though where that is not
    the case, namely when nothing has been written. For this, we need to
    make `free_hashfile()` public.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:57 -07:00
9ddd5f755d object-name: fix leaking symlink paths in object context
The object context may be populated with symlink contents when reading a
symlink, but the associated strbuf doesn't ever get released when
releasing the object context, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:57 -07:00
aa9ef614dc object-file: fix memory leak when reading corrupted headers
When reading corrupt object headers in `read_loose_object()`, we bail
out immediately. This causes a memory leak though because we would have
already initialized the zstream in `unpack_loose_header()`, and it is
the callers responsibility to finish the zstream even on error. While
this feels weird, other callsites do it correctly already.

Fix this leak by ending the zstream even on errors. We may want to
revisit this interface in the future such that the callee handles this
for us already when there was an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:57 -07:00
ce15f9eb9e git: fix leaking system paths
Git has some flags to make it output system paths as they have been
compiled into Git. This is done by calling `system_path()`, which
returns an allocated string. This string isn't ever free'd though,
creating a memory leak.

Plug those leaks. While they are surfaced by t0211, there are more
memory leaks looming exposed by that test suite and it thus does not yet
pass with the memory leak checker enabled.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:56 -07:00
ce01f92889 remote: plug memory leak when aliasing URLs
When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contains all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:56 -07:00
16d89aa975 t9001-send-email.sh: fix quoting for mailrc --dump-aliases test
The .mailrc alias file format documents that multiple addresses are
separated by spaces. The alias file used in the t9001 --dump-aliases
mailrc test have addresses which include both a name and email. These
are unquoted, so git send-email will parse this as an alias that
translates to multiple independent addresses.

The existing test does not care about this, as --dump-aliases only dumps
the alias and not the address. However, it is incorrect for a future
where --dump-aliases could also dump the mail addresses.

Fix the test to quote the aliases properly, so that they translate to a
single address.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 09:55:03 -07:00
1784522a1f midx: drop unused parameters from add_midx_to_chain()
When loading a chained midx, we build up an array of hashes, one per
layer of the chain. But since the chain is also represented by the
linked list of multi_pack_index structs, nobody actually reads this
array. We pass it to add_midx_to_chain(), but the parameters are
completely ignored.

So we can drop those unused parameters. And then we can see that its
sole caller, load_midx_chain_fd_st(), only cares about one layer hash at a
time (for parsing each line and feeding it to the single-layer midx
code). So we can replace the array with a single object_id on the stack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:36:34 -07:00
96a9a3e42e bundle: default to SHA1 when reading bundle headers
We hit a segfault when trying to open a bundle via `git bundle
list-heads` when running outside of a repository. This is caused by
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), which stopped setting the default object hash so that
`the_hash_algo` is a `NULL` pointer when running outside of any repo.

This is only a symptom of a deeper issue though. Bundles default to the
SHA1 object format unless they advertise an "@object-format=" header.
Consequently, it has been wrong in the first place to use the object
format used by the current repository when parsing bundles. The
consequence is that trying to open a bundle that uses a different object
hash than the current repository will fail:

    $ git bundle list-heads sha1.bundle
    error: unrecognized header: ee4b540943284700a32591ad09f7e15bdeb2a10c HEAD (45)

Fix the bug by defaulting to the SHA1 object hash. We already handle the
"@object-format=" header as expected, so we don't need to adapt this
part.

Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:26:44 -07:00
7298bcc573 builtin/bundle: have unbundle check for repo before opening its bundle
The `git bundle unbundle` subcommand requires a repository to unbundle
the contents into. As thus, the subcommand checks whether we have a
startup repository in the first place, and if not it dies.

This check happens after we have already opened the bundle though. This
causes a segfault when running outside of a repository starting with
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07) because we have no hash function set up, but we do try to
parse refs advertised by the bundle's header.

The next commit will fix that underlying issue by defaulting to the SHA1
object format for bundles, which will also fix the described segfault here.
But as we know that we will die anyway, we can do better than that and
avoid some vain work by moving the check for a repository before we try
to open the bundle.

Reported-by: ArcticLampyrid <ArcticLampyrid@outlook.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:26:20 -07:00
5e440bf7f1 t-reftable-readwrite: add test for known error
When using reftable_writer_add_ref() to add a ref record to a
reftable writer, The update_index of the ref record must be within
the limits set by reftable_writer_set_limits(), or REFTABLE_API_ERROR
is returned. This scenario is currently left untested. Add a test
case for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:08:03 -07:00
12f9ea473f t-reftable-readwrite: use 'for' in place of infinite 'while' loops
Using a for loop with an empty conditional statement is more concise
and easier to read than an infinite 'while' loop in instances
where we need a loop variable. Hence, replace such instances of a
'while' loop with the equivalent 'for' loop.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:08:03 -07:00
3dd4fb13a0 t-reftable-readwrite: use free_names() instead of a for loop
free_names() as defined by reftable/basics.{c,h} frees a NULL
terminated array of malloced strings along with the array itself.
Use this function instead of a for loop to free such an array.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:08:02 -07:00
5b539a5361 t: move reftable/readwrite_test.c to the unit testing framework
reftable/readwrite_test.c exercises the functions defined in
reftable/reader.{c,h} and reftable/writer.{c,h}. Migrate
reftable/readwrite_test.c to the unit testing framework. Migration
involves refactoring the tests to use the unit testing framework
instead of reftable's test framework and renaming the tests to
align with unit-tests' naming conventions.

Since some tests in reftable/readwrite_test.c use the functions
set_test_hash(), noop_flush() and strbuf_add_void() defined in
reftable/test_framework.{c,h} but these files are not #included
in the ported unit test, copy these functions in the new test file.

While at it, ensure structs are 0-initialized with '= { 0 }'
instead of '= { NULL }'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:08:02 -07:00
036876a106 config: hide functions using the_repository by default
The config subsystem provides a bunch of legacy functions that read or
set configuration for `the_repository`. The use of those functions is
discouraged, and it is easy to miss the implicit dependency on
`the_repository` that calls to those functions may cause.

Move all config-related functions that use `the_repository` into a block
that gets only conditionally compiled depending on whether or not the
macro has been defined. This also removes all dependencies on that
variable in "config.c", allowing us to remove the definition of said
preprocessor macro.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:05 -07:00
219de841d9 global: prepare for hiding away repo-less config functions
We're about to hide config functions that implicitly depend on
`the_repository` behind the `USE_THE_REPOSITORY_VARIABLE` macro. This
will uncover a bunch of dependents that transitively relied on the
global variable, but didn't define the macro yet.

Adapt them such that we define the macro to prepare for this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:05 -07:00
f7d61c4135 config: don't depend on the_repository with branch conditions
When computing branch "includeIf" conditions we use `the_repository` to
obtain the main ref store. We really shouldn't depend on this global
repository though, but should instead use the repository that is being
passed to us via `struct config_include_data`. Otherwise, when parsing
configuration of e.g. submodules, we may end up evaluating the condition
the via the wrong refdb.

Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:05 -07:00
c2ba4e3b5c config: don't have setters depend on the_repository
Some of the setters that accept a `struct repository` still implicitly
rely on `the_repository` via `git_config_set_multivar_in_file()`. While
this function would typically use the caller-provided path, it knows to
fall back to using the configuration path indicated by `the_repository`.

Adapt those functions to instead use the caller-provided repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
76fc9906f2 config: pass repo to functions that rename or copy sections
Refactor functions that rename or copy config sections to accept a
`struct repository` such that we can get rid of the implicit dependency
on `the_repository`. Rename the functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
0c2c37d16b config: pass repo to git_die_config()
Refactor `git_die_config()` to accept a `struct repository` such that we
can get rid of the implicit dependency on `the_repository`. Rename the
function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
44ebcd6254 config: pass repo to git_config_get_expiry_in_days()
Refactor `git_config_get_expiry_in_days()` to accept a `struct
repository` such that we can get rid of the implicit dependency on
`the_repository`. Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
87aace129e config: pass repo to git_config_get_expiry()
Refactor `git_config_get_expiry()` to accept a `struct repository` such
that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
d8b772182c config: pass repo to git_config_get_max_percent_split_change()
Refactor `git_config_get_max_percent_split_change()` to accept a `struct
repository` such that we can get rid of the implicit dependency on
`the_repository`. Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
be7537e6a9 config: pass repo to git_config_get_split_index()
Refactor `git_config_get_split_index()` to accept a `struct repository`
such that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
1870cc30d4 config: pass repo to git_config_get_index_threads()
Refactor `git_config_get_index_threads()` to accept a `struct
repository` such that we can get rid of the implicit dependency on
`the_repository`. Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:02 -07:00
2ea8536468 config: expose repo_config_clear()
While we already have `repo_config_clear()` as an alternative to
`git_config_clear()` that doesn't rely on `the_repository`, it is not
exposed to callers outside of the config subsystem. Do so.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:02 -07:00
909a2bfb1f config: introduce missing setters that take repo as parameter
While we already provide some of the config-setting interfaces with a
`struct repository` as parameter, others only have a variant that
implicitly depends on `the_repository`. Fill in those gaps such that we
can start to deprecate the repo-less variants.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:02 -07:00
7ac16649ec path: hide functions using the_repository by default
The path subsystem provides a bunch of legacy functions that compute
paths relative to the "gitdir" and "commondir" directories of the global
`the_repository` variable. Use of those functions is discouraged, and it
is easy to miss the implicit dependency on `the_repository` that calls
to those functions may cause.

With `USE_THE_REPOSITORY_VARIABLE`, we have recently introduced a tool
that allows us to get rid of such functions over time. With this macro,
we can hide away functions that have such implicit dependency such that
other subsystems that want to be free of `the_repository` will not use
them by accident.

Move all path-related functions that use `the_repository` into a block
that gets only conditionally compiled depending on whether or not the
macro has been defined. This also removes all dependencies on that
variable in "path.c", allowing us to remove the definition of said
preprocessor macro.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
a973f60dc7 path: stop relying on the_repository in worktree_git_path()
When not provided a worktree, then `worktree_git_path()` will fall back
to returning a path relative to the main repository. In this case, we
implicitly rely on `the_repository` to derive the path. Remove this
dependency by passing a `struct repository` as parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
78f2210b3c path: stop relying on the_repository when reporting garbage
We access `the_repository` in `report_linked_checkout_garbage()` both
directly and indirectly via `get_git_dir()`. Remove this dependency by
instead passing a `struct repository` as parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
169c979771 hooks: remove implicit dependency on the_repository
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
419dbb29d8 editor: do not rely on the_repository for interactive edits
We implicitly rely on `the_repository` when editing a file interactively
because we call `git_path()`. Adapt the function to instead take a
`struct repository` as a parameter so that we can remove this hidden
dependency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:00 -07:00
61419a42f6 path: expose do_git_common_path() as repo_common_pathv()
With the same reasoning as the preceding commit, expose the function
`do_git_common_path()` as `repo_common_pathv()`. While at it, reorder
parameters such that they match the order we have in `repo_git_pathv()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:00 -07:00
b6c6bfef31 path: expose do_git_path() as repo_git_pathv()
We're about to move functions of the "path" subsytem that do not use a
`struct repository` into "path.h" as static inlined functions. This will
require us to call `do_git_path()`, which is internal to "path.c".

Expose the function as `repo_git_pathv()` to prepare for the change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:00 -07:00
4460e052e0 remerge-diff: clean up temporary objdir at a central place
After running a diff between two things, or a series of diffs while
walking the history, the diff computation is concluded by a call to
diff_result_code() to extract the exit status of the diff machinery.

The function can work on "struct diffopt", but all the callers
historically and currently pass "struct diffopt" that is embedded in
the "struct rev_info" that is used to hold the remerge_diff bit and
the remerge_objdir variable that points at the temporary object
directory in use.

Redefine diff_result_code() to take the whole "struct rev_info" to
give it an access to these members related to remerge-diff, so that
it can get rid of the temporary object directory for any and all
callers that used the feature.  We can lose the equivalent code to
do so from the code paths for individual commands, diff-tree, diff,
and log.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 15:42:40 -07:00
245cac5c33 remerge-diff: lazily prepare temporary objdir on demand
It is error prone for each caller that sets revs.remerge_diff bit
to be responsible for preparing a temporary object directory and
rotate it into the list of alternate object stores, making it the
primary object store.

Instead, remove the code to set up and arrange the temporary object
directory from the current callers and implement it in the code that
runs remerge-diff logic.  The code to undo the futzing of the list
of alternate object store is still spread across the callers, but we
will deal with it in future steps.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 15:42:35 -07:00
170cdfc5a4 doc: grammofix in git-diff-tree
Describe in present tense what the option does when it is given.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 10:15:31 -07:00
9a91f7a4de tutorial: grammofix
We say "these", so "range notations" must be plural.

Reported-by: Furkan Akkurt
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 10:14:48 -07:00
a30ce14a80 ref-filter: populate symref from iterator
With a previous commit, the reference the symbolic ref points to is saved
in the ref iterator records. Instead of making a separate call to
resolve_refdup() each time, we can just populate the ref_array_item with
the value from the iterator.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
e8207717f1 refs: add referent to each_ref_fn
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
cfd971520e refs: keep track of unresolved reference value in iterators
Since ref iterators do not hold onto the direct value of a reference
without resolving it, the only way to get ahold of a direct value of a
symbolic ref is to make a separate call to refs_read_symbolic_ref.

To make accessing the direct value of a symbolic ref more efficient,
let's save the direct value of the ref in the iterators for both the
files backend and the reftable backend.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:33 -07:00
a77554ea09 diff-tree: fix crash when used with --remerge-diff
When using "git-diff-tree" to get the tree diff for merge commits with
the diff format set to `remerge`, a bug is triggered as shown below:

  $ git diff-tree -r --remerge-diff 363337e6eb
  363337e6eb
  BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?

This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
added in commit 7b90ab467a (log: clean unneeded objects during log
--remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
attempting to clean up temporary objects generated during the remerge
process.

After some further digging, I find that the remerge-related diff options
were introduced in db757e8b8d (show, log: provide a --remerge-diff
capability, 2022-02-02), which also affect the setup of `rev_info` for
"git-diff-tree", but were not accounted for in the original
implementation (inferred from the commit message).

Elijah Newren, the author of the remerge diff feature, notes that other
callers of `log-tree.c:log_tree_commit` (the only caller of
`log-tree.c:do_remerge_diff`) also exist, but:

  `builtin/am.c`: manually sets all flags; remerge_diff is not among them
  `sequencer.c`: manually sets all flags; remerge_diff is not among them

so `builtin/diff-tree.c` really is the only caller that was overlooked
when remerge-diff functionality was added.

This commit resolves the crash by adding `remerge_objdir` setup logic to
`builtin/diff-tree.c`, mirroring `builtin/log.c:cmd_log_walk_no_free`.
It also includes the necessary cleanup for `remerge_objdir`.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:07:44 -07:00
0d66f601a9 tests: drop use of 'tee' that hides exit status
A few tests have "| tee output" downstream of a git command, and
then inspect the contents of the file.  The net effect is that we
use an extra process, and hide the exit status from the upstream git
command.

In any of these tests, I do not see a reason why we want to hide a
possible failure from these git commands.  Replace the use of tee
with a plain simple redirection.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 18:08:10 -07:00
25673b1c47 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:41:21 -07:00
598422337c Merge branch 'ps/p4-tests-updates'
Perforce tests have been updated.

* ps/p4-tests-updates:
  t98xx: mark Perforce tests as memory-leak free
  ci: update Perforce version to r23.2
  t98xx: fix Perforce tests with p4d r23 and newer
2024-08-08 10:41:21 -07:00
3e12106370 Merge branch 'dh/encoding-trace-optim'
An expensive operation to prepare tracing was done in re-encoding
code path even when the tracing was not requested, which has been
corrected.

* dh/encoding-trace-optim:
  convert: return early when not tracing
2024-08-08 10:41:20 -07:00
536695cabe Merge branch 'ps/doc-more-c-coding-guidelines'
Some project conventions have been added to CodingGuidelines.

* ps/doc-more-c-coding-guidelines:
  Documentation: consistently use spaces inside initializers
  Documentation: document idiomatic function names
  Documentation: document naming schema for structs and their functions
  Documentation: clarify indentation style for C preprocessor directives
  clang-format: fix indentation width for preprocessor directives
2024-08-08 10:41:20 -07:00
984ab11337 Merge branch 'rs/grep-omit-blank-lines-after-function-at-eof'
"git grep -W" omits blank lines that follow the found function at
the end of the file, just like it omits blank lines before the next
function.

* rs/grep-omit-blank-lines-after-function-at-eof:
  grep: -W: skip trailing empty lines at EOF, too
2024-08-08 10:41:19 -07:00
028cf22904 Merge branch 'dd/notes-empty-no-edit-by-default'
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-08 10:41:19 -07:00
c2058b2a85 Merge branch 'es/shell-check-updates'
Test script linter has been updated to catch an attempt to use
one-shot export construct "VAR=VAL func" for shell functions (which
does not work for some shells) better.

* es/shell-check-updates:
  check-non-portable-shell: improve `VAR=val shell-func` detection
  check-non-portable-shell: suggest alternative for `VAR=val shell-func`
  check-non-portable-shell: loosen one-shot assignment error message
  t4034: fix use of one-shot variable assignment with shell function
  t3430: drop unnecessary one-shot "VAR=val shell-func" invocation
2024-08-08 10:41:18 -07:00
d70f3208bc Merge branch 'rj/add-p-pager'
A 'P' command to "git add -p" that passes the patch hunk to the
pager has been added.

* rj/add-p-pager:
  add-patch: render hunks through the pager
  pager: introduce wait_for_pager
  pager: do not close fd 2 unnecessarily
  add-patch: test for 'p' command
2024-08-08 10:41:18 -07:00
f250b51b49 Merge branch 'ks/unit-test-comment-typofix'
Typofix.

* ks/unit-test-comment-typofix:
  unit-tests/test-lib: fix typo in check_pointer_eq() description
2024-08-08 10:41:17 -07:00
203a9bf091 t7004: make use of write_script
Use write_script which takes care of emitting the `#!/bin/sh` line
and the `chmod +x`.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:01 -07:00
2f44f11b0a t7004: use single quotes instead of double quotes
Some test bodies and test description are surrounded with double
quotes instead of single quotes, violating our coding style.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:01 -07:00
c07b695c15 t7004: begin the test body on the same line as test_expect_success
Test body should begin with a single quote right after the test
description instead of backslash followed by new line.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:01 -07:00
8975df91ff t7004: description on the same line as test_expect_success
There are several tests in t7004 where the test description that
follows `test_expect_success` is on a separate line, violating our
coding style. Adapt these to be on the same line.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:01 -07:00
c4e00c1c6b t7004: do not prepare things outside test_expect_success
Do not prepare expect and other things outside test_expect_success.
If such code fails for some reason, we won't necessarily hear about
it in a timely fashion (or perhaps at all). By placing all code inside
`test_expect_success` it ensures that we know immediately if it fails.

Also add '\' before EOF to avoid shell interpolation and '-' to allow
indentation of the body.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:01 -07:00
52a6674a4d t7004: use indented here-doc
Use <<-\EOF instead of <<\EOF where the latter allows us to indent
the body of the here-doc.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:00 -07:00
95fc11b6fd t7004: one command per line
One of the tests in t7004 has multiple commands on a single line,
which is discouraged. Adapt these by splitting up these into one
line per command.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:00 -07:00
ea62c4f947 t7004: remove space after redirect operators
Modernize 't7004' by removing whitespace after redirect operators.

Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:27:00 -07:00
f234df07f6 reftable/stack: handle locked tables during auto-compaction
When compacting tables, it may happen that we want to compact a set of
tables which are already locked by a concurrent process that compacts
them. In the case where we wanted to perform a full compaction of all
tables it is sensible to bail out in this case, as we cannot fulfill the
requested action.

But when performing auto-compaction it isn't necessarily in our best
interest of us to abort the whole operation. For example, due to the
geometric compacting schema that we use, it may be that process A takes
a lot of time to compact the bulk of all tables whereas process B
appends a bunch of new tables to the stack. B would in this case also
notice that it has to compact the tables that process A is compacting
already and thus also try to compact the same range, probably including
the new tables it has appended. But because those tables are locked
already, it will fail and thus abort the complete auto-compaction. The
consequence is that the stack will grow longer and longer while A isn't
yet done with compaction, which will lead to a growing performance
impact.

Instead of aborting auto-compaction altogether, let's gracefully handle
this situation by instead compacting tables which aren't locked. To do
so, instead of locking from the beginning of the slice-to-be-compacted,
we start locking tables from the end of the slice. Once we hit the first
table that is locked already, we abort. If we succeeded to lock two or
more tables, then we simply reduce the slice of tables that we're about
to compact to those which we managed to lock.

This ensures that we can at least make some progress for compaction in
said scenario. It also helps in other scenarios, like for example when a
process died and left a stale lockfile behind. In such a case we can at
least ensure some compaction on a best-effort basis.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:43 -07:00
ed1ad6b44d reftable/stack: fix corruption on concurrent compaction
The locking employed by compaction uses the following schema:

  1. Lock "tables.list" and verify that it matches the version we have
     loaded in core.

  2. Lock each of the tables in the user-supplied range of tables that
     we are supposed to compact. These locks prohibit any concurrent
     process to compact those tables while we are doing that.

  3. Unlock "tables.list". This enables concurrent processes to add new
     tables to the stack, but also allows them to compact tables outside
     of the range of tables that we have locked.

  4. Perform the compaction.

  5. Lock "tables.list" again.

  6. Move the compacted table into place.

  7. Write the new order of tables, including the compacted table, into
     the lockfile.

  8. Commit the lockfile into place.

Letting concurrent processes modify the "tables.list" file while we are
doing the compaction is very much part of the design and thus expected.
After all, it may take some time to compact tables in the case where we
are compacting a lot of very large tables.

But there is a bug in the code. Suppose we have two processes which are
compacting two slices of the table. Given that we lock each of the
tables before compacting them, we know that the slices must be disjunct
from each other. But regardless of that, compaction performed by one
process will always impact what the other process needs to write to the
"tables.list" file.

Right now, we do not check whether the "tables.list" has been changed
after we have locked it for the second time in (5). This has the
consequence that we will always commit the old, cached in-core tables to
disk without paying to respect what the other process has written. This
scenario would then lead to data loss and corruption.

This can even happen in the simpler case of one compacting process and
one writing process. The newly-appended table by the writing process
would get discarded by the compacting process because it never sees the
new table.

Fix this bug by re-checking whether our stack is still up to date after
locking for the second time. If it isn't, then we adjust the indices of
tables to replace in the updated stack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:43 -07:00
128b9aa3e9 reftable/stack: use lock_file when adding table to "tables.list"
When modifying "tables.list", we need to lock the list before updating
it to ensure that no concurrent writers modify the list at the same
point in time. While we do this via the `lock_file` subsystem when
compacting the stack, we manually handle the lock when adding a new
table to it. While not wrong, it is at least inconsistent.

Refactor the code to consistently lock "tables.list" via the `lock_file`
subsytem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:43 -07:00
7ee307da1b reftable/stack: do not die when fsyncing lock file files
We use `fsync_component_or_die()` when committing an addition to the
"tables.list" lock file, which unsurprisingly dies in case the fsync
fails. Given that this is part of the reftable library, we should never
die and instead let callers handle the error.

Adapt accordingly and use `fsync_component()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:43 -07:00
558f6fbeb1 reftable/stack: simplify tracking of table locks
When compacting tables, we store the locks of all tables we are about to
compact in the `table_locks` array. As we currently only ever compact
all tables in the user-provided range or none, we simply track those
locks via the indices of the respective tables in the merged stack.

This is about to change though, as we will introduce a mode where auto
compaction gracefully handles the case of already-locked files. In this
case, it may happen that we only compact a subset of the user-supplied
range of tables. In this case, the indices will not necessarily match
the lock indices anymore.

Refactor the code such that we track the number of locks via a separate
variable. The resulting code is expected to perform the same, but will
make it easier to perform the described change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:42 -07:00
5f0ed603a1 reftable/stack: update stats on failed full compaction
When auto-compaction fails due to a locking error, we update the
statistics to indicate this failure. We're not doing the same when
performing a full compaction.

Fix this inconsistency by using `stack_compact_range_stats()`, which
handles the stat update for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:42 -07:00
8030100bda reftable/stack: test compaction with already-locked tables
We're lacking test coverage for compacting tables when some of the
tables that we are about to compact are locked. Add two tests that
exercise this, one for auto-compaction and one for full compaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:42 -07:00
9a833ca35d reftable/stack: extract function to setup stack with N tables
We're about to add two tests, and both of them will want to initialize
the reftable stack with a set of N tables. Introduce a new function that
handles this and refactor existing tests that use such a setup to use
it.

Note that this changes the exact records contained in the preexisting
tests. This is fine though as we only care about the shape of the stack
here, not the shape of each table.

Furthermore, with this change we now start to disable auto compaction
when writing the tables, as otherwise we might not end up with the
expected amount of new tables added. This also slightly changes the
behaviour of these tests, but the properties we care for remain intact.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:42 -07:00
ed7d2f4770 reftable/stack: refactor function to gather table sizes
Refactor the function that gathers table sizes to be more idiomatic. For
one, use `REFTABLE_CALLOC_ARRAY()` instead of `reftable_calloc()`.
Second, avoid using an integer to iterate through the tables in the
reftable stack given that `stack_len` itself is using a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 10:14:41 -07:00
1c31be45b3 fsck: add ref name check for files backend
The git-fsck(1) only implicitly checks the reference, it does not fully
check refs with bad format name such as standalone "@".

However, a file ending with ".lock" should not be marked as having a bad
ref name. It is expected that concurrent writers may have such lock files.
We currently ignore this situation. But for bare ".lock" file, we will
report it as error.

In order to provide such checks, add a new fsck message id "badRefName"
with default ERROR type. Use existing "check_refname_format" to explicit
check the ref name. And add a new unit test to verify the functionality.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
a7600b8481 files-backend: add unified interface for refs scanning
For refs and reflogs, we need to scan its corresponding directories to
check every regular file or symbolic link which shares the same pattern.
Introduce a unified interface for scanning directories for
files-backend.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
bf061d26c7 builtin/refs: add verify subcommand
Introduce a new subcommand "verify" in git-refs(1) to allow the user to
check the reference database consistency and also this subcommand will
be used as the entry point of checking refs for "git-fsck(1)".

Add "verbose" field into "fsck_options" to indicate whether we should
print verbose messages when checking refs and objects consistency.

Remove bit-field for "strict" field, this is because we cannot take
address of a bit-field which makes it unhandy to set member variables
when parsing the command line options.

The "git-fsck(1)" declares "fsck_options" variable with "static"
identifier which avoids complaint by the leak-checker. However, in
"git-refs verify", we need to do memory clean manually. Thus add
"fsck_options_clear" function in "fsck.c" to provide memory clean
operation.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
ab6f79d8df refs: set up ref consistency check infrastructure
The "struct ref_store" is the base class which contains the "be" pointer
which provides backend-specific functions whose interfaces are defined
in the "ref_storage_be". We could reuse this polymorphism to define only
one interface. For every backend, we need to provide its own function
pointer.

The interfaces defined in the `ref_storage_be` are carefully structured
in semantic. It's organized as the five parts:

1. The name and the initialization interfaces.
2. The ref transaction interfaces.
3. The ref internal interfaces (pack, rename and copy).
4. The ref filesystem interfaces.
5. The reflog related interfaces.

To keep consistent with the git-fsck(1), add a new interface named
"fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be
grouped into any above five categories. Explicitly add blank line to
make it different from others.

Last, implement placeholder functions for each ref backends.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
2de307cdb2 fsck: add refs report function
Introduce a new struct "fsck_ref_report" to contain the information we
need when reporting refs-related messages.

With the new "fsck_vreport" function, add a new function
"fsck_report_ref" to report refs-related fsck error message. Unlike
"report" function uses the exact parameters, we simply pass "struct
fsck_ref_report *report" as the parameter. This is because at current we
don't know exactly how many fields we need. By passing this parameter,
we don't need to change this function prototype when we want to add more
information into "fsck_ref_report".

We have introduced "fsck_report_ref" function to report the error
message for refs. We still need to add the corresponding callback
function. Create refs-specific "error_func" callback
"fsck_refs_error_function".

Last, add "FSCK_REFS_OPTIONS_DEFAULT" macro to create default options
when checking ref consistency.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
3473d18fad fsck: add a unified interface for reporting fsck messages
The static function "report" provided by "fsck.c" aims at checking error
type and calling the callback "error_func" to report the message. Both
refs and objects need to check the error type of the current fsck
message. In order to extract this common behavior, create a new function
"fsck_vreport". Instead of using "...", provide "va_list" to allow more
flexibility.

Instead of changing "report" prototype to be align with the
"fsck_vreport" function, we leave the "report" prototype unchanged due
to the reason that there are nearly 62 references about "report"
function. Simply change "report" function to use "fsck_vreport" to
report objects related messages.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
0ec5dfe8c4 fsck: make "fsck_error" callback generic
The "fsck_error" callback is designed to report the objects-related
error messages. It accepts two parameter "oid" and "object_type" which
is not generic. In order to provide a unified callback which can report
either objects or refs, remove the objects-related parameters and add
the generic parameter "void *fsck_report".

Create a new "fsck_object_report" structure which incorporates the
removed parameters "oid" and "object_type". Then change the
corresponding references to adapt to new "fsck_error" callback.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
8cd4a447b8 fsck: rename objects-related fsck error functions
The names of objects-related fsck error functions are generic. It's OK
when there is only object database check. However, we are going to
introduce refs database check report function. To avoid ambiguity,
rename object-related fsck error functions to explicitly indicate these
functions are used to report objects-related messages.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
2d79aa9095 fsck: rename "skiplist" to "skip_oids"
The "skiplist" field in "fsck_options" is related to objects. Because we
are going to introduce ref consistency check, the "skiplist" name is too
general which will make the caller think "skiplist" is related to both
the refs and objects.

It may seem that for both refs and objects, we should provide a general
"skiplist" here. However, the type for "skiplist" is `struct oidset`
which is totally unsuitable for refs.

To avoid above ambiguity, rename "skiplist" to "skip_oids".

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
6f1e9394e2 object: fix leaking packfiles when closing object store
When calling `raw_object_store_clear()`, we close and free several
resources associated with the object store. Part of that is to close and
free all the packfiles, which is handled by `close_object_store()`.

That function really only ends up closing the packfiles though, but it
doesn't free them. And in fact it can't, as that function is being
called via `run_command()` when `close_object_store = 1`, which is done
e.g. when we execute git-maintenance(1). At that point, other structures
may still have references on those packfiles, and thus we cannot free
them here. So while it is in fact intentional that we really only close
them, the result is a memory leak because `raw_object_store_clear()`
does not free them, either.

Fix the leak by freeing the packfiles in `raw_object_store_clear()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
fa0f27a19d submodule: fix leaking seen submodule names
We keep track of submodules we have already seen via a string map such
that we don't process the same submodule twice. We never free that map
though, causing a memory leak.

Fix this leak by clearing the map.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
1a7e5efdb0 submodule: fix leaking fetch tasks
When done with a fetch task used for parallel fetches of submodules, we
need to both call `fetch_task_release()` to release the task's contents
and `free()` to release the task itself. Most sites do this already, but
some only call `fetch_task_release()` and thus leak memory.

While we could trivially fix this by adding the two missing calls to
free(3P), the result would be that we always call both functions. Let's
thus refactor the code such that `fetch_task_release()` also frees the
structure itself. Rename it to `fetch_task_free()` accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
c369fc46d0 builtin/submodule: allow "add" to use different ref storage format
Same as with "clone", users may want to add a submodule to a repository
with a non-default ref storage format. Wire up a new `--ref-format=`
option that works the same as for `git submodule clone`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
fb99dded31 refs: fix ref storage format for submodule ref stores
When opening a submodule ref storage we accidentally use the ref storage
format of the owning repository, not of the submodule repository. As
submodules may have a different storage format than their parent repo
this can lead to bugs when trying to access the submodule ref storage
from the parent repository.

One such bug was reported when performing a recursive pull with mixed
ref stores, which fails with:

    $ git pull --recursive
    fatal: Unable to find current revision in submodule path 'path/to/sub'

The same issue occurs when adding a repository contained in the working
tree with a different ref storage format via `git submodule add`.

Fix the bug by using the submodule repository's ref storage format
instead and add some tests. Note that the test for `git submodule
status` was included as a precaution, only. The command worked alright
even without the bugfix.

Reported-by: Jeppe Øland <joland@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
69814846ab builtin/clone: propagate ref storage format to submodules
When recursively cloning a repository with a non-default ref storage
format, e.g. by passing the `--ref-format=` option, then only the
top-level repository will end up using that ref storage format, and
all recursively cloned submodules will instead use the default format.

While mixed-format constellations are expected to work alright, the
outcome still is somewhat surprising as we have essentially ignored
the user's request.

Fix this by propagating the requested ref format to cloned submodules.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:21:39 -07:00
5ac781ad62 builtin/submodule: allow cloning with different ref storage format
As submodules are proper self-contained repositories, it is perfectly
valid for them to have a different ref storage format than their parent
repository. There is no obvious way for users to ask for the ref storage
format when initializing submodules though. Whether the setup of such
mixed-ref-storage-format constellations is all that useful remains to be
seen. But there is no good reason to not expose such an option, and we
will require it in a subsequent patch.

Introduce a new `--ref-format=` option for git-submodule(1) that allows
the user to pick the ref storage format. This option will also be used
in a subsequent commit, where we start to propagate the same flag from
git-clone(1) to cloning submodules with the `--recursive` switch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:20:49 -07:00
d9ab8788e1 git-submodule.sh: break overly long command lines
For most of the subcommands of git-submodule(1), we end up passing a
bunch of arguments to the submodule helper. This quickly leads to overly
long lines, where it becomes hard to spot what has changed when one
needs to modify them.

Break up these lines into one argument per line, similarly to how it is
done for the "clone" subcommand already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:20:48 -07:00
6ce8ffe30e transport: mark more tests leak-free
After fixing a transport leak, a few more tests have become
leak-free.  Mark them as such.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:16:21 -07:00
448d51d549 transport: fix leak with transport helper URLs
Transport URLs can be prefixed with "foo::", which would tell us that
the transport uses a remote helper called "foo". We extract the helper
name by `xstrndup()`ing the prefix before the double-colons, but never
free that string.

Fix this leak by assigning the result to a separate local variable that
we can then free upon returning.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-07 17:38:31 -07:00
92a29c2c39 Merge branch 'ps/refs-wo-the-repository' into ps/config-wo-the-repository
* ps/refs-wo-the-repository:
  refs/reftable: stop using `the_repository`
  refs/packed: stop using `the_repository`
  refs/files: stop using `the_repository`
  refs/files: stop using `the_repository` in `parse_loose_ref_contents()`
  refs: stop using `the_repository`
2024-08-07 14:13:20 -07:00
90b801d8ff Merge branch 'ps/leakfixes-part-3' into ps/leakfixes-part-4
* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-06 12:40:41 -07:00
fcb2205b77 midx: implement support for writing incremental MIDX chains
Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.

This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:

    $ git multi-pack-index write --incremental

The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:

  - The `compute_sorted_entries()` function is taught to reject objects
    which appear in any existing MIDX layer.

  - Functions like `write_midx_revindex()` are adjusted to write
    pack_order values which are offset by the number of objects in the
    base MIDX layer.

  - The end of `write_midx_internal()` is adjusted to move
    non-incremental MIDX files when necessary (i.e. when creating an
    incremental chain with an existing non-incremental MIDX in the
    repository).

There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).

The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:

  1. Most of the "interesting" behavior is already thoroughly covered in
     t5319-multi-pack-index.sh, which handles the core logic of reading
     objects through a MIDX.

     The new tests in t5334-incremental-multi-pack-index.sh are mostly
     focused on creating and destroying incremental MIDXs, as well as
     stitching their results together across layers.

  2. A new GIT_TEST environment variable is added called
     "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
     entire test suite to write incremental MIDXs after repacking when
     combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.

     This exercises the long tail of other interesting behavior that is
     defined implicitly throughout the rest of the CI suite. It is
     likewise added to the linux-TEST-vars job.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00
147c3f6740 t/t5313-pack-bounds-checks.sh: prepare for sub-directories
Prepare for sub-directories to appear in $GIT_DIR/objects/pack by
adjusting the copy, remove, and chmod invocations to perform their
behavior recursively.

This prepares us for the new $GIT_DIR/objects/pack/multi-pack-index.d
directory which will be added in a following commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00
9552c3595a t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
Two years ago, commit ff1e653c8e (midx: respect
'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new
environment variable which caused the test suite to write MIDX bitmaps
after any 'git repack' invocation.

At the time, this was done to help flush out any bugs with MIDX bitmaps
that weren't explicitly covered in the t5326-multi-pack-bitmap.sh
script.

Two years later, that flag has served us well and is no longer providing
meaningful coverage, as the script in t5326 has matured substantially
and covers many more interesting cases than it did back when ff1e653c8e
was originally written.

Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable
as it is no longer serving a useful purpose. More importantly, removing
this variable clears the way for us to introduce a new one to help
similarly flush out bugs related to incremental MIDX chains.

Because these incremental MIDX chains are (for now) incompatible with
MIDX bitmaps, we cannot have both.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
3592796d0a midx: implement verification support for incremental MIDXs
Teach the verification implementation used by `git multi-pack-index
verify` to perform verification for incremental MIDX chains by
independently validating each layer within the chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
b80236d0e3 midx: support reading incremental MIDX chains
Now that the MIDX machinery's internals have been taught to understand
incremental MIDXs over the previous handful of commits, the MIDX
machinery itself can begin reading incremental MIDXs.

(Note that while the on-disk format for incremental MIDXs has been
defined, the writing end has not been implemented. This will take place
in the commit after next.)

The core of this change involves following the order specified in the
MIDX chain in reverse and opening up MIDXs in the chain one-by-one,
adding them to the previous layer's `->base_midx` pointer at each step.

In order to implement this, the `load_multi_pack_index()` function is
taught to call a new `load_multi_pack_index_chain()` function if loading
a non-incremental MIDX failed via `load_multi_pack_index_one()`.

When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in
the file one-by-one and dispatches calls to
`load_multi_pack_index_one()` to read each layer of the MIDX chain. When
a layer was successfully read, it is added to the MIDX chain by calling
`add_midx_to_chain()` which validates the contents of the `BASE` chunk,
performs some bounds checks on the number of combined packs and objects,
and attaches the new MIDX by assigning its `base_midx` pointer to the
existing part of the chain.

As a supplement to this, introduce a new mode in the test-read-midx
test-tool which allows us to read the information for a specific MIDX in
the chain by specifying its trailing checksum via the command-line
arguments like so:

    $ test-tool read-midx .git/objects [checksum]

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
97fd770ea1 midx: teach midx_fanout_add_midx_fanout() about incremental MIDXs
The function `midx_fanout_add_midx_fanout()` is used to help construct
the fanout table when generating a MIDX by reusing data from an existing
MIDX.

Prepare this function to work with incremental MIDXs by making a few
changes:

  - The bounds checks need to be adjusted to start object lookups taking
    into account the number of objects in the previous MIDX layer (i.e.,
    by starting the lookups at position `m->num_objects_in_base` instead
    of position 0).

  - Likewise, the bounds checks need to end at `m->num_objects_in_base`
    objects after `m->num_objects`.

  - Finally, `midx_fanout_add_midx_fanout()` needs to recur on earlier
    MIDX layers when dealing with an incremental MIDX chain by calling
    itself when given a MIDX with a non-NULL `base_midx`.

Note that after 0c5a62f14b (midx-write.c: do not read existing MIDX with
`packs_to_include`, 2024-06-11), we do not use this function with an
existing MIDX (incremental or not) when generating a MIDX with
--stdin-packs, and likewise for incremental MIDXs.

But it is still used when adding the fanout table from an incremental
MIDX when generating a non-incremental MIDX (without --stdin-packs, of
course).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
b31f2aac56 midx: teach midx_preferred_pack() about incremental MIDXs
The function `midx_preferred_pack()` is used to determine the identity
of the preferred pack, which is the identity of a unique pack within
the MIDX which is used as a tie-breaker when selecting from which pack
to represent an object that appears in multiple packs within the MIDX.

Historically we have said that the MIDX's preferred pack has the unique
property that all objects from that pack are represented in the MIDX.
But that isn't quite true: a more precise statement would be that all
objects from that pack *which appear in the MIDX* are selected from that
pack.

This helps us extend the concept of preferred packs across a MIDX chain,
where some object(s) in the preferred pack may appear in other packs
in an earlier MIDX layer, in which case those object(s) will not appear
in a subsequent MIDX layer from either the preferred pack or any other
pack.

Extend the concept of preferred packs by using the pack which represents
the object at the first position in MIDX pseudo-pack order belonging to
the current MIDX layer (i.e., at position 'm->num_objects_in_base').

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
853165c50a midx: teach midx_contains_pack() about incremental MIDXs
Now that the `midx_contains_pack()` versus `midx_locate_pack()` debacle
has been cleaned up, teach the former about how to operate in an
incremental MIDX-aware world in a similar fashion as in previous
commits.

Instead of using either of the two `midx_for_object()` or
`midx_for_pack()` helpers, this function is split into two: one that
determines whether a pack is contained in a single MIDX, and another
which calls the former in a loop over all MIDXs.

This approach does not require that we change any of the implementation
in what is now `midx_contains_pack_1()` as it still operates over a
single MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
5d0ee3f675 midx: remove unused midx_locate_pack()
Commit 307d75bbe6 (midx: implement `midx_locate_pack()`, 2023-12-14)
introduced `midx_locate_pack()`, which was described at the time as a
complement to the function `midx_contains_pack()` which allowed
callers to determine where in the MIDX lexical order a pack appeared, as
opposed to whether or not it was simply contained.

307d75bbe6 suggests that future patches would be added which would
introduce callers for this new function, but none ever were, meaning the
function has gone unused since its introduction.

Clean this up by in effect reverting 307d75bbe6, which removes the
unused functions and inlines its definition back into
`midx_contains_pack()`.

(Looking back through the list archives when 307d75bbe6 was written,
this was in preparation for this[1] patch from back when we had the
concept of "disjoint" packs while developing multi-pack verbatim reuse.
That concept was abandoned before the series was merged, but I never
dropped what would become 307d75bbe6 from the series, leading to the
state prior to this commit).

[1]: https://lore.kernel.org/git/3019738b52ba8cd78ea696a3b800fa91e722eb66.1701198172.git.me@ttaylorr.com/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
3b00e35108 midx: teach fill_midx_entry() about incremental MIDXs
In a similar fashion as previous commits, teach the `fill_midx_entry()`
function to work in a incremental MIDX-aware fashion.

This function, unlike others which accept an index into either the
lexical order of objects or packs, takes in an object_id, and attempts
to fill a caller-provided 'struct pack_entry' with the remaining pieces
of information about that object from the MIDX.

The function uses `bsearch_midx()` which fills out the frame-local 'pos'
variable, recording the given object_id's lexical position within the
MIDX chain, if found (if no matching object ID was found, we'll return
immediately without filling out the `pack_entry` structure).

Once given that position, we jump back through the `->base_midx` pointer
to ensure that our `m` points at the MIDX layer which contains the given
object_id (and not an ancestor or descendant of it in the chain). Note
that we can drop the bounds check "if (pos >= m->num_objects)" because
`midx_for_object()` performs this check for us.

After that point, we only need to make two special considerations within
this function:

  - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()`
    is a position in the concatenated lexical order of packs, so we must
    ensure that we subtract `m->num_packs_in_base` before accessing the
    MIDX-local `packs` array.

  - Second, we must avoid translating the `pos` back to a MIDX-local
    index, since we use it as an argument to `nth_midxed_offset()` which
    expects a position relative to the concatenated lexical order of
    objects.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
df7ede83be midx: teach nth_midxed_offset() about incremental MIDXs
In a similar fashion as in previous commits, teach the function
`nth_midxed_offset()` about incremental MIDXs.

The given object `pos` is used to find the containing MIDX, and
translated back into a MIDX-local position by assigning the return value
of `midx_for_object()` to it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
88f309e095 midx: teach bsearch_midx() about incremental MIDXs
Now that the special cases callers of `bsearch_midx()` have been dealt
with, teach `bsearch_midx()` to handle incremental MIDX chains.

The incremental MIDX-aware version of `bsearch_midx()` works by
repeatedly searching for a given OID in each layer along the
`->base_midx` pointer, stopping either when an exact match is found, or
the end of the chain is reached.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
3f5f1cff92 midx: introduce bsearch_one_midx()
The `bsearch_midx()` function will be extended in a following commit to
search for the location of a given object ID across all MIDXs in a chain
(or the single non-chain MIDX if no chain is available).

While most callers will naturally want to use the updated
`bsearch_midx()` function, there are a handful of special cases that
will want finer control and will only want to search through a single
MIDX.

For instance, the object abbreviation code, which cares about object IDs
near to where we'd expect to find a match in a MIDX. In that case, we
want to look at the nearby matches in each layer of the MIDX chain, not
just a single one).

Split the more fine-grained control out into a separate function called
`bsearch_one_midx()` which searches only a single MIDX.

At present both `bsearch_midx()` and `bsearch_one_midx()` have identical
behavior, but the following commit will rewrite the former to be aware
of incremental MIDXs for the remaining non-special case callers.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
60750e1eb9 midx: teach nth_bitmapped_pack() about incremental MIDXs
In a similar fashion as in previous commits, teach the function
`nth_bitmapped_pack()` about incremental MIDXs by translating the given
`pack_int_id` from the concatenated lexical order to a MIDX-local
lexical position.

When accessing the containing MIDX's array of packs, use the local pack
ID. Likewise, when reading the 'BTMP' chunk, use the MIDX-local offset
when accessing the data within that chunk.

(Note that the both the call to prepare_midx_pack() and the assignment
of bp->pack_int_id both care about the global pack_int_id, so avoid
shadowing the given 'pack_int_id' parameter).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
26afb5afa1 midx: teach nth_midxed_object_oid() about incremental MIDXs
The function `nth_midxed_object_oid()` returns the object ID for a given
object position in the MIDX lexicographic order.

Teach this function to instead operate over the concatenated
lexicographic order defined in an earlier step so that it is able to be
used with incremental MIDXs.

To do this, we need to both (a) adjust the bounds check for the given
'n', as well as record the MIDX-local position after chasing the
`->base_midx` pointer to find the MIDX which contains that object.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
1820bd878c midx: teach prepare_midx_pack() about incremental MIDXs
The function `prepare_midx_pack()` is part of the midx.h API and
loads the pack identified by the MIDX-local 'pack_int_id'. This patch
prepares that function to be aware of an incremental MIDX world.

To do this, introduce the second of the two general purpose helpers
mentioned in the previous commit. This commit introduces
`midx_for_pack()`, which is the pack-specific analog of
`midx_for_object()`, and works in the same fashion.

Like `midx_for_object()`, this function chases down the '->base_midx'
field until it finds the MIDX layer within the chain that contains the
given pack.

Use this function within `prepare_midx_pack()` so that the `pack_int_id`
it expects is now relative to the entire MIDX chain, and that it
prepares the given pack in the appropriate MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
19419821ba midx: teach nth_midxed_pack_int_id() about incremental MIDXs
The function `nth_midxed_pack_int_id()` takes in a object position in
MIDX lexicographic order and returns an identifier of the pack from
which that object was selected in the MIDX.

Currently, the given object position is an index into the lexicographic
order of objects in a single MIDX. Change this position to instead refer
into the concatenated lexicographic order of all MIDXs in a MIDX chain.

This has two visible effects within the implementation of
`prepare_midx_pack()`:

  - First, the given position is now an index into the concatenated
    lexicographic order of all MIDXs in the order in which they appear
    in the MIDX chain.

  - Second the pack ID returned from this function is now also in the
    concatenated order of packs among all layers of the MIDX chain in
    the same order that they appear in the MIDX chain.

To do this, introduce the first of two general purpose helpers, this one
being `midx_for_object()`. `midx_for_object()` takes a double pointer to
a `struct multi_pack_index` as well as an object `pos` in terms of the
entire MIDX chain[^1].

The function chases down the '->base_midx' field until it finds the MIDX
layer within the chain that contains the given object. It then:

  - modifies the double pointer to point to the containing MIDX, instead
    of the tip of the chain, and

  - returns the MIDX-local position[^2] at which the given object can be
    found.

Use this function within `nth_midxed_pack_int_id()` so that the `pos` it
expects is now relative to the entire MIDX chain, and that it returns
the appropriate pack position for that object.

[^1]: As a reminder, this means that the object is identified among the
  objects contained in all layers of the incremental MIDX chain, not any
  particular layer. For example, consider MIDX chain with two individual
  MIDXs, one with 4 objects and another with 3 objects. If the MIDX with
  4 objects appears earlier in the chain, then asking for object 6 would
  return the second object in the MIDX with 3 objects.

[^2]: Building on the previous example, asking for object 6 in a MIDX
  chain with (4, 3) objects, respectively, this would set the double
  pointer to point at the MIDX containing three objects, and would
  return an index to the second object within that MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
2678a73009 midx: add new fields for incremental MIDX chains
The incremental MIDX chain feature is designed around the idea of
indexing into a concatenated lexicographic ordering of object IDs
present in the MIDX.

When given an object position, the MIDX machinery needs to be able to
locate both (a) which MIDX layer contains the given object, and (b) at
what position *within that MIDX layer* that object appears.

To do this, three new fields are added to the `struct multi_pack_index`:

  - struct multi_pack_index *base_midx;
  - uint32_t num_objects_in_base;
  - uint32_t num_packs_in_base;

These three fields store the pieces of information suggested by their
respective field names. In turn, the `num_objects_in_base` and
`num_packs_in_base` fields are used to crawl backwards along the
`base_midx` pointer to locate the appropriate position for a given
object within the MIDX that contains it.

The following commits will update various parts of the MIDX machinery
(as well as their callers from outside of midx.c and midx-write.c) to be
aware and make use of these fields when performing object lookups.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
6eb1a7d7b0 Documentation: describe incremental MIDX format
Prepare to implement incremental multi-pack indexes (MIDXs) over the
next several commits by first describing the relevant prerequisites
(like a new chunk in the MIDX format, the directory structure for
incremental MIDXs, etc.)

The format is described in detail in the patch contents below, but the
high-level description is as follows.

Incremental MIDXs live in $GIT_DIR/objects/pack/multi-pack-index.d, and
each `*.midx` within that directory has a single "parent" MIDX, which is
the MIDX layer immediately before it in the MIDX chain. The chain order
resides in a file 'multi-pack-index-chain' in the same directory.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:35 -07:00
6caa96c204 t3206: test_when_finished before dirtying operations, not after
Many existing tests in this script perform operation(s) and then use
test_when_finished to define how to undo the effect of the
operation(s).

This is backwards.  When your operation(s) fail before you manage to
successfully call test_when_finished (remember, that these commands
must be all &&-chained, so a failure of an earlier operation mean
your test_when_finished may not be executed at all).  You must
establish how to clean up your mess with test_when_finished before
you create the mess to be cleaned up.

Also make sure that the body of test_when_finished deals with case
where the cruft it wants to remove failed to be created, by using
"rm -f" (instead of "rm") to remove potential cruft files, and
having "|| :" after "git notes remove" to remove potential cruft
notes---both of these by default fail when asked to remove something
that does not exist, instead of being silently idempotent no-ops.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 10:05:05 -07:00
3469a23659 t: port helper/test-hashmap.c to unit-tests/t-hashmap.c
helper/test-hashmap.c along with t0011-hashmap.sh test the hashmap.h
library. Migrate them to the unit testing framework for better
debugging, runtime performance and concise code.

Along with the migration, make 'add' tests from the shell script order
agnostic in unit tests, since they iterate over entries with the same
keys and we do not guarantee the order. This was already done for the
'iterate' tests[1].

The helper/test-hashmap.c is still not removed because it contains a
performance test meant to be run by the user directly (not used in
t/perf). And it makes sense for such a utility to be a helper.

[1]: e1e7a77141 (t: sort output of hashmap iteration, 2019-07-30)

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Josh Steadmon <steadmon@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 09:25:54 -07:00
ac91586ae5 t/t7704-repack-cruft.sh: avoid failures during long-running tests
On systems where running t7704.09 takes longer than 10 seconds, the test
can fail.

The test works by doing the following:

  - First write three unreachable objects, backdating the mtime for a
    single object ($foo) which we expect to prune.

  - Repack the repository into a pack containing reachable objects, and
    another three cruft packs, each containing one of the objects
    written in the previous step.

  - Backdate the mtimes of the cruft pack *.mtimes files themselves.
    (Note that this does not affect what is pruned further down in the
    test, but is done to ensure that the cruft packs are rewritten
    during that step).

  - Then repack with --cruft-expiration=10.seconds.ago, expecting to
    prune one of the three unreachable objects written in the first
    step.

  - Assert that the surviving cruft packs were rewritten, object $foo is
    pruned, and unreachable objects $bar, and $baz remain in the
    repository.

If longer than 10 seconds pass between writing the three unreachable
objects (the first step) and the "git repack --cruft" (the fourth step),
we will mistakenly prune more objects than expected, causing the test to
fail.

The $foo object which we expect to prune has its mtime set back to
10,000 seconds relative to the current time, but we prune it with a
cutoff of 10.seconds.ago.

Instead, set the cutoff to be 1,000 seconds to give the test much longer
time to run without failing. This helps platforms where running
individual tests can perform slowly, on my machine this test runs much
more quickly:

    $ hyperfine './t7704-repack-cruft.sh --run=9'
    Benchmark 1: ./t7704-repack-cruft.sh --run=9
      Time (mean ± σ):     647.4 ms ±  30.7 ms    [User: 528.5 ms, System: 124.1 ms]
      Range (min … max):   594.1 ms … 696.5 ms    10 runs

Reported-by: Randall Becker <randall.becker@nexbridge.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 12:44:54 -07:00
ec60bb9fc4 t6421: fix test to work when repo dir contains d0
The `grep` statement in this test looks for `d0.*<string>`, attempting
to filter to only show lines that had tabular output where the 2nd
column had `d0` and the final column had a substring of
[`git -c `]`fetch.negotiationAlgorithm`. These lines also have
`child_start` in the 4th column, but this isn't part of the condition.

A subsequent line will have `d1` in the 2nd column, `start` in the 4th
column, and `/path/to/git/git -c fetch.negotiationAlgorihm` in the final
column. If `/path/to/git/git` contains the substring `d0`, then this
line is included by `grep` as well as the desired line, leading to an
effective doubling of the number of lines, and test failures.

Tighten the grep expression to require `d0` to be surrounded by spaces,
and to have the `child_start` label.

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 10:59:21 -07:00
b928d57ca9 set errno=0 before strtoX calls
To detect conversion failure after calls to functions like `strtod`, one
can check `errno == ERANGE`. These functions are not guaranteed to set
`errno` to `0` on successful conversion, however. Manual manipulation of
`errno` can likely be avoided by checking that the output pointer
differs from the input pointer, but that's not how other locations, such
as parse.c:139, handle this issue; they set errno to 0 prior to
executing the function.

For every place I could find a strtoX function with an ERANGE check
following it, set `errno = 0;` prior to executing the conversion
function.

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 10:59:20 -07:00
0c4d5aa22d log-tree: use decimal_width()
Reduce code duplication by calling decimal_width() to count the digits
in the number of commits instead of calculating it locally.

It also has the advantage of returning int, which is the exact type
expected by the printf()-like function strbuf_addf() for field width
arguments.

Additionally, decimal_width() supports numbers bigger than 1410065407,
which is (hopefully) just a theoretical advantage.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 08:59:40 -07:00
e2e373ba82 refs/files: prevent memory leak by freeing packed_ref_store
This complements 64a6dd8ffc (refs: implement removal of ref storages,
2024-06-06).

Signed-off-by: Sven Strickroth <email@cs-ware.de>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 08:58:41 -07:00
e95d515141 apply: canonicalize modes read from patches
Git stores only canonical modes for blobs. So for a regular file, we
care about only "100644" or "100755" (depending only on the executable
bit), but never modes where the group or other permissions are more
exotic. So never "100664", "100700", etc. When a file in the working
tree has such a mode, we quietly turn it into one of the two canonical
modes, and that's what is stored both in the index and in tree objects.

However, we don't canonicalize modes we read from incoming patches in
git-apply. These may appear in a few lines:

  - "old mode" / "new mode" lines for mode changes

  - "new file mode" lines for newly created files

  - "deleted file mode" for removing files

For "new mode" and for "new file mode", this is harmless. The patch is
asking the result to have a certain mode, but:

  - when we add an index entry (for --index or --cached), it is
    canonicalized as we create the entry, via create_ce_mode().

  - for a working tree file, try_create_file() passes either 0777 or
    0666 to open(), so what you get depends only on your umask, not any
    other bits (aside from the executable bit) in the original mode.

However, for "old mode" and "deleted file mode", there is a minor
annoyance. We compare the patch's expected preimage mode with the
current state. But that current state is always going to be a canonical
mode itself:

  - updating an index entry via --cached will have the canonical mode in
    the index

  - for updating a working tree file, check_preimage() runs the mode
    through ce_mode_from_stat(), which does the usual canonicalization

So if the patch feeds a non-canonical mode, it's impossible for it to
match, and we will always complain with something like:

  file has type 100644, expected 100664

Since this is just a warning, the operation proceeds, but it's
confusing and annoying.

These cases should be pretty rare in practice. Git would never produce a
patch with non-canonical modes itself (since it doesn't store them).
And while we do accept patches from other programs, all of those lines
were invented by Git. So you'd need a program trying to be Git
compatible, but not handling canonicalization the same way. Reportedly
"quilt" is such a program.

We should canonicalize the modes as we read them so that the user never
sees the useless warning.

A few notes on the tests:

  - I've covered instances of all lines for completeness, even though
    the "new mode" / "new file mode" ones behave OK currently.

  - the tests apply patches to both the index and working tree, and
    check the result of both. Again, we know that all of these paths
    canonicalize anyway, but it's giving us extra coverage (although we
    are even less likely to have such a bug now since we canonicalize up
    front).

  - the test patches are missing "index" lines, which is also something
    Git would never produce. But they don't matter for the test, they do
    match the case from quilt we saw in the wild, and they avoid some
    sha1/sha256 complexity.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 08:50:43 -07:00
3a498b49d1 t-reftable-tree: improve the test for infix_walk()
In the current testing setup for infix_walk(), the following
properties of an infix traversal of a tree remain untested:
- every node of the tree must be visited
- every node must be visited exactly once
In fact, only the property 'traversal in increasing order' is tested.
Modify test_infix_walk() to check for all the properties above.

This can be achieved by storing the nodes' keys linearly, in a nullified
buffer, as we visit them and then checking the input keys against this
buffer in increasing order. By checking that the element just after
the last input key is 'NULL' in the output buffer, we ensure that
every node is traversed exactly once.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-04 09:50:27 -07:00
c70022c1b9 t-reftable-tree: add test for non-existent key
In the current testing setup for tree_search(), the case for
non-existent key is not exercised. Improve this by adding a
test-case for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-04 09:50:27 -07:00
abf1a96773 t-reftable-tree: split test_tree() into two sub-test functions
In the current testing setup, tests for both tree_search() and
infix_walk() defined by reftable/tree.{c, h} are performed by
a single test function, test_tree(). Split tree_test() into
test_tree_search() and test_infix_walk() responsible for
independently testing tree_search() and infix_walk() respectively.
This improves the overall readability of the test file as well as
simplifies debugging.

Note that the last parameter in the tree_search() functiom is
'int insert' which when set, inserts the key if it is not found
in the tree. Otherwise, the function returns NULL for such cases.

While at it, use 'func' to pass function pointers and not '&func'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-04 09:50:26 -07:00
ec9c0704fc t: move reftable/tree_test.c to the unit testing framework
reftable/tree_test.c exercises the functions defined in
reftable/tree.{c, h}. Migrate reftable/tree_test.c to the unit
testing framework. Migration involves refactoring the tests to use
the unit testing framework instead of reftable's test framework and
renaming the tests to align with unit-tests' standards.

Also add a comment to help understand the test routine.

Note that this commit mostly moves the test from reftable/ to
t/unit-tests/ and most of the refactoring is performed by the
trailing commits.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-04 09:50:26 -07:00
e5a0f7076f reftable: remove unnecessary curly braces in reftable/tree.c
According to Documentation/CodingGuidelines, single-line control-flow
statements must omit curly braces (except for some special cases).
Make reftable/tree.c adhere to this guideline.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-04 09:50:18 -07:00
f823de75a1 git-gui: Remove forced rescan of stat-dirty files.
It is possible that stat information of tracked files is modified without
actually modifying the content. Plumbing commands would detect such files
as modified, so that Git GUI runs `git update-info --refresh` in order to
synchronize the cached stat info with the reality. However, this can be
an expensive operation in large repositories. As remediation,
e534f3a886 (git-gui: Allow the user to disable update-index --refresh
during rescan, 2006-11-07) introduced an option to skip the expensive
part.

The option was named "trust file modification timestamp". But the catch
is that sometimes file timestamps can't be trusted. In this case, a file
would remain listed in Unstaged Changes although there are no changes.
So 16403d0b1f (git-gui: Refresh a file if it has an empty diff,
2006-11-11) introduced a popup message informing the user about the
situation and then removed the file from the Unstaged Changes list.

Now users had to click away the message box for every file that was
stat-dirty. Under the assumption that a file in such a state is not
the only one, 124355d32c (git-gui: Always start a rescan on an empty
diff, 2007-01-22) introduced a forced (potentially expensive) refresh
that would de-list all stat-dirty files after the first notification was
dismissed.

Along came 6c510bee20 (Lazy man's auto-CRLF, 2007-02-13) in Git. It
introduced a new case where a file in the worktree can have no essential
differences to the staged version, but still be detected as modified by
plumbing commands. This time, however, the index cannot be synchronized
fully by `git update-index --refresh`, so that the file remains listed
in Unstaged Changes until it is staged manually.

Needless to say that the message box now becomes an annoyance, because
it must be dismissed every time an affected file is selected, and the
file remains listed nevertheless.

Remove the message box. Write the notice that no differences were found
in the diff panel instead. Also include a link that, when clicked,
initiates the rescan. With this scheme, the rescan does not happen
automatically anymore, but requires an additional click. (This is now
two clicks in total for users who encounter stat-dirty files after
enabling the "trust file modification timestamps" option.) However,
users whom the rescan does not help (autocrlf-related dirty files) save
half the clicks because there is no message box to dismiss.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-08-03 18:56:35 +02:00
d53db106e0 Documentation: add platform support policy
Supporting many platforms is only possible when we have the right tools to
ensure that support.

Teach platform maintainers how they can help us to help them, by
explaining what kind of tooling support we would like to have, and what
level of support becomes available as a result. Provide examples so that
platform maintainers can see what we're asking for in practice.

With this policy in place, we can make changes with stronger assurance
that we are not breaking anybody we promised not to. Instead, we can
feel confident that our existing testing and integration practices
protect those who care from breakage.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-02 16:27:15 -07:00
91d351ec88 refs: drop ref_store-less functions
In c8f815c208 (refs: remove functions without ref store, 2024-05-07), we
have removed functions of the refs subsystem that do not take a ref
store as input parameter. In order to make it easier for folks to figure
out how to replace calls to such functions in in-flight patch series, we
kept their definitions around in an ifdeffed block.

Now that Git v2.46 is out, it is rather unlikely that anybody still has
references to these old functions in their unreleased patches. Let's
thus drop them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-02 08:54:32 -07:00
0ca365c2ed http: do not ignore proxy path
The documentation for `http.proxy` describes that option, and the
environment variables it overrides, as supporting "the syntax understood
by curl". curl allows SOCKS proxies to use a path to a Unix domain
socket, like `socks5h://localhost/path/to/socket.sock`. Git should
therefore include, if present, the path part of the proxy URL in what it
passes to libcurl.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-02 08:30:08 -07:00
9e89dcb66a builtin/ls-remote: fall back to SHA1 outside of a repo
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have stopped setting the default hash algorithm for
`the_repository`. Consequently, code that relies on `the_hash_algo` will
now crash when it hasn't explicitly been initialized, which may be the
case when running outside of a Git repository.

It was reported that git-ls-remote(1) may crash in such a way when using
a remote helper that advertises refspecs. This is because the refspec
announced by the helper will get parsed during capability negotiation.
At that point we haven't yet figured out what object format the remote
uses though, so when run outside of a repository then we will fail.

The course of action is somewhat dubious in the first place. Ideally, we
should only parse object IDs once we have asked the remote helper for
the object format. And if the helper didn't announce the "object-format"
capability, then we should always assume SHA256. But instead, we used to
take either SHA1 if there was no repository, or we used the hash of the
local repository, which is wrong.

Arguably though, crashing hard may not be in the best interest of our
users, either. So while the old behaviour was buggy, let's restore it
for now as a short-term fix. We should eventually revisit, potentially
by deferring the point in time when we parse the refspec until after we
have figured out the remote's object hash.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-02 08:24:05 -07:00
7c7516b8db t0018: remove leftover debugging cruft
The actual file is copied out to /tmp, presumably so that the tester
can inspect it after the test is done, which may have been a useful
debugging aid.

But in the final shape of the test suite, such a code should not
exist.  We cannot even assume that we are allowed to write into /tmp
(our TMPDIR may not even be pointing at it) or read from it for that
matter.

Noticed-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 11:52:11 -07:00
615d2de3b4 config.c: avoid segfault with --fixed-value and valueless config
When using `--fixed-value` with a key whose value is left empty (implied
as being "true"), 'git config' may crash when invoked like either of:

    $ git config set --file=config --value=value --fixed-value \
        section.key pattern
    $ git config --file=config --fixed-value section.key value pattern

The original bugreport[1] bisects to 00bbdde141 (builtin/config:
introduce "set" subcommand, 2024-05-06), which is a red-herring, since
the original bugreport uses the new 'git config set' invocation.

The behavior likely bisects back to c90702a1f6 (config: plumb
--fixed-value into config API, 2020-11-25), which introduces the new
--fixed-value option in the first place.

Looking at the relevant frame from a failed process's coredump, the
crash appears in config.c::matches() like so:

    (gdb) up
    #1  0x000055b3e8b06022 in matches (key=0x55b3ea894360 "section.key", value=0x0,
        store=0x7ffe99076eb0) at config.c:2884
    2884			return !strcmp(store->fixed_value, value);

where we are trying to compare the `--fixed-value` argument to `value`,
which is NULL.

Avoid attempting to match `--fixed-value` for configuration keys with no
explicit value. A future patch could consider the empty value to mean
"true", "yes", "on", etc. when invoked with `--type=bool`, but let's
punt on that for now in the name of avoiding the segfault.

[1]: https://lore.kernel.org/git/CANrWfmTek1xErBLrnoyhHN+gWU+rw14y6SQ+abZyzGoaBjmiKA@mail.gmail.com/

Reported-by: Han Jiang <jhcarl0814@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 10:48:15 -07:00
406f326d27 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 10:18:13 -07:00
363337e6eb Merge branch 'as/show-ref-option-help-update'
A few descriptions in "git show-ref -h" have been clarified.

* as/show-ref-option-help-update:
  show-ref: improve short help messages of options
2024-08-01 10:18:12 -07:00
f08cd19dca Merge branch 'jc/doc-reviewing-guidelines-positive-reviews'
The reviewing guidelines document now explicitly encourages people
to give positive reviews and how.

* jc/doc-reviewing-guidelines-positive-reviews:
  ReviewingGuidelines: encourage positive reviews more
2024-08-01 10:18:12 -07:00
5617a8eee8 Merge branch 'jc/doc-rebase-fuzz-vs-offset-fix'
"git rebase --help" referred to "offset" (the difference between
the location a change was taken from and the change gets replaced)
incorrectly and called it "fuzz", which has been corrected.

* jc/doc-rebase-fuzz-vs-offset-fix:
  doc: difference in location to apply is "offset", not "fuzz"
2024-08-01 10:18:11 -07:00
0dc84a806c t-reftable-pq: add tests for merged_iter_pqueue_top()
merged_iter_pqueue_top() as defined by reftable/pq.{c, h} returns
the element at the top of a priority-queue's heap without removing
it. Since there are no tests for this function in the existing
setup, add tests for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:29 -07:00
c2f831cdfc t-reftable-pq: add test for index based comparison
When comparing two entries, the priority queue as defined by
reftable/pq.{c, h} first compares the entries on the basis of
their ref-record's keys. If the keys turn out to be equal, the
comparison is then made on the basis of their update indices
(which are never equal).

In the current testing setup, only the case for comparison on
the basis of ref-record's keys is exercised. Add a test for
index-based comparison as well. Rename the existing test to
reflect its nature of only testing record-based comparison.

While at it, replace 'strbuf_detach' with 'xstrfmt' to assign
refnames in the existing test. This makes the test conciser.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:29 -07:00
b37b71b129 t-reftable-pq: make merged_iter_pqueue_check() callable by reference
merged_iter_pqueue_check() checks the validity of a priority queue
represented by a merged_iter_pqueue struct by asserting the
parent-child relation in the struct's heap. Explicity passing a
struct to this function means a copy of the entire struct is created,
which is inefficient.

Make the function accept a pointer to the struct instead. This is
safe to do since the function doesn't modify the struct in any way.
Make the function parameter 'const' to assert immutability.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:29 -07:00
2e707447e1 t-reftable-pq: make merged_iter_pqueue_check() static
merged_iter_pqueue_check() is a function previously defined in
reftable/pq_test.c (now t/unit-tests/t-reftable-pq.c) and used in
the testing of a priority queue as defined by reftable/pq.{c, h}.
As such, this function is only called by reftable/pq_test.c and it
makes little sense to expose it to non-testing code via reftable/pq.h.

Hence, make this function static and remove its prototype from
reftable/pq.h.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:29 -07:00
a08ea27cd0 t: move reftable/pq_test.c to the unit testing framework
reftable/pq_test.c exercises a priority queue defined by
reftable/pq.{c, h}. Migrate reftable/pq_test.c to the unit testing
framework. Migration involves refactoring the tests to use the unit
testing framework instead of reftable's test framework, and
renaming the tests to align with unit-tests' standards.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:29 -07:00
2a85906348 reftable: change the type of array indices to 'size_t' in reftable/pq.c
The variables 'i', 'j', 'k' and 'min' are used as indices for
'pq->heap', which is an array. Additionally, 'pq->len' is of
type 'size_t' and is often used to assign values to these
variables. Hence, change the type of these variables from 'int'
to 'size_t'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:28 -07:00
f1b60b7c66 reftable: remove unnecessary curly braces in reftable/pq.c
According to Documentation/CodingGuidelines, control-flow statements
with a single line as their body must omit curly braces. Make
reftable/pq.c conform to this guideline. Besides that, remove
unnecessary newlines and variable assignment.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:28 -07:00
b201316835 credential/osxkeychain: respect NUL terminator in username
This patch fixes a case where git-credential-osxkeychain might output
uninitialized bytes to stdout.

We need to get the username string from a system API using
CFStringGetCString(). To do that, we get the max size for the string
from CFStringGetMaximumSizeForEncoding(), allocate a buffer based on
that, and then read into it. But then we print the entire buffer to
stdout, including the trailing NUL and any extra bytes which were not
needed. Instead, we should stop at the NUL.

This code comes from 9abe31f5f1 (osxkeychain: replace deprecated
SecKeychain API, 2024-02-17). The bug was probably overlooked back then
because this code is only used as a fallback when we can't get the
string via CFStringGetCStringPtr(). According to Apple's documentation:

  Whether or not this function returns a valid pointer or NULL depends
  on many factors, all of which depend on how the string was created and
  its properties.

So it's not clear how we could make a test for this, and we'll have to
rely on manually testing on a system that triggered the bug in the first
place.

Reported-by: Hong Jiang <ilford@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Tested-by: Hong Jiang <ilford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:54:47 -07:00
f30bfafcd4 commit-reach: fix trivial memory leak when computing reachability
We don't free the local `stack` commit list that we use to compute
reachability of multiple commits at once. Do so.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:38 -07:00
9642479a2b convert: fix leaking config strings
In `read_convert_config()`, we end up reading some string values into
variables. We don't free any potentially-existing old values though,
which will result in a memory leak in case the same key has been defined
multiple times.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:38 -07:00
1f08999781 entry: fix leaking pathnames during delayed checkout
When filtering files during delayed checkout, we pass a string list to
`async_query_available_blobs()`. This list is initialized with NODUP,
and thus inserted strings will not be owned by the list. In the latter
function we then try to hand over ownership by passing an `xstrup()`'d
value to `string_list_insert()`. But this is not how this works: a NODUP
list does not take ownership of allocated strings and will never free
them for the caller.

Fix this issue by initializing the list as `DUP` instead and dropping
the explicit call to `xstrdup()`. This is okay to do given that this is
the single callsite of `async_query_available_blobs()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
57fb139b5e object-name: fix leaking commit list items
When calling `get_oid_oneline()`, we pass in a `struct commit_list` that
gets modified by the function. This creates a weird situation where the
commit list may sometimes be empty after returning, but sometimes it
will continue to carry additional commits. In those cases the remainder
of the list leaks.

Ultimately, the design where we only pass partial ownership to
`get_oid_oneline()` feels shoddy. Refactor the code such that we only
pass a constant pointer to the list, creating a local copy as needed.
Callers are thus always responsible for freeing the commit list, which
then allows us to plug a bunch of memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
11f841c1cc t/test-repository: fix leaking repository
The test-repository test helper zeroes out `the_repository` such that it
can be sure that our codebase only ends up using the supplied repository
that we initialize in the respective helper functions. This does cause
memory leaks though as the data that `the_repository` has been holding
onto is not referenced anymore.

Fix this by calling `repo_clear()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
145c979020 builtin/credential-cache: fix trivial leaks
There are two trivial leaks in git-credential-cache(1):

  - We leak the child process in `spawn_daemon()`. As we do not call
    `finish_command()` and instead let the created process daemonize, we
    have to clear the process manually.

  - We do not free the computed socket path in case it wasn't given via
    `--socket=`.

Plug both of these memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
cd6d7630fa builtin/worktree: fix leaking derived branch names
There are several heuristics that git-worktree(1) uses to derive the
name of the newly created branch when not given explicitly. These
heuristics all allocate a new string, but we only end up freeing that
string in a subset of cases.

Fix the remaining cases where we didn't yet free the derived branch
names. While at it, also free `opt_track`, which is being populated via
an `OPT_PASSTHRU()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
06da42beec builtin/shortlog: fix various trivial memory leaks
There is a trivial memory leak in git-shortlog(1). Fix it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
50ef4e09c3 builtin/rerere: fix various trivial memory leaks
There are multiple trivial memory leaks in git-rerere(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
1d615afa8d builtin/credential-store: fix leaking credential
We never free credentials read by the credential store, leading to a
memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
11d6a81c01 builtin/show-branch: fix several memory leaks
There are several memory leaks in git-show-branch(1). Fix them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
2d197e4a0f builtin/rev-parse: fix memory leak with --parseopt
The `--parseopt` mode allows shell scripts to have the same option
parsing mode as we have in C builtins. It soaks up a set of option
descriptions via stdin and massages them into proper `struct option`s
that we can then use to parse a set of arguments.

We only partially free those options when done though, creating a memory
leak. Interestingly, we only end up free'ing the first option's help,
which is of course wrong.

Fix this by freeing all option's help fields as well as their `argh`
fields to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
2e875b6cb4 builtin/stash: fix various trivial memory leaks
There are multiple trivial memory leaks in git-stash(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
fc68633352 builtin/remote: fix various trivial memory leaks
There are multiple trivial memory leaks in git-remote(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
e06c1d1640 builtin/remote: fix leaking strings in branch_list
The `struct string_list branch_list` is declared as `NODUP`, which makes
it not copy strings inserted into it. This causes memory leaks though,
as this means it also won't be responsible for _freeing_ inserted
strings. Thus, every branch we add to this will leak.

Fix this by marking the list as `DUP` instead and free the local copy we
have of the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
4119fc08e2 builtin/ls-remote: fix leaking pattern strings
Users can pass patterns to git-ls-remote(1), which allows them to filter
the list of printed references. We assemble those patterns into an array
and prefix them with "*/", but never free either the array nor the
allocated strings.

Refactor the code to use a `struct strvec` instead of manually tracking
the strings in an array. Like this, we can easily use `strvec_clear()`
to release both the vector and the contained string for us, plugging the
leak.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
6771e2012e builtin/submodule--helper: fix leaking buffer in is_tip_reachable
The `rev` buffer in `is_tip_reachable()` is being populated with the
output of git-rev-list(1) -- if either the command fails or the buffer
contains any data, then the input commit is not reachable.

The buffer isn't used for anything else, but neither do we free it,
causing a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
5535b3f3d3 builtin/submodule--helper: fix leaking clone depth parameter
The submodule helper supports a `--depth` parameter for both its "add"
and "clone" subcommands, which in both cases end up being forwarded to
git-clone(1). But while the former subcommand uses an `OPT_INTEGER()` to
parse the depth, the latter uses `OPT_STRING()`. Consequently, it is
possible to pass non-integer input to "--depth" when calling the "clone"
subcommand, where the value will then ultimately cause git-clone(1) to
bail out.

Besides the fact that the parameter verification should happen earlier,
the submodule helper infrastructure also internally tracks the depth via
a string. This requires us to convert the integer in the "add"
subcommand into an allocated string, and this string ultimately leaks.

Refactor the code to consistently track the clone depth as an integer.
This plugs the memory leak, simplifies the code and allows us to use
`OPT_INTEGER()` instead of `OPT_STRING()`, validating the input before
we shell out to git--clone(1).

Original-patch-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
ac3b143370 builtin/name-rev: fix various trivial memory leaks
There are several structures that we don't release after
`cmd_name_rev()` is done. Plug those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
ed041007f0 builtin/describe: fix trivial memory leak when describing blob
We never free the `struct strvec args` variable in `describe_blob()`,
which thus causes a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
5a1e1e5d40 builtin/describe: fix leaking array when running diff-index
When running git-describe(1) with `--dirty`, we will set up a `struct
rev_info` with arguments for git-diff-index(1). The way we assemble the
arguments it causes two memory leaks though:

  - We never release the `struct strvec`.

  - `setup_revisions()` may end up removing some entries from the
    `strvec`, which we wouldn't free even if we released the struct.

While we could plug those leaks, this is ultimately unnecessary as the
arguments we pass are part of a static array anyway. So instead,
refactor the code to drop the `struct strvec` and just pass this static
array directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
8e2e28799d builtin/describe: fix memory leak with --contains=
When calling `git describe --contains=`, we end up invoking
`cmd_name_rev()` with some munged argv array. This array may contain
allocated strings and furthermore will likely be modified by the called
function. This results in two memory leaks:

  - First, we leak the array that we use to assemble the arguments.

  - Second, we leak the allocated strings that we may have put into the
    array.

Fix those leaks by creating a separate copy of the array that we can
hand over to `cmd_name_rev()`. This allows us to free all strings
contained in the `strvec`, as the original vector will not be modified
anymore.

Furthermore, free both the `strvec` and the copied array to fix the
first memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
7935a02613 builtin/log: fix leaking branch name when creating cover letters
When calling `make_cover_letter()` without a branch name, we try to
derive the branch name by calling `find_branch_name()`. But while this
function returns an allocated string, we never free the result and thus
have a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
34968e56de builtin/replay: plug leaking advance_name variable
The `advance_name` variable can either contain a static string when
parsed via the `--advance` command line option or it may be an allocated
string when set via `determine_replay_mode()`. Because we cannot be sure
whether it is allocated or not we just didn't free it at all, resulting
in a memory leak.

Split up the variables such that we can track the static and allocated
strings separately and then free the allocated one to fix the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:34 -07:00
891ee3b9db Start the 2.47 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-31 13:34:21 -07:00
3ff9ceca89 Merge branch 'jc/how-to-maintain-updates'
Doc update.

* jc/how-to-maintain-updates:
  howto-maintain: update daily tasks
  howto-maintain: cover a whole development cycle
2024-07-31 13:34:21 -07:00
d18eb5ba79 Merge branch 'tn/doc-commit-fix'
Docfix.

* tn/doc-commit-fix:
  doc: remove dangling closing parenthesis
2024-07-31 13:34:20 -07:00
ca9221c17d Merge branch 'jc/doc-one-shot-export-with-shell-func'
It has been documented that we avoid "VAR=VAL shell_func" and why.

* jc/doc-one-shot-export-with-shell-func:
  CodingGuidelines: document a shell that "fails" "VAR=VAL shell_func"
2024-07-31 13:34:20 -07:00
6c70d65712 Merge branch 'cp/unit-test-reftable-merged'
Another reftable test has been ported to use the unit test framework.

* cp/unit-test-reftable-merged:
  t-reftable-merged: add test for REFTABLE_FORMAT_ERROR
  t-reftable-merged: use reftable_ref_record_equal to compare ref records
  t-reftable-merged: add tests for reftable_merged_table_max_update_index
  t-reftable-merged: improve the const-correctness of helper functions
  t-reftable-merged: improve the test t_merged_single_record()
  t: harmonize t-reftable-merged.c with coding guidelines
  t: move reftable/merged_test.c to the unit testing framework
2024-07-31 13:34:19 -07:00
468ebc52f3 Merge branch 'kn/ci-clang-format'
A CI job that use clang-format to check coding style issues in new
code has been added.

* kn/ci-clang-format:
  ci/style-check: add `RemoveBracesLLVM` in CI job
  check-whitespace: detect if no base_commit is provided
  ci: run style check on GitHub and GitLab
  clang-format: formalize some of the spacing rules
  clang-format: avoid spacing around bitfield colon
  clang-format: indent preprocessor directives after hash
2024-07-31 13:34:18 -07:00
90139ae377 Merge branch 'jc/checkout-no-op-switch-errors'
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-07-31 13:34:18 -07:00
d71121c060 Merge branch 'pw/add-patch-with-suppress-blank-empty'
"git add -p" by users with diff.suppressBlankEmpty set to true
failed to parse the patch that represents an unmodified empty line
with an empty line (not a line with a single space on it), which
has been corrected.

* pw/add-patch-with-suppress-blank-empty:
  add-patch: use normalize_marker() when recounting edited hunk
  add-patch: handle splitting hunks with diff.suppressBlankEmpty
2024-07-31 13:34:17 -07:00
2794ac123d Merge branch 'rj/make-cleanup'
A build tweak knob has been simplified by not setting the value
that is already the default; another unused one has been removed.

* rj/make-cleanup:
  config.mak.uname: remove unused uname_P variable
  Makefile: drop -Wno-universal-initializer from SP_EXTRA_FLAGS
2024-07-31 13:34:17 -07:00
f31e901332 Merge branch 'jt/doc-post-receive-hook-update'
Doc update.

* jt/doc-post-receive-hook-update:
  doc: clarify post-receive hook behavior
2024-07-31 13:34:16 -07:00
f084c50de6 Merge branch 'ad/merge-with-diff-algorithm'
Many Porcelain commands that internally use the merge machinery
were taught to consistently honor the diff.algorithm configuration.

* ad/merge-with-diff-algorithm:
  merge-recursive: honor diff.algorithm
2024-07-31 13:34:16 -07:00
6a52f307af Merge branch 'rs/t-strvec-use-test-msg'
Unit test clean-up.

* rs/t-strvec-use-test-msg:
  t-strvec: fix type mismatch in check_strvec
  t-strvec: improve check_strvec() output
  t-strvec: use test_msg()
2024-07-31 13:34:15 -07:00
63ee933383 t98xx: mark Perforce tests as memory-leak free
All the Perforce tests are free of memory leaks. This went unnoticed
because most folks do not have p4 and p4d installed on their computers.
Consequently, given that the prerequisites for running those tests
aren't fulfilled, `TEST_PASSES_SANITIZE_LEAK=check` won't notice that
those tests are indeed memory leak free.

Mark those tests accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-31 10:05:18 -07:00
d707d23d2c ci: update Perforce version to r23.2
Update our Perforce version from r21.2 to r23.2. Note that the updated
version is not the newest version. Instead, it is the last version where
the way that Perforce is being distributed remains the same as in r21.2.
Newer releases stopped distributing p4 and p4d executables as well as
the macOS archives directly and would thus require more work.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-31 10:05:18 -07:00
49f4fd901a t98xx: fix Perforce tests with p4d r23 and newer
Some of the tests in t98xx modify the Perforce depot in ways that the
tool wouldn't normally allow. This is done to test behaviour of git-p4
in certain edge cases that we have observed in the wild, but which
should in theory not be possible.

Naturally, modifying the depot on disk directly is quite intimate with
the tool and thus prone to breakage when Perforce updates the way that
data is stored. And indeed, those tests are broken nowadays with r23 of
Perforce. While a file revision was previously stored as a plain file
"depot/file,v", it is now stored in a directory "depot/file,d" with
compression.

Adapt those tests to handle both old- and new-style depot layouts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-31 10:05:18 -07:00
63ad8dbf16 convert: return early when not tracing
When Git adds a file requiring encoding conversion and tracing of encoding
conversion is not requested via the GIT_TRACE_WORKING_TREE_ENCODING
environment variable, the `trace_encoding()` function still allocates &
prepares "human readable" copies of the file contents before and after
conversion to show in the trace. This results in a high memory footprint
and increased runtime without providing any user-visible benefit.

This fix introduces an early exit from the `trace_encoding()` function
when tracing is not requested, preventing unnecessary memory allocation
and processing.

Signed-off-by: D Harithamma <harithamma.d@ibm.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-31 08:54:13 -07:00
6cda597283 Documentation: consistently use spaces inside initializers
Our coding guide is inconsistent with how it uses spaces inside of
initializers (`struct foo bar = { something }`). While we mostly carry
the space between open and closing braces and the initialized members,
in one case we don't.

Fix this one instance such that we consistently carry the space. This is
also consistent with how clang-format formats such initializers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:50:25 -07:00
10f0723c8d Documentation: document idiomatic function names
We semi-regularly have discussions around whether a function shall be
named `S_release()`, `S_clear()` or `S_free()`. Indeed, it may not be
obvious which of these is preferable as we never really defined what
each of these variants means exactly.

Carve out a space where we can add idiomatic names for common functions
in our coding guidelines and define each of those functions. Like this,
we can get to a shared understanding of their respective semantics and
can easily point towards our style guide in future discussions such that
our codebase becomes more consistent over time.

Note that the intent is not to rename all functions which violate these
semantics right away. Rather, the intent is to slowly converge towards a
common style over time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:50:25 -07:00
541204aabe Documentation: document naming schema for structs and their functions
We nowadays have a proper mishmash of struct-related functions that are
called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
that are called `<struct>_<verb>` (e.g. `strbuf_clear()`). While the
former style may be easier to tie into a spoken conversation, most of
our communication happens in text anyway. Furthermore, prefixing
functions with the name of the structure they operate on makes it way
easier to group them together, see which functions are related, and will
also help folks who are using code completion.

Let's thus settle on one style, namely the one where functions start
with the name of the structure they operate on.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:50:25 -07:00
7df3f55b92 Documentation: clarify indentation style for C preprocessor directives
In the preceding commit, we have settled on using a single space per
nesting level to indent preprocessor directives. Clarify our coding
guidelines accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:50:25 -07:00
395726717b clang-format: fix indentation width for preprocessor directives
In [1], we have improved our clang-format configuration to also specify
the style for how to indent preprocessor directives. But while we have
settled the question of where to put the indentation, either before or
after the hash sign, we didn't specify exactly how to indent.

With the current configuration, clang-format uses tabs to indent each
level of nested preprocessor directives, which is in fact unintentional
and never done in our codebase. Instead, we use a mixture of indenting
by either one or two spaces, where using a single space is somewhat more
common.

Adapt our clang-format configuration accordingly by specifying an
indentation width of one space.

[1]: <20240708092317.267915-1-karthik.188@gmail.com>

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:50:25 -07:00
b4e8a8c163 Merge branch 'kn/ci-clang-format' into ps/doc-more-c-coding-guidelines
* kn/ci-clang-format:
  ci/style-check: add `RemoveBracesLLVM` in CI job
  check-whitespace: detect if no base_commit is provided
  ci: run style check on GitHub and GitLab
  clang-format: formalize some of the spacing rules
  clang-format: avoid spacing around bitfield colon
  clang-format: indent preprocessor directives after hash
2024-07-30 13:47:26 -07:00
9d36dbd1ff refs/reftable: stop using the_repository
Convert the reftable ref backend to stop using `the_repository` in favor
of the repo that gets passed in via `struct ref_store`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:41:24 -07:00
79e54c6a4e refs/packed: stop using the_repository
Convert the packed ref backend to stop using `the_repository` in favor
of the repo that gets passed in via `struct ref_store`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:41:24 -07:00
a6ebc2c6d1 refs/files: stop using the_repository
Convert the files ref backend to stop using `the_repository` in favor of
the repo that gets passed in via `struct ref_store`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:41:23 -07:00
080b068ffb refs/files: stop using the_repository in parse_loose_ref_contents()
We implicitly rely on `the_repository` in `parse_loose_ref_contents()`
by calling `parse_oid_hex()`. Convert the function to instead use
`parse_oid_hex_algop()` and have callers pass in the hash algorithm to
use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:41:23 -07:00
f777f4d884 refs: stop using the_repository
Convert "refs.c" to stop using `the_repository` in favor of the repo
that gets passed in via `struct ref_store`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:41:23 -07:00
9ddec6b79a t-strvec: use if_test
The macro TEST takes a single expression.  If a test requires multiple
statements then they need to be placed in a function that's called in
the TEST expression.

Remove the cognitive overhead of defining and calling single-use
functions by using if_test instead.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:44 -07:00
2c4a6a8d9c t-reftable-basics: use if_test
The macro TEST takes a single expression.  If a test requires multiple
statements then they need to be placed in a function that's called in
the TEST expression.

Remove the overhead of defining and calling single-use functions by
using if_test instead.

Run the tests in the order of definition.  We can reorder them like that
because they are independent.  Technically this changes the output, but
retains the meaning of a full run and allows for easier review e.g. with
diff option --ignore-all-space.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:44 -07:00
e51d7ef940 t-ctype: use if_test
Use the documented macro if_test instead of the internal functions
test__run_begin() and test__run_end(), which are supposed to be private
to the unit test framework.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:43 -07:00
96c6304c18 unit-tests: add if_test
The macro TEST only allows defining a test that consists of a single
expression.  Add a new macro, if_test, which provides a way to define
unit tests that are made up of one or more statements.

if_test allows defining self-contained tests en bloc, a bit like
test_expect_success does for regular tests.  It acts like a conditional;
the test body is executed if test_skip_all() had not been called before.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:43 -07:00
1f452d6c68 unit-tests: show location of checks outside of tests
Checks outside of tests are caught at runtime and reported like this:

 Assertion failed: (ctx.running), function test_assert, file test-lib.c, line 267.

The assert() call aborts the unit test and doesn't reveal the location
or even the type of the offending check, as test_assert() is called by
all of them.

Handle it like the opposite case, a test without any checks: Don't
abort, but report the location of the actual check, along with a message
explaining the situation.  The output for example above becomes:

 # BUG: check outside of test at t/helper/test-example-tap.c:75

... and the unit test program continues and indicates the error in its
exit code at the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:43 -07:00
4575ba6a7c t0080: use here-doc test body
Improve the readability of the expected output by using a here-doc for
the test body and replacing the unwieldy ${SQ} references with literal
single quotes.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:42 -07:00
098be29f5b t-example-decorate: remove test messages
The test_msg() calls only repeat information already present in test
descriptions and check definitions, which are shown automatically if
the checks fail.  Remove the redundant messages to simplify the tests
and their output.  Here it is with all of them failing before:

 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:18
 # when adding a brand-new object, NULL should be returned
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:21
 # when adding a brand-new object, NULL should be returned
 not ok 1 - Add 2 objects, one with a non-NULL decoration and one with a NULL decoration.
 # check "ret == &vars->decoration_a" failed at t/unit-tests/t-example-decorate.c:29
 # when readding an already existing object, existing decoration should be returned
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:32
 # when readding an already existing object, existing decoration should be returned
 not ok 2 - When re-adding an already existing object, the old decoration is returned.
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:40
 # lookup should return added declaration
 # check "ret == &vars->decoration_b" failed at t/unit-tests/t-example-decorate.c:43
 # lookup should return added declaration
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:46
 # lookup for unknown object should return NULL
 not ok 3 - Lookup returns the added declarations, or NULL if the object was never added.
 # check "objects_noticed == 2" failed at t/unit-tests/t-example-decorate.c:58
 #    left: 1
 #   right: 2
 # should have 2 objects
 not ok 4 - The user can also loop through all entries.
 1..4

... and here with the patch applied:

 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:18
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:20
 not ok 1 - Add 2 objects, one with a non-NULL decoration and one with a NULL decoration.
 # check "ret == &vars->decoration_a" failed at t/unit-tests/t-example-decorate.c:27
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:29
 not ok 2 - When re-adding an already existing object, the old decoration is returned.
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:36
 # check "ret == &vars->decoration_b" failed at t/unit-tests/t-example-decorate.c:38
 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:40
 not ok 3 - Lookup returns the added declarations, or NULL if the object was never added.
 # check "objects_noticed == 2" failed at t/unit-tests/t-example-decorate.c:51
 #    left: 1
 #   right: 2
 not ok 4 - The user can also loop through all entries.
 1..4

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:31:46 -07:00
ee0be850b0 safe.directory: setting safe.directory="." allows the "current" directory
When "git daemon" enters a repository, it chdir's to the requested
repository and then uses "." (the curent directory) to consult the
"is this repository considered safe?" when it is not owned by the
same owner as the process.

Make sure this access will be allowed by setting safe.directory to
".", as that was once advertised on the list as a valid workaround
to the overly tight safe.directory settings introduced by 2.45.1
(cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>).

Also add simlar test to show what happens in the same setting if the
safe.directory is set to "*" instead of "."; in short, "." is a bit
tighter (as it is custom designed for git-daemon situation) than
"anything goes" settings given by "*".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 11:47:59 -07:00
dc0edbb01c safe.directory: normalize the configured path
The pathname of a repository comes from getcwd() and it could be a
path aliased via symbolic links, e.g., the real directory may be
/home/u/repository but a symbolic link /home/u/repo may point at it,
and the clone request may come as "git clone file:///home/u/repo/"

A request to check if /home/u/repository is safe would be rejected
if the safe.directory configuration allows /home/u/repo/ but not its
alias /home/u/repository/.  Normalize the paths configured for the
safe.directory configuration variable before comparing them with the
path being checked.

Two and a half things to note, compared to the previous step to
normalize the actual path of the suspected repository, are:

 - A configured safe.directory may be coming from .gitignore in the
   home directory that may be shared across machines.  The path
   meant to match with an entry may not necessarily exist on all of
   such machines, so not being able to convert them to real path on
   this machine is *not* a condition that is worthy of warning.
   Hence, we ignore a path that cannot be converted to a real path.

 - A configured safe.directory is essentially a random string that
   user throws at us, written completely unrelated to the directory
   the current process happens to be in.  Hence it makes little
   sense to give a non-absolute path.  Hence we ignore any
   non-absolute paths, except for ".".

 - The safe.directory set to "." was once advertised on the list as
   a valid workaround for the regression caused by the overly tight
   safe.directory check introduced in 2.45.1; we treat it to mean
   "if we are at the top level of a repository, it is OK".
   (cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>).

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 11:47:59 -07:00
7f547c99a6 safe.directory: normalize the checked path
The pathname of a repository comes from getcwd() and it could be a
path aliased via symbolic links, e.g., the real directory may be
/home/u/repository but a symbolic link /home/u/repo may point at it,
and the clone request may come as "git clone file:///home/u/repo/".

A request to check if /home/u/repo is safe would be rejected if the
safe.directory configuration allows /home/u/repository/ but not its
alias /home/u/repo/.  Normalize the path being checked before
comparing with safe.directory value(s).

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 11:47:59 -07:00
1048aa8b7a safe.directory: preliminary clean-up
The paths given in the safe.directory configuration variable are
allowed to contain "~user" (which interpolates to user's home
directory) and "%(prefix)" (which interpolates to the installation
location in RUNTIME_PREFIX-enabled builds, and a call to the
git_config_pathname() function is tasked to obtain a copy of the
path with these constructs interpolated.

The function, when it succeeds, always yields an allocated string in
the location given as the out-parameter; even when there is nothing
to interpolate in the original, a literal copy is made.  The code
path that contains this caller somehow made two contradicting and
incorrect assumptions of the behaviour when there is no need for
interpolation, and was written with extra defensiveness against
two phantom risks that do not exist.

One wrong assumption was that the function might yield NULL when
there is no interpolation.  This led to the use of an extra "check"
variable, conditionally holding either the interpolated or the
original string.  The assumption was with us since 8959555c
(setup_git_directory(): add an owner check for the top-level
directory, 2022-03-02) originally introduced the safe.directory
feature.

Another wrong assumption was that the function might yield the same
pointer as the input when there is no interpolation.  This led to a
conditional free'ing of the interpolated copy, that the conditional
never skipped, as we always received an allocated string.

Simplify the code by removing the extra defensiveness.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 11:47:58 -07:00
8e5dd94e68 grep: -W: skip trailing empty lines at EOF, too
4aa2c4753d (grep: -W: don't extend context to trailing empty lines,
2016-05-28) stopped showing empty lines at the end of function context
when using -W.  Do the same for trailing empty lines at the end of
files, for consistency -- it doesn't matter whether a function section
is ended by the next function or the end of the file.

Test it by adding a trailing empty line to the file used by the test
"grep -W" and leave its expected output the same.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 09:59:04 -07:00
a6e9429f72 patch-id: tighten code to detect the patch header
The get_one_patchid() function unconditionally takes a line that
matches the patch header (namely, a line that begins with a full
object name, possibly prefixed by "commit" or "From" plus a space)
as the beginning of a patch.  Even when it is *not* looking for one
(namely, when the previous call found the patch header and returned,
and then we are called again to skip the log message and process the
patch whose header was found by the previous invocation).

As a consequence, a line in the commit log message that begins with
one of these patterns can be mistaken to start another patch, with
current message entirely skipped (because we haven't even reached
the patch at all).

Allow the caller to tell us if it called us already and saw the
patch header (in which case we shouldn't be looking for another one,
until we see the "diff" part of the patch; instead we simply should
be skipping these lines as part of the commit log message), and skip
the header processing logic when that is the case.  In the helper
function, it also needs to flip this "are we looking for a header?"
bit, once it finished skipping the commit log message and started
processing the patches, as the patch header of the _next_ message is
the only clue in the input that the current patch is done.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
3f288b6faf patch-id: rewrite code that detects the beginning of a patch
The get_one_patchid() function reads input lines until it finds a
patch header (the line that begins a patch), whose beginning is one
of:

 (1) an "<object name>", which is what "git diff-tree --stdin" shows;
 (2) "commit <object name>", which is what "git log" shows; or
 (3) "From <object name>",  which is what "git log --format=email" shows.

When it finds such a line, it returns to the caller, reporting the
<object name> it found, and the size of the "patch" it processed.

The caller then calls the function again, which then ignores the
commit log message, and then processes the lines in the patch part
until it hits another "beginning of a patch".

The above logic was fairly easy to see until 2bb73ae8 (patch-id: use
starts_with() and skip_prefix(), 2016-05-28) reorganized the code,
which made another logic that has nothing to do with the "where does
the next patch begin?" logic, which came from 2485eab5
(git-patch-id: do not trip over "no newline" markers, 2011-02-17)
that ignores the "\ No newline at the end", rolled into the same
single if() statement.

Let's split it out.  The "\ No newline at the end" marker is part of
the patch, should not appear before we start reading the patch part,
and does not belong to the detection of patch header.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
2438294a13 patch-id: make get_one_patchid() more extensible
We pass two independent Boolean flags (i.e. do we want the stable
variant of patch-id?  do we want to hash the stuff verbatim?) into
the function as two separate parameters.  Before adding the third
one and make the interface even wider, let's consolidate them into
a single flag word.

No changes in behaviour.  Just a trivial interface change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
c92f3195ad patch-id: call flush_current_id() only when needed
The caller passes a flag that is used to become no-op when calling
flush_current_id().  Instead of calling something that becomes a
no-op, teach the caller not to call it in the first place.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
4210ea6f0f t4204: patch-id supports various input format
"git patch-id" was first developed to read from "git diff-tree
--stdin -p" output.  Later it was enhanced to read from "git
diff-tree --stdin -p -v", which was the downstream of an early
imitation of "git log" ("git rev-list" run in the upstream of a pipe
to feed the "diff-tree").  These days, we also read from "git
format-patch".

Their output begins slightly differently, but the patch-id computed
over them for the same commit should be the same.  Ensure that we
won't accidentally break this expectation.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
8b426c84f3 notes: do not trigger editor when adding an empty note
With "git notes add -C $blob", the given blob contents are to be made
into a note without involving an editor.  But when "--allow-empty" is
given, the editor is invoked, which can cause problems for
non-interactive callers[1].

This behaviour started with 90bc19b3ae (notes.c: introduce
'--separator=<paragraph-break>' option, 2023-05-27), which changed
editor invocation logic to check for a zero length note_data buffer.

Restore the original behaviour of "git note" that takes the contents
given via the "-m", "-C", "-F" options without invoking an editor, by
checking for any prior parameter callbacks, indicated by a non-zero
note_data.msg_nr.  Remove the now-unneeded note_data.given flag.

Add a test for this regression by checking whether GIT_EDITOR is
invoked alongside "git notes add -C $empty_blob --allow-empty"

[1] https://github.com/ddiss/icyci/issues/12

Signed-off-by: David Disseldorp <ddiss@suse.de>
[jc: enhanced the test with -m/-F options]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 15:31:30 -07:00
6e71d6ac7c unit-tests/test-lib: fix typo in check_pointer_eq() description
The comment surrounding check_pointer_eq() should explain about what
this function does instead of explaining check_int().  Correct this.

Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 14:23:14 -07:00
39bf06adf9 Git 2.46
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 07:14:09 -07:00
2ab3396b4e Merge tag 'l10n-2.46.0-rnd2' of https://github.com/git-l10n/git-po
l10n-2.46.0-rnd2

* tag 'l10n-2.46.0-rnd2' of https://github.com/git-l10n/git-po:
  l10n: zh_CN: updated translation for 2.46
  l10n: sv.po: Update Swedish translation
  l10n: zh_TW: Git 2.46
  l10n: Update German translation
  l10n: vi: Updated translation for 2.46
  l10n: uk: v2.46 update
  l10n: bg.po: Updated Bulgarian translation (5734t)
  l10n: fr: v2.46.0
  l10n: tr: Update Turkish translations
  l10n: po-id for 2.46
2024-07-29 07:11:16 -07:00
de86879ace l10n: zh_CN: updated translation for 2.46
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Co-authored-by: 依云 <lilydjwg@gmail.com>
Reviewed-by: 依云 <lilydjwg@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2024-07-28 19:52:41 +08:00
c28545a6e2 l10n: sv.po: Update Swedish translation
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2024-07-27 17:08:01 +08:00
5b29a57f54 Merge branch 'l10n/zh-TW/2024-07-24' of github.com:l10n-tw/git-po
* 'l10n/zh-TW/2024-07-24' of github.com:l10n-tw/git-po:
  l10n: zh_TW: Git 2.46
2024-07-27 16:27:25 +08:00
d02895cecc Merge branch 'l10n-de-2.46' of github.com:ralfth/git
* 'l10n-de-2.46' of github.com:ralfth/git:
  l10n: Update German translation
2024-07-27 16:25:13 +08:00
c7dce0fde1 Merge branch 'vi-2.46' of github.com:Nekosha/git-po
* 'vi-2.46' of github.com:Nekosha/git-po:
  l10n: vi: Updated translation for 2.46
2024-07-27 16:24:48 +08:00
d8e2f4d1b1 Merge branch '2.46-uk-update' of github.com:arkid15r/git-ukrainian-l10n
* '2.46-uk-update' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: uk: v2.46 update
2024-07-27 16:21:09 +08:00
caa3bf1503 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5734t)
2024-07-27 16:20:29 +08:00
c3dad83ba6 Merge branch 'l10N_fr_2.46' of github.com:jnavila/git
* 'l10N_fr_2.46' of github.com:jnavila/git:
  l10n: fr: v2.46.0
2024-07-27 16:18:53 +08:00
b81d65b6ad Merge branch 'tr-l10n' of github.com:bitigchi/git-po
* 'tr-l10n' of github.com:bitigchi/git-po:
  l10n: tr: Update Turkish translations
2024-07-27 16:17:45 +08:00
a956262045 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.46
2024-07-27 16:16:43 +08:00
15b02a3d4b l10n: zh_TW: Git 2.46
Co-authored-by: Lumynous <lumynou5.tw@gmail.com>
Co-authored-by: Ngoo Ka-iu <willy04wu69@gmail.com>
Co-authored-by: Nightfeather Chen <slat@nightfeather.me>
Co-authored-by: Kisaragi Hiu <mail@kisaragi-hiu.com>
Co-authored-by: hms5232 <hms5232@hhming.moe>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2024-07-27 15:01:30 +08:00
7b9e54714a check-non-portable-shell: improve VAR=val shell-func detection
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell, thus should be avoided.

As such, check-non-portable-shell.pl warns when it detects such usage.
However, a limitation of the check is that it only detects such
invocations when variable assignment (i.e. `VAR=val`) is the first thing
on the line. Thus, it can easily be fooled by an invocation such as:

    echo X | VAR=val shell-func

Address this shortcoming by loosening the check so that the variable
assignment can be recognized even when not at the beginning of the line.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 22:49:40 -07:00
7bd0cd0e7b check-non-portable-shell: suggest alternative for VAR=val shell-func
Most problems reported by check-non-portable-shell are accompanied by
advice suggesting how the test author can repair the problem. For
instance:

    error: egrep/fgrep obsolescent (use grep -E/-F)

However, when one-shot variable assignment is detected when calling a
shell function (i.e. `VAR=val shell-func`), the problem is reported, but
no advice is given. The lack of advice is particularly egregious since
neither the problem nor the workaround are likely well-known by
newcomers to the project writing tests for the first time. Address this
shortcoming by recommending the use of `test_env` which is tailor made
for this specific use-case.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 22:49:40 -07:00
a7fa609747 check-non-portable-shell: loosen one-shot assignment error message
When a0a630192d (t/check-non-portable-shell: detect "FOO=bar
shell_func", 2018-07-13) added the check for one-shot environment
variable assignment for shell functions, the primary reason given for
avoiding them was that, under some shells, the assignment outlives the
invocation of the shell function, thus could potentially negatively
impact subsequent commands in the same test, as well as subsequent
tests.

However, it has recently become apparent that this is not the only
potential problem with one-shot assignments and shell functions. Another
problem is that some shells do not actually export the variable to
commands which the function invokes[1]. More significantly, however, the
behavior of one-shot assignments with shell functions is not specified
by POSIX[2].

Given this new understanding, the presented error message ("assignment
extends beyond 'shell_func'") is too specific and potentially
misleading. Address this by emitting a less specific error message.

(Note that the wording "is not portable" is chosen over the more
specific "behavior not specified by POSIX" for consistency with almost
all other error message issued by this "lint" script.)

[1]: https://lore.kernel.org/git/xmqqbk2p9lwi.fsf_-_@gitster.g/
[2]: https://lore.kernel.org/git/xmqq34o19jj1.fsf@gitster.g/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 22:49:40 -07:00
5e91056a1b t4034: fix use of one-shot variable assignment with shell function
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell, thus should be avoided.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 22:49:40 -07:00
a26d7004f7 t3430: drop unnecessary one-shot "VAR=val shell-func" invocation
The behavior of a one-shot environment variable assignment of the form
"VAR=val cmd" is unspecified according to POSIX when "cmd" is a shell
function. Indeed the behavior differs between shell implementations and
even different versions of the same shell. One such problematic behavior
is that, with some shells, the assignment will outlive the invocation of
the function, thus may potentially impact subsequent commands in the
test, as well as subsequent tests. A common way to work around the
problem is to wrap a subshell around the one-shot assignment, thus
ensuring that the assignment is short-lived.

In this test, the subshell is employed precisely for this purpose; other
side-effects of the subshell, such as losing the effect of `test_tick`
which is invoked by `test_commit`, are immaterial.

These days, we can take advantage of `test_commit --author` to more
clearly convey that the test is interested only in overriding the author
of the commit.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 22:49:40 -07:00
c199707496 doc: fix hex code escapes in git-ls-files
The --format option on the git-ls-files man page states that `%xx`
interpolates to the character with hex code `xx`. This mirrors the
documentation and behavior of `git for-each-ref --format=...`. However,
in reality it requires the character with code `XX` to be specified as
`%xXX`, mirroring the behaviour of  `git log --format`.

Signed-off-by: Jayson Rhynas <jayrhynas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 10:53:21 -07:00
c3d034df16 csum-file: introduce discard_hashfile()
The hashfile API is used to write out a "hashfile", which has a
final checksum (typically SHA-1) at the end.  An in-core hashfile
structure has up to two file descriptors and a few buffers that can
only be freed by calling a helper function that is private to the
csum-file implementation.

The usual flow of a user of the API is to first open a file
descriptor for writing, obtain a hashfile associated with that write
file descriptor by calling either hashfd() or hashfd_check(), call
hashwrite() number of times to write data to the file, and then call
finalize_hashfile(), which appends th checksum to the end of the
file, closes file descriptors and releases associated buffers.

But what if a caller finds some error after calling hashfd() to
start the process and/or hashwrite() to send some data to the file,
and wants to abort the operation?  The underlying file descriptor is
often managed by the tempfile API, so aborting will clean the file
out of the filesystem, but the resources associated with the in-core
hashfile structure is lost.

Introduce discard_hashfile() API function to allow them to release
the resources held by a hashfile structure the callers want to
dispose of, and use that in read-cache.c:do_write_index(), which is
a central place that writes the index file.

Mark t2107 as leak-free, as this leak in "update-index --cacheinfo"
test that deliberately makes it fail is now plugged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 09:04:02 -07:00
be784de1c4 l10n: Update German translation
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2024-07-26 17:48:59 +02:00
d98d9c77e5 mailmap: plug memory leak in read_mailmap_blob()
When a named object to read mailmap from is not a blob, the code
correctly errors out, but it forgot to free the object data before
doing so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-26 08:00:09 -07:00
db5104501b l10n: vi: Updated translation for 2.46
Signed-off-by: Vũ Tiến Hưng <newcomerminecraft@gmail.com>
2024-07-26 11:06:37 +07:00
70058db385 doc: difference in location to apply is "offset", not "fuzz"
The documentation to "git rebase" says that the line numbers (in the
rebased change) may not exactly be the same as the line numbers the
change gets replayed on top of the new base, but uses a wrong noun
"fuzz".  It should have said "offset".

They are both terms of art.  "fuzz" is about context lines not
exactly matching.  "offset" is about the difference in the location
that a change was taken from the original and the change gets
replayed on the target.  "offset" is often inevitable and part of
normal life.  "fuzz" on the other hand is often a sign of trouble
(and indeed "Git" refuses to apply a change with "fuzz", except
there are options to be fuzzy about whitespaces).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 10:28:58 -07:00
fc87b2f7c1 add-patch: render hunks through the pager
Make the print command trigger the pager when invoked using a capital
'P', to make it easier for the user to review long hunks.

Note that if the PAGER ends unexpectedly before we've been able to send
the payload, perhaps because the user is not interested in the whole
thing, we might receive a SIGPIPE, which would abruptly and unexpectedly
terminate the interactive session for the user.

Therefore, we need to ignore a possible SIGPIPE signal.  Add a test for
this, in addition to the test for normal operation.

For the SIGPIPE test, we need to make sure that we completely fill the
operating system's buffer, otherwise we might not trigger the SIGPIPE
signal.  The normal size of this buffer in different OSs varies from a
few KBs to 1MB.  Use a payload large enough to guarantee that we exceed
this limit.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 09:03:00 -07:00
e8bd8883fe pager: introduce wait_for_pager
Since f67b45f862 (Introduce trivial new pager.c helper infrastructure,
2006-02-28) we have the machinery to send our output to a pager.

That machinery, once set up, does not allow us to regain the original
stdio streams.

In the interactive commands (i.e.: add -p) we want to use the pager for
some output, while maintaining the interaction with the user.

Modify the pager machinery so that we can use `setup_pager()` and, once
we've finished sending the desired output for the pager, wait for the
pager termination using a new function `wait_for_pager()`.  Make this
function reset the pager machinery before returning.

One specific point to note is that we avoid forking the pager in
`setup_pager()` if the configured pager is an empty string [*1*] or
simply "cat" [*2*].  In these cases, `setup_pager()` does nothing and
therefore `wait_for_pager()` should not be called.

We could modify `setup_pager()` to return an indication of these
situations, so we could avoid calling `wait_for_pager()`.

However, let's avoid transferring that responsibility to the caller and
instead treat the call to `wait_for_pager()` as a no-op when we know we
haven't forked the pager.

   1.- 402461aab1 (pager: do not fork a pager if PAGER is set to empty.,
                   2006-04-16)

   2.- caef71a535 (Do not fork PAGER=cat, 2006-04-16)

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 09:03:00 -07:00
da9ef60c8f pager: do not close fd 2 unnecessarily
We send errors to the pager since 61b80509e3 (sending errors to stdout
under $PAGER, 2008-02-16).

In a8335024c2 (pager: do not dup2 stderr if it is already redirected,
2008-12-15) an exception was introduced to avoid redirecting stderr if
it is not connected to a terminal.

In such exceptional cases, the close(STDERR_FILENO) we're doing in
close_pager_fds, is unnecessary.

Furthermore, in a subsequent commit we're going to introduce changes
that will involve using close_pager_fds multiple times.

With this in mind, controlling when we want to close stderr, become
sensible.

Let's close(STDERR_FILENO) only when necessary, and pave the way for the
upcoming changes.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 09:03:00 -07:00
7309be1fc5 add-patch: test for 'p' command
Add a test for the 'p' command, which was introduced in 66c14ab592
(add-patch: introduce 'p' in interactive-patch, 2024-03-29).

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 09:03:00 -07:00
92e24c8b79 ReviewingGuidelines: encourage positive reviews more
I saw some contributors hesitate to give a positive review on
patches by their coworkers.  When written well, a positive review
does not have to be a hollow "looks good" that rubber stamps an
useless approval on a topic that is not interesting to others.

Let's add a few paragraphs to encourage positive reviews, which is a
bit harder to give than a review to point out things to improve.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 08:50:09 -07:00
9885871248 show-ref: improve short help messages of options
Trivial change to indicate that branches and tags are real options
that can be used combined to get more information.  This helps with
linting translations and prompting the user that the terms represent
options.

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 08:04:34 -07:00
dadb75a2dd l10n: uk: v2.46 update
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2024-07-24 14:34:25 -07:00
ad57f148c6 Git 2.46-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 16:54:35 -07:00
c89facd58e Merge branch 'ps/ref-storage-migration-fix'
Hotfix for a topic already in -rc.

* ps/ref-storage-migration-fix:
  refs: fix format migration on Cygwin
2024-07-23 16:54:34 -07:00
6fcd72d5ad Merge branch 'js/doc-markup-updates-fix'
Work around asciidoctor's css that renders `monospace` material
in the SYNOPSIS section of manual pages as block elements.

* js/doc-markup-updates-fix:
  Doc: fix Asciidoctor css workaround
  asciidoctor: fix `synopsis` rendering
2024-07-23 16:54:34 -07:00
37b959ecfb Merge branch 'ja/doc-markup-updates-fix'
Fix documentation mark-up regression in 2.45.

* ja/doc-markup-updates-fix:
  doc: git-clone fix discrepancy between asciidoc and asciidoctor
2024-07-23 16:54:33 -07:00
ec9d46588e Merge branch 'ds/midx-write-repack-fix'
Repacking a repository with multi-pack index started making stupid
pack selections in Git 2.45, which has been corrected.

* ds/midx-write-repack-fix:
  midx-write: revert use of --stdin-packs
  t5319: add failing test case for repack/expire
2024-07-23 16:54:33 -07:00
d44ce6ddd5 Doc: fix Asciidoctor css workaround
The previous step introduced docinfo.html to be used to tweak the
CSS used by the asciidoctor, that by default renders <code> inside
<pre> as a block element, breaking the SYNOPSIS section of a few
pages that adopted a new convention we use since Git 2.45.

But in this project, HTML files are all generated.  We do not force
any human to write HTML by hand, which is an unusual and cruel
punishment.  "*.html" is in the .gitignore file, and "make clean"
removes them.  Having a tracked .html file makes "make clean" make
the tree dirty by removing the tracked docinfo.html file.

Let's do an obvious, minimum and stupid workaround to generate that
file at runtime instead.  The mark-up is being rethought in a major
way for the next development cycle, and the CSS workaround we added
in the previous step may have to adjusted, possibly in a large way,
anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 11:02:52 -07:00
1b8f306612 ci/style-check: add RemoveBracesLLVM in CI job
For 'clang-format', setting 'RemoveBracesLLVM' to 'true', adds a check
to ensure we avoid curly braces for single-statement bodies in
conditional blocks.

However, the option does come with two warnings [1]:

    This option will be renamed and expanded to support other styles.

and

    Setting this option to true could lead to incorrect code formatting
    due to clang-format’s lack of complete semantic information. As
    such, extra care should be taken to review code changes made by
    this option.

The latter seems to be of concern. While we want to experiment with the
rule, adding it to the in-tree '.clang-format' could affect end-users.
Let's only add it to the CI jobs for now. With time, we can evaluate
its efficacy and decide if we want to add it to '.clang-format' or
retract it entirely. We do so, by adding the existing rules in
'.clang-format' and this rule to a temp file outside the working tree,
which is then used by 'git clang-format'. This ensures we don't murk
with files in-tree.

[1]: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#removebracesllvm

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 09:56:50 -07:00
30c4f7e350 check-whitespace: detect if no base_commit is provided
The 'check-whitespace' CI script exits gracefully if no base commit is
provided or if an invalid revision is provided. This is not good because
if a particular CI provides an incorrect base_commit, it would fail
successfully.

This is exactly the case with the GitLab CI. The CI is using the
"$CI_MERGE_REQUEST_TARGET_BRANCH_SHA" variable to get the base commit
SHA, but variable is only defined for _merged_ pipelines. So it is empty
for regular pipelines [1]. This should've failed the check-whitespace
job.

Let's fallback to 'CI_MERGE_REQUEST_DIFF_BASE_SHA' if
"CI_MERGE_REQUEST_TARGET_BRANCH_SHA" isn't available in GitLab CI,
similar to the previous commit. Let's also add a check for incorrect
base_commit in the 'check-whitespace.sh' script. While here, fix a small
typo too.

[1]: https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#predefined-variables-for-merge-request-pipelines

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 09:56:50 -07:00
bce7e52d4e ci: run style check on GitHub and GitLab
We don't run style checks on our CI, even though we have a
'.clang-format' setup in the repository. Let's add one, the job will
validate only against the new commits added and will only run on merge
requests. Since we're introducing it for the first time, let's allow
this job to fail, so we can validate if this is useful and eventually
enforce it.

For GitHub, we allow the job to pass by adding 'continue-on-error: true'
to the workflow. This means the job would show as passed, even if the
style check failed. To know the status of the job, users have to
manually check the logs.

For GitLab, we allow the job to pass by adding 'allow_failure: true', to
the job. Unlike GitHub, here the job will show as failed with a yellow
warning symbol, but the pipeline would still show as passed.

Also for GitLab, we use the 'CI_MERGE_REQUEST_TARGET_BRANCH_SHA'
variable by default to obtain the base SHA of the merged pipeline (which
is only available for merged pipelines [1]). Otherwise we use the
'CI_MERGE_REQUEST_DIFF_BASE_SHA' variable.

[1]: https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#predefined-variables-for-merge-request-pipelines

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 09:56:50 -07:00
1993918b9f clang-format: formalize some of the spacing rules
There are some spacing rules that we follow in the project and it makes
sense to formalize them:
* Ensure there is no space inserted after the logical not '!' operator.
* Ensure there is no space before the case statement's colon.
* Ensure there is no space before the first bracket '[' of an array.
* Ensure there is no space in empty blocks.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 09:56:50 -07:00
5e7eee46a3 clang-format: avoid spacing around bitfield colon
The spacing around colons is currently not standardized and as such we
have the following practices in our code base:
- Spacing around the colon `int bf : 1`: 146 instances
- No spacing around the colon `int bf:1`: 148 instances
- Spacing before the colon `int bf :1`: 6 instances
- Spacing after the colon `int bf: 1`: 12 instances

Let's formalize this by picking the most followed pattern and add the
corresponding style to '.clang-format'.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 09:56:49 -07:00
e3ea432528 clang-format: indent preprocessor directives after hash
We do not have a rule around the indentation of preprocessor directives.
This was also discussed on the list [1], noting how there is often
inconsistency in the styling. While there was discussion, there was no
conclusion around what is the preferred style here. One style being
indenting after the hash:

    #if FOO
    #  if BAR
    #    include <foo>
    #  endif
    #endif

The other being before the hash:

    #if FOO
      #if BAR
        #include <foo>
      #endif
    #endif

Let's pick the former and add 'IndentPPDirectives: AfterHash' value to
our '.clang-format'. There is no clear reason to pick one over the
other, but it would definitely be nicer to be consistent.

[1]: https://lore.kernel.org/r/xmqqwmmm1bw6.fsf@gitster.g

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 09:56:49 -07:00
09c817383f refs: fix format migration on Cygwin
It was reported that t1460-refs-migrate.sh fails when using Cygwin with
errors like the following:

    error: could not link file '.git/ref_migration.sr9pEF/reftable' to '.git/reftable': Permission denied

As some debugging surfaced, the root cause of this is that some files of
the newly-initialized ref store are still open when the target format is
the "reftable" format, and Cygwin refuses to rename open files.

Fix this issue by closing the new ref store before renaming its files
into place. This is a slight change in behaviour compared to before,
where we kept the new ref store open and then updated the repository's
ref store to point to it.

While we could re-open the new ref store after we have moved files
around, this is ultimately unnecessary. We know that the only user of
`repo_migrate_ref_storage_format()` is the git-refs(1) command, and it
won't access the ref store after it has been migrated anyway. So
reinitializing the ref store would be a waste of time. Regardless of
that it is still sensible to leave the repository in a consistent state.
But instead of reinitializing the ref store, we can simply unset the
repo's ref store altogether and let `get_main_ref_store()` lazily
initialize the new ref store as required.

Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 08:58:03 -07:00
728a1962cd CodingGuidelines: document a shell that "fails" "VAR=VAL shell_func"
Over the years, we accumulated the community wisdom to avoid the
common "one-short export" construct for shell functions, but seem to
have lost on which exact platform it is known to fail.  Now during
an investigation on a breakage for a recent topic, we found one
example of failing shell.  Let's document that.

This does *not* mean that we can freely start using the construct
once Ubuntu 20.04 is retired.  But it does mean that we cannot use
the construct until Ubuntu 20.04 is fully retired from the machines
that matter.  Moreover, posix explicitly says that the behaviour for
the construct is unspecified.

Helped-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-23 08:28:43 -07:00
1c473dd6af doc: remove dangling closing parenthesis
The second line of the synopsis, starting with [--dry-run] has a
dangling closing paren in the second optional group. Probably added by
mistake, so remove it.

Signed-off-by: Tomas Nordin <tomasn@posteo.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-22 17:32:36 -07:00
8bfc3e47a7 asciidoctor: fix synopsis rendering
Since 76880f0510 (doc: git-clone: apply new documentation formatting
guidelines, 2024-03-29), the synopsis of `git clone`'s manual page is
rendered differently than before; Its parent commit did the same for
`git init`.

The result looks quite nice. When rendered with AsciiDoc, that is. When
rendered using AsciiDoctor and displayed in a graphical web browser such
as Firefox, Chrome, Edge, etc, the result is quite unpleasant to my eye,
reading something like this:

	SYNOPSIS

	 git clone
	  [
	 --template=
	 <template-directory>]
		  [
	 -l
	 ] [
	 -s
	 ] [
	 --no-hardlinks
	 ] [
	 -q
	 ] [
	[... continuing like this ...]

The reason is that AsciiDoctor's default style sheet contains this (see
https://github.com/asciidoctor/asciidoctor/blob/854923b15533/src/stylesheets/asciidoctor.css#L519-L521
for context):

	pre > code {
	  display: block;
	}

It is this `display: block` that forces the parts that are enclosed in
`<code>` tags (such as the `git clone` or the `--template=` part) to be
rendered on their own line.

Side note: This seems not to affect console web browsers like `lynx` or
`w3m`, most likely because most style sheet directions cannot be
respected in text terminals and therefore they seem to punt on style
sheets altogether.

To fix this, let's apply the method recommended by AsciiDoctor in
https://docs.asciidoctor.org/asciidoctor/latest/html-backend/default-stylesheet/#customize-docinfo
to partially override AsciiDoctor's default style sheet so that the
`<code>` sections of the synopsis are no longer each rendered on their
own, individual lines.

This fixes https://github.com/git-for-windows/git/issues/5063.

Even on the Git home page, where AsciiDoctor's default stylesheet is
_not_ used, this change resulted in some unpleasant rendering where not
only the font is changed for the `<code>` sections of the synopsis, but
padding and a different background color make the visual impression
quite uneven. This has been addressed in the meantime, via
https://github.com/git/git-scm.com/commit/a492d0565512.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-22 14:13:44 -07:00
9200fe2a93 l10n: bg.po: Updated Bulgarian translation (5734t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2024-07-21 22:31:10 +03:00
60cf761ed1 add-patch: use normalize_marker() when recounting edited hunk
After the user has edited a hunk the number of lines in the pre- and
post- image lines is recounted the hunk header can be updated before
passing the hunk to "git apply". The recounting code correctly handles
empty context lines where the leading ' ' is omitted by treating '\n'
and '\r' as context lines.

Update this code to use normalize_marker() so that the handling of empty
context lines is consistent with the rest of the hunk parsing
code. There is a small change in behavior as normalize_marker() only
treats "\r\n" as an empty context line rather than any line starting
with '\r'. This should not matter in practice as Macs have used Unix
line endings since MacOs 10 was released in 2001 and if it transpires
that someone is still using an earlier version of MacOs where lines end
with '\r' then we will need to change the handling of '\r' in
normalize_marker() anyway.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-20 16:29:15 -07:00
39bdd84eaf add-patch: handle splitting hunks with diff.suppressBlankEmpty
When "add -p" parses diffs, it looks for context lines starting with a
single space. But when diff.suppressBlankEmpty is in effect, an empty
context line will omit the space, giving us a true empty line. This
confuses the parser, which is unable to split based on such a line.

It's tempting to say that we should just make sure that we generate a
diff without that option.  However, although we do not parse hunks that
the user has manually edited with parse_diff() we do allow the user
to split such hunks. As POSIX calls the decision of whether to print the
space here "implementation-defined" we need to handle edited hunks where
empty context lines omit the space.

So let's handle both cases: a context line either starts with a space or
consists of a totally empty line by normalizing the first character to a
space when we parse them. Normalizing the first character rather than
changing the code to check for a space or newline will hopefully future
proof against introducing similar bugs if the code is changed.

Reported-by: Ilya Tumaykin <itumaykin@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-20 16:29:14 -07:00
6474da0aa4 doc: git-clone fix discrepancy between asciidoc and asciidoctor
Asciidoc.py does not have the concept of generalized roles, whereas
asciidoctor interprets [foo]`blah` as blah with role foo in the
synopsis, making in effect foo disappear in the output. Note that
square brackets not directly followed by an inline markup do not
define a role, which is why we do not have the issue on other parts of
the documentation.

In order to get a consistant result across asciidoctor and
asciidoc.py, the hack is to use the {empty} entity
to split the bracket part from the inline format part.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-20 16:24:12 -07:00
bb0498b1bb howto-maintain: update daily tasks
Some "implementation details" of how I perform these integration
tasks day to day have changed since the document was originally
written.  Update to reflect the way things are currently done.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-19 13:53:55 -07:00
c93dda2e78 howto-maintain: cover a whole development cycle
The "policy" part is more important than the "daily operation" part
in that it establishes why certain maintainer tasks exist and are
performed the way they are.

The text briefly touches the role each integration branches play in
the workflow, but does not give the whole picture of what happens in
a single development cycle using these branches.  Extend the
description to describe a whole development cycle.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-19 13:53:53 -07:00
ebe8720ed4 l10n: fr: v2.46.0
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2024-07-19 20:26:18 +02:00
8fb6d11fad midx-write: revert use of --stdin-packs
This reverts b7d6f23a17 (midx-write.c: use `--stdin-packs` when
repacking, 2024-04-01) and then marks the test created in the previous
change as passing.

The fundamental issue with the reverted change is that the focus on
pack-files separates the object selection from how the multi-pack-index
selects a single pack-file for an object ID with multiple copies among
the tracked pack-files.

The change was made with the intention of improving delta compression in
the resulting pack-file, but that can be resolved with the existing
object list mechanism. There are other potential pitfalls of doing an
object walk at this time if the repository is a blobless partial clone,
and that will require additional testing on top of the one that changes
here.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-19 07:19:01 -07:00
ebd7b1ebd5 l10n: tr: Update Turkish translations
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2024-07-19 14:03:18 +03:00
68f66648de l10n: po-id for 2.46
Update following components:

  * builtin/clone.c
  * builtin/config.c
  * builtin/for-each-repo.c
  * builtin/refs.c
  * command-list.h
  * commit-graph.c
  * http.c
  * pack-bitmap-write.c
  * pack-bitmap.c
  * promisor-remote.c
  * refs.c
  * sequencer.c

Translate following new components:

  * pseudo-merge.c
  * refs/files-backend.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2024-07-19 14:28:41 +07:00
d7969a5127 git-svn: use svn:global-ignores to create .gitignore
`svn:global-ignores` contains a list of file patterns that should
not be tracked in version control. The syntax of these patterns is
the same as `svn:ignore`. Their semantics differ: patterns in
`svn:global-ignores` apply to all paths under the directory where
they apply, while `svn:ignore` only applies to the directory's
immediate children.

Signed-off-by: Alex Galvin <agalvin@comqi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 15:48:06 -07:00
5c5877b93c git-svn: add public property svn:global-ignores
Subversion 1.8 added a new property `svn:global-ignores`. It
contains a list of patterns used to determine what files should
be ignored. If Git-SVN is going to ignore these files as well, it
is important that we do not skip over directories that have this
property set.

Signed-off-by: Alex Galvin <agalvin@comqi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 15:48:06 -07:00
738fab524c t5319: add failing test case for repack/expire
Git 2.45.0 included the change b7d6f23a17 (midx-write.c: use
`--stdin-packs` when repacking, 2024-04-01) which caused the 'git
multi-pack-index repack' command to use 'git pack-objects --stdin-packs'
instead of listing the objects to repack. While this change was
motivated by efficient cross-process communication and the ability to
improve delta compression, it breaks a fundamental function of the
'incremental-repack' task that is enabled by default in Scalar clones or
Git repositories that run 'git maintenance start'.

The 'incremental-repack' task performs a two-step process of the
'expire' and 'repack' subcommands of the 'git multi-pack-index' builtin.
The 'expire' command removes any pack-files listed in the
multi-pack-index but without any referenced objects. The 'repack' task
then finds a batch of pack-files to repack and sends their objects to
'git pack-objects'. Both the pack-files chosen for the batch and the
objects chosen to repack are based on the ones that the multi-pack-index
references. Objects that appear in a pack-file but have a duplicate copy
in a newer pack-file are not considered in this case. Since the
multi-pack-index references only the newest copy of an object, this
allows the next 'incremental-repack' task to remove the pack-files in
the next 'expire' task. This delay is intentional due to how Windows
handles may block deletion of files with open read handles.

However, the mentioned commit changed this behavior to divorce the set
of objects referenced by the multi-pack-index and instead use a set of
"included" and "excluded" pack-files in the 'git pack-objects' builtin.
When a pack-file is selected as "included", only the objects it contains
but are not in any "excluded" pack-files are considered for repacking.
This has led to client repositories failing to remove old pack-files as
they still have some referenced objects. This grows over time until the
point that Git is trying to repack the same pack-files over and over.

For now, create a test case that demonstrates the expected behavior, but
also fails in its final line. The setup here it attempting to recreate a
typical situation for a repository that uses a blobless partial clone.
There would be a large initial pack-file from the clone that is never
selected in the 'repack' batch. There are other pack-files that have a
combination of new objects from incremental fetches and possibly blobs
that are not connected to those incremental fetches; these blobs could
be filled in from commands like 'git checkout' or 'git blame'. The
pack-files also have some overlap on purpose so test-1 has some
duplicates in test-2 and test-2 has some duplicates in test-3.

At the end of the test, the test-2 pack-file still exists though it
should have been expired. This test will pass when reverting the
offending commit.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 14:53:27 -07:00
d19b6cd2dd Git 2.46-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 08:30:28 -07:00
1aac20a4b0 Merge branch 'jk/am-retry'
Test fix as a follow-up to an already graduated topic.

* jk/am-retry:
  t4153: stop redirecting input from /dev/zero
2024-07-18 08:30:27 -07:00
d07b5d9ad5 Merge branch 'tb/pseudo-merge-reachability-bitmap'
Doc update.

* tb/pseudo-merge-reachability-bitmap:
  Documentation/gitpacking: make sample configs listing blocks
2024-07-18 08:30:27 -07:00
ef2447d97c Merge branch 'ps/pseudo-ref-terminology'
Doc update.

* ps/pseudo-ref-terminology:
  Documentation/glossary: fix double word
2024-07-18 08:30:26 -07:00
ca12618b7b Merge branch 'tb/doc-max-tree-depth-fix'
Doc update.

* tb/doc-max-tree-depth-fix:
  Documentation: fix default value for core.maxTreeDepth
2024-07-18 08:30:26 -07:00
f9e4f2599c Merge branch 'ch/refs-without-the-repository-fix'
Comment fix.

* ch/refs-without-the-repository-fix:
  refs: correct the version numbers in a comment
2024-07-18 08:30:25 -07:00
220adb16e4 config.mak.uname: remove unused uname_P variable
The uname_P make variable was added in commit e15f545155 ("Makefile
tweaks: Solaris 9+ dont need iconv / move up uname variables",
2006-02-20), but it seems to never have been used (even in that original
commit). The man page for 'uname' notes that the '-p' processor option
is non-portable (the 'uname_M' variable is used by the Makefile for that
purpose).

Remove the unused 'uname_P' make variable.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 07:01:39 -07:00
f78e2dd88a Makefile: drop -Wno-universal-initializer from SP_EXTRA_FLAGS
Commit 1c96642326 ("sparse: allow '{ 0 }' to be used without warnings",
2020-05-22) added -Wno-universal-initializer to the SP_EXTRA_FLAGS in
order to suppress potential sparse warnings from using '{0}' as an
aggregate initializer. At that time, the default was for sparse to
issue warnings (i.e. the default was -Wuniversal-initializer) if such
an initializer was used to initialize an aggregate whose first member
was a pointer type. However, this default was changed just a few days
later to -Wno-universal-initializer (first released in sparse v0.6.2)
and has been so in all subsequent release versions of sparse.  Thus,
including -Wno-universal-initializer in the SP_EXTRA_FLAGS variable is
redundant.

Remove the unnecessary warning flag from SP_EXTRA_FLAGS, essentially
reverting commit 1c96642326.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 07:01:38 -07:00
1c4a234a1c Post 2.46-rc0 batch #3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-17 10:47:27 -07:00
219719cc55 Merge branch 'js/unit-test-oidtree-cmake-fix'
Build fix.

* js/unit-test-oidtree-cmake-fix:
  cmake: fix build of `t-oidtree`
2024-07-17 10:47:27 -07:00
76e018b9a1 Merge branch 'js/var-git-shell-path'
"git var GIT_SHELL_PATH" should report the path to the shell used
to spawn external commands, but it didn't do so on Windows, which
has been corrected.

* js/var-git-shell-path:
  var(win32): do report the GIT_SHELL_PATH that is actually used
  run-command: declare the `git_shell_path()` function globally
  run-command(win32): resolve the path to the Unix shell early
  mingw(is_msys2_sh): handle forward slashes in the `sh.exe` path, too
  win32: override `fspathcmp()` with a directory separator-aware version
  strvec: declare the `strvec_push_nodup()` function globally
  run-command: refactor getting the Unix shell path into its own function
2024-07-17 10:47:27 -07:00
c7e8aaee98 Merge branch 'ps/doc-http-empty-cookiefile'
What happens when http.cookieFile gets the special value "" has
been clarified in the documentation.

* ps/doc-http-empty-cookiefile:
  doc: update http.cookieFile with in-memory cookie processing
2024-07-17 10:47:26 -07:00
e13feda98f Merge branch 'kn/push-empty-fix'
"git push '' HEAD:there" used to hit a BUG(); it has been corrected
to die with "fatal: bad repository ''".

* kn/push-empty-fix:
  builtin/push: call set_refspecs after validating remote
2024-07-17 10:47:26 -07:00
dd6d10285b Merge branch 'jc/http-cookiefile'
The http.cookieFile and http.saveCookies configuration variables
have a few values that need to be avoided, which are now ignored
with warning messages.

* jc/http-cookiefile:
  http.c: cookie file tightening
2024-07-17 10:47:26 -07:00
b19a8c00c6 Merge branch 'jk/test-body-in-here-doc'
The test framework learned to take the test body not as a single
string but as a here-document.

* jk/test-body-in-here-doc:
  t/.gitattributes: ignore whitespace in chainlint expect files
  t: convert some here-doc test bodies
  test-lib: allow test snippets as here-docs
  chainlint.pl: add tests for test body in heredoc
  chainlint.pl: recognize test bodies defined via heredoc
  chainlint.pl: check line numbers in expected output
  chainlint.pl: force CRLF conversion when opening input files
  chainlint.pl: do not spawn more threads than we have scripts
  chainlint.pl: only start threads if jobs > 1
  chainlint.pl: add test_expect_success call to test snippets
2024-07-17 10:47:25 -07:00
6da44da936 Merge branch 'rj/test-sanitize-leak-log-fix'
Tests that use GIT_TEST_SANITIZE_LEAK_LOG feature got their exit
status inverted, which has been corrected.

* rj/test-sanitize-leak-log-fix:
  test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by default
  test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG
2024-07-17 10:47:24 -07:00
616e94ca24 Documentation: fix default value for core.maxTreeDepth
When `core.maxTreeDepth` was originally introduced via be20128bfa (add
core.maxTreeDepth config, 2023-08-31), its default value was 4096.

There have since been a couple of updates to its default value that were
not reflected in the documentation for `core.maxTreeDepth`:

  - 4d5693ba05 (lower core.maxTreeDepth default to 2048, 2023-08-31)
  - b64d78ad02 (max_tree_depth: lower it for MSVC to avoid stack
    overflows, 2023-11-01)

Commit 4d5693ba05 lowers the default to 2048 for platforms with smaller
stack sizes, and commit b64d78ad02 lowers the default even further when
Git is compiled with MSVC.

Neither of these changes were reflected in the documentation, which I
noticed while merging newer releases back into GitHub's private fork
(which contained the original implementation of `core.maxTreeDepth`).

Update the documentation to reflect what the platform-specific default
values are.

Noticed-by: Keith W. Campbell <keithc@ca.ibm.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-17 08:51:14 -07:00
b25a2e8f37 Documentation/glossary: fix double word
Remove a spurious "that".

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-17 08:49:09 -07:00
df8b05672c Documentation/gitpacking: make sample configs listing blocks
This document contains a few sample config snippets. At least with
Asciidoctor, the section headers are rendered *more* indented than the
variables that follow:

       [bitmapPseudoMerge "all"]
    pattern = "refs/"
    ...

To address this, wrap these listings in AsciiDoc listing blocks. Remove
the indentation from the section headings. This is similar to how we
handle such sample config elsewhere, e.g., in config.txt.

While we're here, fix the nearby "wiht" typo.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-17 08:48:30 -07:00
2a959ec21a t4153: stop redirecting input from /dev/zero
Commit 852a171018 (am: let command-line options override saved options,
2015-08-04) redirected a few "git am" invocations from /dev/zero, even
though it did not expect "am" to read the input. This was necessary at
the time because those tests used test_terminal, and as described in
18d8c26930 (test_terminal: redirect child process' stdin to a pty,
2015-08-04):

  Note that due to the way the code is structured, the child's stdin
  pseudo-tty will be closed when we finish reading from our stdin. This
  means that in the common case, where our stdin is attached to /dev/null,
  the child's stdin pseudo-tty will be closed immediately. Some operations
  like isatty(), which git-am uses, require the file descriptor to be
  open, and hence if the success of the command depends on such functions,
  test_terminal's stdin should be redirected to a source with large amount
  of data to ensure that the child's stdin is not closed, e.g.

              test_terminal git am --3way </dev/zero

But we later dropped the use of test_terminal in 53ce2e3f0a (am: add
explicit "--retry" option, 2024-06-06). That commit dropped one of the
redirections from /dev/zero but not the other.

In theory the remaining one should not cause any problems, but it turns
out that at least one platform (NonStop) does not have /dev/zero at all.
We never noticed before because it also did not pass the TTY prereq,
meaning these tests were not run at all there until 53ce2e3f0a.

So let's drop the useless /dev/zero mention. There are others in the
test suite, but they are run only for tests marked with EXPENSIVE (so
not typically by default).

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-17 08:31:27 -07:00
5133ead528 Revert "reflog expire: don't use lookup_commit_reference_gently()"
During Git 2.35 timeframe, daf1d828 (reflog expire: don't use
lookup_commit_reference_gently(), 2021-12-22) replaced a call to
lookup_commit_reference_gently() with a call to lookup_commit().

What it failed to consider was that our refs do not necessarily
point at commits (most notably, we have annotated and signed tags),
and more importantly that lookup_commit() does not dereference a tag
to return a commit; instead it returns NULL when a tag is given.

Since the commit returned is used as a starting point for the
reachability check, this ejected the commits that are reachable only
by an annotated tag out of the set of reachable commits, breaking
the computation to correctly implement the "--expire-unreachable"
option.  We also started giving an error message that the API
function expected to be fed a commit object.

This problem hasn't been reported or noticed for a long time,
probably because the "refs/tags/" hierarchy by default is not
covered by reflogs, as nobody usually moves tags.

Revert the change to correctly find the commit pointed at by the ref
to restore the previous behaviour, but do so only in a more modern
codebase, as we had significant code churn since then and it is not
grave enough to worry about for older maintenance tracks.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-16 14:15:35 -07:00
04f5a52757 Post 2.46-rc0 batch #2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-16 11:18:58 -07:00
d6c86368c8 Merge branch 'bc/gitfaq-more'
A handful of entries are added to the GitFAQ document.

* bc/gitfaq-more:
  doc: mention that proxies must be completely transparent
  gitfaq: add entry about syncing working trees
  gitfaq: give advice on using eol attribute in gitattributes
  gitfaq: add documentation on proxies
2024-07-16 11:18:58 -07:00
fe5ba894ec Merge branch 'bc/http-proactive-auth'
The http transport can now be told to send request with
authentication material without first getting a 401 response.

* bc/http-proactive-auth:
  http: allow authenticating proactively
2024-07-16 11:18:57 -07:00
12d49fd028 Merge branch 'jc/where-is-bash-for-ci'
Shell script clean-up.

* jc/where-is-bash-for-ci:
  ci: unify bash calling convention
2024-07-16 11:18:57 -07:00
5d71940dda Merge branch 'ds/advice-sparse-index-expansion'
A new warning message is issued when a command has to expand a
sparse index to handle working tree cruft that are outside of the
sparse checkout.

* ds/advice-sparse-index-expansion:
  advice: warn when sparse index expands
2024-07-16 11:18:56 -07:00
f4c6a0e275 Merge branch 'cb/send-email-sanitize-trailer-addresses'
Address-looking strings found on the trailer are now placed on the
Cc: list after running through sanitize_address by "git send-email".

* cb/send-email-sanitize-trailer-addresses:
  git-send-email: use sanitized address when reading mbox body
2024-07-16 11:18:56 -07:00
ffc8f1142c Merge branch 'en/ort-inner-merge-error-fix'
The "ort" merge backend saw one bugfix for a crash that happens
when inner merge gets killed, and assorted code clean-ups.

* en/ort-inner-merge-error-fix:
  merge-ort: fix missing early return
  merge-ort: convert more error() cases to path_msg()
  merge-ort: upon merge abort, only show messages causing the abort
  merge-ort: loosen commented requirements
  merge-ort: clearer propagation of failure-to-function from merge_submodule
  merge-ort: fix type of local 'clean' var in handle_content_merge ()
  merge-ort: maintain expected invariant for priv member
  merge-ort: extract handling of priv member into reusable function
2024-07-16 11:18:55 -07:00
78687168bc t-strvec: fix type mismatch in check_strvec
Cast i from size_t to uintmax_t to match the format string.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-16 09:30:30 -07:00
730914ed7e refs: correct the version numbers in a comment
The paragraph talks about a change made in c8f815c2 (refs: remove
functions without ref store, 2024-05-07), which is v2.46.0-rc0~119^2
and will be published as part of v2.46, not v2.45.

Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-16 09:06:22 -07:00
8db8786fc2 doc: clarify post-receive hook behavior
The `githooks` documentation mentions that the post-receive hook
executes once after git-receive-pack(1) updates all references and that
it also receives the same information as the pre-receive hook on
standard input. This is misleading though because the hook only
executes once if at least one of the attempted reference updates is
successful. Also, while each line provided on standard input is in the
same format as the pre-receive hook, the information received only
includes the set of references that were successfully updated.

Update the documentation to clarify these points and also provide a
reference to the post-receive hook section of the `git-receive-pack`
documentation which has additional information.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-15 11:43:38 -07:00
67be8c4de5 doc: note that AT&T ksh does not work with our test suite
The scripted Porcelain commands do not allow use of "local" because
it is not universally supported, but we use it liberally in our test
scripts, which means some POSIX compliant shells (like "ksh93") can
not be used to run our tests.

Document the status quo, to help the next person who gets perplexed
seeing our tests fail.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-15 10:14:52 -07:00
ad850ef1cf Post 2.46-rc0 batch #1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-15 10:11:44 -07:00
9118e46e81 Merge branch 'cp/unit-test-reftable-record'
A test in reftable library has been rewritten using the unit test
framework.

* cp/unit-test-reftable-record:
  t-reftable-record: add tests for reftable_log_record_compare_key()
  t-reftable-record: add tests for reftable_ref_record_compare_name()
  t-reftable-record: add index tests for reftable_record_is_deletion()
  t-reftable-record: add obj tests for reftable_record_is_deletion()
  t-reftable-record: add log tests for reftable_record_is_deletion()
  t-reftable-record: add ref tests for reftable_record_is_deletion()
  t-reftable-record: add comparison tests for obj records
  t-reftable-record: add comparison tests for index records
  t-reftable-record: add comparison tests for ref records
  t-reftable-record: add reftable_record_cmp() tests for log records
  t: move reftable/record_test.c to the unit testing framework
2024-07-15 10:11:44 -07:00
f582dc3c5a Merge branch 'jc/disable-push-nego-for-deletion'
"git push" that pushes only deletion gave an unnecessary and
harmless error message when push negotiation is configured, which
has been corrected.

* jc/disable-push-nego-for-deletion:
  push: avoid showing false negotiation errors
2024-07-15 10:11:43 -07:00
fbeed643b9 Merge branch 'ri/doc-show-branch-fix'
Docfix.

* ri/doc-show-branch-fix:
  doc: fix the max number of branches shown by "show-branch"
2024-07-15 10:11:43 -07:00
d319ad5704 Merge branch 'tb/dev-build-pedantic-fix'
Developer build procedure fix.

* tb/dev-build-pedantic-fix:
  config.mak.dev: fix typo when enabling -Wpedantic
2024-07-15 10:11:42 -07:00
76f49679b1 Merge branch 'rs/clang-format-updates'
Custom control structures we invented more recently have been
taught to the clang-format file.

* rs/clang-format-updates:
  clang-format: include kh_foreach* macros in ForEachMacros
2024-07-15 10:11:42 -07:00
ccb74f51c9 Merge branch 'am/gitweb-feed-use-committer-date'
GitWeb update to use committer date consistently in rss/atom feeds.

* am/gitweb-feed-use-committer-date:
  gitweb: rss/atom change published/updated date to committer date
2024-07-15 10:11:41 -07:00
820e796984 Merge branch 'jk/tests-without-dns'
Test suite has been taught not to unnecessarily rely on DNS failing
a bogus external name.

* jk/tests-without-dns:
  t/lib-bundle-uri: use local fake bundle URLs
  t5551: do not confirm that bogus url cannot be used
  t5553: use local url for invalid fetch
2024-07-15 10:11:41 -07:00
cda729581b Merge branch 'gt/unit-test-oidmap'
An existing test of oidmap API has been rewritten with the
unit-test framework.

* gt/unit-test-oidmap:
  t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
2024-07-15 10:11:40 -07:00
b227482ea0 Merge branch 'as/describe-broken-refresh-index-fix'
"git describe --dirty --broken" forgot to refresh the index before
seeing if there is any chang, ("git describe --dirty" correctly did
so), which has been corrected.

* as/describe-broken-refresh-index-fix:
  describe: refresh the index when 'broken' flag is used
2024-07-15 10:11:40 -07:00
d8b9b1fc81 Merge branch 'rj/t0613-no-longer-leaks'
A test that no longer leaks has been marked as such.

* rj/t0613-no-longer-leaks:
  t0613: mark as leak-free
2024-07-15 10:11:39 -07:00
84fc58f24b Merge branch 'rj/t0612-no-longer-leaks'
A test that no longer leaks has been marked as such.

* rj/t0612-no-longer-leaks:
  t0612: mark as leak-free
2024-07-15 10:11:39 -07:00
141e13ee1a t-strvec: improve check_strvec() output
The macro check_strvec calls the function check_strvec_loc(), which
performs the actual checks.  They report the line number inside that
function on error, which is not very helpful.  Before the previous
patch half of them triggered an assertion that reported the caller's
line number using a custom message, which was more useful, but a bit
awkward.

Improve the output by getting rid of check_strvec_loc() and performing
all checks within check_strvec, as they then report the line number of
the call site, aiding in finding the broken test.  Determine the number
of items and check it up front to avoid having to do them both in the
loop and at the end.

Sanity check the expected items to make sure there are any and that the
last one is NULL, as the compiler no longer does that for us with the
removal of the function attribute LAST_ARG_MUST_BE_NULL.

Use only the actual strvec name passed to the macro, the internal
"expect" array name and an index "i" in the output, for clarity.  While
"expect" does not exist at the call site, it's reasonably easy to infer
that it's referring to the NULL-terminated list of expected strings,
converted to an array.

Here's the output with less items than expected in the strvec before:

 # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
 #    left: 1
 #   right: 1

... and with the patch:

 # check "(&vec)->nr == ARRAY_SIZE(expect) - 1" failed at t/unit-tests/t-strvec.c:53
 #    left: 1
 #   right: 2

With too many items in the strvec we got before:

 # check "vec->nr == nr" failed at t/unit-tests/t-strvec.c:34
 #    left: 1
 #   right: 0
 # check "vec->v[nr] == NULL" failed at t/unit-tests/t-strvec.c:36
 #    left: 0x6000004b8010
 #   right: 0x0

... and with the patch:

 # check "(&vec)->nr == ARRAY_SIZE(expect) - 1" failed at t/unit-tests/t-strvec.c:53
 #    left: 1
 #   right: 0

A broken alloc value was reported like this:

 # check "vec->alloc > nr" failed at t/unit-tests/t-strvec.c:20
 #    left: 0
 #   right: 0

 ... and with the patch:

 # check "(&vec)->nr <= (&vec)->alloc" failed at t/unit-tests/t-strvec.c:56
 #    left: 2
 #   right: 0

An unexpected string value was reported like this:

 # check "!strcmp(vec->v[nr], str)" failed at t/unit-tests/t-strvec.c:24
 #    left: "foo"
 #   right: "bar"
 #      nr: 0

... and with the patch:

 # check "!strcmp((&vec)->v[i], expect[i])" failed at t/unit-tests/t-strvec.c:53
 #    left: "foo"
 #   right: "bar"
 #       i: 0

If the strvec is not NULL terminated, we got:

 # check "vec->v[nr] == NULL" failed at t/unit-tests/t-strvec.c:36
 #    left: 0x102c3abc8
 #   right: 0x0

... and with the patch we get the line number of the caller:

 # check "!strcmp((&vec)->v[i], expect[i])" failed at t/unit-tests/t-strvec.c:53
 #    left: "bar"
 #   right: NULL
 #       i: 1

check_strvec calls without a trailing NULL were detected at compile time
before:

 t/unit-tests/t-strvec.c:71:2: error: missing sentinel in function call [-Werror,-Wsentinel]

... and with the patch it's only found at runtime:

 # check "expect[ARRAY_SIZE(expect) - 1] == NULL" failed at t/unit-tests/t-strvec.c:53
 #    left: 0x100e5a663
 #   right: 0x0

We can let check_strvec add the terminating NULL for us and remove it
from callers, making it impossible to forget.  Leave that conversion for
a future patch, though, since this reimplementation is already intrusive
enough.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-15 07:39:12 -07:00
9c93ba4d0a merge-recursive: honor diff.algorithm
The documentation claims that "recursive defaults to the diff.algorithm
config setting", but this is currently not the case. This fixes it,
ensuring that diff.algorithm is used when -Xdiff-algorithm is not
supplied. This affects the following porcelain commands: "merge",
"rebase", "cherry-pick", "pull", "stash", "log", "am" and "checkout".
It also affects the "merge-tree" ancillary interrogator.

This change refactors the initialization of merge options to introduce
two functions, "init_merge_ui_options" and "init_merge_basic_options"
instead of just one "init_merge_options". This design follows the
approach used in diff.c, providing initialization methods for
porcelain and plumbing commands respectively. Thanks to that, the
"replay" and "merge-recursive" plumbing commands remain unaffected by
diff.algorithm.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 18:10:49 -07:00
9ed143ee33 var(win32): do report the GIT_SHELL_PATH that is actually used
On Windows, Unix-like paths like `/bin/sh` make very little sense. In
the best case, they simply don't work, in the worst case they are
misinterpreted as absolute paths that are relative to the drive
associated with the current directory.

To that end, Git does not actually use the path `/bin/sh` that is
recorded e.g. when `run_command()` is called with a Unix shell
command-line. Instead, as of 776297548e (Do not use SHELL_PATH from
build system in prepare_shell_cmd on Windows, 2012-04-17), it
re-interprets `/bin/sh` as "look up `sh` on the `PATH` and use the
result instead".

This is the logic users expect to be followed when running `git var
GIT_SHELL_PATH`.

However, when 1e65721227 (var: add support for listing the shell,
2023-06-27) introduced support for `git var GIT_SHELL_PATH`, Windows was
not special-cased as above, which is why it outputs `/bin/sh` even
though that disagrees with what Git actually uses.

Let's fix this by using the exact same logic as `prepare_shell_cmd()`,
adjusting the Windows-specific `git var GIT_SHELL_PATH` test case to
verify that it actually finds a working executable.

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:37 -07:00
877da5e208 run-command: declare the git_shell_path() function globally
The intention is to use it in `git var GIT_SHELL_PATH`, therefore we
need this function to stop being file-local only.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:37 -07:00
92fe7c7d42 run-command(win32): resolve the path to the Unix shell early
In 776297548e (Do not use SHELL_PATH from build system in
prepare_shell_cmd on Windows, 2012-04-17), the hard-coded path to the
Unix shell was replaced by passing `sh` instead when executing Unix
shell scripts in Git.

This was done because the hard-coded path to the Unix shell is incorrect
on Windows because it not only is a Unix-style absolute path instead of
a Windows one, but Git uses the runtime prefix feature on Windows, i.e.
the correct path cannot be hard-coded.

Naturally, the `sh` argument will be resolved to the full path of said
executable eventually.

To help fixing the bug where `git var GIT_SHELL_PATH` currently does not
reflect that logic, but shows that incorrect hard-coded Unix-style
absolute path, let's resolve the full path to the `sh` executable early
in the `git_shell_path()` function so that we can use it in `git var`,
too, and be sure that the output is equivalent to what `run_command()`
does when it is asked to execute a command-line using a Unix shell.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:37 -07:00
f1ed769a3b mingw(is_msys2_sh): handle forward slashes in the sh.exe path, too
Whether the full path to the MSYS2 Bash is specified using backslashes
or forward slashes, in either case the command-line arguments need to be
quoted in the MSYS2-specific manner instead of using regular Win32
command-line quoting rules.

In preparation for `prepare_shell_cmd()` to use the full path to
`sh.exe` (with forward slashes for consistency), let's teach the
`is_msys2_sh()` function about this; Otherwise 5580.4 'clone with
backslashed path' would fail once `prepare_shell_cmd()` uses the full
path instead of merely `sh`.

This patch relies on the just-introduced fix where `fspathcmp()` handles
backslashes and forward slashes as equivalent on Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:37 -07:00
193eda7507 win32: override fspathcmp() with a directory separator-aware version
On Windows, the backslash is the directory separator, even if the
forward slash can be used, too, at least since Windows NT.

This means that the paths `a/b` and `a\b` are equivalent, and
`fspathcmp()` needs to be made aware of that fact.

Note that we have to override both `fspathcmp()` and `fspathncmp()`, and
the former cannot be a mere pre-processor constant that transforms calls
to `fspathcmp(a, b)` into `fspathncmp(a, b, (size_t)-1)` because the
function `report_collided_checkout()` in `unpack-trees.c` wants to
assign `list.cmp = fspathcmp`.

Also note that `fspatheq()` does _not_ need to be overridden because it
calls `fspathcmp()` internally.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:36 -07:00
ce68178a0a strvec: declare the strvec_push_nodup() function globally
This function differs from `strvec_push()` in that it takes ownership of
the allocated string that is passed as second argument.

This is useful when appending elements to the string array that have
been freshly allocated and serve no further other purpose after that.

Without declaring this function globally, call sites would allocate the
memory, only to have `strvec_push()` duplicate the string, and then the
first copy would need to be released. Having this function globally
avoids that kind of unnecessary work.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:36 -07:00
0593c1ea30 run-command: refactor getting the Unix shell path into its own function
This encapsulates the platform-specific logic better.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:36 -07:00
872721538c cmake: fix build of t-oidtree
When the `oidtree` test helper was turned into a unit test, a new
`lib-oid` source file was added as dependency. This was only done in the
Makefile so far, but also needs to be done in the CMake definition.

This is a companion of ed54840872 (t/: migrate helper/test-oidtree.c
to unit-tests/t-oidtree.c, 2024-06-08).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 14:32:52 -07:00
9a1fb8af98 t-reftable-merged: add test for REFTABLE_FORMAT_ERROR
When calling reftable_new_merged_table(), if the hash ID of the
passed reftable_table parameter doesn't match the passed hash_id
parameter, a REFTABLE_FORMAT_ERROR is thrown. This case is
currently left unexercised, so add a test for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:41 -07:00
40c80eab83 t-reftable-merged: use reftable_ref_record_equal to compare ref records
In the test t_merged_single_record() defined in t-reftable-merged.c,
the 'input' and 'expected' ref records are checked for equality
by comparing their update indices. It is very much possible for
two different ref records to have the same update indices. Use
reftable_ref_record_equal() instead for a stronger check.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:41 -07:00
84958ec754 t-reftable-merged: add tests for reftable_merged_table_max_update_index
reftable_merged_table_max_update_index() as defined by reftable/
merged.{c, h} returns the maximum update index in a merged table.
Since this function is currently unexercised, add tests for it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:40 -07:00
8d4f8165d8 t-reftable-merged: improve the const-correctness of helper functions
In t-reftable-merged.c, a number of helper functions used by the
tests can be re-defined with parameters made 'const' which makes
it easier to understand if they're read-only or not. Re-define
these functions along these lines.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:40 -07:00
c755c2f351 t-reftable-merged: improve the test t_merged_single_record()
In t-reftable-merged.c, the test t_merged_single_record() ensures
that a ref ('a') which occurs in only one of the records ('r2')
can be retrieved. Improve this test by adding another record 'r3'
to ensure that ref 'a' only occurs in 'r2' and that merged tables
don't simply read the last record.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:40 -07:00
e8ed7d1974 t: harmonize t-reftable-merged.c with coding guidelines
Harmonize the newly ported test unit-tests/t-reftable-merged.c
with the following guidelines:
- Single line control flow statements like 'for' and 'if'
  must omit curly braces.
- Structs must be 0-initialized with '= { 0 }' instead of '= { NULL }'.
- Array indices should preferably be of type 'size_t', not 'int'.
- It is fine to use C99 initial declaration in 'for' loop.

While at it, use 'ARRAY_SIZE(x)' to store the number of elements
in an array instead of hardcoding them.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:39 -07:00
9cdfd1d7df t: move reftable/merged_test.c to the unit testing framework
reftable/merged_test.c exercises the functions defined in
reftable/merged.{c, h}. Migrate reftable/merged_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests according to unit-tests' naming
conventions.

Also, move strbuf_add_void() and noop_flush() from
reftable/test_framework.c to the ported test. This is because
both these functions are used in the merged tests and
reftable/test_framework.{c, h} is not #included in the ported test.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:39 -07:00
757c6ee7a3 builtin/push: call set_refspecs after validating remote
When an end-user runs "git push" with an empty string for the remote
repository name, e.g.

    $ git push '' main

"git push" fails with a BUG(). Even though this is a nonsense request
that we want to fail, we shouldn't hit a BUG().  Instead we want to give
a sensible error message, e.g., 'bad repository'".

This is because since 9badf97c42 (remote: allow resetting url list,
2024-06-14), we reset the remote URL if the provided URL is empty. When
a user of 'remotes_remote_get' tries to fetch a remote with an empty
repo name, the function initializes the remote via 'make_remote'. But
the remote is still not a valid remote, since the URL is empty, so it
tries to add the URL alias using 'add_url_alias'. This in-turn will call
'add_url', but since the URL is empty we call 'strvec_clear' on the
`remote->url`. Back in 'remotes_remote_get', we again check if the
remote is valid, which fails, so we return 'NULL' for the 'struct
remote *' value.

The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked with empty repo names earlier since we would get a
remote, albeit with an empty URL. With the new changes, we get a 'NULL'
remote value, this causes the check for remote to fail and raises the
BUG in 'set_refspecs'.

Do a simple fix by doing remote validation first. Also add a test to
validate the bug fix. With this, we can also now directly pass remote to
'set_refspecs' instead of it trying to lazily obtain it.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:14:11 -07:00
a7dae3bdc8 Git 2.46-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 08:41:58 -07:00
e6ae4d6efe Merge branch 'rs/simplify-submodule-helper-super-prefix-invocation'
Code clean-up.

* rs/simplify-submodule-helper-super-prefix-invocation:
  submodule--helper: use strvec_pushf() for --super-prefix
2024-07-12 08:41:58 -07:00
7c01dcd018 Merge branch 'as/pathspec-h-typofix'
Typofix.

* as/pathspec-h-typofix:
  pathspec: fix typo "glossary-context.txt" -> "glossary-content.txt"
2024-07-12 08:41:57 -07:00
8d20119551 doc: update http.cookieFile with in-memory cookie processing
Documentation only mentions how to read cookies from the given file
and how to save them to the file using http.saveCookies.

But underlying libcURL allows the HTTP cookies used only in memory;
cookies from the server will be accepted and sent back in successive
requests within same connection, by using an empty string as the
filename.  Document this.

Signed-off-by: Piotr Szlazak <piotr.szlazak@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-11 08:50:30 -07:00
8c1d6691bc test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by default
As we currently describe in t/README, it can happen that:

    Some tests run "git" (or "test-tool" etc.) without properly checking
    the exit code, or git will invoke itself and fail to ferry the
    abort() exit code to the original caller.

Therefore, GIT_TEST_SANITIZE_LEAK_LOG=true is needed to be set to
capture all memory leaks triggered by our tests.

It seems unnecessary to force users to remember this option, as
forgetting it could lead to missed memory leaks.

We could solve the problem by making it "true" by default, but that
might suggest we think "false" makes sense, which isn't the case.

Therefore, the best approach is to remove the option entirely while
maintaining the capability to detect memory leaks in blind spots of our
tests.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-11 08:37:44 -07:00
55fe61559e t/.gitattributes: ignore whitespace in chainlint expect files
The ".expect" files in t/chainlint/ are snippets of expected output from
the chainlint script, and do not necessarily conform to our usual code
style. Especially with the recent change to retain line numbers, blank
lines in the input script end up with trailing whitespace as we print
"3 " for line 3, for example. The point of these files is to match the
output verbatim, so let's not complain about the trailing spaces.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:15:40 -07:00
f6b75726b2 t: convert some here-doc test bodies
The t1404 script checks a lot of output from Git which contains single
quotes. Because the test snippets are themselves wrapped in the same
single-quotes, we have to resort to using $SQ to match them.  This is
error-prone and makes the tests harder to read.

Instead, let's use the new here-doc feature added in the previous
commit, which lets us write anything in the test body we want (except
the here-doc end marker on a line by itself, of course).

Note that we do use "\" in our marker to avoid interpolation (which is
the whole point). But we don't use "<<-", as we want to preserve
whitespace in the snippet (and running with "-v" before and after shows
that we produce the exact same output, except with the ugly $SQ
references fixed).

I just converted every test here, even though only some of them use
$SQ. But it would be equally correct to mix-and-match styles if we don't
mind the inconsistency.

I've also converted a few tests in t0600 which were moved from t1404 (I
had written this patch before they were moved, but it seemed worth
porting over the changes rather than losing them).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:23 -07:00
1d133ae91f test-lib: allow test snippets as here-docs
Most test snippets are wrapped in single quotes, like:

  test_expect_success 'some description' '
          do_something
  '

This sometimes makes the snippets awkward to write, because you can't
easily use single quotes within them. We sometimes work around this with
$SQ, or by loosening regexes to use "." instead of a literal quote, or
by using double quotes when we'd prefer to use single-quotes (and just
adding extra backslash-escapes to avoid interpolation).

This commit adds another option: feeding the snippet via the function's
stdin. This doesn't conflict with anything the snippet would want to do,
because we always redirect its stdin from /dev/null anyway (which we'll
continue to do).

A few notes on the implementation:

  - it would be nice to push this down into test_run_, but we can't, as
    test_expect_success and test_expect_failure want to see the actual
    script content to report it for verbose-mode. A helper function
    limits the amount of duplication in those callers here.

  - The helper function is a little awkward to call, as you feed it the
    name of the variable you want to set. The more natural thing in
    shell would be command substitution like:

      body=$(body_or_stdin "$2")

    but that loses trailing whitespace. There are tricks around this,
    like:

      body=$(body_or_stdin "$2"; printf .)
      body=${body%.}

    but we'd prefer to keep such tricks in the helper, not in each
    caller.

  - I implemented the helper using a sequence of "read" calls. Together
    with "-r" and unsetting the IFS, this preserves incoming whitespace.
    An alternative is to use "cat" (which then requires the gross "."
    trick above). But this saves us a process, which is probably a good
    thing. The "read" builtin does use more read() syscalls than
    necessary (one per byte), but that is almost certainly a win over a
    separate process.

    Both are probably slower than passing a single-quoted string, but
    the difference is lost in the noise for a script that I converted as
    an experiment.

  - I handle test_expect_success and test_expect_failure here. If we
    like this style, we could easily extend it to other spots (e.g.,
    lazy_prereq bodies) on top of this patch.

  - even though we are using "local", we have to be careful about our
    variable names. Within test_expect_success, any variable we declare
    with local will be seen as local by the test snippets themselves (so
    it wouldn't persist between tests like normal variables would).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:23 -07:00
0c7d630220 chainlint.pl: add tests for test body in heredoc
The chainlint.pl script recently learned about the upcoming:

  test_expect_success 'some test' - <<\EOT
	TEST_BODY
  EOT

syntax, where TEST_BODY should be checked in the usual way. Let's make
sure this works by adding a few tests. The "here-doc-body" file tests
the basic syntax, including an embedded here-doc which we should still
be able to recognize.

Likewise the "here-doc-body-indent" checks the same thing, but using the
"<<-" operator. We wouldn't expect this to be used normally, but we
would not want to accidentally miss a body that uses it. The
"pathological" variant checks the opposite: we don't get confused by an
indented tag within the here-doc body.

The "here-doc-double" tests the handling of two here-doc tags on the
same line. This is not something we'd expect anybody to do in practice,
but the code was written defensively to handle this, so let's make sure
it works.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
a4a5f282f5 chainlint.pl: recognize test bodies defined via heredoc
In order to check tests for semantic problems, chainlint.pl scans test
scripts, looking for tests defined as:

    test_expect_success [prereq] title '
        body
    '

where `body` is a single string which is then treated as a standalone
chunk of code and "linted" to detect semantic issues. (The same happens
for `test_expect_failure` definitions.)

The introduction of test definitions in which the test body is instead
presented via a heredoc rather than as a single string creates a blind
spot in the linting process since such invocations are not recognized by
chainlint.pl.

Prepare for this new style by also recognizing tests defined as:

    test_expect_success [prereq] title - <<\EOT
        body
    EOT

A minor complication is that chainlint.pl has never considered heredoc
bodies significant since it doesn't scan them for semantic problems,
thus it has always simply thrown them away. However, with the new
`test_expect_success` calling sequence, heredoc bodies become
meaningful, thus need to be captured.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
03763e68fb chainlint.pl: check line numbers in expected output
While working on chainlint.pl recently, we introduced some bugs that
showed incorrect line numbers in the output. But it was hard to notice,
since we sanitize the output by removing all of the line numbers! It
would be nice to retain these so we can catch any regressions.

The main reason we sanitize is for maintainability: we concatenate all
of the test snippets into a single file, so it's hard for each ".expect"
file to know at which offset its test input will be found. We can handle
that by storing the per-test line numbers in the ".expect" files, and
then dynamically offsetting them as we build the concatenated test and
expect files together.

The changes to the ".expect" files look like tedious boilerplate, but it
actually makes adding new tests easier. You can now just run:

  perl chainlint.pl chainlint/foo.test |
  tail -n +2 >chainlint/foo.expect

to save the output of the script minus the comment headers (after
checking that it is correct, of course). Whereas before you had to strip
the line numbers. The conversions here were done mechanically using
something like the script above, and then spot-checked manually.

It would be possible to do all of this in shell via the Makefile, but it
gets a bit complicated (and requires a lot of extra processes). Instead,
I've written a short perl script that generates the concatenated files
(we already depend on perl, since chainlint.pl uses it). Incidentally,
this improves a few other things:

  - we incorrectly used $(CHAINLINTTMP_SQ) inside a double-quoted
    string. So if your test directory required quoting, like:

       make "TEST_OUTPUT_DIRECTORY=/tmp/h'orrible"

    we'd fail the chainlint tests.

  - the shell in the Makefile didn't handle &&-chaining correctly in its
    loops (though in practice the "sed" and "cat" invocations are not
    likely to fail).

  - likewise, the sed invocation to strip numbers was hiding the exit
    code of chainlint.pl itself. In practice this isn't a big deal;
    since there are linter violations in the test files, we expect it to
    exit non-zero. But we could later use exit codes to distinguish
    serious errors from expected ones.

  - we now use a constant number of processes, instead of scaling with
    the number of test scripts. So it should be a little faster (on my
    machine, "make check-chainlint" goes from 133ms to 73ms).

There are some alternatives to this approach, but I think this is still
a good intermediate step:

  1. We could invoke chainlint.pl individually on each test file, and
     compare it to the expected output (and possibly using "make" to
     avoid repeating already-done checks). This is a much bigger change
     (and we'd have to figure out what to do with the "# LINT" lines in
     the inputs). But in this case we'd still want the "expect" files to
     be annotated with line numbers. So most of what's in this patch
     would be needed anyway.

  2. Likewise, we could run a single chainlint.pl and feed it all of the
     scripts (with "--jobs=1" to get deterministic output). But we'd
     still need to annotate the scripts as we did here, and we'd still
     need to either assemble the "expect" file, or break apart the
     script output to compare to each individual ".expect" file.

So we may pursue those in the long run, but this patch gives us more
robust tests without too much extra work or moving in a useless
direction.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
382f6edaee chainlint.pl: force CRLF conversion when opening input files
The lexer in chainlint.pl can't handle CRLF line endings; it complains
about an internal error in scan_token() if we see one. For example, in
our Windows CI environment:

  $ perl chainlint.pl chainlint/for-loop.test | cat -v
  Thread 2 terminated abnormally: internal error scanning character '^M'

This doesn't break "make check-chainlint" (yet), because we assemble a
concatenated input by passing the contents of each file through "sed".
And the "sed" we use will strip out the CRLFs. But the next patch is
going to rework this a bit, which does break check-chainlint on Windows.
Plus it's probably nicer to folks on Windows who might work on chainlint
itself and write new tests.

In theory we could fix the parser to handle this, but it's not really
worth the trouble. We should be able to ask the input layer to translate
the line endings for us. In fact, I'd expect this to happen by default,
as perl's documentation claims Win32 uses the ":unix:crlf" PERLIO layer
by default ("unix" here just refers to using read/write syscalls, and
then "crlf" layers the translation on top). However, this doesn't seem
to be the case in our Windows CI environment. I didn't dig into the
exact reason, but it is perhaps because we are using an msys build of
perl rather than a "true" Win32 build.

At any rate, it is easy-ish to just ask explicitly for the conversion.
In the above example, setting PERLIO=crlf in the environment is enough
to make it work. Curiously, though, this doesn't work when invoking
chainlint via "make". Again, I didn't dig into it, but it may have to do
with msys programs calling Windows programs or vice versa.

We can make it work consistently by just explicitly asking for CRLF
translation when we open the files. This will even work on non-Windows
platforms, though we wouldn't really expect to find CRLF files there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
d558509e25 chainlint.pl: do not spawn more threads than we have scripts
The chainlint.pl script spawns worker threads to check many scripts in
parallel. This is good if you feed it a lot of scripts. But if you give
it few (or one), then the overhead of spawning the threads dominates. We
can easily notice that we have fewer scripts than threads and scale back
as appropriate.

This patch reduces the time to run:

  time for i in chainlint/*.test; do
	perl chainlint.pl $i
  done >/dev/null

on my system from ~4.1s to ~1.1s, where I have 8+8 cores.

As with the previous patch, this isn't the usual way we run chainlint
(we feed many scripts at once, which is why it supports threading in the
first place). So this won't make a big difference in the real world, but
it may help us out in the future, and it makes experimenting with and
debugging the chainlint tests a bit more pleasant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:21 -07:00
a7c1c10256 chainlint.pl: only start threads if jobs > 1
If the system supports threads, chainlint.pl will always spawn worker
threads to do the real work. But when --jobs=1, this is pointless, since
we could just do the work in the main thread. And spawning even a single
thread has a high overhead. For example, on my Linux system, running:

  for i in chainlint/*.test; do
	perl chainlint.pl --jobs=1 $i
  done >/dev/null

takes ~1.7s without this patch, and ~1.1s after. We don't usually spawn
a bunch of individual chainlint.pl processes (instead we feed several
scripts at once, and the parallelism outweighs the setup cost). But it's
something we've considered doing, and since we already have fallback
code for systems without thread support, it's pretty easy to make this
work.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:21 -07:00
a5e450144d chainlint.pl: add test_expect_success call to test snippets
The chainlint tests are a series of individual files, each holding a
test body. The "make check-chainlint" target assembles them into a
single file, adding a "test_expect_success" function call around each.
Let's instead include that function call in the files themselves. This
is a little more boilerplate, but has several advantages:

  1. You can now run chainlint manually on snippets with just "perl
     chainlint.perl chainlint/foo.test". This can make developing and
     debugging a little easier.

  2. Many of the tests implicitly relied on the syntax of the lines
     added by the Makefile (in particular the use of single-quotes).
     This assumption is much easier to see when the single-quotes are
     alongside the test body.

  3. We had no way to test how the chainlint program handled
     various test_expect_success lines themselves. Now we'll be able to
     check variations.

The change to the .test files was done mechanically, using the same
test names they would have been assigned by the Makefile (this is
important to match the expected output). The Makefile has the minimal
change to drop the extra lines; there are more cleanups possible but a
future patch in this series will rewrite this substantially anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:21 -07:00
4f5822076f http.c: cookie file tightening
The http.cookiefile configuration variable is used to call
curl_easy_setopt() to set CURLOPT_COOKIEFILE and if http.savecookies
is set, the same value is used for CURLOPT_COOKIEJAR.  The former is
used only to read cookies at startup, the latter is used to write
cookies at the end.

The manual pages https://curl.se/libcurl/c/CURLOPT_COOKIEFILE.html
and https://curl.se/libcurl/c/CURLOPT_COOKIEJAR.html talk about two
interesting special values.

 * "" (an empty string) given to CURLOPT_COOKIEFILE means not to
   read cookies from any file upon startup.

 * It is not specified what "" (an empty string) given to
   CURLOPT_COOKIEJAR does; presumably open a file whose name is an
   empty string and write cookies to it?  In any case, that is not
   what we want to see happen, ever.

 * "-" (a dash) given to CURLOPT_COOKIEFILE makes cURL read cookies
   from the standard input, and given to CURLOPT_COOKIEJAR makes
   cURL write cookies to the standard output.  Neither of which we
   want ever to happen.

So, let's make sure we avoid these nonsense cases.  Specifically,
when http.cookies is set to "-", ignore it with a warning, and when
it is set to "" and http.savecookies is set, ignore http.savecookies
with a warning.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:28:38 -07:00
610cbc1dfb http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response.  Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.

However, there may be times when a user would like to authenticate
nevertheless.  For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use.  Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.

Let's make this possible with a new option, "http.proactiveAuth".  This
option specifies a type of authentication which can be used to
authenticate against the host in question.  This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.

If we're in auto mode and we got a username and password, set the
authentication scheme to Basic.  libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves.  In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.

Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided.  Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it.  While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:27:51 -07:00
70405acf60 doc: mention that proxies must be completely transparent
We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry.  We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
804ecbcfd1 gitfaq: add entry about syncing working trees
Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes.  Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.

The technique that many users are using is their preferred cloud syncing
service, which is a bad idea.  Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly.  They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.

We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools.  Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky.  Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that.  While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
c98f78b806 gitfaq: give advice on using eol attribute in gitattributes
In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute.  As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.

Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.

In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between.  The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
2101341484 gitfaq: add documentation on proxies
Many corporate environments and local systems have proxies in use.  Note
the situations in which proxies can be used and how to configure them.
At the same time, note what standards a proxy must follow to work with
Git.  Explicitly call out certain classes that are known to routinely
have problems reported various places online, including in the Git for
Windows issue tracker and on Stack Overflow, and recommend against the
use of such software, noting that they are associated with myriad
security problems (including, for example, breaking sandboxing and image
integrity[0], and, for TLS middleboxes, the use of insecure protocols
and ciphers and lack of certificate verification[1]). Don't mention the
specific nature of these security problems in the FAQ entry because they
are extremely numerous and varied and we wish to keep the FAQ entry
relatively brief.

[0] https://issues.chromium.org/issues/40285192
[1] https://faculty.cc.gatech.edu/~mbailey/publications/ndss17_interception.pdf

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
58696bfcaa ci: unify bash calling convention
Under ci/ hierarchy, we run scripts under either "sh" (any Bourne
compatible POSIX shell would work) or specifically "bash" (as they
require features from bash, e.g., ${parameter/pattern/string}
expansion).  As we have the CI environment under our control, we can
expect that /bin/sh will always be fine to run the scripts that only
require a Bourne shell, but we may not know where "bash" is
installed depending on the distro used.

So let's make sure we start these scripts with either one of these:

	#!/bin/sh
	#!/usr/bin/env bash

Yes, the latter has to assume that everybody installs "env" at that
path and not as /bin/env or /usr/local/bin/env, but this currently
is the best we could do.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 16:23:05 -07:00
557ae147e6 The ninteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 14:53:11 -07:00
a43b001cce Merge branch 'ds/sparse-lstat-caching'
The code to deal with modified paths that are out-of-cone in a
sparsely checked out working tree has been optimized.

* ds/sparse-lstat-caching:
  sparse-index: improve lstat caching of sparse paths
  sparse-index: count lstat() calls
  sparse-index: use strbuf in path_found()
  sparse-index: refactor path_found()
  sparse-checkout: refactor skip worktree retry logic
2024-07-08 14:53:11 -07:00
125e389470 Merge branch 'xx/bundie-uri-fixes'
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.

* xx/bundie-uri-fixes:
  unbundle: extend object verification for fetches
  fetch-pack: expose fsckObjects configuration logic
  bundle-uri: verify oid before writing refs
2024-07-08 14:53:11 -07:00
3997614c24 Merge branch 'ps/leakfixes-more'
More memory leaks have been plugged.

* ps/leakfixes-more: (29 commits)
  builtin/blame: fix leaking ignore revs files
  builtin/blame: fix leaking prefixed paths
  blame: fix leaking data for blame scoreboards
  line-range: plug leaking find functions
  merge: fix leaking merge bases
  builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
  sequencer: fix memory leaks in `make_script_with_merges()`
  builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
  apply: fix leaking string in `match_fragment()`
  sequencer: fix leaking string buffer in `commit_staged_changes()`
  commit: fix leaking parents when calling `commit_tree_extended()`
  config: fix leaking "core.notesref" variable
  rerere: fix various trivial leaks
  builtin/stash: fix leak in `show_stash()`
  revision: free diff options
  builtin/log: fix leaking commit list in git-cherry(1)
  merge-recursive: fix memory leak when finalizing merge
  builtin/merge-recursive: fix leaking object ID bases
  builtin/difftool: plug memory leaks in `run_dir_diff()`
  object-name: free leaking object contexts
  ...
2024-07-08 14:53:10 -07:00
ecf7fc600a Merge branch 'tb/path-filter-fix'
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.

* tb/path-filter-fix:
  bloom: introduce `deinit_bloom_filters()`
  commit-graph: reuse existing Bloom filters where possible
  object.h: fix mis-aligned flag bits table
  commit-graph: new Bloom filter version that fixes murmur3
  commit-graph: unconditionally load Bloom filters
  bloom: prepare to discard incompatible Bloom filters
  bloom: annotate filters with hash version
  repo-settings: introduce commitgraph.changedPathsVersion
  t4216: test changed path filters with high bit paths
  t/helper/test-read-graph: implement `bloom-filters` mode
  bloom.h: make `load_bloom_filter_from_graph()` public
  t/helper/test-read-graph.c: extract `dump_graph_info()`
  gitformat-commit-graph: describe version 2 of BDAT
  commit-graph: ensure Bloom filters are read with consistent settings
  revision.c: consult Bloom filters for root commits
  t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
2024-07-08 14:53:10 -07:00
6f75d230a1 Merge branch 'db/date-underflow-fix'
date parser updates to be more careful about underflowing epoch
based timestamp.

* db/date-underflow-fix:
  date: detect underflow/overflow when parsing dates with timezone offset
  t0006: simplify prerequisites
2024-07-08 14:53:09 -07:00
4e18cd5ef7 Merge branch 'rj/pager-die-upon-exec-failure'
When GIT_PAGER failed to spawn, depending on the code path taken,
we failed immediately (correct) or just spew the payload to the
standard output (incorrect).  The code now always fail immediately
when GIT_PAGER fails.

* rj/pager-die-upon-exec-failure:
  pager: die when paging to non-existing command
2024-07-08 14:53:08 -07:00
2fa5ae30da Merge branch 'ss/doc-eol-attr-fix'
Doc update.

* ss/doc-eol-attr-fix:
  doc: fix case error of eol attribute in example
2024-07-08 14:53:08 -07:00
87f4164124 Merge branch 'jc/archive-prefix-with-add-virtual-file'
"git archive --add-virtual-file=<path>:<contents>" never paid
attention to the --prefix=<prefix> option but the documentation
said it would. The documentation has been corrected.

* jc/archive-prefix-with-add-virtual-file:
  archive: document that --add-virtual-file takes full path
2024-07-08 14:53:07 -07:00
9479a31d60 advice: warn when sparse index expands
Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.

When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:

 1. Users may not have a full understanding of which files are inside or
    outside of their sparse-checkout. This is more common in monorepos
    that manage the sparse-checkout using custom tools that map build
    dependencies into sparse-checkout definitions.

 2. In some cases, an empty directory could exist outside the
    sparse-checkout and these empty directories are not reported by 'git
    status' and friends.

 3. If the user has '.gitignore' or 'exclude' files, then 'git status'
    will squelch the warnings and not demonstrate any problems.

In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.

The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.

The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 12:23:59 -07:00
428c40da61 doc: fix the max number of branches shown by "show-branch"
The number to be displayed is calculated by the following defined in
object.h:

    #define REV_SHIFT        2
    #define MAX_REVS        (FLAG_BITS - REV_SHIFT)

FLAG_BITS is currently 28, so 26 is the correct number.

Signed-off-by: Rikita Ishikawa <lagrange.resolvent@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 08:26:46 -07:00
cf6ead095b gitweb: rss/atom change published/updated date to committer date
The author date is used for published/updated date in the rss/atom
feed stream.  Change it to the committer date that reflects the
"published/updated" definition better and makes rss/atom feeds more
linear.  Gitlab/Github rss/atom feeds use the committer date.

Additionally, to be consistent, also use the committer date to
determine the date of the last commit to send in the feed
instead of the author date.

Signed-off-by: Jesús Ariel Cabello Mateos <080ariel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-07 23:04:41 -07:00
5c9be4c9d6 Merge https://github.com/j6t/git-gui
* https://github.com/j6t/git-gui:
  git-gui: fix inability to quit after closing another instance
  git-gui: sv.po: Update Swedish translation (576t0f0u)
  git-gui: note the new maintainer
  Makefile(s): do not enforce "all indents must be done with tab"
  Makefile(s): avoid recipe prefix in conditional statements
  doc: switch links to https
  doc: update links to current pages
  git-gui: po: fix typo in French "aperçu"
2024-07-07 22:50:59 -07:00
2864e85593 Merge branch 'os/catch-rename'
The problem can be reproduced on Linux with this sequence:

1. Run git gui from a terminal.
2. Edit the commit message and wait for at least 2 seconds.
3. Terminate the instance from the terminal, for example with Ctrl-C,
   to simulate crash. This leaves the file .git/GITGUI_BCK behind.
4. Start two instances of git gui &.

At this point the first instance can be closed (it renames
.git/GITGUI_BCK to .git/GITGUI_MSG), but the seconds brings an error
message about the absent file and cannot be closed thereafter and must
be killed from the command line.

The renaming that happens by the first instance is the correct action
and need not be repeated by the second instance. It is the correct
action to ignore the failed renaming.

On the other hand, the second instance could just edit the commit
message again, wait 2 seconds to write GITGUI_BCK, and then can be
closed without failing. At this point, since the user has edited the
message, it is again correct to preserve the edited version in
GITGUI_MSG.

* os/catch-rename:
  git-gui: fix inability to quit after closing another instance
2024-07-07 14:14:59 +02:00
1457dff9be clang-format: include kh_foreach* macros in ForEachMacros
The command for generating the list of ForEachMacros searches for
macros whose name contains the string "for_each".  Include those whose
name contains "foreach" as well.  That brings in kh_foreach and
kh_foreach_value from khash.h.

Regenerating the list also brings in hashmap-based macros added by
87571c3f71 (hashmap: use *_entry APIs for iteration, 2019-10-06),
f0e63c4113 (hashmap: use *_entry APIs to wrap container_of, 2019-10-06),
4fa1d501f7 (strmap: add functions facilitating use as a string->int map,
2020-11-05), b70c82e6ed (strmap: add more utility functions,
2020-11-05), and 1201eb628a (strmap: add a strset sub-type, 2020-11-06).

for_each_abbrev is no longer found because its definition was removed by
d850b7a545 (cocci: apply the "cache.h" part of "the_repository.pending",
2023-03-28).  Note that it had been a false positive, though, as it had
been a function wrapper, not a for-like macro.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 15:12:36 -07:00
df32729866 config.mak.dev: fix typo when enabling -Wpedantic
In ebd2e4a13a (Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format
better, 2021-09-28), we tightened our Makefile's behavior to only enable
-Wpedantic when compiling with either gcc5/clang4 or greater as older
compiler versions did not have support for -Wpedantic.

Commit ebd2e4a13a was looking for either "gcc5" or "clang4" to appear in
the COMPILER_FEATURES variable, combining the two "$(filter ...)"
searches with an "$(or ...)".

But ebd2e4a13a has a typo where instead of writing:

    ifneq ($(or ($filter ...),$(filter ...)),)

we wrote:

    ifneq (($or ($filter ...),$(filter ...)),)

Causing our Makefile (when invoked with DEVELOPER=1, and a sufficiently
recent compiler version) to barf:

    $ make DEVELOPER=1
    config.mak.dev:13: extraneous text after 'ifneq' directive
    [...]

Correctly combine the results of the two "$(filter ...)" operations by
using "$(or ...)", not "$or".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 15:10:29 -07:00
c6f35e529e t-strvec: use test_msg()
check_strvec_loc() checks each strvec item by looping through them and
comparing them with expected values.  If a check fails then we'd like
to know which item is affected.  It reports that information by building
a strbuf and delivering its contents using a failing assertion, e.g.
if there are fewer items in the strvec than expected:

   # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
   #    left: 1
   #   right: 1
   # check "strvec index 1" failed at t/unit-tests/t-strvec.c:71

Note that the index variable is "nr" and thus the interesting value is
reported twice in that example (in lines three and four).

Stop printing the index explicitly for checks that already report it.
The message for the same condition as above becomes:

   # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
   #    left: 1
   #   right: 1

For the string comparison, whose error message doesn't include the
index, report it using the simpler and more appropriate test_msg()
instead.  Report the index using its actual variable name and format the
line like the preceding ones.  The message for an unexpected string
value becomes:

   # check "!strcmp(vec->v[nr], str)" failed at t/unit-tests/t-strvec.c:24
   #    left: "foo"
   #   right: "bar"
   #      nr: 0

Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 15:01:13 -07:00
fcf59ac136 merge-ort: fix missing early return
One of the conversions in f19b9165 (merge-ort: convert more error()
cases to path_msg(), 2024-06-19) accidentally lost the early return.

Restore it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 10:47:00 -07:00
28c1c07700 t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.

Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-03 09:12:14 -07:00
4d8ee0317f push: avoid showing false negotiation errors
When "git push" is configured to use the push negotiation, a push of
deletion of a branch (without pushing anything else) may end up not
having anything to negotiate for the common ancestor discovery.

In such a case, we end up making an internal invocation of "git
fetch --negotiate-only" without any "--negotiate-tip" parameters
that stops the negotiate-only fetch from being run, which by itself
is not a bad thing (one fewer round-trip), but the end-user sees a
"fatal: --negotiate-only needs one or more --negotiation-tip=*"
message that the user cannot act upon.

Teach "git push" to notice the situation and omit performing the
negotiate-only fetch to begin with.  One fewer process spawned, one
fewer "alarming" message given the user.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 15:06:13 -07:00
d1e6c61272 checkout: special case error messages during noop switching
"git checkout" ran with no branch and no pathspec behaves like
switching the branch to the current branch (in other words, a
no-op, except that it gives a side-effect "here are the modified
paths" report).  But unlike "git checkout HEAD" or "git checkout
main" (when you are on the 'main' branch), the user is much less
conscious that they are "switching" to the current branch.

This twists end-user expectation in a strange way.  There are
options (like "--ours") that make sense only when we are checking
out paths out of either the tree-ish or out of the index.  So the
error message the command below gives

    $ git checkout --ours
    fatal: '--ours/theirs' cannot be used with switching branches

is technically correct, but because the end-user may not even be
aware of the fact that the command they are issuing is about no-op
branch switching [*], they may find the error confusing.

Let's refactor the code to make it easier to special case the "no-op
branch switching" situation, and then customize the exact error
message for "--ours/--theirs".  Since it is more likely that the
end-user forgot to give pathspec that is required by the option,
let's make it say

    $ git checkout --ours
    fatal: '--ours/theirs' needs the paths to check out

instead.

Among the other options that are incompatible with branch switching,
there may be some that benefit by having messages tweaked when a
no-op branch switching is done, but I'll leave them as #leftoverbits
material.

[Footnote]

 * Yes, the end-users are irrational.  When they did not give
   "--ours", they take it granted that "git checkout" gives a short
   status, e.g..

    $ git checkout
    M	builtin/checkout.c
    M	t/t7201-co.sh

   exactly as a branch switching command.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 13:53:56 -07:00
06e570c0df Sync with 'maint' 2024-07-02 10:01:10 -07:00
c2ad9d68d6 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 09:59:02 -07:00
2d97b4e235 Merge branch 'rs/diff-color-moved-w-no-ext-diff-fix'
"git diff --no-ext-diff" when diff.external is configured ignored
the "--color-moved" option.

* rs/diff-color-moved-w-no-ext-diff-fix:
  diff: allow --color-moved with --no-ext-diff
2024-07-02 09:59:02 -07:00
ca349c387b Merge branch 'ew/object-convert-leakfix'
Leakfix.

* ew/object-convert-leakfix:
  object-file: fix leak on conversion failure
2024-07-02 09:59:01 -07:00
ca463101c8 Merge branch 'jk/remote-wo-url'
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.

* jk/remote-wo-url:
  remote: drop checks for zero-url case
  remote: always require at least one url in a remote
  t5801: test remote.*.vcs config
  t5801: make remote-testgit GIT_DIR setup more robust
  remote: allow resetting url list
  config: document remote.*.url/pushurl interaction
  remote: simplify url/pushurl selection
  remote: use strvecs to store remote url/pushurl
  remote: transfer ownership of memory in add_url(), etc
  remote: refactor alias_url() memory ownership
  archive: fix check for missing url
2024-07-02 09:59:01 -07:00
24cbd29164 Merge branch 'jc/fuzz-sans-curl'
CI job to build minimum fuzzers learned to pass NO_CURL=NoThanks to
the build procedure, as its build environment does not offer, or
the rest of the build needs, anything cURL.

* jc/fuzz-sans-curl:
  fuzz: minimum fuzzers environment lacks libcURL
2024-07-02 09:59:01 -07:00
43fab448cf Merge branch 'rb/build-options-w-lib-versions'
"git version --build-options" reports the version information of
OpenSSL and other libraries (if used) in the build.

* rb/build-options-w-lib-versions:
  version: teach --build-options to reports zlib version information
  version: teach --build-options to reports libcurl version information
  version: --build-options reports OpenSSL version information
2024-07-02 09:59:00 -07:00
7b472da915 Merge branch 'ps/use-the-repository'
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.

* ps/use-the-repository:
  hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
  t/helper: remove dependency on `the_repository` in "proc-receive"
  t/helper: fix segfault in "oid-array" command without repository
  t/helper: use correct object hash in partial-clone helper
  compat/fsmonitor: fix socket path in networked SHA256 repos
  replace-object: use hash algorithm from passed-in repository
  protocol-caps: use hash algorithm from passed-in repository
  oidset: pass hash algorithm when parsing file
  http-fetch: don't crash when parsing packfile without a repo
  hash-ll: merge with "hash.h"
  refs: avoid include cycle with "repository.h"
  global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
  hash: require hash algorithm in `empty_tree_oid_hex()`
  hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
  hash: make `is_null_oid()` independent of `the_repository`
  hash: convert `oidcmp()` and `oideq()` to compare whole hash
  global: ensure that object IDs are always padded
  hash: require hash algorithm in `oidread()` and `oidclr()`
  hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
  hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-07-02 09:59:00 -07:00
ae447ed130 Merge branch 'ew/cat-file-unbuffered-tests'
The output from "git cat-file --batch-check" and "--batch-command
(info)" should not be unbuffered, for which some tests have been
added.

* ew/cat-file-unbuffered-tests:
  t1006: ensure cat-file info isn't buffered by default
  Git.pm: use array in command_bidi_pipe example
2024-07-02 09:58:59 -07:00
c2b3f2b3cd Yet another batch of post 2.45.2 updates from the 'master' front
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 09:27:59 -07:00
2c2ddfb725 Merge branch 'rs/remove-unused-find-header-mem' into maint-2.45
Code clean-up.

* rs/remove-unused-find-header-mem:
  commit: remove find_header_mem()
2024-07-02 09:27:59 -07:00
ae46703d1e Merge branch 'jc/worktree-git-path' into maint-2.45
Code cleanup.

* jc/worktree-git-path:
  worktree_git_path(): move the declaration to path.h
2024-07-02 09:27:58 -07:00
5cf6e9b022 Merge branch 'jk/fetch-pack-fsck-wo-lock-pack' into maint-2.45
"git fetch-pack -k -k" without passing "--lock-pack" (which we
never do ourselves) did not work at all, which has been corrected.

* jk/fetch-pack-fsck-wo-lock-pack:
  fetch-pack: fix segfault when fscking without --lock-pack
2024-07-02 09:27:58 -07:00
77a6c4c730 Merge branch 'jk/t5500-typofix' into maint-2.45
A helper function shared between two tests had a copy-paste bug,
which has been corrected.

* jk/t5500-typofix:
  t5500: fix mistaken $SERVER reference in helper function
2024-07-02 09:27:58 -07:00
c061c1d78f Merge branch 'js/mingw-remove-unused-extern-decl' into maint-2.45
An unused extern declaration for mingw has been removed to prevent
it from causing build failure.

* js/mingw-remove-unused-extern-decl:
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
2024-07-02 09:27:57 -07:00
00e1848087 Merge branch 'jc/no-default-attr-tree-in-bare' into maint-2.45
Earlier we stopped using the tree of HEAD as the default source of
attributes in a bare repository, but failed to document it.  This
has been corrected.

* jc/no-default-attr-tree-in-bare:
  attr.tree: HEAD:.gitattributes is no longer the default in a bare repo
2024-07-02 09:27:57 -07:00
df98236ca4 Merge branch 'tb/precompose-getcwd' into maint-2.45
We forgot to normalize the result of getcwd() to NFC on macOS where
all other paths are normalized, which has been corrected.  This still
does not address the case where core.precomposeUnicode configuration
is not defined globally.

* tb/precompose-getcwd:
  macOS: ls-files path fails if path of workdir is NFD
2024-07-02 09:27:56 -07:00
3e50dfdfc9 Merge branch 'pw/rebase-i-error-message' into maint-2.45
When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant.  Such an error is now
caught earlier in the process that parses the todo list.

* pw/rebase-i-error-message:
  rebase -i: improve error message when picking merge
  rebase -i: pass struct replay_opts to parse_insn_line()
2024-07-02 09:27:56 -07:00
f13710e32e Merge branch 'ds/format-patch-rfc-and-k' into maint-2.45
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".

* ds/format-patch-rfc-and-k:
  format-patch: ensure that --rfc and -k are mutually exclusive
2024-07-02 09:27:56 -07:00
b942fda670 t-reftable-record: add tests for reftable_log_record_compare_key()
reftable_log_record_compare_key() is a function defined by
reftable/record.{c, h} and is used to compare the keys of two
log records when sorting multiple log records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:26 -07:00
f7ec13b538 t-reftable-record: add tests for reftable_ref_record_compare_name()
reftable_ref_record_compare_name() is a function defined by
reftable/record.{c, h} and is used to compare the refname of two
ref records when sorting multiple ref records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:26 -07:00
8a1f1f88bb t-reftable-record: add index tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for index records.

Add tests for this function in the case of index records.
Note that since index records cannot be of type deletion, this function
must always return '0' when called on an index record.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
9aa3814b2f t-reftable-record: add obj tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for two of the four record types (obj, index).

Add tests for this function in the case of obj records.
Note that since obj records cannot be of type deletion, this function
must always return '0' when called on an obj record.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
09ca34799b t-reftable-record: add log tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for three of the four record types (log, obj, index).

Add tests for this function in the case of log records.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
aa3fef4ff3 t-reftable-record: add ref tests for reftable_record_is_deletion()
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for all the four record types (ref, log, obj, index).

Add tests for this function in the case of ref records.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
abb1834f2a t-reftable-record: add comparison tests for obj records
In the current testing setup for obj records, the comparison
functions for obj records, reftable_obj_record_cmp_void() and
reftable_obj_record_equal_void() are left untested.

Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp_void() and reftable_index_record_equal_void()
respectively.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
85ca39e79b t-reftable-record: add comparison tests for index records
In the current testing setup for index records, the comparison
functions for index records, reftable_index_record_cmp() and
reftable_index_record_equal() are left untested.

Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp() and reftable_index_record_equal()
respectively.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
b7bbb58c14 t-reftable-record: add comparison tests for ref records
In the current testing setup for ref records, the comparison
functions for ref records, reftable_ref_record_cmp_void() and
reftable_ref_record_equal() are left untested.

Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_ref_record_cmp_void() and reftable_ref_record_equal()
respectively.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:25 -07:00
9008b8a6e8 t-reftable-record: add reftable_record_cmp() tests for log records
In the current testing setup for log records, only
reftable_log_record_equal() among log record's comparison functions
is tested.

Modify the existing tests to exercise reftable_log_record_cmp_void()
(using the wrapper function reftable_record_cmp()) alongside
reftable_log_record_equal().
Note that to achieve this, we'll need to replace instances of
reftable_log_record_equal() with the wrapper function
reftable_record_equal().

Rename the now modified test to reflect its nature of exercising
all comparison operations, not just equality.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:24 -07:00
ba9661b457 t: move reftable/record_test.c to the unit testing framework
reftable/record_test.c exercises the functions defined in
reftable/record.{c, h}. Migrate reftable/record_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework, and renaming the tests to fit unit-tests' naming scheme.

While at it, change the type of index variable 'i' to 'size_t'
from 'int'. This is because 'i' is used in comparison against
'ARRAY_SIZE(x)' which is of type 'size_t'.

Also, use set_hash() which is defined locally in the test file
instead of set_test_hash() which is defined by
reftable/test_framework.{c, h}. This is fine to do as both these
functions are similarly implemented, and
reftable/test_framework.{c, h} is not #included in the ported test.

Get rid of reftable_record_print() from the tests as well, because
it clutters the test framework's output and we have no way of
verifying the output.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:24 -07:00
03930f93c4 t0612: mark as leak-free
A quick test tells us that t0612 does not trigger any leak:

    $ make SANITIZE=leak test GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true GIT_TEST_OPTS=-i T=t0612-reftable-jgit-compatibility.sh
    [...]
    *** t0612-reftable-jgit-compatibility.sh ***
    in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
    ok 1 - CGit repository can be read by JGit
    ok 2 - JGit repository can be read by CGit
    ok 3 - mixed writes from JGit and CGit
    ok 4 - JGit can read multi-level index
    # passed all 4 test(s)
    1..4
    # faking up non-zero exit with --invert-exit-code
    make[2]: *** [Makefile:75: t0612-reftable-jgit-compatibility.sh] Error 1

Let's mark it as leak-free to silence the machinery activated by
`GIT_TEST_PASSING_SANITIZE_LEAK=check`.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 15:11:05 -07:00
47c6d4dad2 test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG
When a test that leaks runs with GIT_TEST_SANITIZE_LEAK_LOG=true,
the test returns zero, which is not what we want.

In the if-else's chain we have in "check_test_results_san_file_", we
consider three variables: $passes_sanitize_leak, $sanitize_leak_check
and, implicitly, GIT_TEST_SANITIZE_LEAK_LOG (always set to "true" at
that point).

For the first two variables we have different considerations depending
on the value of $test_failure, which makes sense.  However, for the
third, GIT_TEST_SANITIZE_LEAK_LOG, we don't;  regardless of
$test_failure, we use "invert_exit_code=t" to produce a non-zero
return value.

That assumes "$test_failure" is always zero at that point.  But it
may not be:

   $ git checkout v2.40.1
   $ make test SANITIZE=leak T=t3200-branch.sh # this fails
   $ make test SANITIZE=leak GIT_TEST_SANITIZE_LEAK_LOG=true T=t3200-branch.sh # this succeeds
   [...]
   With GIT_TEST_SANITIZE_LEAK_LOG=true, our logs revealed a memory leak, exiting with a non-zero status!
   # faked up failures as TODO & now exiting with 0 due to --invert-exit-code

We need to use "invert_exit_code=t" only when "$test_failure" is zero.

Let's add the missing conditions in the if-else's chain to make it work
as expected.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 15:09:07 -07:00
d0b38a27c6 t0613: mark as leak-free
We can mark t0613 as leak-free:

    $ make test SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true T=t0613-reftable-write-options.sh
    [...]
    *** t0613-reftable-write-options.sh ***
    in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
    ok 1 - default write options
    ok 2 - disabled reflog writes no log blocks
    ok 3 - many refs results in multiple blocks
    ok 4 - tiny block size leads to error
    ok 5 - small block size leads to multiple ref blocks
    ok 6 - small block size fails with large reflog message
    ok 7 - block size exceeding maximum supported size
    ok 8 - restart interval at every single record
    ok 9 - restart interval exceeding maximum supported interval
    ok 10 - object index gets written by default with ref index
    ok 11 - object index can be disabled
    # passed all 11 test(s)
    1..11
    # faking up non-zero exit with --invert-exit-code
    make[2]: *** [Makefile:75: t0613-reftable-write-options.sh] Error 1

Do it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:29:01 -07:00
231cf7370e pathspec: fix typo "glossary-context.txt" -> "glossary-content.txt"
The pathspec syntax is explained in the file "glossary-content.txt".
Moreover, no file named "glossary-context.txt" exists in the repository.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:19:26 -07:00
4b837f821e submodule--helper: use strvec_pushf() for --super-prefix
Use the strvec_pushf() call that already appends a slash to also produce
the stuck form of the option --super-prefix instead of adding the option
name in a separate call of strvec_push() or strvec_pushl().  This way we
can more easily see that these parts make up a single option with its
argument and save a function call.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:18:22 -07:00
c852531f45 git-send-email: use sanitized address when reading mbox body
Addresses that are mentioned on the trailers in the commit log
messages (e.g., "Reviewed-by") are added to the "Cc:" list by "git
send-email".  These hand-written addresses, however, may be
malformed (e.g., having unquoted "." and other punctutation marks in
the display-name part) and can upset MTA.

The code does use the sanitize_address() helper on these
address-looking strings to turn them into valid addresses, but it is
used only to see if the address should be suppressed.  The original
string taken from the message is added to the @cc list if the code
decides the address is not suppressed.

Because the addresses on trailer lines are hand-written and more
likely to contain malformed addresses, when adding to the @cc list,
use the result from sanitize_address, not the original.  Note that
we do not modify the behaviour for addresses taken from the e-mail
headers, as they are more likely to be machine generated and
well-formed.

Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 11:38:29 -07:00
f402c7941f git-gui: fix inability to quit after closing another instance
If you open 2 git gui instances in the same directory, then close one
of them and try to close the other, an error message pops up, saying:
'error renaming ".git/GITGUI_BCK": no such file or directory', and it
is no longer possible to close the window ever.

Fix by catching this error, and proceeding even if the file no longer
exists.

Signed-off-by: Orgad Shaneh <orgads@gmail.com>
2024-06-30 09:15:04 +03:00
790a17fb19 Sync with 'maint' 2024-06-28 16:03:59 -07:00
09e5e7f718 More post 2.45.2 updates from the 'master' front
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 15:53:19 -07:00
5d5675515e Merge branch 'ds/ahead-behind-fix' into maint-2.45
Fix for a progress bar.

* ds/ahead-behind-fix:
  commit-graph: increment progress indicator
2024-06-28 15:53:19 -07:00
112bd6a67c Merge branch 'ds/doc-add-interactive-singlekey' into maint-2.45
Doc update.

* ds/doc-add-interactive-singlekey:
  doc: interactive.singleKey is disabled by default
2024-06-28 15:53:18 -07:00
ce359a4dcc Merge branch 'jc/varargs-attributes' into maint-2.45
Varargs functions that are unannotated as printf-like or execl-like
have been annotated as such.

* jc/varargs-attributes:
  __attribute__: add a few missing format attributes
  __attribute__: mark some functions with LAST_ARG_MUST_BE_NULL
  __attribute__: remove redundant attribute declaration for git_die_config()
  __attribute__: trace2_region_enter_printf() is like "printf"
2024-06-28 15:53:18 -07:00
b2a62b6a42 Merge branch 'ps/ci-fix-detection-of-ubuntu-20' into maint-2.45
Fix for an embarrassing typo that prevented Python2 tests from running
anywhere.

* ps/ci-fix-detection-of-ubuntu-20:
  ci: fix check for Ubuntu 20.04
2024-06-28 15:53:17 -07:00
f30e5332e4 Merge branch 'jk/cap-exclude-file-size' into maint-2.45
An overly large ".gitignore" files are now rejected silently.

* jk/cap-exclude-file-size:
  dir.c: reduce max pattern file size to 100MB
  dir.c: skip .gitignore, etc larger than INT_MAX
2024-06-28 15:53:17 -07:00
ce75d32b99 Merge branch 'jc/safe-directory-leading-path' into maint-2.45
The safe.directory configuration knob has been updated to
optionally allow leading path matches.

* jc/safe-directory-leading-path:
  safe.directory: allow "lead/ing/path/*" match
2024-06-28 15:53:17 -07:00
7b7db54b83 Merge branch 'rs/difftool-env-simplify' into maint-2.45
Code simplification.

* rs/difftool-env-simplify:
  difftool: add env vars directly in run_file_diff()
2024-06-28 15:53:16 -07:00
6e3eb346ed Merge branch 'ps/fix-reinit-includeif-onbranch' into maint-2.45
"git init" in an already created directory, when the user
configuration has includeif.onbranch, started to fail recently,
which has been corrected.

* ps/fix-reinit-includeif-onbranch:
  setup: fix bug with "includeIf.onbranch" when initializing dir
2024-06-28 15:53:16 -07:00
903b4da27f Merge branch 'es/chainlint-ncores-fix' into maint-2.45
The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs.  It now falls
back to 1 CPU to avoid the problem.

* es/chainlint-ncores-fix:
  chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
  chainlint.pl: fix incorrect CPU count on Linux SPARC
  chainlint.pl: make CPU count computation more robust
2024-06-28 15:53:15 -07:00
2988b82b87 Merge branch 'jc/rev-parse-fatal-doc' into maint-2.45
Doc update.

* jc/rev-parse-fatal-doc:
  rev-parse: document how --is-* options work outside a repository
2024-06-28 15:53:14 -07:00
0d56a5946a Merge branch 'jc/doc-diff-name-only' into maint-2.45
The documentation for "git diff --name-only" has been clarified
that it is about showing the names in the post-image tree.

* jc/doc-diff-name-only:
  diff: document what --name-only shows
2024-06-28 15:53:14 -07:00
db9d38d9bb Merge branch 'mt/t0211-typofix' into maint-2.45
Test fix.

* mt/t0211-typofix:
  t/t0211-trace2-perf.sh: fix typo patern -> pattern
2024-06-28 15:53:13 -07:00
db15f4d794 Merge branch 'dg/fetch-pack-code-cleanup' into maint-2.45
Code clean-up to remove an unused struct definition.

* dg/fetch-pack-code-cleanup:
  fetch-pack: remove unused 'struct loose_object_iter'
2024-06-28 15:53:13 -07:00
0c6c514c50 Merge branch 'dm/update-index-doc-fix' into maint-2.45
Doc fix.

* dm/update-index-doc-fix:
  documentation: git-update-index: add --show-index-version to synopsis
2024-06-28 15:53:12 -07:00
b608b33f3d Merge branch 'ds/scalar-reconfigure-all-fix' into maint-2.45
Scalar fix.

* ds/scalar-reconfigure-all-fix:
  scalar: avoid segfault in reconfigure --all
2024-06-28 15:53:12 -07:00
abfdc596d8 Merge branch 'vd/doc-merge-tree-x-option' into maint-2.45
Doc update.

* vd/doc-merge-tree-x-option:
  Documentation/git-merge-tree.txt: document -X
2024-06-28 15:53:11 -07:00
0d23421e2a Merge branch 'fa/p4-error' into maint-2.45
P4 update.

* fa/p4-error:
  git-p4: show Perforce error to the user
2024-06-28 15:53:11 -07:00
6840423c6f Merge branch 'tb/attr-limits' into maint-2.45
The maximum size of attribute files is enforced more consistently.

* tb/attr-limits:
  attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()
2024-06-28 15:53:10 -07:00
a5adab9b16 Merge branch 'rs/diff-parseopts-cleanup' into maint-2.45
Code clean-up to remove code that is now a noop.

* rs/diff-parseopts-cleanup:
  diff-lib: stop calling diff_setup_done() in do_diff_cache()
2024-06-28 15:53:10 -07:00
fc636b413b Merge branch 'dk/zsh-git-repo-path-fix' into maint-2.45
Command line completion support for zsh (in contrib/) has been
updated to stop exposing internal state to end-user shell
interaction.

* dk/zsh-git-repo-path-fix:
  completion: zsh: stop leaking local cache variable
2024-06-28 15:53:09 -07:00
079323dc6d Merge branch 'bc/zsh-compatibility' into maint-2.45
zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.

* bc/zsh-compatibility:
  vimdiff: make script and tests work with zsh
  t4046: avoid continue in &&-chain for zsh
2024-06-28 15:53:09 -07:00
1b1b4d490d Merge branch 'js/for-each-repo-keep-going' into maint-2.45
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error
2024-06-28 15:53:08 -07:00
2a78de0d9f Merge branch 'aj/stash-staged-fix' into maint-2.45
"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
  stash: fix "--staged" with binary files
2024-06-28 15:53:07 -07:00
a41463e437 Merge branch 'xx/disable-replace-when-building-midx' into maint-2.45
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
  midx: disable replace objects
2024-06-28 15:53:07 -07:00
332bcf74ea Merge branch 'pw/rebase-m-signoff-fix' into maint-2.45
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.

* pw/rebase-m-signoff-fix:
  rebase -m: fix --signoff with conflicts
  sequencer: store commit message in private context
  sequencer: move current fixups to private context
  sequencer: start removing private fields from public API
  sequencer: always free "struct replay_opts"
2024-06-28 15:53:06 -07:00
114bff72ac sparse-index: improve lstat caching of sparse paths
The clear_skip_worktree_from_present_files() method was first introduced
in af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files
present in worktree, 2022-01-14) to allow better interaction with the
working directory in the presence of paths outside of the
sparse-checkout. The initial implementation would lstat() every single
SKIP_WORKTREE path to see if it existed; if it ran across a sparse
directory that existed (when a sparse index was in use), then it would
expand the index and then check every SKIP_WORKTREE path.

Since these lstat() calls were very expensive, this was improved in
d79d299352 (Accelerate clear_skip_worktree_from_present_files() by
caching, 2022-01-14) by caching directories that do not exist so it
could avoid lstat()ing any files under such directories. However, there
are some inefficiencies in that caching mechanism.

The caching mechanism stored only the parent directory as not existing,
even if a higher parent directory also does not exist. This means that
wasted lstat() calls would occur when the paths passed to path_found()
change immediate parent directories but within the same parent directory
that does not exist.

To create an example repository that demonstrates this problem, it helps
to have a directory outside of the sparse-checkout that contains many
deep paths. In particular, the first paths (in lexicographic order)
underneath the sparse directory should have deep directory structures,
maximizing the difference between the old caching algorithm that looks
to a single parent and the new caching algorithm that looks to the
top-most missing directory.

The performance test script p2000-sparse-operations.sh takes the sample
repository and copies its HEAD to several copies nested in directories
of the form f<i>/f<j>/f<k> where i, j, and k are numbers from 1 to 4.
The sparse-checkout cone is then selected as "f2/f4/". Creating "f1/f1/"
will trigger the behavior and also lead to some interesting cases for
the caching algorithm since "f1/f1/" exists but "f1/f2/" and "f3/" do
not.

This is difficult to notice when running performance tests using the Git
repository (or a blow-up of the Git repository, as in
p2000-sparse-operations.sh) because Git has a very shallow directory
structure.

This change reorganizes the caching algorithm to focus on storing the
highest level leading directory that does not exist; specifically this
means that that directory's parent _does_ exist. By doing a little extra
work on a path passed to path_found(), we can short-circuit all of the
paths passed to path_found() afterwards that match a prefix with that
non-existing directory. When in a repository where the first sparse file
is likely to have a much deeper path than the first non-existing
directory, this can realize significant gains.

The details of this algorithm require careful attention, so the new
implementation of path_found() has detailed comments, including the use
of a new max_common_dir_prefix() method that may be of independent
interest.

It's worth noting that this is not universally positive, since we are
doing extra lstat() calls to establish the exact path to cache. In the
blow-up of the Git repository, we can see that the lstat count
_increases_ from 28 to 31. However, these numbers were already
artificially low.

Contributor Elijah Newren created a publicly-available test repository
that demonstrates the difference in these caching algorithms in the most
extreme way. To test, follow these steps:

  git clone --sparse https://github.com/newren/gvfs-like-git-bomb
  cd gvfs-like-git-bomb
  ./runme.sh                   # NOTE: check scripts before running!

At this point, assuming you do not have index.sparse=true set globally,
the index has one million paths with the SKIP_WORKTREE bit and they will
all be sent to path_found() in the sparse loop. You can measure this by
running 'git status' with GIT_TRACE2_PERF=1:

    Sparse files in the index: 1,000,000
  sparse_lstat_count (before):   200,000
   sparse_lstat_count (after):         2

And here are the performance numbers:

  Benchmark 1: old
    Time (mean ± σ):     397.5 ms ±   4.1 ms
    Range (min … max):   391.2 ms … 404.8 ms    10 runs

  Benchmark 2: new
    Time (mean ± σ):     252.7 ms ±   3.1 ms
    Range (min … max):   249.4 ms … 259.5 ms    11 runs

  Summary
    'new' ran
      1.57 ± 0.02 times faster than 'old'

By modifying this example further, we can demonstrate a more realistic
example and include the sparse index expansion. Continue by creating
this directory, confusing both caching algorithms somewhat:

  mkdir -p bomb/d/e/f/a/a

Then re-run the 'git status' tests to see these statistics:

    Sparse files in the index: 1,000,000
  sparse_lstat_count (before):   724,010
   sparse_lstat_count (after):       106

  Benchmark 1: old
    Time (mean ± σ):     753.0 ms ±   3.5 ms
    Range (min … max):   749.7 ms … 760.9 ms    10 runs

  Benchmark 2: new
    Time (mean ± σ):     201.4 ms ±   3.2 ms
    Range (min … max):   196.0 ms … 207.9 ms    14 runs

  Summary
    'new' ran
      3.74 ± 0.06 times faster than 'old'

Note that if this repository had a sparse index enabled, the additional
cost of expanding the sparse index affects the total time of these
commands by over four seconds, significantly diminishing the benefit of
the caching algorithm. Having existing paths outside of the
sparse-checkout is a known performance issue for the sparse index and is
a known trade-off for the performance benefits given when no such paths
exist.

Using an internal monorepo with over two million paths at HEAD and a
typical sparse-checkout cone such that the sparse index contains
~190,000 entries (including over two thousand sparse trees), I was able
to measure these lstat counts when one sparse directory actually exists
on disk:

  Sparse files in expanded index: 1,841,997
       full_lstat_count (before): 1,188,161
       full_lstat_count  (after):     4,404

This resulted in this absolute time change, on a warm disk:

      Time in full loop (before): 13.481 s
      Time in full loop  (after):  0.081 s

(These times were calculated on a Windows machine, where lstat() is
slower than a similar Linux machine.)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:12 -07:00
c4e8c42c19 sparse-index: count lstat() calls
The clear_skip_worktree.. methods already report some statistics about
how many cache entries are checked against path_found() due to having
the skip-worktree bit set. However, due to path_found() performing some
caching, this isn't the only information that would be helpful to
report.

Add a new lstat_count member to the path_found_data struct to count the
number of times path_found() calls lstat(). This will be helpful to help
explain performance problems in this method as well as to demonstrate
future changes to the caching algorithm in a more concrete way than
end-to-end timings.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:12 -07:00
23dd6f8bcc sparse-index: use strbuf in path_found()
The path_found() method previously reused strings from the cache entries
the calling methods were using. This prevents string manipulation in
place and causes some odd reallocation before the final lstat() call in
the method.

Refactor the method to use strbufs and copy the path into the strbuf,
but also only the parent directory and not the whole path. This looks
like extra copying when assigning the path to the strbuf, but we save an
allocation by dropping the 'tmp' string, and we are "reusing" the copy
from 'tmp' to put the data in the strbuf.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:11 -07:00
b746a85d9a sparse-index: refactor path_found()
In advance of changing the behavior of path_found(), take all of the
intermediate data values and group them into a single struct. This
simplifies the method prototype as well as the initialization. Future
changes can be made directly to the struct and method without changing
the callers with this approach.

Note that the clear_path_found_data() method is currently empty, as
there is nothing to free. This method is a placeholder for future
changes that require a non-trivial implementation. Its stub is created
now so consumers could call it now and not change in future changes.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:11 -07:00
532e216986 sparse-checkout: refactor skip worktree retry logic
The clear_skip_worktree_from_present_files() method was introduced in
af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present
in worktree, 2022-01-14) to help cases where sparse-checkout is enabled
but some paths outside of the sparse-checkout also exist on disk.  This
operation can be slow as it needs to check path existence in a way not
stored in the index, so caching was introduced in d79d299352 (Accelerate
clear_skip_worktree_from_present_files() by caching, 2022-01-14).

This check is particularly confusing in the presence of a sparse index,
as a sparse tree entry corresponding to an existing directory must first
be expanded to a full index before examining the paths within. This is
currently implemented using a 'goto' and a boolean variable to ensure we
restart only once.

Even with that caching, it was noticed that this could take a long time
to execute. 89aaab11a3 (index: add trace2 region for clear skip
worktree, 2022-11-03) introduced trace2 regions to measure this time.
Further, the way the loop repeats itself was slightly confusing and
prone to breakage, so a BUG() statement was added in 8c7abdc596 (index:
raise a bug if the index is materialised more than once, 2022-11-03) to
be sure that the second run of the loop does not hit any sparse trees.

One thing that can be confusing about the current setup is that the
trace2 regions nest and it is not clear that a second loop is running
after a sparse index is expanded. Here is an example of what the regions
look like in a typical case:

| region_enter | ... | label:clear_skip_worktree_from_present_files
| region_enter | ... | ..label:update
| region_leave | ... | ..label:update
| region_enter | ... | ..label:ensure_full_index
| region_enter | ... | ....label:update
| region_leave | ... | ....label:update
| region_leave | ... | ..label:ensure_full_index
| data         | ... | ..sparse_path_count:1
| data         | ... | ..sparse_path_count_full:269538
| region_leave | ... | label:clear_skip_worktree_from_present_files

One thing that is particularly difficult to understand about these
regions is that most of the time is spent between the close of the
ensure_full_index region and the reporting of the end data. This is
because of the restart of the loop being within the same region as the
first iteration of the loop.

This change refactors the method into two separate methods that are
traced separately. This will be more important later when we change
other features of the methods, but for now the only functional change is
the difference in the structure of the trace regions.

After this change, the same telemetry section is split into three
distinct chunks:

| region_enter | ... | label:clear_skip_worktree_from_present_files_sparse
| data         | ... | ..sparse_path_count:1
| region_leave | ... | label:clear_skip_worktree_from_present_files_sparse
| region_enter | ... | label:update
| region_leave | ... | label:update
| region_enter | ... | label:ensure_full_index
| region_enter | ... | ..label:update
| region_leave | ... | ..label:update
| region_leave | ... | label:ensure_full_index
| region_enter | ... | label:clear_skip_worktree_from_present_files_full
| data         | ... | ..full_path_count:269538
| region_leave | ... | label:clear_skip_worktree_from_present_files_full

Here, we see the sparse loop terminating early with its first sparse
path being a sparse directory containing a file. Then, that loop's
region terminates before ensure_full_index begins (in this case, the
cache-tree must also be computed). Then, _after_ the index is expanded,
the full loop begins with its own region.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:10 -07:00
daed0c68e9 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-27 09:20:00 -07:00
b781a3e08e Merge branch 'jk/fetch-pack-fsck-wo-lock-pack'
"git fetch-pack -k -k" without passing "--lock-pack" (which we
never do ourselves) did not work at all, which has been corrected.

* jk/fetch-pack-fsck-wo-lock-pack:
  fetch-pack: fix segfault when fscking without --lock-pack
2024-06-27 09:19:59 -07:00
5dce36e04f Merge branch 'rs/remove-unused-find-header-mem'
Code clean-up.

* rs/remove-unused-find-header-mem:
  commit: remove find_header_mem()
2024-06-27 09:19:59 -07:00
b8d1a1b06c Merge branch 'jk/t5500-typofix'
A helper function shared between two tests had a copy-paste bug,
which has been corrected.

* jk/t5500-typofix:
  t5500: fix mistaken $SERVER reference in helper function
2024-06-27 09:19:59 -07:00
424a13db64 Merge branch 'js/mingw-remove-unused-extern-decl'
An unused extern declaration for mingw has been removed to prevent
it from causing build failure.

* js/mingw-remove-unused-extern-decl:
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
2024-06-27 09:19:58 -07:00
6c0bfce914 Merge branch 'kz/merge-fail-early-upon-refresh-failure'
When "git merge" sees that the index cannot be refreshed (e.g. due
to another process doing the same in the background), it died but
after writing MERGE_HEAD etc. files, which was useless for the
purpose to recover from the failure.

* kz/merge-fail-early-upon-refresh-failure:
  merge: avoid write merge state when unable to write index
2024-06-27 09:19:58 -07:00
407cdbd271 t/lib-bundle-uri: use local fake bundle URLs
A few of the bundle URI tests point config at a fake bundle; they care
only that the client has been configured with _some_ bundle, but it
doesn't have to actually contain objects.

For the file:// tests, we use "$BUNDLE_URI_REPO_URI/fake.bdl", a
non-existent file inside the actual remote repo. But for git:// and
http:// tests, we use "https://example.com/fake.bdl". This works OK in
practice, but it means we actually make a request to example.com (which
returns a placeholder HTML response). That can be annoying when running
the test suite on a spotty network (it doesn't produce a wrong result,
since we expect it to fail, but it may introduce delays).

We can reduce our dependency on the outside world by using a local URL.
It would work to just do "file://$PWD/fake.bdl" here, since the bundle
code does not care about the actual location. But in the long run I
suspect we may have more restrictions on which protocols can be passed
around as bundle URIs. So instead, let's stick with the file:// repo's
pattern and just point to a bogus name based on the remote repo's URL.

For http this makes perfect sense; we'll make a request to the local
http server and find that there's nothing there. For git:// it's a
little weird, as you wouldn't normally access a bundle file over git://
at all. But it's probably the most reasonable guess we can make for now,
and anybody who tightens protocol selection later will know better
what's the best path forward.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 14:31:18 -07:00
e6653ec3c6 t5551: do not confirm that bogus url cannot be used
t5551 tries to access a URL with a bogus hostname and confirms that
http.curloptResolve lets us use this otherwise unresolvable name.

Before doing so, though, we confirm that trying to access the bogus
hostname without http.curloptResolve fails as expected. This isn't
testing Git at all, but is confirming the test's assumptions. That's
often a good thing to do, but in this case it means that we'll actually
try to resolve the external name. Even though it's unlikely that
"gitbogusexamplehost.invalid" would ever resolve, the DNS lookup itself
may take time.

It's probably reasonable to just assume that this obviously-bogus name
would not actually resolve in practice, which lets us reduce our test
suite's dependency on the outside world.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 14:31:18 -07:00
63ec97faf7 t5553: use local url for invalid fetch
We test how "fetch --set-upstream" behaves when given an invalid URL,
using the bogus URL "http://nosuchdomain.example.com". But finding out
that it is invalid requires an actual DNS lookup.

Reduce our dependency on external factors by using an invalid local
filesystem URL, which works just as well for our purposes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 14:31:17 -07:00
b8ae42e292 describe: refresh the index when 'broken' flag is used
When describe is run with 'dirty' flag, we refresh the index
to make sure it is in sync with the filesystem before
determining if the working tree is dirty.  However, this is
not done for the codepath where the 'broken' flag is used.

This causes `git describe --broken --dirty` to false
positively report the worktree being dirty if a file has
different stat info than what is recorded in the index.
Running `git update-index -q --refresh` to refresh the index
before running diff-index fixes the problem.

Also add tests to deliberately update stat info of a
file before running describe to verify it behaves correctly.

Reported-by: Paul Millar <paul.millar@desy.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 13:04:08 -07:00
72c282098d archive: document that --add-virtual-file takes full path
Tom Scogland noticed that `--add-virtual-file` option uses the path
specified as its value as-is, without prepending any value given to
the `--prefix` option like `--add-file` does.

The behaviour has always been that way since the option was
introduced, but the documentation has always been wrong and said
that it would use the value of `--prefix` just like `--add-file`
does.

We could modify the behaviour to make it literally work like the
documentation said, but it would break existing scripts the users
use.

Noticed-by: Tom Scogland <scogland1@llnl.gov>
Acked-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 12:56:45 -07:00
9d69789770 date: detect underflow/overflow when parsing dates with timezone offset
Overriding the date of a commit to be close to "1970-01-01 00:00:00"
with a large enough positive timezone for the equivelant GMT time to be
before the epoch is considered valid by `parse_date_basic`. Similar
behaviour occurs when using a date close to "2099-12-31 23:59:59" (the
maximum date allowed by `tm_to_time_t`) with a large enough negative
timezone offset.

This leads to an integer underflow or underflow respectively in the
commit timestamp, which is not caught by `git-commit`, but will cause
other services to fail, such as `git-fsck`, which, for the first case,
reports "badDateOverflow: invalid author/committer line - date causes
integer overflow".

Instead check the timezone offset and fail if the resulting time comes
before the epoch "1970-01-01T00:00:00Z" or after the maximum date
"2099-12-31T23:59:59Z".

Using the REQUIRE_64BIT_TIME prerequisite, make sure that the tests
near the end of Git time (aka end of year 2099) are not attempted on
purely 32-bit systems, as they cannot express timestamp beyond 2038
anyway.

Signed-off-by: Darcy Burke <acednes@gmail.com>
[jc: fixups for 32-bit platforms]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 17:07:41 -07:00
a59275d5d6 t0006: simplify prerequisites
The system must support 64-bit time and its time_t must be 64-bit
wide to pass these tests.  Combine these two prerequisites together
to simplify the tests.  In theory, they could be fulfilled
independently and tests could require only one without the other,
but in practice, these must come hand-in-hand.

Update the "check_parse" test helper to pay attention to the
REQUIRE_64BIT_TIME variable, which can be set to the HAVE_64BIT_TIME
prerequisite so that a parse test can be skipped on 32-bit systems.
This will be used in the next step to skip tests for timestamps near
the end of year 2099, as 32-bit systems will not be able to express
a timestamp beyond 2038 anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 17:07:26 -07:00
9c8a9ec787 bloom: introduce deinit_bloom_filters()
After we are done using Bloom filters, we do not currently clean up any
memory allocated by the commit slab used to store those filters in the
first place.

Besides the bloom_filter structures themselves, there is mostly nothing
to free() in the first place, since in the read-only path all Bloom
filter's `data` members point to a memory mapped region in the
commit-graph file itself.

But when generating Bloom filters from scratch (or initializing
truncated filters) we allocate additional memory to store the filter's
data.

Keep track of when we need to free() this additional chunk of memory by
using an extra pointer `to_free`. Most of the time this will be NULL
(indicating that we are representing an existing Bloom filter stored in
a memory mapped region). When it is non-NULL, free it before discarding
the Bloom filters slab.

Suggested-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
5421e7c3a1 commit-graph: reuse existing Bloom filters where possible
In an earlier commit, a bug was described where it's possible for Git to
produce non-murmur3 hashes when the platform's "char" type is signed,
and there are paths with characters whose highest bit is set (i.e. all
characters >= 0x80).

That patch allows the caller to control which version of Bloom filters
are read and written. However, even on platforms with a signed "char"
type, it is possible to reuse existing Bloom filters if and only if
there are no changed paths in any commit's first parent tree-diff whose
characters have their highest bit set.

When this is the case, we can reuse the existing filter without having
to compute a new one. This is done by marking trees which are known to
have (or not have) any such paths. When a commit's root tree is verified
to not have any such paths, we mark it as such and declare that the
commit's Bloom filter is reusable.

Note that this heuristic only goes in one direction. If neither a commit
nor its first parent have any paths in their trees with non-ASCII
characters, then we know for certain that a path with non-ASCII
characters will not appear in a tree-diff against that commit's first
parent. The reverse isn't necessarily true: just because the tree-diff
doesn't contain any such paths does not imply that no such paths exist
in either tree.

So we end up recomputing some Bloom filters that we don't strictly have
to (i.e. their bits are the same no matter which version of murmur3 we
use). But culling these out is impossible, since we'd have to perform
the full tree-diff, which is the same effort as computing the Bloom
filter from scratch.

But because we can cache our results in each tree's flag bits, we can
often avoid recomputing many filters, thereby reducing the time it takes
to run

    $ git commit-graph write --changed-paths --reachable

when upgrading from v1 to v2 Bloom filters.

To benchmark this, let's generate a commit-graph in linux.git with v1
changed-paths in generation order[^1]:

    $ git clone git@github.com:torvalds/linux.git
    $ cd linux
    $ git commit-graph write --reachable --changed-paths
    $ graph=".git/objects/info/commit-graph"
    $ mv $graph{,.bak}

Then let's time how long it takes to go from v1 to v2 filters (with and
without the upgrade path enabled), resetting the state of the
commit-graph each time:

    $ git config commitGraph.changedPathsVersion 2
    $ hyperfine -p 'cp -f $graph.bak $graph' -L v 0,1 \
        'GIT_TEST_UPGRADE_BLOOM_FILTERS={v} git.compile commit-graph write --reachable --changed-paths'

On linux.git (where there aren't any non-ASCII paths), the timings
indicate that this patch represents a speed-up over recomputing all
Bloom filters from scratch:

    Benchmark 1: GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths
      Time (mean ± σ):     124.873 s ±  0.316 s    [User: 124.081 s, System: 0.643 s]
      Range (min … max):   124.621 s … 125.227 s    3 runs

    Benchmark 2: GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths
      Time (mean ± σ):     79.271 s ±  0.163 s    [User: 74.611 s, System: 4.521 s]
      Range (min … max):   79.112 s … 79.437 s    3 runs

    Summary
      'GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths' ran
        1.58 ± 0.01 times faster than 'GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths'

On git.git, we do have some non-ASCII paths, giving us a more modest
improvement from 4.163 seconds to 3.348 seconds, for a 1.24x speed-up.
On my machine, the stats for git.git are:

  - 8,285 Bloom filters computed from scratch
  - 10 Bloom filters generated as empty
  - 4 Bloom filters generated as truncated due to too many changed paths
  - 65,114 Bloom filters were reused when transitioning from v1 to v2.

[^1]: Note that this is is important, since `--stdin-packs` or
  `--stdin-commits` orders commits in the commit-graph by their pack
  position (with `--stdin-packs`) or in the raw input (with
  `--stdin-commits`).

  Since we compute Bloom filters in the same order that commits appear
  in the graph, we must see a commit's (first) parent before we process
  the commit itself. This is only guaranteed to happen when sorting
  commits by their generation number.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
df3df2dcf4 object.h: fix mis-aligned flag bits table
Bit position 23 is one column too far to the left.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
ba5a81d52b commit-graph: new Bloom filter version that fixes murmur3
The murmur3 implementation in bloom.c has a bug when converting series
of 4 bytes into network-order integers when char is signed (which is
controllable by a compiler option, and the default signedness of char is
platform-specific). When a string contains characters with the high bit
set, this bug causes results that, although internally consistent within
Git, does not accord with other implementations of murmur3 (thus,
the changed path filters wouldn't be readable by other off-the-shelf
implementatios of murmur3) and even with Git binaries that were compiled
with different signedness of char. This bug affects both how Git writes
changed path filters to disk and how Git interprets changed path filters
on disk.

Therefore, introduce a new version (2) of changed path filters that
corrects this problem. The existing version (1) is still supported and
is still the default, but users should migrate away from it as soon
as possible.

Because this bug only manifests with characters that have the high bit
set, it may be possible that some (or all) commits in a given repo would
have the same changed path filter both before and after this fix is
applied. However, in order to determine whether this is the case, the
changed paths would first have to be computed, at which point it is not
much more expensive to just compute a new changed path filter.

So this patch does not include any mechanism to "salvage" changed path
filters from repositories. There is also no "mixed" mode - for each
invocation of Git, reading and writing changed path filters are done
with the same version number; this version number may be explicitly
stated (typically if the user knows which version they need) or
automatically determined from the version of the existing changed path
filters in the repository.

There is a change in write_commit_graph(). graph_read_bloom_data()
makes it possible for chunk_bloom_data to be non-NULL but
bloom_filter_settings to be NULL, which causes a segfault later on. I
produced such a segfault while developing this patch, but couldn't find
a way to reproduce it neither after this complete patch (or before),
but in any case it seemed like a good thing to include that might help
future patch authors.

The value in t0095 was obtained from another murmur3 implementation
using the following Go source code:

  package main

  import "fmt"
  import "github.com/spaolacci/murmur3"

  func main() {
          fmt.Printf("%x\n", murmur3.Sum32([]byte("Hello world!")))
          fmt.Printf("%x\n", murmur3.Sum32([]byte{0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}))
  }

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
638e1702d7 commit-graph: unconditionally load Bloom filters
In an earlier commit, we began ignoring the Bloom data ("BDAT") chunk
for commit-graphs whose Bloom filters were computed using a hash version
incompatible with the value of `commitGraph.changedPathVersion`.

Now that the Bloom API has been hardened to discard these incompatible
filters (with the exception of low-level APIs), we can safely load these
Bloom filters unconditionally.

We no longer want to return early from `graph_read_bloom_data()`, and
similarly do not want to set the bloom_settings' `hash_version` field as
a side-effect. The latter is because we want to wait until we know which
Bloom settings we're using (either the defaults, from the GIT_TEST
variables, or from the previous commit-graph layer) before deciding what
hash_version to use.

If we detect an existing BDAT chunk, we'll infer the rest of the
settings (e.g., number of hashes, bits per entry, and maximum number of
changed paths) from the earlier graph layer. The hash_version will be
inferred from the previous layer as well, unless one has already been
specified via configuration.

Once all of that is done, we normalize the value of the hash_version to
either "1" or "2".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
b2cf331057 bloom: prepare to discard incompatible Bloom filters
Callers use the inline `get_bloom_filter()` implementation as a thin
wrapper around `get_or_compute_bloom_filter()`. The former calls the
latter with a value of "0" for `compute_if_not_present`, making
`get_bloom_filter()` the default read-only path for fetching an existing
Bloom filter.

Callers expect the value returned from `get_bloom_filter()` is usable,
that is that it's compatible with the configured value corresponding to
`commitGraph.changedPathsVersion`.

This is OK, since the commit-graph machinery only initializes its BDAT
chunk (thereby enabling it to service Bloom filter queries) when the
Bloom filter hash_version is compatible with our settings. So any value
returned by `get_bloom_filter()` is trivially useable.

However, subsequent commits will load the BDAT chunk even when the Bloom
filters are built with incompatible hash versions. Prepare to handle
this by teaching `get_bloom_filter()` to discard filters that are
incompatible with the configured hash version.

Callers who wish to read incompatible filters (e.g., for upgrading
filters from v1 to v2) may use the lower level routine,
`get_or_compute_bloom_filter()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
5b5d5b598c bloom: annotate filters with hash version
In subsequent commits, we will want to load existing Bloom filters out
of a commit-graph, even when the hash version they were computed with
does not match the value of `commitGraph.changedPathVersion`.

In order to differentiate between the two, add a "version" field to each
Bloom filter.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
ea0024deb9 repo-settings: introduce commitgraph.changedPathsVersion
A subsequent commit will introduce another version of the changed-path
filter in the commit graph file. In order to control which version to
write (and read), a config variable is needed.

Therefore, introduce this config variable. For forwards compatibility,
teach Git to not read commit graphs when the config variable
is set to an unsupported version. Because we teach Git this,
commitgraph.readChangedPaths is now redundant, so deprecate it and
define its behavior in terms of the config variable we introduce.

This commit does not change the behavior of writing (Git writes changed
path filters when explicitly instructed regardless of any config
variable), but a subsequent commit will restrict Git such that it will
only write when commitgraph.changedPathsVersion is a recognized value.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
08b6ae38c6 t4216: test changed path filters with high bit paths
Subsequent commits will teach Git another version of changed path
filter that has different behavior with paths that contain at least
one character with its high bit set, so test the existing behavior as
a baseline.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
57982b8f2a t/helper/test-read-graph: implement bloom-filters mode
Implement a mode of the "read-graph" test helper to dump out the
hexadecimal contents of the Bloom filter(s) contained in a commit-graph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
a09858d43d bloom.h: make load_bloom_filter_from_graph() public
Prepare for a future commit to use the load_bloom_filter_from_graph()
function directly to load specific Bloom filters out of the commit-graph
for manual inspection (to be used during tests).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:05 -07:00
460b15699d t/helper/test-read-graph.c: extract dump_graph_info()
Prepare for the 'read-graph' test helper to perform other tasks besides
dumping high-level information about the commit-graph by extracting its
main routine into a separate function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:05 -07:00
23e91c0ca3 gitformat-commit-graph: describe version 2 of BDAT
The code change to Git to support version 2 will be done in subsequent
commits.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:05 -07:00
cf73936ddf commit-graph: ensure Bloom filters are read with consistent settings
The changed-path Bloom filter mechanism is parameterized by a couple of
variables, notably the number of bits per hash (typically "m" in Bloom
filter literature) and the number of hashes themselves (typically "k").

It is critically important that filters are read with the Bloom filter
settings that they were written with. Failing to do so would mean that
each query is liable to compute different fingerprints, meaning that the
filter itself could return a false negative. This goes against a basic
assumption of using Bloom filters (that they may return false positives,
but never false negatives) and can lead to incorrect results.

We have some existing logic to carry forward existing Bloom filter
settings from one layer to the next. In `write_commit_graph()`, we have
something like:

    if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) {
        struct commit_graph *g = ctx->r->objects->commit_graph;

        /* We have changed-paths already. Keep them in the next graph */
        if (g && g->chunk_bloom_data) {
            ctx->changed_paths = 1;
            ctx->bloom_settings = g->bloom_filter_settings;
        }
    }

, which drags forward Bloom filter settings across adjacent layers.

This doesn't quite address all cases, however, since it is possible for
intermediate layers to contain no Bloom filters at all. For example,
suppose we have two layers in a commit-graph chain, say, {G1, G2}. If G1
contains Bloom filters, but G2 doesn't, a new G3 (whose base graph is
G2) may be written with arbitrary Bloom filter settings, because we only
check the immediately adjacent layer's settings for compatibility.

This behavior has existed since the introduction of changed-path Bloom
filters. But in practice, this is not such a big deal, since the only
way up until this point to modify the Bloom filter settings at write
time is with the undocumented environment variables:

  - GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY
  - GIT_TEST_BLOOM_SETTINGS_NUM_HASHES
  - GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS

(it is still possible to tweak MAX_CHANGED_PATHS between layers, but
this does not affect reads, so is allowed to differ across multiple
graph layers).

But in future commits, we will introduce another parameter to change the
hash algorithm used to compute Bloom fingerprints itself. This will be
exposed via a configuration setting, making this foot-gun easier to use.

To prevent this potential issue, validate that all layers of a split
commit-graph have compatible settings with the newest layer which
contains Bloom filters.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Original-test-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:05 -07:00
1343c89313 revision.c: consult Bloom filters for root commits
The commit-graph stores changed-path Bloom filters which represent the
set of paths included in a tree-level diff between a commit's root tree
and that of its parent.

When a commit has no parents, the tree-diff is computed against that
commit's root tree and the empty tree. In other words, every path in
that commit's tree is stored in the Bloom filter (since they all appear
in the diff).

Consult these filters during pathspec-limited traversals in the function
`rev_same_tree_as_empty()`. Doing so yields a performance improvement
where we can avoid enumerating the full set of paths in a parentless
commit's root tree when we know that the path(s) of interest were not
listed in that commit's changed-path Bloom filter.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Original-patch-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:05 -07:00
f88611c6d0 t/t4216-log-bloom.sh: harden test_bloom_filters_not_used()
The existing implementation of test_bloom_filters_not_used() asserts
that the Bloom filter sub-system has not been initialized at all, by
checking for the absence of any data from it from trace2.

In the following commit, it will become possible to load Bloom filters
without using them (e.g., because the `commitGraph.changedPathVersion`
introduced later in this series is incompatible with the hash version
with which the commit-graph's Bloom filters were written).

When this is the case, it's possible to initialize the Bloom filter
sub-system, while still not using any Bloom filters. When this is the
case, check that the data dump from the Bloom sub-system is all zeros,
indicating that no filters were used.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:05 -07:00
78f0a5d187 pager: die when paging to non-existing command
When trying to execute a non-existent program from GIT_PAGER, we display
an error.  However, we also send the complete text to the terminal
and return a successful exit code.  This can be confusing for the user
and the displayed error could easily become obscured by a lengthy
text.

For example, here the error message would be very far above after
sending 50 MB of text:

    $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
    error: cannot run non-existent: No such file or directory
    50314363

Let's make the error clear by aborting the process and return an error
so that the user can easily correct their mistake.

This will be the result of the change:

    $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
    error: cannot run non-existent: No such file or directory
    fatal: unable to execute pager 'non-existent'
    0

The behavior change we're introducing in this commit affects two tests
in t7006, which is a good sign regarding test coverage and requires us
to address it.

The first test is 'git skips paging non-existing command'.  This test
comes from f7991f01f2 (t7006: clean up SIGPIPE handling in trace2 tests,
2021-11-21,) where a modification was made to a test that was originally
introduced in c24b7f6736 (pager: test for exit code with and without
SIGPIPE, 2021-02-02).  That original test was, IMHO, in the same
direction we're going in this commit.

At any rate, this test obviously needs to be adjusted to check the new
behavior we are introducing.  Do it.

The second test being affected is: 'non-existent pager doesnt cause
crash', introduced in f917f57f40 (pager: fix crash when pager program
doesn't exist, 2021-11-24).  As its name states, it has the intention of
checking that we don't introduce a regression that produces a crash when
GIT_PAGER points to a nonexistent program.

This test could be considered redundant nowadays, due to us already
having several tests checking implicitly what a non-existent command in
GIT_PAGER produces.  However, let's maintain a good belt-and-suspenders
strategy; adapt it to the new world.

Finally, it's worth noting that we are not changing the behavior if the
command specified in GIT_PAGER is a shell command.  In such cases, it
is:

    $ GIT_PAGER=:\;non-existent t/test-terminal.perl git log
    :;non-existent: 1: non-existent: not found
    died of signal 13 at t/test-terminal.perl line 33.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:47:13 -07:00
00f3661a0a doc: fix case error of eol attribute in example
The eol attribute only accepts "crlf" and "lf",
but the example incorrectly capitalizes "crlf".

References:

- https://git-scm.com/docs/gitattributes#_eol
- https://github.com/git/git/blob/v2.45.2/convert.c#L1278

Signed-off-by: Shane Sun <github@waterlemons2k.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-24 21:49:03 -07:00
1e1586e4ed The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-24 16:39:16 -07:00
532083fd16 Merge branch 'kl/attr-read-attr-fromindex-msan-workaround'
Code clarification to avoid an appearance of using an uninitialized
variable.

* kl/attr-read-attr-fromindex-msan-workaround:
  attr: fix msan issue in read_attr_from_index
2024-06-24 16:39:15 -07:00
107ed55103 Merge branch 'jc/worktree-git-path'
Code cleanup.

* jc/worktree-git-path:
  worktree_git_path(): move the declaration to path.h
2024-06-24 16:39:15 -07:00
e5ff701d4c Merge branch 'tb/commit-graph-use-tempfile'
"git update-server-info" and "git commit-graph --write" have been
updated to use the tempfile API to avoid leaving cruft after
failing.

* tb/commit-graph-use-tempfile:
  server-info.c: remove temporary info files on exit
  commit-graph.c: remove temporary graph layers on exit
2024-06-24 16:39:15 -07:00
2c4aa7ad74 Merge branch 'jc/add-i-retire-usebuiltin-config'
For over a year, setting add.interactive.useBuiltin configuration
variable did nothing but giving a "this does not do anything"
warning.  Finally remove it.

* jc/add-i-retire-usebuiltin-config:
  add-i: finally retire add.interactive.useBuiltin
2024-06-24 16:39:14 -07:00
ae2f21b560 Merge branch 'jc/no-default-attr-tree-in-bare'
Earlier we stopped using the tree of HEAD as the default source of
attributes in a bare repository, but failed to document it.  This
has been corrected.

* jc/no-default-attr-tree-in-bare:
  attr.tree: HEAD:.gitattributes is no longer the default in a bare repo
2024-06-24 16:39:14 -07:00
f0a462ecd5 Merge branch 'tb/precompose-getcwd'
We forgot to normalize the result of getcwd() to NFC on macOS where
all other paths are normalized, which has been corrected.  This still
does not address the case where core.precomposeUnicode configuration
is not defined globally.

* tb/precompose-getcwd:
  macOS: ls-files path fails if path of workdir is NFD
2024-06-24 16:39:14 -07:00
ffa47b75cf Merge branch 'tb/pseudo-merge-reachability-bitmap'
The pseudo-merge reachability bitmap to help more efficient storage
of the reachability bitmap in a repository with too many refs has
been added.

* tb/pseudo-merge-reachability-bitmap: (26 commits)
  pack-bitmap.c: ensure pseudo-merge offset reads are bounded
  Documentation/technical/bitmap-format.txt: add missing position table
  t/perf: implement performance tests for pseudo-merge bitmaps
  pseudo-merge: implement support for finding existing merges
  ewah: `bitmap_equals_ewah()`
  pack-bitmap: extra trace2 information
  pack-bitmap.c: use pseudo-merges during traversal
  t/test-lib-functions.sh: support `--notick` in `test_commit_bulk()`
  pack-bitmap: implement test helpers for pseudo-merge
  ewah: implement `ewah_bitmap_popcount()`
  pseudo-merge: implement support for reading pseudo-merge commits
  pack-bitmap.c: read pseudo-merge extension
  pseudo-merge: scaffolding for reads
  pack-bitmap: extract `read_bitmap()` function
  pack-bitmap-write.c: write pseudo-merge table
  pseudo-merge: implement support for selecting pseudo-merge commits
  config: introduce `git_config_double()`
  pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public
  pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()`
  pack-bitmap-write: support storing pseudo-merge commits
  ...
2024-06-24 16:39:13 -07:00
0f4b0d4cf0 diff: allow --color-moved with --no-ext-diff
We ignore the option --color-moved if an external diff program is
configured, presumably because its overhead is unnecessary in that case.
Respect the option if we don't actually use the external diff, though.

Reported-by: lolligerhans@gmx.de
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-24 13:49:41 -07:00
493fdae046 object-file: fix leak on conversion failure
I'm not sure exactly how to trigger the leak, but it seems fairly
obvious that the `content' buffer should be freed even if
convert_object_file() fails.  Noticed while working in this area
on unrelated things.

Signed-off-by: Eric Wong <e@80x24.org>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-24 09:07:21 -07:00
c1db988093 Merge branch 'pk/swedish-translation'
* pk/swedish-translation:
  git-gui: sv.po: Update Swedish translation (576t0f0u)
2024-06-23 10:25:57 +02:00
1f9693afb2 Merge branch 'bc/french-translation'
* bc/french-translation:
  git-gui: po: fix typo in French "aperçu"
2024-06-23 10:25:41 +02:00
4e66b5a990 fuzz: minimum fuzzers environment lacks libcURL
The "fuzz smoke test" job compiles various .o files to create
libgit.a and others, but the final build product of the fuzzer build
is *not* "git".  Since the job is not interested in building a
working "git", it does not define any build flags, and among the
notable ones that are missing is NO_CURL---even though the CI
environment that runs the job does not have libcURL development
package installed.

This obviously leads to a build failure.

Pass NO_CURL=NoThanks to "make" to make sure things will build
correctly, if we add any conditional compilation with "#ifdef
NO_CURL ... #endif" in the codebase.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-21 22:12:13 -07:00
57139818bf version: teach --build-options to reports zlib version information
Show ZLIB_VERSION, if defined, in "git version --build-options"
output.

Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-21 16:43:58 -07:00
2e2203163d version: teach --build-options to reports libcurl version information
Show LIBCURL_VERSION, if defined, in "git version --build-options"
output.

Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-21 16:40:43 -07:00
9005149a4a The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 15:45:17 -07:00
892fd8b89f Merge branch 'jc/heads-are-branches'
The "--heads" option of "ls-remote" and "show-ref" has been been
deprecated; "--branches" replaces "--heads".

* jc/heads-are-branches:
  show-ref: introduce --branches and deprecate --heads
  ls-remote: introduce --branches and deprecate --heads
  refs: call branches branches
2024-06-20 15:45:17 -07:00
166cdd8915 Merge branch 'ps/document-breaking-changes'
The structure of the document that records longer-term project
decisions to deprecate/remove/update various behaviour has been
outlined.

* ps/document-breaking-changes:
  BreakingChanges: document that we do not plan to deprecate git-checkout
  BreakingChanges: document removal of grafting
  BreakingChanges: document upcoming change from "sha1" to "sha256"
  docs: introduce document to announce breaking changes
2024-06-20 15:45:16 -07:00
83ac567781 Merge branch 'pw/rebase-i-error-message'
When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant.  Such an error is now
caught earlier in the process that parses the todo list.

* pw/rebase-i-error-message:
  rebase -i: improve error message when picking merge
  rebase -i: pass struct replay_opts to parse_insn_line()
2024-06-20 15:45:15 -07:00
e4ecba994c Merge branch 'ds/ahead-behind-fix'
Fix for a progress bar.

* ds/ahead-behind-fix:
  commit-graph: increment progress indicator
2024-06-20 15:45:14 -07:00
4401639f96 Merge branch 'ps/abbrev-length-before-setup-fix'
Setting core.abbrev too early before the repository set-up
(typically in "git clone") caused segfault, which as been
corrected.

* ps/abbrev-length-before-setup-fix:
  object-name: don't try to abbreviate to lengths greater than hexsz
  parse-options-cb: stop clamping "--abbrev=" to hash length
  config: fix segfault when parsing "core.abbrev" without repo
2024-06-20 15:45:13 -07:00
9071453ef6 Merge branch 'rj/format-patch-auto-cover-with-interdiff'
"git format-patch --interdiff" for multi-patch series learned to
turn on cover letters automatically (unless told never to enable
cover letter with "--no-cover-letter" and such).

* rj/format-patch-auto-cover-with-interdiff:
  format-patch: assume --cover-letter for diff in multi-patch series
  t4014: cleanups in a few tests
2024-06-20 15:45:12 -07:00
5f14d20984 Merge branch 'kn/update-ref-symref'
"git update-ref --stdin" learned to handle transactional updates of
symbolic-refs.

* kn/update-ref-symref:
  update-ref: add support for 'symref-update' command
  reftable: pick either 'oid' or 'target' for new updates
  update-ref: add support for 'symref-create' command
  update-ref: add support for 'symref-delete' command
  update-ref: add support for 'symref-verify' command
  refs: specify error for regular refs with `old_target`
  refs: create and use `ref_update_expects_existing_old_ref()`
2024-06-20 15:45:12 -07:00
c1322ca474 Merge branch 'gt/unit-test-oidtree'
"oidtree" tests were rewritten to use the unit test framework.

* gt/unit-test-oidtree:
  t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c
2024-06-20 15:45:10 -07:00
393879d473 Merge branch 'tb/multi-pack-reuse-fix'
Assorted fixes to multi-pack-index code paths.

* tb/multi-pack-reuse-fix:
  pack-revindex.c: guard against out-of-bounds pack lookups
  pack-bitmap.c: avoid uninitialized `pack_int_id` during reuse
  midx-write.c: do not read existing MIDX with `packs_to_include`
2024-06-20 15:45:10 -07:00
f4788a577b Merge branch 'ps/make-append-to-cflags'
To help developers, the build procedure now allows builders to use
CFLAGS_APPEND to specify additional CFLAGS.

* ps/make-append-to-cflags:
  Makefile: add ability to append to CFLAGS and LDFLAGS
2024-06-20 15:45:09 -07:00
8ba7dbdefb Merge branch 'rs/diff-exit-code-with-external-diff'
"git diff --exit-code --ext-diff" learned to take the exit status
of the external diff driver into account when deciding the exit
status of the overall "git diff" invocation when configured to do
so.

* rs/diff-exit-code-with-external-diff:
  diff: let external diffs report that changes are uninteresting
  userdiff: add and use struct external_diff
  t4020: test exit code with external diffs
2024-06-20 15:45:08 -07:00
e631115ae5 Merge branch 'ds/doc-add-interactive-singlekey'
Doc update.

* ds/doc-add-interactive-singlekey:
  doc: interactive.singleKey is disabled by default
2024-06-20 15:45:08 -07:00
8b731b8d06 version: --build-options reports OpenSSL version information
This change uses the OpenSSL supplied OPENSSL_VERSION_TEXT #define supplied
for this purpose by that project. If the #define is not present, the version
is not reported.

Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 13:02:31 -07:00
28dc26dc33 commit: remove find_header_mem()
cfc5cf428b (receive-pack.c: consolidate find header logic, 2022-01-06)
introduced find_header_mem() and turned find_commit_header() into a thin
wrapper.  Since then, the latter has become the last remaining caller of
the former.  Remove it to restore find_commit_header() to the state
before cfc5cf428b, get rid of a strlen(3) call and resolve a NEEDSWORK
note in the process.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 11:12:40 -07:00
40d817875d t5500: fix mistaken $SERVER reference in helper function
The end of t5500 contains two tests which use a single helper function,
fetch_filter_blob_limit_zero(). It takes a parameter to point to the
path of the server repository, which we store locally as $SERVER. The
first caller uses the relative path "server", while the second points
into the httpd document root.

Commit 07ef3c6604 (fetch test: use more robust test for filtered
objects, 2019-12-23) refactored some lines, but accidentally switched
"$SERVER" to "server" in one spot. That means the second caller is
looking at the server directory from the previous test rather than its
own.

This happens to work out because the "server" directory from the first
test is still hanging around, and the contents of the two are identical.
But it was clearly not the intended behavior, and is fragile to cleaning
up the leftovers from the first test.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 11:06:45 -07:00
3c295c87c2 mingw: drop bogus (and unneeded) declaration of _pgmptr
In 08809c09aa (mingw: add a helper function to attach GDB to the
current process, 2020-02-13), I added a declaration that was not needed.
Back then, that did not matter, but now that the declaration of that
symbol was changed in mingw-w64's headers, it causes the following
compile error:

      CC compat/mingw.o
  compat/mingw.c: In function 'open_in_gdb':
  compat/mingw.c:35:9: error: function declaration isn't a prototype [-Werror=strict-prototypes]
     35 |         extern char *_pgmptr;
        |         ^~~~~~
  In file included from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/mm_malloc.h:27,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/xmmintrin.h:34,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/immintrin.h:31,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/lib/gcc/x86_64-w64-mingw32/14.1.0/include/x86intrin.h:32,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/winnt.h:1658,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/minwindef.h:163,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/windef.h:9,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/windows.h:69,
                   from C:/git-sdk-64/usr/src/git/build-installers/mingw64/include/winsock2.h:23,
                   from compat/../git-compat-util.h:215,
                   from compat/mingw.c:1:
  compat/mingw.c:35:22: error: '__p__pgmptr' redeclared without dllimport attribute: previous dllimport ignored [-Werror=attributes]
     35 |         extern char *_pgmptr;
        |                      ^~~~~~~

Let's just drop the declaration and get rid of this compile error.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:59:42 -07:00
96a6621d25 fetch-pack: fix segfault when fscking without --lock-pack
The fetch-pack internals have multiple options related to creating
".keep" lock-files for the received pack:

  - if args.lock_pack is set, then we tell index-pack to create a .keep
    file. In the fetch-pack plumbing command, this is triggered by
    passing "-k" twice.

  - if the caller passes in a pack_lockfiles string list, then we use it
    to record the path of the keep-file created by index-pack. We get
    that name by reading the stdout of index-pack. In the fetch-pack
    command, this is triggered by passing the (undocumented) --lock-pack
    option; without it, we pass in a NULL string list.

So it's possible to ask index-pack to create the lock-file (using "-k
-k") but not ask to record it (by avoiding "--lock-pack"). This worked
fine until 5476e1efde (fetch-pack: print and use dangling .gitmodules,
2021-02-22), but now it causes a segfault.

Before that commit, if pack_lockfiles was NULL, we wouldn't bother
reading the output from index-pack at all. But since that commit,
index-pack may produce extra output if we asked it to fsck. So even if
nobody cares about the lockfile path, we still need to read it to skip
to the output we do care about.

We correctly check that we didn't get a NULL lockfile path (which can
happen if we did not ask it to create a .keep file at all), but we
missed the case where the lockfile path is not NULL (due to "-k -k") but
the pack_lockfiles string_list is NULL (because nobody passed
"--lock-pack"), and segfault trying to add to the NULL string-list.

We can fix this by skipping the append to the string list when either
the value or the list is NULL. In that case we must also free the
lockfile path to avoid leaking it when it's non-NULL.

Nobody noticed the bug for so long because the transport code used by
"git fetch" always passes in a pack_lockfiles pointer, and remote-curl
(the main user of the fetch-pack plumbing command) always passes
--lock-pack.

Reported-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:58:00 -07:00
f19b916535 merge-ort: convert more error() cases to path_msg()
merge_submodule() stores errors using path_msg(), whereas other call
sites make use of the error() function.  This is inconsistent, and
moving towards path_msg() seems more friendly for libification efforts
since it will allow the caller to determine whether the error messages
need to be printed.

Note that this deferred handling of error messages changes the error
message in a recursive merge from
  error: failed to execute internal merge
to
  From inner merge:  error: failed to execute internal merge
which provides a little more information about the error which may be
useful.  Since the recursive merge strategy still only shows the older
error, we had to adjust the new testcase introduced a few commits ago to
just search for the older message somewhere in the output.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:25 -07:00
14949d91b6 merge-ort: upon merge abort, only show messages causing the abort
When something goes wrong enough that we need to abort early and not
even attempt merging the remaining files, it probably does not make
sense to report conflicts messages for the subset of files we processed
before hitting the fatal error.  Instead, only show the messages
associated with paths where we hit the fatal error.  Also, print these
messages to stderr rather than stdout.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:25 -07:00
c55c3f20b1 merge-ort: loosen commented requirements
The comment above type_short_descriptions claimed that the order had to
match what was found in the conflict_info_and_types enum.  Since
type_short_descriptions uses designated initializers, the order should
not actually matter; I am guessing that positional initializers may have
been under consideration when that comment was added, but the comment
was not updated when designated initializers were chosen.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:25 -07:00
5fadf1f933 merge-ort: clearer propagation of failure-to-function from merge_submodule
The 'clean' member variable is somewhat of a tri-state (1 = clean, 0 =
conflicted, -1 = failure-to-determine), but we often like to think of
it as binary (ignoring the possibility of a negative value) and use
constructs like '!clean' to reflect this.  However, these constructs
can make codepaths more difficult to understand, unless we handle the
negative case early and return pre-emptively; do that in
handle_content_merge() to make the code a bit easier to read.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:24 -07:00
9ed8e17d8a merge-ort: fix type of local 'clean' var in handle_content_merge ()
handle_content_merge() returns an int.  Every caller of
handle_content_merge() expects an int.  However, we declare a local
variable 'clean' that we use for the return value to be unsigned.  To
make matters worse, we also assign 'clean' the return value of
merge_submodule() in one codepath, which is defined to return an int.
It seems that the only reason to have 'clean' be unsigned was to allow a
cutesy bit manipulation operation to be well-defined.  Fix the type of
the 'clean' local in handle_content_merge().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:24 -07:00
0b4f726cde merge-ort: maintain expected invariant for priv member
The calling convention for the merge machinery is
   One call to          init_merge_options()
   One or more calls to merge_incore_[non]recursive()
   One call to          merge_finalize()
      (possibly indirectly via merge_switch_to_result())
Both merge_switch_to_result() and merge_finalize() expect
   opt->priv == NULL && result->priv != NULL
which is supposed to be set up by our move_opt_priv_to_result_priv()
function.  However, two codepaths dealing with error cases did not
execute this necessary logic, which could result in assertion failures
(or, if assertions were compiled out, could result in segfaults).  Fix
the oversight and add a test that would have caught one of these
problems.

While at it, also tighten an existing test for a non-recursive merge
to verify that it fails with appropriate status.  Most merge tests in
the testsuite check either for success or conflicts; those testing for
neither are rare and it is good to ensure they support the invariant
assumed by builtin/merge.c in this comment:
    /*
     * The backend exits with 1 when conflicts are
     * left to be resolved, with 2 when it does not
     * handle the given merge at all.
     */
So, explicitly check for the exit status of 2 in these cases.

Reported-by: Matt Cree <matt.cree@gearset.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:24 -07:00
e79bdb426c merge-ort: extract handling of priv member into reusable function
In preparation for a subsequent commit which will ensure we do not
forget to maintain our invariants for the priv member in error
codepaths, extract the necessary functionality out into a separate
function.  This change is cosmetic at this point, and introduces no
changes beyond an extra assertion sanity check.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:35:24 -07:00
63d903ff52 unbundle: extend object verification for fetches
The existing fetch.fsckObjects and transfer.fsckObjects configurations
were not fully applied to bundle-involved fetches, including direct
bundle fetches and bundle-uri enabled fetches. Furthermore, there was no
object verification support for unbundle.

This commit extends object verification support in `bundle.c:unbundle`
by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When
this option is enabled, we append the `--fsck-objects` flag to
`git-index-pack`.

The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches,
where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether
to enable this option for `bundle.c:unbundle`, specifically in:

- `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
- `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.

This addition ensures a consistent logic for object verification during
fetches. Tests have been added to confirm functionality in the scenarios
mentioned above.

Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:30:08 -07:00
d0cbc75680 fetch-pack: expose fsckObjects configuration logic
Currently, we can use "transfer.fsckObjects" and the more specific
"fetch.fsckObjects" to control checks for broken objects in received
packs during fetches. However, these configurations were only
acknowledged by `fetch-pack.c:get_pack` and did not take effect in
direct bundle fetches or fetches with _bundle-uri_ enabled.

This commit exposes the fetch-then-transfer configuration logic by
adding a new function `fetch_pack_fsck_objects` in fetch-pack.h. This
new function is used to replace the assignment for `fsck_objects` in
`fetch-pack.c:get_pack`. In the next commit, this function will also be
used to extend fsck support for bundle-involved fetches.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:30:07 -07:00
3079026fc1 bundle-uri: verify oid before writing refs
When using the bundle-uri mechanism with a bundle list containing
multiple interrelated bundles, we encountered a bug where tips from
downloaded bundles were not discovered, thus resulting in rather slow
clones. This was particularly problematic when employing the
"creationTokens" heuristic.

To reproduce this issue, consider a repository with a single branch
"main" pointing to commit "A". Firstly, create a base bundle with:

  git bundle create base.bundle main

Then, add a new commit "B" on top of "A", and create an incremental
bundle for "main":

  git bundle create incr.bundle A..main

Now, generate a bundle list with the following content:

  [bundle]
      version = 1
      mode = all
      heuristic = creationToken

  [bundle "base"]
      uri = base.bundle
      creationToken = 1

  [bundle "incr"]
      uri = incr.bundle
      creationToken = 2

A fresh clone with the bundle list above should result in a reference
"refs/bundles/main" pointing to "B" in the new repository. However, git
would still download everything from the server, as if it had fetched
nothing locally.

So why the "refs/bundles/main" is not discovered? After some digging I
found that:

1. Bundles in bundle list are downloaded to local files via
   `bundle-uri.c:download_bundle_list` or via
   `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
   heuristic.
2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which
   is called by `bundle-uri.c:unbundle_all_bundles` or called within
   `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
   heuristic.
3. To get all prerequisites of the bundle, the bundle header is read
   inside `bundle-uri.c:unbundle_from_file` to by calling
   `bundle.c:read_bundle_header`.
4. Then it calls `bundle.c:unbundle`, which calls
   `bundle.c:verify_bundle` to ensure the repository contains all the
   prerequisites.
5. `bundle.c:verify_bundle` calls `parse_object`, which eventually
   invokes `packfile.c:prepare_packed_git` or
   `packfile.c:reprepare_packed_git`, filling
   `raw_object_store->packed_git` and setting `packed_git_initialized`.
6. If `bundle.c:unbundle` succeeds, it writes refs via
   `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here
   bundle refs which can target arbitrary objects are written to the
   repository.
7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions
   `fetch-pack.c:mark_complete_and_common_ref` and
   `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to
   find local tips for negotiation. The `OBJECT_INFO_QUICK` flag
   prevents `packfile.c:reprepare_packed_git` from being called,
   resulting in failures to parse OIDs that reside only in the latest
   bundle.

In the example above, when unbunding "incr.bundle", "base.pack" is added
to `packed_git` due to prerequisites verification. However, "B" cannot
be found for negotiation because it exists in "incr.pack", which is not
included in `packed_git`.

Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing
bundle refs. When `refs.c:refs_update_ref` is called to write the
corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`.
This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls
`transaction_prepare` of the refs storage backend. For files backend, it
is `files-backend.c:files_transaction_prepare`, and for reftable
backend, it is `reftable-backend.c:reftable_be_transaction_prepare`.
Both functions eventually call `object.c:parse_object`, which can invoke
`packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures
that bundle refs point to valid objects and that all tips from bundle
refs are correctly parsed during subsequent negotiations.

A set of negotiation-related tests for cloning with bundle-uri has been
included to demonstrate that downloaded bundles are utilized to
accelerate fetching.

Additionally, another test has been added to show that bundles with
incorrect headers, where refs point to non-existent objects, do not
result in any bundle refs being created in the repository.

Reviewed-by: Karthik Nayak <karthik.188@gmail.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:30:07 -07:00
75daa42ddf t1006: ensure cat-file info isn't buffered by default
While working on buffering changes to `git cat-file' in a
separate patch, I inadvertently made the output of --batch-check
and the `info' command of --batch-command buffered as if
opt->buffer_output is turned on by default.

Buffering by default breaks some 3rd-party Perl scripts using
cat-file, but this breakage was not detected anywhere in our
test suite.  Add a small Perl snippet to test this problem since
(AFAIK) other equivalent ways to test this behavior from Bourne
shell and/or awk would require racy sleeps, non-portable FIFOs
or tedious C code.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:28:46 -07:00
47d2691ae9 git-gui: sv.po: Update Swedish translation (576t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-06-20 19:27:35 +02:00
2e5a636593 merge: avoid write merge state when unable to write index
Writing the merge state after the index write fails is meaningless and
could potentially cause Git to lose changes.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-18 08:13:35 -07:00
66ac6e4bcd The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-17 15:55:59 -07:00
4216329457 Merge branch 'ps/no-writable-strings'
Building with "-Werror -Wwrite-strings" is now supported.

* ps/no-writable-strings: (27 commits)
  config.mak.dev: enable `-Wwrite-strings` warning
  builtin/merge: always store allocated strings in `pull_twohead`
  builtin/rebase: always store allocated string in `options.strategy`
  builtin/rebase: do not assign default backend to non-constant field
  imap-send: fix leaking memory in `imap_server_conf`
  imap-send: drop global `imap_server_conf` variable
  mailmap: always store allocated strings in mailmap blob
  revision: always store allocated strings in output encoding
  remote-curl: avoid assigning string constant to non-const variable
  send-pack: always allocate receive status
  parse-options: cast long name for OPTION_ALIAS
  http: do not assign string constant to non-const field
  compat/win32: fix const-correctness with string constants
  pretty: add casts for decoration option pointers
  object-file: make `buf` parameter of `index_mem()` a constant
  object-file: mark cached object buffers as const
  ident: add casts for fallback name and GECOS
  entry: refactor how we remove items for delayed checkouts
  line-log: always allocate the output prefix
  line-log: stop assigning string constant to file parent buffer
  ...
2024-06-17 15:55:58 -07:00
72576d139d Merge branch 'jk/imap-send-plug-all-msgs-leak'
A leak in "git imap-send" that somehow escapes LSan has been
plugged.

* jk/imap-send-plug-all-msgs-leak:
  imap-send: free all_msgs strbuf in "out" label
2024-06-17 15:55:58 -07:00
42b8b5bfd0 Merge branch 'jk/am-retry'
"git am" has a safety feature to prevent it from starting a new
session when there already is a session going.  It reliably
triggers when a mbox is given on the command line, but it has to
rely on the tty-ness of the standard input.  Add an explicit way to
opt out of this safety with a command line option.

* jk/am-retry:
  test-terminal: drop stdin handling
  am: add explicit "--retry" option
2024-06-17 15:55:56 -07:00
cff3b034d5 Merge branch 'jc/varargs-attributes'
Varargs functions that are unannotated as printf-like or execl-like
have been annotated as such.

* jc/varargs-attributes:
  __attribute__: add a few missing format attributes
  __attribute__: mark some functions with LAST_ARG_MUST_BE_NULL
  __attribute__: remove redundant attribute declaration for git_die_config()
  __attribute__: trace2_region_enter_printf() is like "printf"
2024-06-17 15:55:55 -07:00
40a163f217 Merge branch 'ps/ref-storage-migration'
A new command has been added to migrate a repository that uses the
files backend for its ref storage to use the reftable backend, with
limitations.

* ps/ref-storage-migration:
  builtin/refs: new command to migrate ref storage formats
  refs: implement logic to migrate between ref storage formats
  refs: implement removal of ref storages
  worktree: don't store main worktree twice
  reftable: inline `merged_table_release()`
  refs/files: fix NULL pointer deref when releasing ref store
  refs/files: extract function to iterate through root refs
  refs/files: refactor `add_pseudoref_and_head_entries()`
  refs: allow to skip creation of reflog entries
  refs: pass storage format to `ref_store_init()` explicitly
  refs: convert ref storage format to an enum
  setup: unset ref storage when reinitializing repository version
2024-06-17 15:55:55 -07:00
dfd668fa84 Merge branch 'ps/check-docs-fix'
"make check-docs" noticed problems and reported to its output but
failed to signal its findings with its exit status, which has been
corrected.

* ps/check-docs-fix:
  ci/test-documentation: work around SyntaxWarning in Python 3.12
  gitlab-ci: add job to run `make check-docs`
  Documentation/lint-manpages: bubble up errors
  Makefile: extract script to lint missing/extraneous manpages
2024-06-17 15:55:54 -07:00
4551858c18 Merge branch 'ps/ci-fix-detection-of-ubuntu-20'
Fix for an embarrassing typo that prevented Python2 tests from running
anywhere.

* ps/ci-fix-detection-of-ubuntu-20:
  ci: fix check for Ubuntu 20.04
2024-06-17 15:55:53 -07:00
7e2d0348d8 Merge branch 'ap/credential-clear-fix'
Upon expiration event, the credential subsystem forgot to clear
in-core authentication material other than password (whose support
was added recently), which has been corrected.

* ap/credential-clear-fix:
  credential: clear expired c->credential, unify secret clearing
2024-06-17 15:55:53 -07:00
4d8ae4d3ca Merge branch 'jc/format-patch-with-range-diff'
The inter/range-diff output has been moved to the end of the patch
when format-patch adds it to a single patch, instead of writing it
before the patch text, to be consistent with what is done for a
cover letter for a multi-patch series.

* jc/format-patch-with-range-diff:
  format-patch: move range/inter diff at the end of a single patch output
  show_log: factor out interdiff/range-diff generation
2024-06-17 15:55:52 -07:00
8270201971 Git.pm: use array in command_bidi_pipe example
command_bidi_pipe takes the git command and optional arguments as an
array, not a string.  Make sure the documentation example is usable
code.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-17 13:41:51 -07:00
34d982caaf attr: fix msan issue in read_attr_from_index
Memory sanitizer (msan) is detecting a use of an uninitialized variable
(`size`) in `read_attr_from_index`:

    ==2268==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x5651f3416504 in read_attr_from_index git/attr.c:868:11
    #1 0x5651f3415530 in read_attr git/attr.c
    #2 0x5651f3413d74 in bootstrap_attr_stack git/attr.c:968:6
    #3 0x5651f3413d74 in prepare_attr_stack git/attr.c:1004:2
    #4 0x5651f3413d74 in collect_some_attrs git/attr.c:1199:2
    #5 0x5651f3413144 in git_check_attr git/attr.c:1345:2
    #6 0x5651f34728da in convert_attrs git/convert.c:1320:2
    #7 0x5651f3473425 in would_convert_to_git_filter_fd git/convert.c:1373:2
    #8 0x5651f357a35e in index_fd git/object-file.c:2630:34
    #9 0x5651f357aa15 in index_path git/object-file.c:2657:7
    #10 0x5651f35db9d9 in add_to_index git/read-cache.c:766:7
    #11 0x5651f35dc170 in add_file_to_index git/read-cache.c:799:9
    #12 0x5651f321f9b2 in add_files git/builtin/add.c:346:7
    #13 0x5651f321f9b2 in cmd_add git/builtin/add.c:565:18
    #14 0x5651f321d327 in run_builtin git/git.c:474:11
    #15 0x5651f321bc9e in handle_builtin git/git.c:729:3
    #16 0x5651f321a792 in run_argv git/git.c:793:4
    #17 0x5651f321a792 in cmd_main git/git.c:928:19
    #18 0x5651f33dde1f in main git/common-main.c:62:11

The issue exists because `size` is an output parameter from
`read_blob_data_from_index`, but it's only modified if
`read_blob_data_from_index` returns non-NULL. The read of `size` when
calling `read_attr_from_buf` unconditionally may read from an
uninitialized value. `read_attr_from_buf` checks that `buf` is non-NULL
before reading from `size`, but by then it's already too late: the
uninitialized read will have happened already. Furthermore, there's no
guarantee that the compiler won't reorder things so that it checks
`size` before checking `!buf`.

Make the call to `read_attr_from_buf` conditional on `buf` being
non-NULL, ensuring that `size` is not read if it's never set.

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-17 13:32:42 -07:00
a83e21de6b pack-bitmap.c: ensure pseudo-merge offset reads are bounded
After reading the pseudo-merge extension's metadata table, we allocate
an array to store information about each pseudo-merge, including its
byte offset within the .bitmap file itself.

This is done like so:

    pseudo_merge_ofs = index_end - 24 -
            (index->pseudo_merges.nr * sizeof(uint64_t));
    for (i = 0; i < index->pseudo_merges.nr; i++) {
            index->pseudo_merges.v[i].at = get_be64(pseudo_merge_ofs);
            pseudo_merge_ofs += sizeof(uint64_t);
    }

But if the pseudo-merge table is corrupt, we'll keep calling get_be64()
past the end of the pseudo-merge extension, potentially reading off the
end of the mmap'd region.

Prevent this by ensuring that we have at least `table_size - 24` many
bytes available to read (adding 24 to the left-hand side of our
inequality to account for the length of the metadata component).

This is sufficient to prevent us from reading off the end of the
pseudo-merge extension, and ensures that all of the get_be64() calls
below are in bounds.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 14:19:27 -07:00
20c49432e4 Documentation/technical/bitmap-format.txt: add missing position table
While investigating a benign Coverity warning on the new pseudo-merge
implementation, I was struggling to understand the (paraphrased) below:

    ofs = index_end - 24 - (index->pseudo_merges.nr * sizeof(uint64_t));
    for (i = 0; i < index->pseudo_merges.nr; i++) {
            index->pseudo_merges.v[i].at = get_be64(ofs);
            ofs += sizeof(uint64_t);
    }

, in pack-bitmap.c::load_bitmap_header(). Looking at the documentation,
the diagram describing the on-disk format (prior to this patch)
suggested that the optional extended lookup table immediately preceded
the trailing metadata portion.

If that were the case, that would make the above code from
load_bitmap_header() incorrect, as we'd be blindly reading into the
extended offset table.

But later on in the documentation there is a description of the
pseudo-merge position table as immediately preceding the trailing
metadata portion of the extension. And indeed, we do write the position
table in pack-bitmap-write.c:

    /* write positions for all pseudo merges */
    for (i = 0; i < writer->pseudo_merges_nr; i++)
            hashwrite_be64(f, pseudo_merge_ofs[i]);

    hashwrite_be32(f, writer->pseudo_merges_nr);
    hashwrite_be32(f, kh_size(writer->pseudo_merge_commits));
    hashwrite_be64(f, table_start - start);
    hashwrite_be64(f, hashfile_total(f) - start + sizeof(uint64_t));

So this is purely a case of the diagram being out of sync with the
textual description and actual implementation of the format
specification.

Add the missing component back to the format diagram to avoid further
confusion in this area.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 14:19:26 -07:00
dc89b7d522 hex: guard declarations with USE_THE_REPOSITORY_VARIABLE
Guard declarations of functions that implicitly use `the_repository`
with `USE_THE_REPOSITORY_VARIABLE` such that callers don't accidentally
rely on that global variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:35 -07:00
912d4756cd t/helper: remove dependency on the_repository in "proc-receive"
The "proc-receive" test helper implicitly relies on `the_repository` via
`parse_oid_hex()`. This isn't necessary though, and in fact the whole
command does not depend on `the_repository` at all.

Stop setting up `the_repository` and use `parse_oid_hex_any()` to parse
object IDs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:35 -07:00
8e9a1d0dc2 t/helper: fix segfault in "oid-array" command without repository
The "oid-array" test helper can supposedly work without a Git
repository, but will in fact crash because `the_repository->hash_algo`
is not initialized. This is because `oid_pos()`, which is used by
`oid_array_lookup()`, depends on `the_hash_algo->rawsz`.

Ideally, we'd adapt `oid_pos()` to not depend on `the_hash_algo`
anymore. That is a bigger untertaking though, so instead we fall back to
SHA1 when there is no repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
fa9e009aa7 t/helper: use correct object hash in partial-clone helper
The `object_info()` function of the partial-clone helper is responsible
for checking the object ID of a repository other than `the_repository`.
We use `parse_oid_hex()` in this function though, which means that we
still depend on `the_repository->hash_algo`.

Fix this by using the object hash of the function-local repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
2a0e11479f compat/fsmonitor: fix socket path in networked SHA256 repos
The IPC socket used by the fsmonitor on Darwin is usually contained in
the Git repository itself. When the repository is hosted on a networked
filesystem though, we instead create the socket path in the user's home
directory or the socket directory. In that case, we derive the path by
hashing the repository path.

But while we always use SHA1 to hash the repository path, we then end up
using `hash_to_hex()` to append the computed hash to the socket path.
This is wrong because `hash_to_hex()` uses the hash algorithm configured
in `the_repository`, which may not be SHA1. The consequence is that we
may append uninitialized bytes to the path when operating in a SHA256
repository.

Fix this bug by using `hash_to_hex_algop()` with SHA1.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
99cf4d6d35 replace-object: use hash algorithm from passed-in repository
In `register_replace_ref()`, we pass in a repository but then use
`get_oid_hex()` to parse passed-in object IDs, which implicitly uses
`the_repository`. Fix this by using the hash algorithm from the
passed-in repository instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
58650befd9 protocol-caps: use hash algorithm from passed-in repository
In `send_info()`, we pass in a repository but then use `get_oid_hex()`
to parse passed-in object IDs, which implicitly uses `the_repository`.
Fix this by using the hash algorithm from the passed-in repository
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
f2c32a66f5 oidset: pass hash algorithm when parsing file
The `oidset_parse_file_carefully()` function implicitly depends on
`the_repository` when parsing object IDs. Fix this by having callers
pass in the hash algorithm to use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
afa2c6ddc8 http-fetch: don't crash when parsing packfile without a repo
The git-http-fetch(1) command accepts a `--packfile=` option, which
allows the user to specify that it shall fetch a specific packfile,
only. The parameter here is the hash of the packfile, which is specific
to the object hash used by the repository. This requirement is implicit
though via our use of `parse_oid_hex()`, which internally uses
`the_repository`.

The git-http-fetch(1) command allows for there to be no repository
though, which only exists such that we can show usage via the "-h"
option. In that case though, starting with c8aed5e8da (repository: stop
setting SHA1 as the default object hash, 2024-05-07), `the_repository`
does not have its object hash initialized anymore and thus we would
crash when trying to parse the object ID outside of a repository.

Fix this issue by dying immediately when we see a "--packfile="
parameter when outside a Git repository. This is not a functional
regression as we would die later on with the same error anyway.

Add a test to detect the segfault. We use the "nongit" function to do
so, which we need to allow-list in `test_must_fail ()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
8a676bdc5c hash-ll: merge with "hash.h"
The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split
out of hash.h to remove dependency on repository.h, 2023-04-22) to make
explicit the split between hash-related functions that rely on the
global `the_repository`, and those that don't. This split is no longer
necessary now that we we have removed the reliance on `the_repository`.

Merge "hash-ll.h" back into "hash.h". This causes some code units to not
include "repository.h" anymore, which requires us to add some forward
declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
36026a0f30 refs: avoid include cycle with "repository.h"
There is an include cycle between "refs.h" and "repository.h" via
"commit.h", "object.h" and "hash.h". This has the effect that several
definitions of structs and enums will not be visible once we merge
"hash-ll.h" back into "hash.h" in the next commit.

The only reason that "repository.h" includes "refs.h" is the definition
of `enum ref_storage_format`. Move it into "repository.h" and have
"refs.h" include "repository.h" instead to fix the cycle.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
e7da938570 global: introduce USE_THE_REPOSITORY_VARIABLE macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
7abbca0e74 hash: require hash algorithm in empty_tree_oid_hex()
The `empty_tree_oid_hex()` function use `the_repository` to derive the
hash function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.

While at it, remove the unused `empty_blob_oid_hex()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
9c34eb93fb hash: require hash algorithm in is_empty_{blob,tree}_oid()
Both functions `is_empty_{blob,tree}_oid()` use `the_repository` to
derive the hash function that shall be used. Require callers to pass in
the hash algorithm to get rid of this implicit dependency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
861e8c76f6 hash: make is_null_oid() independent of the_repository
The function `is_null_oid()` uses `oideq(oid, null_oid())` to check
whether a given object ID is the all-zero object ID. `null_oid()`
implicitly relies on `the_repository` though to return the correct null
object ID.

Get rid of this dependency by always comparing the complete hash array
for being all-zeroes. This is possible due to the refactoring of object
IDs so that their hash arrays are always fully initialized.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
d4d364b2c7 hash: convert oidcmp() and oideq() to compare whole hash
With the preceding commit, the hash array of object IDs is now fully
zero-padded even when the hash algorithm's output is smaller than the
array length. With that, we can now adapt both `oidcmp()` and `oideq()`
to unconditionally memcmp(3P) the whole array instead of depending on
the hash size.

While it may feel inefficient to compare unused bytes for e.g. SHA-1, in
practice the compiler should now be able to produce code that is better
optimized both because we have no branch anymore, but also because the
size to compare is now known at compile time. Goldbolt spits out the
following assembly on an x86_64 platform with GCC 14.1 for the old and
new implementations of `oidcmp()`:

    oidcmp_old:
            movsx   rax, DWORD PTR [rdi+32]
            test    eax, eax
            jne     .L2
            mov     rax, QWORD PTR the_repository[rip]
            cmp     QWORD PTR [rax+16], 32
            je      .L6
    .L4:
            mov     edx, 20
            jmp     memcmp
    .L2:
            lea     rdx, [rax+rax*2]
            lea     rax, [rax+rdx*4]
            lea     rax, hash_algos[0+rax*8]
            cmp     QWORD PTR [rax+16], 32
            jne     .L4
    .L6:
            mov     edx, 32
            jmp     memcmp

    oidcmp_new:
            mov     edx, 32
            jmp     memcmp

The new implementation gets ridi of all the branches and effectively
only ends setting up `edx` for `memcmp()` and then calling it.

And for `oideq()`:

    oideq_old:
            movsx   rcx, DWORD PTR [rdi+32]
            mov     rax, rdi
            mov     rdx, rsi
            test    ecx, ecx
            jne     .L2
            mov     rcx, QWORD PTR the_repository[rip]
            cmp     QWORD PTR [rcx+16], 32
            mov     rcx, QWORD PTR [rax]
            je      .L12
    .L4:
            mov     rsi, QWORD PTR [rax+8]
            xor     rcx, QWORD PTR [rdx]
            xor     rsi, QWORD PTR [rdx+8]
            or      rcx, rsi
            je      .L13
    .L8:
            mov     eax, 1
            test    eax, eax
            sete    al
            movzx   eax, al
            ret
    .L2:
            lea     rsi, [rcx+rcx*2]
            lea     rcx, [rcx+rsi*4]
            lea     rcx, hash_algos[0+rcx*8]
            cmp     QWORD PTR [rcx+16], 32
            mov     rcx, QWORD PTR [rax]
            jne     .L4
    .L12:
            mov     rsi, QWORD PTR [rax+8]
            xor     rcx, QWORD PTR [rdx]
            xor     rsi, QWORD PTR [rdx+8]
            or      rcx, rsi
            jne     .L8
            mov     rcx, QWORD PTR [rax+16]
            mov     rax, QWORD PTR [rax+24]
            xor     rcx, QWORD PTR [rdx+16]
            xor     rax, QWORD PTR [rdx+24]
            or      rcx, rax
            jne     .L8
            xor     eax, eax
    .L14:
            test    eax, eax
            sete    al
            movzx   eax, al
            ret
    .L13:
            mov     edi, DWORD PTR [rdx+16]
            cmp     DWORD PTR [rax+16], edi
            jne     .L8
            xor     eax, eax
            jmp     .L14

    oideq_new:
            mov     rax, QWORD PTR [rdi]
            mov     rdx, QWORD PTR [rdi+8]
            xor     rax, QWORD PTR [rsi]
            xor     rdx, QWORD PTR [rsi+8]
            or      rax, rdx
            je      .L5
    .L2:
            mov     eax, 1
            xor     eax, 1
            ret
    .L5:
            mov     rax, QWORD PTR [rdi+16]
            mov     rdx, QWORD PTR [rdi+24]
            xor     rax, QWORD PTR [rsi+16]
            xor     rdx, QWORD PTR [rsi+24]
            or      rax, rdx
            jne     .L2
            xor     eax, eax
            xor     eax, 1
            ret

Interestingly, the compiler decides to split the comparisons into two so
that it first compares the lower half of the object ID for equality and
then the upper half. If the first check shows a difference, then we
wouldn't even end up comparing the second half.

In both cases, the new generated code is significantly shorter and has
way less branches. While I didn't benchmark the change, I'd be surprised
if the new code was slower.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
c98d762ed9 global: ensure that object IDs are always padded
The `oidcmp()` and `oideq()` functions only compare the prefix length as
specified by the given hash algorithm. This mandates that the object IDs
have a valid hash algorithm set, or otherwise we wouldn't be able to
figure out that prefix. As we do not have a hash algorithm in many
cases, for example when handling null object IDs, this assumption cannot
always be fulfilled. We thus have a fallback in place that instead uses
`the_repository` to derive the hash function. This implicit dependency
is hidden away from callers and can be quite surprising, especially in
contexts where there may be no repository.

In theory, we can adapt those functions to always memcmp(3P) the whole
length of their hash arrays. But there exist a couple of sites where we
populate `struct object_id`s such that only the prefix of its hash that
is actually used by the hash algorithm is populated. The remaining bytes
are left uninitialized. The fact that those bytes are uninitialized also
leads to warnings under Valgrind in some places where we copy those
bytes.

Refactor callsites where we populate object IDs to always initialize all
bytes. This also allows us to get rid of `oidcpy_with_padding()`, for
one because the input is now fully initialized, and because `oidcpy()`
will now always copy the whole hash array.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
9da95bda74 hash: require hash algorithm in oidread() and oidclr()
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
f4836570a7 hash: require hash algorithm in hasheq(), hashcmp() and hashclr()
Many of our hash functions have two variants, one receiving a `struct
git_hash_algo` and one that derives it via `the_repository`. Adapt all
of those functions to always require the hash algorithm as input and
drop the variants that do not accept one.

As those functions are now independent of `the_repository`, we can move
them from "hash.h" to "hash-ll.h".

Note that both in this and subsequent commits in this series we always
just pass `the_repository->hash_algo` as input even if it is obvious
that there is a repository in the context that we should be using the
hash from instead. This is done to be on the safe side and not introduce
any regressions. All callsites should eventually be amended to use a
repo passed via parameters, but this is outside the scope of this patch
series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
129cb1b99d hash: drop (mostly) unused is_empty_{blob,tree}_sha1() functions
The functions `is_empty_{blob,tree}_sha1()` are mostly unused, except
for a single callsite in "read-cache.c". Most callsites have long since
been converted to use the equivalents that accept a `struct object_id`
instead of a string.

Adapt the remaining callsite and drop those functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
aecd794fca remote: drop checks for zero-url case
Now that the previous commit removed the possibility that a "struct
remote" will ever have zero url fields, we can drop a number of
redundant checks and untriggerable code paths.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:39 -07:00
ffce821880 remote: always require at least one url in a remote
When we return a struct from remote_get(), the result _almost_ always
has at least one url. In remotes_remote_get_1(), we do this:

        if (name_given && !valid_remote(ret))
                add_url_alias(remote_state, ret, name);
        if (!valid_remote(ret))
                return NULL;

So if the remote doesn't have a url, we give it one based on the name
(this is how unconfigured urls are used as remotes). And if that doesn't
work, we return NULL.

But there's a catch: valid_remote() checks that we have at least one url
_unless_ the remote.*.vcs field is set. This comes from c578f51d52 (Add
a config option for remotes to specify a foreign vcs, 2009-11-18), and
the whole idea was to support remote helpers that don't have their own
url.

However, that mode has been broken since 25d5cc488a (Pass unknown
protocols to external protocol handlers, 2009-12-09)! That commit
unconditionally looks at the url in get_helper(), causing a segfault
with something like:

  git -c remote.foo.vcs=bar fetch foo

We could fix that now, of course. But given that it has been broken for
almost 15 years and nobody noticed, there's a better option. This weird
"there might not be a url" special case requires checks all over the
code base, and it's not clear if there are other similar segfaults
lurking. It would be nice if we could drop that special case.

So instead, let's let the "the remote name is the url" code kick in. If
you have "remote.foo.vcs", then your url (unless otherwise configured)
is "foo". This does have a visible effect compared to what 25d5cc488a
was trying to do. The idea back then is that for a remote without a url,
we'd run:

   # only one command-line option!
   git-remote-bar foo

whereas with our default url, now we'll run:

  git-remote-bar foo foo

Again, in practice nobody can be relying on this because it has been
segfaulting for 15 years. We should consider just removing this "vcs"
config option entirely, but that would be a user-visible breakage. So by
fixing it this way, we can keep things working that have been working,
and simplify away one special case inside our code.

This fixes the segfault from 25d5cc488a (demonstrated by the test), and
we can build further cleanups on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
7384e75618 t5801: test remote.*.vcs config
The usual way to trigger a remote helper is to use the "::" syntax from:
87422439d1 (Allow specifying the remote helper in the url, 2009-11-18).
Doing:

  git config remote.origin.url hg::https://example.com/repo

will run "git-remote-hg origin https://example.com/repo". Or you can
use the fallback handling from 25d5cc488a (Pass unknown protocols to
external protocol handlers, 2009-12-09):

  git config remote.origin.url "foo://bar"

which will run "git-remote-foo origin foo://bar".

But there's a third way, from c578f51d52 (Add a config option for
remotes to specify a foreign vcs, 2009-11-18):

  git config remote.origin.vcs foo
  git config remote.origin.url bar

which will run "git-remote-foo origin bar". This is mostly redundant
with the other methods, except that it is supposed to allow you to run
without a URL at all. So:

  git config remote.origin.vcs foo

would run "git-remote-foo origin" with no extra URL parameter (under the
assumption that the helper somehow knows how to access the remote repo).
However, this mode has been broken since 25d5cc488a, shortly after it
was added! That commit taught the transport code to always look at the
URL string to parse off the "foo::" bits, meaning it would always
segfault in the no-url case. You can see that with:

  git -c remote.foo.vcs=bar fetch foo

Nobody seems to have noticed in the almost 15 years since, so presumably
it's not a well-used feature. And without that, arguably the whole
remote.*.vcs feature could be removed entirely, as it isn't offering
anything you couldn't do with the "helper::" syntax. But it _does_ work
if you have a URL, and it has been advertised in the documentation for
all that time. So we shouldn't just remove it without warning.

Likewise, even if we were going to deprecate it, we should avoid
breaking it in the meantime. Since there are no tests for it at all,
let's add a few basic ones:

  - this syntax doesn't work well with "git clone" (another point
    against it versus "helper::"). But we can use "clone -c" to set up
    the config manually, passing the URL as usual to clone. This does
    work, though note that I had to use --no-local in the test to avoid
    broken interactions between the local code and the helper. In the
    real world this would be a non-issue, since the remote URL would
    generally not also be a local Git repo!

  - likewise, we should be able to set up the config manually and fetch
    into a repository. This also works.

  - we can simulate a vcs that has no URL support by stuffing the remote
    path into another environment variable. This should work, but
    doesn't (it hits the segfault mentioned above).

In the first two cases, I took the extra step of checking GIT_TRACE
output to confirm that we actually ran the helper (since the URL is a
valid Git repo, the clone/fetch would appear to work even if we
didn't use the helper at all!).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
e2269a2b59 t5801: make remote-testgit GIT_DIR setup more robust
Our tests use a fake helper that just imports from an existing Git
repository. We're fed the path to that repo on the command line, and
derive the GIT_DIR by tacking on "/.git".

This is wrong if the path is a bare repository, but that's OK since this
is just a limited test. But it's also wrong if the transport code feeds
us the actual .git directory itself (i.e., we expect "/path/to/repo" but
it gives us "/path/to/repo/.git"). None of the current tests do that,
but let's future-proof ourselves against adding a test that does.

We can instead ask "rev-parse" to set our GIT_DIR. Note that we have to
first unset other git variables from our environment. Coming into this
script, we'll have GIT_DIR set to the fetching repository, and we need
to "switch" to the remote one.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
9badf97c42 remote: allow resetting url list
Because remote.*.url is treated as a multi-valued key, there is no way
to override previous config. So for example if you have
remote.origin.url set to some wrong value, doing:

  git -c remote.origin.url=right fetch

would not work. It would append "right" to the list, which means we'd
still fetch from "wrong" (since subsequent values are used only as push
urls).

Let's provide a mechanism to reset the list, like we do for other
multi-valued keys (e.g., credential.helper, http.extraheaders, and
merge.suppressDest all use this "empty string means reset" pattern).

Reported-by: Mathew George <mathewegeorge@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
bd1b88dc7a config: document remote.*.url/pushurl interaction
The documentation for these keys gives a very terse definition and
points you to the fetch/push manpages. But from reading those pages it
was not at all obvious to me that:

  - these are keys that can be defined multiple times with meaningful
    behavior (especially remote.*.url)

  - the way that pushurl overrides url (the git-push page does mention
    that "pushurl defaults to url", but it is not immediately clear what
    a multi-valued url would do in that situation).

Let's try to summarize the current behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
b68118d2e8 remote: simplify url/pushurl selection
When we want to know the push urls for a remote, there is some simple
logic:

  - if the user configured any remote.*.pushurl keys, then those make
    the complete set of push urls

  - otherwise we push to all urls in remote.*.url

Many spots implement this with a level of indirection, assigning to a
local url/url_nr pair. But since both arrays are now strvecs, we can
just use a pointer to select the appropriate strvec, shortening the code
a bit.

Even though this is now a one-liner, since it is application logic that
is present in so many places, it's worth abstracting a helper function.
In fact, we already have such a function, but it's local to
builtin/push.c. So we'll just make it available everywhere via remote.h.

There are two spots to pay special attention to here:

  1. in builtin/remote.c's get_url(), we are selecting first based on
     push_mode and then falling back to "url" when we're in push_mode
     but no pushurl is defined. The updated code makes that much more
     clear, compared to the original which had an "else" fall-through.

  2. likewise in that file's set_url(), we _only_ respect push_mode,
     sine the point is that we are adding to pushurl in that case
     (whether it is empty or not). And thus it does not use our helper
     function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
8e804415fd remote: use strvecs to store remote url/pushurl
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:

  - push/clear are now one-liners

  - likewise the free+assigns in alias_all_urls() can use
    strvec_replace()

  - we now use size_t for storage, avoiding possible overflow

  - this will enable some further cleanups in future patches

There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
52595c155a remote: transfer ownership of memory in add_url(), etc
Many of the internal functions in remote.c take const strings and store
them forever in instances of "struct remote". Since the functions are
internal and callers are aware of the convention, this seems to mostly
work and not cause leaks. But there are some issues:

  - it's impossible to clear any of the arrays, because the data
    dependencies between them are too muddled (if you free() a string,
    it might also be referenced from another array, causing a
    user-after-free; but if you don't, that might be the last reference,
    causing a leak).

    This is mostly of interest for further refactoring and features, but
    there's at least one spot that's already a problem. In alias_all_urls(),
    we replace elements of remote->url and remote->pushurl with their
    aliased forms, dropping references to the original.

  - sometimes strings from outside callers make their way in. For
    example, calling remote_get("foo") when there is no configured "foo"
    remote will create a remote struct with the single url "foo". But
    we'll do so by holding on to the string passed to remote_get()
    forever.

    In practice I think this works out because we'd usually pass in a
    string that lasts the length of the program (a string literal, or
    argv reference, or other data structure allocated in the main
    function). But it's a rather subtle requirement.

Instead, let's have remote->url and remote->pushurl own their string
memory. They'll copy the const strings that are passed in, and callers
can stop making their own copies. Likewise, when we overwrite an entry,
we can free the memory it points to, fixing the leak mentioned above.

We'll leave the struct members as "const" since they are visible to the
outside world, and shouldn't usually be touched. This requires casting
on free() for now, but we'll clean that further in a future patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:37 -07:00
aa0595fbd6 remote: refactor alias_url() memory ownership
The alias_url() function may return either a newly allocated string
(which the caller must take ownership of), or the original const "url"
parameter that was passed in.

This often works OK because callers are generally passing in a "url"
that they expect to retain ownership of anyway. So whether we got back
the original or a new string, we're always interested in storing it
forever. But I suspect there are some possible leaks here (e.g.,
add_url_alias() may end up discarding the original "url").

Whether there are active leaks or not, this is a confusing setup that
makes further refactoring of memory ownership harder. So instead of
returning the original string, return NULL, forcing callers to decide
what to do with it explicitly. We can then build further cleanups on top
of that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:37 -07:00
0295ce7cbf archive: fix check for missing url
Running "git archive --remote" checks that we have at least one url for
the remote. It does so by looking at remote.url[0], but that won't work;
if we have no url at all, then remote.url will be NULL, and we'll
segfault.

Check url_nr instead, which is a more direct way of asking what we
want.

You can trigger the segfault like this:

  git -c remote.foo.vcs=bar archive --remote=foo

but I didn't bother adding a test. This is the tip of the iceberg for
no-url remotes, and a later patch will improve that situation. I just
wanted to clean up this bug so it didn't make further refactoring of
this code more confusing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:37 -07:00
028bb23a61 BreakingChanges: document that we do not plan to deprecate git-checkout
The git-checkout(1) command is seen by many as hard to understand
because it connects two somewhat unrelated features: switching between
branches and restoring worktree files from arbitrary revisions. In 2019,
we thus implemented two new commands git-switch(1) and git-restore(1) to
split out these separate concerns into standalone functions.

This "replacement" of git-checkout(1) has repeatedly triggered concerns
for our userbase that git-checkout(1) will eventually go away. This is
not the case though: the use of that command is still widespread, and it
is not expected that this will change anytime soon.

Document that all three commands will remain for the foreseeable future.
This decision may be revisited in case we ever figure out that most
everyone has given up on any of the commands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:08:52 -07:00
fcf0f4801d BreakingChanges: document removal of grafting
The grafting mechanism for objects has been deprecated in e650d0643b
(docs: mark info/grafts as outdated, 2014-03-05), which is more than a
decade ago. The mechanism can lead to hard-to-debug issues and has a
superior replacement with replace refs.

Follow through with the deprecation and mark grafts for removal in Git
3.0.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:08:52 -07:00
6ccf041d1d BreakingChanges: document upcoming change from "sha1" to "sha256"
Starting with 8e42eb0e9a (doc: sha256 is no longer experimental,
2023-07-31), the "sha256" object format is no longer considered to be
experimental. Furthermore, the SHA-1 hash function is actively
recommended against by for example NIST and FIPS 140-2, and attacks
against it are becoming more practical both due to new weaknesses
(SHAppening, SHAttered, Shambles) and due to the ever-increasing
computing power. It is only a matter of time before it can be considered
to be broken completely.

Let's plan for this event by being active instead of waiting for it to
happend and announce that the default object format is going to change
from "sha1" to "sha256" with Git 3.0.

All major Git implementations (libgit2, JGit, go-git) support the
"sha256" object format and are thus prepared for this change. The most
important missing piece in the puzzle is support in forges. But while
GitLab recently gained experimental support for the "sha256" object
format though, to the best of my knowledge GitHub doesn't support it
yet. Ideally, announcing this upcoming change will encourage forges to
start building that support.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:08:52 -07:00
57ec9254eb docs: introduce document to announce breaking changes
Over time, Git has grown quite a lot. With this evolution, many ideas
that were sensible at the time they were introduced are not anymore and
are thus considered to be deprecated. And while some deprecations may be
noted in manpages, most of them are actually deprecated in the "hive
mind" of the Git community, only.

Introduce a new document that tracks such breaking changes, but also
deprecations which we are not willing to go through with, to address
this issue. This document serves multiple purposes:

  - It is a way to facilitate discussion around proposed deprecations.

  - It allows users to learn about deprecations and speak up in case
    they have good reasons why a certain feature should not be
    deprecated.

  - It states intent and documents where the Git project wants to go,
    both in the case where we want to deprecate, but also in the case
    where we don't want to deprecate a specific feature.

The document is _not_ intended to cast every single discussion into
stone. It is supposed to be a living document that may change over time
when there are good reasons for it to change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:08:52 -07:00
10aa7c74a2 Merge branch 'gt/unit-test-oidtree' into ps/use-the-repository
* gt/unit-test-oidtree:
  t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c
2024-06-13 09:39:46 -07:00
092b33da2b Merge branch 'ps/ref-storage-migration' into ps/use-the-repository
* ps/ref-storage-migration:
  builtin/refs: new command to migrate ref storage formats
  refs: implement logic to migrate between ref storage formats
  refs: implement removal of ref storages
  worktree: don't store main worktree twice
  reftable: inline `merged_table_release()`
  refs/files: fix NULL pointer deref when releasing ref store
  refs/files: extract function to iterate through root refs
  refs/files: refactor `add_pseudoref_and_head_entries()`
  refs: allow to skip creation of reflog entries
  refs: pass storage format to `ref_store_init()` explicitly
  refs: convert ref storage format to an enum
  setup: unset ref storage when reinitializing repository version
2024-06-13 09:39:08 -07:00
f1160393c1 commit-graph: increment progress indicator
This fixes a bug that was introduced by 368d19b0b7 (commit-graph:
refactor compute_topological_levels(), 2023-03-20): Previously, the
progress indicator was updated from `i + 1` where `i` is the loop
variable of the enclosing `for` loop. After this patch, the update used
`info->progress_cnt + 1` instead, however, unlike `i`, the
`progress_cnt` attribute was not incremented. Let's increment it.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: squashed in a test update from Patrick Steinhardt]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 13:52:14 -07:00
d63586cb31 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 13:37:18 -07:00
2a061a62e2 Merge branch 'gt/decorate-unit-test'
A test helper that essentially is unit tests on the "decorate"
logic has been rewritten using the unit-tests framework.

* gt/decorate-unit-test:
  t/: migrate helper/test-example-decorate to the unit testing framework
2024-06-12 13:37:18 -07:00
51ea70c18a Merge branch 'jk/sparse-leakfix'
Many memory leaks in the sparse-checkout code paths have been
plugged.

* jk/sparse-leakfix:
  sparse-checkout: free duplicate hashmap entries
  sparse-checkout: free string list after displaying
  sparse-checkout: free pattern list in sparse_checkout_list()
  sparse-checkout: free sparse_filename after use
  sparse-checkout: refactor temporary sparse_checkout_patterns
  sparse-checkout: always free "line" strbuf after reading input
  sparse-checkout: reuse --stdin buffer when reading patterns
  dir.c: always copy input to add_pattern()
  dir.c: free removed sparse-pattern hashmap entries
  sparse-checkout: clear patterns when init() sees existing sparse file
  dir.c: free strings in sparse cone pattern hashmaps
  sparse-checkout: pass string literals directly to add_pattern()
  sparse-checkout: free string list in write_cone_to_file()
2024-06-12 13:37:17 -07:00
c2f79440ac Merge branch 'jk/cap-exclude-file-size'
An overly large ".gitignore" files are now rejected silently.

* jk/cap-exclude-file-size:
  dir.c: reduce max pattern file size to 100MB
  dir.c: skip .gitignore, etc larger than INT_MAX
2024-06-12 13:37:17 -07:00
b8bdb2f283 Merge branch 'jc/safe-directory-leading-path'
The safe.directory configuration knob has been updated to
optionally allow leading path matches.

* jc/safe-directory-leading-path:
  safe.directory: allow "lead/ing/path/*" match
2024-06-12 13:37:16 -07:00
22cf18fd9e Merge branch 'gt/t-hash-unit-test'
A pair of test helpers that essentially are unit tests on hash
algorithms have been rewritten using the unit-tests framework.

* gt/t-hash-unit-test:
  t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash
  strbuf: introduce strbuf_addstrings() to repeatedly add a string
2024-06-12 13:37:15 -07:00
56346ba24e Merge branch 'cp/reftable-unit-test'
Basic unit tests for reftable have been reimplemented under the
unit test framework.

* cp/reftable-unit-test:
  t: improve the test-case for parse_names()
  t: add test for put_be16()
  t: move tests from reftable/record_test.c to the new unit test
  t: move tests from reftable/stack_test.c to the new unit test
  t: move reftable/basics_test.c to the unit testing framework
2024-06-12 13:37:14 -07:00
a39e28ace7 Merge branch 'jc/t1517-more'
A new test was added to ensure git commands that are designed to
run outside repositories do work.

* jc/t1517-more:
  imap-send: minimum leakfix
  t1517: more coverage for commands that work without repository
2024-06-12 13:37:14 -07:00
ed54840872 t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c
helper/test-oidtree.c along with t0069-oidtree.sh test the oidtree.h
library, which is a wrapper around crit-bit tree. Migrate them to
the unit testing framework for better debugging and runtime
performance. Along with the migration, add an extra check for
oidtree_each() test, which showcases how multiple expected matches can
be given to check_each() helper.

To achieve this, introduce a new library called 'lib-oid.h'
exclusively for the unit tests to use. It currently mainly includes
utility to generate object_id from an arbitrary hex string
(i.e. '12a' -> '12a0000000000000000000000000000000000000'). This also
handles the hash algo selection based on GIT_TEST_DEFAULT_HASH.
This library will also be helpful when we port other unit tests such
as oid-array, oidset etc.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
[jc: small fixlets squashed in]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 13:33:20 -07:00
037df60013 object-name: don't try to abbreviate to lengths greater than hexsz
When given a length that equals the current hash algorithm's hex size,
then `repo_find_unique_abbrev_r()` exits early without trying to find an
abbreviation. This is only sensible because there is nothing to
abbreviate in the first place, so searching through objects to find a
unique prefix would be a waste of compute.

What we don't handle though is the case where the user passes a length
greater than the hash length. This is fine in practice as we still
compute the correct result. But at the very least, this is a waste of
resources as we try to abbreviate a value that cannot be abbreviated,
which causes us to hit the object database.

Start to explicitly handle values larger than hexsz to avoid this
performance penalty, which leads to a measureable speedup. The following
benchmark has been executed in linux.git:

  Benchmark 1: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD~)
    Time (mean ± σ):     12.812 s ±  0.040 s    [User: 12.225 s, System: 0.554 s]
    Range (min … max):   12.723 s … 12.857 s    10 runs

  Benchmark 2: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD)
    Time (mean ± σ):     11.095 s ±  0.029 s    [User: 10.546 s, System: 0.521 s]
    Range (min … max):   11.037 s … 11.122 s    10 runs

  Summary
    git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD) ran
      1.15 ± 0.00 times faster than git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 12:57:18 -07:00
59ff92c516 parse-options-cb: stop clamping "--abbrev=" to hash length
The `OPT__ABBREV()` option allows the user to specify the length that
object hashes shall be abbreviated to. This length needs to be in the
range of `(MIN_ABBREV, the_hash_algo->hexsz)`, which is why we clamp the
value as required. While this makes sense in the case of `MIN_ABBREV`,
it is unnecessary for the upper boundary as the value is eventually
passed down to `repo_find_unnique_abbrev_r()`, which handles values
larger than the current hash length just fine.

In the preceding commit, we have changed parsing of the "core.abbrev"
config to stop clamping to the upper boundary. Let's do the same here so
that the code becomes simpler, we are consistent with how we treat the
"core.abbrev" config and so that we stop depending on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 12:57:18 -07:00
524c0183c9 config: fix segfault when parsing "core.abbrev" without repo
The "core.abbrev" config allows the user to specify the minimum length
when abbreviating object hashes. Next to the values "auto" and "no",
this config also accepts a concrete length that needs to be bigger or
equal to the minimum length and smaller or equal to the hash algorithm's
hex length. While the former condition is trivial, the latter depends on
the object format used by the current repository. It is thus a variable
upper boundary that may either be 40 (SHA-1) or 64 (SHA-256).

This has two major downsides. First, the user that specifies this config
must be aware of the object hashes that its repository use. If they want
to configure the value globally, then they cannot pick any value in the
range `[41, 64]` if they have any repository that uses SHA-1. If they
did, Git would error out when parsing the config.

Second, and more importantly, parsing "core.abbrev" crashes when outside
of a Git repository because we dereference `the_hash_algo` to figure out
its hex length. Starting with c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) though, we stopped initializing
`the_hash_algo` outside of Git repositories.

Fix both of these issues by not making it an error anymore when the
given length exceeds the hash length. Instead, leave the abbreviated
length intact. `repo_find_unique_abbrev_r()` handles this just fine
except for a performance penalty which we will fix in a subsequent
commit.

Reported-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 12:57:18 -07:00
1d969afb78 Makefile: add ability to append to CFLAGS and LDFLAGS
There are some usecases where we may want to append CFLAGS to the
default CFLAGS set by Git. This could for example be to enable or
disable specific compiler warnings or to change the optimization level
that code is compiled with. This cannot be done without overriding the
complete CFLAGS value though and thus requires the user to redeclare the
complete defaults used by Git.

Introduce a new variable `CFLAGS_APPEND` that gets appended to the
default value of `CFLAGS`. As compiler options are last-one-wins, this
fulfills both of the usecases mentioned above. It's also common practice
across many other projects to have such a variable.

While at it, also introduce a matching `LDFLAGS_APPEND` variable. While
there isn't really any need for this variable as there are no default
`LDFLAGS`, users may expect this variable to exist, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:11:43 -07:00
e162aed591 pack-revindex.c: guard against out-of-bounds pack lookups
The function midx_key_to_pack_pos() is a helper function used by
midx_to_pack_pos() and midx_pair_to_pack_pos() to translate a (pack,
offset) tuple into a position into the MIDX pseudo-pack order.

Ensure that the pack ID given to midx_pair_to_pack_pos() is bounded by
the number of packs within the MIDX to prevent, for instance,
uninitialized memory from being used as a pack ID.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:08:28 -07:00
ed4a1d6ae1 pack-bitmap.c: avoid uninitialized pack_int_id during reuse
When performing multi-pack reuse, reuse_partial_packfile_from_bitmap()
is responsible for generating an array of bitmapped_pack structs from
which to perform reuse.

In the multi-pack case, we loop over the MIDXs packs and copy the result
of calling `nth_bitmapped_pack()` to construct the list of reusable
paths.

But we may also want to do pack-reuse over a single pack, either because
we only had one pack to perform reuse over (in the case of single-pack
bitmaps), or because we explicitly asked to do single pack reuse even
with a MIDX[^1].

When this is the case, the array we generate of reusable packs contains
only a single element, which is either (a) the pack attached to the
single-pack bitmap, or (b) the MIDX's preferred pack.

In 795006fff4 (pack-bitmap: gracefully handle missing BTMP chunks,
2024-04-15), we refactored the reuse_partial_packfile_from_bitmap()
function and stopped assigning the pack_int_id field when reusing only
the MIDX's preferred pack. This results in an uninitialized read down in
try_partial_reuse() like so:

    ==7474==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55c5cd191dde in try_partial_reuse pack-bitmap.c:1887:8
    #1 0x55c5cd191dde in reuse_partial_packfile_from_bitmap_1 pack-bitmap.c:2001:8
    #2 0x55c5cd191dde in reuse_partial_packfile_from_bitmap pack-bitmap.c:2105:3
    #3 0x55c5cce0bd0e in get_object_list_from_bitmap builtin/pack-objects.c:4043:3
    #4 0x55c5cce0bd0e in get_object_list builtin/pack-objects.c:4156:27
    #5 0x55c5cce0bd0e in cmd_pack_objects builtin/pack-objects.c:4596:3
    #6 0x55c5ccc8fac8 in run_builtin git.c:474:11

which happens when try_partial_reuse() tries to call
midx_pair_to_pack_pos() when it tries to reject cross-pack deltas.

Avoid the uninitialized read by ensuring that the pack_int_id field is
set in the single-pack reuse case by setting it to either the MIDX
preferred pack's pack_int_id, or '-1', in the case of single-pack
bitmaps.  In the latter case, we never read the pack_int_id field, so
the choice of '-1' is intentional as a "garbage in, garbage out"
measure.

Guard against further regressions in this area by adding a test which
ensures that we do not throw out deltas from the preferred pack as
"cross-pack" due to an uninitialized pack_int_id.

[^1]: This can happen for a couple of reasons, either because the
  repository is configured with 'pack.allowPackReuse=(true|single)', or
  because the MIDX was generated prior to the introduction of the BTMP
  chunk, which contains information necessary to perform multi-pack
  reuse.

Reported-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:08:28 -07:00
0c5a62f14b midx-write.c: do not read existing MIDX with packs_to_include
Commit d6a8c58675 (midx-write.c: support reading an existing MIDX with
`packs_to_include`, 2024-05-29) changed the MIDX generation machinery to
support reading from an existing MIDX when writing a new one.

Unfortunately, the rest of the MIDX generation machinery is not prepared
to deal with such a change. For instance, the function responsible for
adding to the object ID fanout table from a MIDX source
(midx_fanout_add_midx_fanout()) will gladly add objects from an existing
MIDX for some fanout level regardless of whether or not those objects
came from packs that are to be included in the subsequent MIDX write.

This results in broken pseudo-pack object order (leading to incorrect
object traversal results) and segmentation faults, like so (generated by
running the added test prior to the changes in midx-write.c):

    #0  0x000055ee31393f47 in midx_pack_order (ctx=0x7ffdde205c70) at midx-write.c:590
    #1  0x000055ee31395a69 in write_midx_internal (object_dir=0x55ee32570440 ".git/objects",
        packs_to_include=0x7ffdde205e20, packs_to_drop=0x0, preferred_pack_name=0x0,
        refs_snapshot=0x0, flags=15) at midx-write.c:1171
    #2  0x000055ee31395f38 in write_midx_file_only (object_dir=0x55ee32570440 ".git/objects",
        packs_to_include=0x7ffdde205e20, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15)
        at midx-write.c:1274
    [...]

In stack frame #0, the code on midx-write.c:590 is using the new pack ID
corresponding to some object which was added from the existing MIDX.
Importantly, the pack from which that object was selected in the
existing MIDX does not appear in the new MIDX as it was excluded via
`--stdin-packs`.

In this instance, the pack in question had pack ID "1" in the existing
MIDX, but since it was excluded from the new MIDX, we never filled in
that entry in the pack_perm table, resulting in:

    (gdb) p *ctx->pack_perm@2
    $1 = {0, 1515870810}

Which is what causes the segfault above when we try and read:

    struct pack_info *pack = &ctx->info[ctx->pack_perm[i]];
    if (pack->bitmap_pos == BITMAP_POS_UNKNOWN)
        pack->bitmap_pos = 0;

Fundamentally, we should be able to read information from an existing
MIDX when generating a new one. But in practice the midx-write.c code
assumes that we won't run into issues like the above with incongruent
pack IDs, and often makes those assumptions in extremely subtle and
fragile ways.

Instead, let's avoid reading from an existing MIDX altogether, and stick
with the pre-d6a8c58675 implementation. Harden against any regressions
in this area by adding a test which demonstrates these issues.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:08:28 -07:00
fbf7a46d88 builtin/blame: fix leaking ignore revs files
When parsing the blame configuration we add "blame.ignoreRevsFile"
configs to a string list. This string list is declared as with `NODUP`,
and thus we hand over the allocated string to that list. We eventually
end up calling `string_list_clear()` on that list, but due to it being
declared as `NODUP` we will not release the associated strings and thus
leak memory.

Fix this issue by setting up the list as `DUP` instead and free the
config string after insertion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
3332f35577 builtin/blame: fix leaking prefixed paths
In `cmd_blame()` we compute prefixed paths by calling `add_prefix()`,
which itself calls `prefix_path()`. While `prefix_path()` returns an
allocated string, `add_prefix()` pretends to return a constant string.
Consequently, this path never gets freed.

Fix the return type to be `char *` and free the path to plug the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
ee6a998583 blame: fix leaking data for blame scoreboards
There are some memory leaks when cleaning up blame scoreboards. Fix
those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
4b4f5a911c line-range: plug leaking find functions
In `parse_range_funcname()` we may end up allocating a "find function",
but never free it. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
44ec7c575f merge: fix leaking merge bases
When calling either the recursive or the ORT merge machineries we need
to provide a list of merge bases. The ownership of that parameter is
then implicitly transferred to the callee, which is somewhat fishy.
Furthermore, that list may leak in some cases where the merge machinery
runs into an error, thus causing a memory leak.

Refactor the code such that we stop transferring ownership. Instead, the
merge machinery will now create its own local copies of the passed in
list as required if they need to modify the list. Free the list at the
callsites as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
77241a6b5e builtin/merge: fix leaking struct cmdnames in get_strategy()
In "builtin/merge.c" we use the helper infrastructure to figure out what
merge strategies there are. We never free contents of the `cmdnames`
structures though and thus leak their memory.

Fix this by exposing the already existing `clean_cmdnames()` function to
release their memory. As this name isn't quite idiomatic, rename it to
`cmdnames_release()` while at it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
6e95f4ee03 sequencer: fix memory leaks in make_script_with_merges()
Fix some trivial memory leaks in `make_script_with_merges()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
8909d6e1a1 builtin/clone: plug leaking HEAD ref in wanted_peer_refs()
In `wanted_peer_refs()` we first create a copy of the "HEAD" ref. This
copy may not actually be passed back to the caller, but is not getting
freed in this case. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
4806c55c86 apply: fix leaking string in match_fragment()
Before calling `update_pre_post_images()`, we call `strbuf_detach()` to
put its buffer into a new string variable that we then pass to that
function. Besides being rather pointless, it also causes us to leak
memory of that variable because we never free it.

Get rid of the variable altogether and instead reach into the `strbuf`
directly. While at it, refactor the code to have a common exit path and
mark string that do not contain allocated memory as constant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
1e5c1601f9 sequencer: fix leaking string buffer in commit_staged_changes()
We're leaking the `rev` string buffer in various call paths. Refactor
the function to have a common exit path so that we can release its
memory reliably.

This fixes a subset of tests failing with the memory sanitizer in t3404.
But as there are more failures, we cannot yet mark the whole test suite
as passing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
63c9bd372e commit: fix leaking parents when calling commit_tree_extended()
When creating commits via `commit_tree_extended()`, the caller passes in
a string list of parents. This call implicitly transfers ownership of
that list to the function, which is quite surprising to begin with. But
to make matters worse, `commit_tree_extended()` doesn't even bother to
free the list of parents in error cases. The result is a memory leak,
and one that the caller cannot fix by themselves because they do not
know whether parts of the string list have already been released.

Refactor the code such that callers can keep ownership of the list of
parents, which is getting indicated by parameter being a constant
pointer now. Free the lists at the calling site and add a common exit
path to those sites as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
c6eb58bfb1 config: fix leaking "core.notesref" variable
The variable used to track the "core.notesref" config is not getting
freed before we assign to it and thus leaks. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
f46ede661f rerere: fix various trivial leaks
We leak various different string lists in the rerere code. Free those to
plug them.

Note that the `merge_rr` variable is intentionally being free'd with the
`free_util` parameter set to 1. The `util` field is used there to store
the IDs of every rerere item and thus needs to be freed, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
748bd0943b builtin/stash: fix leak in show_stash()
We leak the `revision_args()` variable. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
a90a089611 revision: free diff options
There is a todo comment in `release_revisions()` that mentions that we
need to free the diff options, which was added via 54c8a7c379 (revisions
API: add a TODO for diff_free(&revs->diffopt), 2022-04-14). Releasing
the diff options wasn't quite feasible at that time because some call
sites rely on its contents to remain even after the revisions have been
released.

In fact, there really only are a couple of callsites that misbehave
here:

  - `cmd_shortlog()` releases the revisions, but continues to access its
    file pointer.

  - `do_diff_cache()` creates a shallow copy of `struct diff_options`,
    but does not set the `no_free` member. Consequently, we end up
    releasing resources of the caller-provided diff options.

  - `diff_free()` and friends do not play nice when being called
    multiple times as they don't unset data structures that they have
    just released.

Fix all of those cases and enable the call to `diff_free()`, which plugs
a bunch of memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
a282dbeba7 builtin/log: fix leaking commit list in git-cherry(1)
We're storing the list of commits that git-cherry(1) is about to print
into a temporary list. This list is never getting free'd and thus leaks.
Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
8ff6bd4750 merge-recursive: fix memory leak when finalizing merge
We do not free some members of `struct merge_options`' private data.
Fix this to plug those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
3199b22e7d builtin/merge-recursive: fix leaking object ID bases
In `cmd_merge_recursive()` we have a static array of object ID bases
that we pass to `merge_recursive_generic()`. This interface is somewhat
weird though because the latter function accepts a pointer to a pointer
of object IDs, which requires us to allocate the object IDs on the heap.
And as we never free those object IDs, the end result is a leak.

While we can easily solve this leak by just freeing the respective
object IDs, the whole calling convention is somewhat weird. Instead,
refactor `merge_recursive_generic()` to accept a plain pointer to object
IDs so that we can avoid allocating them altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
9e903a5531 builtin/difftool: plug memory leaks in run_dir_diff()
We're leaking a bunch of memory leaks in `run_dir_diff()`. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
f87c55c264 object-name: free leaking object contexts
While it is documented in `struct object_context::path` that this
variable needs to be released by the caller, this fact is rather easy to
miss given that we do not ever provide a function to release the object
context. And of course, while some callers dutifully release the path,
many others don't.

Introduce a new `object_context_release()` function that releases the
path. Convert callsites that used to free the path to use that new
function and add missing calls to callsites that were leaking memory.
Refactor those callsites as required to have a single return path, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
61f8bb1ec1 builtin/rev-list: fix leaking bitmap index when calculating disk usage
git-rev-list(1) can speed up its object size calculations for reachable
objects via a bitmap walk, if there is any bitmap. This is done in
`try_bitmap_disk_usage()`, which tries to optimistically load the bitmap
and then use it, if available. It never frees it though, leading to a
memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
f644dc8494 notes: fix memory leak when pruning notes
In `prune_notes()` we first store the notes that are to be deleted in a
local list, and then iterate through that list to delete those notes one
by one. We never free the list though and thus leak its memory. Fix
this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
9748537437 revision: fix leaking display notes
We never free the display notes options embedded into `struct revision`.
Implement a new function `release_display_notes()` that we can call in
`release_revisions()` to fix this.

There is another gotcha here though: we play some games with the string
list used to track extra notes refs, where we sometimes set the bit that
indicates that strings should be strdup'd and sometimes unset it. This
dance is done to avoid a copy of an already-allocated string when we
call `enable_ref_display_notes()`. But this dance is rather pointless as
we can instead call `string_list_append_nodup()` to transfer ownership
of the allocated string to the list.

Refactor the code to do so and drop the `strdup_strings` dance.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
3d31d38255 merge-recursive: fix leaking rename conflict info
When computing rename conflicts in our recursive merge algorithm we set
up `struct rename_conflict_info`s to track that information. We never
free those data structures though and thus leak memory.

We need to be a bit more careful here though because the same rename
conflict info can be assigned to multiple structures. Accommodate for
this by introducing a `rename_conflict_info_owned` bit that we can use
to steer whether or not the rename conflict info shall be free'd.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
afb0653d23 biultin/rev-parse: fix memory leaks in --parseopt mode
We have a bunch of memory leaks in git-rev-parse(1)'s `--parseopt` mode.
Refactor the code to use `struct strvec`s to make it easier for us to
track the lifecycle of those leaking variables and then free them.

While at it, remove the unneeded static lifetime for some of the
variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
11ee9a75e7 bundle: plug leaks in create_bundle()
When creating a bundle, we set up a revision walk, but never release
data associated with it. Furthermore, we create a mostly-shallow copy of
that revision walk where we only adapt its pending objects such that we
can reuse the walk. While that copy must not be released, the pending
objects array need to be.

Plug those memory leaks by releasing the revision walk and the pending
objects of the copied revision walk.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
bb8c43d5cd notes-utils: free note trees when releasing copied notes
While we clear most of the members of `struct notes_rewrite_cfg` in
`finish_copy_notes_for_rewrite()`, we do not clear the notes tree. Fix
this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
14da26230a parse-options: fix leaks for users of OPT_FILENAME
The `OPT_FILENAME()` option will, if set, put an allocated string into
the user-provided variable. Consequently, that variable thus needs to be
free'd by the caller of `parse_options()`. Some callsites don't though
and thus leak memory. Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:04 -07:00
56931c4d89 revision: fix memory leak when reversing revisions
When reversing revisions in a rev walk, `get_revision()` will allocate a
new commit list and assign it to `revs->commits`. It does not free the
old list though, which makes it leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:04 -07:00
8d94cfb545 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 10:30:39 -07:00
5235e56ea5 Merge branch 'jk/leakfixes'
Memory leaks in "git mv" has been plugged.

* jk/leakfixes:
  mv: replace src_dir with a strvec
  mv: factor out empty src_dir removal
  mv: move src_dir cleanup to end of cmd_mv()
  t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
  t-strvec: use va_end() to match va_start()
2024-06-10 10:30:39 -07:00
718b50e3bf Merge branch 'iw/trace-argv-on-alias'
The alias-expanded command lines are logged to the trace output.

* iw/trace-argv-on-alias:
  run-command: show prepared command
  Documentation: alias: add notes on shell expansion
  Documentation: alias: rework notes into points
2024-06-10 10:30:38 -07:00
d7b97b7185 diff: let external diffs report that changes are uninteresting
The options --exit-code and --quiet instruct git diff to indicate
whether it found any significant changes by exiting with code 1 if it
did and 0 if there were none.  Currently this doesn't work if external
diff programs are involved, as we have no way to learn what they found.

Add that ability in the form of the new configuration options
diff.trustExitCode and diff.<driver>.trustExitCode and the environment
variable GIT_EXTERNAL_DIFF_TRUST_EXIT_CODE.  They pair with the config
options diff.external and diff.<driver>.command and the environment
variable GIT_EXTERNAL_DIFF, respectively.

The new options are off by default, keeping the old behavior.  Enabling
them indicates that the external diff returns exit code 1 if it finds
significant changes and 0 if it doesn't, like diff(1).

The name of the new options is taken from the git difftool and mergetool
options of similar purpose.  (There they enable passing on the exit code
of a diff tool and to infer whether a merge done by a merge tool is
successful.)

The new feature sets the diff flag diff_from_contents in
diff_setup_done() if we need the exit code and are allowed to call
external diffs.  This disables the optimization that avoids calling the
program with --quiet.  Add it back by skipping the call if the external
diff is not able to report empty diffs.  We can only do that check after
evaluating the file-specific attributes in run_external_diff().

If we do run the external diff with --quiet, send its output to
/dev/null.

I considered checking the output of the external diff to check whether
its empty.  It was added as 11be65cfa4 (diff: fix --exit-code with
external diff, 2024-05-05) and quickly reverted, as it does not work
with external diffs that do not write to stdout.  There's no reason why
a graphical diff tool would even need to write anything there at all.

I also considered using a non-zero exit code for empty diffs, which
could be done without adding new configuration options.  We'd need to
disable the optimization that allows git diff --quiet to skip calling
external diffs, though -- that might be quite surprising if graphical
diff programs are involved.  And assigning the opposite meaning of the
exit codes compared to diff(1) and git diff --exit-code to the external
diff can cause unnecessary confusion.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:20:46 -07:00
54443bbfc3 userdiff: add and use struct external_diff
Wrap the string specifying the external diff command in a new struct to
simplify adding attributes, which the next patch will do.

Make sure external_diff() still returns NULL if neither the environment
variable GIT_EXTERNAL_DIFF nor the configuration option diff.external is
set, to continue allowing its use in a boolean context.

Use a designated initializer for the default builtin userdiff driver to
adjust to the type change of the second struct member.  Spelling out
only the non-zero members improves readability as a nice side-effect.

No functional change intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:19:20 -07:00
33be6cf51a t4020: test exit code with external diffs
Add tests to check the exit code of git diff with its options --quiet
and --exit-code when using an external diff program.  Currently we
cannot tell whether it found significant changes or not.

While at it, document briefly that --quiet turns off execution of
external diff programs because that behavior surprised me for a moment
while writing the tests.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:19:20 -07:00
99c7de732e __attribute__: add a few missing format attributes
A public function mem_pool_strfmt() takes printf like parameters,
but is not given an attribute as such.  Also a few file-scope static
functions were missing their format attribute.

Add them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:16:30 -07:00
ba744647ea __attribute__: mark some functions with LAST_ARG_MUST_BE_NULL
Some varargs functions that use NULL-terminated parameter list were
missing __attributes__ ((sentinel)) aka LAST_ARG_MUST_BE_NULL.

Add them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:16:30 -07:00
f52c9a2a28 __attribute__: remove redundant attribute declaration for git_die_config()
The convention is to declare the function attribute to an extern
function together with its declaration in the header file, without
repeating the attribute declaration with its definition in the .c
source file (a file-scope static function declares its attribute
together with its definition in the .c file it is defined, as there
is no other place to do so).

The definition of git_die_config() in config.c did not follow the
convention and had its attribute declared with both its declaration
in the header and its definition in the .c source file.

Remove the one in the config.c to match everybody else.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:16:30 -07:00
89e78c7cda __attribute__: trace2_region_enter_printf() is like "printf"
The last part of the parameter list the function takes is like
parameters to printf.  Mark it as such.

An existing call that formats a value of type size_t using "%d" was
found by the compiler with the help with this annotation; fix it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:16:19 -07:00
bf6a86236e worktree_git_path(): move the declaration to path.h
The definition of this function is in path.c but its declaration is
in worktree.h, which is something unexpected.  The function is
explained as "Similar to git_path()"; declaring it next to where
git_path() is declared would make more sense.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-08 11:42:37 -07:00
e83055ecb0 doc: interactive.singleKey is disabled by default
Make it clear that the interactive.singleKey configuration option is
disabled by default, using rather subtle wording that avoids an
emphasis on the actual default value.  This should eliminate any
associated doubts.

While there, touch up the remaining wording of the description a bit.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 15:27:41 -07:00
f96c385449 format-patch: assume --cover-letter for diff in multi-patch series
When we deal with a multi-patch series in git-format-patch(1), if we see
`--interdiff` or `--range-diff` but no `--cover-letter`, we return with
an error, saying:

    fatal: --range-diff requires --cover-letter or single patch

or:

    fatal: --interdiff requires --cover-letter or single patch

This makes sense because the cover-letter is where we place the diff
from the previous version.

However, considering that `format-patch` generates a multi-patch as
needed, let's adopt a similar "cover as necessary" approach when using
`--interdiff` or `--range-diff`.

Therefore, relax the requirement for an explicit `--cover-letter` in a
multi-patch series when the user says `--iterdiff` or `--range-diff`.

Still, if only to return the error, respect "format.coverLetter=no" and
`--no-cover-letter`.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 14:02:13 -07:00
bc665cdab7 t4014: cleanups in a few tests
Arrange things we are going to create to be removed at end, and then
start creating them.  That way, we will clean them up even if we fail
after creating some but before the end of the command.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 14:02:12 -07:00
1b76f06508 Merge branch 'tb/midx-write-cleanup'
Code clean-up around writing the .midx files.

* tb/midx-write-cleanup:
  pack-bitmap.c: reimplement `midx_bitmap_filename()` with helper
  midx: replace `get_midx_rev_filename()` with a generic helper
  midx-write.c: support reading an existing MIDX with `packs_to_include`
  midx-write.c: extract `fill_packs_from_midx()`
  midx-write.c: extract `should_include_pack()`
  midx-write.c: pass `start_pack` to `compute_sorted_entries()`
  midx-write.c: reduce argument count for `get_sorted_entries()`
  midx-write.c: tolerate `--preferred-pack` without bitmaps
2024-06-07 10:57:23 -07:00
e3d2364c45 imap-send: free all_msgs strbuf in "out" label
We read stdin into a strbuf, but most code paths never release it,
causing a leak (albeit a minor one, as we leak only when exiting from
the main function of the program).

Commit 56f4f4a29d (imap-send: minimum leakfix, 2024-06-04) did the
minimum to plug the one instance we see in the test suite, when we read
an empty input. But it was sufficient only because aside from this noop
invocation, we don't test imap-send at all!

The right spot to free is in the "out" label, which is hit by all code
paths before leaving the function. We couldn't do that in 56f4f4a29d
because there was no unified exit path. That came separately in
3aca5f7fb0 (imap-send: fix leaking memory in `imap_server_conf`,
2024-06-04), which cleaned up many other leaks (but not this one).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:32:53 -07:00
f5598fcb7b Merge branch 'jc/t1517-more' into jk/imap-send-plug-all-msgs-leak
* jc/t1517-more:
  imap-send: minimum leakfix
  t1517: more coverage for commands that work without repository
2024-06-07 10:32:20 -07:00
7986451963 Merge branch 'ps/no-writable-strings' into jk/imap-send-plug-all-msgs-leak
* ps/no-writable-strings: (46 commits)
  config.mak.dev: enable `-Wwrite-strings` warning
  builtin/merge: always store allocated strings in `pull_twohead`
  builtin/rebase: always store allocated string in `options.strategy`
  builtin/rebase: do not assign default backend to non-constant field
  imap-send: fix leaking memory in `imap_server_conf`
  imap-send: drop global `imap_server_conf` variable
  mailmap: always store allocated strings in mailmap blob
  revision: always store allocated strings in output encoding
  remote-curl: avoid assigning string constant to non-const variable
  send-pack: always allocate receive status
  parse-options: cast long name for OPTION_ALIAS
  http: do not assign string constant to non-const field
  compat/win32: fix const-correctness with string constants
  pretty: add casts for decoration option pointers
  object-file: make `buf` parameter of `index_mem()` a constant
  object-file: mark cached object buffers as const
  ident: add casts for fallback name and GECOS
  entry: refactor how we remove items for delayed checkouts
  line-log: always allocate the output prefix
  line-log: stop assigning string constant to file parent buffer
  ...
2024-06-07 10:32:02 -07:00
d66fe0726b config.mak.dev: enable -Wwrite-strings warning
Writing to string constants is undefined behaviour and must be avoided
in C. Even so, the compiler does not help us with this by default
because those constants are not in fact marked as `const`. This makes it
rather easy to accidentally assign a constant to a non-const variable or
field and then later on try to either free it or write to it.

Enable `-Wwrite-strings` to catch such mistakes. With this warning
enabled, the type of string constants is changed to `const char[]` and
will thus cause compiler warnings when being assigned to non-const
fields and variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:56 -07:00
71e01a0ebd builtin/merge: always store allocated strings in pull_twohead
The `pull_twohead` configuration may sometimes contain an allocated
string, and sometimes it may contain a string constant. Refactor this to
instead always store an allocated string such that we can release its
resources without risk.

While at it, manage the lifetime of other config strings, as well. Note
that we explicitly don't free `cleanup_arg` here. This is because the
variable may be assigned a string constant via command line options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:56 -07:00
fc06676766 builtin/rebase: always store allocated string in options.strategy
The `struct rebase_options::strategy` field is a `char *`, but we do end
up assigning string constants to it in two cases:

  - When being passed a `--strategy=` option via the command line.

  - When being passed a strategy option via `--strategy-option=`, but
    not a strategy.

This will cause warnings once we enable `-Wwrite-strings`.

Ideally, we'd just convert the field to be a `const char *`. But we also
assign to this field via the GIT_TEST_MERGE_ALGORITHM envvar, which we
have to strdup(3P) into it.

Instead, refactor the code to make sure that we only ever assign
allocated strings to this field.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:55 -07:00
25a47ffac0 builtin/rebase: do not assign default backend to non-constant field
The `struct rebase_options::default_backend` field is a non-constant
string, but is being assigned a constant via `REBASE_OPTIONS_INIT`.
Fix this by using `xstrdup()` to assign the variable and introduce a new
function `rebase_options_release()` that releases memory held by the
structure, including the newly-allocated variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:55 -07:00
6d1f198f34 imap-send: fix leaking memory in imap_server_conf
We never free any of the config strings that we populate into the
`struct imap_server_conf`. Fix this by creating a common exit path where
we can free resources.

While at it, drop the unused member `imap_server_conf::name`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:55 -07:00
cea1ff7f1f imap-send: drop global imap_server_conf variable
In "imap-send.c", we have a global `sturct imap_server_conf` variable
that keeps track of the configuration of the IMAP server. This variable
is being populated mostly via the Git configuration.

Refactor the code to allocate the structure on the stack instead of
having it globally. This change allows us to track its lifetime more
closely.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:54 -07:00
c77756015e mailmap: always store allocated strings in mailmap blob
Same as with the preceding commit, the `git_mailmap_blob` may sometimes
contain an allocated string and sometimes it may contain a string
constant. This is risky and can easily lead to bugs in case the variable
is getting re-assigned, where the code may then try to free the previous
value to avoid memory leaks.

Safeguard the code by always storing allocated strings in the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:54 -07:00
844d190677 revision: always store allocated strings in output encoding
The `git_log_output_encoding` variable can be set via the `--encoding=`
option. When doing so, we conditionally either assign it to the passed
value, or if the value is "none" we assign it the empty string.
Depending on which of the both code paths we pick though, the variable
may end up being assigned either an allocated string or a string
constant.

This is somewhat risky and may easily lead to bugs when a different code
path may want to reassign a new value to it, freeing the previous value.
We already to this when parsing the "i18n.logoutputencoding" config in
`git_default_i18n_config()`. But because the config is typically parsed
before we parse command line options this has been fine so far.

Regardless of that, safeguard the code such that the variable always
contains an allocated string. While at it, also free the old value in
case there was any to plug a potential memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:54 -07:00
a3da6948c3 remote-curl: avoid assigning string constant to non-const variable
When processing remote options, we split the option line into two by
searching for a space. If there is one, we replace the space with '\0',
otherwise we implicitly assume that the value is "true" and thus assign
a string constant.

As the return value of strchr(3P) weirdly enough is a `char *` even
though it gets a `const char *` as input, the assigned-to variable also
is a non-constant. This is fine though because the argument is in fact
an allocated string, and thus we are allowed to modify it. But this will
break once we enable `-Wwrite-strings`.

Refactor the code stop splitting the fields with '\0' altogether.
Instead, we can pass the length of the option name to `set_option()` and
then use strncmp(3P) instead of strcmp(3P).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:53 -07:00
5bd0851d97 send-pack: always allocate receive status
In `receive_status()`, we record the reason why ref updates have been
rejected by the remote via the `remote_status`. But while we allocate
the assigned string when a reason was given, we assign a string constant
when no reason was given.

This has been working fine so far due to two reasons:

  - We don't ever free the refs in git-send-pack(1)'

  - Remotes always give a reason, at least as implemented by Git proper.

Adapt the code to always allocate the receive status string and free the
refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:53 -07:00
e463c5e8a0 parse-options: cast long name for OPTION_ALIAS
We assign the long name for OPTION_ALIAS options to a non-constant value
field. We know that the variable will never be written to, but this will
cause warnings once we enable `-Wwrite-strings`.

Cast away the constness to be prepared for this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:53 -07:00
8d3a7ce441 http: do not assign string constant to non-const field
In `write_accept_language()`, we put all acceptable languages into an
array. While all entries in that array are allocated strings, the final
entry in that array is a string constant. This is fine because we
explicitly skip over the last entry when freeing the array, but will
cause warnings once we enable `-Wwrite-strings`.

Adapt the code to also allocate the final entry.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:53 -07:00
e7b40195ae compat/win32: fix const-correctness with string constants
Adjust various places in our Win32 compatibility layer where we are not
assigning string constants to `const char *` variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:52 -07:00
9c076c32fb pretty: add casts for decoration option pointers
The `struct decoration_options` have a prefix and suffix field which are
both non-constant, but we assign a constant pointer to them. This is
safe to do because we pass them to `format_decorations()`, which never
modifies these pointers, and then immediately discard the structure. Add
explicit casts to avoid compilation warnings with `-Wwrite-strings`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:52 -07:00
9f03e4813a object-file: make buf parameter of index_mem() a constant
The `buf` parameter of `index_mem()` is a non-constant string. This will
break once we enable `-Wwrite-strings` because we also pass constants
from at least one callsite.

Adapt the parameter to be a constant. As we cannot free the buffer
without casting now, this also requires us to move the lifetime of the
nested buffer around.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:52 -07:00
724b6d1e18 object-file: mark cached object buffers as const
The buffers of cached objects are never modified, but are still stored
as a non-constant pointer. This will cause a compiler warning once we
enable the `-Wwrite-strings` compiler warning as we assign an empty
constant string when initializing the static `empty_tree` cached object.

Convert the field to be constant. This requires us to shuffle around
the code a bit because we memcpy(3P) into the allocated buffer in
`pretend_object_file()`. This is easily fixed though by allocating the
buffer into a temporary variable first.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:51 -07:00
32f9929109 ident: add casts for fallback name and GECOS
In `xgetpwuid_self()`, we return a fallback identity when it was not
possible to look up the current identity. This fallback identity needs
to be internal and must never be written to by the calles as specified
by getpwuid(3P). As both the `pw_name` and `pw_gecos` fields are marked
as non-constant though, it will cause a warning to assign constant
strings to them once compiling with `-Wwrite-strings`.

Add explicit casts to avoid the warning.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:51 -07:00
b31607a3e0 entry: refactor how we remove items for delayed checkouts
When finalizing a delayed checkout, we sort out several strings from the
passed-in string list by first assigning the empty string to those
filters and then calling `string_list_remove_empty_items()`. Assigning
the empty string will cause compiler warnings though as the string is
a `char *` once we enable `-Wwrite-strings`.

Refactor the code to use a `NULL` pointer with `filter_string_list()`
instead to avoid this warning.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:51 -07:00
394affd46d line-log: always allocate the output prefix
The returned string by `output_prefix()` is sometimes a string constant
and sometimes an allocated string. This has been fine until now because
we always leak the allocated strings, and thus we never tried to free
the string constant.

Fix the code to always return an allocated string and free the returned
value at all callsites.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:51 -07:00
42d2ad5556 line-log: stop assigning string constant to file parent buffer
Stop assigning a string constant to the file parent buffer and instead
assign an allocated string. While the code is fine in practice, it will
break once we compile with `-Wwrite-strings`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:50 -07:00
86badd4d0a diff: cast string constant in fill_textconv()
The `fill_textconv()` function is responsible for converting an input
file with a textconv driver, which is then passed to the caller. Weirdly
though, the function also handles the case where there is no textconv
driver at all. In that case, it will return either the contents of the
populated filespec, or an empty string if the filespec is invalid.

These two cases have differing memory ownership semantics. When there is
a textconv driver, then the result is an allocated string. Otherwise,
the result is either a string constant or owned by the filespec struct.
All callers are in fact aware of this weirdness and only end up freeing
the output buffer when they had a textconv driver.

Ideally, we'd split up this interface to only perform the conversion via
the textconv driver, and BUG in case the caller didn't provide one. This
would make memory ownership semantics much more straight forward. For
now though, let's simply cast the empty string constant to `char *` to
avoid a warning with `-Wwrite-strings`. This is equivalent to the same
cast that we already have in `fill_mmfile()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:50 -07:00
81654d27bf builtin/remote: cast away constness in get_head_names()
In `get_head_names()`, we assign the "refs/heads/*" string constant to
`struct refspec_item::{src,dst}`, which are both non-constant pointers.
Ideally, we'd refactor the code such that both of these fields were
constant. But `struct refspec_item` is used for two different usecases
with conflicting requirements:

  - To query for a source or destination based on the given refspec. The
    caller either sets `src` or `dst` as the branch that we want to
    search for, and the respective other field gets populated. The
    fields should be constant when being used as a query parameter,
    which is owned by the caller, and non-constant when being used as an
    out parameter, which is owned by the refspec item. This is is
    contradictory in itself already.

  - To store refspec items with their respective source and destination
    branches, in which case both fields should be owned by the struct.

Ideally, we'd split up this interface to clearly separate between
querying and storing, which would enable us to clarify lifetimes of the
strings. This would be a much bigger undertaking though.

Instead, accept the status quo for now and cast away the constness of
the source and destination patterns. We know that those are not being
written to or freed, so while this is ugly it certainly is fine for now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:50 -07:00
235ac3f81a refspec: remove global tag refspec structure
We have a global tag refspec structure that is used by both git-clone(1)
and git-fetch(1). Initialization of the structure will break once we
enable `-Wwrite-strings`, even though the breakage is harmless. While we
could just add casts, the structure isn't really required in the first
place as we can simply initialize the structures at the respective
callsites.

Refactor the code accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:49 -07:00
66f892bb07 reftable: cast away constness when assigning constants to records
The reftable records are used in multiple ways throughout the reftable
library. In many of those cases they merely act as input to a function
without getting modified by it at all. Most importantly, this happens
when writing records and when querying for records.

We rely on this in our tests and thus assign string constants to those
fields, which is about to generate warnings as those fields are of type
`char *`. While we could go through the process and instead allocate
those strings in all of our tests, this feels quite unnecessary.

Instead, add casts to `char *` for all of those strings. As this is part
of our tests, this also nicely serves as a demonstration that nothing
writes or frees those string constants, which would otherwise lead to
segfaults.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:49 -07:00
23c32511b3 refs/reftable: stop micro-optimizing refname allocations on copy
When copying refs, we execute `write_copy_table()` to write the new
table. As the names are given to us via `arg->newname` and
`arg->oldname`, respectively, we optimize away some allocations by
assigning those fields to the reftable records we are about to write
directly, without duplicating them. This requires us to cast the input
to `char *` pointers as they are in fact constant strings. Later on, we
then unset the refname for all of the records before calling
`reftable_log_record_release()` on them.

We also do this when assigning the "HEAD" constant, but here we do not
cast because its type is `char[]` by default. It's about to be turned
into `const char *` though once we enable `-Wwrite-strings` and will
thus cause another warning.

It's quite dubious whether this micro-optimization really helps. We're
about to write to disk anyway, which is going to be way slower than a
small handful of allocations. Let's drop the optimization altogther and
instead copy arguments to simplify the code and avoid the future warning
with `-Wwrite-strings`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:48 -07:00
c113c5df79 global: convert intentionally-leaking config strings to consts
There are multiple cases where we intentionally leak config strings:

  - `struct gpg_format` is used to track programs that can be used for
    signing commits, either via gpg(1), gpgsm(1) or ssh-keygen(1). The
    user can override the commands via several config variables. As the
    array is populated once, only, and the struct memers are never
    written to or free'd.

  - `struct ll_merge_driver` is used to track merge drivers. Same as
    with the GPG format, these drivers are populated once and then
    reused. Its data is never written to or free'd, either.

  - `struct userdiff_funcname` and `struct userdiff_driver` can be
    configured via `diff.<driver>.*` to add additional drivers. Again,
    these have a global lifetime and are never written to or free'd.

All of these are intentionally kept alive and are never written to.
Furthermore, all of these are being assigned both string constants in
some places, and allocated strings in other places. This will cause
warnings once we enable `-Wwrite-strings`, so let's mark the respective
fields as `const char *` and cast away the constness when assigning
those values.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:48 -07:00
b567004b4b global: improve const correctness when assigning string constants
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:48 -07:00
7dd4051b01 update-ref: add support for 'symref-update' command
Add 'symref-update' command to the '--stdin' mode of 'git-update-ref' to
allow updates of symbolic refs. The 'symref-update' command takes in a
<new-target>, which the <ref> will be updated to. If the <ref> doesn't
exist it will be created.

It also optionally takes either an `ref <old-target>` or `oid
<old-oid>`. If the <old-target> is provided, it checks to see if the
<ref> targets the <old-target> before the update. If <old-oid> is provided
it checks <ref> to ensure that it is a regular ref and <old-oid> is the
OID before the update. This by extension also means that this when a
zero <old-oid> is provided, it ensures that the ref didn't exist before.

The divergence in syntax from the regular `update` command is because if
we don't use a `(ref | oid)` prefix for the old_value, then there is
ambiguity around if the value provided should be treated as an oid or a
reference. This is more so the reason, because we allow anything
committish to be provided as an oid. While 'symref-verify' and
'symref-delete' also take in `<old-target>` we do not have this
divergence there as those commands only work with symrefs. Whereas
'symref-update' also works with regular refs and allows users to convert
regular refs to symrefs.

The command allows users to perform symbolic ref updates within a
transaction. This provides atomicity and allows users to perform a set
of operations together.

This command supports deref mode, to ensure that we can update
dereferenced regular refs to symrefs.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:45 -07:00
f1dcdd6deb reftable: pick either 'oid' or 'target' for new updates
When creating a reference transaction update, we can provide the old/new
oid/target for the update. We have checks in place to ensure that for
each old/new, either oid or target is set and not both.

In the reftable backend, when dealing with updates without the
`REF_NO_DEREF` flag, we don't selectively propagate data as needed.
Since there are no active users of the path, this is not caught. As we
want to introduce the 'symref-update' command in the upcoming commit,
which would use this flow, correct it.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:45 -07:00
ed3272720e update-ref: add support for 'symref-create' command
Add 'symref-create' command to the '--stdin' mode 'git-update-ref' to
allow creation of symbolic refs in a transaction. The 'symref-create'
command takes in a <new-target>, which the created <ref> will point to.

Also, support the 'core.prefersymlinkrefs' config, wherein if the config
is set and the filesystem supports symlinks, we create the symbolic ref
as a symlink. We fallback to creating a regular symref if creating the
symlink is unsuccessful.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:45 -07:00
2343720967 update-ref: add support for 'symref-delete' command
Add a new command 'symref-delete' to allow deletions of symbolic refs in
a transaction via the '--stdin' mode of the 'git-update-ref' command.
The 'symref-delete' command can, when given an <old-target>, delete the
provided <ref> only when it points to <old-target>.

This command is only compatible with the 'no-deref' mode because we
optionally want to check the 'old_target' of the ref being deleted.
De-referencing a symbolic ref would provide a regular ref and we already
have the 'delete' command for regular refs.

While users can also use 'git symbolic-ref -d' to delete symbolic refs,
the 'symref-delete' command in 'git-update-ref' allows users to do so
within a transaction, which promises atomicity of the operation and can
be batched with other commands.

When no 'old_target' is provided it can also delete regular refs,
similar to how the 'delete' command can delete symrefs when no 'old_oid'
is provided.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
1451ac734f update-ref: add support for 'symref-verify' command
The 'symref-verify' command allows users to verify if a provided <ref>
contains the provided <old-target> without changing the <ref>. If
<old-target> is not provided, the command will verify that the <ref>
doesn't exist.

The command allows users to verify symbolic refs within a transaction,
and this means users can perform a set of changes in a transaction only
when the verification holds good.

Since we're checking for symbolic refs, this command will only work with
the 'no-deref' mode. This is because any dereferenced symbolic ref will
point to an object and not a ref and the regular 'verify' command can be
used in such situations.

Add required tests for symref support in 'verify'. Since we're here,
also add reflog checks for the pre-existing 'verify' tests, there is no
divergence from behavior, but we never tested to ensure that reflog
wasn't affected by the 'verify' command.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
aa6e99f122 refs: specify error for regular refs with old_target
When a reference update tries to update a symref, but the ref in
question is actually a regular ref, we raise an error. However the error
raised in this situation is:

  verifying symref target: '<ref>': reference is missing but expected <old-target>

which is very generic and doesn't indicate the mismatch of types. Let's
make this error more specific:

  cannot lock ref '<ref>': expected symref with target '<old-target>': but is a regular ref

so that users have a clearer understanding.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
aba381c090 refs: create and use ref_update_expects_existing_old_ref()
The files and reftable backend, need to check if a ref must exist, so
that the required validation can be done. A ref must exist only when the
`old_oid` value of the update has been explicitly set and it is not the
`null_oid` value.

Since we also support symrefs now, we need to ensure that even when
`old_target` is set a ref must exist. While this was missed when we
added symref support in transactions, there are no active users of this
path. As we introduce the 'symref-verify' command in the upcoming
commits, it is important to fix this.

So let's export this to a function called
`ref_update_expects_existing_old_ref()` and expose it internally via
'refs-internal.h'.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
8981dca8bc server-info.c: remove temporary info files on exit
The update_info_file() function within server-info.c is responsible for
moving the info/refs and info/packs files around when updating server
info.

These updates are staged into a temporary file and then moved into place
atomically to avoid race conditions when reading those files. However,
the temporary file used to stage these changes is managed outside of the
tempfile.h API, and thus survives process death.

Manage these files instead with the tempfile.h API so that they are
automatically cleaned up upon abnormal process death.

Unfortunately, and unlike in the previous step, there isn't a
straightforward way to inject a failure into the update-server-info step
that causes us to die() rather than take the cleanup path in label
'out', hence the lack of a test here.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 08:40:50 -07:00
c195ecda77 commit-graph.c: remove temporary graph layers on exit
Since the introduction of split commit graph layers in 92b1ea66b9
(Merge branch 'ds/commit-graph-incremental', 2019-07-19), the function
write_commit_graph_file() has done the following when writing an
incremental commit-graph layer:

  - used a lock_file to control access to the commit-graph-chain file

  - used an auxiliary file (whose descriptor was stored in 'fd') to
    write the new commit-graph layer itself

Using a lock_file to control access to the commit-graph-chain is
sensible, since only one writer may modify it at a time. Likewise, when
the commit-graph machinery is writing out a single layer, the lock_file
structure is used to modify the commit-graph itself. This is also
sensible, since the non-incremental commit-graph may also have at most
one writer.

However, using an auxiliary temporary file without using the tempfile.h
API means that writes that fail after the temporary graph layer has been
created will leave around a file in

    $GIT_DIR/objects/info/commit-graphs/tmp_graph_XXXXXX

The commit-graph-chain file and non-incremental commit-graph do not
suffer from this problem as the lockfile.h API uses the tempfile.h API
transparently, so processes that died before moving those finals into
their final location cleaned up after themselves.

Ensure that the temporary file used to write incremental commit-graphs
is also managed with the tempfile.h API, to ensure that we do not ever
leave tmp_graph_XXXXXX files laying around.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 08:40:48 -07:00
cd77e87115 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 12:49:25 -07:00
9d8e7d2ef7 Merge branch 'mt/openindiana-scalar'
Avoid removing the $(cwd) for portability.

* mt/openindiana-scalar:
  scalar: make enlistment delete to work on all POSIX platforms
2024-06-06 12:49:25 -07:00
df5c2c4962 Merge branch 'rs/difftool-env-simplify'
Code simplification.

* rs/difftool-env-simplify:
  difftool: add env vars directly in run_file_diff()
2024-06-06 12:49:24 -07:00
d11b0c75ec Merge branch 'th/quiet-lazy-fetch-from-promisor'
The promisor.quiet configuration knob can be set to true to make
lazy fetching from promisor remotes silent.

* th/quiet-lazy-fetch-from-promisor:
  promisor-remote: add promisor.quiet configuration option
2024-06-06 12:49:24 -07:00
cf792653ad Merge branch 'ps/leakfixes'
Leakfixes.

* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-06-06 12:49:23 -07:00
27db485c34 credential: clear expired c->credential, unify secret clearing
When a struct credential expires, credential_fill() clears c->password
so that clients don't try to use it later. However, a struct cred that
uses an alternate authtype won't have a password, but might have a
credential stored in c->credential.

This is a problem, for example, when an OAuth2 bearer token is used. In
the system I'm using, the OAuth2 configuration generates and caches a
bearer token that is valid for an hour. After the token expires, git
needs to call back into the credential helper to use a stored refresh
token to get a new bearer token. But if c->credential is still non-NULL,
git will instead try to use the expired token and fail with an error:

 fatal: Authentication failed for 'https://<oauth2-enabled-server>/repository'

And on the server:

 [auth_openidc:error] [client <ip>:34012] oidc_proto_validate_exp: "exp" validation failure (1717522989): JWT expired 224 seconds ago

Fix this by clearing both c->password and c->credential for an expired
struct credential. While we're at it, use credential_clear_secrets()
wherever both c->password and c->credential are being cleared.

Update comments in credential.h to mention the new struct fields.

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 11:42:40 -07:00
62c71ace44 test-terminal: drop stdin handling
Since 18d8c26930 (test_terminal: redirect child process' stdin to a pty,
2015-08-04), we set up a pty and copy stdin to the child program. But
this ends up being racy; once we send all of the bytes and close the
descriptor, the child program will no longer see a terminal! isatty()
will return 0, and trying to read may return EIO, even if we didn't yet
get all of the bytes.

This was mentioned even in the commit message of 18d8c26930, but we
hacked around it by just sending an infinite input from /dev/zero (in
the intended case, we only cared about isatty(0), not reading actual
input).

And it came up again recently in:

  https://lore.kernel.org/git/d42a55b1-1ba9-4cfb-9c3d-98ea4d86da33@gmail.com/

where we tried to actually send bytes, but they don't always all come
through. So this interface is somewhat of an accident waiting to happen;
a caller might not even care about stdin being a tty, but will get bit
by the flaky behavior.

One solution would probably be to avoid closing test_terminal's end of
the pty altogether. But then the other side would never see EOF on its
stdin.  That may be OK for some cases, but it's another gotcha that
might cause races or deadlocks, depending on what the child expects to
read.

Let's instead just drop test_terminal's stdin feature completely. Since
the previous commit dropped the two cases from t4153 for which the
feature was originally added, there are no callers left that need it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 10:07:41 -07:00
53ce2e3f0a am: add explicit "--retry" option
After a patch fails, you can ask "git am" to try applying it again with
new options by running without any of the resume options. E.g.:

  git am <patch
  # oops, it failed; let's try again
  git am --3way

But since this second command has no explicit resume option (like
"--continue"), it looks just like an invocation to read a fresh patch
from stdin. To avoid confusing the two cases, there are some heuristics,
courtesy of 8d18550318 (builtin-am: reject patches when there's a
session in progress, 2015-08-04):

	if (in_progress) {
		/*
		 * Catch user error to feed us patches when there is a session
		 * in progress:
		 *
		 * 1. mbox path(s) are provided on the command-line.
		 * 2. stdin is not a tty: the user is trying to feed us a patch
		 *    from standard input. This is somewhat unreliable -- stdin
		 *    could be /dev/null for example and the caller did not
		 *    intend to feed us a patch but wanted to continue
		 *    unattended.
		 */
		if (argc || (resume_mode == RESUME_FALSE && !isatty(0)))
			die(_("previous rebase directory %s still exists but mbox given."),
				state.dir);

		if (resume_mode == RESUME_FALSE)
			resume_mode = RESUME_APPLY;
		[...]

So if no resume command is given, then we require that stdin be a tty,
and otherwise complain about (potentially) receiving an mbox on stdin.
But of course you might not actually have a terminal available! And
sadly there is no explicit way to hit this same code path; this is the
only place that sets RESUME_APPLY. So you're stuck, and scripts like our
test suite have to bend over backwards to create a pseudo-tty.

Let's provide an explicit option to trigger this mode. The code turns
out to be quite simple; just setting "resume_mode" to RESUME_FALSE is
enough to dodge the tty check, and then our state is the same as it
would be with the heuristic case (which we'll continue to allow).

When we don't have a session in progress, there's already code to
complain when resume_mode is set (but we'll add a new test to cover
that).

To test the new option, we'll convert the existing tests that rely on
the fake stdin tty. That lets us test them on more platforms, and will
let us simplify test_terminal a bit in a future patch.

It does, however, mean we're not testing the tty heuristic at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 10:07:41 -07:00
df651330ab ci: fix check for Ubuntu 20.04
In 5ca0c455f1 (ci: fix Python dependency on Ubuntu 24.04, 2024-05-06),
we made the use of Python 2 conditional on whether or not the CI job
runs Ubuntu 20.04. There was a brown-paper-bag-style bug though, where
the condition forgot to invoke the `test` builtin. The result of it is
that the check always fails, and thus all of our jobs run with Python 3
by accident.

Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:59:27 -07:00
25a0023f28 builtin/refs: new command to migrate ref storage formats
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:

  - There is no good place to put the migration logic in existing
    commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
    not the correct place to put it, either.

  - I had it in my mind to create a new low-level command for accessing
    refs for quite a while already. git-refs(1) is that command and can
    over time grow more functionality relating to refs. This should help
    discoverability by consolidating low-level access to refs into a
    single executable.

As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:34 -07:00
6d6a3a99c7 refs: implement logic to migrate between ref storage formats
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.

The implementation is generic and works with arbitrary ref storage
formats so that a backend does not need to implement any migration
logic. It does have a few limitations though:

  - We do not migrate repositories with worktrees, because worktrees
    have separate ref storages. It makes the overall affair more complex
    if we have to migrate multiple storages at once.

  - We do not migrate reflogs, because we have no interfaces to write
    many reflog entries.

  - We do not lock the repository for concurrent access, and thus
    concurrent writes may end up with weird in-between states. There is
    no way to fully lock the "files" backend for writes due to its
    format, and thus we punt on this topic altogether and defer to the
    user to avoid those from happening.

In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
often have neither worktrees nor reflogs. But it will not work for many
other repositories without some preparations. These limitations are not
set into stone though, and ideally we will eventually address them over
time.

The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:33 -07:00
64a6dd8ffc refs: implement removal of ref storages
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.

Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:33 -07:00
1339cb3c47 worktree: don't store main worktree twice
In `get_worktree_ref_store()` we either return the repository's main ref
store, or we look up the ref store via the map of worktree ref stores.
Which of these worktrees gets picked depends on the `is_current` bit of
the worktree, which indicates whether the worktree is the one that
corresponds to `the_repository`.

The bit is getting set in `get_worktrees()`, but only after we have
computed the list of all worktrees. This is too late though, because at
that time we have already called `get_worktree_ref_store()` on each of
the worktrees via `add_head_info()`. The consequence is that the current
worktree will not have been marked accordingly, which means that we did
not use the main ref store, but instead created a new ref store. We thus
have two separate ref stores now that map to the same ref database.

Fix this by setting `is_current` before we call `add_head_info()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:33 -07:00
b5d7db9e83 reftable: inline merged_table_release()
The function `merged_table_release()` releases a merged table, whereas
`reftable_merged_table_free()` releases a merged table and then also
free's its pointer. But all callsites of `merged_table_release()` are in
fact followed by `reftable_merged_table_free()`, which is redundant.

Inline `merged_table_release()` into `reftable_merged_table_free()` to
get rid of this redundance.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:32 -07:00
b3e098d6e7 refs/files: fix NULL pointer deref when releasing ref store
The `free_ref_cache()` function is not `NULL` safe and will thus
segfault when being passed such a pointer. This can easily happen when
trying to release a partially initialized "files" ref store. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:32 -07:00
120b67172f refs/files: extract function to iterate through root refs
Extract a new function that can be used to iterate through all root refs
known to the "files" backend. This will be used in the next commit,
where we start to teach ref backends to remove themselves.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:32 -07:00
66275a6311 refs/files: refactor add_pseudoref_and_head_entries()
The `add_pseudoref_and_head_entries()` function accepts both the ref
store as well as a directory name as input. This is unnecessary though
as the ref store already uniquely identifies the root directory of the
ref store anyway.

Furthermore, the function is misnamed now that we have clarified the
meaning of pseudorefs as it doesn't add pseudorefs, but root refs.
Rename it accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:32 -07:00
fbd1a693c7 refs: allow to skip creation of reflog entries
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.

Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
6e1683ace9 refs: pass storage format to ref_store_init() explicitly
We're about to introduce logic to migrate refs from one storage format
to another one. This will require us to initialize a ref store with a
different format than the one used by the passed-in repository.

Prepare for this by accepting the desired ref storage format as
parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
318efb966b refs: convert ref storage format to an enum
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.

Convert the ref storage format to instead be an enum.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
a83f7f51e1 setup: unset ref storage when reinitializing repository version
When reinitializing a repository's version we may end up unsetting the
hash algorithm when it matches the default hash algorithm. If we didn't
do that then the previously configured value might remain intact.

While the same issue exists for the ref storage extension, we don't do
this here. This has been fine for most of the part because it is not
supported to re-initialize a repository with a different ref storage
format anyway. We're about to introduce a new command to migrate ref
storages though, so this is about to become an issue there.

Prepare for this and unset the ref storage format when reinitializing a
repository with the "files" format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
f60fec6a16 ci/test-documentation: work around SyntaxWarning in Python 3.12
In Python 3.6, unrecognized escape sequences in regular expressions
started to produce a DeprecationWarning [1]. In Python 3.12, this was
upgraded to a SyntaxWarning and will eventually be raised even further
to a SyntaxError. We indirectly hit such unrecognized escape sequences
via Asciidoc, which results in a bunch of warnings:

    $ asciidoc -o /dev/null git-cat-file.txt
    <unknown>:1: SyntaxWarning: invalid escape sequence '\S'
    <unknown>:1: SyntaxWarning: invalid escape sequence '\S'

This in turn causes our "ci/test-documentation.sh" script to fail, as it
checks that stderr of `make doc` is empty.

These escape sequences seem to be part of Asciidoc itself. In the long
term, we should probably consider dropping support for Asciidoc in favor
of Asciidoctor. Upstream also considers itself to be legacy software and
recommends to move away from it [2]:

    It is suggested that unless you specifically require the AsciiDoc.py
    toolchain, you should find a processor that handles the modern
    AsciiDoc syntax.

For now though, let's expand its lifetime a little bit more by filtering
out these new warnings. We should probably reconsider once the warnings
are upgraded to errors by Python.

[1]: https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
[2]: 6d9f76cff0/README.md (asciidocpy)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 08:20:51 -07:00
401151de9e gitlab-ci: add job to run make check-docs
Add another job to execute `make check-docs`, which lints our
documentation and makes sure that expected manpages exist. This job
mirrors the same job that we already have for GitHub Actions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 08:20:51 -07:00
6423920974 Documentation/lint-manpages: bubble up errors
The "lint-manpages.sh" script does not return an error in case any of
its checks fail. While this is faithful to the implementation that we
had as part of the "check-docs" target before the preceding commit, it
makes it hard to spot any violations of the rules via the corresponding
CI job, which will of course exit successfully, too.

Adapt the script to bubble up errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 08:20:51 -07:00
2dd100c513 Makefile: extract script to lint missing/extraneous manpages
The "check-docs" target of our top-level Makefile fulfills two different
roles. For one it runs the "lint-docs" target of the "Documentation/"
Makefile. And second it performs some checks of whether there are any
manpages that are missing or extraneous via some inline scripts.

The second set of checks feels quite misplaced in the top-level Makefile
as it would fit in much better with our "lint-docs" target. Back when
the checks were introduced in 8c989ec528 (Makefile: $(MAKE) check-docs,
2006-04-13), that target did not yet exist though.

Furthermore, the script makes use of several Makefile variables which
are defined in the top-level Makefile, which makes it hard to access
their contents from elsewhere. There is a trick though that we already
use in "check-builtins.sh" to gain access: we can create an ad-hoc
Makefile that has an extra target to print those variables.

Pull out the script into a separate "lint-manpages.sh" script by using
that trick. Wire up that script via the "lint-docs" target. For one,
normal shell scripts are way easier to reason about than those which are
embedded in a Makefile. Second, it allows one to easily execute the
script standalone without any of the other checks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 08:20:50 -07:00
a74c0686fa add-i: finally retire add.interactive.useBuiltin
The configuration variable stopped doing anything (other than
announcing itself as a variable that does not do anything useful,
when it is used) in Git 2.40.

At this point, it is not even worth giving the warning, which was
meant to be a way to help users notice they are carrying unused
cruft in their configuration files and give them a chance to
clean-up.

Let's remove the warning and documentation for it, and truly stop
paying attention to it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
               ---
 Documentation/config/add.txt |  6 ------
 builtin/add.c                |  6 +-----
 t/t3701-add-interactive.sh   | 15 ---------------
 3 files changed, 1 insertion(+), 26 deletions(-)
2024-06-05 14:53:26 -07:00
5c71d6b63a attr.tree: HEAD:.gitattributes is no longer the default in a bare repo
51441e64 (stop using HEAD for attributes in bare repository by
default, 2024-05-03) has addressed a recent performance regression
by partially reverting a topic that was merged at 26dd307c (Merge
branch 'jc/attr-tree-config', 2023-10-30).  But it forgot to update
the documentation to remove the mention of a special case in bare
repositories.

Let's update the document before the update hits the next release.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 14:52:53 -07:00
6d107751b2 sparse-checkout: free duplicate hashmap entries
In insert_recursive_pattern(), we create a new pattern_entry to insert
into the parent_hashmap. If we find that the same entry already exists
in the hashmap, we skip adding the new one. But we forget to free the new
one, creating a leak.

We can fix it by cleaning up the discarded entry. It would probably be
possible to avoid creating it in the first place, but it's non-trivial.
We'd have to define a "keydata" struct that lets us compare the existing
entries to the broken-out fields. It's probably not worth the
complexity, so we'll punt on that for now.

There is one subtlety here: our insertion is happening in a loop, with
each iteration looking at the pattern we just inserted (hence the
"recursive" in the name). So if we skip insertion, what do we look at?

The obvious answer is that we should remember the existing duplicate we
found and use that. But I _think_ in that case, we probably already have
all of the recursive bits already (from when the original entry was
added). And so just breaking out of the loop would be correct. But I'm
not 100% sure on that; after all, the original leaky code could have
done the same break, but it didn't.

So I went with the "obvious answer" above, which has no chance of
changing the behavior aside from fixing the leak.

With this patch, t1091 can now be marked leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
a544b7da2c sparse-checkout: free string list after displaying
In sparse_checkout_list(), we put the hashmap entries into a string_list
so we can sort them. But after printing, we forget to free the list.

This patch drops 5 leaks from t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
521e04e6e8 sparse-checkout: free pattern list in sparse_checkout_list()
In sparse_checkout_list(), we create a pattern_list that needs to
eventually be cleared. We remember to do so in the regular code path,
but the cone-mode path does an early return, and forgets to clean up.

We could fix the leak by adding a new call to clear_pattern_list(). But
we can simplify even further by just skipping the early return, pushing
the other code path (which consists now of only one line!) into an else
block. That also matches the same cone/non-cone if/else used in some
other functions.

This fixes 15 leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
008f59d2d6 sparse-checkout: free sparse_filename after use
We allocate a heap buffer via get_sparse_checkout_filename(). Most calls
remember to free it, but sparse_checkout_init() forgets to, causing a
leak. Ironically, it remembers to do so in the error return paths, but
not in the path that makes it all the way to the function end!

Fixing this clears up 6 leaks from t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
a14d49ca84 sparse-checkout: refactor temporary sparse_checkout_patterns
In update_working_directory(), we take in a pattern_list, attach it to
the repository index by assigning it to index->sparse_checkout_patterns,
and then call unpack_trees. Afterwards, we remove it by setting
index->sparse_checkout_patterns back to NULL.

But there are two possible leaks here:

  1. If the index already had a populated sparse_checkout_patterns,
     we've obliterated it. We can fix this by saving and restoring it,
     rather than always setting it back to NULL.

  2. We may call the function with a NULL pattern_list, expecting it to
     use the on-disk sparse file. In that case, the index routines will
     lazy-load the sparse patterns automatically. But now at the end of
     the function when we restore the patterns, we'll leak those
     lazy-loaded ones!

     We can fix this by freeing the pattern list before overwriting its
     pointer whenever it does not match what was passed in (in practice
     this should only happen when the passed-in list is NULL, but this
     is erring on the defensive side).

Together these remove 48 indirect leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
d765fa0331 sparse-checkout: always free "line" strbuf after reading input
In add_patterns_from_input(), we may read lines from a file with a loop
like this:

  while (!strbuf_getline(&line, file)) {
	...
	strbuf_to_cone_pattern(&line, pl);
  }
  /* we don't strbuf_release(&line) here! */

This generally is OK because strbuf_to_cone_pattern() consumes the
buffer via strbuf_detach(). But we can leak in a few cases:

  1. We don't always consume the buffer! If the line ends up empty after
     trimming, we leave strbuf_to_cone_pattern() without detaching. In
     most cases this is OK, because a subsequent getline() call will use
     the same buffer. But if you had an empty line at the end of file,
     for example, it would leak.

  2. Even if strbuf_to_cone_pattern() always consumed the buffer,
     there's a subtle issue with strbuf_getline(). As we saw in
     94e2aa555e (strbuf: fix leak when `appendwholeline()` fails with
     EOF, 2024-05-27), it's possible for it to return EOF with an
     allocated buffer (e.g., if the underlying getdelim() call saw an
     error). So we should always strbuf_release() after finishing a read
     loop like this.

Note that even the code to read patterns from argv has the same problem.
Because that also uses strbuf_to_cone_pattern(), we stuff each argv
entry into a strbuf. It uses the same "line" strbuf as the getline code,
but we should position the strbuf_release() to cover both code paths.

This fixes at least 9 leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
c3324649ed sparse-checkout: reuse --stdin buffer when reading patterns
When we read patterns from --stdin, we loop on strbuf_getline(), and
detach each line we read to pass into add_pattern(). This used to be
necessary because add_pattern() required that the pattern strings remain
valid while the pattern_list was in use. But it also created a leak,
since we didn't record the detached buffers anywhere else.

Now that add_pattern() has been modified to make its own copy of the
strings, we can stop detaching and fix the leak. This fixes 4 leaks
detected in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:42 -07:00
eed1fbe73b dir.c: always copy input to add_pattern()
The add_pattern() function has a subtle and undocumented gotcha: the
pattern string you pass in must remain valid as long as the pattern_list
is in use (and nor do we take ownership of it). This is easy to get
wrong, causing either subtle bugs (because you free or reuse the string
buffer) or leaks (because you copy the string, but don't track ownership
separately).

All of this "pattern" code was originally the "exclude" mechanism. So
this _usually_ works OK because you add entries in one of two ways:

  1. From the command-line (e.g., "--exclude"), in which case we're
     pointing to an argv entry which remains valid for the lifetime of
     the program.

  2. From a file (e.g., ".gitignore"), in which case we read the whole
     file into a buffer, attach it to the pattern_list's "filebuf"
     entry, then parse the buffer in-place (adding NULs). The strings
     point into the filebuf, which is cleaned up when the whole
     pattern_list goes away.

But other code, like sparse-checkout, reads individual lines from stdin
and passes them one by one to add_pattern(), leaking each. We could fix
this by refactoring it to take in the whole buffer at once, like (2)
above, and stuff it in "filebuf". But given how subtle the interface is,
let's just fix it to always copy the string.

That seems at first like we'd be wasting extra memory, but we can
mitigate that:

  a. The path_pattern struct already uses a FLEXPTR, since we sometimes
     make a copy (when we see "foo/", we strip off the trailing slash,
     requiring a modifiable copy of the string).

     Since we'll now always embed the string inside the struct, we can
     switch to the regular FLEX_ARRAY pattern, saving us 8 bytes of
     pointer. So patterns with a trailing slash and ones under 8 bytes
     actually get smaller.

  b. Now that we don't need the original string to hang around, we can
     get rid of the "filebuf" mechanism entirely, and just free the file
     contents after parsing. Since files are the sources we'd expect to
     have the largest pattern sets, we should mostly break even on
     stuffing the same data into the individual structs.

This patch just adjusts the add_pattern() interface; it doesn't fix any
leaky callers yet.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:42 -07:00
e7c3d1ddba dir.c: reduce max pattern file size to 100MB
In a2bc523e1e (dir.c: skip .gitignore, etc larger than INT_MAX,
2024-05-31) we put capped the size of some files whose parsing code and
data structures used ints. Setting the limit to INT_MAX was a natural
spot, since we know the parsing code would misbehave above that.

But it also leaves the possibility of overflow errors when we multiply
that limit to allocate memory. For instance, a file consisting only of
"a\na\n..." could have INT_MAX/2 entries. Allocating an array of
pointers for each would need INT_MAX*4 bytes on a 64-bit system, enough
to overflow a 32-bit int.

So let's give ourselves a bit more safety margin by giving a much
smaller limit. The size 100MB is somewhat arbitrary, but is based on the
similar value for attribute files added by 3c50032ff5 (attr: ignore
overly large gitattributes files, 2022-12-01).

There's no particular reason these have to be the same, but the idea is
that they are in the ballpark of "so huge that nobody would care, but
small enough to avoid malicious overflow". So lacking a better guess, it
makes sense to use the same value. The implementation here doesn't share
the same constant, but we could change that later (or even give it a
runtime config knob, though nobody has complained yet about the
attribute limit).

And likewise, let's add a few tests that exercise the limits, based on
the attr ones. In this case, though, we never read .gitignore from the
index; the blob code is exercised only for sparse filters. So we'll
trigger it that way.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:23:42 -07:00
607c3d372e show-ref: introduce --branches and deprecate --heads
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.

Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.

We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
b773fb8822 ls-remote: introduce --branches and deprecate --heads
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.

Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.

We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
a096e70c78 refs: call branches branches
These things in refs/heads/ hierarchy are called "branches" in human
parlance.  Replace REF_HEADS with REF_BRANCHES to make it clearer.

No end-user visible change intended at this step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
56f4f4a29d imap-send: minimum leakfix
EVen with the minimum "no-op" invocation t1517 makes, "git imap-send"
leaks an empty strbuf it used to read a 0-byte string into.

There are a few other topics cooking in 'next' that plugs many
other leaks in this program, so let's minimally fix this one, barely
enough to make CI pass, leaving the rest for the other topic.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 11:48:20 -07:00
4c844c2f49 dir.c: free removed sparse-pattern hashmap entries
In add_pattern_to_hashsets(), we remove entries from the
recursive_hashmap when adding similar ones to the parent_hashmap. I
won't pretend to understand all of what's going on here, but there's an
obvious leak: whatever we removed from recursive_hashmap is not
referenced anywhere else, and is never free()d.

We can easily fix this by asking the hashmap to return a pointer to the
old entry. This makes t7002 now completely leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
db83b64cda sparse-checkout: clear patterns when init() sees existing sparse file
In sparse_checkout_init(), we first try to load patterns from an
existing file. If we found any, we return immediately, but end up
leaking the patterns we parsed. Fixing this reduces the number of leaks
in t7002 from 9 down to 5.

Note that there are two other exits from the function, but they don't
need the same treatment:

  - if we can't resolve HEAD, we write out a hard-coded sparse file and
    return. But we know the pattern list is empty there, since we didn't
    find any in the on-disk file and we haven't yet added any of our
    own.

  - otherwise, we do populate the list and then tail-call into
    write_patterns_and_update(). But that function frees the
    pattern_list itself, so we don't need to.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
4318d3ab65 dir.c: free strings in sparse cone pattern hashmaps
The pattern_list structs used for cone-mode sparse lookups use a few
extra hashmaps. These store pattern_entry structs, each of which has its
own heap-allocated pattern string. When we clean up the hashmaps, we
free the individual pattern_entry structs, but forget to clean up the
embedded strings, causing memory leaks.

We can fix this by iterating over the hashmaps to free the extra
strings. This reduces the numbers of leaks in t7002 from 22 to 9.

One alternative here would be to make the string a FLEX_ARRAY member of
the pattern_entry. Then there's no extra free() required, and as a bonus
it would be a little more efficient. However, some of the refactoring
gets awkward, as we are often assigning strings allocated by helper
functions. So let's just fix the leak for now, and we can explore bigger
refactoring separately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
4d7f95ed1f sparse-checkout: pass string literals directly to add_pattern()
The add_pattern() function takes a pattern string, but neither makes a
copy of it nor takes ownership of the memory. So it is the caller's
responsibility to make sure the string hangs around as long as the
pattern_list which references it.

There are a few cases in sparse-checkout where we use string literal
patterns by stuffing them into a strbuf, detaching the buffer, and then
passing the result into add_pattern(). This creates a leak when the
pattern_list is eventually cleared, since we don't retain a copy of the
detached buffer to free.

But we can observe that the whole strbuf dance is unnecessary. The point
was presumably[1] to satisfy the lifetime requirement of the string. But
string literals have static duration; we can count on them lasting for
the whole program.

So we can fix the leak by just passing them directly. And as a bonus,
that simplifies the code. The leaks can be seen in t7002, which drops
from 25 leaks to 22 with this patch. It also makes t3602 and t1090
leak-free.

In the long run, we will also want to clean up this (undocumented!)
memory lifetime requirement of add_pattern(). But that can come in a
later patch; passing the string literals directly will be the right
thing either way.

[1] The code in question comes from 416adc8711 (sparse-checkout: update
    working directory in-process for 'init', 2019-11-21) and 99dfa6f970
    (sparse-checkout: use in-process update for disable subcommand,
    2019-11-21), but I didn't see anything in their commit messages or
    on the list explaining the strbufs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
2181fe6e46 sparse-checkout: free string list in write_cone_to_file()
We use a string list to hold sorted and de-duped patterns, but don't
free it before leaving the function, causing a leak.

This drops the number of leaks found in t7002 from 27 to 25.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:22 -07:00
7b0defb391 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-03 13:14:52 -07:00
eb6392fb4f Merge branch 'th/push-local-ff-check-without-lazy-fetch'
When "git push" notices that the commit at the tip of the ref on
the other side it is about to overwrite does not exist locally, it
used to first try fetching it if the local repository is a partial
clone. The command has been taught not to do so and immediately
fail instead.

* th/push-local-ff-check-without-lazy-fetch:
  push: don't fetch commit object when checking existence
2024-06-03 13:11:12 -07:00
5c7c063c1f Merge branch 'ps/fix-reinit-includeif-onbranch'
"git init" in an already created directory, when the user
configuration has includeif.onbranch, started to fail recently,
which has been corrected.

* ps/fix-reinit-includeif-onbranch:
  setup: fix bug with "includeIf.onbranch" when initializing dir
2024-06-03 13:11:11 -07:00
03b0e7d3a7 Merge branch 'ps/leakfixes' into ps/leakfixes-more
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-06-03 13:08:33 -07:00
9eaef5822c Sync with 'maint' 2024-05-31 15:50:54 -07:00
291ef5b61c run-command: show prepared command
This adds a trace point in start_command so we can see the full
command invocation without having to resort to strace/code inspection.
For example:

 $ GIT_TRACE=1 git test foo
 git.c:755               trace: exec: git-test foo
 run-command.c:657       trace: run_command: git-test foo
 run-command.c:657       trace: run_command: 'echo $*' foo
 run-command.c:749       trace: start_command: /bin/sh -c 'echo $* "$@"' 'echo $*' foo

Prior changes have made the documentation around the internals of the
alias command execution clearer, but I have still found this detailed
view of the aliased command being run helpful for debugging purposes.

A test case is added to ensure the full command output is present in
the execution flow.

Signed-off-by: Ian Wienand <iwienand@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-31 15:47:55 -07:00
d35a743659 Documentation: alias: add notes on shell expansion
When writing inline shell for shell-expansion aliases (i.e. prefixed
with "!"), there are some caveats around argument parsing to be aware
of.  This series of notes attempts to explain what is happening more
clearly.

Signed-off-by: Ian Wienand <iwienand@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-31 15:47:55 -07:00
a2bc523e1e dir.c: skip .gitignore, etc larger than INT_MAX
We use add_patterns() to read .gitignore, .git/info/exclude, etc, as
well as other pattern-like files like sparse-checkout. The parser for
these uses an "int" as an index, meaning that files over 2GB will
generally cause signed integer overflow and out-of-bounds access.

This is unlikely to happen in any real files, but we do read .gitignore
files from the tree. A malicious tree could cause an out-of-bounds read
and segfault (we also write NULs over newlines, so in theory it could be
an out-of-bounds write, too, but as we go char-by-char, the first thing
that happens is trying to read a negative 2GB offset).

We could fix the most obvious issue by replacing one "int" with a
"size_t". But there are tons of "int" sprinkled throughout this code for
things like pattern lengths, number of patterns, and so on. Since nobody
would actually want a 2GB .gitignore file, an easy defensive measure is
to just refuse to parse them.

The "int" in question is in add_patterns_from_buffer(), so we could
catch it there. But by putting the checks in its two callers, we can
produce more useful error messages.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-31 15:30:32 -07:00
715ae27382 Post 2.45.2 updates
Merge down a handful of topics to adjust tests and CI to make them
work better, without changing Git itself, and a bit of developer
docs update:

 * Tests that try to corrupt in-repository files in chunked format did
   not work well on macOS due to its broken "mv", which has been
   worked around.

 * Unbreak CI jobs so that we do not attempt to use Python 2 that has
   been removed from the platform.

 * Git 2.43 started using the tree of HEAD as the source of attributes
   in a bare repository, which has severe performance implications.
   For now, revert the change, without ripping out a more explicit
   support for the attr.tree configuration variable.

 * Windows CI running in GitHub Actions started complaining about the
   order of arguments given to calloc(); the imported regex code uses
   the wrong order almost consistently, which has been corrected.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-31 15:28:22 -07:00
8211adfaba Merge branch 'jk/ci-macos-gcc13-fix' into maint-2.45
CI fix.

* jk/ci-macos-gcc13-fix:
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
2024-05-31 15:28:22 -07:00
6e90e88de9 Merge branch 'ma/win32-unix-domain-socket' into maint-2.45
Build fix.

* ma/win32-unix-domain-socket:
  win32: fix building with NO_UNIX_SOCKETS
2024-05-31 15:28:21 -07:00
104cf1422c Merge branch 'jt/doc-submitting-rerolled-series' into maint-2.45
Developer doc update.

* jt/doc-submitting-rerolled-series:
  doc: clarify practices for submitting updated patch versions
2024-05-31 15:28:21 -07:00
2e416ef066 Merge branch 'jc/doc-manpages-l10n' into maint-2.45
The SubmittingPatches document now refers folks to manpages
translation project.

* jc/doc-manpages-l10n:
  SubmittingPatches: advertise git-manpages-l10n project a bit
2024-05-31 15:28:20 -07:00
73049492d5 Merge branch 'jc/compat-regex-calloc-fix' into maint-2.45
Windows CI running in GitHub Actions started complaining about the
order of arguments given to calloc(); the imported regex code uses
the wrong order almost consistently, which has been corrected.

* jc/compat-regex-calloc-fix:
  compat/regex: fix argument order to calloc(3)
2024-05-31 15:28:20 -07:00
1258fc2b08 Merge branch 'jc/no-default-attr-tree-in-bare' into maint-2.45
Git 2.43 started using the tree of HEAD as the source of attributes
in a bare repository, which has severe performance implications.
For now, revert the change, without ripping out a more explicit
support for the attr.tree configuration variable.

* jc/no-default-attr-tree-in-bare:
  stop using HEAD for attributes in bare repository by default
2024-05-31 15:28:19 -07:00
1b2e9068f8 Merge branch 'ps/ci-python-2-deprecation' into maint-2.45
Unbreak CI jobs so that we do not attempt to use Python 2 that has
been removed from the platform.

* ps/ci-python-2-deprecation:
  ci: fix Python dependency on Ubuntu 24.04
2024-05-31 15:28:19 -07:00
0d7b7484c9 Merge branch 'jc/test-workaround-broken-mv' into maint-2.45
Tests that try to corrupt in-repository files in chunked format did
not work well on macOS due to its broken "mv", which has been
worked around.

* jc/test-workaround-broken-mv:
  t/lib-chunk: work around broken "mv" on some vintage of macOS
2024-05-31 15:28:18 -07:00
7482bc956c Merge branch 'jc/git-gui-maintainer-update' into maint-2.45
* jc/git-gui-maintainer-update:
  SubmittingPatches: welcome the new maintainer of git-gui part
2024-05-31 15:28:18 -07:00
71fa8d2212 macOS: ls-files path fails if path of workdir is NFD
Under macOS, `git ls-files path` does not work (gives an error)
if the absolute 'path' contains characters in NFD (decomposed).
This happens when core.precomposeunicode is true, which is the
most common case. The bug report says:

$ cd somewhere          # some safe place, /tmp or ~/tmp etc.
$ mkdir $'u\xcc\x88'    # ü in NFD
$ cd ü                  # or cd $'u\xcc\x88' or cd $'\xc3\xbc'
$ git init
$ git ls-files $'/somewhere/u\xcc\x88'   # NFD
  fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü'
$ git ls-files $'/somewhere/\xc3\xbc'    # NFC
(the same error as above)

In the 'fatal:' error message, there are three ü;
the 1st and 2nd are in NFC, the 3rd is in NFD.

Add test cases that follows the bug report, with the simplification
that the 'ü' is replaced by an 'ä', which is already used as NFD and
NFC in t3910.

The solution is to add a call to precompose_string_if_needed()
to this code in setup.c :
`work_tree = precompose_string_if_needed(get_git_work_tree());`

There is, however, a limitation with this very usage of Git:
The (repo) local .gitconfig file is not used, only the global
"core.precomposeunicode" is taken into account, if it is set (or not).
To set it to true is a good recommendation anyway, and here is the
analyzes from Jun T :

The problem is the_repository->config->hash_initialized
is set to 1 before the_repository->commondir is set to ".git".
Due to this, .git/config is never read, and precomposed_unicode
is never set to 1 (remains -1).

run_builtin() {
    setup_git_directory() {
        strbuf_getcwd() {   # setup.c:1542
            precompose_{strbuf,string}_if_needed() {
                # precomposed_unicode is still -1
                git_congig_get_bool("core.precomposeunicode") {
                    git_config_check_init() {
                        repo_read_config() {
                            git_config_init() {
                                # !!!
                                the_repository->config->hash_initialized=1
                                # !!!
                            }
                            # does not read .git/config since
                            # the_repository->commondir is still NULL
                        }
                    }
                }
                returns without converting to NFC
            }
            returns cwd in NFD
        }

        setup_discovered_git_dir() {
            set_git_work_tree(".") {
                repo_set_worktree() {
                    # this function indirectly calls strbuf_getcwd()
                    # --> precompose_{strbuf,string}_if_needed() -->
                    # {git,repo}_config_get_bool("core.precomposeunicode"),
                    # but does not try to read .git/config since
                    # the_repository->config->hash_initialized
                    # is already set to 1 above. And it will not read
                    # .git/config even if hash_initialized is 0
                    # since the_repository->commondir is still NULL.

                    the_repository->worktree = NFD
                }
            }
        }

        setup_git_env() {
            repo_setup_gitdir() {
                repo_set_commondir() {
                    # finally commondir is set here
                    the_repository->commondir = ".git"
                }
            }
        }

    } // END setup_git_directory

Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-31 13:13:40 -07:00
94d25d3254 Merge branch 'jk/leakfixes' into jk/sparse-leakfix
* jk/leakfixes:
  mv: replace src_dir with a strvec
  mv: factor out empty src_dir removal
  mv: move src_dir cleanup to end of cmd_mv()
  t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
  t-strvec: use va_end() to match va_start()
2024-05-31 08:55:34 -07:00
b25ec8b8d5 t1517: more coverage for commands that work without repository
While most of the commands in Git suite are designed to do useful
things in Git repositories, some commands are also usable outside
any repository.  Building on top of an earlier work abece6e9 (t1517:
test commands that are designed to be run outside repository,
2024-05-20) that adds tests for such commands, let's give coverage
to some more commands.

This patch covers commands whose code has hits for

    $ git grep setup_git_directory_gently

and passes a pointer to nongit_ok variable it uses to allow it to
run outside a Git repository, but mostly they are tested only to see
that they start up (as opposed to dying with "not in a git
repository" complaint).  We may want to update them to actually do
something useful later, but this would at least help us catch
regressions by mistake.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-31 07:51:01 -07:00
c3ebe91b40 Sync with Git 2.45.2 2024-05-30 17:25:37 -07:00
bea9ecd24b Git 2.45.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 17:18:43 -07:00
f8c58f24cc Merge branch 'jc/fix-2.45.1-and-friends-for-maint' into maint-2.45
* jc/fix-2.45.1-and-friends-for-maint:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 17:17:21 -07:00
46698a8ea1 Git 2.44.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 17:16:34 -07:00
d103d3d282 Merge branch 'fixes/2.45.1/2.44' into maint-2.44
* fixes/2.45.1/2.44:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 17:11:02 -07:00
337b4d4000 Git 2.43.5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 17:06:24 -07:00
5eebceaafa Merge branch 'fixes/2.45.1/2.43' into maint-2.43
* fixes/2.45.1/2.43:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 17:04:37 -07:00
239bd35bd2 Git 2.42.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 17:03:31 -07:00
18df122d3d Merge branch 'fixes/2.45.1/2.42' into maint-2.42
* fixes/2.45.1/2.42:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 17:00:57 -07:00
0dc9cad22d Git 2.41.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 17:00:29 -07:00
f20b96a798 Merge branch 'fixes/2.45.1/2.41' into maint-2.41
* fixes/2.45.1/2.41:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 16:58:12 -07:00
dbecc617f7 Git 2.40.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 16:57:31 -07:00
75e7cd2bd0 Merge branch 'fixes/2.45.1/2.40' into maint-2.40
* fixes/2.45.1/2.40:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 16:54:42 -07:00
cc7d11c167 Git 2.39.5 2024-05-30 16:52:52 -07:00
7eb91521fd Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into maint-2.39
* jc/fix-2.45.1-and-friends-for-2.39:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 16:38:58 -07:00
58bac47f8e The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 14:15:17 -07:00
f8da12adcf Merge branch 'jc/fix-2.45.1-and-friends-for-maint'
Adjust jc/fix-2.45.1-and-friends-for-2.39 for more recent
maintenance track.

* jc/fix-2.45.1-and-friends-for-maint:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 14:15:17 -07:00
d019b80d4f Merge branch 'jc/add-patch-enforce-single-letter-input'
"git add -p" learned to complain when an answer with more than one
letter is given to a prompt that expects a single letter answer.

* jc/add-patch-enforce-single-letter-input:
  add-patch: enforce only one-letter response to prompts
2024-05-30 14:15:16 -07:00
99d3cbe21b Merge branch 'gt/unit-test-strcmp-offset'
The strcmp-offset tests have been rewritten using the unit test
framework.

* gt/unit-test-strcmp-offset:
  t/: port helper/test-strcmp-offset.c to unit-tests/t-strcmp-offset.c
2024-05-30 14:15:15 -07:00
b3ba0f2133 Merge branch 'es/chainlint-ncores-fix'
The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs.  It now falls
back to 1 CPU to avoid the problem.

* es/chainlint-ncores-fix:
  chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
  chainlint.pl: fix incorrect CPU count on Linux SPARC
  chainlint.pl: make CPU count computation more robust
2024-05-30 14:15:15 -07:00
6c5be97e4e Merge branch 'jc/undecided-is-not-necessarily-sha1-fix'
The base topic started to make it an error for a command to leave
the hash algorithm unspecified, which revealed a few commands that
were not ready for the change.  Give users a knob to revert back to
the "default is sha-1" behaviour as an escape hatch, and start
fixing these breakages.

* jc/undecided-is-not-necessarily-sha1-fix:
  apply: fix uninitialized hash function
  builtin/hash-object: fix uninitialized hash function
  builtin/patch-id: fix uninitialized hash function
  t1517: test commands that are designed to be run outside repository
  setup: add an escape hatch for "no more default hash algorithm" change
2024-05-30 14:15:14 -07:00
b7544a1d50 Merge branch 'js/doc-decisions'
The project decision making policy has been documented.

* js/doc-decisions:
  doc: describe the project's decision-making process
2024-05-30 14:15:14 -07:00
988499e295 Merge branch 'ps/refs-without-the-repository-updates'
Further clean-up the refs subsystem to stop relying on
the_repository, and instead use the repository associated to the
ref_store object.

* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-30 14:15:13 -07:00
67ce50ba26 Merge branch 'ps/reftable-reusable-iterator'
Code clean-up to make the reftable iterator closer to be reusable.

* ps/reftable-reusable-iterator:
  reftable/merged: adapt interface to allow reuse of iterators
  reftable/stack: provide convenience functions to create iterators
  reftable/reader: adapt interface to allow reuse of iterators
  reftable/generic: adapt interface to allow reuse of iterators
  reftable/generic: move seeking of records into the iterator
  reftable/merged: simplify indices for subiterators
  reftable/merged: split up initialization and seeking of records
  reftable/reader: set up the reader when initializing table iterator
  reftable/reader: inline `reader_seek_internal()`
  reftable/reader: separate concerns of table iter and reftable reader
  reftable/reader: unify indexed and linear seeking
  reftable/reader: avoid copying index iterator
  reftable/block: use `size_t` to track restart point index
2024-05-30 14:15:12 -07:00
23528d352a Merge branch 'ps/reftable-write-options'
The knobs to tweak how reftable files are written have been made
available as configuration variables.

* ps/reftable-write-options:
  refs/reftable: allow configuring geometric factor
  reftable: make the compaction factor configurable
  refs/reftable: allow disabling writing the object index
  refs/reftable: allow configuring restart interval
  reftable: use `uint16_t` to track restart interval
  refs/reftable: allow configuring block size
  reftable/dump: support dumping a table's block structure
  reftable/writer: improve error when passed an invalid block size
  reftable/writer: drop static variable used to initialize strbuf
  reftable: pass opts as constant pointer
  reftable: consistently refer to `reftable_write_options` as `opts`
2024-05-30 14:15:11 -07:00
a60c21b720 Merge branch 'ps/undecided-is-not-necessarily-sha1'
Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.

* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
2024-05-30 14:15:11 -07:00
4cac79a50e pack-bitmap.c: reimplement midx_bitmap_filename() with helper
Now that we have the `get_midx_filename_ext()` helper, we can
reimplement the `midx_bitmap_filename()` function in terms of it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:52 -07:00
defba632c1 midx: replace get_midx_rev_filename() with a generic helper
Commit f894081dea (pack-revindex: read multi-pack reverse indexes,
2021-03-30) introduced the `get_midx_rev_filename()` helper (later
modified by commit 60980aed78 (midx.c: write MIDX filenames to
strbuf, 2021-10-26)).

This function returns the location of the classic ".rev" files we used
to write for MIDXs (prior to 95e8383bac (midx.c: make changing the
preferred pack safe, 2022-01-25)), which is always of the form:

    $GIT_DIR/objects/pack/multi-pack-index-$HASH.rev

Replace this function with a generic helper that populates a strbuf with
the above form, replacing the ".rev" extension with a caller-provided
argument.

This will allow us to remove a similarly-defined function in the
pack-bitmap code (used to determine the location of a MIDX .bitmap file)
by reimplementing it in terms of `get_midx_filename_ext()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:52 -07:00
d6a8c58675 midx-write.c: support reading an existing MIDX with packs_to_include
Avoid unconditionally copying all packs from an existing MIDX into a new
MIDX by checking that packs added via `fill_packs_from_midx()` don't
appear in the `to_include` set, if one was provided.

Do so by calling `should_include_pack()` from both `add_pack_to_midx()`
and `fill_packs_from_midx()`.

In order to make this work, teach `should_include_pack()` a new
"exclude_from_midx" parameter, which allows skipping the first check.
This is done so that the caller in `fill_packs_from_midx()` doesn't
reject all of the packs it provided since they appear in an existing
MIDX by definition.

The sum total of this change is that we are now able to read and
reference objects in an existing MIDX even when given a non-NULL
`packs_to_include`. This is a prerequisite step for incremental MIDXs,
which need to load any existing MIDX (if one is present) in order to
determine whether or not an object already appears in an earlier portion
of the MIDX to avoid duplicating it across multiple portions.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:51 -07:00
c5e204af1f midx-write.c: extract fill_packs_from_midx()
When write_midx_internal() loads an existing MIDX, all packs are copied
forward into the new MIDX. Improve the readability of
write_midx_internal() by extracting this functionality out into a
separate function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:51 -07:00
364c0ffc5a midx-write.c: extract should_include_pack()
The add_pack_to_midx() callback used via for_each_file_in_pack_dir() is
used to add packs with .idx files to the MIDX being written.

Within this function, we have a pair of checks that discards packs
which:

  - appear in an existing MIDX, if we successfully read an existing MIDX
    from disk

  - or, appear in the "to_include" list, if invoking the MIDX write
    machinery with the `--stdin-packs` command-line argument.

A future commit will want to call a slight variant of these checks from
the code that reuses all packs from an existing MIDX, as well as the
current location via add_pack_to_midx(). The latter will be modified in
subsequent commits to only reuse packs which appear in the to_include
list, if one was given.

Prepare for that step by extracting these checks as a subroutine that
may be called from both places.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:51 -07:00
33e9218ffb midx-write.c: pass start_pack to compute_sorted_entries()
The function `compute_sorted_entries()` is broadly responsible for
building an array of the objects to be written into a MIDX based on the
provided list of packs.

If we have loaded an existing MIDX, however, we may not use all of its
packs, despite loading them into the ctx->info array.

The existing implementation simply skips past the first
ctx->m->num_packs (if ctx->m is non-NULL, indicating that we loaded an
existing MIDX). This is because we read objects in packs from an
existing MIDX via the MIDX itself, rather than from the pack-level
fanout to guarantee a de-duplicated result (see: a40498a126 (midx: use
existing midx when writing new one, 2018-07-12)).

Future changes (outside the scope of this patch series) to the MIDX code
will require us to skip *at most* that number[^1].

We could tag each pack with a bit that indicates the pack's contents
should be included in the MIDX. But we can just as easily determine the
number of packs to skip by passing in the number of packs we learned
about after processing an existing MIDX.

[^1]: Kind of. The real number will be bounded by the number of packs in
  a MIDX layer, and the number of packs in its base layer(s), but that
  concept hasn't been fully defined yet.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:51 -07:00
3eac5e1ff1 midx-write.c: reduce argument count for get_sorted_entries()
The function `midx-write.c::get_sorted_entries()` is responsible for
constructing the array of OIDs from a given list of packs which will
comprise the MIDX being written.

The singular call-site for this function looks something like:

    ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr,
                                     &ctx.entries_nr,
                                     ctx.preferred_pack_idx);

This function has five formal arguments, all of which are members of the
shared `struct write_midx_context` used to track various pieces of
information about the MIDX being written.

The function `get_sorted_entries()` dates back to fe1ed56f5e (midx:
sort and deduplicate objects from packfiles, 2018-07-12), which came
shortly after 396f257018 (multi-pack-index: read packfile list,
2018-07-12). The latter patch introduced the `pack_list` structure,
which was a precursor to the structure we now know as
`write_midx_context` (c.f. 577dc49696 (midx: rename pack_info to
write_midx_context, 2021-02-18)).

At the time, `get_sorted_entries()` likely could have used the pack_list
structure introduced earlier in 396f257018, but understandably did not
since the structure only contained three fields (only two of which were
relevant to `get_sorted_entries()`) at the time.

Simplify the declaration of this function by taking a single pointer to
the whole `struct write_midx_context` instead of various members within
it. Since this function is now computing the entire result (populating
both `ctx->entries`, and `ctx->entries_nr`), rename it to something that
doesn't start with "get_" to make clear that this function has a
side-effect.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:50 -07:00
23532be8e9 midx-write.c: tolerate --preferred-pack without bitmaps
When passing a preferred pack to the MIDX write machinery, we ensure
that the given preferred pack is non-empty since 5d3cd09a80 (midx:
reject empty `--preferred-pack`'s, 2021-08-31).

However packs are only loaded (via `write_midx_internal()`, though a
subsequent patch will refactor this code out to its own function) when
the `MIDX_WRITE_REV_INDEX` flag is set.

So if a caller runs:

    $ git multi-pack-index write --preferred-pack=...

with both (a) an existing MIDX, and (b) specifies a pack from that MIDX
as the preferred one, without passing `--bitmap`, then the check added
in 5d3cd09a80 will result in a segfault.

Note that packs loaded from disk which don't appear in an existing MIDX
do not trigger this issue, as those packs are loaded unconditionally. We
conditionally load packs from a MIDX since we tolerate MIDXs whose
packs do not resolve (i.e., via the MIDX write after removing
unreferenced packs via 'git multi-pack-index expire').

In practice, this isn't possible to trigger when running `git
multi-pack-index write` from `git repack`, as the latter always passes
`--stdin-packs`, which prevents us from loading an existing MIDX, as it
forces all packs to be read from disk.

But a future commit in this series will change that behavior to
unconditionally load an existing MIDX, even with `--stdin-packs`, making
this behavior trigger-able from 'repack' much more easily.

Prevent this from being an issue by removing the segfault altogether by
calling `prepare_midx_pack()` on packs loaded from an existing MIDX when
either the `MIDX_WRITE_REV_INDEX` flag is set *or* we specified a
`--preferred-pack`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 13:43:50 -07:00
4c063c82e9 rebase -i: improve error message when picking merge
The only todo commands that accept a merge commit are "merge" and
"reset". All the other commands like "pick" or "reword" fail when they
try to pick a a merge commit and print the message

    error: commit abc123 is a merge but no -m option was given.

followed by a hint about the command being rescheduled. This message is
designed to help the user when they cherry-pick a merge and forget to
pass "-m". For users who are rebasing the message is confusing as there
is no way for rebase to cherry-pick the merge.

Improve the user experience by detecting the error and printing some
advice on how to fix it when the todo list is parsed rather than waiting
for the "pick" command to fail. The advice recommends "merge" rather
than "exec git cherry-pick -m ..." on the assumption that cherry-picking
merges is relatively rare and it is more likely that the user chose
"pick" by a mistake.

It would be possible to support cherry-picking merges by allowing the
user to pass "-m" to "pick" commands but that adds complexity to do
something that can already be achieved with

    exec git cherry-pick -m1 abc123

Reported-by: Stefan Haller <lists@haller-berlin.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 10:02:58 -07:00
0c26738aa4 rebase -i: pass struct replay_opts to parse_insn_line()
This new parameter will be used in the next commit. As adding the
parameter requires quite a few changes to plumb it through the call
chain these are separated into their own commit to avoid cluttering up
the next commit with incidental changes.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 10:02:56 -07:00
64f8502b40 mv: replace src_dir with a strvec
We manually manage the src_dir array with ALLOC_GROW. Using a strvec is
a little more ergonomic, and makes the memory ownership more clear. It
does mean that we copy the strings (which were otherwise just pointers
into the "sources" strvec), but using the same rationale as 9fcd9e4e72
(builtin/mv duplicate string list memory, 2024-05-27), it's just not
enough to be worth worrying about here.

As a bonus, this gets rid of some "int"s used for allocation management
(though in practice these were limited to command-line sizes and thus
not overflowable).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
d58a687705 mv: factor out empty src_dir removal
This pulls the loop added by b6f51e3db9 (mv: cleanup empty
WORKING_DIRECTORY, 2022-08-09) into a sub-function. That reduces clutter
in cmd_mv() and makes it easier to see that the lifetime of the
a_src_dir strbuf is limited to this code (and thus its cleanup doesn't
need to go after the "out" label).

Another option would be to just declare the strbuf inside the loop,
since it is only used there. But this refactor retains the existing
property that we can reuse the allocated buffer for each iteration of
the loop. That optimization is probably overkill, but I think the
sub-function is more readable anyway, and then keeping the optimization
is basically free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
cc65e085e4 mv: move src_dir cleanup to end of cmd_mv()
Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
added an auxiliary array where we store directory arguments that we see
while processing the incoming arguments. After actually moving things,
we then use that array to remove now-empty directories, and then
immediately free the array.

But if the actual move queues any errors in only_match_skip_worktree,
that can cause us to jump straight to the "out" label to clean up,
skipping the free() and leaking the array.

Let's push the free() down past the "out" label so that we always clean
up (the array is initialized to NULL, so this is always safe). We'll
hold on to the memory a little longer than necessary, but clarity is
more important than micro-optimizing here.

Note that the adjacent "a_src_dir" strbuf does not suffer the same
problem; it is only allocated during the removal step.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
34eb843721 t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
This will let the compiler catch a problem like:

  /* oops, we forgot the NULL */
  check_strvec(&vec, "foo");

rather than triggering undefined behavior at runtime.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
b639884f9a t-strvec: use va_end() to match va_start()
Our check_strvec_loc() helper uses a variable argument list. When we
va_start(), we must be sure to va_end() before leaving the function.
This is required by the standard (though the effect of forgetting will
vary between platforms).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
a3f0e2a064 Merge branch 'ps/leakfixes' into jk/leakfixes
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-05-30 08:54:58 -07:00
efa8786800 t: improve the test-case for parse_names()
In the existing test-case for parse_names(), the fact that empty
lines should be ignored is not obvious because the empty line is
immediately followed by end-of-string. This can be mistaken as the
empty line getting replaced by NULL. Improve this by adding a
non-empty line after the empty one to demonstrate the intended behavior.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 07:30:10 -07:00
e31efffc28 t: add test for put_be16()
put_be16() is a function defined in reftable/basics.{c, h} for which
there are no tests in the current setup. Add a test for the same.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 07:30:10 -07:00
afe5b9e7ec t: move tests from reftable/record_test.c to the new unit test
common_prefix_size(), get_be24() and put_be24() are functions defined
in reftable/basics.{c, h}. Move the tests for these functions from
reftable/record_test.c to the newly ported test.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 07:30:10 -07:00
f74e1865fe t: move tests from reftable/stack_test.c to the new unit test
parse_names() and names_equal() are functions defined in
reftable/basics.{c, h}. Move the tests for these functions from
reftable/stack_test.c to the newly ported test.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 07:30:10 -07:00
b34116a30c t: move reftable/basics_test.c to the unit testing framework
reftable/basics_test.c exercise the functions defined in
reftable/basics.{c, h}. Migrate reftable/basics_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 07:30:10 -07:00
313eec177a safe.directory: allow "lead/ing/path/*" match
When safe.directory was introduced in v2.30.3 timeframe, 8959555c
(setup_git_directory(): add an owner check for the top-level
directory, 2022-03-02), it only allowed specific opt-out
directories.  Immediately after an embargoed release that included
the change, 0f85c4a3 (setup: opt-out of check with safe.directory=*,
2022-04-13) was done as a response to loosen the check so that a
single '*' can be used to say "I trust all repositories" for folks
who host too many repositories to list individually.

Let's further loosen the check to allow people to say "everything
under this hierarchy is deemed safe" by specifying such a leading
directory with "/*" appended to it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-29 12:06:27 -07:00
5529cba09f Merge branch 'ps/leakfixes' into ps/no-writable-strings
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-05-29 09:32:24 -07:00
2794932548 t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash
t/helper/test-{sha1, sha256} and t/t0015-hash.sh test the hash
implementation of SHA-1 and SHA-256 in Git with basic hash values.
Migrate them to the new unit testing framework for better debugging
and runtime performance.

The 'sha1' and 'sha256' subcommands are still not removed due to
pack_trailer():lib-pack.sh's reliance on them. The 'sha1' subcommand
is also relied upon by t0013-sha1dc (which requires 'test-tool
sha1' dying when it is used on a file created to contain the
known sha1 attack).

Helped-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-29 09:11:41 -07:00
a70f8f19ad strbuf: introduce strbuf_addstrings() to repeatedly add a string
In a following commit we are going to port code from
"t/helper/test-sha256.c", t/helper/test-hash.c and "t/t0015-hash.sh" to
a new "t/unit-tests/t-hash.c" file using the recently added unit test
framework.

To port code like: perl -e "$| = 1; print q{aaaaaaaaaa} for 1..100000;"
we are going to need a new strbuf_addstrings() function that repeatedly
adds the same string a number of times to a buffer.

Such a strbuf_addstrings() function would already be useful in
"json-writer.c" and "builtin/submodule-helper.c" as both of these files
already have code that repeatedly adds the same string. So let's
introduce such a strbuf_addstrings() function in "strbuf.{c,h}" and use
it in both "json-writer.c" and "builtin/submodule-helper.c".

We use the "strbuf_addstrings" name as this way strbuf_addstr() and
strbuf_addstrings() would be similar for strings as strbuf_addch() and
strbuf_addchars() for characters.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-29 09:09:39 -07:00
456b4dce4c t/: migrate helper/test-example-decorate to the unit testing framework
helper/test-example-decorate.c along with t9004-example.sh provide
an example of how to use the functions in decorate.h (which provides
a data structure that associates Git objects to void pointers) and
also test their output.

Migrate them to the new unit testing framework for better debugging
and runtime performance.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-28 13:53:36 -07:00
3a57aa566a The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-28 11:17:11 -07:00
1a367763d0 Merge branch 'ps/leakfixes-base'
* ps/leakfixes-base:
  t: mark a bunch of tests as leak-free
  ci: add missing dependency for TTY prereq
2024-05-28 11:17:11 -07:00
2a1a882890 Merge branch 'kn/osxkeychain-skip-idempotent-store'
The credential helper that talks with osx keychain learned to avoid
storing back the authentication material it just got received from
the keychain.

* kn/osxkeychain-skip-idempotent-store:
  osxkeychain: state to skip unnecessary store operations
  osxkeychain: exclusive lock to serialize execution of operations
2024-05-28 11:17:11 -07:00
b32f298264 Merge branch 'jc/format-patch-more-aggressive-range-diff'
The default "creation-factor" used by "git format-patch" has been
raised to make it more aggressively find matching commits.

* jc/format-patch-more-aggressive-range-diff:
  format-patch: run range-diff with larger creation-factor
2024-05-28 11:17:10 -07:00
3acecc04c7 Merge branch 'jc/rev-parse-fatal-doc'
Doc update.

* jc/rev-parse-fatal-doc:
  rev-parse: document how --is-* options work outside a repository
2024-05-28 11:17:10 -07:00
dfe42162d9 Merge branch 'jc/t0017-clarify-bogus-expectation'
Test clean-up.

* jc/t0017-clarify-bogus-expectation:
  t0017: clarify dubious test set-up
2024-05-28 11:17:09 -07:00
789ec1d91d Merge branch 'ds/send-email-per-message-block'
Preliminary code clean-up for "git send-email".

* ds/send-email-per-message-block:
  send-email: move newline characters out of a few translatable strings
2024-05-28 11:17:09 -07:00
7a40196328 Merge branch 'ps/complete-config-w-subcommands'
The command line completion script (in contrib/) has been adjusted
to the recent update to "git config" that adopted subcommand based
UI.

* ps/complete-config-w-subcommands:
  completion: adapt git-config(1) to complete subcommands
2024-05-28 11:17:08 -07:00
6e95dce712 Merge branch 'jc/doc-diff-name-only'
The documentation for "git diff --name-only" has been clarified
that it is about showing the names in the post-image tree.

* jc/doc-diff-name-only:
  diff: document what --name-only shows
2024-05-28 11:17:08 -07:00
ee8537ebc9 Merge branch 'tb/pack-bitmap-write-cleanups'
The pack bitmap code saw some clean-up to prepare for a follow-up topic.

* tb/pack-bitmap-write-cleanups:
  pack-bitmap: introduce `bitmap_writer_free()`
  pack-bitmap-write.c: avoid uninitialized 'write_as' field
  pack-bitmap: drop unused `max_bitmaps` parameter
  pack-bitmap: avoid use of static `bitmap_writer`
  pack-bitmap-write.c: move commit_positions into commit_pos fields
  object.h: add flags allocated by pack-bitmap.h
2024-05-28 11:17:07 -07:00
00ffa1cb1c Merge branch 'ps/builtin-config-cleanup'
Code clean-up to reduce inter-function communication inside
builtin/config.c done via the use of global variables.

* ps/builtin-config-cleanup: (21 commits)
  builtin/config: pass data between callbacks via local variables
  builtin/config: convert flags to a local variable
  builtin/config: track "fixed value" option via flags only
  builtin/config: convert `key` to a local variable
  builtin/config: convert `key_regexp` to a local variable
  builtin/config: convert `regexp` to a local variable
  builtin/config: convert `value_pattern` to a local variable
  builtin/config: convert `do_not_match` to a local variable
  builtin/config: move `respect_includes_opt` into location options
  builtin/config: move default value into display options
  builtin/config: move type options into display options
  builtin/config: move display options into local variables
  builtin/config: move location options into local variables
  builtin/config: refactor functions to have common exit paths
  config: make the config source const
  builtin/config: check for writeability after source is set up
  builtin/config: move actions into `cmd_config_actions()`
  builtin/config: move legacy options into `cmd_config()`
  builtin/config: move subcommand options into `cmd_config()`
  builtin/config: move legacy mode into its own function
  ...
2024-05-28 11:17:07 -07:00
16a592f132 Merge branch 'ps/pseudo-ref-terminology'
Terminology to call various ref-like things are getting
straightened out.

* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-28 11:17:06 -07:00
3b1e3f02bf Merge branch 'kn/patch-iteration-doc'
Doc updates.

* kn/patch-iteration-doc:
  SubmittingPatches: add section for iterating patches
2024-05-28 11:17:06 -07:00
eeec143a37 Merge branch 'mt/t0211-typofix'
Test fix.

* mt/t0211-typofix:
  t/t0211-trace2-perf.sh: fix typo patern -> pattern
2024-05-28 11:17:05 -07:00
64a7424694 Merge branch 'jc/doc-manpages-l10n'
The SubmittingPatches document now refers folks to manpages
translation project.

* jc/doc-manpages-l10n:
  SubmittingPatches: advertise git-manpages-l10n project a bit
2024-05-28 11:17:05 -07:00
ebdbefa4fe builtin/mv: fix leaks for submodule gitfile paths
Similar to the preceding commit, we have effectively given tracking
memory ownership of submodule gitfile paths. Refactor the code to start
tracking allocated strings in a separate `struct strvec` such that we
can easily plug those leaks. Mark now-passing tests as leak free.

Note that ideally, we wouldn't require two separate data structures to
track those paths. But we do need to store `NULL` pointers for the
gitfile paths such that we can indicate that its corresponding entries
in the other arrays do not have such a path at all. And given that
`struct strvec`s cannot store `NULL` pointers we cannot use them to
store this information.

There is another small gotcha that is easy to miss: you may be wondering
why we don't want to store `SUBMODULE_WITH_GITDIR` in the strvec. This
is because this is a mere sentinel value and not actually a string at
all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:03 -07:00
52a7dab439 builtin/mv: refactor to use struct strvec
Memory allocation patterns in git-mv(1) are extremely hard to follow:
We copy around string pointers into manually-managed arrays, some of
which alias each other, but only sometimes, while we also drop some of
those strings at other times without ever daring to free them.

While this may be my own subjective feeling, it seems like others have
given up as the code has multiple calls to `UNLEAK()`. These are not
sufficient though, and git-mv(1) is still leaking all over the place
even with them.

Refactor the code to instead track strings in `struct strvec`. While
this has the effect of effectively duplicating some of the strings
without an actual need, it is way easier to reason about and fixes all
of the aliasing of memory that has been going on. It allows us to get
rid of the `UNLEAK()` calls and also fixes leaks that those calls did
not paper over.

Mark tests which are now leak-free accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
9fcd9e4e72 builtin/mv duplicate string list memory
makes the next patch easier, where we will migrate to the paths being
owned by a strvec. given that we are talking about command line
parameters here it's also not like we have tons of allocations that this
would save

while at it, fix a memory leak

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
3d231f7b82 builtin/mv: refactor add_slash() to always return allocated strings
The `add_slash()` function will only conditionally return an allocated
string when the passed-in string did not yet have a trailing slash. This
makes the memory ownership harder to track than really necessary.

It's dubious whether this optimization really buys us all that much. The
number of times we execute this function is bounded by the number of
arguments to git-mv(1), so in the typical case we may end up saving an
allocation or two.

Simplify the code to unconditionally return allocated strings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
11ce77b5cc strvec: add functions to replace and remove strings
Add two functions that allow to replace and remove strings contained in
the strvec. This will be used by a subsequent commit that refactors
git-mv(1).

While at it, add a bunch of unit tests that cover both old and new
functionality.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
3ef52dd112 submodule: fix leaking memory for submodule entries
In `free_one_config()` we never end up freeing the `url` and `ignore`
fields and thus leak memory. Fix those leaks and mark now-passing tests
as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:01 -07:00
ba9d029445 commit-reach: fix memory leak in ahead_behind()
We use a priority queue in `ahead_behind()` to compute the ahead/behind
count for commits. We may not iterate through all commits part of that
queue though in case all of its entries are stale. Consequently, as we
never make the effort to release the remaining commits, we end up
leaking bit arrays that we have allocated for each of the contained
commits.

Plug this leak and mark the corresponding test as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:01 -07:00
96c1655095 builtin/credential: clear credential before exit
We never release memory associated with `struct credential`. Fix this
and mark the corresponding test as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:01 -07:00
49eb597ce0 config: plug various memory leaks
Now that memory ownership rules around `git_config_string()` and
`git_config_pathname()` are clearer, it also got easier to spot that
the returned memory needs to be free'd. Plug a subset of those cases and
mark now-passing tests as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
1b261c20ed config: clarify memory ownership in git_config_string()
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
83024d98f7 builtin/log: stop using globals for format config
This commit does the exact same as the preceding commit, only for the
format configuration instead of the log configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
106a54aecb builtin/log: stop using globals for log config
We're using global variables to store the log configuration. Many of
these can be set both via the command line and via the config, and
depending on how they are being set, they may contain allocated strings.
This leads to hard-to-track memory ownership and memory leaks.

Refactor the code to instead use a `struct log_config` that is being
allocated on the stack. This allows us to more clearly scope the
variables, track memory ownership and ultimately release the memory.

This also prepares us for a change to `git_config_string()`, which will
be adapted to have a `char **` out parameter instead of `const char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
a6cb0cc610 convert: refactor code to clarify ownership of check_roundtrip_encoding
The `check_roundtrip_encoding` variable is tracked in a `const char *`
even though it may contain allocated strings at times. The result is
that those strings may be leaking because we never free them.

Refactor the code to always store allocated strings in this variable.
The default value is handled in `check_roundtrip()` now, which is the
only user of the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
f9c1989674 diff: refactor code to clarify memory ownership of prefixes
The source and destination prefixes are tracked in a `const char *`
array, but may at times contain allocated strings. The result is that
those strings may be leaking because we never free them.

Refactor the code to always store allocated strings in those variables,
freeing them as required. This requires us to handle the default values
a bit different compared to before. But given that there is only a
single callsite where we use the variables to `struct diff_options` it's
easy to handle the defaults there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
6073b3b5c3 config: clarify memory ownership in git_config_pathname()
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
f962ffc392 http: refactor code to clarify memory ownership
There are various variables assigned via `git_config_string()` and
`git_config_pathname()` which are never free'd. This bug is relatable
because the out parameter of those functions are a `const char **`, even
though memory ownership is transferred to the caller.

We're about to adapt the functions to instead use `char **`. Prepare the
code accordingly. Note that the `(const char **)` casts will go away
once we have adapted the functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:58 -07:00
cc395d6b47 checkout: clarify memory ownership in unique_tracking_name()
The function `unique_tracking_name()` returns an allocated string, but
does not clearly indicate this because its return type is `const char *`
instead of `char *`. This has led to various callsites where we never
free its returned memory at all, which causes memory leaks.

Plug those leaks and mark now-passing tests as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:58 -07:00
94e2aa555e strbuf: fix leak when appendwholeline() fails with EOF
In `strbuf_appendwholeline()` we call `strbuf_getwholeline()` with a
temporary buffer. In case the call returns an error we indicate this by
returning EOF, but never release the temporary buffer. This can cause a
leak though because `strbuf_getwholeline()` calls getline(3). Quoting
its documentation:

    If *lineptr was set to NULL before the call, then the buffer
    should be freed by the user program even on failure.

Consequently, the temporary buffer may hold allocated memory even when
the call to `strbuf_getwholeline()` fails.

Fix this by releasing the temporary buffer on error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:58 -07:00
fba95dad6a t: mark a bunch of tests as leak-free
There are a bunch of tests which do not have any leaks:

  - t0411: Introduced via 5c5a4a1c05 (t0411: add tests for cloning from
    partial repo, 2024-01-28), passes since its inception.

  - t0610: Introduced via 57db2a094d (refs: introduce reftable backend,
    2024-02-07), passes since its inception.

  - t2405: Passes since 6741e917de (repository: avoid leaking
    `fsmonitor` data, 2024-04-12).

  - t7423: Introduced via b20c10fd9b (t7423: add tests for symlinked
    submodule directories, 2024-01-28), passes since e8d0608944
    (submodule: require the submodule path to contain directories only,
    2024-03-26). The fix is not obviously related, but probably works
    because we now die early in many code paths.

  - t9xxx: All of these are exercising CVS-related tooling and pass
    since at least Git v2.40. It's likely that these pass for a long
    time already, but nobody ever noticed because Git developers do not
    tend to have CVS on their machines.

Mark all of these tests as passing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:57 -07:00
97613b9cb9 transport-helper: fix leaking helper name
When initializing the transport helper in `transport_get()`, we
allocate the name of the helper. We neither end up transferring
ownership of the name, nor do we free it. The associated memory thus
leaks.

Fix this memory leak by freeing the string at the calling side in
`transport_get()`. `transport_helper_init()` now creates its own copy of
the string and thus can free it as required.

An alterantive way to fix this would be to transfer ownership of the
string passed into `transport_helper_init()`, which would avoid the call
to xstrdup(1). But it does make for a more surprising calling convention
as we do not typically transfer ownership of strings like this.

Mark now-passing tests as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:57 -07:00
9fd369377d ci: add missing dependency for TTY prereq
In "t/lib-terminal.sh", we declare a lazy prerequisite for tests that
require a TTY. The prerequisite uses a Perl script to figure out whether
we do have a usable TTY or not and thus implicitly depends on the PERL
prerequisite, as well. Furthermore though, the script requires another
dependency that is easy to miss, namely on the IO::Pty module. If that
module is not installed, then the script will exit early due to an
reason unrelated to missing TTYs.

This easily leads to missing test coverage. But most importantly, our CI
systems are missing this dependency and thus don't execute those tests
at all. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:57 -07:00
174443ed3a Documentation: alias: rework notes into points
There are a number of caveats when using aliases.  Rather than
stuffing them all together in a paragraph, let's separate them out
into individual points to make it clearer what's going on.

Signed-off-by: Ian Wienand <iwienand@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 10:44:07 -07:00
36d900d2b0 difftool: add env vars directly in run_file_diff()
Add the environment variables of the child process directly using
strvec_push() instead of building an array out of them and then adding
that using strvec_pushv().  The new code is shorter and avoids magic
array index values and fragile array padding.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 08:55:59 -07:00
7e17d954d8 promisor-remote: add promisor.quiet configuration option
Add a configuration option to allow output from the promisor
fetching objects to be suppressed.

This allows us to stop commands like 'git blame' being swamped
with progress messages and gc notifications from the promisor
when used in a partial clone.

Signed-off-by: Tom Hughes <tom@compton.nu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-26 09:17:08 -07:00
d36cc0d5a4 Merge branch 'fixes/2.45.1/2.44' into jc/fix-2.45.1-and-friends-for-maint
* fixes/2.45.1/2.44:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:59:12 -07:00
863c0ed71e Merge branch 'fixes/2.45.1/2.43' into fixes/2.45.1/2.44
* fixes/2.45.1/2.43:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:35 -07:00
3c562ef2e6 Merge branch 'fixes/2.45.1/2.42' into fixes/2.45.1/2.43
* fixes/2.45.1/2.42:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:11 -07:00
73339e4dc2 Merge branch 'fixes/2.45.1/2.41' into fixes/2.45.1/2.42
* fixes/2.45.1/2.41:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:43 -07:00
4f215d214f Merge branch 'fixes/2.45.1/2.40' into fixes/2.45.1/2.41
* fixes/2.45.1/2.40:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:02 -07:00
2fa04cebfb format-patch: move range/inter diff at the end of a single patch output
When running "format-patch" on a multiple patch series, the output
coming from "--interdiff" and "--range-diff" options is inserted
after the "shortlog" list of commits and the overall diffstat.

The idea is that shortlog/diffstat are shorter and with denser
information content, which gives a better overview before the
readers dive into more details of range/inter diff.

When working on a single patch, however, we stuff the inter/range
diff output before the actual patch, next to the diffstat.  This
pushes down the patch text way down with inter/range diff output,
distracting readers.

Move the inter/range diff output to the very end of the output,
after all the patch text is shown.

As the inter/range diff is no longer part of the commentary block
(i.e., what comes after the log message and "---", but before the
patch text), stop producing "---" in the function that generates
them.  But to separate it out visually (note: this is not needed
to help tools like "git apply" that pay attention to the hunk
headers to figure out the length of the hunks), add an extra blank
line between the end of the patch text and the inter/range diff.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 16:26:57 -07:00
48440f60a7 Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into fixes/2.45.1/2.40
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.

* jc/fix-2.45.1-and-friends-for-2.39:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 12:29:36 -07:00
0b7500dc66 t/perf: implement performance tests for pseudo-merge bitmaps
Implement a straightforward performance test demonstrating the benefit
of pseudo-merge bitmaps by measuring how long it takes to count
reachable objects in a few different scenarios:

  - without bitmaps, to demonstrate a reasonable baseline
  - with bitmaps, but without pseudo-merges
  - with bitmaps and pseudo-merges

Results from running this test on git.git are as follows:

    Test                                                                this tree
    -----------------------------------------------------------------------------------
    5333.2: git rev-list --count --all --objects (no bitmaps)           3.54(3.45+0.08)
    5333.3: git rev-list --count --all --objects (no pseudo-merges)     0.43(0.40+0.03)
    5333.4: git rev-list --count --all --objects (with pseudo-merges)   0.12(0.11+0.01)

On a private repository which is much larger, and has many spikey parts
of history that aren't merged into the 'master' branch, the results are
as follows:

    Test                                                                this tree
    ---------------------------------------------------------------------------------------
    5333.1: git rev-list --count --all --objects (no bitmaps)           122.29(121.31+0.97)
    5333.2: git rev-list --count --all --objects (no pseudo-merges)     21.88(21.30+0.58)
    5333.3: git rev-list --count --all --objects (with pseudo-merges)   5.05(4.77+0.28)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:44 -07:00
7252d9a036 pseudo-merge: implement support for finding existing merges
This patch implements support for reusing existing pseudo-merge commits
when writing bitmaps when there is an existing pseudo-merge bitmap which
has exactly the same set of parents as one that we are about to write.

Note that unstable pseudo-merges are likely to change between
consecutive repacks, and so are generally poor candidates for reuse.
However, stable pseudo-merges (see the configuration option
'bitmapPseudoMerge.<name>.stableThreshold') are by definition unlikely
to change between runs (as they represent long-running branches).

Because there is no index from a *set* of pseudo-merge parents to a
matching pseudo-merge bitmap, we have to construct the bitmap
corresponding to the set of parents for each pending pseudo-merge commit
and see if a matching bitmap exists.

This is technically quadratic in the number of pseudo-merges, but is OK
in practice for a couple of reasons:

  - non-matching pseudo-merge bitmaps are rejected quickly as soon as
    they differ in a single bit

  - already-matched pseudo-merge bitmaps are discarded from subsequent
    rounds of search

  - the number of pseudo-merges is generally small, even for large
    repositories

In order to do this, implement (a) a function that finds a matching
pseudo-merge given some uncompressed bitset describing its parents, (b)
a function that computes the bitset of parents for a given pseudo-merge
commit, and (c) call that function before computing the set of reachable
objects for some pending pseudo-merge.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:44 -07:00
94c1addf86 ewah: bitmap_equals_ewah()
Prepare to reuse existing pseudo-merge bitmaps by implementing a
`bitmap_equals_ewah()` helper.

This helper will be used to see if a raw bitmap (containing the set of
parents for some pseudo-merge) is equal to any existing pseudo-merge's
commits bitmap (which are stored as EWAH-compressed bitmaps on disk).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:44 -07:00
25163f50a2 pack-bitmap: extra trace2 information
Add some extra trace2 lines to capture the number of bitmap lookups that
are hits versus misses, as well as the number of reachability roots that
have bitmap coverage (versus those that do not).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:44 -07:00
11d45a6e6a pack-bitmap.c: use pseudo-merges during traversal
Now that all of the groundwork has been laid to support reading and
using pseudo-merges, make use of that work in this commit by teaching
the pack-bitmap machinery to use pseudo-merge(s) when available during
traversal.

The basic operation is as follows:

  - When enumerating objects on either side of a reachability query,
    first see if any subset of the roots satisfies some pseudo-merge
    bitmap. If it does, apply that pseudo-merge bitmap.

  - If any pseudo-merge bitmap(s) were applied in the previous step, OR
    them into the result[^1]. Then repeat the process over all
    pseudo-merge bitmaps (we'll refer to this as "cascading"
    pseudo-merges). Once this is done, OR in the resulting bitmap.

  - If there is no fill-in traversal to be done, return the bitmap for
    that side of the reachability query. If there is fill-in traversal,
    then for each commit we encounter via show_commit(), check to see if
    any unsatisfied pseudo-merges containing that commit as one of its
    parents has been made satisfied by the presence of that commit.

    If so, OR in the object set from that pseudo-merge bitmap, and then
    cascade. If not, continue traversal.

A similar implementation is present in the boundary-based bitmap
traversal routines.

[^1]: Importantly, we cannot OR in the entire set of roots along with
  the objects reachable from whatever pseudo-merge bitmaps were
  satisfied.  This may leave some dangling bits corresponding to any
  unsatisfied root(s) getting OR'd into the resulting bitmap, tricking
  other parts of the traversal into thinking we already have a
  reachability closure over those commit(s) when we do not.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
8e41468ef3 t/test-lib-functions.sh: support --notick in test_commit_bulk()
One of the tests we'll want to add for pseudo-merge bitmaps needs to be
able to generate a large number of commits at a specific date.

Support the `--notick` option (with identical semantics to the
`--notick` option for `test_commit()`) within `test_commit_bulk` as a
prerequisite for that. Callers can then set the various _DATE variables
themselves.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
71eca9ab79 pack-bitmap: implement test helpers for pseudo-merge
Implement three new sub-commands for the "bitmap" test-helper:

  - t/helper test-tool bitmap dump-pseudo-merges
  - t/helper test-tool bitmap dump-pseudo-merge-commits <n>
  - t/helper test-tool bitmap dump-pseudo-merge-objects <n>

These three helpers dump the list of pseudo merges, the "parents" of the
nth pseudo-merges, and the set of objects reachable from those parents,
respectively.

These helpers will be useful in subsequent patches when we add test
coverage for pseudo-merge bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
0481cbf912 ewah: implement ewah_bitmap_popcount()
Some of the pseudo-merge test helpers (which will be introduced in the
following commit) will want to indicate the total number of commits in
or objects reachable from a pseudo-merge.

Implement a popcount() function that operates on EWAH bitmaps to quickly
determine how many bits are set in each of the respective bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
955747b4da pseudo-merge: implement support for reading pseudo-merge commits
Implement the basic API for reading pseudo-merge bitmaps, which consists
of four basic functions:

  - pseudo_merge_bitmap()
  - use_pseudo_merge()
  - apply_pseudo_merges_for_commit()
  - cascade_pseudo_merges()

These functions are all documented in pseudo-merge.h, but their rough
descriptions are as follows:

  - pseudo_merge_bitmap() reads and inflates the objects EWAH bitmap for
    a given pseudo-merge

  - use_pseudo_merge() does the same as pseudo_merge_bitmap(), but on
    the commits EWAH bitmap, not the objects bitmap

  - apply_pseudo_merges_for_commit() applies all satisfied pseudo-merge
    commits for a given result set, and cascades any yet-unsatisfied
    pseudo-merges if any were applied in the previous step

  - cascade_pseudo_merges() applies all pseudo-merges which are
    satisfied but have not been previously applied, repeating this
    process until no more pseudo-merges can be applied

The core of the API is the latter two functions, which are responsible
for applying pseudo-merges during the object traversal implemented in
the pack-bitmap machinery.

The other two functions (pseudo_merge_bitmap(), and use_pseudo_merge())
are low-level ways to interact with the pseudo-merge machinery, which
will be useful in future commits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
7c0fae8844 pack-bitmap.c: read pseudo-merge extension
Now that the scaffolding for reading the pseudo-merge extension has been
laid, teach the pack-bitmap machinery to read the pseudo-merge extension
when present.

Note that pseudo-merges themselves are not yet used during traversal,
this step will be taken by a future commit.

In the meantime, read the table and initialize the pseudo_merge_map
structure introduced by a previous commit. When the pseudo-merge
extension is present, `load_bitmap_header()` performs basic sanity
checks to make sure that the table is well-formed.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
0f81b9cb2c pseudo-merge: scaffolding for reads
Implement scaffolding within the new pseudo-merge compilation unit
necessary to use the pseudo-merge API from within the pack-bitmap.c
machinery.

The core of this scaffolding is two-fold:

  - The `pseudo_merge` structure itself, which represents an individual
    pseudo-merge bitmap. It has fields for both bitmaps, as well as
    metadata about its position within the memory-mapped region, and
    a few extra bits indicating whether or not it is satisfied, and
    which bitmaps(s, if any) have been read, since they are initialized
    lazily.

  - The `pseudo_merge_map` structure, which holds an array of
    pseudo_merges, as well as a pointer to the memory-mapped region
    containing the pseudo-merge serialization from within a .bitmap
    file.

Note that the `bitmap_index` structure is defined statically within the
pack-bitmap.o compilation unit, so we can't take in a `struct
bitmap_index *`. Instead, wrap the primary components necessary to read
the pseudo-merges in this new structure to avoid exposing the
implementation details of the `bitmap_index` structure.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00
79621f3e41 pack-bitmap: extract read_bitmap() function
The pack-bitmap machinery uses the `read_bitmap_1()` function to read a
bitmap from within the mmap'd region corresponding to the .bitmap file.
As as side-effect of calling this function, `read_bitmap_1()` increments
the `index->map_pos` variable to reflect the number of bytes read.

Extract the core of this routine to a separate function (that operates
over a `const unsigned char *`, a `size_t` and a `size_t *` pointer)
instead of a `struct bitmap_index *` pointer.

This function (called `read_bitmap()`) is part of the pack-bitmap.h API
so that it can be used within the upcoming portion of the implementation
in pseduo-merge.ch.

Rewrite the existing function, `read_bitmap_1()`, in terms of its more
generic counterpart.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
53ea3ec479 pack-bitmap-write.c: write pseudo-merge table
Now that the pack-bitmap writer machinery understands how to select and
store pseudo-merge commits, teach it how to write the new optional
pseudo-merge .bitmap extension.

No readers yet exist for this new extension to the .bitmap format. The
following commits will take any preparatory step(s) necessary before
then implementing the routines necessary to read this new table.

In the meantime, the new `write_pseudo_merges()` function implements
writing this new format as described by a previous commit in
Documentation/technical/bitmap-format.txt.

Writing this table is fairly straightforward and consists of a few
sub-components:

  - a pair of bitmaps for each pseudo-merge (one for the pseudo-merge
    "parents", and another for the objects reachable from those parents)

  - for each commit, the offset of either (a) the pseudo-merge it
    belongs to, or (b) an extended lookup table if it belongs to >1
    pseudo-merge groups

  - if there are any commits belonging to >1 pseudo-merge group, the
    extended lookup tables (which each consist of the number of
    pseudo-merge groups a commit appears in, and then that many 4-byte
    unsigned )

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
faf558b23e pseudo-merge: implement support for selecting pseudo-merge commits
Teach the new pseudo-merge machinery how to select non-bitmapped commits
for inclusion in different pseudo-merge group(s) based on a handful of
criteria.

Note that the selected pseudo-merge commits aren't actually used or
written anywhere yet. This will be done in the following commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
5831f8ac41 config: introduce git_config_double()
Future commits will want to parse a double-precision floating point
value from configuration, but we have no way to parse such a value prior
to this patch.

The core of the routine is implemented in git_parse_double(). Unlike
git_parse_unsigned() and git_parse_signed(), however, the function
implemented here only works on type "double", and not related types like
"float", or "long double".

This is because "float" and "long double" use different functions to
convert from ASCII strings to floating point values (strtof() and
strtold(), respectively). Likewise, there is no pointer type that can
assign to any of these values (except for "void *"), so the only way to
define this trio of functions would be with a macro expansion that is
parameterized over the floating point type and conversion function.

That is all doable, but likely to be overkill given our current needs,
which is only to parse double-precision floats.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
c059c8795e pack-bitmap: make bitmap_writer_push_bitmapped_commit() public
The pseudo-merge selection code will be added in a subsequent commit,
and will need a way to push the allocated commit structures into the
bitmap writer from a separate compilation unit.

Make the `bitmap_writer_push_bitmapped_commit()` function part of the
pack-bitmap.h header in order to make this possible.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
245a7f2e01 pack-bitmap: implement bitmap_writer_has_bitmapped_object_id()
Prepare to implement pseudo-merge bitmap selection by implementing a
necessary new function, `bitmap_writer_has_bitmapped_object_id()`.

This function returns whether or not the bitmap_writer selected the
given object ID for bitmapping. This will allow the pseudo-merge
machinery to reject candidates for pseudo-merges if they have already
been selected as an ordinary bitmap tip.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
0d41b18317 pack-bitmap-write: support storing pseudo-merge commits
Prepare to write pseudo-merge bitmaps by annotating individual bitmapped
commits (which are represented by the `bitmapped_commit` structure) with
an extra bit indicating whether or not they are a pseudo-merge.

In subsequent commits, pseudo-merge bitmaps will be generated by
allocating a fake commit node with parents covering the full set of
commits represented by the pseudo-merge bitmap. These commits will be
added to the set of "selected" commits as usual, but will be written
specially instead of being included with the rest of the selected
commits.

Mechanically speaking, there are two parts of this change:

  - The bitmapped_commit struct gets a new bit indicating whether it is
    a pseudo-merge, or an ordinary commit selected for bitmaps.

  - A handful of changes to only write out the non-pseudo-merge commits
    when enumerating through the selected array (see the new
    `bitmap_writer_selected_nr()` function). Pseudo-merge commits appear
    after all non-pseudo-merge commits, so it is safe to enumerate
    through the selected array like so:

        for (i = 0; i < bitmap_writer_selected_nr(); i++)
          if (writer.selected[i].pseudo_merge)
            BUG("unexpected pseudo-merge");

    without encountering the BUG().

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:42 -07:00
89f47c45df pseudo-merge.ch: initial commit
Add a new (empty) header file to contain the implementation for
selecting, reading, and applying pseudo-merge bitmaps.

For now this header and its corresponding implementation are left
empty, but they will evolve over the course of subsequent commit(s).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
4722e06edc pack-bitmap: move some initialization to bitmap_writer_init()
The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to
map from commits selected for bitmaps (by OID) to a bitmapped_commit
structure (containing the bitmap itself, among other things like its XOR
offset, etc.)

This map was initialized at the end of `bitmap_writer_build()`. New
entries are added in `pack-bitmap-write.c::store_selected()`, which is
called by the bitmap_builder machinery (which is responsible for
traversing history and generating the actual bitmaps).

Reorganize when this field is initialized and when entries are added to
it so that we can quickly determine whether a commit is a candidate for
pseudo-merge selection, or not (since it was already selected to receive
a bitmap, and thus storing it in a pseudo-merge would be redundant).

The changes are as follows:

  - Introduce a new `bitmap_writer_init()` function which initializes
    the `writer.bitmaps` field (instead of waiting until the end of
    `bitmap_writer_build()`).

  - Add map entries in `push_bitmapped_commit()` (which is called via
    `bitmap_writer_select_commits()`) with OID keys and NULL values to
    track whether or not we *expect* to write a bitmap for some given
    commit.

  - Validate that a NULL entry is found matching the given key when we
    store a selected bitmap.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
10a96af8dc ewah: implement ewah_bitmap_is_subset()
In order to know whether a given pseudo-merge (comprised of a "parents"
and "objects" bitmaps) is "satisfied" and can be OR'd into the bitmap
result, we need to be able to quickly determine whether the "parents"
bitmap is a subset of the current set of objects reachable on either
side of a traversal.

Implement a helper function to prepare for that, which determines
whether an EWAH bitmap (the parents bitmap from the pseudo-merge) is a
subset of a non-EWAH bitmap (in this case, the results bitmap from
either side of the traversal).

This function makes use of the EWAH iterator to avoid inflating any part
of the EWAH bitmap after we determine it is not a subset of the non-EWAH
bitmap. This "fail-fast" allows us to avoid a potentially large amount
of wasted effort.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
2bfc24ecf6 Documentation/technical: describe pseudo-merge bitmaps format
Prepare to implement pseudo-merge bitmaps over the next several commits
by first describing the serialization format which will store the new
pseudo-merge bitmaps themselves.

This format is implemented as an optional extension within the bitmap v1
format, making it compatible with previous versions of Git, as well as
the original .bitmap implementation within JGit.

The format is described in detail in the patch contents below, but the
high-level description is as follows:

  - An array of pseudo-merge bitmaps, each containing a pair of EWAH
    bitmaps: one describing the set of pseudo-merge "parents", and
    another describing the set of object(s) reachable from those
    parents.

  - A lookup table to determine which pseudo-merge(s) a given commit
    appears in. An optional extended lookup table follows when there is
    at least one commit which appears in multiple pseudo-merge groups.

  - Trailing metadata, including the number of pseudo-merge(s), number
    of unique parents, the offset within the .bitmap file for the
    pseudo-merge commit lookup table, and the size of the optional
    extension itself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
40864ac902 Documentation/gitpacking.txt: describe pseudo-merge bitmaps
Add some details to the gitpacking(7) manual page which motivate and
describe pseudo-merge bitmaps.

The exact on-disk format and many of the configuration knobs will be
described in subsequent commits.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
0074cc2994 Documentation/gitpacking.txt: initial commit
Introduce a new manual page, gitpacking(7) to collect useful information
about advanced packing concepts in Git.

In future commits in this series, this manual page will expand to
describe the new pseudo-merge bitmaps feature, as well as include
examples, relevant configuration bits, use-cases, and so on.

Outside of this series, this manual page may absorb similar pieces from
other parts of Git's documentation about packing.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
862f88cfaf Merge branch 'tb/pack-bitmap-write-cleanups' into tb/pseudo-merge-reachability-bitmap
* tb/pack-bitmap-write-cleanups:
  pack-bitmap: introduce `bitmap_writer_free()`
  pack-bitmap-write.c: avoid uninitialized 'write_as' field
  pack-bitmap: drop unused `max_bitmaps` parameter
  pack-bitmap: avoid use of static `bitmap_writer`
  pack-bitmap-write.c: move commit_positions into commit_pos fields
  object.h: add flags allocated by pack-bitmap.h
2024-05-24 11:40:34 -07:00
84ed505515 show_log: factor out interdiff/range-diff generation
The integration of "git range-diff" with "git format-patch" for a
single patch (i.e., not generating "range-diff" into the cover
letter) hooks into log-tree.c:show_log(), which is responsible for
writing the log message out and other stuff.  Essentially,
everything you see before the diffstat and the patch is generated
there.

Split out the code that spits out the interdiff/range-diff into a
separate helper function show_diff_of_diff().  Hopefully this will
make it easier to move things around in the output stream in the
future patches.

This is supposed to be a no-op refactoring.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-23 16:04:28 -07:00
b9cfe4845c The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-23 11:04:29 -07:00
8890b4f37e Merge branch 'mt/openindiana-portability'
Portability updates to various uses of grep and sed.

* mt/openindiana-portability:
  t/t9001-send-email.sh: sed - remove the i flag for s
  t/t9118-git-svn-funky-branch-names.sh: sed needs semicolon
  t/t1700-split-index.sh: mv -v is not portable
  t/t4202-log.sh: fix misspelled variable
  t/t0600-reffiles-backend.sh: rm -v is not portable
  t/t9902-completion.sh: backslashes in echo
  Switch grep from non-portable BRE to portable ERE
2024-05-23 11:04:29 -07:00
d365a27bf7 Merge branch 'dg/fetch-pack-code-cleanup'
Code clean-up to remove an unused struct definition.

* dg/fetch-pack-code-cleanup:
  fetch-pack: remove unused 'struct loose_object_iter'
2024-05-23 11:04:28 -07:00
daa00897d7 Merge branch 'dm/update-index-doc-fix'
Doc fix.

* dm/update-index-doc-fix:
  documentation: git-update-index: add --show-index-version to synopsis
2024-05-23 11:04:28 -07:00
d525723b99 Merge branch 'jc/patch-flow-updates'
Doc updates.

* jc/patch-flow-updates:
  SubmittingPatches: extend the "flow" section
  SubmittingPatches: move the patch-flow section earlier
2024-05-23 11:04:27 -07:00
86a49253a6 Merge branch 'it/refs-name-conflict'
Expose "name conflict" error when a ref creation fails due to D/F
conflict in the ref namespace, to improve an error message given by
"git fetch".

* it/refs-name-conflict:
  refs: return conflict error when checking packed refs
2024-05-23 11:04:27 -07:00
7593d66928 Merge branch 'la/hide-trailer-info'
The trailer API has been reshuffled a bit.

* la/hide-trailer-info:
  trailer unit tests: inspect iterator contents
  trailer: document parse_trailers() usage
  trailer: retire trailer_info_get() from API
  trailer: make trailer_info struct private
  trailer: make parse_trailers() return trailer_info pointer
  interpret-trailers: access trailer_info with new helpers
  sequencer: use the trailer iterator
  trailer: teach iterator about non-trailer lines
  trailer: add unit tests for trailer iterator
  Makefile: sort UNIT_TEST_PROGRAMS
2024-05-23 11:04:27 -07:00
939d49e9bd Merge branch 'kn/ref-transaction-symref' into kn/update-ref-symref
* kn/ref-transaction-symref:
  refs: remove `create_symref` and associated dead code
  refs: rename `refs_create_symref()` to `refs_update_symref()`
  refs: use transaction in `refs_create_symref()`
  refs: add support for transactional symref updates
  refs: move `original_update_refname` to 'refs.c'
  refs: support symrefs in 'reference-transaction' hook
  files-backend: extract out `create_symref_lock()`
  refs: accept symref values in `ref_transaction_update()`
2024-05-23 09:38:59 -07:00
0ff6d23a0f Merge branch 'ps/pseudo-ref-terminology' into ps/ref-storage-migration
* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-23 09:14:32 -07:00
e55f364398 Merge branch 'ps/refs-without-the-repository-updates' into ps/ref-storage-migration
* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-23 09:14:08 -07:00
1991703bdb Revert "fsck: warn about symlink pointing inside a gitdir"
This reverts commit a33fea08 (fsck: warn about symlink pointing
inside a gitdir, 2024-04-10), which warns against symbolic links
commonly created by git-annex.
2024-05-22 21:55:31 -07:00
407997c1dd setup: fix bug with "includeIf.onbranch" when initializing dir
It was reported that git-init(1) can fail when initializing an existing
directory in case the config contains an "includeIf.onbranch:"
condition:

    $ mkdir repo
    $ git -c includeIf.onbranch:main.path=nonexistent init repo
    BUG: refs.c:2056: reference backend is unknown

The same error can also be triggered when re-initializing an already
existing repository.

The bug has been introduced in 173761e21b (setup: start tracking ref
storage format, 2023-12-29), which wired up the ref storage format. The
root cause is in `init_db()`, which tries to read the config before we
have initialized `the_repository` and most importantly its ref storage
format. We eventually end up calling `include_by_branch()` and execute
`refs_resolve_ref_unsafe()`, but because we have not initialized the ref
storage format yet this will trigger the above bug.

Interestingly, `include_by_branch()` has a mechanism that will only
cause us to resolve the ref when `the_repository->gitdir` is set. This
is also the reason why this only happens when we initialize an already
existing directory or repository: `gitdir` is set in those cases, but
not when creating a new directory.

Now there are two ways to address the issue:

  - We can adapt `include_by_branch()` to also make the code conditional
    on whether `the_repository->ref_storage_format` is set.

  - We can shift around code such that we initialize the repository
    format before we read the config.

While the first approach would be safe, it may also cause us to paper
over issues where a ref store should have been set up. In our case for
example, it may be reasonable to expect that re-initializing the repo
will cause the "onbranch:" condition to trigger, but we would not do
that if the ref storage format was not set up yet. This also used to
work before the above commit that introduced this bug.

Rearrange the code such that we set up the repository format before
reading the config. This fixes the bug and ensures that "onbranch:"
conditions can trigger.

Reported-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Tested-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
[jc: fixed a test and backported to v2.44.0 codebase]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-22 18:24:48 -07:00
d3f616a4e5 add-patch: enforce only one-letter response to prompts
In a "git add -p" session, especially when we are not using the
single-key mode, we may see 'qa' as a response to a prompt

  (1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?

and then just do the 'q' thing (i.e. quit the session), ignoring
everything other than the first byte.

If 'q' and 'a' are next to each other on the user's keyboard, there
is a plausible chance that we see 'qa' when the user who wanted to
say 'a' fat-fingered and we ended up doing the 'q' thing instead.

As we didn't think of a good reason during the review discussion why
we want to accept excess letters only to ignore them, it appears to
be a safe change to simply reject input that is longer than just one
byte.

The two exceptions are the 'g' command that takes a hunk number, and
the '/' command that takes a regular expression.  They have to be
accompanied by their operands (this makes me wonder how users who
set the interactive.singlekey configuration feed these operands---it
turns out that we notice there is no operand and give them another
chance to type the operand separately, without using single key
input this time), so we accept a string that is more than one byte
long.

Keep the "use only the first byte, downcased" behaviour when we ask
yes/no question, though.  Neither on Qwerty or on Dvorak, 'y' and
'n' are not close to each other.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-22 14:46:31 -07:00
6549c41ead push: don't fetch commit object when checking existence
If we're checking to see whether to tell the user to do a fetch
before pushing there's no need for us to actually fetch the object
from the remote if the clone is partial.

Because the promisor doesn't do negotiation actually trying to do
the fetch of the new head can be very expensive as it will try and
include history that we already have and it just results in rejecting
the push with a different message, and in behavior that is different
to a clone that is not partial.

Signed-off-by: Tom Hughes <tom@compton.nu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-22 13:46:08 -07:00
2e7e9205be chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
On Linux, ncores() computes the number of CPUs by counting the
"processor" or "CPU" lines emitted by /proc/cpuinfo. However, on some
platforms, /proc/cpuinfo does not enumerate the CPUs at all, but
instead merely mentions the total number of CPUs. In such cases, pluck
the CPU count directly from the /proc/cpuinfo line which reports the
number of active CPUs. (In particular, check for "cpus active: NN" and
"ncpus active: NN" since both variants have been seen in the
wild[1,2].)

[1]: https://lore.kernel.org/git/503a99f3511559722a3eeef15d31027dfe617fa1.camel@physik.fu-berlin.de/
[2]: https://lore.kernel.org/git/7acbd5c6c68bd7ba020e2d1cc457a8954fd6edf4.camel@physik.fu-berlin.de/

Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-22 11:58:56 -07:00
45db5ed3b2 chainlint.pl: fix incorrect CPU count on Linux SPARC
On SPARC systems running Linux, individual processors are denoted with
"CPUnn:" in /proc/cpuinfo instead of the usual "processor : NN". As a
result, the regexp in ncores() matches 0 times. Address this shortcoming
by extending the regexp to also match lines with "CPUnn:".

Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
[es: simplified regexp; tweaked commit message]
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Tested-by: John Paul Adrian Glaubitz  <glaubitz@physik.fu-berlin.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-22 11:58:40 -07:00
35dfccb2b4 Revert "Add a helper function to compare file contents"
Now that during a `git clone`, the hooks' contents are no longer
compared to the templates' files', the caller for which the
`do_files_match()` function was introduced is gone, and therefore this
function can be retired, too.

This reverts commit 584de0b4c2 (Add a helper function to compare file
contents, 2024-03-30).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
873a466ea3 clone: drop the protections where hooks aren't run
As part of the security bug-fix releases v2.39.4, ..., v2.45.1, I
introduced logic to safeguard `git clone` from running hooks that were
installed _during_ the clone operation.

The rationale was that Git's CVE-2024-32002, CVE-2021-21300,
CVE-2019-1354, CVE-2019-1353, CVE-2019-1352, and CVE-2019-1349 should
have been low-severity vulnerabilities but were elevated to
critical/high severity by the attack vector that allows a weakness where
files inside `.git/` can be inadvertently written during a `git clone`
to escalate to a Remote Code Execution attack by virtue of installing a
malicious `post-checkout` hook that Git will then run at the end of the
operation without giving the user a chance to see what code is executed.

Unfortunately, Git LFS uses a similar strategy to install its own
`post-checkout` hook during a `git clone`; In fact, Git LFS is
installing four separate hooks while running the `smudge` filter.

While this pattern is probably in want of being improved by introducing
better support in Git for Git LFS and other tools wishing to register
hooks to be run at various stages of Git's commands, let's undo the
clone protections to unbreak Git LFS-enabled clones.

This reverts commit 8db1e8743c (clone: prevent hooks from running
during a clone, 2024-03-28).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
c8f64781c8 tests: verify that clone -c core.hooksPath=/dev/null works again
As part of the protections added in Git v2.45.1 and friends,
repository-local `core.hooksPath` settings are no longer allowed, as a
defense-in-depth mechanism to prevent future Git vulnerabilities to
raise to critical level if those vulnerabilities inadvertently allow the
repository-local config to be written.

What the added protection did not anticipate is that such a
repository-local `core.hooksPath` can not only be used to point to
maliciously-placed scripts in the current worktree, but also to
_prevent_ hooks from being called altogether.

We just reverted the `core.hooksPath` protections, based on the Git
maintainer's recommendation in
https://lore.kernel.org/git/xmqq4jaxvm8z.fsf@gitster.g/ to address this
concern as well as related ones. Let's make sure that we won't regress
while trying to protect the clone operation further.

Reported-by: Brooke Kuhlmann <brooke@alchemists.io>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
75631a3cd8 Revert "core.hooksPath: add some protection while cloning"
This defense-in-depth was intended to protect the clone operation
against future escalations where bugs in `git clone` would allow
attackers to write arbitrary files in the `.git/` directory would allow
for Remote Code Execution attacks via maliciously-placed hooks.

However, it turns out that the `core.hooksPath` protection has
unintentional side effects so severe that they do not justify the
benefit of the protections. For example, it has been reported in
https://lore.kernel.org/git/FAFA34CB-9732-4A0A-87FB-BDB272E6AEE8@alchemists.io/
that the following invocation, which is intended to make `git clone`
safer, is itself broken by that protective measure:

	git clone --config core.hooksPath=/dev/null <url>

Since it turns out that the benefit does not justify the cost, let's revert
20f3588efc (core.hooksPath: add some protection while cloning,
2024-03-30).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
197a772c48 init: use the correct path of the templates directory again
In df93e407f0 (init: refactor the template directory discovery into its
own function, 2024-03-29), I refactored the way the templates directory
is discovered.

The refactoring was faithful, but missed a reference in the `Makefile`
where the `DEFAULT_GIT_TEMPLATE_DIR` constant is defined. As a
consequence, Git v2.45.1 and friends will always use the hard-coded path
`/usr/share/git-core/templates`.

Let's fix that by defining the `DEFAULT_GIT_TEMPLATE_DIR` when building
`setup.o`, where that constant is actually used.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
ee052533bb hook: plug a new memory leak
In 8db1e8743c (clone: prevent hooks from running during a clone,
2024-03-28), I introduced an inadvertent memory leak that was
unfortunately not caught before v2.45.1 was released. Here is a fix.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
d17d18f85a ci: stop installing "gcc-13" for osx-gcc
Our osx-gcc job explicitly asks to install gcc-13. But since the GitHub
runner image already comes with gcc-13 installed, this is mostly doing
nothing (or in some cases it may install an incremental update over the
runner image). But worse, it recently started causing errors like:

    ==> Fetching gcc@13
    ==> Downloading https://ghcr.io/v2/homebrew/core/gcc/13/blobs/sha256:fb2403d97e2ce67eb441b54557cfb61980830f3ba26d4c5a1fe5ecd0c9730d1a
    ==> Pouring gcc@13--13.2.0.ventura.bottle.tar.gz
    Error: The `brew link` step did not complete successfully
    The formula built, but is not symlinked into /usr/local
    Could not symlink bin/c++-13
    Target /usr/local/bin/c++-13
    is a symlink belonging to gcc. You can unlink it:
      brew unlink gcc

which cause the whole CI job to bail.

I didn't track down the root cause, but I suspect it may be related to
homebrew recently switching the "gcc" default to gcc-14. And it may even
be fixed when a new runner image is released. But if we don't need to
run brew at all, it's one less thing for us to worry about.

[jc: cherry-picked from v2.45.0-3-g7df2405b38]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
93ec0a7cbf ci: avoid bare "gcc" for osx-gcc job
On macOS, a bare "gcc" (without a version) will invoke a wrapper for
clang, not actual gcc. Even when gcc is installed via homebrew, that
only provides version-specific links in /usr/local/bin (like "gcc-13"),
and never a version-agnostic "gcc" wrapper.

As far as I can tell, this has been the case for a long time, and this
osx-gcc job has largely been doing nothing. We can point it at "gcc-13",
which will pick up the homebrew-installed version.

The fix here is specific to the github workflow file, as the gitlab one
does not have a matching job.

It's a little unfortunate that we cannot just ask for the latest version
of gcc which homebrew provides, but as far as I can tell there is no
easy alias (you'd have to find the highest number gcc-* in
/usr/local/bin yourself).

[jc: cherry-picked from v2.45.0-2-g11c7001e3d]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
b6b9fafacb ci: drop mention of BREW_INSTALL_PACKAGES variable
The last user of this variable went away in 4a6e4b9602 (CI: remove
Travis CI support, 2021-11-23), so it's doing nothing except making it
more confusing to find out which packages _are_ installed.

[jc: cherry-picked from v2.45.0-1-g9d4453e8d6]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
d11c51eec8 send-email: avoid creating more than one Term::ReadLine object
Every time git-send-email calls its ask() function to prompt the user,
we call term(), which instantiates a new Term::ReadLine object. But in
v1.46 of Term::ReadLine::Gnu (which provides the Term::ReadLine
interface on some platforms), its constructor refuses to create a second
instance[1]. So on systems with that version of the module, most
git-send-email instances will fail (as we usually prompt for both "to"
and "in-reply-to" unless the user provided them on the command line).

We can fix this by keeping a single instance variable and returning it
for each call to term(). In perl 5.10 and up, we could do that with a
"state" variable. But since we only require 5.008, we'll do it the
old-fashioned way, with a lexical "my" in its own scope.

Note that the tests in t9001 detect this problem as-is, since the
failure mode is for the program to die. But let's also beef up the
"Prompting works" test to check that it correctly handles multiple
inputs (if we had chosen to keep our FakeTerm hack in the previous
commit, then the failure mode would be incorrectly ignoring prompts
after the first).

[1] For discussion of why multiple instances are forbidden, see:
    https://github.com/hirooih/perl-trg/issues/16

[jc: cherry-picked from v2.42.0-rc2~6^2]

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:07 -07:00
fde2b4b9bc send-email: drop FakeTerm hack
Back in 280242d1cc (send-email: do not barf when Term::ReadLine does not
like your terminal, 2006-07-02), we added a fallback for when
Term::ReadLine's constructor failed: we'd have a FakeTerm object
instead, which would then die if anybody actually tried to call
readline() on it. Since we instantiated the $term variable at program
startup, we needed this workaround to let the program run in modes when
we did not prompt the user.

But later, in f4dc9432fd (send-email: lazily load modules for a big
speedup, 2021-05-28), we started loading Term::ReadLine lazily only when
ask() is called. So at that point we know we're trying to prompt the
user, and we can just die if ReadLine instantiation fails, rather than
making this fake object to lazily delay showing the error.

This should be OK even if there is no tty (e.g., we're in a cron job),
because Term::ReadLine will return a stub object in that case whose "IN"
and "OUT" functions return undef. And since 5906f54e47 (send-email:
don't attempt to prompt if tty is closed, 2009-03-31), we check for that
case and skip prompting.

And we can be sure that FakeTerm was not kicking in for such a
situation, because it has actually been broken since that commit! It
does not define "IN" or "OUT" methods, so perl would barf with an error.
If FakeTerm was in use, we were neither honoring what 5906f54e47 tried
to do, nor producing the readable message that 280242d1cc intended.

So we're better off just dropping FakeTerm entirely, and letting the
error reported by constructing Term::ReadLine through.

[jc: cherry-picked from v2.42.0-rc2~6^2~1]

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:07 -07:00
4674ab682d apply: fix uninitialized hash function
"git apply" can work outside a repository as a better "GNU patch",
but when it does so, it still assumed that it can access
the_hash_algo, which is no longer true in the new world order.

Make sure we explicitly fall back to SHA-1 algorithm for backward
compatibility.

It is of dubious value to make this configurable to other hash
algorithms, as the code does not use the_hash_algo for hashing
purposes when working outside a repository (which is how
the_hash_algo is left to NULL)---it is only used to learn the max
length of the hash when parsing the object names on the "index"
line, but failing to parse the "index" line is not a hard failure,
and the program does not support operations like applying binary
patches and --3way fallback that requires object access outside a
repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:07:48 -07:00
8d058b8024 builtin/hash-object: fix uninitialized hash function
The git-hash-object(1) command allows users to hash an object even
without a repository. Starting with c8aed5e8da (repository: stop setting
SHA1 as the default object hash, 2024-05-07), this will make us hit an
uninitialized hash function, which subsequently leads to a segfault.

Fix this by falling back to SHA-1 explicitly when running outside of
a Git repository. Users can use GIT_DEFAULT_HASH environment to
specify what hash algorithm they want, so arguably this code should
not be needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:05:13 -07:00
4a1c95931f builtin/patch-id: fix uninitialized hash function
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have adapted `initialize_repository()` to no longer set
up a default hash function. As this function is also used to set up
`the_repository`, the consequence is that `the_hash_algo` will now by
default be a `NULL` pointer unless the hash algorithm was configured
properly. This is done as a mechanism to detect cases where we may be
using the wrong hash function by accident.

This change now causes git-patch-id(1) to segfault when it's run outside
of a repository. As this command can read diffs from stdin, it does not
necessarily need a repository, but then relies on `the_hash_algo` to
compute the patch ID itself.

It is somewhat dubious that git-patch-id(1) relies on `the_hash_algo` in
the first place. Quoting its manpage:

    A "patch ID" is nothing but a sum of SHA-1 of the file diffs
    associated with a patch, with line numbers ignored. As such, it’s
    "reasonably stable", but at the same time also reasonably unique,
    i.e., two patches that have the same "patch ID" are almost
    guaranteed to be the same thing.

We explicitly document patch IDs to be using SHA-1. Furthermore, patch
IDs are supposed to be stable for most of the part. But even with the
same input, the patch IDs will now be different depending on the repo's
configured object hash.

Work around the issue by setting up SHA-1 when there was no startup
repository for now. This is arguably not the correct fix, but for now we
rather want to focus on getting the segfault fixed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:05:13 -07:00
abece6e970 t1517: test commands that are designed to be run outside repository
A few commands, like "git apply" and "git patch-id", have been
broken with a recent change to stop setting the default hash
algorithm to SHA-1.  Test them and fix them in later commits.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:05:13 -07:00
d3b2ff75fd setup: add an escape hatch for "no more default hash algorithm" change
Partially revert c8aed5e8 (repository: stop setting SHA1 as the
default object hash, 2024-05-07), to keep end-user systems still
broken when we have gap in our test coverage but yet give them an
escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment
variable to "sha1" in order to revert to the previous behaviour, in
case we haven't done a thorough job in fixing the fallout from
c8aed5e8.  After we build confidence, we should remove the escape
hatch support, but we are not there yet after only fixing three
commands (hash-object, apply, and patch-id) in this series.

Due to the way the end-user facing GIT_DEFAULT_HASH environment
variable is used in our test suite, we unfortunately cannot reuse it
for this purpose.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:04:12 -07:00
4d00d948ff t/: port helper/test-strcmp-offset.c to unit-tests/t-strcmp-offset.c
In the recent codebase update (8bf6fbd (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.

Let's migrate the unit tests for strcmp-offset functionality from the
legacy approach using the test-tool command `test-tool strcmp-offset` in
helper/test-strcmp-offset.c to the new unit testing framework
(t/unit-tests/test-lib.h).

The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).

Helped-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-20 13:39:49 -07:00
037348e99a chainlint.pl: make CPU count computation more robust
There have been reports[1,2] of chainlint.pl failing to produce output
when output is expected. In fact, the underlying problem is more severe:
in these cases, it isn't doing any work at all, thus not checking Git
tests for semantic problems. In the reported cases, the problem was
tracked down to ncores() returning 0 for the CPU count, which resulted
in chainlint.pl not performing any work (since it thought it had no
cores on which to process).

In the reported cases, the reason for the failure was that the regular
expression counting the number of processors reported by /proc/cpuinfo
failed to find any matches, hence it counted 0 processors. Although
fixing each case as it is reported allows chaining.pl to work correctly
on that architecture, it does nothing to improve the overall robustness
of the core count computation which may still return 0 on some yet
untested architecture.

Address this shortcoming by ensuring that ncores() returns a sensible
fallback value in all cases.

[1]: https://lore.kernel.org/git/pull.1385.git.git.1669148861635.gitgitgadget@gmail.com/
[2]: https://lore.kernel.org/git/8baa12f8d044265f1ddeabd64209e7ac0d3700ae.camel@physik.fu-berlin.de/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-20 12:36:41 -07:00
4365c6fcf9 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-20 11:20:05 -07:00
55f5476ce5 Merge branch 'jc/compat-regex-calloc-fix'
Windows CI running in GitHub Actions started complaining about the
order of arguments given to calloc(); the imported regex code uses
the wrong order almost consistently, which has been corrected.

* jc/compat-regex-calloc-fix:
  compat/regex: fix argument order to calloc(3)
2024-05-20 11:20:05 -07:00
4beb7a3b06 Merge branch 'kn/ref-transaction-symref'
Updates to symbolic refs can now be made as a part of ref
transaction.

* kn/ref-transaction-symref:
  refs: remove `create_symref` and associated dead code
  refs: rename `refs_create_symref()` to `refs_update_symref()`
  refs: use transaction in `refs_create_symref()`
  refs: add support for transactional symref updates
  refs: move `original_update_refname` to 'refs.c'
  refs: support symrefs in 'reference-transaction' hook
  files-backend: extract out `create_symref_lock()`
  refs: accept symref values in `ref_transaction_update()`
2024-05-20 11:20:04 -07:00
c82df70818 doc: describe the project's decision-making process
The Git project currently operates according to an informal
consensus-building process, which is currently described in the
SubmittingPatches document. However, that focuses on small/medium-scale
patch series. For larger-scale decisions, the process is not as well
described. Document what to expect so that we have something concrete to
help inform newcomers to the project.

This document explicitly does not aim to impose a formal process to
decision-making, nor to change pre-existing norms. Its only aim is to
describe how the project currently operates today.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 13:53:04 -07:00
72b8c934f2 scalar: make enlistment delete to work on all POSIX platforms
The ability to remove the current working directory is not guaranteed by
POSIX so it is better to go out of the directory we want to delete on
all platforms unconditionally.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:16:25 -07:00
bac28a942a t/t9001-send-email.sh: sed - remove the i flag for s
The 'i' flag for the 's' command of sed is not specified by POSIX so
it is not portable.  Replace its usage by different and portable
syntax.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:15:52 -07:00
22c22d30d3 t/t9118-git-svn-funky-branch-names.sh: sed needs semicolon
POSIX specifies that all editing commands between braces shall be
terminated by a <newline> or <semicolon>.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:15:51 -07:00
50acb48359 t/t1700-split-index.sh: mv -v is not portable
The -v option for mv is not specified by POSIX.  The illumos
implementation of mv does not support -v.  Since we do not need the
verbose mv output we just drop -v for mv.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:15:51 -07:00
05e5ff035f t/t4202-log.sh: fix misspelled variable
The GPGSSH_GOOD_SIGNATURE_TRUSTED variable was spelled as
GOOD_SIGNATURE_TRUSTED and so the grep was used the null RE that
matches everything.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:13:37 -07:00
ce09c692cd t/t0600-reffiles-backend.sh: rm -v is not portable
The -v option for rm is not specified by POSIX.  The illumos
implementation of rm does not support -v.  Since we do not need the
verbose rm output we just drop -v for rm.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:13:28 -07:00
ba1dec3257 t/t9902-completion.sh: backslashes in echo
The usage of backslashes in echo is not portable.  Since some tests
tries to output strings containing '\b' it is safer to use printf
here.  The usage of printf instead of echo is also preferred by POSIX.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:13:26 -07:00
0f063b6c76 Switch grep from non-portable BRE to portable ERE
This makes the grep usage fully POSIX compliant.  The ability to
enable ERE features in BRE using backslash is a GNU extension.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 12:13:18 -07:00
4986662cbc diff: document what --name-only shows
The "--name-only" option is about showing the name of each file in
the post-image tree that got changed and nothing else (like "was it
created?").  Unlike the "--name-status" option that tells how the
change happened (e.g., renamed with similarity), it does not give
anything else, like the name of the corresponding file in the old
tree.

For example, if you start from a clean checkout that has a file
whose name is COPYING, here is what you would see:

    $ git mv COPYING RENAMING
    $ git diff -M --name-only HEAD
    RENAMING
    $ git diff -M --name-status HEAD
    R100	COPYING	RENAMING

Lack of the description of this fact has confused readers in the
past.  Even back when dda2d79a ([PATCH] Clean up diff option
descriptions., 2005-07-13) documented "--name-only", "git diff"
already supported the renames, so in a sense, from day one, this
should have been documented more clearly but it wasn't.

Belatedly clarify it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 11:03:08 -07:00
558a5b8cd0 SubmittingPatches: advertise git-manpages-l10n project a bit
The project takes our AsciiDoc sources of documentation and actively
maintains the translations to various languages.

Let's give them enhanced visibility to help those who want to
volunteer find them.

Acked-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:35:58 -07:00
00892786b8 refs/packed: remove references to the_hash_algo
Remove references to `the_hash_algo` in favor of the hash algo specified
by the repository associated with the packed ref store.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:40 -07:00
c1026b9d7d refs/files: remove references to the_hash_algo
Remove references to `the_hash_algo` in favor of the hash algo specified
by the repository associated with the files ref store.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:40 -07:00
c9e9723e1f refs/files: use correct repository
There are several places in the "files" backend where we use
`the_repository` instead of the repository associated with the ref store
itself. Adapt those to use the correct repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:40 -07:00
2bb444b196 refs: remove dwim_log()
Remove `dwim_log()` in favor of `repo_dwim_log()` so that we can get rid
of one more dependency on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
97abaab5f6 refs: drop git_default_branch_name()
The `git_default_branch_name()` function is a thin wrapper around
`repo_default_branch_name()` with two differences:

  - We implicitly rely on `the_repository`.

  - We cache the default branch name.

None of the callsites of `git_default_branch_name()` are hot code paths
though, so the caching of the branch name is not really required.

Refactor the callsites to use `repo_default_branch_name()` instead and
drop `git_default_branch_name()`, thus getting rid of one more case
where we rely on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
30aaff437f refs: pass repo when peeling objects
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.

Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
19c76e8235 refs: move object peeling into "object.c"
Peeling an object has nothing to do with refs, but we still have the
code in "refs.c". Move it over into "object.c", which is a more natural
place to put it.

Ideally, we'd also move `peel_iterated_oid()` over into "object.c". But
this function is tied to the refs interfaces because it uses a global
ref iterator variable to optimize peeling when the iterator already has
the peeled object ID readily available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
330a2ae60b refs: pass ref store when detecting dangling symrefs
Both `warn_dangling_symref()` and `warn_dangling_symrefs()` derive the
ref store via `the_repository`. Adapt them to instead take in the ref
store as a parameter. While at it, rename the functions to have a `ref_`
prefix to align them with other functions that take a ref store.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
8378c9d27b refs: convert iteration over replace refs to accept ref store
The function `for_each_replace_ref()` is a bit of an oddball across the
refs interfaces as it accepts a pointer to the repository instead of a
pointer to the ref store. The only reason for us to accept a repository
is so that we can eventually pass it back to the callback function that
the caller has provided. This is somewhat arbitrary though, as callers
that need the repository can instead make it accessible via the callback
payload.

Refactor the function to instead accept the ref store and adjust callers
accordingly. This allows us to get rid of some of the boilerplate that
we had to carry to pass along the repository and brings us in line with
the other functions that iterate through refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
dc7fb4f72c refs: retrieve worktree ref stores via associated repository
Similar as with the preceding commit, the worktree ref stores are always
looked up via `the_repository`. Also, again, those ref stores are stored
in a global map.

Refactor the code so that worktrees have a pointer to their repository.
Like this, we can move the global map into `struct repository` and stop
using `the_repository`. With this change, we can now in theory look up
worktree ref stores for repositories other than `the_repository`. In
practice, the worktree code will need further changes to look up
arbitrary worktrees.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
e19488a60a refs: refactor resolve_gitlink_ref() to accept a repository
In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to
look up the submodule ref store. Now that we can look up submodule ref
stores for arbitrary repositories we can improve this function to
instead accept a repository as parameter for which we want to resolve
the gitlink.

Do so and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
965f8991e5 refs: pass repo when retrieving submodule ref store
Looking up submodule ref stores has two deficiencies:

  - The initialized subrepo will be attributed to `the_repository`.

  - The submodule ref store will be tracked in a global map.

This makes it impossible to have submodule ref stores for a repository
other than `the_repository`.

Modify the function to accept the parent repository as parameter and
move the global map into `struct repository`. Like this it becomes
possible to look up submodule ref stores for arbitrary repositories.

Note that this also adds a new reference to `the_repository` in
`resolve_gitlink_ref()`, which is part of the refs interfaces. This will
get adjusted in the next patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:37 -07:00
f1782d185b refs: track ref stores via strmap
The refs code has two global maps that track the submodule and worktree
ref stores. Even though both of these maps track values by strings, we
still use a `struct hashmap` instead of a `struct strmap`. This has the
benefit of saving us an allocation because we can combine key and value
in a single struct. But it does introduce significant complexity that is
completely unneeded.

Refactor the code to use `struct strmap`s instead to reduce complexity.
It's unlikely that this will have any real-world impact on performance
given that most repositories likely won't have all that many ref stores.
Furthermore, this refactoring allows us to de-globalize those maps and
move them into `struct repository` in a subsequent commit more easily.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:37 -07:00
71c871b48d refs: implement releasing ref storages
Ref storages are typically only initialized once for `the_repository`
and then never released. Until now we got away with that without causing
memory leaks because `the_repository` stays reachable, and because the
ref backend is reachable via `the_repository` its memory basically never
leaks.

This is about to change though because of the upcoming migration logic,
which will create a secondary ref storage. In that case, we will either
have to release the old or new ref storage to avoid leaks.

Implement a new `release` callback and expose it via a new
`ref_storage_release()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:37 -07:00
ed93ea1602 refs: rename init_db callback to avoid confusion
Reference backends have two callbacks `init` and `init_db`. The
similarity of these two callbacks has repeatedly confused me whenever I
was looking at them, where I always had to look up which of them does
what.

Rename the `init_db` callback to `create_on_disk`, which should
hopefully be clearer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:36 -07:00
1febabff7a refs: adjust names for init and init_db callbacks
The names of the functions that implement the `init` and `init_db`
callbacks in the "files" and "packed" backends do not match the names of
the callbacks, which is inconsistent. Rename them so that they match,
which makes it easier to discover their respective implementations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:36 -07:00
c397ddffc3 SubmittingPatches: add section for iterating patches
Add a section to explain how to work around other in-flight patches and
how to navigate conflicts which arise as a series is being iterated.
This provides the necessary steps that users can follow to reduce
friction with other ongoing topics and also provides guidelines on how
the users can also communicate this to the list efficiently.

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:31:45 -07:00
43e073bdb0 Merge branch 'jc/patch-flow-updates' into kn/patch-iteration-doc
* jc/patch-flow-updates:
  SubmittingPatches: extend the "flow" section
  SubmittingPatches: move the patch-flow section earlier
2024-05-17 10:31:38 -07:00
5dd5007f89 completion: adapt git-config(1) to complete subcommands
With fe3ccc7aab (Merge branch 'ps/config-subcommands', 2024-05-15),
git-config(1) has gained support for subcommands. These subcommands live
next to the old, action-based mode, so that both the old and new way
continue to work.

The manpage for this command has been updated to prominently show the
subcommands, and the action-based modes are marked as deprecated. Update
Bash completion scripts accordingly to advertise subcommands instead of
actions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 09:26:19 -07:00
22f13e0414 t0017: clarify dubious test set-up
1ff750b1 (tests: make GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21)
added this test, in which "test-tool -C" is fed a name of a
directory that does not exist, and expects that it dies because of a
failure to read the configuration file(s), because the configuration
setting is screwed up to contain mutual inclusion loop, before it
notices that the directory to chdir into does not exist and dies.

It is of dubious value to etch the current order of events, i.e.,
the configuration needs to be read that early (for initializing
trace2 subsystem) before we even notice the lack of the directory
and have a chance to fail, into stone.  Indeed, if you completely
compile out trace2 subsystem so that it does not even attempt to
read the configuration that early, we would die with a different
error message (i.e. "unable to chdir to 'cycle'") and this test will
fail.

At least give a bogus argument to "test-tool -C" a name that is
clearly bogus to make sure we can more easily see what is going on
with plenty of comments.

We may want to remove this test altogether, instead, though.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16 10:29:24 -07:00
d8ab1d464d The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16 10:11:24 -07:00
bca900904d Merge branch 'ps/refs-without-the-repository'
The refs API lost functions that implicitly assumes to work on the
primary ref_store by forcing the callers to pass a ref_store as an
argument.

* ps/refs-without-the-repository:
  refs: remove functions without ref store
  cocci: apply rules to rewrite callers of "refs" interfaces
  cocci: introduce rules to transform "refs" to pass ref store
  refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
  refs: introduce missing functions that accept a `struct ref_store`
2024-05-16 10:10:14 -07:00
f0e2183768 Merge branch 'jl/git-no-advice'
A new global "--no-advice" option can be used to disable all advice
messages, which is meant to be used only in scripts.

* jl/git-no-advice:
  t0018: two small fixes
  advice: add --no-advice global option
  doc: add spacing around paginate options
  doc: clean up usage documentation for --no-* opts
2024-05-16 10:10:13 -07:00
db271e7bb6 Merge branch 'rs/external-diff-with-exit-code'
* rs/external-diff-with-exit-code:
  Revert "diff: fix --exit-code with external diff"
2024-05-16 10:09:23 -07:00
e37423f081 Revert "diff: fix --exit-code with external diff"
This reverts commit 11be65cfa4, per
original author's request to come up with a better strategy.
2024-05-16 10:08:35 -07:00
46536278a8 Merge branch 'ps/refs-without-the-repository' into ps/refs-without-the-repository-updates
* ps/refs-without-the-repository:
  refs: remove functions without ref store
  cocci: apply rules to rewrite callers of "refs" interfaces
  cocci: introduce rules to transform "refs" to pass ref store
  refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
  refs: introduce missing functions that accept a `struct ref_store`
2024-05-16 09:48:46 -07:00
7150f140f9 t/t0211-trace2-perf.sh: fix typo patern -> pattern
The bug went unnoticed because grep with null RE matches everything.

Signed-off-by: Marcel Telka <marcel@telka.sk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16 09:34:39 -07:00
e1ab45b2da osxkeychain: state to skip unnecessary store operations
git passes a credential that has been used successfully to the helpers
to record. If a credential is already stored,
"git-credential-osxkeychain store" just records the credential returned
by "git-credential-osxkeychain get", and unnecessary (sometimes
problematic) SecItemAdd() and/or SecItemUpdate() are performed.

We can skip such unnecessary operations by marking a credential returned
by "git-credential-osxkeychain get". This marking can be done by
utilizing the "state[]" feature:

- The "get" command sets the field "state[]=osxkeychain:seen=1".

- The "store" command skips its actual operation if the field
  "state[]=osxkeychain:seen=1" exists.

Introduce a new state "state[]=osxkeychain:seen=1".

Suggested-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 14:02:45 -07:00
fcf5b74e59 osxkeychain: exclusive lock to serialize execution of operations
git passes a credential that has been used successfully to the helpers
to record. If "git-credential-osxkeychain store" commands run in
parallel (with fetch.parallel configuration and/or by running multiple
git commands simultaneously), some of them may exit with the error
"failed to store: -25299". This is because SecItemUpdate() in
add_internet_password() may return errSecDuplicateItem (-25299) in this
situation. Apple's documentation [1] also states as below:

  In macOS, some of the functions of this API block while waiting for
  input from the user (for example, when the user is asked to unlock a
  keychain or give permission to change trust settings). In general, it
  is safe to use this API in threads other than your main thread, but
  avoid calling the functions from multiple operations, work queues, or
  threads concurrently. Instead, serialize function calls or confine
  them to a single thread.

The error has not been noticed before, because the former implementation
ignored the error.

Introduce an exclusive lock to serialize execution of operations.

[1] https://developer.apple.com/documentation/security/certificate_key_and_trust_services/working_with_concurrency

Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 14:02:44 -07:00
19fe900cfc The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 09:52:55 -07:00
1e00d22ec5 Merge branch 'ds/scalar-reconfigure-all-fix'
Scalar fix.

* ds/scalar-reconfigure-all-fix:
  scalar: avoid segfault in reconfigure --all
2024-05-15 09:52:55 -07:00
754ae50219 Merge branch 'vd/doc-merge-tree-x-option'
Doc update.

* vd/doc-merge-tree-x-option:
  Documentation/git-merge-tree.txt: document -X
2024-05-15 09:52:55 -07:00
068df18c90 Merge branch 'rs/external-diff-with-exit-code'
The "--exit-code" option of "git diff" command learned to work with
the "--ext-diff" option.

* rs/external-diff-with-exit-code:
  diff: fix --exit-code with external diff
  diff: report unmerged paths as changes in run_diff_cmd()
2024-05-15 09:52:54 -07:00
3fc99d037f Merge branch 'jt/port-ci-whitespace-check-to-gitlab'
The "whitespace check" task that was enabled for GitHub Actions CI
has been ported to GitLab CI.

* jt/port-ci-whitespace-check-to-gitlab:
  gitlab-ci: add whitespace error check
  ci: make the whitespace report optional
  ci: separate whitespace check script
  github-ci: fix link to whitespace error
  ci: pre-collapse GitLab CI sections
2024-05-15 09:52:54 -07:00
60521f6043 Merge branch 'ow/refspec-glossary-update'
Doc update.

* ow/refspec-glossary-update:
  Documentation: Mention that refspecs are explained elsewhere
2024-05-15 09:52:53 -07:00
f9d4eaf86c Merge branch 'jp/tag-trailer'
"git tag" learned the "--trailer" option to futz with the trailers
in the same way as "git commit" does.

* jp/tag-trailer:
  builtin/tag: add --trailer option
  builtin/commit: refactor --trailer logic
  builtin/commit: use ARGV macro to collect trailers
2024-05-15 09:52:53 -07:00
fe3ccc7aab Merge branch 'ps/config-subcommands'
The operation mode options (like "--get") the "git config" command
uses have been deprecated and replaced with subcommands (like "git
config get").

* ps/config-subcommands:
  builtin/config: display subcommand help
  builtin/config: introduce "edit" subcommand
  builtin/config: introduce "remove-section" subcommand
  builtin/config: introduce "rename-section" subcommand
  builtin/config: introduce "unset" subcommand
  builtin/config: introduce "set" subcommand
  builtin/config: introduce "get" subcommand
  builtin/config: introduce "list" subcommand
  builtin/config: pull out function to handle `--null`
  builtin/config: pull out function to handle config location
  builtin/config: use `OPT_CMDMODE()` to specify modes
  builtin/config: move "fixed-value" option to correct group
  builtin/config: move option array around
  config: clarify memory ownership when preparing comment strings
2024-05-15 09:52:53 -07:00
b7a1d47ba5 Merge branch 'js/unit-test-suite-runner'
The "test-tool" has been taught to run testsuite tests in parallel,
bypassing the need to use the "prove" tool.

* js/unit-test-suite-runner:
  cmake: let `test-tool` run the unit tests, too
  ci: use test-tool as unit test runner on Windows
  t/Makefile: run unit tests alongside shell tests
  unit tests: add rule for running with test-tool
  test-tool run-command testsuite: support unit tests
  test-tool run-command testsuite: remove hardcoded filter
  test-tool run-command testsuite: get shell from env
  t0080: turn t-basic unit test into a helper
2024-05-15 09:52:52 -07:00
8e4f5c2dc2 refs: refuse to write pseudorefs
Pseudorefs are not stored in the ref database as by definition, they
carry additional metadata that essentially makes them not a ref. As
such, writing pseudorefs via the ref backend does not make any sense
whatsoever as the ref backend wouldn't know how exactly to store the
data.

Restrict writing pseudorefs via the ref backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
f1701f279a ref-filter: properly distinuish pseudo and root refs
The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
993d57eded refs: pseudorefs are no refs
The `is_root_ref()` function will happily clarify a pseudoref as a root
ref, even though pseudorefs are no refs. Next to being wrong, it also
leads to inconsistent behaviour across ref backends: while the "files"
backend accidentally knows to parse those pseudorefs and thus yields
them to the caller, the "reftable" backend won't ever see the pseudoref
at all because they are never stored in the "reftable" backend.

Fix this issue by filtering out pseudorefs in `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
31951c2248 refs: classify HEAD as a root ref
Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.

Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:

  - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    calling `is_root_ref()`.

  - The "files" and "reftable" backends explicitly call both
    `is_root_ref()` and `is_headref()` together.

This also aligns behaviour or `is_root_ref()` and `is_headref()` such
that we stop checking for ref existence. This changes semantics for our
backends:

  - In the reftable backend we already know that the ref must exist
    because `is_headref()` is called as part of the ref iterator. The
    existence check is thus redundant, and the change is safe to do.

  - In the files backend we use it when populating root refs, where we
    would skip adding the "HEAD" file if it was not possible to resolve
    it. The new behaviour is to instead mark "HEAD" as broken, which
    will cause us to emit warnings in various places.

As there are no callers of `is_headref()` left afer the refactoring, we
can absorb it completely into `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
afcd067dad refs: do not check ref existence in is_root_ref()
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.

This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.

Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.

We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.

This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:

  - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
    It's clear that the intent is to classify based on the ref name,
    only.

  - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
    filter root refs. Again, using existence checks is pointless here as
    the iterator has just surfaced the ref, so we know it does exist.

  - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
    determine whether it should add a ref to the root directory of its
    iterator. This had the effect that we skipped over any files that
    are either a symbolic ref, or which are not a ref at all.

    The new behaviour is to include symbolic refs know, which aligns us
    with the adapted terminology. Furthermore, files which look like
    root refs but aren't are now mark those as "broken". As broken refs
    are not surfaced by our tooling, this should not lead to a change in
    user-visible behaviour, but may cause us to emit warnings. This
    feels like the right thing to do as we would otherwise just silently
    ignore corrupted root refs completely.

So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.

Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
32019a7a76 refs: rename is_special_ref() to is_pseudo_ref()
Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
defined terminology in our gitglossary(7). Note that in the preceding
commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
there may be confusion for in-flight patch series that add new calls to
`is_pseudoref()`. In order to intentionally break such patch series we
have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
new name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:51 -07:00
f6936e62a5 refs: rename is_pseudoref() to is_root_ref()
Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:51 -07:00
74b50a5881 Documentation/glossary: define root refs as refs
Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
in the root of the ref hierarchy behave the exact same as normal refs.
They can be symbolic refs or direct refs and can be read, iterated over
and written via normal tooling. All of these refs are stored in the ref
backends, which further demonstrates that they are just normal refs.

Extend the definition of "ref" to also cover such root refs. The only
additional restriction for root refs is that they must conform to a
specific naming schema.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:51 -07:00
29be36a2ea Documentation/glossary: clarify limitations of pseudorefs
Clarify limitations that pseudorefs have:

  - They can be read via git-rev-parse(1) and similar tools.

  - They are not surfaced when iterating through refs, like when using
    git-for-each-ref(1). They are not refs, so iterating through refs
    should not surface them.

  - They cannot be written via git-update-ref(1) and related commands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:51 -07:00
6fd8037564 Documentation/glossary: redefine pseudorefs as special refs
Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):

  - Regular refs that start with "refs/", like "refs/heads/main".

  - Pseudorefs, which live in the root directory. These must have
    all-caps names and must be a file that start with an object hash.
    Consequently, symbolic refs are not pseudorefs because they do not
    start with an object hash.

  - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".

This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:

  - Where does "HEAD" fit in? It's not a pseudoref because it can be
    a symbolic ref. It's not a regular ref because it does not start
    with "refs/". And it's not a special ref, either.

  - There is a strong overlap between pseudorefs and special refs. The
    pseudoref section for example mentions "MERGE_HEAD", even though it
    is a special ref. Is it thus both a pseudoref and a special ref?

  - Why do we even need to distinguish refs that live in the root from
    other refs when they behave just like a regular ref anyway?

In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.

The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadays, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that highlights where a ref has been
fetched from or the list of commits that have been merged.

This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.

Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.

Refs that live in the root of the ref hierarchy but which are not
pseudorefs will be further defined in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:51 -07:00
9c62534377 builtin/config: pass data between callbacks via local variables
We use several global variables to pass data between callers and
callbacks in `get_color()` and `get_colorbool()`. Convert those to use
callback data structures instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
35a7cfda56 builtin/config: convert flags to a local variable
Both the `do_all` and `use_key_regexp` bits essentially act like flags
to `get_value()`. Let's convert them to actual flags so that we can get
rid of the last two remaining global variables that track options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
ab8bac8bb6 builtin/config: track "fixed value" option via flags only
We track the "fixed value" option via two separate bits: once via the
global variable `fixed_value`, and once via the CONFIG_FLAGS_FIXED_VALUE
bit in `flags`. This is confusing and may easily lead to issues when one
is not aware that this is tracked via two separate mechanisms.

Refactor the code to use the flag exclusively. We already pass it to all
the required callsites anyway, except for `collect_config()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
040b141df3 builtin/config: convert key to a local variable
The `key` variable is used by the `get_value()` function for two
purposes:

  - It is used to store the result of `git_config_parse_key()`, which is
    then passed on to `collect_config()`.

  - It is used as a store to convert the provided key to an
    all-lowercase key when `use_key_regexp` is set.

Neither of these cases warrant a global variable at all. In the former
case we can pass the key via `struct collect_config_data`. And in the
latter case we really only want to have it as a temporary local variable
such that we can free associated memory.

Refactor the code accordingly to reduce our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
fdfaaa1b68 builtin/config: convert key_regexp to a local variable
The `key_regexp` variable is used by the `format_config()` callback when
`use_key_regexp` is set. It is only ever set up by its only caller,
`collect_config()` and can thus easily be moved into the
`collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
4ff8feb307 builtin/config: convert regexp to a local variable
The `regexp` variable is used by the `format_config()` callback when
`CONFIG_FLAGS_FIXED_VALUE` is not set. It is only ever set up by its
only caller, `collect_config()` and can thus easily be moved into the
`collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
bfe45f83e7 builtin/config: convert value_pattern to a local variable
The `value_pattern` variable is used by the `format_config()` callback
when `CONFIG_FLAGS_FIXED_VALUE` is used. It is only ever set up by its
only caller, `collect_config()` and can thus easily be moved into the
`collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
65d197cffc builtin/config: convert do_not_match to a local variable
The `do_not_match` variable is used by the `format_config()` callback as
an indicator whether or not the passed regular expression is negated. It
is only ever set up by its only caller, `collect_config()` and can thus
easily be moved into the `collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
8c86981228 builtin/config: move respect_includes_opt into location options
The variable tracking whether or not we want to honor includes is
tracked via a global variable. Move it into the location options
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
4090a9c948 builtin/config: move default value into display options
The default value is tracked via a global variable. Move it into the
display options instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
94c4693079 builtin/config: move type options into display options
The type options are tracked via a global variable. Move it into the
display options instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
c0c1e26326 builtin/config: move display options into local variables
The display options are tracked via a set of global variables. Move
them into a self-contained structure so that we can easily parse all
relevant options and hand them over to the various functions that
require them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
ddb103c2c7 builtin/config: move location options into local variables
The location options are tracked via a set of global variables. Move
them into a self-contained structure so that we can easily parse all
relevant options and hand them over to the various functions that
require them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:53 -07:00
999425cb12 builtin/config: refactor functions to have common exit paths
Refactor functions to have a single exit path. This will make it easier
in subsequent commits to add common cleanup code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:53 -07:00
12b2306830 config: make the config source const
The `struct git_config_source` passed to `config_with_options()` is
never modified. Let's mark it as `const` to clarify.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:53 -07:00
e44b018c52 builtin/config: check for writeability after source is set up
The `check_write()` function verifies that we do not try to write to a
config source that cannot be written to, like for example stdin. But
while the new subcommands do call this function, they do so before
calling `handle_config_location()`. Consequently, we only end up
checking the default config location for writeability, not the location
that was actually specified by the caller of git-config(1).

Fix this by calling `check_write()` after `handle_config_location()`. We
will further clarify the relationship between those two functions in a
subsequent commit where we remove the global state that both implicitly
rely on.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
9cab5e8078 builtin/config: move actions into cmd_config_actions()
We only use actions in the legacy mode. Convert them to an enum and move
them into `cmd_config_actions()` to clearly demonstrate that they are
not used anywhere else.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
7d5387e263 builtin/config: move legacy options into cmd_config()
Move the legacy options as well some of the variables it references into
`cmd_config_action()`. This reduces our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
8b908f9dcf builtin/config: move subcommand options into cmd_config()
Move the subcommand options as well as the `subcommand` variable into
`cmd_config()`. This reduces our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
0336d0055c builtin/config: move legacy mode into its own function
In `cmd_config()` we first try to parse the provided arguments as
subcommands and, if this is successful, call the respective functions
of that subcommand. Otherwise we continue with the "legacy" mode that
uses implicit actions and/or flags.

Disentangle this by moving the legacy mode into its own function. This
allows us to move the options into the respective functions and clearly
separates concerns.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
a577d2f1a9 builtin/config: stop printing full usage on misuse
When invoking git-config(1) with a wrong set of arguments we end up
calling `usage_builtin_config()` after printing an error message that
says what was wrong. As that function ends up printing the full list of
options, which is quite long, the actual error message will be buried by
a wall of text. This makes it really hard to figure out what exactly
caused the error.

Furthermore, now that we have recently introduced subcommands, the usage
information may actually be misleading as we unconditionally print
options of the subcommand-less mode.

Fix both of these issues by just not printing the options at all
anymore. Instead, we call `usage()` that makes us report in a single
line what has gone wrong. This should be way more discoverable for our
users and addresses the inconsistency.

Furthermore, this change allow us to inline the options into the
respective functions that use them to parse the command line.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:51 -07:00
85f360fee5 pack-bitmap: introduce bitmap_writer_free()
Now that there is clearer memory ownership around the bitmap_writer
structure, introduce a bitmap_writer_free() function that callers may
use to free any memory associated with their instance of the
bitmap_writer structure.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:53:46 -07:00
f25e1f2a4d pack-bitmap-write.c: avoid uninitialized 'write_as' field
Prepare to free() memory associated with bitmapped_commit structs by
zero'ing the 'write_as' field.

In ideal cases, it is fine to do something like:

    for (i = 0; i < writer->selected_nr; i++) {
        struct bitmapped_commit *bc = &writer->selected[i];
        if (bc->write_as != bc->bitmap)
            ewah_free(bc->write_as);
        ewah_free(bc->bitmap);
    }

but if not all of the 'write_as' fields were populated (e.g., because
the packing_data given does not form a reachability closure), then we
may attempt to free uninitialized memory.

Guard against this by preemptively zero'ing this field just in case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:32 -07:00
9675b06917 pack-bitmap: drop unused max_bitmaps parameter
The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was
introduced back in 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21), making it original to the bitmap implementation in Git
itself.

When that patch was merged via 0f9e62e084 (Merge branch
'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c
passed a value of "-1" for `max_bitmaps`, indicating no limit.

Since then, the only other caller (in midx.c, added via c528e17966
(pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value
of "-1" for `max_bitmaps`.

Since no callers have needed a finite limit for the `max_bitmaps`
parameter in the nearly decade that has passed since 0f9e62e084, let's
remove the parameter and any dead pieces of code connected to it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:32 -07:00
07647c92ff pack-bitmap: avoid use of static bitmap_writer
The pack-bitmap machinery uses a structure called 'bitmap_writer' to
collect the data necessary to write out .bitmap files. Since its
introduction in 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21), there has been a single static bitmap_writer structure,
which is responsible for all bitmap writing-related operations.

In practice, this is OK, since we are only ever writing a single .bitmap
file in a single process (e.g., `git multi-pack-index write --bitmap`,
`git pack-objects --write-bitmap-index`, `git repack -b`, etc.).

However, having a single static variable makes issues like data
ownership unclear, when to free variables, what has/hasn't been
initialized unclear.

Refactor this code to be written in terms of a given bitmap_writer
structure instead of relying on a static global.

Note that this exposes the structure definition of the bitmap_writer at
the pack-bitmap.h level. We could work around this by, e.g., forcing
callers to declare their writers as:

    struct bitmap_writer *writer;
    bitmap_writer_init(&bitmap_writer);

and then declaring `bitmap_writer_init()` as taking in a double-pointer
like so:

    void bitmap_writer_init(struct bitmap_writer **writer);

which would avoid us having to expose the definition of the structure
itself. This patch takes a different approach, since future patches
(like for the ongoing pseudo-merge bitmaps work) will want to modify the
innards of this structure (in the previous example, via pseudo-merge.c).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:32 -07:00
94830fcacc pack-bitmap-write.c: move commit_positions into commit_pos fields
In 7cc8f97108 (pack-objects: implement bitmap writing, 2013-12-21), the
bitmapped_commit struct was introduced, including the 'commit_pos'
field, which has been unused ever since its introduction more than a
decade ago.

Instead, we have used the nearby `commit_positions` array leaving the
bitmapped_commit struct with an unused 4-byte field.

We could drop the `commit_pos` field as unused, and continue to store
the values in the auxiliary array. But we could also drop the array and
store the data for each bitmapped_commit struct inside of the structure
itself, which is what this patch does.

In any spot that we previously read `commit_positions[i]`, we can now
instead read `writer.selected[i].commit_pos`. There are a few spots that
need changing as a result:

  - write_selected_commits_v1() is a simple transformation, since we're
    just reading the field. As a result, the function no longer needs an
    explicit argument to pass the commit_positions array.

  - write_lookup_table() also no longer needs the explicit
    commit_positions array passed in as an argument. But it still needs
    to sort an array of indices into the writer.selected array to read
    them in commit_pos order, so table_cmp() is adjusted accordingly.

  - bitmap_writer_finish() no longer needs to allocate, populate, and
    free the commit_positions table. Instead, we can just write the data
    directly into each struct bitmapped_commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:31 -07:00
b174a97a54 object.h: add flags allocated by pack-bitmap.h
In commit 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21) the NEEDS_BITMAP flag was introduced into pack-bitmap.h, but
no object flags allocation table existed at the time.

In 208acbfb82 (object.h: centralize object flag allocation, 2014-03-25)
when that table was first introduced, we never added the flags from
7cc8f97108, which has remained the case since.

Rectify this by including the flag bit used by pack-bitmap.h into the
centralized table in object.h.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:31 -07:00
83f1add914 Sync with Git 2.45.1
* tag 'v2.45.1': (42 commits)
  Git 2.45.1
  Git 2.44.1
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  ...
2024-05-13 18:29:15 -07:00
369b84196e reftable/merged: adapt interface to allow reuse of iterators
Refactor the interfaces exposed by `struct reftable_merged_table` and
`struct merged_iter` such that they support iterator reuse. This is done
by separating initialization of the iterator and seeking on it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:19 -07:00
08efe69212 reftable/stack: provide convenience functions to create iterators
There exist a bunch of call sites in the reftable backend that want to
create iterators for a reftable stack. This is rather convoluted right
now, where you always have to go via the merged table. And it is about
to become even more convoluted when we split up iterator initialization
and seeking in the next commit.

Introduce convenience functions that allow the caller to create an
iterator from a reftable stack directly without going through the merged
table. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:19 -07:00
0e7be2b3ea reftable/reader: adapt interface to allow reuse of iterators
Refactor the interfaces exposed by `struct reftable_reader` and `struct
table_iterator` such that they support iterator reuse. This is done by
separating initialization of the iterator and seeking on it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:18 -07:00
d76f0d3f57 reftable/generic: adapt interface to allow reuse of iterators
Refactor the interfaces exposed by `struct reftable_table` and `struct
reftable_iterator` such that they support iterator reuse. This is done
by separating initialization of the iterator and seeking on it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:18 -07:00
5bf96e0c39 reftable/generic: move seeking of records into the iterator
Reftable iterators are created by seeking on the parent structure of a
corresponding record. For example, to create an iterator for the merged
table you would call `reftable_merged_table_seek_ref()`. Most notably,
it is not posible to create an iterator and then seek it afterwards.

While this may be a bit easier to reason about, it comes with two
significant downsides. The first downside is that the logic to find
records is split up between the parent data structure and the iterator
itself. Conceptually, it is more straight forward if all that logic was
contained in a single place, which should be the iterator.

The second and more significant downside is that it is impossible to
reuse iterators for multiple seeks. Whenever you want to look up a
record, you need to re-create the whole infrastructure again, which is
quite a waste of time. Furthermore, it is impossible to optimize seeks,
such as when seeking the same record multiple times.

To address this, we essentially split up the concerns properly such that
the parent data structure is responsible for setting up the iterator via
a new `init_iter()` callback, whereas the iterator handles seeks via a
new `seek()` callback. This will eventually allow us to call `seek()` on
the iterator multiple times, where every iterator can potentially
optimize for certain cases.

Note that at this point in time we are not yet ready to reuse the
iterators. This will be left for a future patch series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:18 -07:00
701713a254 reftable/merged: simplify indices for subiterators
When seeking on a merged table, we perform the seek for each of the
subiterators. If the subiterator has the desired record we add it to the
priority queue, otherwise we skip it and don't add it to the stack of
subiterators hosted by the merged table.

The consequence of this is that the index of the subiterator in the
merged table does not necessarily correspond to the index of it in the
merged iterator. Next to being potentially confusing, it also means that
we won't easily be able to re-seek the merged iterator because we have
no clear connection between both of the data structures.

Refactor the code so that the index stays the same in both structures.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:18 -07:00
e08f49a4f5 reftable/merged: split up initialization and seeking of records
To initialize a `struct merged_iter`, we need to seek all subiterators
to the wanted record and then add their results to the priority queue
used to sort the records. This logic is split up across two functions,
`merged_table_seek_record()` and `merged_iter_init()`. The scope of
these functions is somewhat weird though, where `merged_iter_init()` is
only responsible for adding the records of the subiterators to the
priority queue.

Clarify the scope of those functions such that `merged_iter_init()` is
only responsible for initializing the iterator's structure. Performing
the subiterator seeks are now part of `merged_table_seek_record()`.

This step is required to move seeking of records into the generic
`struct reftable_iterator` infrastructure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:17 -07:00
c82692f755 reftable/reader: set up the reader when initializing table iterator
All the seeking functions accept a `struct reftable_reader` as input
such that they can use the reader to look up the respective blocks.
Refactor the code to instead set up the reader as a member of `struct
table_iter` during initialization such that we don't have to pass the
reader on every single call.

This step is required to move seeking of records into the generic
`struct reftable_iterator` infrastructure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:17 -07:00
f1e3c12196 reftable/reader: inline reader_seek_internal()
We have both `reader_seek()` and `reader_seek_internal()`, where the
former function only exists so that we can exit early in case the given
table has no records of the sought-after type.

Merge these two functions into one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:17 -07:00
81a03a3236 reftable/reader: separate concerns of table iter and reftable reader
In "reftable/reader.c" we implement two different interfaces:

  - The reftable reader contains the logic to read reftables.

  - The table iterator is used to iterate through a single reftable read
    by the reader.

The way those two types are used in the code is somewhat confusing
though because seeking inside a table is implemented as if it was part
of the reftable reader, even though it is ultimately more of a detail
implemented by the table iterator.

Make the boundary between those two types clearer by renaming functions
that seek records in a table such that they clearly belong to the table
iterator's logic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:17 -07:00
dfdd1455bb reftable/reader: unify indexed and linear seeking
In `reader_seek_internal()` we either end up doing an indexed seek when
there is one or a linear seek otherwise. These two code paths are
disjunct without a good reason, where the indexed seek will cause us to
exit early.

Refactor the two code paths such that it becomes possible to share a bit
more code between them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:16 -07:00
9a59b65dba reftable/reader: avoid copying index iterator
When doing an indexed seek we need to walk down the multi-level index
until we finally hit a record of the desired indexed type. This loop
performs a copy of the index iterator on every iteration, which is both
hard to understand and completely unnecessary.

Refactor the code so that we use a single iterator to walk down the
indices, only.

Note that while this should improve performance, the improvement is
negligible in all but the most unreasonable repositories. This is
because the effect is only really noticeable when we have to walk down
many levels of indices, which is not something that a repository would
typically have. So the motivation for this change is really only about
readability.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:16 -07:00
d537ce6b9e reftable/block: use size_t to track restart point index
The function `block_reader_restart_offset()` gets the offset of the
`i`th restart point. `i` is a signed integer though, which is certainly
not the correct type to track indices like this. Furthermore, both
callers end up passing a `size_t`.

Refactor the code to use a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:04:16 -07:00
f518d91a2b refs/reftable: allow configuring geometric factor
Allow configuring the geometric factor used by the auto-compaction
algorithm whenever a new table is appended to the stack of tables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:39 -07:00
f663d34306 reftable: make the compaction factor configurable
When auto-compacting, the reftable library packs references such that
the sizes of the tables form a geometric sequence. The factor for this
geometric sequence is hardcoded to 2 right now. We're about to expose
this as a config option though, so let's expose the factor via write
options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:39 -07:00
afbdbfae0b refs/reftable: allow disabling writing the object index
Besides the expected "ref" and "log" records, the reftable library also
writes "obj" records. These are basically a reverse mapping of object
IDs to their respective ref records so that it becomes efficient to
figure out which references point to a specific object. The motivation
for this data structure is the "uploadpack.allowTipSHA1InWant" config,
which allows a client to fetch any object by its hash that has a ref
pointing to it.

This reverse index is not used by Git at all though, and the expectation
is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It
may thus be preferable for many users to disable writing these optional
object indices altogether to safe some precious disk space.

Add a new config "reftable.indexObjects" that allows the user to disable
the object index altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
90db611c2a refs/reftable: allow configuring restart interval
Add a new option `reftable.restartInterval` that allows the user to
control the restart interval when writing reftable records used by the
reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
8e9e136d61 reftable: use uint16_t to track restart interval
The restart interval can at most be `UINT16_MAX` as specified in the
technical documentation of the reftable format. Furthermore, it cannot
ever be negative. Regardless of that we use an `int` to track the
restart interval.

Change the type to use an `uint16_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
831b366c24 refs/reftable: allow configuring block size
Add a new option `reftable.blockSize` that allows the user to control
the block size used by the reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
fcf341890e reftable/dump: support dumping a table's block structure
We're about to introduce new configs that will allow users to have more
control over how exactly reftables are written. To verify that these
configs are effective we will need to take a peak into the actual blocks
written by the reftable backend.

Introduce a new mode to the dumping logic that prints out the block
structure. This logic can be invoked via `test-tool dump-reftables -b`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
c22d75b027 reftable/writer: improve error when passed an invalid block size
The reftable format only supports block sizes up to 16MB. When the
writer is being passed a value bigger than that it simply calls
abort(3P), which isn't all that helpful due to the lack of a proper
error message.

Improve this by calling `BUG()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
e0cf3d8f8b reftable/writer: drop static variable used to initialize strbuf
We have a static variable in the reftable writer code that is merely
used to initialize the `last_key` of the writer. Convert the code to
instead use `strbuf_init()` and drop the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
799237852b reftable: pass opts as constant pointer
We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.

In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.

Adapt the code so that we do not modify the caller-provided values
anymore. While at it, refactor the code to code to consistently pass the
options as a constant pointer to clarify that the caller-provided opts
will not ever get modified.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
4d35bb2aba reftable: consistently refer to reftable_write_options as opts
Throughout the reftable library the `reftable_write_options` are
sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
consistently use `opts` to avoid confusion.

While at it, touch up the coding style a bit by removing unneeded braces
around one-line statements and newlines between variable declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:37 -07:00
c81ffcff83 documentation: git-update-index: add --show-index-version to synopsis
In 606e088d5d (update-index: add --show-index-version, 2023-09-12), we
added the new '--show-index-version' option to 'git-update-index' and
documented it, but forgot to add it to the synopsis section.

Add '--show-index-version' to the synopsis of 'git-update-index'.

Signed-off-by: Dov Murik <dov.murik@linux.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 16:57:17 -07:00
fc0202b0e9 fetch-pack: remove unused 'struct loose_object_iter'
'struct loose_object_iter' in fetch-pack.c is unused since commit
97b2fa08 (fetch-pack: drop custom loose object cache, 2018-11-12).

Remove it.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 16:55:20 -07:00
17bc3a4767 Merge branch 'ps/undecided-is-not-necessarily-sha1' into jc/undecided-is-not-necessarily-sha1-fix
* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
2024-05-13 12:24:54 -07:00
3e4a232f6e The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 10:19:48 -07:00
39887d8abb Merge branch 'jc/git-gui-maintainer-update'
* jc/git-gui-maintainer-update:
  SubmittingPatches: welcome the new maintainer of git-gui part
2024-05-13 10:19:48 -07:00
bbffcd4514 Merge branch 'fa/p4-error'
P4 update.

* fa/p4-error:
  git-p4: show Perforce error to the user
2024-05-13 10:19:48 -07:00
235b9fb179 Merge branch 'ps/ci-fuzzers-at-gitlab-fix'
CI fix.

* ps/ci-fuzzers-at-gitlab-fix:
  gitlab-ci: fix installing dependencies for fuzz smoke tests
  gitlab-ci: add smoke test for fuzzers
2024-05-13 10:19:47 -07:00
537f17ec8b Merge branch 'jk/ci-test-with-jgit-fix'
CI fix.

* jk/ci-test-with-jgit-fix:
  ci: update coverity runs_on_pool reference
2024-05-13 10:19:47 -07:00
6cb0bd7fc3 Merge branch 'jk/ci-macos-gcc13-fix'
CI fix.

* jk/ci-macos-gcc13-fix:
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
2024-05-13 10:19:47 -07:00
b077cf2679 Merge branch 'jc/no-default-attr-tree-in-bare'
Git 2.43 started using the tree of HEAD as the source of attributes
in a bare repository, which has severe performance implications.
For now, revert the change, without ripping out a more explicit
support for the attr.tree configuration variable.

* jc/no-default-attr-tree-in-bare:
  stop using HEAD for attributes in bare repository by default
2024-05-13 10:19:46 -07:00
dddddea4b5 Merge branch 'ps/ci-python-2-deprecation'
Unbreak CI jobs so that we do not attempt to use Python 2 that has
been removed from the platform.

* ps/ci-python-2-deprecation:
  ci: fix Python dependency on Ubuntu 24.04
2024-05-13 10:19:46 -07:00
71bd0c8a61 Merge branch 'tb/attr-limits'
The maximum size of attribute files is enforced more consistently.

* tb/attr-limits:
  attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()
2024-05-13 10:19:46 -07:00
328f164496 Merge branch 'jc/test-workaround-broken-mv'
Tests that try to corrupt in-repository files in chunked format did
not work well on macOS due to its broken "mv", which has been
worked around.

* jc/test-workaround-broken-mv:
  t/lib-chunk: work around broken "mv" on some vintage of macOS
2024-05-13 10:19:45 -07:00
e05b9e9a39 Merge branch 'ma/win32-unix-domain-socket'
Build fix.

* ma/win32-unix-domain-socket:
  win32: fix building with NO_UNIX_SOCKETS
2024-05-13 10:19:45 -07:00
f01301aabe compat/regex: fix argument order to calloc(3)
Windows compiler suddenly started complaining that calloc(3) takes
its arguments in <nmemb, size> order.  Indeed, there are many calls
that has their arguments in a _wrong_ order.

Fix them all.

A sample breakage can be seen at

  https://github.com/git/git/actions/runs/9046793153/job/24857988702#step:4:272

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 10:19:08 -07:00
e18ad8eb26 SubmittingPatches: welcome the new maintainer of git-gui part
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-11 14:31:30 -07:00
83cf2847b0 git-gui: note the new maintainer
Pratyush Yadev has relinquished, and Johannes Sixt has taken over,
maintainership of Git-GUI.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-05-11 17:22:17 +02:00
9422e7169e Merge branch 'ps/config-subcommands' into ps/builtin-config-cleanup
* ps/config-subcommands:
  builtin/config: display subcommand help
  builtin/config: introduce "edit" subcommand
  builtin/config: introduce "remove-section" subcommand
  builtin/config: introduce "rename-section" subcommand
  builtin/config: introduce "unset" subcommand
  builtin/config: introduce "set" subcommand
  builtin/config: introduce "get" subcommand
  builtin/config: introduce "list" subcommand
  builtin/config: pull out function to handle `--null`
  builtin/config: pull out function to handle config location
  builtin/config: use `OPT_CMDMODE()` to specify modes
  builtin/config: move "fixed-value" option to correct group
  builtin/config: move option array around
  config: clarify memory ownership when preparing comment strings
2024-05-10 10:32:06 -07:00
120adc7d3c SubmittingPatches: extend the "flow" section
Explain a full lifecycle of a patch series upfront, so that it is
clear when key decisions to "accept" a series is made and how a new
patch series becomes a part of a new release.

Fold the "you need to monitor the progress of your topic" section
into the primary "patch lifecycle" section, as that is one of the
things the patch submitter is responsible for.  It is not like "I
sent a patch and responded to review messages, and now it is their
problem".  They need to see their patch through the patch life
cycle.

Earlier versions of this document outlined a slightly different
patch flow in an idealized world, where the original submitter
gathered agreements from the participants of the discussion and sent
the final "we all agreed that this is the good version--please
apply" patches to the maintainer.  In practice, this almost never
happened.  Instead, describe what flow was used in practice for the
past decade that worked well for us.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-10 10:26:14 -07:00
d58848fb21 SubmittingPatches: move the patch-flow section earlier
Before discussing the small details of how the patch gets sent, we'd
want to give people a larger picture first to set the expectation
straight.  The existing patch-flow section covers materials that are
suitable for that purpose, so move it to the beginning of the
document.  We'll update the contents of the section to clarify what
goal the patch submitter is working towards in the next step, which
will make it easier to understand the reason behind the individual
rules presented in latter parts of the document.

This step only moves two sections (patch-flow and patch-status)
without changing their contents, except that their section levels
are demoted from Level 1 to Level 2 to fit better in the document
structure at their new place.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-10 10:26:12 -07:00
7df2405b38 ci: stop installing "gcc-13" for osx-gcc
Our osx-gcc job explicitly asks to install gcc-13. But since the GitHub
runner image already comes with gcc-13 installed, this is mostly doing
nothing (or in some cases it may install an incremental update over the
runner image). But worse, it recently started causing errors like:

    ==> Fetching gcc@13
    ==> Downloading https://ghcr.io/v2/homebrew/core/gcc/13/blobs/sha256:fb2403d97e2ce67eb441b54557cfb61980830f3ba26d4c5a1fe5ecd0c9730d1a
    ==> Pouring gcc@13--13.2.0.ventura.bottle.tar.gz
    Error: The `brew link` step did not complete successfully
    The formula built, but is not symlinked into /usr/local
    Could not symlink bin/c++-13
    Target /usr/local/bin/c++-13
    is a symlink belonging to gcc. You can unlink it:
      brew unlink gcc

which cause the whole CI job to bail.

I didn't track down the root cause, but I suspect it may be related to
homebrew recently switching the "gcc" default to gcc-14. And it may even
be fixed when a new runner image is released. But if we don't need to
run brew at all, it's one less thing for us to worry about.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09 09:58:08 -07:00
11c7001e3d ci: avoid bare "gcc" for osx-gcc job
On macOS, a bare "gcc" (without a version) will invoke a wrapper for
clang, not actual gcc. Even when gcc is installed via homebrew, that
only provides version-specific links in /usr/local/bin (like "gcc-13"),
and never a version-agnostic "gcc" wrapper.

As far as I can tell, this has been the case for a long time, and this
osx-gcc job has largely been doing nothing. We can point it at "gcc-13",
which will pick up the homebrew-installed version.

The fix here is specific to the github workflow file, as the gitlab one
does not have a matching job.

It's a little unfortunate that we cannot just ask for the latest version
of gcc which homebrew provides, but as far as I can tell there is no
easy alias (you'd have to find the highest number gcc-* in
/usr/local/bin yourself).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09 09:57:32 -07:00
9d4453e8d6 ci: drop mention of BREW_INSTALL_PACKAGES variable
The last user of this variable went away in 4a6e4b9602 (CI: remove
Travis CI support, 2021-11-23), so it's doing nothing except making it
more confusing to find out which packages _are_ installed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09 09:57:04 -07:00
157ed03c83 ci: update coverity runs_on_pool reference
Commit 2d65e5b6a6 (ci: rename "runs_on_pool" to "distro", 2024-04-12)
renamed this variable for the main CI workflow, as well as in the ci/
scripts. Because the coverity workflow also relies on those scripts to
install dependencies, it needs to be updated, too. Without this patch,
the coverity build fails because we lack libcurl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09 09:38:43 -07:00
672cf2c870 gitlab-ci: fix installing dependencies for fuzz smoke tests
There was a semantic merge conflict between 9cdeb34b96 (ci: merge
scripts which install dependencies, 2024-04-12), which has merged
"ci/install-docker-dependencies.sh" into "ci/install-dependencies.sh"
and c7b228e000 (gitlab-ci: add smoke test for fuzzers, 2024-04-29),
which has added a new fuzz smoke test job that makes use of the
now-removed script.

Adapt the job to instead use the new script to install dependencies.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09 08:45:50 -07:00
21373511b9 Merge branch 'ps/ci-python-2-deprecation' into ps/ci-fuzzers-at-gitlab-fix
* ps/ci-python-2-deprecation:
  ci: fix Python dependency on Ubuntu 24.04
2024-05-09 08:45:36 -07:00
b664d36165 Merge branch 'ps/ci-enable-minimal-fuzzers-at-gitlab' into ps/ci-fuzzers-at-gitlab-fix
* ps/ci-enable-minimal-fuzzers-at-gitlab:
  gitlab-ci: add smoke test for fuzzers
2024-05-09 08:45:29 -07:00
55702c543e git-p4: show Perforce error to the user
During "git p4 clone" if p4 process returns an error from the server,
it will store the message in the 'err' variable. Then it will send a
text command "die-now" to git-fast-import. However, git-fast-import
raises an exception: "fatal: Unsupported command: die-now" and err is
never displayed. This patch ensures that err is shown to the end user.

Signed-off-by: Fahad Alrashed <fahad@keylock.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-08 15:44:14 -07:00
0f3415f1f8 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-08 10:18:47 -07:00
20ceead5c3 Merge branch 'bb/rgb-12-bit-colors'
The color parsing code learned to handle 12-bit RGB colors, spelled
as "#RGB" (in addition to "#RRGGBB" that is already supported).

* bb/rgb-12-bit-colors:
  color: add support for 12-bit RGB colors
  t/t4026-color: add test coverage for invalid RGB colors
  t/t4026-color: remove an extra double quote character
2024-05-08 10:18:47 -07:00
db05f61738 Merge branch 'rs/diff-parseopts-cleanup'
Code clean-up to remove code that is now a noop.

* rs/diff-parseopts-cleanup:
  diff-lib: stop calling diff_setup_done() in do_diff_cache()
2024-05-08 10:18:46 -07:00
97673bdea7 Merge branch 'dk/zsh-git-repo-path-fix'
Command line completion support for zsh (in contrib/) has been
updated to stop exposing internal state to end-user shell
interaction.

* dk/zsh-git-repo-path-fix:
  completion: zsh: stop leaking local cache variable
2024-05-08 10:18:46 -07:00
c2b36ab32e Merge branch 'bc/zsh-compatibility'
zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.

* bc/zsh-compatibility:
  vimdiff: make script and tests work with zsh
  t4046: avoid continue in &&-chain for zsh
2024-05-08 10:18:46 -07:00
80dbfac2aa Merge branch 'rj/add-p-typo-reaction'
When the user responds to a prompt given by "git add -p" with an
unsupported command, list of available commands were given, which
was too much if the user knew what they wanted to type but merely
made a typo.  Now the user gets a much shorter error message.

* rj/add-p-typo-reaction:
  add-patch: response to unknown command
  add-patch: do not show UI messages on stderr
2024-05-08 10:18:45 -07:00
34f34d63bb Merge branch 'jt/doc-submitting-rerolled-series'
Developer doc update.

* jt/doc-submitting-rerolled-series:
  doc: clarify practices for submitting updated patch versions
2024-05-08 10:18:45 -07:00
2c34e4e747 Merge branch 'rh/complete-symbolic-ref'
Command line completion script (in contrib/) learned to complete
"git symbolic-ref" a bit better (you need to enable plumbing
commands to be completed with GIT_COMPLETION_SHOW_ALL_COMMANDS).

* rh/complete-symbolic-ref:
  completion: add docs on how to add subcommand completions
  completion: improve docs for using __git_complete
  completion: add 'symbolic-ref'
2024-05-08 10:18:45 -07:00
f526a4f314 Merge branch 'ps/the-index-is-no-more'
The singleton index_state instance "the_index" has been eliminated
by always instantiating "the_repository" and replacing references
to "the_index"  with references to its .index member.

* ps/the-index-is-no-more:
  repository: drop `initialize_the_repository()`
  repository: drop `the_index` variable
  builtin/clone: stop using `the_index`
  repository: initialize index in `repo_init()`
  builtin: stop using `the_index`
  t/helper: stop using `the_index`
2024-05-08 10:18:44 -07:00
c5c9acf77d Merge branch 'bc/credential-scheme-enhancement'
The credential helper protocol, together with the HTTP layer, have
been enhanced to support authentication schemes different from
username & password pair, like Bearer and NTLM.

* bc/credential-scheme-enhancement:
  credential: add method for querying capabilities
  credential-cache: implement authtype capability
  t: add credential tests for authtype
  credential: add support for multistage credential rounds
  t5563: refactor for multi-stage authentication
  docs: set a limit on credential line length
  credential: enable state capability
  credential: add an argument to keep state
  http: add support for authtype and credential
  docs: indicate new credential protocol fields
  credential: add a field called "ephemeral"
  credential: gate new fields on capability
  credential: add a field for pre-encoded credentials
  http: use new headers for each object request
  remote-curl: reset headers on new request
  credential: add an authtype field
2024-05-08 10:18:44 -07:00
d25ad94df6 Merge branch 'ps/ci-test-with-jgit'
Tests to ensure interoperability between reftable written by jgit
and our code have been added and enabled in CI.

* ps/ci-test-with-jgit:
  t0612: add tests to exercise Git/JGit reftable compatibility
  t0610: fix non-portable variable assignment
  t06xx: always execute backend-specific tests
  ci: install JGit dependency
  ci: make Perforce binaries executable for all users
  ci: merge scripts which install dependencies
  ci: fix setup of custom path for GitLab CI
  ci: merge custom PATH directories
  ci: convert "install-dependencies.sh" to use "/bin/sh"
  ci: drop duplicate package installation for "linux-gcc-default"
  ci: skip sudo when we are already root
  ci: expose distro name in dockerized GitHub jobs
  ci: rename "runs_on_pool" to "distro"
2024-05-08 10:18:44 -07:00
5aec7231c8 Merge branch 'ps/reftable-write-optim'
Code to write out reftable has seen some optimization and
simplification.

* ps/reftable-write-optim:
  reftable/block: reuse compressed array
  reftable/block: reuse zstream when writing log blocks
  reftable/writer: reset `last_key` instead of releasing it
  reftable/writer: unify releasing memory
  reftable/writer: refactorings for `writer_flush_nonempty_block()`
  reftable/writer: refactorings for `writer_add_record()`
  refs/reftable: don't recompute committer ident
  reftable: remove name checks
  refs/reftable: skip duplicate name checks
  refs/reftable: perform explicit D/F check when writing symrefs
  refs/reftable: fix D/F conflict error message on ref copy
2024-05-08 10:18:43 -07:00
b64b0df9da scalar: avoid segfault in reconfigure --all
During the latest v2.45.0 update, 'scalar reconfigure --all' started to
segfault on my machine. Breaking it down via the debugger, it was
faulting on a NULL reference to the_hash_algo, which is a macro pointing
to the_repository->hash_algo.

In my case, this is due to one of my repositories having a detached HEAD,
which requires get_oid_hex() to parse that the HEAD reference is valid.
Another way to cause a failure is to use the "includeIf.onbranch" config
key, which will lead to a BUG() statement.

My first inclination was to try to refactor cmd_reconfigure() to execute
'git for-each-repo' instead of this loop. In addition to the difficulty
of executing 'scalar reconfigure' within 'git for-each-repo', it would
be difficult to perform the clean-up logic for non-existent repos if we
relied on that child process.

Instead, I chose to move the temporary repo to be within the loop and
reinstate the_repository to its old value after we are done performing
logic on the current array item.

Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with
multiple registered repos. There are two different ways that the old
use of the_repository could trigger bugs. These issues are being solved
independently to be more careful about the_repository being
uninitialized, but the change in this patch around the use of
the_repository is still a good safety precaution.

Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 17:51:12 -07:00
cbdc83f151 t0018: two small fixes
Even though the three tests that were recently added started their
here-doc with "<<-\EOF", it did not take advantage of that and
instead wrote the here-doc payload abut to the left edge.  Use a tabs
to indent these lines.

More importantly, because these all hardcode the expected output,
which contains the current branch name, they break the CI job that
uses 'main' as the default branch name.

Use

    GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=trunk
    export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

between the test_description line and ". ./test-lib.sh" line to
force the initial branch name to 'trunk' and expect it to show in
the output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 17:50:22 -07:00
2566a77774 Documentation/git-merge-tree.txt: document -X
Add an entry in the 'merge-tree' builtin documentation for
-X/--strategy-option (added in 6a4c9e7b32 (merge-tree: add -X strategy
option, 2023-09-24)). The same option is documented for 'merge', 'rebase',
'revert', etc. in their respective Documentation/ files, so let's do the
same for 'merge-tree'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 15:36:08 -07:00
c8f815c208 refs: remove functions without ref store
The preceding commit has rewritten all callers of ref-related functions
to use the equivalents that accept a `struct ref_store`. Consequently,
the respective variants without the ref store are now unused. Remove
them.

There are likely patch series in-flight that use the now-removed
functions. To help the authors, the old implementations have been added
to "refs.c" in an ifdef'd section as a reference for how to migrate each
of the respective callers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
2e5c4758b7 cocci: apply rules to rewrite callers of "refs" interfaces
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
b198ee0b3d cocci: introduce rules to transform "refs" to pass ref store
Most of the functions in "refs.h" have two flavors: one that accepts a
`struct ref_store`, and one that figures it out via `the_repository`.
As part of the libification efforts we want to get rid of the latter
variant and stop relying on `the_repository` altogether.

Introduce a set of Coccinelle rules that transform callers of the "refs"
interfaces to pass a `struct ref_store`. These rules are not yet applied
by this patch so that it can be reviewed standalone more easily. This
will be done in the next patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
54876c6dfb refs: add exclude_patterns parameter to for_each_fullref_in()
The `for_each_fullref_in()` function is supposedly the ref-store-less
equivalent of `refs_for_each_fullref_in()`, but the latter has gained a
new parameter `exclude_patterns` over time. Bring these two functions
back in sync again by adding the parameter to the former function, as
well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
39a9ef8fc4 refs: introduce missing functions that accept a struct ref_store
While most of the functions in "refs.h" have a variant that accepts a
`struct ref_store`, some don't. Callers of these functions are thus
forced to implicitly rely on `the_repository` to figure out the ref
store that is to be used.

Introduce those missing functions to address this shortcoming.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:58 -07:00
066cef7707 builtin/tag: add --trailer option
git-tag supports interpreting trailers from an annotated tag message,
using --list --format="%(trailers)". However, the available methods to
add a trailer to a tag message (namely -F or --editor) are not as
ergonomic.

In a previous patch, we moved git-commit's implementation of its
--trailer option to the trailer.h API. Let's use that new function to
teach git-tag the same --trailer option, emulating as much of
git-commit's behavior as much as possible.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:03 -07:00
4a8618785e builtin/commit: refactor --trailer logic
git-commit adds user trailers to the commit message by passing its
`--trailer` arguments to a child process running `git-interpret-trailers
--in-place`. This logic is broadly useful, not just for git-commit but
for other commands constructing message bodies (e.g. git-tag).

Let's move this logic from git-commit to a new function in the trailer
API, so that it can be re-used in other commands.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:03 -07:00
56740f9910 builtin/commit: use ARGV macro to collect trailers
Replace git-commit's callback for --trailer with the standard
OPT_PASSTHRU_ARGV macro. The callback only adds its values to a strvec
and sanity-checks that `unset` is always false; both of these are
already implemented in the parse-option API.

Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:05:41 -07:00
4865707bda refs: remove create_symref and associated dead code
In the previous commits, we converted `refs_create_symref()` to utilize
transactions to perform symref updates. Earlier `refs_create_symref()`
used `create_symref()` to do the same.

We can now remove `create_symref()` and any code associated with it
which is no longer used. We remove `create_symref()` code from all the
reference backends and also remove it entirely from the `ref_storage_be`
struct.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:50 -07:00
f151dfe3c9 refs: rename refs_create_symref() to refs_update_symref()
The `refs_create_symref()` function is used to update/create a symref.
But it doesn't check the old target of the symref, if existing. It force
updates the symref. In this regard, the name `refs_create_symref()` is a
bit misleading. So let's rename it to `refs_update_symref()`. This is
akin to how 'git-update-ref(1)' also allows us to create apart from
update.

While we're here, rename the arguments in the function to clarify what
they actually signify and reduce confusion.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:50 -07:00
300b38e46f refs: use transaction in refs_create_symref()
The `refs_create_symref()` function updates a symref to a given new
target. To do this, it uses a ref-backend specific function
`create_symref()`.

In the previous commits, we introduced symref support in transactions.
This means we can now use transactions to perform symref updates and
don't have to resort to `create_symref()`. Doing this allows us to
remove and cleanup `create_symref()`, which we will do in the following
commit.

Modify the expected error message for a test in
't/t0610-reftable-basics.sh', since the error is now thrown from
'refs.c'. This is because in transactional updates, F/D conflicts are
caught before we're in the reference backend.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:50 -07:00
644daf7785 refs: add support for transactional symref updates
The reference backends currently support transactional reference
updates. While this is exposed to users via 'git-update-ref' and its
'--stdin' mode, it is also used internally within various commands.

However, we do not support transactional updates of symrefs. This commit
adds support for symrefs in both the 'files' and the 'reftable' backend.

Here, we add and use `ref_update_has_null_new_value()`, a helper
function which is used to check if there is a new_value in a reference
update. The new value could either be a symref target `new_target` or a
OID `new_oid`.

We also add another common function `ref_update_check_old_target` which
will be used to check if the update's old_target corresponds to a
reference's current target.

Now transactional updates (verify, create, delete, update) can be used
for:
- regular refs
- symbolic refs
- conversion of regular to symbolic refs and vice versa

This also allows us to expose this to users via new commands in
'git-update-ref' in the future.

Note that a dangling symref update does not record a new reflog entry,
which is unchanged before and after this commit.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
e9965ba477 refs: move original_update_refname to 'refs.c'
The files backend and the reftable backend implement
`original_update_refname` to obtain the original refname of the update.
Move it out to 'refs.c' and only expose it internally to the refs
library. This will be used in an upcoming commit to also introduce
another common functionality for the two backends.

We also rename the function to `ref_update_original_update_refname` to
keep it consistent with the upcoming other 'ref_update_*' functions
that'll be introduced.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
a8ae923f85 refs: support symrefs in 'reference-transaction' hook
The 'reference-transaction' hook runs whenever a reference update is
made to the system. In a previous commit, we added the `old_target` and
`new_target` fields to the `reference_transaction_update()`. In
following commits we'll also add the code to handle symref's in the
reference backends.

Support symrefs also in the 'reference-transaction' hook, by modifying
the current format:
    <old-oid> SP <new-oid> SP <ref-name> LF
to be be:
    <old-value> SP <new-value> SP <ref-name> LF
where for regular refs the output would not change and remain the same.
But when either 'old-value' or 'new-value' is a symref, we print the ref
as 'ref:<ref-target>'.

This does break backward compatibility, but the 'reference-transaction'
hook's documentation always stated that support for symbolic references
may be added in the future.

We do not add any tests in this commit since there is no git command
which activates this flow, in an upcoming commit, we'll start using
transaction based symref updates as the default, we'll add tests there
for the hook too.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
57d0b1e2ea files-backend: extract out create_symref_lock()
The function `create_symref_locked()` creates a symref by creating a
'<symref>.lock' file and then committing the symref lock, which creates
the final symref.

Extract the early half of `create_symref_locked()` into a new helper
function `create_symref_lock()`. Because the name of the new function is
too similar to the original, rename the original to
`create_and_commit_symref()` to avoid confusion.

The new function `create_symref_locked()` can be used to create the
symref lock in a separate step from that of committing it. This allows
to add transactional support for symrefs, where the lock would be
created in the preparation step and the lock would be committed in the
finish step.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
1bc4cc3fc4 refs: accept symref values in ref_transaction_update()
The function `ref_transaction_update()` obtains ref information and
flags to create a `ref_update` and add them to the transaction at hand.

To extend symref support in transactions, we need to also accept the
old and new ref targets and process it. This commit adds the required
parameters to the function and modifies all call sites.

The two parameters added are `new_target` and `old_target`. The
`new_target` is used to denote what the reference should point to when
the transaction is applied. Some functions allow this parameter to be
NULL, meaning that the reference is not changed.

The `old_target` denotes the value the reference must have before the
update. Some functions allow this parameter to be NULL, meaning that the
old value of the reference is not checked.

We also update the internal function `ref_transaction_add_update()`
similarly to take the two new parameters.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
c8aed5e8da repository: stop setting SHA1 as the default object hash
During the startup of Git, we call `initialize_the_repository()` to set
up `the_repository` as well as `the_index`. Part of this setup is also
to set the default object hash of the repository to SHA1. This has the
effect that `the_hash_algo` is getting initialized to SHA1, as well.
This default hash algorithm eventually gets overridden by most Git
commands via `setup_git_directory()`, which also detects the actual hash
algorithm used by the repository.

There are some commands though that don't access a repository at all, or
at a later point only, and thus retain the default hash function for
some amount of time. As some of the the preceding commits demonstrate,
this can lead to subtle issues when we access `the_hash_algo` when no
repository has been set up.

Address this issue by dropping the set up of the default hash algorithm
completely. The effect of this is that `the_hash_algo` will map to a
`NULL` pointer and thus cause Git to crash when something tries to
access the hash algorithm without it being properly initialized. It thus
forces all Git commands to explicitly set up the hash algorithm in case
there is no repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:50 -07:00
781ba69d8b oss-fuzz/commit-graph: set up hash algorithm
Our fuzzing setups don't work in a proper repository, but only use the
in-memory configured `the_repository`. Consequently, we never go through
the full repository setup procedures and thus do not set up the hash
algo used by the repository.

The commit-graph fuzzer does rely on a properly initialized hash algo
though. Initialize it explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:50 -07:00
373bfa6077 builtin/shortlog: don't set up revisions without repo
It is possible to run git-shortlog(1) outside of a repository by passing
it output from git-log(1) via standard input. Obviously, as there is no
repository in that context, it is thus unsupported to pass any revisions
as arguments.

Regardless of that we still end up calling `setup_revisions()`. While
that works alright, it is somewhat strange. Furthermore, this is about
to cause problems when we unset the default object hash.

Refactor the code to only call `setup_revisions()` when we have a
repository. This is safe to do as we already verify that there are no
arguments when running outside of a repository anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:50 -07:00
ab274909d4 builtin/diff: explicitly set hash algo when there is no repo
The git-diff(1) command can be used outside repositories to diff two
files with each other. But even if there is no repository we will end up
hashing the files that we are diffing so that we can print the "index"
line:

    ```
    diff --git a/a b/b
    index 7898192..6178079 100644
    --- a/a
    +++ b/b
    @@ -1 +1 @@
    -a
    +b
    ```

We implicitly use SHA1 to calculate the hash here, which is because
`the_repository` gets initialized with SHA1 during the startup routine.
We are about to stop doing this though such that `the_repository` only
ever has a hash function when it was properly initialized via a repo's
configuration.

To give full control to our users, we would ideally add a new switch to
git-diff(1) that allows them to specify the hash function when executed
outside of a repository. But for now, we only convert the code to make
this explicit such that we can stop setting the default hash algorithm
for `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
332b56b762 builtin/bundle: abort "verify" early when there is no repository
Verifying a bundle requires us to have a repository. This is encoded in
`verify_bundle()`, which will return an error if there is no repository.
We call `open_bundle()` before we call `verify_bundle()` though, which
already performs some verifications even though we may ultimately abort
due to a missing repository.

This is problematic because `open_bundle()` already reads the bundle
header and verifies that it contains a properly formatted hash. When
there is no repository we have no clue what hash function to expect
though, so we always end up assuming SHA1 here, which may or may not be
correct. Furthermore, we are about to stop initializing `the_hash_algo`
when there is no repository, which will lead to segfaults.

Check early on whether we have a repository to fix this issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
ce992ce29a builtin/blame: don't access potentially unitialized the_hash_algo
We access `the_hash_algo` in git-blame(1) before we have executed
`parse_options_start()`, which may not be properly set up in case we
have no repository. This is fine for most of the part because all the
call paths that lead to it (git-blame(1), git-annotate(1) as well as
git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository.

There is one exception though, namely when passing `-h` to print the
help. Here we will access `the_hash_algo` even if there is no repo.
This works fine right now because `the_hash_algo` gets sets up to point
to the SHA1 algorithm via `initialize_repository()`. But we're about to
stop doing this, and thus the code would lead to a `NULL` pointer
exception.

Prepare the code for this and only access `the_hash_algo` after we are
sure that there is a proper repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
07658e9ce5 builtin/rev-parse: allow shortening to more than 40 hex characters
The `--short=` option for git-rev-parse(1) allows the user to specify
to how many characters object IDs should be shortened to. The option is
broken though for SHA256 repositories because we set the maximum allowed
hash size to `the_hash_algo->hexsz` before we have even set up the repo.
Consequently, `the_hash_algo` will always be SHA1 and thus we truncate
every hash after at most 40 characters.

Fix this by accessing `the_hash_algo` only after we have set up the
repo.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
bd455cec37 remote-curl: fix parsing of detached SHA256 heads
The dumb HTTP transport tries to read the remote HEAD reference by
downloading the "HEAD" file and then parsing it via `http_fetch_ref()`.
This function will either parse the file as an object ID in case it is
exactly `the_hash_algo->hexsz` long, or otherwise it will check whether
the reference starts with "ref :" and parse it as a symbolic ref.

This is broken when parsing detached HEADs of a remote SHA256 repository
because we never update `the_hash_algo` to the discovered remote object
hash. Consequently, `the_hash_algo` will always be the fallback SHA1
hash algorithm, which will cause us to fail parsing HEAD altogteher when
it contains a SHA256 object ID.

Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`.
While at it, let's make the expected SHA1 fallback explicit in our code,
which also addresses an upcoming issue where we are going to remove the
SHA1 fallback for `the_hash_algo`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
813f17fd6b attr: fix BUG() when parsing attrs outside of repo
If either the `--attr-source` option or the `GIT_ATTR_SOURCE` envvar are
set, then `compute_default_attr_source()` will try to look up the value
as a treeish. It is possible to hit that function while outside of a Git
repository though, for example when using `git grep --no-index`. In that
case, Git will hit a bug because we try to look up the main ref store
outside of a repository.

Handle the case gracefully and detect when we try to look up an attr
source without a repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
bbb82f8dc8 attr: don't recompute default attribute source
The `default_attr_source()` function lazily computes the attr source
supposedly once, only. This is done via a static variable `attr_source`
that contains the resolved object ID of the attr source's tree. If the
variable is the null object ID then we try to look up the attr source,
otherwise we skip over it.

This approach is flawed though: the variable will never be set to
anything else but the null object ID in case there is no attr source.
Consequently, we re-compute the information on every call. And in the
worst case, when we silently ignore bad trees, this will cause us to try
and look up the treeish every single time.

Improve this by introducing a separate variable `has_attr_source` to
track whether we already computed the attr source and, if so, whether we
have an attr source or not.

This also allows us to convert the `ignore_bad_attr_tree` to not be
static anymore as the code will only be executed once anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:48 -07:00
b7afb46225 parse-options-cb: only abbreviate hashes when hash algo is known
The `OPT__ABBREV()` option can be used to add an option that abbreviates
object IDs. When given a length longer than `the_hash_algo->hexsz`, then
it will instead set the length to that maximum length.

It may not always be guaranteed that we have `the_hash_algo` initialized
properly as the hash algorithm can only be set up after we have set up
`the_repository`. In that case, the hash would always be truncated to
the hex length of SHA1, which may not be what the user desires.

In practice it's not a problem as all commands that use `OPT__ABBREV()`
also have `RUN_SETUP` set and thus cannot work without a repository.
Consequently, both `the_repository` and `the_hash_algo` would be
properly set up.

Regardless of that, harden the code to not truncate the length when we
didn't set up a repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:48 -07:00
0c6bd2b81d path: move validate_headref() to its only user
While `validate_headref()` is only called from `is_git_directory()` in
"setup.c", it is currently implemented in "path.c". Move it over such
that it becomes clear that it is only really used during setup in order
to discover repositories.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:48 -07:00
a0851cece5 path: harden validation of HEAD with non-standard hashes
The `validate_headref()` function takes a path to a supposed "HEAD" file
and checks whether its format is something that we understand. It is
used as part of our repository discovery to check whether a specific
directory is a Git directory or not.

Part of the validation is a check for a detached HEAD that contains a
plain object ID. To do this validation we use `get_oid_hex()`, which
relies on `the_hash_algo`. At this point in time the hash algo cannot
yet be initialized though because we didn't yet read the Git config.
Consequently, it will always be the SHA1 hash algorithm.

In practice this works alright because `get_oid_hex()` only ends up
checking whether the prefix of the buffer is a valid object ID. And
because SHA1 is shorter than SHA256, the function will successfully
parse SHA256 object IDs, as well.

It is somewhat fragile though and not really the intent to only check
for SHA1. With this in mind, harden the code to use `get_oid_hex_any()`
to check whether the "HEAD" file parses as any known hash.

One might be hard pressed to tighten the check even further and fully
validate the file contents, not only the prefix. In practice though that
wouldn't make a lot of sense as it could be that the repository uses a
hash function that produces longer hashes than SHA256, but which the
current version of Git doesn't understand yet. We'd still want to detect
the repository as proper Git repository in that case, and we will fail
eventually with a proper error message that the hash isn't understood
when trying to set up the repository format.

It follows that we could just leave the current code intact, as in
practice the code change doesn't have any user visible impact. But it
also prepares us for `the_hash_algo` being unset when there is no
repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:48 -07:00
3452c8ab8a Merge branch 'ps/the-index-is-no-more' into ps/undecided-is-not-necessarily-sha1
* ps/the-index-is-no-more:
  repository: drop `initialize_the_repository()`
  repository: drop `the_index` variable
  builtin/clone: stop using `the_index`
  repository: initialize index in `repo_init()`
  builtin: stop using `the_index`
  t/helper: stop using `the_index`
2024-05-06 22:50:29 -07:00
e9e8dd8801 Merge branch 'jc/no-default-attr-tree-in-bare' into ps/undecided-is-not-necessarily-sha1
* jc/no-default-attr-tree-in-bare:
  stop using HEAD for attributes in bare repository by default
2024-05-06 22:50:24 -07:00
951105664d cmake: let test-tool run the unit tests, too
The `test-tool` recently learned to run the unit tests. To this end, it
needs to link with `test-lib.c`, which was done in the `Makefile`, and
this patch does it in the CMake definition, too.

This is a companion of 44400f58407e (t0080: turn t-basic unit test into
a helper, 2024-02-02).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:11:45 -07:00
b121eed8d5 ci: use test-tool as unit test runner on Windows
Although the previous commit changed t/Makefile to run unit tests
alongside shell tests, the Windows CI still needs a separate unit-tests
step due to how the test sharding works.

We want to avoid using `prove` as a test running on Windows due to
performance issues [1], so use the new test-tool runner instead.

[1] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:35 -07:00
cc75e4a08f t/Makefile: run unit tests alongside shell tests
Add a wrapper script to allow `prove` to run both shell tests and unit
tests from a single invocation. This avoids issues around running prove
twice in CI, as discussed in [1].

Additionally, this moves the unit tests into the main dev workflow, so
that errors can be spotted more quickly. Accordingly, we remove the
separate unit tests step for Linux CI. (We leave the Windows CI
unit-test step as-is, because the sharding scheme there involves
selecting specific test files rather than running `make test`.)

[1] https://lore.kernel.org/git/pull.1613.git.1699894837844.gitgitgadget@gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:35 -07:00
5bbc8c927f unit tests: add rule for running with test-tool
In the previous commit, we added support in test-tool for running
collections of unit tests. Now, add rules in t/Makefile for running in
this way.

This new rule can be executed from the top-level Makefile via
`make DEFAULT_UNIT_TEST_TARGET=unit-tests-test-tool unit-tests`, or by
setting DEFAULT_UNIT_TEST_TARGET in config.mak.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:35 -07:00
a2b55e2506 test-tool run-command testsuite: support unit tests
Teach the testsuite runner in `test-tool run-command testsuite` how to
run unit tests: if TEST_SHELL_PATH is not set, run the programs directly
from CWD, rather than defaulting to "sh" as an interpreter.

With this change, you can now use test-tool to run the unit tests:
$ make
$ cd t/unit-tests/bin
$ ../../helper/test-tool run-command testsuite

This should be helpful on Windows to allow running tests without
requiring Perl (for `prove`), as discussed in [1] and [2].

This again breaks backwards compatibility, as it is now required to set
TEST_SHELL_PATH properly for executing shell scripts, but again, as
noted in [2], there are no longer any such invocations in our codebase.

[1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/
[2] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:34 -07:00
d28c5a520f test-tool run-command testsuite: remove hardcoded filter
`test-tool run-command testsuite` currently assumes that it will only be
running the shell test suite, and therefore filters out anything that
does not match a hardcoded pattern of "t[0-9][0-9][0-9][0-9]-*.sh".

Later in this series, we'll adapt `test-tool run-command testsuite` to
also support unit tests, which do not follow the same naming conventions
as the shell tests, so this hardcoded pattern is inconvenient.

Since `testsuite` also allows specifying patterns on the command-line,
let's just remove this pattern. As noted in [1], there are no longer any
uses of `testsuite` in our codebase, it should be OK to break backwards
compatibility in this case. We also add a new filter to avoid trying to
execute "." and "..", so that users who wish to execute every test in a
directory can do so without specifying a pattern.

[1] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:34 -07:00
22f0df7a09 test-tool run-command testsuite: get shell from env
When running tests through `test-tool run-command testsuite`, we
currently hardcode `sh` as the command interpreter. As discussed in [1],
this is incorrect, and we should be using the shell set in
TEST_SHELL_PATH instead.

Add a shell_path field in struct testsuite so that we can pass this to
the task runner callback. If this is non-null, we'll use it as the
argv[0] of the subprocess. Otherwise, we'll just execute the test
program directly. We will use this feature in a later commit to enable
running binary executable unit tests.

However, for now when setting up the struct testsuite in testsuite(),
use the value of TEST_SHELL_PATH if it's set, otherwise keep the
original behavior by defaulting to `sh`.

[1] https://lore.kernel.org/git/20240123005913.GB835964@coredump.intra.peff.net/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:34 -07:00
80bb227e41 t0080: turn t-basic unit test into a helper
While t/unit-tests/t-basic.c uses the unit-test framework added in
e137fe3b29 (unit tests: add TAP unit test framework, 2023-11-09), it is
not a true unit test in that it intentionally fails in order to exercise
various codepaths in the unit-test framework. Thus, we intentionally
exclude it when running unit tests through the various t/Makefile
targets. Instead, it is executed by t0080-unit-test-output.sh, which
verifies its output follows the TAP format expected for the various
pass, skip, or fail cases.

As such, it makes more sense for t-basic to be a helper item for
t0080-unit-test-output.sh, so let's move it to
t/helper/test-example-tap.c and adjust Makefiles as necessary.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 14:06:34 -07:00
5ca0c455f1 ci: fix Python dependency on Ubuntu 24.04
Newer versions of Ubuntu have dropped Python 2 starting with Ubuntu
23.04. By default though, our CI setups will try to use that Python
version on all Ubuntu-based jobs except for the "linux-gcc" one.

We didn't notice this issue due to two reasons:

  - The "ubuntu:latest" tag always points to the latest LTS release.
    Until a few weeks ago this was Ubuntu 22.04, which still had Python
    2.

  - Our Docker-based CI jobs had their own script to install
    dependencies until 9cdeb34b96 (ci: merge scripts which install
    dependencies, 2024-04-12), where we didn't even try to install
    Python at all for many of them.

Since the CI refactorings have originally been implemented, Ubuntu
24.04 was released, and it being an LTS versions means that the "latest"
tag now points to that Python-2-less version. Consequently, those jobs
that use "ubuntu:latest" broke.

Address this by using Python 2 on Ubuntu 20.04, only, whereas we use
Python 3 on all other Ubuntu jobs. Eventually, we should think about
dropping support for Python 2 completely.

Reported-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 12:26:46 -07:00
0b8bd1959e Documentation: Mention that refspecs are explained elsewhere
The syntax for refspecs are explained in more detail in documention for
git-fetch and git-push. Give a hint to the user too look there more fore
information

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 12:12:16 -07:00
c22d41d641 format-patch: run range-diff with larger creation-factor
We see too often that a range-diff added to format-patch output
shows too many "unmatched" patches.  This is because the default
value for creation-factor is set to a relatively low value.

It may be justified for other uses (like you have a yet-to-be-sent
new iteration of your series, and compare it against the 'seen'
branch that has an older iteration, probably with the '--left-only'
option, to pick out only your patches while ignoring the others) of
"range-diff" command, but when the command is run as part of the
format-patch, the user _knows_ and expects that the patches in the
old and the new iterations roughly correspond to each other, so we
can and should use a much higher default.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:57:22 -07:00
c7b228e000 gitlab-ci: add smoke test for fuzzers
Our GitLab CI setup has a test gap where the fuzzers aren't exercised at
all. Add a smoke test, similar to the one we have in GitHub Workflows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:52:24 -07:00
7b91d310ce builtin/config: display subcommand help
Until now, `git config -h` would have printed help for the old-style
syntax. Now that all modes have proper subcommands though it is
preferable to instead display the subcommand help.

Drop the `NO_INTERNAL_HELP` flag to do so. While at it, drop the help
mismatch in t0450 and add the `--get-colorbool` option to the usage such
that git-config(1)'s synopsis and `git config -h` match.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:10 -07:00
3cbace5ee0 builtin/config: introduce "edit" subcommand
Introduce a new "edit" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:10 -07:00
15dad20c3f builtin/config: introduce "remove-section" subcommand
Introduce a new "remove-section" subcommand to git-config(1). Please
refer to preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:10 -07:00
3418e96f37 builtin/config: introduce "rename-section" subcommand
Introduce a new "rename-section" subcommand to git-config(1). Please
refer to preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
95ea69c67b builtin/config: introduce "unset" subcommand
Introduce a new "unset" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
00bbdde141 builtin/config: introduce "set" subcommand
Introduce a new "set" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
4e51389000 builtin/config: introduce "get" subcommand
Introduce a new "get" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
14970509c6 builtin/config: introduce "list" subcommand
While git-config(1) has several modes, those modes are not exposed with
subcommands but instead by specifying action flags like `--unset` or
`--list`. This user interface is not really in line with how our more
modern commands work, where it is a lot more customary to say e.g. `git
remote list`. Furthermore, to add to the confusion, git-config(1) also
allows the user to request modes implicitly by just specifying the
correct number of arguments. Thus, `git config foo.bar` will retrieve
the value of "foo.bar" while `git config foo.bar baz` will set it to
"baz".

Overall, this makes for a confusing interface that could really use a
makeover. It hurts discoverability of what you can do with git-config(1)
and is comparatively easy to get wrong. Converting the command to have
subcommands instead would go a long way to help address these issues.

One concern in this context is backwards compatibility. Luckily, we can
introduce subcommands without breaking backwards compatibility at all.
This is because all the implicit modes of git-config(1) require that the
first argument is a properly formatted config key. And as config keys
_must_ have a dot in their name, any value without a dot would have been
discarded by git-config(1) previous to this change. Thus, given that
none of the subcommands do have a dot, they are unambiguous.

Introduce the first such new subcommand, which is "git config list". To
retain backwards compatibility we only conditionally use subcommands and
will fall back to the old syntax in case no subcommand was detected.
This should help to transition to the new-style syntax until we
eventually deprecate and remove the old-style syntax.

Note that the way we handle this we're duplicating some functionality
across old and new syntax. While this isn't pretty, it helps us to
ensure that there really is no change in behaviour for the old syntax.

Amend tests such that we run them both with old and new style syntax.
As tests are now run twice, state from the first run may be still be
around in the second run and thus cause tests to fail. Add cleanup logic
as required to fix such tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:08 -07:00
fee3796616 builtin/config: pull out function to handle --null
Pull out function to handle the `--null` option, which we are about to
reuse in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:08 -07:00
9dda6b72b7 builtin/config: pull out function to handle config location
There's quite a bunch of options to git-config(1) that allow the user to
specify which config location to use when reading or writing config
options. The logic to handle this is thus by necessity also quite
involved.

Pull it out into a separate function so that we can reuse it in
subsequent commits which introduce proper subcommands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:08 -07:00
daa3325024 builtin/config: use OPT_CMDMODE() to specify modes
The git-config(1) command has various different modes which are
accessible via e.g. `--get-urlmatch` or `--unset-all`. These modes are
declared with `OPT_BIT()`, which causes two minor issues:

  - The respective modes also have a negated form `--no-get-urlmatch`,
    which is unintended.

  - We have to manually handle exclusiveness of the modes.

Switch these options to instead use `OPT_CMDMODE()`, which is made
exactly for this usecase. Remove the now-unneeded check that only a
single mode is given, which is now handled by the parse-options
interface.

While at it, format optional placeholders for arguments to conform to
our style guidelines by using `[<placeholder>]`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
8415507b32 builtin/config: move "fixed-value" option to correct group
The `--fixed-value` option can be used to alter how the value-pattern
parameter is interpreted for the various actions of git-config(1). But
while it is an option, it is currently listed as part of the actions
group, which is wrong.

Move the option to the "Other" group, which hosts the various options
known to git-config(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
424a29c3a7 builtin/config: move option array around
Move around the option array. This will help us with a follow-up commit
that introduces subcommands to git-config(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
a78b462976 config: clarify memory ownership when preparing comment strings
The ownership of memory returned when preparing a comment string is
quite intricate: when the returned value is different than the passed
value, then the caller is responsible to free the memory. This is quite
subtle, and it's even easier to miss because the returned value is in
fact a `const char *`.

Adapt the function to always return either `NULL` or a newly allocated
string. The function is called at most once per git-config(1), so it's
not like this micro-optimization really matters. Thus, callers are now
always responsible for freeing the value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
11be65cfa4 diff: fix --exit-code with external diff
You can ask the diff machinery to let the exit code indicate whether
there are changes, e.g. with --exit-code.  It as two ways to calculate
that bit: The quick one assumes blobs with different hashes have
different content, and the more elaborate way actually compares the
contents, possibly applying transformations like ignoring whitespace.

Always use the slower path by setting the flag diff_from_contents,
because any of the files could have an external diff driver set via an
attribute, which might consider binary differences irrelevant, like e.g.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 10:23:42 -07:00
7b30c3ad2d diff: report unmerged paths as changes in run_diff_cmd()
You can ask the diff machinery to let the exit code indicate whether
there are changes, e.g. with --quiet.  It as two ways to calculate that
bit: The quick one assumes blobs with different hashes have different
content, and the more elaborate way actually compares the contents,
possibly applying transformations like ignoring whitespace.

The quick way considers an unmerged file to be a change and reports
exit code 1, which makes sense.

The slower path uses the struct diff_options member found_changes to
indicate whether the blobs differ even with the transformations applied.
It's not set for unmerged files, though, resulting in exit code 0.

Set found_changes in run_diff_cmd() for unmerged files, for a consistent
exit code of 1 if there's an unmerged file, regardless of whether
whitespace is ignored.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 10:23:40 -07:00
9339fca23e refs: return conflict error when checking packed refs
The TRANSACTION_NAME_CONFLICT error code refers to a failure to create a
ref due to a name conflict with another ref. An example of this is a
directory/file conflict such as ref names A/B and A.

"git fetch" uses this error code to more accurately describe the error
by recommending to the user that they try running "git remote prune" to
remove any old refs that are deleted by the remote which would clear up
any directory/file conflicts.

This helpful error message is not displayed when the conflicted ref is
stored in packed refs. This change fixes this by ensuring error return
code consistency in `lock_raw_ref`.

Signed-off-by: Ivan Tse <ivan.tse1@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 08:48:25 -07:00
6a8c13e03d Makefile(s): do not enforce "all indents must be done with tab"
Our top-level Makefile follows our generic whitespace rule
established by the top-level .gitattributes file that does not
enforce indent-with-non-tab rule by default, but git-gui is set up
to enforce indent-with-non-tab by default.  With the upcoming change
to GNU make, we no longer can reject (and worse, "fix") a patch that
adds whitespace indented lines to the Makefile, so loosen the rule
there for git-gui/Makefile, too.

[j6t: cherry-picked from 227b8fd902]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-05-05 16:54:35 +02:00
1351570912 Makefile(s): avoid recipe prefix in conditional statements
In GNU Make commit 07fcee35 ([SV 64815] Recipe lines cannot contain
conditional statements, 2023-05-22) and following, conditional
statements may no longer be preceded by a tab character (which Make
refers to as the recipe prefix).

There are a handful of spots in our various Makefile(s) which will break
in a future release of Make containing 07fcee35. For instance, trying to
compile the pre-image of this patch with the tip of make.git results in
the following:

    $ make -v | head -1 && make
    GNU Make 4.4.90
    config.mak.uname:842: *** missing 'endif'.  Stop.

The kernel addressed this issue in 82175d1f9430 (kbuild: Replace tabs
with spaces when followed by conditionals, 2024-01-28). Address the
issues in Git's tree by applying the same strategy.

When a conditional word (ifeq, ifneq, ifdef, etc.) is preceded by one or
more tab characters, replace each tab character with 8 space characters
with the following:

    find . -type f -not -path './.git/*' -name Makefile -or -name '*.mak' |
      xargs perl -i -pe '
        s/(\t+)(ifn?eq|ifn?def|else|endif)/" " x (length($1) * 8) . $2/ge unless /\\$/
      '

The "unless /\\$/" removes any false-positives (like "\telse \"
appearing within a shell script as part of a recipe).

After doing so, Git compiles on newer versions of Make:

    $ make -v | head -1 && make
    GNU Make 4.4.90
    GIT_VERSION = 2.44.0.414.gfac1dc44ca9
    [...]

    $ echo $?
    0

[j6t: cherry-picked from 728b9ac0c3]

Reported-by: Dario Gjorgjevski <dario.gjorgjevski@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-05-05 16:54:35 +02:00
34a2498659 doc: switch links to https
These sites offer https versions of their content.
Using the https versions provides some protection for users.

[j6t: cherry-picked from d05b08cd52]

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-05-05 16:49:00 +02:00
f282df1ef8 doc: update links to current pages
It's somewhat traditional to respect sites' self-identification.

[j6t: cherry-picked from 65175d9ea2]

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-05-05 16:49:00 +02:00
f9a3e704ab Merge branch 'ml/git-gui-exec-path-fix'
* ml/git-gui-exec-path-fix:
  git-gui - use git-hook, honor core.hooksPath
  git-gui - re-enable use of hook scripts
2024-05-05 14:41:21 +02:00
fe475c4e2f git-gui: po: fix typo in French "aperçu"
The French word "aperçu", meaning "view" or "preview", contains only a
single letter "p".  Remove the extra letter, which is an obvious typo.

Reported-by: Léonard Michelet <leonard@lebasic.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2024-05-05 14:02:05 +02:00
c793f9cb08 attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()
Commit 3c50032ff5 (attr: ignore overly large gitattributes files,
2022-12-01) added a defense-in-depth check to ensure that .gitattributes
blobs read from the index do not exceed ATTR_MAX_FILE_SIZE (100 MB).

But there were two cases added shortly after 3c50032ff5 was written
which do not apply similar protections:

  - 47cfc9bd7d (attr: add flag `--source` to work with tree-ish,
    2023-01-14)

  - 4723ae1007 (attr.c: read attributes in a sparse directory,
    2023-08-11) added a similar

Ensure that we refuse to process a .gitattributes blob exceeding
ATTR_MAX_FILE_SIZE when reading from either an arbitrary tree object or
a sparse directory. This is done by pushing the ATTR_MAX_FILE_SIZE check
down into the low-level `read_attr_from_buf()`.

In doing so, plug a leak in `read_attr_from_index()` where we would
accidentally leak the large buffer upon detecting it is too large to
process.

(Since `read_attr_from_buf()` handles a NULL buffer input, we can remove
a NULL check before calling it in `read_attr_from_index()` as well).

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 12:44:16 -07:00
8f19e82c5b gitlab-ci: add whitespace error check
GitLab CI does not have a job to check for whitespace errors introduced
by a set of changes. Reuse the existing generic `whitespace-check.sh` to
create the job for GitLab pipelines.

Note that the `$CI_MERGE_REQUEST_TARGET_BRANCH_SHA` variable is only
available in GitLab merge request pipelines and therefore the CI job is
configured to only run as part of those pipelines.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 12:11:49 -07:00
9bef98096c ci: make the whitespace report optional
The `check-whitespace` CI job generates a formatted output file
containing whitespace error information. As not all CI providers support
rendering a formatted summary, make its generation optional.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 12:11:49 -07:00
66820fb7bf ci: separate whitespace check script
The `check-whitespace` CI job is only available as a GitHub action. To
help enable this job with other CI providers, first separate the logic
performing the whitespace check into its own script. In subsequent
commits, this script is further generalized allowing its reuse.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 12:11:49 -07:00
ecaacbc7a2 github-ci: fix link to whitespace error
When the `check-whitespace` CI job detects whitespace errors, a
formatted summary of the issue is generated. This summary contains links
to the commits and blobs responsible for the whitespace errors. The
generated links for blobs do not work and result in a 404.

Instead of using the reference name in the link, use the commit ID
directly. This fixes the broken link and also helps enable future
generalization of the script for other CI providers by removing one of
the GitHub specific CI variables used.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 12:11:49 -07:00
7789ea5842 ci: pre-collapse GitLab CI sections
Sections of CI output defined by `begin_group()` and `end_group()` are
expanded in GitLab pipelines by default. This can make CI job output
rather noisy and harder to navigate. Update the behavior for GitLab
pipelines to now collapse sections by default.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 12:11:49 -07:00
b79deeb554 advice: add --no-advice global option
Advice hints must be disabled individually by setting the relevant
advice.* variables to false in the Git configuration. For server-side
and scripted usages of Git where hints can be a hindrance, it can be
cumbersome to maintain configuration to ensure all advice hints are
disabled in perpetuity. This is a particular concern in tests, where
new or changed hints can result in failed assertions.

Add a --no-advice global option to disable all advice hints from being
displayed. This is independent of the toggles for individual advice
hints. Use an internal environment variable (GIT_ADVICE) to ensure this
configuration is propagated to the usage site, even if it executes in a
subprocess.

Signed-off-by: James Liu <james@jamesliu.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 10:36:59 -07:00
5bd8811a73 doc: add spacing around paginate options
Make the documentation page consistent with the usage string printed by
"git help git" and consistent with the description of "[-v | --version]"
option.

Signed-off-by: James Liu <james@jamesliu.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 10:33:12 -07:00
9b715ad926 doc: clean up usage documentation for --no-* opts
We'll be adding another option to the --no-* class of options soon.

Clean up the existing options by grouping them together in the OPTIONS
section, and adding missing ones to the SYNOPSIS.

Signed-off-by: James Liu <james@jamesliu.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 10:32:06 -07:00
51441e6460 stop using HEAD for attributes in bare repository by default
With 23865355 (attr: read attributes from HEAD when bare repo,
2023-10-13), we started to use the HEAD tree as the default
attribute source in a bare repository.  One argument for such a
behaviour is that it would make things like "git archive" run in
bare and non-bare repositories for the same commit consistent.
This changes was merged to Git 2.43 but without an explicit mention
in its release notes.

It turns out that this change destroys performance of shallowly
cloning from a bare repository.  As the "server" installations are
expected to be mostly bare, and "git pack-objects", which is the
core of driving the other side of "git clone" and "git fetch" wants
to see if a path is set not to delta with blobs from other paths via
the attribute system, the change forces the server side to traverse
the tree of the HEAD commit needlessly to find if each and every
paths the objects it sends out has the attribute that controls the
deltification.  Given that (1) most projects do not configure such
an attribute, and (2) it is dubious for the server side to honor
such an end-user supplied attribute anyway, this was a poor choice
of the default.

To mitigate the current situation, let's revert the change that uses
the tree of HEAD in a bare repository by default as the attribute
source.  This will help most people who have been happy with the
behaviour of Git 2.42 and before.

Two things to note:

 * If you are stuck with versions of Git 2.43 or newer, that is
   older than the release this fix appears in, you can explicitly
   set the attr.tree configuration variable to point at an empty
   tree object, i.e.

	$ git config attr.tree 4b825dc642

 * If you like the behaviour we are reverting, you can explicitly
   set the attr.tree configuration variable to HEAD, i.e.

	$ git config attr.tree HEAD

The right fix for this is to optimize the code paths that allow
accesses to attributes in tree objects, but that is a much more
involved change and is left as a longer-term project, outside the
scope of this "first step" fix.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 09:15:33 -07:00
395c130fd8 win32: fix building with NO_UNIX_SOCKETS
After 2406bf5f (Win32: detect unix socket support at runtime,
2024-04-03), it fails with:

compat/mingw.c:4160:5: error: no previous prototype for function 'mingw_have_unix_sockets' [-Werror,-Wmissing-prototypes]
   4160 | int mingw_have_unix_sockets(void)
        |     ^

because the prototype is behind `ifndef NO_UNIX_SOCKETS`.

Signed-off-by: Mike Hommey <mh@glandium.org>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 08:42:50 -07:00
861dc19ba8 t/lib-chunk: work around broken "mv" on some vintage of macOS
When the destination is read-only, "mv" on some version of macOS
asks whether to replace the destination even though in the test its
stdin is not a terminal (and thus doesn't conform to POSIX[1]).

The helper to corrupt a chunk-file is designed to work on the
files like commit-graph and multi-pack-index files that are
generally read-only, so use "mv -f" to work around this issue.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 13:16:42 -07:00
dc88e5279a trailer unit tests: inspect iterator contents
Previously we only checked whether we would iterate a certain (expected)
number of times.

Also check the parsed "raw", "key" and "val" fields during each
iteration.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
5f800603a9 trailer: document parse_trailers() usage
Explain how to use parse_trailers(), because earlier we made the
trailer_info struct opaque. That is, because clients can no longer peek
inside it, we should give them guidance about how the (pointer to the)
opaque struct can still be useful to them.

Rename "head" struct to "trailer_objects" to make the wording of the new
comments a bit easier to read (because "head" itself doesn't really have
any domain-specific meaning here).

Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
cf5c9349de trailer: retire trailer_info_get() from API
Make trailer_info_get() "static" to be file-scoped to trailer.c, because
no one outside of trailer.c uses it. Remove its declaration from
<trailer.h>.

We have to also reposition it to be above parse_trailers(), which
depends on it.

Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
c1e4b2b18e trailer: make trailer_info struct private
In 13211ae23f (trailer: separate public from internal portion of
trailer_iterator, 2023-09-09) we moved trailer_info behind an anonymous
struct to discourage use by trailer.h API users. However it still left
open the possibility of external use of trailer_info itself. Now that
there are no external users of trailer_info, we can make this struct
private.

Make this struct private by putting its definition inside trailer.c.
This has two benefits:

  (1) it makes the surface area of the public facing
      interface (trailer.h) smaller, and

  (2) external API users are unable to peer inside this struct (because
      it is only ever exposed as an opaque pointer).

There are a few disadvantages:

  (A) every time the member of the struct is accessed an extra pointer
      dereference must be done, and

  (B) for users of trailer_info outside trailer.c, this struct can no
      longer be allocated on the stack and may only be allocated on the
      heap (because its definition is hidden away in trailer.c) and
      appropriately deallocated by the user, and

  (C) without good documentation on the API, the opaque struct is
      hostile to programmers by going opposite to the "Show me your
      data structures, and I won't usually need your code; it'll
      be obvious." mantra [2].

(The disadvantages have already been observed in the two preparatory
commits that precede this one.) This commit believes that the benefits
outweigh the disadvantages for designing APIs, as explained below.

Making trailer_info private exposes existing deficiencies in the API.
This is because users of this struct had full access to its internals,
so there wasn't much need to actually design it to be "complete" in the
sense that API users only needed to use what was provided by the API.
For example, the location of the trailer block (start/end offsets
relative to the start of the input text) was accessible by looking at
these struct members directly. Now that the struct is private, we have
to expose new API functions to allow clients to access this
information (see builtin/interpret-trailers.c).

The idea in this commit to hide implementation details behind an "opaque
pointer" is also known as the "pimpl" (pointer to implementation) idiom
in C++ and is a common pattern in that language (where, for example,
abstract classes only have pointers to concrete classes).

However, the original inspiration to use this idiom does not come from
C++, but instead the book "C Interfaces and Implementations: Techniques
for Creating Reusable Software" [1]. This book recommends opaque
pointers as a good design principle for designing C libraries, using the
term "interface" as the functions defined in *.h (header) files and
"implementation" as the corresponding *.c file which define the
interfaces.

The book says this about opaque pointers:

    ... clients can manipulate such pointers freely, but they can’t
    dereference them; that is, they can’t look at the innards of the
    structure pointed to by them. Only the implementation has that
    privilege. Opaque pointers hide representation details and help
    catch errors.

In our case, "struct trailer_info" is now hidden from clients, and the
ways in which this opaque pointer can be used is limited to the richness
of <trailer.h>. In other words, <trailer.h> exclusively controls exactly
how "trailer_info" pointers are to be used.

[1] Hanson, David R. "C Interfaces and Implementations: Techniques for
    Creating Reusable Software". Addison Wesley, 1997. p. 22

[2] Raymond, Eric S. "The Cathedral and the Bazaar: Musings on Linux and
    Open Source by an Accidental Revolutionary". O'Reilly, 1999.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
24a25c630c trailer: make parse_trailers() return trailer_info pointer
This is the second and final preparatory commit for making the
trailer_info struct private to the trailer implementation.

Make trailer_info_get() do the actual work of allocating a new
trailer_info struct, and return a pointer to it. Because
parse_trailers() wraps around trailer_info_get(), it too can return this
pointer to the caller. From the trailer API user's perspective, the call
to trailer_info_new() can be replaced with parse_trailers(); do so in
interpret-trailers.

Because trailer_info_new() is no longer called by interpret-trailers,
remove this function from the trailer API.

With this change, we no longer allocate trailer_info on the stack ---
all uses of it are via a pointer where the actual data is always
allocated at runtime through trailer_info_new(). Make
trailer_info_release() free this dynamically allocated memory.

Finally, due to the way the function signatures of parse_trailers() and
trailer_info_get() have changed, update the callsites in
format_trailers_from_commit() and trailer_iterator_init() accordingly.

Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
655eb65d48 interpret-trailers: access trailer_info with new helpers
Instead of directly accessing trailer_info members, access them
indirectly through new helper functions exposed by the trailer API.

This is the first of two preparatory commits which will allow us to
use the so-called "pimpl" (pointer to implementation) idiom for the
trailer API, by making the trailer_info struct private to the trailer
implementation (and thus hidden from the API).

Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
2ade05431e sequencer: use the trailer iterator
Instead of calling "trailer_info_get()", which is a low-level function
in the trailers implementation (trailer.c), call
trailer_iterator_advance(), which was specifically designed for public
consumption in f0939a0eb1 (trailer: add interface for iterating over
commit trailers, 2020-09-27).

Avoiding "trailer_info_get()" means we don't have to worry about options
like "no_divider" (relevant for parsing trailers). We also don't have to
check for things like "info.trailer_start == info.trailer_end" to see
whether there were any trailers (instead we can just check to see
whether the iterator advanced at all).

Note how we have to use "iter.raw" in order to get the same behavior as
before when we iterated over the unparsed string array (char **trailers)
in trailer_info.

Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
3be65e6ee2 trailer: teach iterator about non-trailer lines
Previously the iterator did not iterate over non-trailer lines. This was
somewhat unfortunate, because trailer blocks could have non-trailer
lines in them since 146245063e (trailer: allow non-trailers in trailer
block, 2016-10-21), which was before the iterator was created in
f0939a0eb1 (trailer: add interface for iterating over commit trailers,
2020-09-27).

So if trailer API users wanted to iterate over all lines in a trailer
block (including non-trailer lines), they could not use the iterator and
were forced to use the lower-level trailer_info struct directly (which
provides a raw string array that includes all lines in the trailer
block).

Change the iterator's behavior so that we also iterate over non-trailer
lines, instead of skipping over them. The new "raw" member of the
iterator allows API users to access previously inaccessible non-trailer
lines. Reword the variable "trailer" to just "line" because this
variable can now hold both trailer lines _and_ non-trailer lines.

The new "raw" member is important because anyone currently not using the
iterator is using trailer_info's raw string array directly to access
lines to check what the combined key + value looks like. If we didn't
provide a "raw" member here, iterator users would have to re-construct
the unparsed line by concatenating the key and value back together again

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
56b04883f0 trailer: add unit tests for trailer iterator
Test the number of trailers found by the iterator (to be more precise,
the parsing mechanism which the iterator just walks over) when given
some arbitrary log message.

We test the iterator because it is a public interface function exposed
by the trailer API (we generally don't want to test internal
implementation details which are, unlike the API, subject to drastic
changes).

Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:03 -07:00
704b59099e Makefile: sort UNIT_TEST_PROGRAMS
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:53:51 -07:00
7b97dfe47b color: add support for 12-bit RGB colors
RGB color parsing currently supports 24-bit values in the form #RRGGBB.

As in Cascading Style Sheets (CSS [1]), also allow to specify an RGB color
using only three digits with #RGB.

In this shortened form, each of the digits is – again, as in CSS –
duplicated to convert the color to 24 bits, e.g. #f1b specifies the same
color as #ff11bb.

In color.h, remove the '0x' prefix in the example to match the actual
syntax.

[1] https://developer.mozilla.org/en-US/docs/Web/CSS/hex-color

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:30:38 -07:00
d78d692efc t/t4026-color: add test coverage for invalid RGB colors
Make sure that the RGB color parser rejects invalid characters and
invalid lengths.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:30:38 -07:00
e95af749a2 t/t4026-color: remove an extra double quote character
This is most probably just an editing left-over from cb357221a4 (t4026:
test "normal" color, 2014-11-20) which added this test.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:30:37 -07:00
d424488901 rev-parse: document how --is-* options work outside a repository
When "git rev-parse" is run with the "--is-inside-work-tree" option
and friends outside a Git repository, the command exits with a
non-zero status and says "fatal: not a repository".  While it is not
wrong per-se, in the sense that it is useless to learn if we are
inside or outside a working tree in the first place when we are not
even in a repository, it could be argued that they should emit
"false" and exit with status 0, as they cannot possibly be "true".

As the current behaviour has been with us for a decade or more
since it was introduced in Git 1.5.3 timeframe, it is too late to
change it.

And arguably, the current behaviour is easier to use if you want to
distinguish among three states, i.e.,

 (1) the cwd is not controlled by Git at all
 (2) the cwd is inside a working tree
 (3) the cwd is not inside a working tree (e.g., .git/hooks/)

with a single invocation of the command by doing

    if inout=$(git rev-parse --is-inside-work-tree)
    then
        case "$inout" in
        true)   : in a working tree ;;
        false)  : not in a working tree ;;
        esac
    else
        : not in a repository
    fi

So, let's document clearly that the command will die() when run
outside a repository in general, unless in some special cases like
when the command is in the --parseopt mode.

While at it, update the introductory text that makes it sound as if
the primary operating mode is the only operating mode of the
command, which was written long before we added "--parseopt" and
"--sq-quote" modes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-01 12:56:09 -07:00
a5a4cb7b27 diff-lib: stop calling diff_setup_done() in do_diff_cache()
d44e5267ea (diff-lib: plug minor memory leaks in do_diff_cache(),
2020-11-14) added the call to diff_setup_done() to release the memory
of the parseopt member of struct diff_options that repo_init_revisions()
had allocated via repo_diff_setup() and prep_parse_options().

189e97bc4b (diff: remove parseopts member from struct diff_options,
2022-12-01) did away with that allocation; diff_setup_done() doesn't
release any memory anymore.  So stop calling this function on the blank
diffopt member before it is overwritten, as this is no longer necessary.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-01 09:44:10 -07:00
3c20acdf46 completion: zsh: stop leaking local cache variable
Completing commands like "git rebase" in one repository will leak the
local __git_repo_path into the shell's environment so that completing
commands after changing to a different repository will give the old
repository's references (or none at all).

The bug report on the mailing list [1] suggests one simple way to observe
this yourself:

Enter the following commands from some directory:
  mkdir a b b/c
  for d (a b); git -C $d init && git -C $d commit --allow-empty -m init
  cd a
  git branch foo
  pushd ../b/c
  git branch bar

Now type these:
  git rebase <TAB>… # completion for bar available; C-c to abort
  declare -p __git_repo_path # outputs /path/to/b/.git
  popd
  git branch # outputs foo, main
  git rebase <TAB>… # completion candidates are bar, main!

Ideally, the last typed <TAB> should be yielding foo, main.

Commit beb6ee7163 (completion: extract repository discovery from
__gitdir(), 2017-02-03) anticipated this problem by marking
__git_repo_path as local in __git_main and __gitk_main for Bash
completion but did not give the same mark to _git for Zsh completion.
Thus make __git_repo_path local for Zsh completion, too.

[1]: https://lore.kernel.org/git/CALnO6CBv3+e2WL6n6Mh7ZZHCX2Ni8GpvM4a-bQYxNqjmgZdwdg@mail.gmail.com/

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-30 15:24:56 -07:00
d4cc1ec35f Start the 2.46 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-30 14:52:20 -07:00
75b182d34e Merge branch 'js/for-each-repo-keep-going'
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error
2024-04-30 14:49:45 -07:00
473dcb4d89 Merge branch 'js/build-fuzz-more-often'
In addition to building the objects needed, try to link the objects
that are used in fuzzer tests, to make sure at least they build
without bitrot, in Linux CI runs.

* js/build-fuzz-more-often:
  fuzz: link fuzz programs with `make all` on Linux
2024-04-30 14:49:44 -07:00
07410bb4e8 Merge branch 'la/doc-use-of-contacts-when-contributing'
Advertise "git contacts", a tool for newcomers to find people to
ask review for their patches, a bit more in our developer
documentation.

* la/doc-use-of-contacts-when-contributing:
  SubmittingPatches: demonstrate using git-contacts with git-send-email
  SubmittingPatches: add heading for format-patch and send-email
  SubmittingPatches: dedupe discussion of security patches
  SubmittingPatches: discuss reviewers first
  SubmittingPatches: quote commands
  SubmittingPatches: mention GitGitGadget
  SubmittingPatches: clarify 'git-contacts' location
  MyFirstContribution: mention contrib/contacts/git-contacts
2024-04-30 14:49:44 -07:00
90f6b5a597 Merge branch 'aj/stash-staged-fix'
"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
  stash: fix "--staged" with binary files
2024-04-30 14:49:43 -07:00
708e9257f8 Merge branch 'jc/format-patch-rfc-more'
The "--rfc" option of "git format-patch" learned to take an
optional string value to be used in place of "RFC" to tweak the
"[PATCH]" on the subject header.

* jc/format-patch-rfc-more:
  format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
  format-patch: allow --rfc to optionally take a value, like --rfc=WIP
2024-04-30 14:49:43 -07:00
07fc8275e1 Merge branch 'ds/format-patch-rfc-and-k'
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".

* ds/format-patch-rfc-and-k:
  format-patch: ensure that --rfc and -k are mutually exclusive
2024-04-30 14:49:42 -07:00
55e5548a0f Merge branch 'xx/disable-replace-when-building-midx'
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
  midx: disable replace objects
2024-04-30 14:49:42 -07:00
c9f43012a1 Merge branch 'pw/rebase-m-signoff-fix'
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.

* pw/rebase-m-signoff-fix:
  rebase -m: fix --signoff with conflicts
  sequencer: store commit message in private context
  sequencer: move current fixups to private context
  sequencer: start removing private fields from public API
  sequencer: always free "struct replay_opts"
2024-04-30 14:49:41 -07:00
26998ed2a2 add-patch: response to unknown command
When the user gives an unknown command to the "add -p" prompt, the list
of accepted commands with their explanation is given.  This is the same
output they get when they say '?'.

However, the unknown command may be due to a user input error rather
than the user not knowing the valid command.

To reduce the likelihood of user confusion and error repetition, instead
of displaying the list of accepted commands, display a short error
message with the unknown command received, as feedback to the user.

Include a reminder about the current command '?' in the new message, to
guide the user if they want help.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-30 12:02:50 -07:00
9d225b025d add-patch: do not show UI messages on stderr
There is no need to show some UI messages on stderr, and yet doing so
may produce some undesirable results, such as messages appearing in an
unexpected order.

Let's use stdout for all UI messages, and adjusts the tests accordingly.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-30 12:02:39 -07:00
2c7b491c1d Git 2.45.1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-29 20:42:46 +02:00
1c00f92eb5 Sync with 2.44.1
* maint-2.44: (41 commits)
  Git 2.44.1
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  ...
2024-04-29 20:42:30 +02:00
786a3e4b8d Git 2.45
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-29 07:30:29 -07:00
f4edad9530 Merge tag 'l10n-2.45.0-rnd1' of https://github.com/git-l10n/git-po
l10n-2.45.0-rnd1

* tag 'l10n-2.45.0-rnd1' of https://github.com/git-l10n/git-po:
  l10n: tr: Update Turkish translations
  l10n: zh_CN: for git 2.45 rounds
  l10n: zh-TW: Git 2.45
  l10n: vi: Updated translation for 2.45
  l10n: TEAMS: retire l10n teams no update in 1 year
  l10n: uk: v2.45 update
  l10n: sv.po: Update Swedish translation
  l10n: Update German translation
  l10n: po-id for 2.45
  l10n: bg.po: Updated Bulgarian translation (5652t)
  l10n: fr: v2.45.0
  l10n: Update Vietnamese team contact
2024-04-29 07:29:35 -07:00
2cf631412d Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5652t)
2024-04-29 14:50:23 +08:00
afb6f74b96 Merge branch 'fr_v2.45.0' of github.com:jnavila/git
* 'fr_v2.45.0' of github.com:jnavila/git:
  l10n: fr: v2.45.0
2024-04-29 14:49:44 +08:00
c994a2c5ea l10n: tr: Update Turkish translations
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2024-04-29 01:12:09 +03:00
aa5ce16a4f Merge branch 'l10n/zh-TW/240428' of github.com:l10n-tw/git-po
* 'l10n/zh-TW/240428' of github.com:l10n-tw/git-po:
  l10n: zh-TW: Git 2.45
2024-04-28 20:36:57 +08:00
1919aa01b5 Merge branch 'tl/zh_CN_2.45.0_rnd' of github.com:dyrone/git
* 'tl/zh_CN_2.45.0_rnd' of github.com:dyrone/git:
  l10n: zh_CN: for git 2.45 rounds
2024-04-28 20:35:54 +08:00
b705d3a745 l10n: zh_CN: for git 2.45 rounds
Signed-off-by: Teng Long <dyroneteng@gmail.com>
2024-04-28 20:31:55 +08:00
ef7ba0e1f2 l10n: zh-TW: Git 2.45
Co-Authored-By: Lumynous <lumynou5.tw@gmail.com>
Co-Authored-By: Kisaragi Hiu <mail@kisaragi-hiu.com>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2024-04-28 18:54:03 +08:00
7ddd462820 Merge branch 'update-teams' of https://github.com/Nekosha/git-po
* 'update-teams' of https://github.com/Nekosha/git-po:
  l10n: Update Vietnamese team contact
2024-04-28 18:28:48 +08:00
562f54eb3d l10n: vi: Updated translation for 2.45
Signed-off-by: Vũ Tiến Hưng <newcomerminecraft@gmail.com>
2024-04-28 14:05:51 +07:00
900af19275 l10n: TEAMS: retire l10n teams no update in 1 year
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2024-04-28 07:33:25 +08:00
9a9872ad87 Merge branch 'l10n/uk/2.45-uk-update'
* '2.45-uk-update' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: uk: v2.45 update
2024-04-28 07:30:08 +08:00
1b632c84d7 Merge branch 'l10n-de-2.45' of github.com:ralfth/git
* 'l10n-de-2.45' of github.com:ralfth/git:
  l10n: Update German translation
2024-04-28 07:25:22 +08:00
155ceb38ce Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.45
2024-04-28 07:23:52 +08:00
e35a8c9a52 l10n: uk: v2.45 update
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2024-04-27 11:41:08 -07:00
7607417b23 l10n: sv.po: Update Swedish translation
Also fix some inconsistencies, and fix issue reported by
Anders Jonsson <anders.jonsson@norsjovallen.se>.

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2024-04-27 15:21:53 +01:00
fedd5c79ff vimdiff: make script and tests work with zsh
When we process the $LAYOUT variable through sed, the result will end
with the character "#".  We then split it at the shell using IFS so that
we can process it a character at a time.

POSIX specifies that only "IFS white space shall be ignored at the
beginning and end of the input".  The hash mark is not a white space
character, so it is not ignored at the beginning and end of the input.

POSIX then specifies that "[e]ach occurrence in the input of an IFS
character that is not IFS white space, along with any adjacent IFS white
space, shall delimit a field, as described previously."  Thus, the final
hash mark delimits a field, and the final field is the empty string.

zsh implements this behavior strictly in compliance with POSIX (and
differently from most other shells), such that we end up with a trailing
empty field.  We don't want this empty field and processing it in the
normal way causes us to fail to parse properly and fail the tests with
"ERROR" entries, so let's just ignore it instead.  This is the behavior
of bash and dash anyway and what was clearly intended, so this is a
reasonable thing to do.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-26 16:31:46 -07:00
058b8dc2c2 t4046: avoid continue in &&-chain for zsh
zsh has a bug in which the keyword "continue" within an &&-chain is not
effective and the code following it is executed nonetheless.
Fortunately, this bug has been fixed upstream in 12e5db145 ("51608:
Don't execute commands after "continue &&"", 2023-03-29).  However, zsh
releases very infrequently, so it is not present in a stable release
yet.

That, combined with the fact that almost all zsh users get their shell
from their OS vendor, means that it will likely be a long time before
this problem is fixed for most users.  We have other workarounds in
place for FreeBSD ash and dash, so it shouldn't be too difficult to add
one here, either.

Replace the existing code with a test and if-block, which comes only at
the cost of an additional indentation, and leaves the code a little more
idiomatic anyway.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-26 16:31:46 -07:00
3a8a93672b l10n: Update German translation
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2024-04-26 16:24:36 +02:00
4c4e43e736 l10n: po-id for 2.45
Translate following new components:

  * refs/reftable-backend.c

Update following components:

  * branch.c
  * builtin/column.c
  * builtin/config.c
  * builtin/for-each-ref.c
  * builtin/pack-refs.c
  * revision.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2024-04-26 15:52:10 +07:00
4cf6e7bf5e doc: clarify practices for submitting updated patch versions
The `SubmittingPatches` documentation briefly mentions that related
patches should be grouped together in their own e-mail thread. Expand on
this to explicitly state that updated versions of a patch series should
also follow this. Also provide add a link to existing documentation from
`MyFirstContribution` that provides detailed instructions on how to do
this via `git-send-email(1)`.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-25 14:49:06 -07:00
e326e52010 Merge branch 'rj/add-i-leak-fix'
Leakfix.

* rj/add-i-leak-fix:
  add: plug a leak on interactive_add
  add-patch: plug a leak handling the '/' command
  add-interactive: plug a leak in get_untracked_files
  apply: plug a leak in apply_data
2024-04-25 10:34:24 -07:00
c9d1ee7cdf Merge branch 'rs/vsnprintf-failure-is-not-a-bug'
Demote a BUG() to an die() when the failure from vsnprintf() may
not be due to a programmer error.

* rs/vsnprintf-failure-is-not-a-bug:
  don't report vsnprintf(3) error as bug
2024-04-25 10:34:23 -07:00
6b7c45e8c9 completion: add docs on how to add subcommand completions
Signed-off-by: Roland Hieber <rhi@pengutronix.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-25 09:23:27 -07:00
d13a295074 completion: improve docs for using __git_complete
It took me more than a few tries and a good lecture of __git_main to
understand that the two paragraphs really only refer to adding
completion functions for executables that are not called through git's
subcommand magic. Improve the docs and be more specific.

Signed-off-by: Roland Hieber <rhi@pengutronix.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-25 09:23:26 -07:00
cb85fdf4a4 completion: add 'symbolic-ref'
Even 'symbolic-ref' is only completed when
GIT_COMPLETION_SHOW_ALL_COMMANDS=1 is set, it currently defaults to
completing file names, which is not very helpful. Add a simple
completion function which completes options and refs.

Signed-off-by: Roland Hieber <rhi@pengutronix.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-25 09:23:26 -07:00
8427b7e72b fuzz: link fuzz programs with make all on Linux
Since 5e47215080 (fuzz: add basic fuzz testing target., 2018-10-12), we
have compiled object files for the fuzz tests as part of the default
'make all' target. This helps prevent bit-rot in lesser-used parts of
the codebase, by making sure that incompatible changes are caught at
build time.

However, since we never linked the fuzzer executables, this did not
protect us from link-time errors. As of 8b9a42bf48 (fuzz: fix fuzz test
build rules, 2024-01-19), it's now possible to link the fuzzer
executables without using a fuzzing engine and a variety of
compiler-specific (and compiler-version-specific) flags, at least on
Linux. So let's add a platform-specific option in config.mak.uname to
link the executables as part of the default `make all` target.

Since linking the fuzzer executables without a fuzzing engine does not
require a C++ compiler, we can change the FUZZ_PROGRAMS build rule to
use $(CC) by default. This avoids compiler mis-match issues when
overriding $(CC) but not $(CXX). When we *do* want to actually link with
a fuzzing engine, we can set $(FUZZ_CXX). The build instructions in the
CI fuzz-smoke-test job and in the Makefile comment have been updated
accordingly.

While we're at it, we can consolidate some of the fuzzer build
instructions into one location in the Makefile.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24 11:56:40 -07:00
c75662bfc9 maintenance: running maintenance should not stop on errors
In https://github.com/microsoft/git/issues/623, it was reported that
maintenance stops on a missing repository, omitting the remaining
repositories that were scheduled for maintenance.

This is undesirable, as it should be a best effort type of operation.

It should still fail due to the missing repository, of course, but not
leave the non-missing repositories in unmaintained shapes.

Let's use `for-each-repo`'s shiny new `--keep-going` option that we just
introduced for that very purpose.

This change will be picked up when running `git maintenance start`,
which is run implicitly by `scalar reconfigure`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24 10:46:03 -07:00
12c2ee5fbd for-each-repo: optionally keep going on an error
In https://github.com/microsoft/git/issues/623, it was reported that
the regularly scheduled maintenance stops if one repo in the middle of
the list was found to be missing.

This is undesirable, and points out a gap in the design of `git
for-each-repo`: We need a mode where that command does not stop on an
error, but continues to try running the specified command with the other
repositories.

Imitating the `--keep-going` option of GNU make, this commit teaches
`for-each-repo` the same trick: to continue with the operation on all
the remaining repositories in case there was a problem with one
repository, still setting the exit code to indicate an error occurred.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24 10:46:03 -07:00
9f32d8da7a Documentation/RelNotes/2.45.0.txt: fix typo
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24 10:32:55 -07:00
bf995e7a4f Git 2.45-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23 15:05:56 -07:00
5c7ffafcea Merge branch 'ps/run-auto-maintenance-in-receive-pack'
The "receive-pack" program (which responds to "git push") was not
converted to run "git maintenance --auto" when other codepaths that
used to run "git gc --auto" were updated, which has been corrected.

* ps/run-auto-maintenance-in-receive-pack:
  builtin/receive-pack: convert to use git-maintenance(1)
  run-command: introduce function to prepare auto-maintenance process
2024-04-23 15:05:56 -07:00
5b78774820 Merge branch 'pk/bisect-use-show'
When "git bisect" reports the commit it determined to be the
culprit, we used to show it in a format that does not honor common
UI tweaks, like log.date and log.decorate.  The code has been
taught to use "git show" to follow more customizations.

* pk/bisect-use-show:
  bisect: report the found commit with "show"
2024-04-23 15:05:56 -07:00
10f1281498 A bit more topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23 11:52:42 -07:00
b0679fa2b8 Merge branch 'rs/apply-reject-long-name'
The filename used for rejected hunks "git apply --reject" creates
was limited to PATH_MAX, which has been lifted.

* rs/apply-reject-long-name:
  apply: avoid using fixed-size buffer in write_out_one_reject()
2024-04-23 11:52:42 -07:00
7b66f5dd8b Merge branch 'mr/rerere-crash-fix'
When .git/rr-cache/ rerere database gets corrupted or rerere is fed to
work on a file with conflicted hunks resolved incompletely, the rerere
machinery got confused and segfaulted, which has been corrected.

* mr/rerere-crash-fix:
  rerere: fix crashes due to unmatched opening conflict markers
2024-04-23 11:52:41 -07:00
fb9f603f3c Merge branch 'rs/imap-send-simplify-cmd-issuing-codepath'
Code simplification.

* rs/imap-send-simplify-cmd-issuing-codepath:
  imap-send: increase command size limit
2024-04-23 11:52:41 -07:00
9cb0bbf0b4 Merge branch 'xx/rfc2822-date-format-in-doc'
Docfix.

* xx/rfc2822-date-format-in-doc:
  Documentation: fix typos describing date format
2024-04-23 11:52:40 -07:00
567293123d Merge branch 'ps/missing-btmp-fix'
GIt 2.44 introduced a regression that makes the updated code to
barf in repositories with multi-pack index written by older
versions of Git, which has been corrected.

* ps/missing-btmp-fix:
  pack-bitmap: gracefully handle missing BTMP chunks
2024-04-23 11:52:40 -07:00
c9f1f88bb0 Merge branch 'la/format-trailer-info'
The code to format trailers have been cleaned up.

* la/format-trailer-info:
  trailer: finish formatting unification
  trailer: begin formatting unification
  format_trailer_info(): append newline for non-trailer lines
  format_trailer_info(): drop redundant unfold_value()
  format_trailer_info(): use trailer_item objects
2024-04-23 11:52:39 -07:00
b258237f4d Merge branch 'dd/t9604-use-posix-timezones'
The cvsimport tests required that the platform understands
traditional timezone notations like CST6CDT, which has been
updated to work on those systems as long as they understand
POSIX notation with explicit tz transition dates.

* dd/t9604-use-posix-timezones:
  t9604: Fix test for musl libc and new Debian
2024-04-23 11:52:39 -07:00
5615be39bc Merge branch 'rj/launch-editor-error-message'
Git writes a "waiting for your editor" message on an incomplete
line after launching an editor, and then append another error
message on the same line if the editor errors out.  It now clears
the "waiting for..." line before giving the error message.

* rj/launch-editor-error-message:
  launch_editor: waiting message on error
2024-04-23 11:52:39 -07:00
7f49008602 Merge branch 'yb/replay-doc-linkfix'
Docfix.

* yb/replay-doc-linkfix:
  Documentation: fix linkgit reference
2024-04-23 11:52:38 -07:00
ec465fcb75 Merge branch 'rs/no-openssl-compilation-fix-on-macos'
Build fix.

* rs/no-openssl-compilation-fix-on-macos:
  git-compat-util: fix NO_OPENSSL on current macOS
2024-04-23 11:52:38 -07:00
050e334979 Merge branch 'ta/fast-import-parse-path-fix'
The way "git fast-import" handles paths described in its input has
been tightened up and more clearly documented.

* ta/fast-import-parse-path-fix:
  fast-import: make comments more precise
  fast-import: forbid escaped NUL in paths
  fast-import: document C-style escapes for paths
  fast-import: improve documentation for path quoting
  fast-import: remove dead strbuf
  fast-import: allow unquoted empty path for root
  fast-import: directly use strbufs for paths
  fast-import: tighten path unquoting
2024-04-23 11:52:37 -07:00
33bbc21c92 Merge branch 'ps/reftable-block-iteration-optim'
The code to iterate over reftable blocks has seen some optimization
to reduce memory allocation and deallocation.

* ps/reftable-block-iteration-optim:
  reftable/block: avoid copying block iterators on seek
  reftable/block: reuse `zstream` state on inflation
  reftable/block: open-code call to `uncompress2()`
  reftable/block: reuse uncompressed blocks
  reftable/reader: iterate to next block in place
  reftable/block: move ownership of block reader into `struct table_iter`
  reftable/block: introduce `block_reader_release()`
  reftable/block: better grouping of functions
  reftable/block: merge `block_iter_seek()` and `block_reader_seek()`
  reftable/block: rename `block_reader_start()`
2024-04-23 11:52:37 -07:00
ce36894509 format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
In the previous step, the "--rfc" option of "format-patch" learned
to take an optional string value to prepend to the subject prefix,
so that --rfc=WIP can give "[WIP PATCH]".

There may be cases in which the extra string wants to come after the
subject prefix.  Extend the mechanism to allow "--rfc=-(WIP)" [*] to
signal that the extra string is to be appended instead of getting
prepended, resulting in "[PATCH (WIP)]".

In the documentation, discourage (ab)using "--rfc=-RFC" to say
"[PATCH RFC]" just to be different, when "[RFC PATCH]" is the norm.

[Footnote]

 * The syntax takes inspiration from Perl's open syntax that opens
   pipes "open fh, '|-', 'cmd'", where the dash signals "the other
   stuff comes here".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23 11:00:39 -07:00
ce48fb2eab format-patch: allow --rfc to optionally take a value, like --rfc=WIP
With the "--rfc" option, we can tweak the "[PATCH]" (or whatever
string specified with the "--subject-prefix" option, instead of
"PATCH") that we prefix the title of the commit with into "[RFC
PATCH]", but some projects may want "[rfc PATCH]".  Adding a new
option, e.g., "--rfc-lowercase", to support such need every time
somebody wants to use different strings would lead to insanity of
accumulating unbounded number of such options.

Allow an optional value specified for the option, so that users can
use "--rfc=rfc" (think of "--rfc" without value as a short-hand for
"--rfc=RFC") if they wanted to.

This can of course be (ab)used to make the prefix "[WIP PATCH]" by
passing "--rfc=WIP".  Passing an empty string, i.e., "--rfc=", is
the same as "--no-rfc" to override an option given earlier on the
same command line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23 11:00:38 -07:00
16727404c4 add: plug a leak on interactive_add
Plug a leak we have since 5a76aff1a6 (add: convert to use
parse_pathspec, 2013-07-14).

This leak can be triggered with:
    $ git add -p anything

Fixing this leak allows us to mark as leak-free the following tests:

    + t3701-add-interactive.sh
    + t7514-commit-patch.sh

Mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix
promply any new leak that may be introduced and triggered by them in the
future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 16:27:43 -07:00
ec9b74b18e add-patch: plug a leak handling the '/' command
Plug a leak we have since d6cf873340 (built-in add -p: implement the '/'
("search regex") command, 2019-12-13).

This leak can be triggered with:

    $ printf "A\n\nB\n" >file
    $ git add file && git commit -m file
    $ printf "AA\n\nBB\n" >file
    $ printf "s\n/ .\n" >lines
    $ git add -p <lines

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 16:27:42 -07:00
5861aa84a7 add-interactive: plug a leak in get_untracked_files
Plug a leak we have since ab1e1cccaf (built-in add -i: re-implement
`add-untracked` in C, 2019-11-29).

This leak can be triggered with:

	$ echo a | git add -i

As a curiosity, we have a somewhat similar function in builtin/stash.c,
which correctly frees the memory.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 16:27:42 -07:00
71c7916053 apply: plug a leak in apply_data
We have an execution path in apply_data that leaks the local struct
image.  Plug it.

This leak can be triggered with:

    $ echo foo >file
    $ git add file && git commit -m file
    $ echo bar >file
    $ git diff file >diff
    $ sed s/foo/frotz/ <diff >baddiff
    $ git apply --cached <baddiff

Fixing this leak allows us to mark as leak-free the following tests:

    + t2016-checkout-patch.sh
    + t4103-apply-binary.sh
    + t4104-apply-boundary.sh
    + t4113-apply-ending.sh
    + t4117-apply-reject.sh
    + t4123-apply-shrink.sh
    + t4252-am-options.sh
    + t4258-am-quoted-cr.sh

Mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix
promply any new leak that may be introduced and triggered by them in the
future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 16:27:42 -07:00
5fb7686409 stash: fix "--staged" with binary files
"git stash --staged" errors out when given binary files, after saving the
stash.

This behaviour dates back to the addition of the feature in 41a28eb6c1
(stash: implement '--staged' option for 'push' and 'save', 2021-10-18).
Adding the "--binary" option of "diff-tree" fixes this. The "diff-tree" call
in stash_patch() also omits "--binary", but that is fine since binary files
cannot be selected interactively.

Helped-By: Jeff King <peff@peff.net>
Helped-By: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Adam Johnson <me@adamj.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 13:57:18 -07:00
00e10ef10e docs: address typos in Git v2.45 changelog
Address some typos in the Git v2.45 changelog.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 08:54:01 -07:00
bbeb79789c docs: improve changelog entry for git pack-refs --auto
The changelog entry for the new `git pack-refs --auto` mode only says
that the new flag is useful, but doesn't really say what it does. Add
some more information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 08:54:01 -07:00
bf3fe4f1a2 docs: remove duplicate entry and fix typo in 2.45 changelog
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 08:53:41 -07:00
0283cd5161 don't report vsnprintf(3) error as bug
strbuf_addf() has been reporting a negative return value of vsnprintf(3)
as a bug since f141bd804d (Handle broken vsnprintf implementations in
strbuf, 2007-11-13).  Other functions copied that behavior:

7b03c89ebd (add xsnprintf helper function, 2015-09-24)
5ef264dbdb (strbuf.c: add `strbuf_insertf()` and `strbuf_vinsertf()`, 2019-02-25)
8d25663d70 (mem-pool: add mem_pool_strfmt(), 2024-02-25)

However, vsnprintf(3) can legitimately return a negative value if the
formatted output would be longer than INT_MAX.  Stop accusing it of
being broken and just report the fact that formatting failed.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-21 12:27:07 -07:00
d35a5cf850 l10n: bg.po: Updated Bulgarian translation (5652t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2024-04-21 17:00:36 +02:00
aa7b8b7567 l10n: fr: v2.45.0
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2024-04-20 17:16:20 +08:00
7be7783164 l10n: Update Vietnamese team contact
The previous team has not maintained the translation since 2.37. Leader
has agreed to transfer leadership to me.

Signed-off-by: Vũ Tiến Hưng <newcomerminecraft@gmail.com>
2024-04-20 12:02:27 +07:00
ae3196a5ea Git 2.45-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-19 09:13:47 -07:00
6c69d3a91f Merge branch 'la/mailmap-entry'
Update contact address for Linus Arver.

* la/mailmap-entry:
  mailmap: change primary address for Linus Arver
2024-04-19 09:13:47 -07:00
18dd9301a2 Merge branch 'pf/commitish-committish'
Spellfix.

* pf/commitish-committish:
  typo: replace 'commitish' with 'committish'
2024-04-19 09:13:47 -07:00
cadcf58085 format-patch: ensure that --rfc and -k are mutually exclusive
Fix a bug that allows the "--rfc" and "-k" options to be specified together
when "git format-patch" is executed, which was introduced in the commit
e0d7db7423 ("format-patch: --rfc honors what --subject-prefix sets").

Add a couple of additional tests to t4014, to cover additional cases of
the mutual exclusivity between different "git format-patch" options.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-19 08:40:57 -07:00
10dc9846b8 Git 2.44.1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:56 +02:00
e5e6663e69 Sync with 2.43.4
* maint-2.43: (40 commits)
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  ...
2024-04-19 12:38:54 +02:00
1f2e64e22d Git 2.43.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:52 +02:00
8e97ec3662 Sync with 2.42.2
* maint-2.42: (39 commits)
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  ...
2024-04-19 12:38:50 +02:00
babb4e5d71 Git 2.42.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:48 +02:00
be348e9815 Sync with 2.41.1
* maint-2.41: (38 commits)
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  docs: document security issues around untrusted .git dirs
  ...
2024-04-19 12:38:46 +02:00
0f15832059 Git 2.41.1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:43 +02:00
f5b2af06f5 Sync with 2.40.2
* maint-2.40: (39 commits)
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  docs: document security issues around untrusted .git dirs
  upload-pack: disable lazy-fetching by default
  ...
2024-04-19 12:38:42 +02:00
b9b439e0e3 Git 2.40.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:38 +02:00
93a88f42db Sync with 2.39.4
* maint-2.39: (38 commits)
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  docs: document security issues around untrusted .git dirs
  upload-pack: disable lazy-fetching by default
  fetch/clone: detect dubious ownership of local repositories
  ...
2024-04-19 12:38:37 +02:00
47b6d90e91 Git 2.39.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:33 +02:00
9e65df5eab Merge branch 'ownership-checks-in-local-clones'
This topic addresses two CVEs:

- CVE-2024-32020:

  Local clones may end up hardlinking files into the target repository's
  object database when source and target repository reside on the same
  disk. If the source repository is owned by a different user, then
  those hardlinked files may be rewritten at any point in time by the
  untrusted user.

- CVE-2024-32021:

  When cloning a local source repository that contains symlinks via the
  filesystem, Git may create hardlinks to arbitrary user-readable files
  on the same filesystem as the target repository in the objects/
  directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:32 +02:00
2b3d38a6b1 Merge branch 'defense-in-depth'
This topic branch adds a couple of measures designed to make it much
harder to exploit any bugs in Git's recursive clone machinery that might
be found in the future.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:29 +02:00
a33fea0886 fsck: warn about symlink pointing inside a gitdir
In the wake of fixing a vulnerability where `git clone` mistakenly
followed a symbolic link that it had just written while checking out
files, writing into a gitdir, let's add some defense-in-depth by
teaching `git fsck` to report symbolic links stored in its trees that
point inside `.git/`.

Even though the Git project never made any promises about the exact
shape of the `.git/` directory's contents, there are likely repositories
out there containing symbolic links that point inside the gitdir. For
that reason, let's only report these as warnings, not as errors.
Security-conscious users are encouraged to configure
`fsck.symlinkPointsToGitDir = error`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:25 +02:00
20f3588efc core.hooksPath: add some protection while cloning
Quite frequently, when vulnerabilities were found in Git's (quite
complex) clone machinery, a relatively common way to escalate the
severity was to trick Git into running a hook which is actually a script
that has just been laid on disk as part of that clone. This constitutes
a Remote Code Execution vulnerability, the highest severity observed in
Git's vulnerabilities so far.

Some previously-fixed vulnerabilities allowed malicious repositories to
be crafted such that Git would check out files not in the worktree, but
in, say, a submodule's `<git>/hooks/` directory.

A vulnerability that "merely" allows to modify the Git config would
allow a related attack vector, to manipulate Git into looking in the
worktree for hooks, e.g. redirecting the location where Git looks for
hooks, via setting `core.hooksPath` (which would be classified as
CWE-427: Uncontrolled Search Path Element and CWE-114: Process Control,
for more details see https://cwe.mitre.org/data/definitions/427.html and
https://cwe.mitre.org/data/definitions/114.html).

To prevent that attack vector, let's error out and complain loudly if an
active `core.hooksPath` configuration is seen in the repository-local
Git config during a `git clone`.

There is one caveat: This changes Git's behavior in a slightly
backwards-incompatible manner. While it is probably a rare scenario (if
it exists at all) to configure `core.hooksPath` via a config in the Git
templates, it _is_ conceivable that some valid setup requires this to
work. In the hopefully very unlikely case that a user runs into this,
there is an escape hatch: set the `GIT_CLONE_PROTECTION_ACTIVE=false`
environment variable. Obviously, this should be done only with utmost
caution.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:24 +02:00
4412a04fe6 init.templateDir: consider this config setting protected
The ability to configuring the template directory is a delicate feature:
It allows defining hooks that will be run e.g. during a `git clone`
operation, such as the `post-checkout` hook.

As such, it is of utmost importance that Git would not allow that config
setting to be changed during a `git clone` by mistake, allowing an
attacker a chance for a Remote Code Execution, allowing attackers to run
arbitrary code on unsuspecting users' machines.

As a defense-in-depth measure, to prevent minor vulnerabilities in the
`git clone` code from ballooning into higher-serverity attack vectors,
let's make this a protected setting just like `safe.directory` and
friends, i.e. ignore any `init.templateDir` entries from any local
config.

Note: This does not change the behavior of any recursive clone (modulo
bugs), as the local repository config is not even supposed to be written
while cloning the superproject, except in one scenario: If a config
template is configured that sets the template directory. This might be
done because `git clone --recurse-submodules --template=<directory>`
does not pass that template directory on to the submodules'
initialization.

Another scenario where this commit changes behavior is where
repositories are _not_ cloned recursively, and then some (intentional,
benign) automation configures the template directory to be used before
initializing the submodules.

So the caveat is that this could theoretically break existing processes.

In both scenarios, there is a way out, though: configuring the template
directory via the environment variable `GIT_TEMPLATE_DIR`.

This change in behavior is a trade-off between security and
backwards-compatibility that is struck in favor of security.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:24 +02:00
8db1e8743c clone: prevent hooks from running during a clone
Critical security issues typically combine relatively common
vulnerabilities such as case confusion in file paths with other
weaknesses in order to raise the severity of the attack.

One such weakness that has haunted the Git project in many a
submodule-related CVE is that any hooks that are found are executed
during a clone operation. Examples are the `post-checkout` and
`fsmonitor` hooks.

However, Git's design calls for hooks to be disabled by default, as only
disabled example hooks are copied over from the templates in
`<prefix>/share/git-core/templates/`.

As a defense-in-depth measure, let's prevent those hooks from running.

Obviously, administrators can choose to drop enabled hooks into the
template directory, though, _and_ it is also possible to override
`core.hooksPath`, in which case the new check needs to be disabled.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:23 +02:00
584de0b4c2 Add a helper function to compare file contents
In the next commit, Git will learn to disallow hooks during `git clone`
operations _except_ when those hooks come from the templates (which are
inherently supposed to be trusted). To that end, we add a function to
compare the contents of two files.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:19 +02:00
61e124bb2d SubmittingPatches: demonstrate using git-contacts with git-send-email
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:11 -07:00
bf96614541 SubmittingPatches: add heading for format-patch and send-email
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:11 -07:00
01ea2b2836 SubmittingPatches: dedupe discussion of security patches
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:11 -07:00
e2663c4597 SubmittingPatches: discuss reviewers first
No matter how well someone configures their email tooling, understanding
who to send the patches to is something that must always be considered.
So discuss it first instead of at the end.

In the following commit we will clean up the (now redundant) discussion
about sending security patches to the Git Security mailing list.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:10 -07:00
c8d6a54a07 SubmittingPatches: quote commands
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:10 -07:00
84b91fc465 SubmittingPatches: mention GitGitGadget
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:10 -07:00
824503ce88 SubmittingPatches: clarify 'git-contacts' location
Use a dash ("git-contacts", not "git contacts") because the script is
not installed as part of "git" toolset. This also puts the script on
one line, which should make it easier to grep for with a loose search
query, such as

    $ git grep git.contacts Documentation

Also add a footnote to describe where the script is located, to help
readers who may not be familiar with such "contrib" scripts (and how
they are not accessible with the usual "git <subcommand>" syntax).

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:10 -07:00
7e50b3f5df MyFirstContribution: mention contrib/contacts/git-contacts
Although we've had this script since 4d06402b1b (contrib: add
git-contacts helper, 2013-07-21), we don't mention it in our
introductory docs. Do so now.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 14:55:09 -07:00
a6c2654f83 rebase -m: fix --signoff with conflicts
When rebasing with "--signoff" the commit created by "rebase --continue"
after resolving conflicts or editing a commit fails to add the
"Signed-off-by:" trailer. This happens because the message from the
original commit is reused instead of the one that would have been used
if the sequencer had not stopped for the user interaction. The correct
message is stored in ctx->message and so with a couple of exceptions
this is written to rebase_path_message() when stopping for user
interaction instead. The exceptions are (i) "fixup" and "squash"
commands where the file is written by error_failed_squash() and (ii)
"edit" commands that are fast-forwarded where the original message is
still reused. The latter is safe because "--signoff" will never
fast-forward.

Note this introduces a change in behavior as the message file now
contains conflict comments. This is safe because commit_staged_changes()
passes an explicit cleanup flag when not editing the message and when
the message is being edited it will be cleaned up automatically. This
means user now sees the same message comments in editor with "rebase
--continue" as they would if they ran "git commit" themselves before
continuing the rebase. It also matches the behavior of "git
cherry-pick", "git merge" etc. which all list the files with merge
conflicts.

The tests are extended to check that all commits made after continuing a
rebase have a "Signed-off-by:" trailer. Sadly there are a couple of
leaks in apply.c which I've not been able to track down that mean this
test file is no-longer leak free when testing "git rebase --apply
--signoff" with conflicts.

Reported-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 13:33:41 -07:00
53f6746615 sequencer: store commit message in private context
Add an strbuf to "struct replay_ctx" to hold the current commit
message. This does not change the behavior but it will allow us to fix a
bug with "git rebase --signoff" in the next commit. A future patch
series will use the changes here to avoid writing the commit message to
disc unless there are conflicts or the commit is being reworded.

The changes in do_pick_commit() are a mechanical replacement of "msgbuf"
with "ctx->message". In do_merge() the code to write commit message to
disc is factored out of the conditional now that both branches store the
message in the same buffer.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 13:33:41 -07:00
497a01a2d3 sequencer: move current fixups to private context
The list of current fixups is an implementation detail of the sequencer
and so it should not be stored in the public options struct.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 13:33:41 -07:00
a3152edc97 sequencer: start removing private fields from public API
"struct replay_opts" has a number of fields that are for internal
use. While they are marked as private having them in a public struct is
a distraction for callers and means that every time the internal details
are changed we have to recompile all the files that include sequencer.h
even though the public API is unchanged. This commit starts the process
of removing the private fields by adding an opaque pointer to a "struct
replay_ctx" to "struct replay_opts" and moving the "reflog_message"
member to the new private struct.

The sequencer currently updates the state files on disc each time it
processes a command in the todo list. This is an artifact of the
scripted implementation and makes the code hard to reason about as it is
not possible to get a complete view of the state in memory. In the
future we will add new members to "struct replay_ctx" to remedy this and
avoid writing state to disc unless the sequencer stops for user
interaction.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 13:33:41 -07:00
42aae6a49a sequencer: always free "struct replay_opts"
sequencer_post_commit_cleanup() initializes an instance of "struct
replay_opts" but does not call replay_opts_release(). Currently this
does not leak memory because the code paths called don't allocate any of
the struct members. That will change in the next commit so add call to
replay_opts_release() to prevent a memory leak in "git commit" that
breaks all of the leak free tests.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 13:33:41 -07:00
2a60cb766e Merge branch 'pw/t3428-cleanup' into pw/rebase-m-signoff-fix
* pw/t3428-cleanup:
  t3428: restore coverage for "apply" backend
  t3428: use test_commit_message
  t3428: modernize test setup
2024-04-18 13:33:37 -07:00
0c47355790 repository: drop initialize_the_repository()
Now that we have dropped `the_index`, `initialize_the_repository()`
doesn't really do a lot anymore except for setting up the pointer for
`the_repository` and then calling `initialize_repository()`. The former
can be replaced by statically initializing the pointer though, which
basically makes this function moot.

Convert callers to instead call `initialize_repository(the_repository)`
and drop `initialize_thee_repository()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:43 -07:00
19fa8cd48c repository: drop the_index variable
All users of `the_index` have been converted to use either a custom
`struct index_state *` or the index provided by `the_repository`. We can
thus drop the globally-accessible declaration of this variable. In fact,
we can go further than that and drop `the_index` completely now and have
it be allocated dynamically in `initialize_repository()` as all the
other data structures in it are.

This concludes the quest to make Git `the_index` free, which has started
with 4aab5b46f4 (Make read-cache.c "the_index" free., 2007-04-01).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:42 -07:00
9ee6d63bab builtin/clone: stop using the_index
Convert git-clone(1) to use `the_repository->index` instead of
`the_index`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:42 -07:00
66bce9d00b repository: initialize index in repo_init()
When Git starts, one of the first things it will do is to call
`initialize_the_repository()`. This function sets up both the global
`the_repository` and `the_index` variables as required. Part of that
setup is also to set `the_repository.index = &the_index` so that the
index can be accessed via the repository.

When calling `repo_init()` on a repository though we set the complete
struct to all-zeroes, which will also cause us to unset the `index`
pointer. And as we don't re-initialize the index in that function, we
will end up with a `NULL` pointer here.

This has been fine until now becaues this function is only used to
create a new repository. git-init(1) does not access the index at all
after initializing the repository, whereas git-checkout(1) only uses
`the_index` directly. We are about to remove `the_index` though, which
will uncover this partially-initialized repository structure.

Refactor the code and create a common `initialize_repository()` function
that gets called from `repo_init()` and `initialize_the_repository()`.
This function sets up both the repository and the index as required.
Like this, we can easily special-case when `repo_init()` gets called
with `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:42 -07:00
f59aa5e0a9 builtin: stop using the_index
Convert builtins to use `the_repository->index` instead of `the_index`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:42 -07:00
319ba14407 t/helper: stop using the_index
Convert test-helper tools to use `the_repository->index` instead of
`the_index`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:41 -07:00
86cb6a3f05 Merge branch 'icasefs-symlink-confusion'
This topic branch fixes two vulnerabilities:

- Recursive clones on case-insensitive filesystems that support symbolic
  links are susceptible to case confusion that can be exploited to
  execute just-cloned code during the clone operation.

- Repositories can be configured to execute arbitrary code during local
  clones. To address this, the ownership checks introduced in v2.30.3
  are now extended to cover cloning local repositories.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:24 +02:00
df93e407f0 init: refactor the template directory discovery into its own function
We will need to call this function from `hook.c` to be able to prevent
hooks from running that were written as part of a `clone` but did not
originate from the template directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:10 +02:00
48c171d927 find_hook(): refactor the STRIP_EXTENSION logic
When looking for a hook and not finding one, and when `STRIP_EXTENSION`
is available (read: if we're on Windows and `.exe` is the required
extension for executable programs), we want to look also for a hook with
that extension.

Previously, we added that handling into the conditional block that was
meant to handle when no hook was found (possibly providing some advice
for the user's benefit). If the hook with that file extension was found,
we'd return early from that function instead of writing out said advice,
of course.

However, we're about to introduce a safety valve to prevent hooks from
being run during a clone, to reduce the attack surface of bugs that
allow writing files to be written into arbitrary locations.

To prepare for that, refactor the logic to avoid the early return, by
separating the `STRIP_EXTENSION` handling from the conditional block
handling the case when no hook was found.

This commit is best viewed with `--patience`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:09 +02:00
31572dc420 clone: when symbolic links collide with directories, keep the latter
When recursively cloning a repository with submodules, we must ensure
that the submodules paths do not suddenly contain symbolic links that
would let Git write into unintended locations. We just plugged that
vulnerability, but let's add some more defense-in-depth.

Since we can only keep one item on disk if multiple index entries' paths
collide, we may just as well avoid keeping a symbolic link (because that
would allow attack vectors where Git follows those links by mistake).

Technically, we handle more situations than cloning submodules into
paths that were (partially) replaced by symbolic links. This provides
defense-in-depth in case someone finds a case-folding confusion
vulnerability in the future that does not even involve submodules.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:08 +02:00
850c3a220e entry: report more colliding paths
In b878579ae7 (clone: report duplicate entries on case-insensitive
filesystems, 2018-08-17) code was added to warn about index entries that
resolve to the same file system entity (usually the cause is a
case-insensitive filesystem).

In Git for Windows, where inodes are not trusted (because of a
performance trade-off, inodes are equal to 0 by default), that check
does not compare inode numbers but the verbatim path.

This logic works well when index entries' paths differ only in case.

However, for file/directory conflicts only the file's path was reported,
leaving the user puzzled with what that path collides.

Let's try ot catch colliding paths even if one path is the prefix of the
other. We do this also in setups where the file system is case-sensitive
because the inode check would not be able to catch those collisions.

While not a complete solution (for example, on macOS, Unicode
normalization could also lead to file/directory conflicts but be missed
by this logic), it is at least another defensive layer on top of what
the previous commits added.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:07 +02:00
e4930e86c0 t5510: verify that D/F confusion cannot lead to an RCE
The most critical vulnerabilities in Git lead to a Remote Code Execution
("RCE"), i.e. the ability for an attacker to have malicious code being
run as part of a Git operation that is not expected to run said code,
such has hooks delivered as part of a `git clone`.

A couple of parent commits ago, a bug was fixed that let Git be confused
by the presence of a path `a-` to mistakenly assume that a directory
`a/` can safely be created without removing an existing `a` that is a
symbolic link.

This bug did not represent an exploitable vulnerability on its
own; Let's make sure it stays that way.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:06 +02:00
e8d0608944 submodule: require the submodule path to contain directories only
Submodules are stored in subdirectories of their superproject. When
these subdirectories have been replaced with symlinks by a malicious
actor, all kinds of mayhem can be caused.

This _should_ not be possible, but many CVEs in the past showed that
_when_ possible, it allows attackers to slip in code that gets executed
during, say, a `git clone --recursive` operation.

Let's add some defense-in-depth to disallow submodule paths to have
anything except directories in them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:04 +02:00
eafffd9ad4 clone_submodule: avoid using access() on directories
In 0060fd1511 (clone --recurse-submodules: prevent name squatting on
Windows, 2019-09-12), I introduced code to verify that a git dir either
does not exist, or is at least empty, to fend off attacks where an
inadvertently (and likely maliciously) pre-populated git dir would be
used while cloning submodules recursively.

The logic used `access(<path>, X_OK)` to verify that a directory exists
before calling `is_empty_dir()` on it. That is a curious way to check
for a directory's existence and might well fail for unwanted reasons.
Even the original author (it was I ;-) ) struggles to explain why this
function was used rather than `stat()`.

This code was _almost_ copypastad in the previous commit, but that
`access()` call was caught during review.

Let's use `stat()` instead also in the code that was almost copied
verbatim. Let's not use `lstat()` because in the unlikely event that
somebody snuck a symbolic link in, pointing to a crafted directory, we
want to verify that that directory is empty.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:03 +02:00
9706576133 submodules: submodule paths must not contain symlinks
When creating a submodule path, we must be careful not to follow
symbolic links. Otherwise we may follow a symbolic link pointing to
a gitdir (which are valid symbolic links!) e.g. while cloning.

On case-insensitive filesystems, however, we blindly replace a directory
that has been created as part of the `clone` operation with a symlink
when the path to the latter differs only in case from the former's path.

Let's simply avoid this situation by expecting not ever having to
overwrite any existing file/directory/symlink upon cloning. That way, we
won't even replace a directory that we just created.

This addresses CVE-2024-32002.

Reported-by: Filip Hejsek <filip.hejsek@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:02 +02:00
9cf8547320 clone: prevent clashing git dirs when cloning submodule in parallel
While it is expected to have several git dirs within the `.git/modules/`
tree, it is important that they do not interfere with each other. For
example, if one submodule was called "captain" and another submodule
"captain/hooks", their respective git dirs would clash, as they would be
located in `.git/modules/captain/` and `.git/modules/captain/hooks/`,
respectively, i.e. the latter's files could clash with the actual Git
hooks of the former.

To prevent these clashes, and in particular to prevent hooks from being
written and then executed as part of a recursive clone, we introduced
checks as part of the fix for CVE-2019-1387 in a8dee3ca61 (Disallow
dubiously-nested submodule git directories, 2019-10-01).

It is currently possible to bypass the check for clashing submodule
git dirs in two ways:

1. parallel cloning
2. checkout --recurse-submodules

Let's check not only before, but also after parallel cloning (and before
checking out the submodule), that the git dir is not clashing with
another one, otherwise fail. This addresses the parallel cloning issue.

As to the parallel checkout issue: It requires quite a few manual steps
to create clashing git dirs because Git itself would refuse to
initialize the inner one, as demonstrated by the test case.

Nevertheless, let's teach the recursive checkout (namely, the
`submodule_move_head()` function that is used by the recursive checkout)
to be careful to verify that it does not use a clashing git dir, and if
it does, disable it (by deleting the `HEAD` file so that subsequent Git
calls won't recognize it as a git dir anymore).

Note: The parallel cloning test case contains a `cat err` that proved to
be highly useful when analyzing the racy nature of the operation (the
operation can fail with three different error messages, depending on
timing), and was left on purpose to ease future debugging should the
need arise.

Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:01 +02:00
b20c10fd9b t7423: add tests for symlinked submodule directories
Submodule operations must not follow symlinks in working tree, because
otherwise files might be written to unintended places, leading to
vulnerabilities.

Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:30:00 +02:00
c30a574a0b has_dir_name(): do not get confused by characters < '/'
There is a bug in directory/file ("D/F") conflict checking optimization:
It assumes that such a conflict cannot happen if a newly added entry's
path is lexicgraphically "greater than" the last already-existing index
entry _and_ contains a directory separator that comes strictly after the
common prefix (`len > len_eq_offset`).

This assumption is incorrect, though: `a-` sorts _between_ `a` and
`a/b`, their common prefix is `a`, the slash comes after the common
prefix, and there is still a file/directory conflict.

Let's re-design this logic, taking these facts into consideration:

- It is impossible for a file to sort after another file with whose
  directory it conflicts because the trailing NUL byte is always smaller
  than any other character.

- Since there are quite a number of ASCII characters that sort before
  the slash (e.g. `-`, `.`, the space character), looking at the last
  already-existing index entry is not enough to determine whether there
  is a D/F conflict when the first character different from the
  existing last index entry's path is a slash.

  If it is not a slash, there cannot be a file/directory conflict.

  And if the existing index entry's first different character is a
  slash, it also cannot be a file/directory conflict because the
  optimization requires the newly-added entry's path to sort _after_ the
  existing entry's, and the conflicting file's path would not.

So let's fall back to the regular binary search whenever the newly-added
item's path differs in a slash character. If it does not, and it sorts
after the last index entry, there is no D/F conflict and the new index
entry can be safely appended.

This fix also nicely simplifies the logic and makes it much easier to
reason about, while the impact on performance should be negligible:
After this fix, the optimization will be skipped only when index
entry's paths differ in a slash and a space, `!`,  `"`,  `#`,  `$`,
`%`, `&`,  `'`,  | (  `)`,  `*`,  `+`,  `,`,  `-`, or  `.`, which should
be a rare situation.

Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:29:58 +02:00
e69ac42fcc docs: document security issues around untrusted .git dirs
For a long time our general philosophy has been that it's unsafe to run
arbitrary Git commands if you don't trust the hooks or config in .git,
but that running upload-pack should be OK. E.g., see 1456b043fc (Remove
post-upload-hook, 2009-12-10), or the design of uploadpack.packObjectsHook.

But we never really documented this (and even the discussions that led
to 1456b043fc were not on the public list!). Let's try to make our
approach more clear, but also be realistic that even upload-pack carries
some risk.

Helped-by: Filip Hejsek <filip.hejsek@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:29:57 +02:00
7b70e9efb1 upload-pack: disable lazy-fetching by default
The upload-pack command tries to avoid trusting the repository in which
it's run (e.g., by not running any hooks and not using any config that
contains arbitrary commands). But if the server side of a fetch or a
clone is a partial clone, then either upload-pack or its child
pack-objects may run a lazy "git fetch" under the hood. And it is very
easy to convince fetch to run arbitrary commands.

The "server" side can be a local repository owned by someone else, who
would be able to configure commands that are run during a clone with the
current user's permissions. This issue has been designated
CVE-2024-32004.

The fix in this commit's parent helps in this scenario, as well as in
related scenarios using SSH to clone, where the untrusted .git directory
is owned by a different user id. But if you received one as a zip file,
on a USB stick, etc, it may be owned by your user but still untrusted.

This has been designated CVE-2024-32465.

To mitigate the issue more completely, let's disable lazy fetching
entirely during `upload-pack`. While fetching from a partial repository
should be relatively rare, it is certainly not an unreasonable workflow.
And thus we need to provide an escape hatch.

This commit works by respecting a GIT_NO_LAZY_FETCH environment variable
(to skip the lazy-fetch), and setting it in upload-pack, but only when
the user has not already done so (which gives us the escape hatch).

The name of the variable is specifically chosen to match what has
already been added in 'master' via e6d5479e7a (git: extend
--no-lazy-fetch to work across subprocesses, 2024-02-27). Since we're
building this fix as a backport for older versions, we could cherry-pick
that patch and its earlier steps. However, we don't really need the
niceties (like a "--no-lazy-fetch" option) that it offers. By using the
same name, everything should just work when the two are eventually
merged, but here are a few notes:

  - the blocking of the fetch in e6d5479e7a is incomplete! It sets
    fetch_if_missing to 0 when we setup the repository variable, but
    that isn't enough. pack-objects in particular will call
    prefetch_to_pack() even if that variable is 0. This patch by
    contrast checks the environment variable at the lowest level before
    we call the lazy fetch, where we can be sure to catch all code
    paths.

    Possibly the setting of fetch_if_missing from e6d5479e7a can be
    reverted, but it may be useful to have. For example, some code may
    want to use that flag to change behavior before it gets to the point
    of trying to start the fetch. At any rate, that's all outside the
    scope of this patch.

  - there's documentation for GIT_NO_LAZY_FETCH in e6d5479e7a. We can
    live without that here, because for the most part the user shouldn't
    need to set it themselves. The exception is if they do want to
    override upload-pack's default, and that requires a separate
    documentation section (which is added here)

  - it would be nice to use the NO_LAZY_FETCH_ENVIRONMENT macro added by
    e6d5479e7a, but those definitions have moved from cache.h to
    environment.h between 2.39.3 and master. I just used the raw string
    literals, and we can replace them with the macro once this topic is
    merged to master.

At least with respect to CVE-2024-32004, this does render this commit's
parent commit somewhat redundant. However, it is worth retaining that
commit as defense in depth, and because it may help other issues (e.g.,
symlink/hardlink TOCTOU races, where zip files are not really an
interesting attack vector).

The tests in t0411 still pass, but now we have _two_ mechanisms ensuring
that the evil command is not run. Let's beef up the existing ones to
check that they failed for the expected reason, that we refused to run
upload-pack at all with an alternate user id. And add two new ones for
the same-user case that both the restriction and its escape hatch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:29:56 +02:00
f4aa8c8bb1 fetch/clone: detect dubious ownership of local repositories
When cloning from somebody else's repositories, it is possible that,
say, the `upload-pack` command is overridden in the repository that is
about to be cloned, which would then be run in the user's context who
started the clone.

To remind the user that this is a potentially unsafe operation, let's
extend the ownership checks we have already established for regular
gitdir discovery to extend also to local repositories that are about to
be cloned.

This protection extends also to file:// URLs.

The fixes in this commit address CVE-2024-32004.

Note: This commit does not touch the `fetch`/`clone` code directly, but
instead the function used implicitly by both: `enter_repo()`. This
function is also used by `git receive-pack` (i.e. pushes), by `git
upload-archive`, by `git daemon` and by `git http-backend`. In setups
that want to serve repositories owned by different users than the
account running the service, this will require `safe.*` settings to be
configured accordingly.

Also note: there are tiny time windows where a time-of-check-time-of-use
("TOCTOU") race is possible. The real solution to those would be to work
with `fstat()` and `openat()`. However, the latter function is not
available on Windows (and would have to be emulated with rather
expensive low-level `NtCreateFile()` calls), and the changes would be
quite extensive, for my taste too extensive for the little gain given
that embargoed releases need to pay extra attention to avoid introducing
inadvertent bugs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:29:54 +02:00
5c5a4a1c05 t0411: add tests for cloning from partial repo
Cloning from a partial repository must not fetch missing objects into
the partial repository, because that can lead to arbitrary code
execution.

Add a couple of test cases, pretending to the `upload-pack` command (and
to that command only) that it is working on a repository owned by
someone else.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 22:29:53 +02:00
93e2ae1c95 midx: disable replace objects
We observed a series of clone failures arose in a specific set of
repositories after we fully enabled the MIDX bitmap feature within our
Codebase service. These failures were accompanied with error messages
such as:

    Cloning into bare repository 'clone.git'...
    remote: Enumerating objects: 8, done.
    remote: Total 8 (delta 0), reused 0 (delta 0), pack-reused 8 (from 1)
    Receiving objects: 100% (8/8), done.
    fatal: did not receive expected object ...
    fatal: fetch-pack: invalid index-pack output

Temporarily disabling the MIDX feature eliminated the reported issues.
After some investigation we found that all repositories experiencing
failures contain replace references, which seem to be improperly
acknowledged by the MIDX bitmap generation logic.

A more thorough explanation about the root cause from Taylor Blau says:

Indeed, the pack-bitmap-write machinery does not itself call
disable_replace_refs(). So when it generates a reachability bitmap, it
is doing so with the replace refs in mind. You can see that this is
indeed the cause of the problem by looking at the output of an
instrumented version of Git that indicates what bits are being set
during the bitmap generation phase.

With replace refs (incorrectly) enabled, we get:

    [2, 4, 6, 8, 13, 3, 6, 7, 3, 4, 6, 8]

and doing the same after calling disable_replace_refs(), we instead get:

    [2, 5, 6, 13, 3, 6, 7, 3, 4, 6, 8]

Single pack bitmaps are unaffected by this issue because we generate
them from within pack-objects, which does call disable_replace_refs().

This patch updates the MIDX logic to disable replace objects within the
multi-pack-index builtin, and a test showing a clone (which would fail
with MIDX bitmap) is added to demonstrate the bug.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 12:35:41 -07:00
7bf3057d9c builtin/receive-pack: convert to use git-maintenance(1)
In 850b6edefa (auto-gc: extract a reusable helper from "git fetch",
2020-05-06), we have introduced a helper function `run_auto_gc()` that
kicks off `git gc --auto`. The intent of this function was to pass down
the "--quiet" flag to git-gc(1) as required without duplicating this at
all callsites. In 7c3e9e8cfb (auto-gc: pass --quiet down from am,
commit, merge and rebase, 2020-05-06) we then converted callsites that
need to pass down this flag to use the new helper function. This has the
notable omission of git-receive-pack(1), which is the only remaining
user of `git gc --auto` that sets up the proccess manually. This is
probably because it unconditionally passes down the `--quiet` flag and
thus didn't benefit much from the new helper function.

In a95ce12430 (maintenance: replace run_auto_gc(), 2020-09-17) we then
replaced `run_auto_gc()` with `run_auto_maintenance()` which invokes
git-maintenance(1) instead of git-gc(1). This command is the modern
replacement for git-gc(1) and is both more thorough and also more
flexible because administrators can configure which tasks exactly to run
during maintenance.

But due to git-receive-pack(1) not using `run_auto_gc()` in the first
place it did not get converted to use git-maintenance(1) like we do
everywhere else now. Address this oversight and start to use the newly
introduced function `prepare_auto_maintenance()`. This will also make it
easier for us to adapt this code together with all the other callsites
that invoke auto-maintenance in the future.

This removes the last internal user of `git gc --auto`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:42:26 -07:00
b396ee6bed run-command: introduce function to prepare auto-maintenance process
The `run_auto_maintenance()` function is responsible for spawning a new
`git maintenance run --auto` process. To do so, it sets up the `sturct
child_process` and then runs it by executing `run_command()` directly.
This is rather inflexible in case callers want to modify the child
process somewhat, e.g. to redirect stderr or stdout.

Introduce a new `prepare_auto_maintenance()` function to plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:42:26 -07:00
ffff4ac065 credential: add method for querying capabilities
Right now, there's no specific way to determine whether a credential
helper or git credential itself supports a given set of capabilities.
It would be helpful to have such a way, so let's let credential helpers
and git credential take an argument, "capability", which has it list the
capabilities and a version number on standard output.

Specifically choose a format that is slightly different from regular
credential output and assume that no capabilities are supported if a
non-zero exit status occurs or the data deviates from the format.  It is
common for users to write small shell scripts as the argument to
credential.helper, which will almost never be designed to emit
capabilities.  We want callers to gracefully handle this case by
assuming that they are not capable of extended support because that is
almost certainly the case, and specifying the error behavior up front
does this and preserves backwards compatibility in a graceful way.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:08 -07:00
40220f48b1 credential-cache: implement authtype capability
Now that we have full support in Git for the authtype capability, let's
add support to the cache credential helper.

When parsing data, we always set the initial capabilities because we're
the helper, and we need both the initial and helper capabilities to be
set in order to have the helper capabilities take effect.

When emitting data, always emit the supported capability and make sure
we emit items only if we have them and they're supported by the caller.
Since we may no longer have a username or password, be sure to emit
those conditionally as well so we don't segfault on a NULL pointer.
Similarly, when comparing credentials, consider both the password and
credential fields when we're matching passwords.

Adjust the partial credential detection code so that we can store
credentials missing a username or password as long as they have an
authtype and credential.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:08 -07:00
30c0a3036f t: add credential tests for authtype
It's helpful to have some basic tests for credential helpers supporting
the authtype and credential fields.  Let's add some tests for this case
so that we can make sure newly supported helpers work correctly.

Note that we explicitly check that credential helpers can produce
different sets of authtype and credential values based on the username.
While the username is not used in the HTTP protocol with authtype and
credential, it can still be specified in the URL and thus may be part of
the protocol.  Additionally, because it is common for users to have
multiple accounts on one service (say, both personal and professional
accounts), it's very helpful to be able to store different credentials
for different accounts in the same helper, and that doesn't become less
useful if one is using, say, Bearer authentication instead of Basic.
Thus, credential helpers should be expected to support this
functionality as basic functionality, so verify here that they do so.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:08 -07:00
ac4c7cbfaa credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side.  It's possible that there are custom authentication schemes
that also implement this same approach.  Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.

To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication.  However, this necessitates some changes in our
current credential code so that we can make this work.

Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations.  That avoids us passing
stale data when we finally approve or reject the credential.  Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not.  Remove the credential values so that we can actually fill a
second time with new responses.

Limit the number of iterations of reauthentication we do to 3.  This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).

In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:08 -07:00
37417b7717 t5563: refactor for multi-stage authentication
Some HTTP authentication schemes, such as NTLM- and Kerberos-based
options, require more than one round trip to authenticate.  Currently,
these can only be supported in libcurl, since Git does not have support
for this in the credential helper protocol.

However, in a future commit, we'll add support for this functionality
into the credential helper protocol and Git itself. Because we don't
really want to implement either NTLM or Kerberos, both of which are
complex protocols, we'll want to test this using a fake credential
authentication scheme.  In order to do so, update t5563 and its backend
to allow us to accept multiple sets of credentials and respond with
different behavior in each case.

Since we can now provide any number of possible status codes, provide a
non-specific reason phrase so we don't have to generate a more specific
one based on the response.  The reason phrase is mandatory according to
the status-line production in RFC 7230, but clients SHOULD ignore it,
and curl does (except to print it).

Each entry in the authorization and challenge fields contains an ID,
which indicates a corresponding credential and response.  If the
response is a 200 status, then we continue to execute git-http-backend.
Otherwise, we print the corresponding status and response.  If no ID is
matched, we use the default response with a status of 401.

Note that there is an implicit order to the parameters.  The ID is
always first and the creds or response value is always last, and
therefore may contain spaces, equals signs, or other arbitrary data.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:08 -07:00
bd590bde58 docs: set a limit on credential line length
We recently introduced a way for credential helpers to add arbitrary
state as part of the protocol.  Set some limits on line length to avoid
helpers passing extremely large amounts of data.  While Git doesn't have
a fixed parsing length, there are other tools which support this
protocol and it's kind to allow them to use a reasonable fixed-size
buffer for parsing.  In addition, we would like to be moderate in our
memory usage and imposing reasonable limits is helpful for that purpose.

In the event a credential helper is incapable of storing its serialized
state in 64 KiB, it can feel free to serialize it on disk and store a
reference instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:07 -07:00
36f7d865e3 credential: enable state capability
Now that we've implemented the state capability, let's send it along by
default when filling credentials so we can make use of it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:07 -07:00
8470c94be3 credential: add an argument to keep state
Until now, our credential code has mostly deal with usernames and
passwords and we've let libcurl deal with the variant of authentication
to be used.  However, now that we have the credential value, the
credential helper can take control of the authentication, so the value
provided might be something that's generated, such as a Digest hash
value.

In such a case, it would be helpful for a credential helper that gets an
erase or store command to be able to keep track of an identifier for the
original secret that went into the computation.  Furthermore, some types
of authentication, such as NTLM and Kerberos, actually need two round
trips to authenticate, which will require that the credential helper
keep some state.

In order to allow for these use cases and others, allow storing state in
a field called "state[]".  This value is passed back to the credential
helper that created it, which avoids confusion caused by parsing values
from different helpers.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:07 -07:00
ad9bb6dfe6 http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it.  If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.

Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.

Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream.  We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken.  In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features.  For all of these reasons, do so here.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:07 -07:00
5af5cc68aa docs: indicate new credential protocol fields
Now that we have new fields (authtype and credential), let's document
them for users and credential helper implementers.

Indicate specifically what common values of authtype are and what values
are allowed.  Note that, while common, digest and NTLM authentication
are insecure because they require unsalted, uniterated password hashes
to be stored.

Tell users that they can continue to use a username and password even if
the new capability is supported.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:07 -07:00
2ae6dc686d credential: add a field called "ephemeral"
Now that we have support for a wide variety of types of authentication,
it's important to indicate to other credential helpers whether they
should store credentials, since not every credential helper may
intuitively understand all possible values of the authtype field.  Do so
with a boolean field called "ephemeral", to indicate whether the
credential is expected to be temporary.

For example, in HTTP Digest authentication, the Authorization header
value is based off a nonce.  It isn't useful to store this value
for later use because reusing the credential long term will not result
in successful authentication due to the nonce necessarily differing.

An additional case is potentially short-lived credentials, which may
last only a few hours.  It similarly wouldn't be helper for other
credential helpers to attempt to provide these much later.

We do still pass the value to "git credential store" or "git credential
erase", since it may be helpful to the original helper to know whether
the operation was successful.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:07 -07:00
ca9ccbf674 credential: gate new fields on capability
We support the new credential and authtype fields, but we lack a way to
indicate to a credential helper that we'd like them to be used.  Without
some sort of indication, the credential helper doesn't know if it should
try to provide us a username and password, or a pre-encoded credential.
For example, the helper might prefer a more restricted Bearer token if
pre-encoded credentials are possible, but might have to fall back to
more general username and password if not.

Let's provide a simple way to indicate whether Git (or, for that matter,
the helper) is capable of understanding the authtype and credential
fields.  We send this capability when we generate a request, and the
other side may reply to indicate to us that it does, too.

For now, don't enable sending capabilities for the HTTP code.  In a
future commit, we'll introduce appropriate handling for that code,
which requires more in-depth work.

The logic for determining whether a capability is supported may seem
complex, but it is not.  At each stage, we emit the capability to the
following stage if all preceding stages have declared it.  Thus, if the
caller to git credential fill didn't declare it, then we won't send it
to the helper, and if fill's caller did send but the helper doesn't
understand it, then we won't send it on in the response.  If we're an
internal user, then we know about all capabilities and will request
them.

For "git credential approve" and "git credential reject", we set the
helper capability before calling the helper, since we assume that the
input we're getting from the external program comes from a previous call
to "git credential fill", and thus we'll invoke send a capability to the
helper if and only if we got one from the standard input, which is the
correct behavior.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:06 -07:00
6a6d6fb12e credential: add a field for pre-encoded credentials
At the moment, our credential code wants to find a username and password
for access, which, for HTTP, it will pass to libcurl to encode and
process.  However, many users want to use authentication schemes that
libcurl doesn't support, such as Bearer authentication.  In these
schemes, the secret is not a username and password pair, but some sort
of token that meets the production for authentication data in the RFC.

In fact, in general, it's useful to allow our credential helper to have
knowledge about what specifically to put in the protocol header.  Thus,
add a field, credential, which contains data that's preencoded to be
suitable for the protocol in question.  If we have such data, we need
neither a username nor a password, so make that adjustment as well.

It is in theory possible to reuse the password field for this.  However,
if we do so, we must know whether the credential helper supports our new
scheme before sending it data, which necessitates some sort of
capability inquiry, because otherwise an uninformed credential helper
would store our preencoded data as a password, which would fail the next
time we attempted to connect to the remote server.  This design is
substantially simpler, and we can hint to the credential helper that we
support this approach with a simple new field instead of needing to
query it first.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:06 -07:00
d01c76f1cf http: use new headers for each object request
Currently we create one set of headers for all object requests and reuse
it.  However, we'll need to adjust the headers for authentication
purposes in the future, so let's create a new set for each request so
that we can adjust them if the authentication changes.

Note that the cost of allocation here is tiny compared to the fact that
we're making a network call, not to mention probably a full TLS
connection, so this shouldn't have a significant impact on performance.
Moreover, nobody who cares about performance is using the dumb HTTP
protocol anyway, since it often makes huge numbers of requests compared
to the smart protocol.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:06 -07:00
90765ea81e remote-curl: reset headers on new request
When we retry a post_rpc request, we currently reuse the same headers as
before.  In the future, we'd like to be able to modify them based on the
result we get back, so let's reset them on each retry so we can avoid
sending potentially duplicate headers if the values change.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:06 -07:00
7046f1d572 credential: add an authtype field
When Git makes an HTTP request, it can negotiate the type of
authentication to use with the server provided the authentication scheme
is one of a few well-known types (Basic, Digest, NTLM, or Negotiate).
However, some servers wish to use other types of authentication, such as
the Bearer type from OAuth2.  Since libcurl doesn't natively support
this type, it isn't possible to use it, and the user is forced to
specify the Authorization header using the http.extraheader setting.

However, storing a plaintext token in the repository configuration is
not very secure, especially if a repository can be shared by multiple
parties.  We already have support for many types of secure credential
storage by using credential helpers, so let's teach credential helpers
how to produce credentials for an arbitrary scheme.

If the credential helper specifies an authtype field, then it specifies
an authentication scheme (e.g., Bearer) and the password field specifies
the raw authentication token, with any encoding already specified.  We
reuse the password field for this because some credential helpers store
the metadata without encryption even though the password is encrypted,
and we'd like to avoid insecure storage if an older version of the
credential helper gets ahold of the data.

The username is not used in this case, but it is still preserved for the
purpose of finding the right credential if the user has multiple
accounts.

If the authtype field is not specified, then the password behaves as
normal and it is passed along with the username to libcurl.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:39:06 -07:00
8882ee9d68 mailmap: change primary address for Linus Arver
Linus will lose access to his work email soon.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 22:25:11 -07:00
1204e1a824 builtin/clone: refuse local clones of unsafe repositories
When performing a local clone of a repository we end up either copying
or hardlinking the source repository into the target repository. This is
significantly more performant than if we were to use git-upload-pack(1)
and git-fetch-pack(1) to create the new repository and preserves both
disk space and compute time.

Unfortunately though, performing such a local clone of a repository that
is not owned by the current user is inherently unsafe:

  - It is possible that source files get swapped out underneath us while
    we are copying or hardlinking them. While we do perform some checks
    here to assert that we hardlinked the expected file, they cannot
    reliably thwart time-of-check-time-of-use (TOCTOU) style races. It
    is thus possible for an adversary to make us copy or hardlink
    unexpected files into the target directory.

    Ideally, we would address this by starting to use openat(3P),
    fstatat(3P) and friends. Due to platform compatibility with Windows
    we cannot easily do that though. Furthermore, the scope of these
    fixes would likely be quite broad and thus not fit for an embargoed
    security release.

  - Even if we handled TOCTOU-style races perfectly, hardlinking files
    owned by a different user into the target repository is not a good
    idea in general. It is possible for an adversary to rewrite those
    files to contain whatever data they want even after the clone has
    completed.

Address these issues by completely refusing local clones of a repository
that is not owned by the current user. This reuses our existing infra we
have in place via `ensure_valid_ownership()` and thus allows a user to
override the safety guard by adding the source repository path to the
"safe.directory" configuration.

This addresses CVE-2024-32020.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 02:17:40 +02:00
8c9c051bef setup.c: introduce die_upon_dubious_ownership()
Introduce a new function `die_upon_dubious_ownership()` that uses
`ensure_valid_ownership()` to verify whether a repositroy is safe for
use, and causes Git to die in case it is not.

This function will be used in a subsequent commit.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:01:26 +02:00
d1bb66a546 builtin/clone: abort when hardlinked source and target file differ
When performing local clones with hardlinks we refuse to copy source
files which are symlinks as a mitigation for CVE-2022-39253. This check
can be raced by an adversary though by changing the file to a symlink
after we have checked it.

Fix the issue by checking whether the hardlinked destination file
matches the source file and abort in case it doesn't.

This addresses CVE-2024-32021.

Reported-by: Apple Product Security <product-security@apple.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:01:25 +02:00
150e6b0aed builtin/clone: stop resolving symlinks when copying files
When a user performs a local clone without `--no-local`, then we end up
copying the source repository into the target repository directly. To
optimize this even further, we try to hardlink files into place instead
of copying data over, which helps both disk usage and speed.

There is an important edge case in this context though, namely when we
try to hardlink symlinks from the source repository into the target
repository. Depending on both platform and filesystem the resulting
behaviour here can be different:

  - On macOS and NetBSD, calling link(3P) with a symlink target creates
    a hardlink to the file pointed to by the symlink.

  - On Linux, calling link(3P) instead creates a hardlink to the symlink
    itself.

To unify this behaviour, 36596fd2df (clone: better handle symlinked
files at .git/objects/, 2019-07-10) introduced logic to resolve symlinks
before we try to link(3P) files. Consequently, the new behaviour was to
always create a hard link to the target of the symlink on all platforms.

Eventually though, we figured out that following symlinks like this can
cause havoc when performing a local clone of a malicious repository,
which resulted in CVE-2022-39253. This issue was fixed via 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28),
by refusing symlinks in the source repository.

But even though we now shouldn't ever link symlinks anymore, the code
that resolves symlinks still exists. In the best case the code does not
end up doing anything because there are no symlinks anymore. In the
worst case though this can be abused by an adversary that rewrites the
source file after it has been checked not to be a symlink such that it
actually is a symlink when we call link(3P). Thus, it is still possible
to recreate CVE-2022-39253 due to this time-of-check-time-of-use bug.

Remove the call to `realpath()`. This doesn't yet address the actual
vulnerability, which will be handled in a subsequent commit.

Reported-by: Apple Product Security <product-security@apple.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:01:25 +02:00
9e06401098 Merge branch 'js/github-actions-update'
Update remaining GitHub Actions jobs to avoid warnings against
using deprecated version of Node.js.

* js/github-actions-update:
  ci(linux32): add a note about Actions that must not be updated
  ci: bump remaining outdated Actions versions

With this backport, `maint-2.39`'s CI builds are finally healthy again.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-17 00:01:11 +02:00
64f35baa34 Merge branch 'jc/maint-github-actions-update'
* jc/maint-github-actions-update:
  GitHub Actions: update to github-script@v7
  GitHub Actions: update to checkout@v4

Yet another thing to help `maint-2.39`'s CI builds to become healthy
again.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-17 00:01:00 +02:00
213958f248 ci(linux32): add a note about Actions that must not be updated
The Docker container used by the `linux32` job comes without Node.js,
and therefore the `actions/checkout` and `actions/upload-artifact`
Actions cannot be upgraded to the latest versions (because they use
Node.js).

One time too many, I accidentally tried to update them, where
`actions/checkout` at least fails immediately, but the
`actions/upload-artifact` step is only used when any test fails, and
therefore the CI run usually passes even though that Action was updated
to a version that is incompatible with the Docker container in which
this job runs.

So let's add a big fat warning, mainly for my own benefit, to avoid
running into the very same issue over and over again.

Backported-from: 20e0ff8835 (ci(linux32): add a note about Actions that must not be updated, 2024-02-11)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:58 +02:00
f6bed64ce2 GitHub Actions: update to github-script@v7
We seem to be getting "Node.js 16 actions are deprecated." warnings
for jobs that use github-script@v6.  Update to github-script@v7,
which is said to use Node.js 20.

Backported-from: c4ddbe043e (GitHub Actions: update to github-script@v7, 2024-02-02)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:57 +02:00
7e1bcc8d63 ci: bump remaining outdated Actions versions
After activating automatic Dependabot updates in the
git-for-windows/git repository, Dependabot noticed a couple
of yet-unaddressed updates.  They avoid "Node.js 16 Actions"
deprecation messages by bumping the following Actions'
versions:

- actions/upload-artifact from 3 to 4
- actions/download-artifact from 3 to 4
- actions/cache from 3 to 4

Backported-from: 820a340085 (ci: bump remaining outdated Actions versions, 2024-02-11)
Helped-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:57 +02:00
ce47f7c85f GitHub Actions: update to checkout@v4
We seem to be getting "Node.js 16 actions are deprecated." warnings
for jobs that use checkout@v3.  Except for the i686 containers job
that is kept at checkout@v1 [*], update to checkout@v4, which is
said to use Node.js 20.

[*] 6cf4d908 (ci(main): upgrade actions/checkout to v3, 2022-12-05)
    refers to https://github.com/actions/runner/issues/2115 and
    explains why container jobs are kept at checkout@v1.  We may
    want to check the current status of the issue and move it to the
    same version as other jobs, but that is outside the scope of
    this step.

Backported-from: e94dec0c1d (GitHub Actions: update to checkout@v4, 2024-02-02)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:57 +02:00
6133e3a77e Merge branch 'quicker-asan-lsan'
This patch speeds up the `asan`/`lsan` jobs that are really slow enough
already.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:56 +02:00
ea094eec54 Merge branch 'jk/test-lsan-denoise-output'
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.

* jk/test-lsan-denoise-output:
  test-lib: ignore uninteresting LSan output

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:54 +02:00
b12dcab61d Merge branch 'js/ci-use-macos-13'
Replace macos-12 used at GitHub CI with macos-13.

* js/ci-use-macos-13:
  ci: upgrade to using macos-13

This is another backport to `maint-2.39` to allow less CI jobs to break.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:59:03 +02:00
c7db432de6 Merge branch 'backport/jk/libcurl-8.7-regression-workaround' into maint-2.39
Fix was added to work around a regression in libcURL 8.7.0 (which has
already been fixed in their tip of the tree).

* jk/libcurl-8.7-regression-workaround:
  remote-curl: add Transfer-Encoding header only for older curl
  INSTALL: bump libcurl version to 7.21.3
  http: reset POSTFIELDSIZE when clearing curl handle

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:53 +02:00
e1813a335c Merge branch 'jk/redact-h2h3-headers-fix' into maint-2.42
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.

* jk/redact-h2h3-headers-fix:
  http: update curl http/2 info matching for curl 8.3.0
  http: factor out matching of curl http/2 trace lines

This backport to `maint-2.39` is needed to bring the following test
cases back to a working state in conjunction with recent libcurl
versions:

- t5559.17 GIT_TRACE_CURL redacts auth details
- t5559.18 GIT_CURL_VERBOSE redacts auth details
- t5559.38 cookies are redacted by default

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:48 +02:00
ef0fc42829 Merge branch 'jk/httpd-test-updates'
Test update.

* jk/httpd-test-updates:
  t/lib-httpd: increase ssl key size to 2048 bits
  t/lib-httpd: drop SSLMutex config
  t/lib-httpd: bump required apache version to 2.4
  t/lib-httpd: bump required apache version to 2.2

This is a backport onto the `maint-2.39` branch, to improve the CI
health of that branch.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:40 +02:00
e3cbeb9673 Merge branch 'jk/http-test-fixes'
Various fix-ups on HTTP tests.

* jk/http-test-fixes:
  t5559: make SSL/TLS the default
  t5559: fix test failures with LIB_HTTPD_SSL
  t/lib-httpd: enable HTTP/2 "h2" protocol, not just h2c
  t/lib-httpd: respect $HTTPD_PROTO in expect_askpass()
  t5551: drop curl trace lines without headers
  t5551: handle v2 protocol in cookie test
  t5551: simplify expected cookie file
  t5551: handle v2 protocol in upload-pack service test
  t5551: handle v2 protocol when checking curl trace
  t5551: stop forcing clone to run with v0 protocol
  t5551: handle HTTP/2 when checking curl trace
  t5551: lower-case headers in expected curl trace
  t5551: drop redundant grep for Accept-Language
  t5541: simplify and move "no empty path components" test
  t5541: stop marking "used receive-pack service" test as v0 only
  t5541: run "used receive-pack service" test earlier

This is a backport onto the `maint-2.39` branch, starting to take care
of making that branch's CI builds healthy again.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:06 +02:00
cf5dcd817a ci(linux-asan/linux-ubsan): let's save some time
Every once in a while, the `git-p4` tests flake for reasons outside of
our control. It typically fails with "Connection refused" e.g. here:
https://github.com/git/git/actions/runs/5969707156/job/16196057724

	[...]
	+ git p4 clone --dest=/home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git //depot
	  Initialized empty Git repository in /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git/.git/
	  Perforce client error:
		Connect to server failed; check $P4PORT.
		TCP connect to localhost:9807 failed.
		connect: 127.0.0.1:9807: Connection refused
	  failure accessing depot: could not run p4
	  Importing from //depot into /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git
	 [...]

This happens in other jobs, too, but in the `linux-asan`/`linux-ubsan`
jobs it hurts the most because those jobs often take an _awfully_ long
time to run, therefore re-running a failed `linux-asan`/`linux-ubsan`
jobs is _very_ costly.

The purpose of the `linux-asan`/`linux-ubsan` jobs is to exercise the C
code of Git, anyway, and any part of Git's source code that the `git-p4`
tests run and that would benefit from the attention of ASAN/UBSAN are
run better in other tests anyway, as debugging C code run via Python
scripts can get a bit hairy.

In fact, it is not even just `git-p4` that is the problem (even if it
flakes often enough to be problematic in the CI builds), but really the
part about Python scripts. So let's just skip any Python parts of the
tests from being run in that job.

For good measure, also skip the Subversion tests because debugging C
code run via Perl scripts is as much fun as debugging C code run via
Python scripts. And it will reduce the time this very expensive job
takes, which is a big benefit.

Backported to `maint-2.39` as another step to get that branch's CI
builds back to a healthy state.

Backported-from: 6ba913629f (ci(linux-asan-ubsan): let's save some time, 2023-08-29)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 23:58:04 +02:00
c67cf4c434 test-lib: ignore uninteresting LSan output
When I run the tests in leak-checking mode the same way our CI job does,
like:

  make SANITIZE=leak \
       GIT_TEST_PASSING_SANITIZE_LEAK=true \
       GIT_TEST_SANITIZE_LEAK_LOG=true \
       test

then LSan can racily produce useless entries in the log files that look
like this:

  ==git==3034393==Unable to get registers from thread 3034307.

I think they're mostly harmless based on the source here:

  7e0a52e8e9/compiler-rt/lib/lsan/lsan_common.cpp (L414)

which reads:

    PtraceRegistersStatus have_registers =
        suspended_threads.GetRegistersAndSP(i, &registers, &sp);
    if (have_registers != REGISTERS_AVAILABLE) {
      Report("Unable to get registers from thread %llu.\n", os_id);
      // If unable to get SP, consider the entire stack to be reachable unless
      // GetRegistersAndSP failed with ESRCH.
      if (have_registers == REGISTERS_UNAVAILABLE_FATAL)
        continue;
      sp = stack_begin;
    }

The program itself still runs fine and LSan doesn't cause us to abort.
But test-lib.sh looks for any non-empty LSan logs and marks the test as
a failure anyway, under the assumption that we simply missed the failing
exit code somehow.

I don't think I've ever seen this happen in the CI job, but running
locally using clang-14 on an 8-core machine, I can't seem to make it
through a full run of the test suite without having at least one
failure. And it's a different one every time (though they do seem to
often be related to packing tests, which makes sense, since that is one
of our biggest users of threaded code).

We can hack around this by only counting LSan log files that contain a
line that doesn't match our known-uninteresting pattern.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 23:58:04 +02:00
3167b60e5b ci: upgrade to using macos-13
In April, GitHub announced that the `macos-13` pool is available:
https://github.blog/changelog/2023-04-24-github-actions-macos-13-is-now-available/.
It is only a matter of time until the `macos-12` pool is going away,
therefore we should switch now, without pressure of a looming deadline.

Since the `macos-13` runners no longer include Python2, we also drop
specifically testing with Python2 and switch uniformly to Python3, see
https://github.com/actions/runner-images/blob/HEAD/images/macos/macos-13-Readme.md
for details about the software available on the `macos-13` pool's
runners.

Also, on macOS 13, Homebrew seems to install a `gcc@9` package that no
longer comes with a regular `unistd.h` (there seems only to be a
`ssp/unistd.h`), and hence builds would fail with:

    In file included from base85.c:1:
    git-compat-util.h:223:10: fatal error: unistd.h: No such file or directory
      223 | #include <unistd.h>
          |          ^~~~~~~~~~
    compilation terminated.

The reason why we install GCC v9.x explicitly is historical, and back in
the days it was because it was the _newest_ version available via
Homebrew: 176441bfb5 (ci: build Git with GCC 9 in the 'osx-gcc' build
job, 2019-11-27).

To reinstate the spirit of that commit _and_ to fix that build failure,
let's switch to the now-newest GCC version: v13.x.

Backported-from: 682a868f67 (ci: upgrade to using macos-13, 2023-11-03)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 23:58:03 +02:00
bddc176e79 Merge branch 'jh/fsmonitor-darwin-modernize'
Stop using deprecated macOS API in fsmonitor.

* jh/fsmonitor-darwin-modernize:
  fsmonitor: eliminate call to deprecated FSEventStream function

This backport to `maint-2.39` is needed to be able to build on
`macos-13`, which we need to update to as we restore the CI health of
that branch.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:55:55 +02:00
21306a098c The twentieth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 14:50:31 -07:00
93e3f9df7a Merge branch 'pw/t3428-cleanup'
Test cleanup.

* pw/t3428-cleanup:
  t3428: restore coverage for "apply" backend
  t3428: use test_commit_message
  t3428: modernize test setup
2024-04-16 14:50:31 -07:00
51c15ac1b6 Merge branch 'ba/osxkeychain-updates'
Update osxkeychain backend with features required for the recent
credential subsystem.

* ba/osxkeychain-updates:
  osxkeychain: store new attributes
  osxkeychain: erase matching passwords only
  osxkeychain: erase all matching credentials
  osxkeychain: replace deprecated SecKeychain API
2024-04-16 14:50:30 -07:00
82a31ec324 Merge branch 'jt/reftable-geometric-compaction'
The strategy to compact multiple tables of reftables after many
operations accumulate many entries has been improved to avoid
accumulating too many tables uncollected.

* jt/reftable-geometric-compaction:
  reftable/stack: use geometric table compaction
  reftable/stack: add env to disable autocompaction
  reftable/stack: expose option to disable auto-compaction
2024-04-16 14:50:30 -07:00
2b49e41155 Merge branch 'tb/make-indent-conditional-with-non-spaces'
Adjust to an upcoming changes to GNU make that breaks our Makefiles.

* tb/make-indent-conditional-with-non-spaces:
  Makefile(s): do not enforce "all indents must be done with tab"
  Makefile(s): avoid recipe prefix in conditional statements
2024-04-16 14:50:29 -07:00
a7589384d5 Merge branch 'rs/usage-fallback-to-show-message-format'
vreportf(), which is usede by error() and friends, has been taught
to give the error message printf-format string when its vsnprintf()
call fails, instead of showing nothing useful to identify the
nature of the error.

* rs/usage-fallback-to-show-message-format:
  usage: report vsnprintf(3) failure
2024-04-16 14:50:29 -07:00
107313eb11 Merge branch 'rs/date-mode-pass-by-value'
The codepaths that reach date_mode_from_type() have been updated to
pass "struct date_mode" by value to make them thread safe.

* rs/date-mode-pass-by-value:
  date: make DATE_MODE thread-safe
2024-04-16 14:50:29 -07:00
2d642afb0a Merge branch 'sj/userdiff-c-sharp'
The userdiff patterns for C# has been updated.

Acked-by: Johannes Sixt <j6t@kdbg.org>
cf. <c2154457-3f2f-496e-9b8b-c8ea7257027b@kdbg.org>

* sj/userdiff-c-sharp:
  userdiff: better method/property matching for C#
2024-04-16 14:50:28 -07:00
625ef1c6f1 Merge branch 'tb/t7700-fixup'
Test fix.

* tb/t7700-fixup:
  t/t7700-repack.sh: fix test breakages with `GIT_TEST_MULTI_PACK_INDEX=1 `
2024-04-16 14:50:28 -07:00
92e8388bd3 Merge branch 'jc/local-extern-shell-rules'
Document and apply workaround for a buggy version of dash that
mishandles "local var=val" construct.

* jc/local-extern-shell-rules:
  t1016: local VAR="VAL" fix
  t0610: local VAR="VAL" fix
  t: teach lint that RHS of 'local VAR=VAL' needs to be quoted
  t: local VAR="VAL" (quote ${magic-reference})
  t: local VAR="VAL" (quote command substitution)
  t: local VAR="VAL" (quote positional parameters)
  CodingGuidelines: quote assigned value in 'local var=$val'
  CodingGuidelines: describe "export VAR=VAL" rule
2024-04-16 14:50:27 -07:00
20fee9af9e apply: avoid using fixed-size buffer in write_out_one_reject()
On some systems PATH_MAX is not a hard limit.  Support longer paths by
building them on the heap instead of using static buffers.

Take care to work around (arguably buggy) implementations of free(3)
that change errno by calling it only after using the errno value.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 13:38:35 -07:00
167395bb47 rerere: fix crashes due to unmatched opening conflict markers
When rerere handles a conflict with an unmatched opening conflict marker
in a file with other conflicts, it will fail create a preimage and also
fail allocate the status member of struct rerere_dir. Currently the
status member is allocated after the error handling. This will lead to a
SEGFAULT when the status member is accessed during cleanup of the failed
parse.

Additionally, in subsequent executions of rerere, after removing the
MERGE_RR.lock manually, rerere crashes for a similar reason. MERGE_RR
points to a conflict id that has no preimage, therefore the status
member is not allocated and a SEGFAULT happens when trying to check if a
preimage exists.

Solve this by making sure the status field is allocated correctly and add
tests to prevent the bug from reoccurring.

This does not fix the root cause, failing to parse stray conflict
markers, but I don't think we can do much better than recognizing it,
printing an error, and moving on gracefully.

Signed-off-by: Marcel Röthke <marcel@roethke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 08:42:36 -07:00
548fe35913 The ninteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 14:11:44 -07:00
cb25f97eab Merge branch 'jc/t2104-style-fixes'
Test style fixes.

* jc/t2104-style-fixes:
  t2104: style fixes
2024-04-15 14:11:44 -07:00
b415f15b49 Merge branch 'jc/unleak-core-excludesfile'
The variable that holds the value read from the core.excludefile
configuration variable used to leak, which has been corrected.

* jc/unleak-core-excludesfile:
  config: do not leak excludes_file
2024-04-15 14:11:44 -07:00
eba498a774 Merge branch 'jk/libcurl-8.7-regression-workaround'
Fix was added to work around a regression in libcURL 8.7.0 (which has
already been fixed in their tip of the tree).

* jk/libcurl-8.7-regression-workaround:
  remote-curl: add Transfer-Encoding header only for older curl
  INSTALL: bump libcurl version to 7.21.3
  http: reset POSTFIELDSIZE when clearing curl handle
2024-04-15 14:11:44 -07:00
372aabe912 Merge branch 'ps/t0610-umask-fix'
The "shared repository" test in the t0610 reftable test failed
under restrictive umask setting (e.g. 007), which has been
corrected.

* ps/t0610-umask-fix:
  t0610: execute git-pack-refs(1) with specified umask
  t0610: make `--shared=` tests reusable
2024-04-15 14:11:43 -07:00
d75ec4c627 Merge branch 'gt/add-u-commit-i-pathspec-check'
"git add -u <pathspec>" and "git commit [-i] <pathspec>" did not
diagnose a pathspec element that did not match any files in certain
situations, unlike "git add <pathspec>" did.

* gt/add-u-commit-i-pathspec-check:
  builtin/add: error out when passing untracked path with -u
  builtin/commit: error out when passing untracked path with -i
  revision: optionally record matches with pathspec elements
2024-04-15 14:11:43 -07:00
6c142bc846 Merge branch 'ds/fetch-config-parse-microfix'
A config parser callback function fell through instead of returning
after recognising and processing a variable, wasting cycles, which
has been corrected.

* ds/fetch-config-parse-microfix:
  fetch: return when parsing submodule.recurse
2024-04-15 14:11:43 -07:00
ce729ea9ba Merge branch 'rs/apply-reject-fd-leakfix'
A file descriptor leak in an error codepath, used when "git apply
--reject" fails to create the *.rej file, has been corrected.

* rs/apply-reject-fd-leakfix:
  apply: don't leak fd on fdopen() error
2024-04-15 14:11:43 -07:00
c7a9ec4728 Merge branch 'rs/apply-lift-path-length-limit'
"git apply" has been updated to lift the hardcoded pathname length
limit, which in turn allowed a mksnpath() function that is no
longer used.

* rs/apply-lift-path-length-limit:
  path: remove mksnpath()
  apply: avoid fixed-size buffer in create_one_file()
2024-04-15 14:11:42 -07:00
509cc1d413 Merge branch 'ma/win32-unix-domain-socket'
Windows binary used to decide the use of unix-domain socket at
build time, but it learned to make the decision at runtime instead.

* ma/win32-unix-domain-socket:
  Win32: detect unix socket support at runtime
2024-04-15 14:11:42 -07:00
21b5821acd imap-send: increase command size limit
nfvasprintf() has a 8KB limit, but it's not relevant, as its result is
combined with other strings and added to a 1KB buffer by its caller.
That 1KB limit is not mentioned in RFC 9051, which specifies IMAP.

While 1KB is plenty for user names, passwords and mailbox names,
there's no point in limiting our commands like that.  Call xstrvfmt()
instead of open-coding it and use strbuf to format the command to
send, as we need its length.  Fail hard if it exceeds INT_MAX, because
socket_write() can't take more than that.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:34:17 -07:00
8198993c81 bisect: report the found commit with "show"
When "git bisect" finds the first bad commit and shows it to the user,
it calls "git diff-tree" to do so, whose output is meant to be stable
and deliberately ignores end-user customizations.

As the output is supposed to be consumed by humans, replace this with
a call to "git show". This command honors configuration options (such
as "log.date" and "log.mailmap") and other UI improvements (renames
are detected).

Pass some hard-coded options to "git show" to make the output similar
to the one we are replacing, such as showing a patch summary only.

Reported-by: Michael Osipov <michael.osipov@innomotics.com>
Signed-off-By: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:29:09 -07:00
f412d72c19 Documentation: fix linkgit reference
In git-replay documentation, linkgit to git-rev-parse is missing the
man section, which breaks its rendering.

Add section number as done in other references to this command.

Signed-off-by: Yehezkel Bernat <YehezkelShB@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:02:43 -07:00
44bdba2fa6 git-compat-util: fix NO_OPENSSL on current macOS
b195aa00c1 (git-compat-util: suppress unavoidable Apple-specific
deprecation warnings, 2014-12-16) started to define
__AVAILABILITY_MACROS_USES_AVAILABILITY in git-compat-util.h.  On
current versions it is already defined (e.g. on macOS 14.4.1).  Undefine
it before redefining it to avoid a compilation error.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:01:31 -07:00
795006fff4 pack-bitmap: gracefully handle missing BTMP chunks
In 0fea6b73f1 (Merge branch 'tb/multi-pack-verbatim-reuse', 2024-01-12)
we have introduced multi-pack verbatim reuse of objects. This series has
introduced a new BTMP chunk, which encodes information about bitmapped
objects in the multi-pack index. Starting with dab60934e3 (pack-bitmap:
pass `bitmapped_pack` struct to pack-reuse functions, 2023-12-14) we use
this information to figure out objects which we can reuse from each of
the packfiles.

One thing that we glossed over though is backwards compatibility with
repositories that do not yet have BTMP chunks in their multi-pack index.
In that case, `nth_bitmapped_pack()` would return an error, which causes
us to emit a warning followed by another error message. These warnings
are visible to users that fetch from a repository:

```
$ git fetch
...
remote: error: MIDX does not contain the BTMP chunk
remote: warning: unable to load pack: 'pack-f6bb7bd71d345ea9fe604b60cab9ba9ece54ffbe.idx', disabling pack-reuse
remote: Enumerating objects: 40, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 40 (delta 5), reused 0 (delta 0), pack-reused 0 (from 0)
...
```

While the fetch succeeds the user is left wondering what they did wrong.
Furthermore, as visible both from the warning and from the reuse stats,
pack-reuse is completely disabled in such repositories.

What is quite interesting is that this issue can even be triggered in
case `pack.allowPackReuse=single` is set, which is the default value.
One could have expected that in this case we fall back to the old logic,
which is to use the preferred packfile without consulting BTMP chunks at
all. But either we fail with the above error in case they are missing,
or we use the first pack in the multi-pack-index. The former case
disables pack-reuse altogether, whereas the latter case may result in
reusing objects from a suboptimal packfile.

Fix this issue by partially reverting the logic back to what we had
before this patch series landed. Namely, in the case where we have no
BTMP chunks or when `pack.allowPackReuse=single` are set, we use the
preferred pack instead of consulting the BTMP chunks.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:42:00 -07:00
9da5c992dd reftable/block: avoid copying block iterators on seek
When seeking a reftable record in a block we need to position the
iterator _before_ the sought-after record so that the next call to
`block_iter_next()` would yield that record. To achieve this, the loop
that performs the linear seeks to restore the previous position once it
has found the record.

This is done by advancing two `block_iter`s: one to check whether the
next record is our sought-after record, and one that we update after
every iteration. This of course involves quite a lot of copying and also
leads to needless memory allocations.

Refactor the code to get rid of the `next` iterator and the copying this
involves. Instead, we can restore the previous offset such that the call
to `next` will return the correct record.

Next to being simpler conceptually this also leads to a nice speedup.
The following benchmark parser 10k refs out of 100k existing refs via
`git-rev-list --no-walk`:

  Benchmark 1: rev-list: print many refs (HEAD~)
    Time (mean ± σ):     170.2 ms ±   1.7 ms    [User: 86.1 ms, System: 83.6 ms]
    Range (min … max):   166.4 ms … 180.3 ms    500 runs

  Benchmark 2: rev-list: print many refs (HEAD~)
    Time (mean ± σ):     161.6 ms ±   1.6 ms    [User: 78.1 ms, System: 83.0 ms]
    Range (min … max):   158.4 ms … 172.3 ms    500 runs

  Summary
    rev-list: print many refs (HEAD) ran
      1.05 ± 0.01 times faster than rev-list: print many refs (HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:37:59 -07:00
ce1f213cc9 reftable/block: reuse zstream state on inflation
When calling `inflateInit()` and `inflate()`, the zlib library will
allocate several data structures for the underlying `zstream` to keep
track of various information. Thus, when inflating repeatedly, it is
possible to optimize memory allocation patterns by reusing the `zstream`
and then calling `inflateReset()` on it to prepare it for the next chunk
of data to inflate.

This is exactly what the reftable code is doing: when iterating through
reflogs we need to potentially inflate many log blocks, but we discard
the `zstream` every single time. Instead, as we reuse the `block_reader`
for each of the blocks anyway, we can initialize the `zstream` once and
then reuse it for subsequent inflations.

Refactor the code to do so, which leads to a significant reduction in
the number of allocations. The following measurements were done when
iterating through 1 million reflog entries. Before:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 23,028 allocs, 22,906 frees, 162,813,552 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 302 allocs, 180 frees, 88,352 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
15a60b747e reftable/block: open-code call to uncompress2()
The reftable format stores log blocks in a compressed format. Thus,
whenever we want to read such a block we first need to decompress it.
This is done by calling the convenience function `uncompress2()` of the
zlib library, which is a simple wrapper that manages the lifecycle of
the `zstream` structure for us.

While nice for one-off inflation of data, when iterating through reflogs
we will likely end up inflating many such log blocks. This requires us
to reallocate the state of the `zstream` every single time, which adds
up over time. It would thus be great to reuse the `zstream` instead of
discarding it after every inflation.

Open-code the call to `uncompress2()` such that we can start reusing the
`zstream` in the subsequent commit. Note that our open-coded variant is
different from `uncompress2()` in two ways:

  - We do not loop around `inflate()` until we have processed all input.
    As our input is limited by the maximum block size, which is 16MB, we
    should not hit limits of `inflate()`.

  - We use `Z_FINISH` instead of `Z_NO_FLUSH`. Quoting the `inflate()`
    documentation: "inflate() should normally be called until it returns
    Z_STREAM_END or an error. However if all decompression is to be
    performed in a single step (a single call of inflate), the parameter
    flush should be set to Z_FINISH."

    Furthermore, "Z_FINISH also informs inflate to not maintain a
    sliding window if the stream completes, which reduces inflate's
    memory footprint."

Other than that this commit is expected to be functionally equivalent
and does not yet reuse the `zstream`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
dd347bbce6 reftable/block: reuse uncompressed blocks
The reftable backend stores reflog entries in a compressed format and
thus needs to uncompress blocks before one can read records from it.
For each reflog block we thus have to allocate an array that we can
decompress the block contents into. This block is being discarded
whenever the table iterator moves to the next block. Consequently, we
reallocate a new array on every block, which is quite wasteful.

Refactor the code to reuse the uncompressed block data when moving the
block reader to a new block. This significantly reduces the number of
allocations when iterating through many compressed blocks. The following
measurements are done with `git reflog list` when listing 100k reflogs.
Before:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 45,755 allocs, 45,633 frees, 254,779,456 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 23,028 allocs, 22,906 frees, 162,813,547 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
b00bcb7c49 reftable/reader: iterate to next block in place
The table iterator has to iterate towards the next block once it has
yielded all records of the current block. This is done by creating a new
table iterator, initializing it to the next block, releasing the old
iterator and then copying over the data.

Refactor the code to instead advance the table iterator in place. This
is simpler and unlocks some optimizations in subsequent patches. Also,
it allows us to avoid some allocations.

The following measurements show a single matching ref out of 1 million
refs. Before this change:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 315 allocs, 190 frees, 107,027 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
bcdc586db0 reftable/block: move ownership of block reader into struct table_iter
The table iterator allows the caller to iterate through all records in a
reftable table. To do so it iterates through all blocks of the desired
type one by one, where for each block it creates a new block iterator
and yields all its entries.

One of the things that is somewhat confusing in this context is who owns
the block reader that is being used to read the blocks and pass them to
the block iterator. Intuitively, as the table iterator is responsible
for iterating through the blocks, one would assume that this iterator is
also responsible for managing the lifecycle of the reader. And while it
somewhat is, the block reader is ultimately stored inside of the block
iterator.

Refactor the code such that the block reader is instead fully managed by
the table iterator. Instead of passing the reader to the block iterator,
we now only end up passing the block data to it. Despite clearing up the
lifecycle of the reader, it will also allow for better reuse of the
reader in subsequent patches.

The following benchmark prints a single matching ref out of 1 million
refs. Before:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 6,607 allocs, 6,482 frees, 509,635 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated

Note that while there are more allocation and free calls now, the
overall number of bytes allocated is significantly lower. The number of
allocations will be reduced significantly by the next patch though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
b371221a60 reftable/block: introduce block_reader_release()
Introduce a new function `block_reader_release()` that releases
resources acquired by the block reader. This function will be extended
in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
aac8c03cc4 reftable/block: better grouping of functions
Function definitions and declaration of `struct block_reader` and
`struct block_iter` are somewhat mixed up, making it hard to see which
functions belong together. Rearrange them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
42c7bdc36d reftable/block: merge block_iter_seek() and block_reader_seek()
The function `block_iter_seek()` is merely a simple wrapper around
`block_reader_seek()`. Merge those two functions into a new function
`block_iter_seek_key()` that more clearly says what it is actually
doing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
3122d44025 reftable/block: rename block_reader_start()
The function `block_reader_start()` does not really apply to the block
reader, but to the block iterator. It's name is thus somewhat confusing.
Rename it to `block_iter_seek_start()` to clarify.

We will rename `block_reader_seek()` in similar spirit in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
48e1ca27b1 launch_editor: waiting message on error
When advice.waitingForEditor configuration is not set to false, we show
a hint telling that we are waiting for user's editor to close the file
when we launch an editor and wait for it to return control back to us.
We give the message on an incomplete line, expecting that we can go back
to the beginning of the line and clear the message when the editor returns.

However, it is possible that the editor exits with an error status, in
which case we show an error message and then return to our caller.  In
such a case, the error message is given where the terminal cursor
happens to be, which is most likely after the "we are waiting for your
editor" message on the same line.

Clear the line before showing the error.

While we're here, make the error message follow our CodingGuideLines.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:13:32 -07:00
ab4ad1fa8a fast-import: make comments more precise
The former is somewhat imprecise. The latter became out of sync with the
behavior in e814c39c2f (fast-import: refactor parsing of spaces,
2014-06-18).

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:18 -07:00
be4d6a371e fast-import: forbid escaped NUL in paths
NUL cannot appear in paths. Even disregarding filesystem path
limitations, the tree object format delimits with NUL, so such a path
cannot be encoded by Git.

When a quoted path is unquoted, it could possibly contain NUL from
"\000". Forbid it so it isn't truncated.

fast-import still has other issues with NUL, but those will be addressed
later.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:18 -07:00
a923a04b80 fast-import: document C-style escapes for paths
Simply saying “C-style” string quoting is imprecise, as only a subset of
C escapes are supported. Document the exact escapes.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:18 -07:00
22915955ca fast-import: improve documentation for path quoting
It describes what characters cannot be in an unquoted path, but not
their semantics. Reframe it as a definition of unquoted paths. From the
perspective of the parser, whether it starts with `"` is what defines
whether it will parse it as quoted or unquoted.

The restrictions on characters in unquoted paths (with starting-", LF,
and spaces) are explained in the quoted paragraph. Move it to the
unquoted paragraph and reword.

The restriction that the source paths of filecopy and filerename cannot
contain SP is only stated in their respective sections. Restate it in
the <path> section.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:18 -07:00
212ab23e98 fast-import: remove dead strbuf
The strbuf in `note_change_n` is to copy the remainder of `p` before
potentially invalidating it when reading the next line. However, `p` is
not used after that point. It has been unused since the function was
created in a8dd2e7d2b (fast-import: Add support for importing commit
notes, 2009-10-09) and looks to be a fossil from adapting
`file_change_m`. Remove it.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:17 -07:00
b5062f752e fast-import: allow unquoted empty path for root
Ever since filerename was added in f39a946a1f (Support wholesale
directory renames in fast-import, 2007-07-09) and filecopy in b6f3481bb4
(Teach fast-import to recursively copy files/directories, 2007-07-15),
both have produced an error when the destination path is empty. Later,
when support for targeting the root directory with an empty string was
added in 2794ad5244 (fast-import: Allow filemodify to set the root,
2010-10-10), this had the effect of allowing the quoted empty string
(`""`), but forbidding its unquoted variant (``). This seems to have
been intended as simple data validation for parsing two paths, rather
than a syntax restriction, because it was not extended to the other
operations.

All other occurrences of paths (in filemodify, filedelete, the source of
filecopy and filerename, and ls) allow both.

For most of this feature's lifetime, the documentation has not
prescribed the use of quoted empty strings. In e5959106d6
(Documentation/fast-import: put explanation of M 040000 <dataref> "" in
context, 2011-01-15), its documentation was changed from “`<path>` may
also be an empty string (`""`) to specify the root of the tree” to “The
root of the tree can be represented by an empty string as `<path>`”.

Thus, we should assume that some front-ends have depended on this
behavior.

Remove this restriction for the destination paths of filecopy and
filerename and change tests targeting the root to test `""` and ``.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:17 -07:00
5733f894d7 fast-import: directly use strbufs for paths
Previously, one case would not write the path to the strbuf: when the
path is unquoted and at the end of the string. It was essentially
copy-on-write. However, with the logic simplification of the previous
commit, this case was eliminated and the strbuf is always populated.

Directly use the strbufs now instead of an alias.

Since this already changes all the lines that use the strbufs, rename
them from `uq` to be more descriptive. That they are unquoted is not
their most important property, so name them after what they carry.

Additionally, `file_change_m` no longer needs to copy the path before
reading inline data.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:17 -07:00
0df86b6689 fast-import: tighten path unquoting
Path parsing in fast-import is inconsistent and many unquoting errors
are suppressed or not checked.

<path> appears in the grammar in these places:

    filemodify ::= 'M' SP <mode> (<dataref> | 'inline') SP <path> LF
    filedelete ::= 'D' SP <path> LF
    filecopy   ::= 'C' SP <path> SP <path> LF
    filerename ::= 'R' SP <path> SP <path> LF
    ls         ::= 'ls' SP <dataref> SP <path> LF
    ls-commit  ::= 'ls' SP <path> LF

and fast-import.c parses them in five different ways:

1. For filemodify and filedelete:
   Try to unquote <path>. If it unquotes without errors, use the
   unquoted version; otherwise, treat it as literal bytes to the end of
   the line (including any number of SP).
2. For filecopy (source) and filerename (source):
   Try to unquote <path>. If it unquotes without errors, use the
   unquoted version; otherwise, treat it as literal bytes up to, but not
   including, the next SP.
3. For filecopy (dest) and filerename (dest):
   Like 1., but an unquoted empty string is forbidden.
4. For ls:
   If <path> starts with `"`, unquote it and report parse errors;
   otherwise, treat it as literal bytes to the end of the line
   (including any number of SP).
5. For ls-commit:
   Unquote <path> and report parse errors.
   (It must start with `"` to disambiguate from ls.)

In the first three, any errors from trying to unquote a string are
suppressed, so a quoted string that contains invalid escapes would be
interpreted as literal bytes. For example, `"\xff"` would fail to
unquote (because hex escapes are not supported), and it would instead be
interpreted as the byte sequence '"', '\\', 'x', 'f', 'f', '"', which is
certainly not intended. Some front-ends erroneously use their language's
standard quoting routine instead of matching Git's, which could silently
introduce escapes that would be incorrectly parsed due to this and lead
to data corruption.

The documentation states “To use a source path that contains SP the path
must be quoted.”, so it is expected that some implementations depend on
spaces being allowed in paths in the final position. Thus we have two
documented ways to parse paths, so simplify the implementation to that.

Now we have:

1. `parse_path_eol` for filemodify, filedelete, filecopy (dest),
   filerename (dest), ls, and ls-commit:

   If <path> starts with `"`, unquote it and report parse errors;
   otherwise, treat it as literal bytes to the end of the line
   (including any number of SP).

2. `parse_path_space` for filecopy (source) and filerename (source):

   If <path> starts with `"`, unquote it and report parse errors;
   otherwise, treat it as literal bytes up to, but not including, the
   next SP. It must be followed by SP.

There remain two special cases: The dest <path> in filecopy and rename
cannot be an unquoted empty string (this will be addressed subsequently)
and <path> in ls-commit must be quoted to disambiguate it from ls.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:17 -07:00
8f7582d995 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 11:31:39 -07:00
d8360a86ed Merge branch 'tb/midx-write'
Code clean-up by splitting code responsible for writing midx files
into its own file.

* tb/midx-write:
  midx-write.c: use `--stdin-packs` when repacking
  midx-write.c: check count of packs to repack after grouping
  midx-write.c: factor out common want_included_pack() routine
  midx-write: move writing-related functions from midx.c
2024-04-12 11:31:39 -07:00
28dc93bab0 Merge branch 'rs/t-prio-queue-cleanup'
t-prio-queue test has been cleaned up by using C99 compound
literals; this is meant to also serve as a weather-balloon to smoke
out folks with compilers who have trouble compiling code that uses
the feature.

* rs/t-prio-queue-cleanup:
  t-prio-queue: simplify using compound literals
2024-04-12 11:31:39 -07:00
7fbe3ead19 Merge branch 'ps/reftable-binsearch-updates'
Reftable code clean-up and some bugfixes.

* ps/reftable-binsearch-updates:
  reftable/block: avoid decoding keys when searching restart points
  reftable/record: extract function to decode key lengths
  reftable/block: fix error handling when searching restart points
  reftable/block: refactor binary search over restart points
  reftable/refname: refactor binary search over refnames
  reftable/basics: improve `binsearch()` test
  reftable/basics: fix return type of `binsearch()` to be `size_t`
2024-04-12 11:31:39 -07:00
847af43a3a Merge branch 'jc/checkout-detach-wo-tracking-report'
"git checkout/switch --detach foo", after switching to the detached
HEAD state, gave the tracking information for the 'foo' branch,
which was pointless.

Tested-by: M Hickford <mirth.hickford@gmail.com>
cf. <CAGJzqsmE9FDEBn=u3ge4LA3ha4fDbm4OWiuUbMaztwjELBd7ug@mail.gmail.com>

* jc/checkout-detach-wo-tracking-report:
  checkout: omit "tracking" information on a detached HEAD
2024-04-12 11:31:39 -07:00
d8800f630a Merge branch 'rs/imap-send-use-xsnprintf'
Code clean-up and duplicate reduction.

* rs/imap-send-use-xsnprintf:
  imap-send: use xsnprintf to format command
2024-04-12 11:31:38 -07:00
d842e22ebb Merge branch 'js/merge-tree-3-trees'
Match the option argument type in the help text to the correct type
updated by a recent series.

* js/merge-tree-3-trees:
  merge-tree: fix argument type of the `--merge-base` option
2024-04-12 11:31:38 -07:00
0c6ee971fb merge-tree: fix argument type of the --merge-base option
In 5f43cf5b2e (merge-tree: accept 3 trees as arguments, 2024-01-28), I
taught `git merge-tree` to perform three-way merges on trees. This
commit even changed the manual page to state that the `--merge-base`
option takes a tree-ish rather than requiring a commit.

But I forgot to adjust the in-program help text. This patch fixes that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 09:10:43 -07:00
5da40be8d7 Documentation: fix typos describing date format
This commit corrects a typographical error found in both
date-formats.txt and git-fast-import.txt documentation, where the term
`email format` was mistakenly used instead of `date format`.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 09:03:03 -07:00
70b81fbf3c t0612: add tests to exercise Git/JGit reftable compatibility
While the reftable format is a recent introduction in Git, JGit already
knows to read and write reftables since 2017. Given the complexity of
the format there is a very real risk of incompatibilities between those
two implementations, which is something that we really want to avoid.

Add some basic tests that verify that reftables written by Git and JGit
can be read by the respective other implementation. For now this test
suite is rather small, only covering basic functionality. But it serves
as a good starting point and can be extended over time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:51 -07:00
db1d63bf57 t0610: fix non-portable variable assignment
Older versions of the Dash shell fail to parse `local var=val`
assignments in some cases when `val` is unquoted. Such failures can be
observed e.g. with Ubuntu 20.04 and older, which has a Dash version that
still has this bug.

Such an assignment has been introduced in t0610. The issue wasn't
detected for a while because this test used to only run when the
GIT_TEST_DEFAULT_REF_FORMAT environment variable was set to "reftable".
We have dropped that requirement now though, meaning that it runs
unconditionally, including on jobs which use such older versions of
Ubuntu.

We have worked around such issues in the past, e.g. in ebee5580ca
(parallel-checkout: avoid dash local bug in tests, 2021-06-06), by
quoting the `val` side. Apply the same fix to t0610.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:51 -07:00
ca13c3e94a t06xx: always execute backend-specific tests
The tests in t06xx exercise specific ref formats. Next to probing some
basic functionality, these tests also exercise other low-level details
specific to the format. Those tests are only executed though in case
`GIT_TEST_DEFAULT_REF_FORMAT` is set to the ref format of the respective
backend-under-test.

Ideally, we would run the full test matrix for ref formats such that our
complete test suite is executed with every supported format on every
supported platform. This is quite an expensive undertaking though, and
thus we only execute e.g. the "reftable" tests on macOS and Linux. As a
result, we basically have no test coverage for the "reftable" format at
all on other platforms like Windows.

Adapt these tests so that they override `GIT_TEST_DEFAULT_REF_FORMAT`,
which means that they'll always execute. This increases test coverage on
platforms that don't run the full test matrix, which at least gives us
some basic test coverage on those platforms for the "reftable" format.

This of course comes at the cost of running those tests multiple times
on platforms where we do run the full test matrix. But arguably, this is
a good thing because it will also cause us to e.g. run those tests with
the address sanitizer and other non-standard parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:51 -07:00
04ba2c7eb3 ci: install JGit dependency
We have some tests in t5310 that use JGit to verify that bitmaps can be
read both by Git and by JGit. We do not execute these tests in our CI
jobs though because we don't make JGit available there. Consequently,
the tests basically bitrot because almost nobody is ever going to have
JGit in their path.

Install JGit to plug this test gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
ca44ef3165 ci: make Perforce binaries executable for all users
The Perforce binaries are only made executable for the current user. On
GitLab CI though we execute tests as a different user than "root", and
thus these binaries may not be executable by that test user at all. This
has gone unnoticed so far because those binaries are optional -- in case
they don't exist we simply skip over tests requiring them.

Fix the setup so that we set the executable bits for all users.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
9cdeb34b96 ci: merge scripts which install dependencies
We have two different scripts which install dependencies, one for
dockerized jobs and one for non-dockerized ones. Naturally, these
scripts have quite some duplication. Furthermore, either of these
scripts is missing some test dependencies that the respective other
script has, thus reducing test coverage.

Merge those two scripts such that there is a single source of truth for
test dependencies, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
2c5c7639e5 ci: fix setup of custom path for GitLab CI
Part of "install-dependencies.sh" is to install some binaries required
for tests into a custom directory that gets added to the PATH. This
directory is located at "$HOME/path" and thus depends on the current
user that the script executes as.

This creates problems for GitLab CI, which installs dependencies as the
root user, but runs tests as a separate, unprivileged user. As their
respective home directories are different, we will end up using two
different custom path directories. Consequently, the unprivileged user
will not be able to find the binaries that were set up as root user.

Fix this issue by allowing CI to override the custom path, which allows
GitLab to set up a constant value that isn't derived from "$HOME".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
d1ef3d3b1d ci: merge custom PATH directories
We're downloading various executables required by our tests. Each of
these executables goes into its own directory, which is then appended to
the PATH variable. Consequently, whenever we add a new dependency and
thus a new directory, we would have to adapt to this change in several
places.

Refactor this to instead put all binaries into a single directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
40c60f4c12 ci: convert "install-dependencies.sh" to use "/bin/sh"
We're about to merge the "install-docker-dependencies.sh" script into
"install-dependencies.sh". This will also move our Alpine-based jobs
over to use the latter script. This script uses the Bash shell though,
which is not available by default on Alpine Linux.

Refactor "install-dependencies.sh" to use "/bin/sh" instead of Bash.
This requires us to get rid of the pushd/popd invocations, which are
replaced by some more elaborate commands that download or extract
executables right to where they are needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
21bcb4a602 ci: drop duplicate package installation for "linux-gcc-default"
The "linux-gcc-default" job installs common Ubuntu packages. This is
already done in the distro-specific switch, so we basically duplicate
the effort here.

Drop the duplicate package installations and inline the variable that
contains those common packages.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
11d3f1aa5f ci: skip sudo when we are already root
Our "install-dependencies.sh" script is executed by non-dockerized jobs
to install dependencies. These jobs don't run with "root" permissions,
but with a separate user. Consequently, we need to use sudo(8) there to
elevate permissions when installing packages.

We're about to merge "install-docker-dependencies.sh" into that script
though, and our Docker containers do run as "root". Using sudo(8) is
thus unnecessary there, even though it would be harmless. On some images
like Alpine Linux though there is no sudo(8) available by default, which
would consequently break the build.

Adapt the script to make "sudo" a no-op when running as "root" user.
This allows us to easily reuse the script for our dockerized jobs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
ab2b3aadf3 ci: expose distro name in dockerized GitHub jobs
Expose a distro name in dockerized jobs. This will be used in a
subsequent commit where we merge the installation scripts for dockerized
and non-dockerized jobs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
2d65e5b6a6 ci: rename "runs_on_pool" to "distro"
The "runs_on_pool" environment variable is used by our CI scripts to
distinguish the different kinds of operating systems. It is quite
specific to GitHub Actions though and not really a descriptive name.

Rename the variable to "distro" to clarify its intent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:49 -07:00
6741e917de repository: avoid leaking fsmonitor data
The `fsmonitor` repo-setting data is allocated lazily. It needs to be
released in `repo_clear()` along the rest of the allocated data in
`repository`.

This is needed because the next commit will merge v2.39.4, which will
add a `git checkout --recurse-modules` call (which would leak memory
without this here fix) to an otherwise leak-free test script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-12 11:09:50 +02:00
84a7c33a4b typo: replace 'commitish' with 'committish'
Across only three files, comments and a single function name used
'commitish' rather than 'commit-ish' or 'committish' as the spelling.
The git glossary accepts a hyphen or a double-t, but not a single-t.
Despite the typo in a translation file, none of the typos appear in
user-visible locations.

Signed-off-by: Pi Fisher <Pi.L.D.Fisher@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-11 15:14:56 -07:00
5d312ec8a4 remote-curl: add Transfer-Encoding header only for older curl
As of curl 7.66.0, we don't need to manually specify a "chunked"
Transfer-Encoding header. Instead, modern curl deduces the need for it
in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather
than POSTFIELDS.

That version is recent enough that we can't just drop the header; we
need to do so conditionally. Since it's only a single line, it seems
like the simplest thing would just be to keep setting it unconditionally
(after all, the #ifdefs are much longer than the actual code). But
there's another wrinkle: HTTP/2.

Curl may choose to use HTTP/2 under the hood if the server supports it.
And in that protocol, we do not use the chunked encoding for streaming
at all. Most versions of curl handle this just fine by recognizing and
removing the header. But there's a regression in curl 8.7.0 and 8.7.1
where it doesn't, and large requests over HTTP/2 are broken (which t5559
notices). That regression has since been fixed upstream, but not yet
released.

Make the setting of this header conditional, which will let Git work
even with those buggy curl versions. And as a bonus, it serves as a
reminder that we can eventually clean up the code as we bump the
supported curl versions.

This is a backport of 92a209bf24 (remote-curl: add Transfer-Encoding
header only for older curl, 2024-04-05) into the `maint-2.39` branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-10 19:24:48 +02:00
ea61df9d1e INSTALL: bump libcurl version to 7.21.3
Our documentation claims we support curl versions back to 7.19.5. But we
can no longer compile with that version since adding an unconditional
use of CURLOPT_RESOLVE in 511cfd3bff (http: add custom hostname to IP
address resolutions, 2022-05-16). That feature wasn't added to libcurl
until 7.21.3.

We could add #ifdefs to make this work back to 7.19.5. But given that
nobody noticed the compilation failure in the intervening two years, it
makes more sense to bump the version in the documentation to 7.21.3
(which is itself over 13 years old).

We could perhaps go forward even more (which would let us drop some
cruft from git-curl-compat.h), but this should be an obviously safe
jump, and we can move forward later.

Note that user-visible syntax for CURLOPT_RESOLVE has grown new features
in subsequent curl versions. Our documentation mentions "+" and "-"
entries, which require more recent versions than 7.21.3. We could
perhaps clarify that in our docs, but it's probably not worth cluttering
them with restrictions of ancient curl versions.

This is a backport of c28ee09503 (INSTALL: bump libcurl version to
7.21.3, 2024-04-02) into the `maint-2.39` branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-10 19:24:48 +02:00
580097bf95 http: reset POSTFIELDSIZE when clearing curl handle
In get_active_slot(), we return a CURL handle that may have been used
before (reusing them is good because it lets curl reuse the same
connection across many requests). We set a few curl options back to
defaults that may have been modified by previous requests.

We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
defaults to "-1"). This usually doesn't matter because most POSTs will
set both fields together anyway. But there is one exception: when
handling a large request in remote-curl's post_rpc(), we don't set
_either_, and instead set a READFUNCTION to stream data into libcurl.

This can interact weirdly with a stale POSTFIELDSIZE setting, because
curl will assume it should read only some set number of bytes from our
READFUNCTION. However, it has worked in practice because we also
manually set a "Transfer-Encoding: chunked" header, which libcurl uses
as a clue to set the POSTFIELDSIZE to -1 itself.

So everything works, but we're better off resetting the size manually
for a few reasons:

  - there was a regression in curl 8.7.0 where the chunked header
    detection didn't kick in, causing any large HTTP requests made by
    Git to fail. This has since been fixed (but not yet released). In
    the issue, curl folks recommended setting it explicitly to -1:

      https://github.com/curl/curl/issues/13229#issuecomment-2029826058

    and it indeed works around the regression. So even though it won't
    be strictly necessary after the fix there, this will help folks who
    end up using the affected libcurl versions.

  - it's consistent with what a new curl handle would look like. Since
    get_active_slot() may or may not return a used handle, this reduces
    the possibility of heisenbugs that only appear with certain request
    patterns.

Note that the recommendation in the curl issue is to actually drop the
manual Transfer-Encoding header. Modern libcurl will add the header
itself when streaming from a READFUNCTION. However, that code wasn't
added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
support back to 7.19.5, so those older versions still need the manual
header.

This is a backport of 3242311742 (http: reset POSTFIELDSIZE when
clearing curl handle, 2024-04-02) into the `maint-2.39` branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-10 19:24:48 +02:00
436d4e5b14 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10 10:00:09 -07:00
f43863e686 Merge branch 'jc/t2104-style-update'
Coding style fixes.

* jc/t2104-style-update:
  t2104: style fixes
2024-04-10 10:00:09 -07:00
280b74ce18 Merge branch 'kn/clarify-update-ref-doc'
Doc update, as a preparation to enhance "git update-ref --stdin".

* kn/clarify-update-ref-doc:
  githooks: use {old,new}-oid instead of {old,new}-value
  update-ref: use {old,new}-oid instead of {old,new}value
2024-04-10 10:00:08 -07:00
a4a1453ad1 Merge branch 'vs/complete-with-set-u-fix'
Another "set -u" fix for the bash prompt (in contrib/) script.

* vs/complete-with-set-u-fix:
  completion: protect prompt against unset SHOWUPSTREAM in nounset mode
  completion: fix prompt with unset SHOWCONFLICTSTATE in nounset mode
2024-04-10 10:00:08 -07:00
aaf524cfb0 Merge branch 'rs/mem-pool-size-t-safety'
size_t arithmetic safety.

* rs/mem-pool-size-t-safety:
  mem-pool: use st_add() in mem_pool_strvfmt()
2024-04-10 10:00:08 -07:00
dc89c59951 Merge branch 'ds/typofix-core-config-doc'
Typofix.

* ds/typofix-core-config-doc:
  config: fix some small capitalization issues, as spotted
2024-04-10 10:00:08 -07:00
c02dc38570 send-email: move newline characters out of a few translatable strings
Move the already existing newline characters out of a few translatable
strings, to help a bit with the translation efforts.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10 09:11:10 -07:00
03e84cca5d t9604: Fix test for musl libc and new Debian
CST6CDT and the like are POSIX timezone, with no rule for transition.
And POSIX doesn't enforce how to interpret the rule if it's omitted.
Some libc (e.g. glibc) resorted back to IANA (formerly Olson) db rules
for those timezones.  Some libc (e.g. FreeBSD) uses a fixed rule.
Other libc (e.g. musl) interpret that as no transition at all [1].

In addition, distributions (notoriously Debian-derived, which uses IANA
db for CST6CDT and the like) started to split "legacy" timezones
like CST6CDT, EST5EDT into `tzdata-legacy', which will not be installed
by default [2].

In those cases, t9604 will run into failure.

Let's switch to POSIX timezone with rules to change timezone.

1: http://mm.icann.org/pipermail/tz/2024-March/058751.html
2: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043250

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10 09:10:31 -07:00
8d320cec60 t2104: style fixes
We use tabs to indent, not two or four spaces.

These days, even the test fixture preparation should be done inside
test_expect_success block.

Address these two style violations in this test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 22:27:49 -07:00
b4454d5a7b t3428: restore coverage for "apply" backend
This test file assumes the "apply" backend is the default which is not
the case since 2ac0d6273f (rebase: change the default backend from "am"
to "merge", 2020-02-15). Make sure the "apply" backend is tested by
specifying it explicitly.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 16:03:19 -07:00
1ad81756b4 t3428: use test_commit_message
Using a helper function makes the tests shorter and avoids running "git
cat-file" upstream of a pipe.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 16:03:19 -07:00
aac1c6e8f5 t3428: modernize test setup
Perform the setup in a dedicated test so the later tests can be run
independently. Also avoid running git upstream of a pipe and take
advantage of test_commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 16:03:19 -07:00
91ec36f2cc The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:31:45 -07:00
8f31543f3d Merge branch 'rj/use-adv-if-enabled'
Use advice_if_enabled() API to rewrite a simple pattern to
call advise() after checking advice_enabled().

* rj/use-adv-if-enabled:
  add: use advise_if_enabled for ADVICE_ADD_EMBEDDED_REPO
  add: use advise_if_enabled for ADVICE_ADD_EMPTY_PATHSPEC
  add: use advise_if_enabled for ADVICE_ADD_IGNORED_FILE
2024-04-09 14:31:45 -07:00
eacfd581d2 Merge branch 'ps/pack-refs-auto'
"git pack-refs" learned the "--auto" option, which is a useful
addition to be triggered from "git gc --auto".

Acked-by: Karthik Nayak <karthik.188@gmail.com>
cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com>

* ps/pack-refs-auto:
  builtin/gc: pack refs when using `git maintenance run --auto`
  builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
  t6500: extract objects with "17" prefix
  builtin/gc: move `struct maintenance_run_opts`
  builtin/pack-refs: introduce new "--auto" flag
  builtin/pack-refs: release allocated memory
  refs/reftable: expose auto compaction via new flag
  refs: remove `PACK_REFS_ALL` flag
  refs: move `struct pack_refs_opts` to where it's used
  t/helper: drop pack-refs wrapper
  refs/reftable: print errors on compaction failure
  reftable/stack: gracefully handle failed auto-compaction due to locks
  reftable/stack: use error codes when locking fails during compaction
  reftable/error: discern locked/outdated errors
  reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-09 14:31:45 -07:00
a6abddab1e Merge branch 'es/test-cron-safety'
The test script had an incomplete and ineffective attempt to avoid
clobbering the testing user's real crontab (and its equivalents),
which has been completed.

* es/test-cron-safety:
  test-lib: fix non-functioning GIT_TEST_MAINT_SCHEDULER fallback
2024-04-09 14:31:45 -07:00
989bf45394 Merge branch 'rj/add-p-explicit-reshow'
"git add -p" and other "interactive hunk selection" UI has learned to
skip showing the hunk immediately after it has already been shown, and
an additional action to explicitly ask to reshow the current hunk.

* rj/add-p-explicit-reshow:
  add-patch: do not print hunks repeatedly
  add-patch: introduce 'p' in interactive-patch
2024-04-09 14:31:44 -07:00
4b4081034b Merge branch 'mg/editorconfig-makefile'
The .editorconfig file has been taught that a Makefile uses HT
indentation.

* mg/editorconfig-makefile:
  editorconfig: add Makefiles to "text files"
2024-04-09 14:31:44 -07:00
58dd7e4b11 Merge branch 'ja/doc-markup-updates'
Documentation rules has been explicitly described how to mark-up
literal parts and a few manual pages have been updated as examples.

* ja/doc-markup-updates:
  doc: git-clone: do not autoreference the manpage in itself
  doc: git-clone: apply new documentation formatting guidelines
  doc: git-init: apply new documentation formatting guidelines
  doc: allow literal and emphasis format in doc vs help tests
  doc: rework CodingGuidelines with new formatting rules
2024-04-09 14:31:44 -07:00
4697c8a445 Merge branch 'dg/myfirstobjectwalk-updates'
Update a more recent tutorial doc.

* dg/myfirstobjectwalk-updates:
  MyFirstObjectWalk: add stderr to pipe processing
  MyFirstObjectWalk: fix description for counting omitted objects
  MyFirstObjectWalk: fix filtered object walk
  MyFirstObjectWalk: fix misspelled "builtins/"
  MyFirstObjectWalk: use additional arg in config_fn_t
2024-04-09 14:31:44 -07:00
39b2c6f77e Merge branch 'jc/advice-sans-trailing-whitespace'
The "hint:" messages given by the advice mechanism, when given a
message with a blank line, left a line with trailing whitespace,
which has been cleansed.

* jc/advice-sans-trailing-whitespace:
  advice: omit trailing whitespace
2024-04-09 14:31:43 -07:00
8289a36f87 Merge branch 'jc/apply-parse-diff-git-header-names-fix'
"git apply" failed to extract the filename the patch applied to,
when the change was about an empty file created in or deleted from
a directory whose name ends with a SP, which has been corrected.

* jc/apply-parse-diff-git-header-names-fix:
  t4126: fix "funny directory name" test on Windows (again)
  t4126: make sure a directory with SP at the end is usable
  apply: parse names out of "diff --git" more carefully
2024-04-09 14:31:43 -07:00
69d87802da t0610: execute git-pack-refs(1) with specified umask
The tests for git-pack-refs(1) with the `core.sharedRepository` config
execute git-pack-refs(1) outside of the shell that has the expected
umask set. This is wrong because we want to test the behaviour of that
command with different umasks. The issue went unnoticed because most
distributions have a default umask of 0022, and we only ever test with
`--shared=true`, which re-adds the group write bit.

Fix the issue by moving git-pack-refs(1) into the umask'd shell and add
a bunch of test cases that exercise behaviour more thoroughly.

Note that we drop the check for whether `core.sharedRepository` was set
to the correct value to make the test setup a bit easier. We should be
able to rely on git-init(1) doing its thing correctly. Furthermore, to
help readability, we convert tests that pass `--shared=true` to instead
pass the equivalent `--shared=group`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:14:00 -07:00
2f960dd5fe t0610: make --shared= tests reusable
We have two kinds of `--shared=` tests, one for git-init(1) and one for
git-pack-refs(1). Merge them into a reusable function such that we can
easily add additional testcases with different umasks and flags for the
`--shared=` switch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:14:00 -07:00
fa74f32291 reftable/block: reuse compressed array
Similar to the preceding commit, let's reuse the `compressed` array that
we use to store compressed data in. This results in a small reduction in
memory allocations when writing many refs.

Before:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,618,257 allocs, 22,618,106 frees, 1,236,351,528 bytes allocated

So while the reduction in allocations isn't really all that big, it's a
low hanging fruit and thus there isn't much of a reason not to pick it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:42 -07:00
a155ab2bf4 reftable/block: reuse zstream when writing log blocks
While most reftable blocks are written to disk as-is, blocks for log
records are compressed with zlib. To compress them we use `compress2()`,
which is a simple wrapper around the more complex `zstream` interface
that would require multiple function invocations.

One downside of this interface is that `compress2()` will reallocate
internal state of the `zstream` interface on every single invocation.
Consequently, as we call `compress2()` for every single log block which
we are about to write, this can lead to quite some memory allocation
churn.

Refactor the code so that the block writer reuses a `zstream`. This
significantly reduces the number of bytes allocated when writing many
refs in a single transaction, as demonstrated by the following benchmark
that writes 100k refs in a single transaction.

Before:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,631,887 allocs, 22,631,736 frees, 1,854,670,793 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:42 -07:00
8aaeffe3b5 reftable/writer: reset last_key instead of releasing it
The reftable writer tracks the last key that it has written so that it
can properly compute the compressed prefix for the next record it is
about to write. This last key must be reset whenever we move on to write
the next block, which is done in `writer_reinit_block_writer()`. We do
this by calling `strbuf_release()` though, which needlessly deallocates
the underlying buffer.

Convert the code to use `strbuf_reset()` instead, which saves one
allocation per block we're about to write. This requires us to also
amend `reftable_writer_free()` to release the buffer's memory now as we
previously seemingly relied on `writer_reinit_block_writer()` to release
the memory for us. Releasing memory here is the right thing to do
anyway.

While at it, convert a callsite where we truncate the buffer by setting
its length to zero to instead use `strbuf_reset()`, too.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
60dd319519 reftable/writer: unify releasing memory
There are two code paths which release memory of the reftable writer:

  - `reftable_writer_close()` releases internal state after it has
    written data.

  - `reftable_writer_free()` releases the block that was written to and
    the writer itself.

Both code paths free different parts of the writer, and consequently the
caller must make sure to call both. And while callers mostly do this
already, this falls apart when a write failure causes the caller to skip
calling `reftable_write_close()`.

Introduce a new function `reftable_writer_release()` that releases all
internal state and call it from both paths. Like this it is fine for the
caller to not call `reftable_writer_close()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
7e892fec47 reftable/writer: refactorings for writer_flush_nonempty_block()
Large parts of the reftable library do not conform to Git's typical code
style. Refactor `writer_flush_nonempty_block()` such that it conforms
better to it and add some documentation that explains some of its more
intricate behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
d0dd119f72 reftable/writer: refactorings for writer_add_record()
Large parts of the reftable library do not conform to Git's typical code
style. Refactor `writer_add_record()` such that it conforms better to it
and add some documentation that explains some of its more intricate
behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
44afd85fbd refs/reftable: don't recompute committer ident
In order to write reflog entries we need to compute the committer's
identity as it gets encoded in the log record itself. The reftable
backend does this via `git_committer_info()` and `split_ident_line()` in
`fill_reftable_log_record()`, which use the Git config as well as
environment variables to figure out the identity.

While most callers would only call `fill_reftable_log_record()` once or
twice, `write_transaction_table()` will call it as many times as there
are queued ref updates. This can be quite a waste of effort when writing
many refs with reflog entries in a single transaction.

Refactor the code to pre-compute the committer information. This results
in a small speedup when writing 100000 refs in a single transaction:

  Benchmark 1: update-ref: create many refs (HEAD~)
    Time (mean ± σ):      2.895 s ±  0.020 s    [User: 1.516 s, System: 1.374 s]
    Range (min … max):    2.868 s …  2.983 s    100 runs

  Benchmark 2: update-ref: create many refs (HEAD)
    Time (mean ± σ):      2.845 s ±  0.017 s    [User: 1.461 s, System: 1.379 s]
    Range (min … max):    2.803 s …  2.913 s    100 runs

  Summary
    update-ref: create many refs (HEAD) ran
      1.02 ± 0.01 times faster than update-ref: create many refs (HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
485c63cf5c reftable: remove name checks
In the preceding commit we have disabled name checks in the "reftable"
backend. These checks were responsible for verifying multiple things
when writing records to the reftable stack:

  - Detecting file/directory conflicts. Starting with the preceding
    commits this is now handled by the reftable backend itself via
    `refs_verify_refname_available()`.

  - Validating refnames. This is handled by `check_refname_format()` in
    the generic ref transacton layer.

The code in the reftable library is thus not used anymore and likely to
bitrot over time. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
4af31dc84a refs/reftable: skip duplicate name checks
All the callback functions which write refs in the reftable backend
perform D/F conflict checks via `refs_verify_refname_available()`. But
in reality we perform these D/F conflict checks a second time in the
reftable library via `stack_check_addition()`.

Interestingly, the code in the reftable library is inferior compared to
the generic function:

  - It is slower than `refs_verify_refname_available()`, even though
    this can probably be optimized.

  - It does not provide a proper error message to the caller, and thus
    all the user would see is a generic "file/directory conflict"
    message.

Disable the D/F conflict checks in the reftable library by setting the
`skip_name_check` write option. This results in a non-negligible speedup
when writing many refs. The following benchmark writes 100k refs in a
single transaction:

  Benchmark 1: update-ref: create many refs (HEAD~)
    Time (mean ± σ):      3.241 s ±  0.040 s    [User: 1.854 s, System: 1.381 s]
    Range (min … max):    3.185 s …  3.454 s    100 runs

  Benchmark 2: update-ref: create many refs (HEAD)
    Time (mean ± σ):      2.878 s ±  0.024 s    [User: 1.506 s, System: 1.367 s]
    Range (min … max):    2.838 s …  2.960 s    100 runs

  Summary
    update-ref: create many refs (HEAD~) ran
      1.13 ± 0.02 times faster than update-ref: create many refs (HEAD)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 16:59:02 -07:00
455d61b6d2 refs/reftable: perform explicit D/F check when writing symrefs
We already perform explicit D/F checks in all reftable callbacks which
write refs, except when writing symrefs. For one this leads to an error
message which isn't perfectly actionable because we only tell the user
that there was a D/F conflict, but not which refs conflicted with each
other. But second, once all ref updating callbacks explicitly check for
D/F conflicts, we can disable the D/F checks in the reftable library
itself and thus avoid some duplicated efforts.

Refactor the code that writes symref tables to explicitly call into
`refs_verify_refname_available()` when writing symrefs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 16:59:01 -07:00
f57cc987a9 refs/reftable: fix D/F conflict error message on ref copy
The `write_copy_table()` function is shared between the reftable
implementations for renaming and copying refs. The only difference
between those two cases is that the rename will also delete the old
reference, whereas copying won't.

This has resulted in a bug though where we don't properly verify refname
availability. When calling `refs_verify_refname_available()`, we always
add the old ref name to the list of refs to be skipped when computing
availability, which indicates that the name would be available even if
it already exists at the current point in time. This is only the right
thing to do for renames though, not for copies.

The consequence of this bug is quite harmless because the reftable
backend has its own checks for D/F conflicts further down in the call
stack, and thus we refuse the update regardless of the bug. But all the
user gets in this case is an uninformative message that copying the ref
has failed, without any further details.

Fix the bug and only add the old name to the skip-list in case we rename
the ref. Consequently, this error case will now be handled by
`refs_verify_refname_available()`, which knows to provide a proper error
message.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 16:59:01 -07:00
227b8fd902 Makefile(s): do not enforce "all indents must be done with tab"
Our top-level Makefile follows our generic whitespace rule
established by the top-level .gitattributes file that does not
enforce indent-with-non-tab rule by default, but git-gui is set up
to enforce indent-with-non-tab by default.  With the upcoming change
to GNU make, we no longer can reject (and worse, "fix") a patch that
adds whitespace indented lines to the Makefile, so loosen the rule
there for git-gui/Makefile, too.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 16:36:05 -07:00
728b9ac0c3 Makefile(s): avoid recipe prefix in conditional statements
In GNU Make commit 07fcee35 ([SV 64815] Recipe lines cannot contain
conditional statements, 2023-05-22) and following, conditional
statements may no longer be preceded by a tab character (which Make
refers to as the recipe prefix).

There are a handful of spots in our various Makefile(s) which will break
in a future release of Make containing 07fcee35. For instance, trying to
compile the pre-image of this patch with the tip of make.git results in
the following:

    $ make -v | head -1 && make
    GNU Make 4.4.90
    config.mak.uname:842: *** missing 'endif'.  Stop.

The kernel addressed this issue in 82175d1f9430 (kbuild: Replace tabs
with spaces when followed by conditionals, 2024-01-28). Address the
issues in Git's tree by applying the same strategy.

When a conditional word (ifeq, ifneq, ifdef, etc.) is preceded by one or
more tab characters, replace each tab character with 8 space characters
with the following:

    find . -type f -not -path './.git/*' -name Makefile -or -name '*.mak' |
      xargs perl -i -pe '
        s/(\t+)(ifn?eq|ifn?def|else|endif)/" " x (length($1) * 8) . $2/ge unless /\\$/
      '

The "unless /\\$/" removes any false-positives (like "\telse \"
appearing within a shell script as part of a recipe).

After doing so, Git compiles on newer versions of Make:

    $ make -v | head -1 && make
    GNU Make 4.4.90
    GIT_VERSION = 2.44.0.414.gfac1dc44ca9
    [...]

    $ echo $?
    0

Reported-by: Dario Gjorgjevski <dario.gjorgjevski@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 14:42:32 -07:00
0e0fefb29f config: do not leak excludes_file
The excludes_file variable is marked "const char *", but all the
assignments to it are made with a piece of memory allocated just
for it, and the variable is responsible for owning it.

When "core.excludesfile" is read, the code just lost the previous
value, leaking memory.  Plug it.

The real problem is that the variable is mistyped; our convention
is to never make a variable that owns the piece of memory pointed
by it as "const".  Fixing that would reduce the chance of this kind
of bug happening, and also would make it unnecessary to cast the
constness away while free()ing it, but that would be a much larger
follow-up effort.

Reported-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 13:20:24 -07:00
a949ebd342 reftable/stack: use geometric table compaction
To reduce the number of on-disk reftables, compaction is performed.
Contiguous tables with the same binary log value of size are grouped
into segments. The segment that has both the lowest binary log value and
contains more than one table is set as the starting point when
identifying the compaction segment.

Since segments containing a single table are not initially considered
for compaction, if the table appended to the list does not match the
previous table log value, no compaction occurs for the new table. It is
therefore possible for unbounded growth of the table list. This can be
demonstrated by repeating the following sequence:

git branch -f foo
git branch -d foo

Each operation results in a new table being written with no compaction
occurring until a separate operation produces a table matching the
previous table log value.

Instead, to avoid unbounded growth of the table list, the compaction
strategy is updated to ensure tables follow a geometric sequence after
each operation by individually evaluating each table in reverse index
order. This strategy results in a much simpler and more robust algorithm
compared to the previous one while also maintaining a minimal ordered
set of tables on-disk.

When creating 10 thousand references, the new strategy has no
performance impact:

Benchmark 1: update-ref: create refs sequentially (revision = HEAD~)
  Time (mean ± σ):     26.516 s ±  0.047 s    [User: 17.864 s, System: 8.491 s]
  Range (min … max):   26.447 s … 26.569 s    10 runs

Benchmark 2: update-ref: create refs sequentially (revision = HEAD)
  Time (mean ± σ):     26.417 s ±  0.028 s    [User: 17.738 s, System: 8.500 s]
  Range (min … max):   26.366 s … 26.444 s    10 runs

Summary
  update-ref: create refs sequentially (revision = HEAD) ran
    1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~)

Some tests in `t0610-reftable-basics.sh` assert the on-disk state of
tables and are therefore updated to specify the correct new table count.
Since compaction is more aggressive in ensuring tables maintain a
geometric sequence, the expected table count is reduced in these tests.
In `reftable/stack_test.c` tests related to `sizes_to_segments()` are
removed because the function is no longer needed. Also, the
`test_suggest_compaction_segment()` test is updated to better showcase
and reflect the new geometric compaction behavior.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 12:11:10 -07:00
7c8eb5928f reftable/stack: add env to disable autocompaction
In future tests it will be neccesary to create repositories with a set
number of tables. To make this easier, introduce the
`GIT_TEST_REFTABLE_AUTOCOMPACTION` environment variable that, when set
to false, disables autocompaction of reftables.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 12:11:10 -07:00
bc91330cec reftable/stack: expose option to disable auto-compaction
The reftable stack already has a variable to configure whether or not to
run auto-compaction, but it is inaccessible to users of the library.
There exist use cases where a caller may want to have more control over
auto-compaction.

Move the `disable_auto_compact` option into `reftable_write_options` to
allow external callers to disable auto-compaction. This will be used in
a subsequent commit.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 12:11:10 -07:00
836b221391 t1016: local VAR="VAL" fix
The series was based on maint and fixes all the tests that exist
there, but we have acquired a few more.

I suspect that the values assigned in many of these places are $IFS
safe, and this is primarily to squelch the linter than adding a
necessary workaround for buggy dash.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:11 -07:00
26ba7477d9 t0610: local VAR="VAL" fix
The series was based on maint and fixes all the tests that exist
there, but we have acquired a few more.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:11 -07:00
8bfe486191 t: teach lint that RHS of 'local VAR=VAL' needs to be quoted
Teach t/check-non-portable-shell.pl that right hand side of the
assignment done with "local VAR=VAL" need to be quoted.  We
deliberately target only VAL that begins with $ so that we can catch

 - $variable_reference and positional parameter reference like $4
 - $(command substitution)
 - ${variable_reference-with_magic}

while excluding

 - $'\n' that is a bash-ism freely usable in t990[23]
 - $(( arithmetic )) whose result should be $IFS safe.
 - $? that also is $IFS safe

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:11 -07:00
e97f4a6d94 t: local VAR="VAL" (quote ${magic-reference})
Future-proof test scripts that do

	local VAR=VAL

without quoting VAL (which is OK in POSIX but broken in some shells)
that is ${magic-"reference to a parameter"}.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:11 -07:00
7f9f230b7f t: local VAR="VAL" (quote command substitution)
Future-proof test scripts that do

	local VAR=VAL

without quoting VAL (which is OK in POSIX but broken in some shells)
that is a $(command substitution).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:10 -07:00
341aad8d41 t: local VAR="VAL" (quote positional parameters)
Future-proof test scripts that do

	local VAR=VAL

without quoting VAL (which is OK in POSIX but broken in some shells)
that is a positional parameter, e.g. $4.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:10 -07:00
be34b51049 CodingGuidelines: quote assigned value in 'local var=$val'
Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
lets the shell erroneously perform field splitting on the expansion
of a command substitution during declaration of a local or an extern
variable.

The explanation was stolen from ebee5580 (parallel-checkout: avoid
dash local bug in tests, 2021-06-06).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:50:05 -07:00
7e3a9c23d6 CodingGuidelines: describe "export VAR=VAL" rule
https://lore.kernel.org/git/201307081121.22769.tboegi@web.de/
resulted in 9968ffff (test-lint: detect 'export FOO=bar',
2013-07-08) to add a rule to t/check-non-portable-shell.pl script to
reject

	export VAR=VAL

and suggest us to instead write it as two statements, i.e.,

	VAR=VAL
	export VAR

This however was not spelled out in the CodingGuidelines document.

We may want to re-evaluate the rule since it is from ages ago, but
for now, let's make the written rule and what the automation
enforces consistent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 22:48:12 -07:00
ec0e3075d2 userdiff: better method/property matching for C#
- Support multi-line methods by not requiring closing parenthesis.
- Support multiple generics (comma was missing before).
- Add missing `foreach`, `lock` and  `fixed` keywords to skip over.
- Remove `instanceof` keyword, which isn't C#.
- Also detect non-method keywords not positioned at the start of a line.
- Added tests; none existed before.

The overall strategy is to focus more on what isn't expected for
method/property definitions, instead of what is, but is fully optional.

Signed-off-by: Steven Jeuris <steven.jeuris@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 15:21:43 -07:00
9720d23e8c date: make DATE_MODE thread-safe
date_mode_from_type() modifies a static variable and returns a pointer
to it.  This is not thread-safe.  Most callers of date_mode_from_type()
use it via the macro DATE_MODE and pass its result on to functions like
show_date(), which take a const pointer and don't modify the struct.

Avoid the static storage by putting the variable on the stack and
returning the whole struct date_mode.  Change functions that take a
constant pointer to expect the whole struct instead.

Reduce the cost of passing struct date_mode around on 64-bit systems
by reordering its members to close the hole between the 32-bit wide
.type and the 64-bit aligned .strftime_fmt as well as the alignment
hole at the end.  sizeof reports 24 before and 16 with this change
on x64.  Keep .type at the top to still allow initialization without
designator -- though that's only done in a single location, in
builtin/blame.c.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 15:21:14 -07:00
c63adab961 usage: report vsnprintf(3) failure
vreportf(), which is used e.g. by die() and warning() by default, calls
vsnprintf(3) to format the message to report.  If that call fails, it
only prints the prefix, e.g. "fatal: " or "warning: ".  This at least
informs users that they were supposed to get a message and reveals its
severity, but leaves them wondering what it may have been about.

Here's an example where vreportf() tries to print a message with a 2GB
string, which is too much for vsnprintf(3):

  $ perl -le 'print "create refs/heads/", "a"x2**31' | git update-ref --stdin
  fatal:

At least report the formatting error along with the offending message
(unformatted) to indicate why that message is empty.  Use fprintf(3)
instead of error() to get the message out directly and avoid recursing
back into vreportf().

With this patch we get:

  $ perl -le 'print "create refs/heads/", "a"x2**31' | git update-ref --stdin
  error: unable to format message: invalid ref format: %s
  fatal:

... which allows users to at least get an idea of what went wrong.

Suggested-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 15:16:27 -07:00
92a209bf24 remote-curl: add Transfer-Encoding header only for older curl
As of curl 7.66.0, we don't need to manually specify a "chunked"
Transfer-Encoding header. Instead, modern curl deduces the need for it
in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather
than POSTFIELDS.

That version is recent enough that we can't just drop the header; we
need to do so conditionally. Since it's only a single line, it seems
like the simplest thing would just be to keep setting it unconditionally
(after all, the #ifdefs are much longer than the actual code). But
there's another wrinkle: HTTP/2.

Curl may choose to use HTTP/2 under the hood if the server supports it.
And in that protocol, we do not use the chunked encoding for streaming
at all. Most versions of curl handle this just fine by recognizing and
removing the header. But there's a regression in curl 8.7.0 and 8.7.1
where it doesn't, and large requests over HTTP/2 are broken (which t5559
notices). That regression has since been fixed upstream, but not yet
released.

Make the setting of this header conditional, which will let Git work
even with those buggy curl versions. And as a bonus, it serves as a
reminder that we can eventually clean up the code as we bump the
supported curl versions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 14:45:19 -07:00
19981daefd The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 10:49:49 -07:00
dce1e0b6da Merge branch 'jk/core-comment-string'
core.commentChar used to be limited to a single byte, but has been
updated to allow an arbitrary multi-byte sequence.

* jk/core-comment-string:
  config: add core.commentString
  config: allow multi-byte core.commentChar
  environment: drop comment_line_char compatibility macro
  wt-status: drop custom comment-char stringification
  sequencer: handle multi-byte comment characters when writing todo list
  find multi-byte comment chars in unterminated buffers
  find multi-byte comment chars in NUL-terminated strings
  prefer comment_line_str to comment_line_char for printing
  strbuf: accept a comment string for strbuf_add_commented_lines()
  strbuf: accept a comment string for strbuf_commented_addf()
  strbuf: accept a comment string for strbuf_stripspace()
  environment: store comment_line_char as a string
  strbuf: avoid shadowing global comment_line_char name
  commit: refactor base-case of adjust_comment_line_char()
  strbuf: avoid static variables in strbuf_add_commented_lines()
  strbuf: simplify comment-handling in add_lines() helper
  config: forbid newline as core.commentChar
2024-04-05 10:49:49 -07:00
3256584c36 Merge branch 'rs/config-comment'
"git config" learned "--comment=<message>" option to leave a
comment immediately after the "variable = value" on the same line
in the configuration file.

* rs/config-comment:
  config: allow tweaking whitespace between value and comment
  config: fix --comment formatting
  config: add --comment option to add a comment
2024-04-05 10:49:49 -07:00
7424fb7797 Merge branch 'ps/pack-refs-auto' into jt/reftable-geometric-compaction
* ps/pack-refs-auto:
  builtin/gc: pack refs when using `git maintenance run --auto`
  builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
  t6500: extract objects with "17" prefix
  builtin/gc: move `struct maintenance_run_opts`
  builtin/pack-refs: introduce new "--auto" flag
  builtin/pack-refs: release allocated memory
  refs/reftable: expose auto compaction via new flag
  refs: remove `PACK_REFS_ALL` flag
  refs: move `struct pack_refs_opts` to where it's used
  t/helper: drop pack-refs wrapper
  refs/reftable: print errors on compaction failure
  reftable/stack: gracefully handle failed auto-compaction due to locks
  reftable/stack: use error codes when locking fails during compaction
  reftable/error: discern locked/outdated errors
  reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-05 10:34:23 -07:00
2b1f456adf apply: don't leak fd on fdopen() error
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 10:09:26 -07:00
a816ccd642 fetch: return when parsing submodule.recurse
When parsing config keys, the normal pattern is to return 0 after
completing the logic for a specific config key, since no other key will
match. One instance, for "submodule.recurse", was missing this case in
builtin/fetch.c.

This is a very minor change, and will have minimal impact to
performance. This particular block was edited recently in 56e8bb4fb4
(fetch: use `fetch_config` to store "fetch.recurseSubmodules" value,
2023-05-17), which led to some hesitation that perhaps this omission was
on purpose.

However, no later cases within git_fetch_config() will match the key if
equal to "submodule.recurse" and neither will any key matches within the
catch-all git_default_config().

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 09:55:21 -07:00
708f7e0590 path: remove mksnpath()
Remove the function mksnpath(), which has become unused.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 09:49:38 -07:00
9126cb3186 apply: avoid fixed-size buffer in create_one_file()
PATH_MAX is not always a hard limit and 'path' in create_one_file()
could be longer -- it's taken from the patch file and allocated
dynamically.  Allocate the name of the temporary file on the heap as
well instead of using a fixed-size buffer to avoid that arbitrary limit.

Resist the temptation of using the more convenient mkpath() to avoid
introducing a dependency on a static variable deep inside the apply
machinery.

Take care to work around (arguably buggy) implementations of free(3)
that modify errno, by calling it only after using the errno value.

Suggested-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 09:49:36 -07:00
7de13cfef3 builtin/add: error out when passing untracked path with -u
When passing untracked path with -u option, it silently succeeds.
There is no error message and the exit code is zero. This is
inconsistent with other instances of git commands where the expected
argument is a known path. In those other instances, we error out when
the path is not known.

Fix this by passing a character array to add_files_to_cache() to
collect the pathspec matching information and report the error if a
pathspec does not match any cache entry. Also add a testcase to cover
this scenario.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 14:55:22 -07:00
ac5946e624 builtin/commit: error out when passing untracked path with -i
When we provide a pathspec which does not match any tracked path
alongside --include, we do not error like without --include. If there
is something staged, it will commit the staged changes and ignore the
pathspec which does not match any tracked path. And if nothing is
staged, it will print the status. Exit code is 0 in both cases (unlike
without --include). This is also described in the TODO comment before
the relevant testcase.

Fix this by passing a character array to add_files_to_cache() to
collect the pathspec matching information and error out if the given
path is untracked. Also, amend the testcase to check for the error
message and remove the TODO comment.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 14:55:22 -07:00
86829f3f3e revision: optionally record matches with pathspec elements
Unlike "git add" and other end-user facing commands, where it is
diagnosed as an error to give a pathspec with an element that does
not match any path, the diff machinery does not care if some
elements of the pathspec do not match.  Given that the diff
machinery is heavily used in pathspec-limited "git log" machinery,
and it is common for a path to come and go while traversing the
project history, this is usually a good thing.

However, in some cases we would want to know if all the pathspec
elements matched.  For example, "git add -u <pathspec>" internally
uses the machinery used by "git diff-files" to decide contents from
what paths to add to the index, and as an end-user facing command,
"git add -u" would want to report an unmatched pathspec element.

Add a new .ps_matched member next to the .prune_data member in
"struct rev_info" so that we can optionally keep track of the use of
.prune_data pathspec elements that can be inspected by the caller.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 14:55:21 -07:00
2406bf5fc5 Win32: detect unix socket support at runtime
Windows 10 build 17063 introduced support for unix sockets to Windows.
bb390b1 (git-compat-util: include declaration for unix sockets in
windows, 2021-09-14) introduced a way to build git with unix socket
support on Windows, but you still had to decide at build time which
Windows version the compiled executable was supposed to run on.

We can detect at runtime wether the operating system supports unix
sockets and act accordingly for all supported Windows versions.

This fixes https://github.com/git-for-windows/git/issues/3892

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 14:54:28 -07:00
7774cfed62 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 10:56:20 -07:00
17381ab62a Merge branch 'bl/cherry-pick-empty'
Allow git-cherry-pick(1) to automatically drop redundant commits via
a new `--empty` option, similar to the `--empty` options for
git-rebase(1) and git-am(1). Includes a soft deprecation of
`--keep-redundant-commits` as well as some related docs changes and
sequencer code cleanup.

* bl/cherry-pick-empty:
  cherry-pick: add `--empty` for more robust redundant commit handling
  cherry-pick: enforce `--keep-redundant-commits` incompatibility
  sequencer: do not require `allow_empty` for redundant commit options
  sequencer: handle unborn branch with `--allow-empty`
  rebase: update `--empty=ask` to `--empty=stop`
  docs: clean up `--empty` formatting in git-rebase(1) and git-am(1)
  docs: address inaccurate `--empty` default with `--exec`
2024-04-03 10:56:20 -07:00
d988e80bd3 Merge branch 'bl/pretty-shorthand-config-fix'
The "--pretty=<shortHand>" option of the commands in the "git log"
family, defined as "[pretty] shortHand = <expansion>" should have
been looked up case insensitively, but was not, which has been
corrected.

* bl/pretty-shorthand-config-fix:
  pretty: find pretty formats case-insensitively
  pretty: update tests to use `test_config`
2024-04-03 10:56:20 -07:00
4cc302e886 Merge branch 'rs/strbuf-expand-bad-format'
Code clean-up.

* rs/strbuf-expand-bad-format:
  cat-file: use strbuf_expand_bad_format()
  factor out strbuf_expand_bad_format()
2024-04-03 10:56:20 -07:00
f046355ec3 Merge branch 'rs/midx-use-strvec-pushf'
Code clean-up.

* rs/midx-use-strvec-pushf:
  midx: use strvec_pushf() for pack-objects base name
2024-04-03 10:56:20 -07:00
188e94250a Merge branch 'pb/test-scripts-are-build-targets'
The t/README file now gives a hint on running individual tests in
the "t/" directory with "make t<num>-*.sh t<num>-*.sh".

* pb/test-scripts-are-build-targets:
  t/README: mention test files are make targets
2024-04-03 10:56:19 -07:00
e4193dcf12 Merge branch 'ds/grep-doc-updates'
Documentation updates.

* ds/grep-doc-updates:
  grep docs: describe --no-index further and improve formatting a bit
  grep docs: describe --recurse-submodules further and improve formatting a bit
2024-04-03 10:56:19 -07:00
e76218cad3 Merge branch 'az/grep-group-error-message-update'
Error message clarification.

* az/grep-group-error-message-update:
  grep: improve errors for unmatched ( and )
2024-04-03 10:56:19 -07:00
eda72ddc18 Merge branch 'jc/release-notes-entry-experiment'
Introduce an experimental protocol for contributors to propose the
topic description to be used in the "What's cooking" report, the
merge commit message for the topic, and in the release notes and
document it in the SubmittingPatches document.

* jc/release-notes-entry-experiment:
  SubmittingPatches: release-notes entry experiment
2024-04-03 10:56:19 -07:00
e139bb1006 Merge branch 'jk/remote-helper-object-format-option-fix'
The implementation and documentation of "object-format" option
exchange between the Git itself and its remote helpers did not
quite match, which has been corrected.

* jk/remote-helper-object-format-option-fix:
  transport-helper: send "true" value for object-format option
  transport-helper: drop "object-format <algo>" option
  transport-helper: use write helpers more consistently
2024-04-03 10:56:18 -07:00
b494b1ce39 t/t7700-repack.sh: fix test breakages with GIT_TEST_MULTI_PACK_INDEX=1
There are a handful of related test breakages which are found when
running t/t7700-repack.sh with GIT_TEST_MULTI_PACK_INDEX set to "1" in
your environment.

Both test failures are the result of something like:

    git repack --write-midx --write-bitmap-index [...] &&

    test_path_is_file $midx &&
    test_path_is_file $midx-$(midx_checksum $objdir).bitmap

, where we repack instructing Git to write a new MIDX and corresponding
MIDX bitamp.

The error occurs when GIT_TEST_MULTI_PACK_INDEX=1 is found in the
enviornment. This causes Git to write out a second MIDX (after
processing the builtin's `--write-midx` argument) which is identical to
the first, but does not request a bitmap (since we did not set the
GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP variable in the environment).

Since c528e17966 (pack-bitmap: write multi-pack bitmaps, 2021-08-31),
the MIDX machinery will drop an existing MIDX bitmap when rewriting an
identical MIDX which does not itself request a corresponding bitmap,
which is similar to the way repack itself behaves in the pack-bitmap
case.

Correct these issues (which date back to [1] and [2], respectively) by
explicitly setting GIT_TEST_MULTI_PACK_INDEX to zero before running each
command.

In the future, we should consider removing GIT_TEST_MULTI_PACK_INDEX,
and in general clean up unused GIT_TEST_-variables. But that is a larger
effort, and this ensures that we can cleanly run:

    $ GIT_TEST_MULTI_PACK_INDEX=1 make test

in the meantime.

[1]: 324efc90d1 (builtin/repack.c: pass `--refs-snapshot` when writing
  bitmaps, 2021-10-01)

[2]: 197443e80a (repack: don't remove .keep packs with
  `--pack-kept-objects`, 2022-10-17).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 10:45:36 -07:00
d51d8cc368 reftable/block: avoid decoding keys when searching restart points
When searching over restart points in a block we decode the key of each
of the records, which results in a memory allocation. This is quite
pointless though given that records it restart points will never use
prefix compression and thus store their keys verbatim in the block.

Refactor the code so that we can avoid decoding the keys, which saves us
some allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
cd75790707 reftable/record: extract function to decode key lengths
We're about to refactor the binary search over restart points so that it
does not need to fully decode the record keys anymore. To do so we will
need to decode the record key lengths, which is non-trivial logic.

Extract the logic to decode these lengths from `refatble_decode_key()`
so that we can reuse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
f9e88544f5 reftable/block: fix error handling when searching restart points
When doing the binary search over restart points in a block we need to
decode the record keys. This decoding step can result in an error when
the block is corrupted, which we indicate to the caller of the binary
search by setting `args.error = 1`. But the only caller that exists
mishandles this because it in fact performs the error check before
calling `binsearch()`.

Fix this bug by checking for errors at the right point in time.
Furthermore, refactor `binsearch()` so that it aborts the search in case
the callback function returns a negative value so that we don't
needlessly continue to search the block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
77307a61d6 reftable/block: refactor binary search over restart points
When seeking a record in our block reader we perform a binary search
over the block's restart points so that we don't have to do a linear
scan over the whole block. The logic to do so is quite intricate though,
which makes it hard to understand.

Improve documentation and rename some of the functions and variables so
that the code becomes easier to understand overall. This refactoring
should not result in any change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
21767925b0 reftable/refname: refactor binary search over refnames
It is comparatively hard to understand how exactly the binary search
over refnames works given that the function and variable names are not
exactly easy to grasp. Rename them to make this more obvious. This
should not result in any change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
e8b808258e reftable/basics: improve binsearch() test
The `binsearch()` test is somewhat weird in that it doesn't explicitly
spell out its expectations. Instead it does so in a rather ad-hoc way
with some hard-to-understand computations.

Refactor the test to spell out the needle as well as expected index for
all testcases. This refactoring highlights that the `binsearch_func()`
is written somewhat weirdly to find the first integer smaller than the
needle, not smaller or equal to it. Adjust the function accordingly.

While at it, rename the callback function to better convey its meaning.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:49 -07:00
3e7b36d129 reftable/basics: fix return type of binsearch() to be size_t
The `binsearch()` function can be used to find the first element for
which a callback functions returns a truish value. But while the array
size is of type `size_t`, the function in fact returns an `int` that is
supposed to index into that array.

Fix the function signature to return a `size_t`. This conversion does
not change any semantics given that the function would only ever return
a value in the range `[0, sz]` anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:49 -07:00
543b2a1083 t-prio-queue: simplify using compound literals
Test names like "basic" are mentioned seven times in the code (ignoring
case): Twice when defining the input and result macros, thrice when
defining the test function, and twice again when calling it.  Reduce
that to a single time by using compound literals to pass the input and
result arrays via TEST_INPUT to test_prio_queue().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 13:41:32 -07:00
c28ee09503 INSTALL: bump libcurl version to 7.21.3
Our documentation claims we support curl versions back to 7.19.5. But we
can no longer compile with that version since adding an unconditional
use of CURLOPT_RESOLVE in 511cfd3bff (http: add custom hostname to IP
address resolutions, 2022-05-16). That feature wasn't added to libcurl
until 7.21.3.

We could add #ifdefs to make this work back to 7.19.5. But given that
nobody noticed the compilation failure in the intervening two years, it
makes more sense to bump the version in the documentation to 7.21.3
(which is itself over 13 years old).

We could perhaps go forward even more (which would let us drop some
cruft from git-curl-compat.h), but this should be an obviously safe
jump, and we can move forward later.

Note that user-visible syntax for CURLOPT_RESOLVE has grown new features
in subsequent curl versions. Our documentation mentions "+" and "-"
entries, which require more recent versions than 7.21.3. We could
perhaps clarify that in our docs, but it's probably not worth cluttering
them with restrictions of ancient curl versions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 13:27:20 -07:00
3242311742 http: reset POSTFIELDSIZE when clearing curl handle
In get_active_slot(), we return a CURL handle that may have been used
before (reusing them is good because it lets curl reuse the same
connection across many requests). We set a few curl options back to
defaults that may have been modified by previous requests.

We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
defaults to "-1"). This usually doesn't matter because most POSTs will
set both fields together anyway. But there is one exception: when
handling a large request in remote-curl's post_rpc(), we don't set
_either_, and instead set a READFUNCTION to stream data into libcurl.

This can interact weirdly with a stale POSTFIELDSIZE setting, because
curl will assume it should read only some set number of bytes from our
READFUNCTION. However, it has worked in practice because we also
manually set a "Transfer-Encoding: chunked" header, which libcurl uses
as a clue to set the POSTFIELDSIZE to -1 itself.

So everything works, but we're better off resetting the size manually
for a few reasons:

  - there was a regression in curl 8.7.0 where the chunked header
    detection didn't kick in, causing any large HTTP requests made by
    Git to fail. This has since been fixed (but not yet released). In
    the issue, curl folks recommended setting it explicitly to -1:

      https://github.com/curl/curl/issues/13229#issuecomment-2029826058

    and it indeed works around the regression. So even though it won't
    be strictly necessary after the fix there, this will help folks who
    end up using the affected libcurl versions.

  - it's consistent with what a new curl handle would look like. Since
    get_active_slot() may or may not return a used handle, this reduces
    the possibility of heisenbugs that only appear with certain request
    patterns.

Note that the recommendation in the curl issue is to actually drop the
manual Transfer-Encoding header. Modern libcurl will add the header
itself when streaming from a READFUNCTION. However, that code wasn't
added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
support back to 7.19.5, so those older versions still need the manual
header.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 13:27:18 -07:00
40c45f809f t2104: style fixes
We use tabs to indent, not two or four spaces.

These days, even the test fixture preparation should be done inside
test_expect_success block.

Address these two style violations in this test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 11:46:47 -07:00
39bb692152 imap-send: use xsnprintf to format command
nfsnprintf() wraps vsnprintf(3) and reports attempts to use too small a
buffer using BUG(), just like xsnprintf().

It has an extra check that makes sure the buffer size (converted to int)
is positive.  vsnprintf(3) is supposed to handle a buffer size of zero
or bigger than INT_MAX just fine, so this extra comparison doesn't make
us any safer.  If a platform has a broken implementation, we'd need to
work around it in our compat code.

Call xsnprintf() instead to reduce code duplication and make the caller
slightly more readable by using this more common helper.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 10:29:34 -07:00
5b1967a33c githooks: use {old,new}-oid instead of {old,new}-value
Similar to the previous commit, rename {old,new}-value in the 'githooks'
documentation to {old,new}-oid. This improves clarity and also ensures
consistency within the document.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 10:20:28 -07:00
67e943c308 update-ref: use {old,new}-oid instead of {old,new}value
The `git-update-ref` command is used to modify references. The usage of
{old,new}value in the documentation refers to the OIDs. This is fine
since the command only works with regular references which hold OIDs.
But if the command is updated to support symrefs, we'd also be dealing
with {old,new}-refs.

To improve clarity around what exactly {old,new}value mean, let's rename
it to {old,new}-oid.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-02 10:20:28 -07:00
d5b35bba86 osxkeychain: store new attributes
d208bfdfef (credential: new attribute password_expiry_utc, 2023-02-18)
and a5c76569e7 (credential: new attribute oauth_refresh_token,
2023-04-21) introduced new credential attributes but support was missing
from git-credential-osxkeychain.

Support these attributes by appending the data to the password in the
keychain, separated by line breaks. Line breaks cannot appear in a git
credential password so it is an appropriate separator.

Fixes the remaining test failures with osxkeychain:

    18 - helper (osxkeychain) gets password_expiry_utc
    19 - helper (osxkeychain) overwrites when password_expiry_utc
    changes
    21 - helper (osxkeychain) gets oauth_refresh_token

Signed-off-by: Bo Anderson <mail@boanderson.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 15:38:20 -07:00
e3cef40db8 osxkeychain: erase matching passwords only
Other credential helpers support deleting credentials that match a
specified password. See 7144dee3ec (credential/libsecret: erase matching
creds only, 2023-07-26) and cb626f8e5c (credential/wincred: erase
matching creds only, 2023-07-26).

Support this in osxkeychain too by extracting, decrypting and comparing
the stored password before deleting.

Fixes the following test failure with osxkeychain:

    11 - helper (osxkeychain) does not erase a password distinct from
    input

Signed-off-by: Bo Anderson <mail@boanderson.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 15:38:20 -07:00
9032bcad82 osxkeychain: erase all matching credentials
Other credential managers erased all matching credentials, as indicated
by a test case that osxkeychain failed:

    15 - helper (osxkeychain) erases all matching credentials

Signed-off-by: Bo Anderson <mail@boanderson.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 15:38:20 -07:00
9abe31f5f1 osxkeychain: replace deprecated SecKeychain API
The SecKeychain API was deprecated in macOS 10.10, nearly 10 years ago.
The replacement SecItem API however is available as far back as macOS
10.6.

While supporting older macOS was perhaps prevously a concern,
git-credential-osxkeychain already requires a minimum of macOS 10.7
since 5747c8072b (contrib/credential: avoid fixed-size buffer in
osxkeychain, 2023-05-01) so using the newer API should not regress the
range of macOS versions supported.

Adapting to use the newer SecItem API also happens to fix two test
failures in osxkeychain:

    8 - helper (osxkeychain) overwrites on store
    9 - helper (osxkeychain) can forget host

The new API is compatible with credentials saved with the older API.

Signed-off-by: Bo Anderson <mail@boanderson.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 15:38:19 -07:00
b7d6f23a17 midx-write.c: use --stdin-packs when repacking
When constructing a new pack `git multi-pack-index repack` provides a
list of objects which is the union of objects in all MIDX'd packs which
were "included" in the repack.

Though correct, this typically yields a poorly structured pack, since
providing the objects list over stdin does not give pack-objects a
chance to discover the namehash values for each object, leading to
sub-optimal delta selection.

We can use `--stdin-packs` instead, which has a couple of benefits:

  - it does a supplemental walk over objects in the supplied list of
    packs to discover their namehash, leading to higher-quality delta
    selection

  - it requires us to list far less data over stdin; instead of listing
    each object in the resulting pack, we need only list the
    constituent packs from which those objects were selected in the MIDX

Of course, this comes at a slight cost: though we save time on listing
packs versus objects over stdin[^1] (around ~650 milliseconds), we add a
non-trivial amount of time walking over the given objects in order to
find better deltas.

In general, this is likely to more closely match the user's expectations
(i.e. that packs generated via `git multi-pack-index repack` are written
with high-quality deltas). But if not, we can always introduce a new
option in pack-objects to disable the supplemental object walk, which
would yield a pure CPU-time savings, at the cost of the on-disk size of
the resulting pack.

[^1]: In a patched version of Git that doesn't perform the supplemental
  object walk in `pack-objects --stdin-packs`, we save around ~650ms
  (from 5.968 to 5.325 seconds) when running `git multi-pack-index
  repack --batch-size=0` on git.git with all objects packed, and all
  packs in a MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 14:18:17 -07:00
440e470edb midx-write.c: check count of packs to repack after grouping
In both fill_included_packs_all() and fill_included_packs_batch(), we
accumulate a list of packs whose contents we want to repack together,
and then use that information to feed a list of objects as input to
pack-objects.

In both cases, the `fill_included_packs_` functions keep track of how
many packs they want to repack together, and only execute pack-objects
if there are at least two packs that need repacking.

Having both of these functions keep track of this information themselves
is not strictly necessary, since they also log which packs to repack via
the `include_pack` array, so we can simply count the non-zero entries in
that array after either function is done executing, reducing the overall
amount of code necessary.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 14:18:17 -07:00
e94be606f3 midx-write.c: factor out common want_included_pack() routine
When performing a 'git multi-pack-index repack', the MIDX machinery
tries to aggregate MIDX'd packs together either to (a) fill the given
`--batch-size` argument, or (b) combine all packs together.

In either case (using the `midx-write.c::fill_included_packs_batch()` or
`midx-write.c::fill_included_packs_all()` function, respectively), we
evaluate whether or not we want to repack each MIDX'd pack, according to
whether or it is loadable, kept, cruft, or non-empty.

Between the two `fill_included_packs_` callers, they both care about the
same conditions, except for `fill_included_packs_batch()` which also
cares that the pack is non-empty.

We could extract two functions (say, `want_included_pack()` and a
`_nonempty()` variant), but this is not necessary. For the case in
`fill_included_packs_all()` which does not check the pack size, we add
all of the pack's objects assuming that the pack meets all other
criteria. But if the pack is empty in the first place, we add all of its
zero objects, so whether or not we "accept" or "reject" it in the first
place is irrelevant.

This change improves the readability in both `fill_included_packs_`
functions.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 14:18:17 -07:00
748b88a021 midx-write: move writing-related functions from midx.c
Introduce a new midx-write.c source file, which holds all of the
functionality from the MIDX sub-system related to writing new MIDX files.

Similar to the relationship between "pack-bitmap.c" and
"pack-bitmap-write.c", this source file will hold code that is specific
to writing MIDX files as opposed to reading them (the latter will remain
in midx.c).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 14:18:16 -07:00
34f00e8643 Merge branch 'rs/midx-use-strvec-pushf' into tb/midx-write
* rs/midx-use-strvec-pushf:
  midx: use strvec_pushf() for pack-objects base name
2024-04-01 14:18:05 -07:00
c2cbfbd2e2 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 13:21:36 -07:00
cebe702a2a Merge branch 'ps/clone-with-includeif-onbranch'
An additional test to demonstrate that clone would not choke on a
global configuration file that uses includeIf.onbranch:*.path.

* ps/clone-with-includeif-onbranch:
  t5601: exercise clones with "includeIf.*.onbranch"
2024-04-01 13:21:36 -07:00
f949703f4b Merge branch 'jk/rebase-apply-leakfix'
Leakfix.

* jk/rebase-apply-leakfix:
  rebase: use child_process_clear() to clean
2024-04-01 13:21:35 -07:00
f0c570e20b Merge branch 'ps/t7800-variable-interpolation-fix'
Fix the way recently added tests interpolate variables defined
outside them, and document the best practice to help future
developers.

* ps/t7800-variable-interpolation-fix:
  t/README: document how to loop around test cases
  t7800: use single quotes for test bodies
  t7800: improve test descriptions with empty arguments
2024-04-01 13:21:35 -07:00
6938b355c0 Merge branch 'ps/reftable-unit-test-nfs-workaround'
A unit test for reftable code tried to enumerate all files in a
directory after reftable operations and expected to see nothing but
the files it wanted to leave there, but was fooled by .nfs* cruft
files left, which has been corrected.

* ps/reftable-unit-test-nfs-workaround:
  reftable: fix tests being broken by NFS' delete-after-close semantics
2024-04-01 13:21:35 -07:00
50b52cafae Merge branch 'jk/doc-remote-helpers-markup-fix'
Documentation mark-up fix.

* jk/doc-remote-helpers-markup-fix:
  doc/gitremote-helpers: fix more missing single-quotes
2024-04-01 13:21:34 -07:00
ac16f55697 Merge branch 'pb/advice-merge-conflict'
Hints that suggest what to do after resolving conflicts can now be
squelched by disabling advice.mergeConflict.

Acked-by: Phillip Wood <phillip.wood123@gmail.com>
cf. <e040c631-42d9-4501-a7b8-046f8dac6309@gmail.com>

* pb/advice-merge-conflict:
  builtin/am: allow disabling conflict advice
  sequencer: allow disabling conflict advice
2024-04-01 13:21:34 -07:00
521df686e5 Merge branch 'ds/config-internal-whitespace-fix'
"git config" corrupted literal HT characters written in the
configuration file as part of a value, which has been corrected.

* ds/config-internal-whitespace-fix:
  config.txt: describe handling of whitespace further
  t1300: add more tests for whitespace and inline comments
  config: really keep value-internal whitespace verbatim
  config: minor addition of whitespace
2024-04-01 13:21:34 -07:00
a031815a7d Merge branch 'jk/pretty-subject-cleanup'
Code clean-up in the "git log" machinery that implements custom log
message formatting.

* jk/pretty-subject-cleanup:
  format-patch: fix leak of empty header string
  format-patch: simplify after-subject MIME header handling
  format-patch: return an allocated string from log_write_email_headers()
  log: do not set up extra_headers for non-email formats
  pretty: drop print_email_subject flag
  pretty: split oneline and email subject printing
  shortlog: stop setting pp.print_email_subject
2024-04-01 13:21:34 -07:00
ccdc7d98bb Merge branch 'pw/checkout-conflict-errorfix'
"git checkout --conflict=bad" reported a bad conflictStyle as if it
were given to a configuration variable; it has been corrected to
report that the command line option is bad.

* pw/checkout-conflict-errorfix:
  checkout: fix interaction between --conflict and --merge
  checkout: cleanup --conflict=<style> parsing
  merge options: add a conflict style member
  merge-ll: introduce LL_MERGE_OPTIONS_INIT
  xdiff-interface: refactor parsing of merge.conflictstyle
2024-04-01 13:21:33 -07:00
d7805bc743 completion: protect prompt against unset SHOWUPSTREAM in nounset mode
As it stands, the only call site of `__git_ps1_show_upstream` checks
that the `GIT_PS1_SHOWUPSTREAM` variable is set, so this is effectively
a no-op. However, that might change, and chances of noticing the
unprotected use might not be that high when it does.

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 12:38:23 -07:00
758b4e1373 completion: fix prompt with unset SHOWCONFLICTSTATE in nounset mode
`GIT_PS1_SHOWCONFLICTSTATE` is a user variable that might not be set,
causing errors when the shell is in `nounset` mode.

Take into account on access by falling back to an empty string.

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01 08:31:54 -07:00
8b68b48d5c config: fix some small capitalization issues, as spotted
Fix some small capitalization issues, as spotted while going through the
documentation.  In general, a semicolon doesn't start a new sentence, and
"this" has no meaning of a proper noun in this context.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-31 16:01:52 -07:00
ffeaf2f76a mem-pool: use st_add() in mem_pool_strvfmt()
If len is INT_MAX in mem_pool_strvfmt(), then len + 1 overflows.
Casting it to size_t would prevent that.  Use st_add() to go a step
further and make the addition *obviously* safe.  The compiler can
optimize the check away on platforms where SIZE_MAX > INT_MAX, i.e.
basically everywhere.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-31 16:00:36 -07:00
73cb87773b test-lib: fix non-functioning GIT_TEST_MAINT_SCHEDULER fallback
When environment variable GIT_TEST_MAINT_SCHEDULER is set, `git
maintenance` invokes the command specified as the variable's value
rather than invoking the actual underlying platform-specific scheduler
management command. By setting GIT_TEST_MAINT_SCHEDULER to some suitable
value, test authors can therefore validate behavior of "destructive"
`git maintenance` commands without having to worry about clobbering the
user's own local scheduler configuration.

In order to protect an absent-minded test author from forgetting to set
GIT_TEST_MAINT_SCHEDULER in the local test script (and thus clobbering
his or her own scheduler configuration), t/test-lib.sh assigns an
"immediately error-out" value to GIT_TEST_MAINT_SCHEDULER by default
which should ensure that the problem will be caught and reported before
any damage can be done to the configuration of the person running the
tests.

Unfortunately, however, t/test-lib.sh neglects to export
GIT_TEST_MAINT_SCHEDULER, which renders the default "error-out"
assignment worthless. Fix this by exporting the variable as
originally intended.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-of-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-31 15:09:44 -07:00
6412d01527 add: use advise_if_enabled for ADVICE_ADD_EMBEDDED_REPO
By following a similar reasoning as in previous commits, there are no
reason why we should not use the advise_if_enabled() API to display the
ADVICE_ADD_EMBEDDED_REPO advice.

This advice was introduced in 532139940c (add: warn when adding an
embedded repository, 2017-06-14).  Some tests were included in the
commit, but none is testing this advice.  Which, note, we only want to
display once per run.

So, use the advise_if_enabled() machinery to show the
ADVICE_ADD_EMBEDDED_REPO advice and include a test to notice any
possible breakage.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-30 17:55:01 -07:00
1028db00f7 add: use advise_if_enabled for ADVICE_ADD_EMPTY_PATHSPEC
Since 93b0d86aaf (git-add: error out when given no arguments.,
2006-12-20) we display a message when no arguments are given to "git
add".

Part of that message was converted to advice in bf66db37f1 (add: use
advise function to display hints, 2020-01-07).

Following the same line of reasoning as in the previous commit, it is
sensible to use advise_if_enabled() here.

Therefore, use advise_if_enabled() in builtin/add.c to show the
ADVICE_ADD_EMPTY_PATHSPEC advice, and don't bother checking there the
visibility of the advice or displaying the instruction on how to disable
it.

Also add a test for these messages, in order to detect a possible
change in them.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-30 17:55:01 -07:00
9da49befd0 add: use advise_if_enabled for ADVICE_ADD_IGNORED_FILE
Since b3b18d1621 (advice: revamp advise API, 2020-03-02), we can use
advise_if_enabled() to display an advice.  This API encapsulates three
actions:
	1.- checking the visibility of the advice

	2.- displaying the advice when appropriate

	3.- displaying instructions on how to disable the advice, when
	    appropriate

The code we have in builtin/add.c to display the ADVICE_ADD_IGNORED_FILE
advice, is doing these three things.  However, the instructions
displayed on how to disable the hint are not shown in the normalized way
that advise_if_enabled() introduced.  This may cause distraction.

There is no reason not to use the new API here.  On the contrary, by
using it we gain simplicity in the code and avoid possible distractions.

For these reasons, use the newer advise_if_enabled() machinery to show
the ADVICE_ADD_IGNORED_FILE advice, and don't bother checking the
visibility or displaying the instruction on how to disable the advice.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-30 17:55:00 -07:00
b9f2e1a684 checkout: omit "tracking" information on a detached HEAD
By definition, a detached HEAD state is tentative and there is no
configured "upstream" that it always wants to integrate with.  But
if you detach from a branch that is behind its upstream, e.g.,

    $ git checkout -t -b main origin/main
    $ git checkout main
    $ git reset --hard HEAD^
    $ git checkout --detach main

you'd see "you are behind your upstream origin/main".  This does not
happen when you replace the last step in the above with any of these

    $ git checkout HEAD^0
    $ git checkout --detach HEAD
    $ git checkout --detach origin/main

Before 32669671 (checkout: introduce --detach synonym for "git
checkout foo^{commit}", 2011-02-08) introduced the "--detach"
option, the rule to decide if we show the tracking information
used to be:

    If --quiet is not given, and if the given branch name is a real
    local branch (i.e. the one we can compute the file path under
    .git/, like 'refs/heads/master' or "HEAD" which stand for the
    name of the current branch", then give the tracking information.

to exclude things like "git checkout master^0" (which was the
official way to detach HEAD at the commit before that commit) and
"git checkout origin/master^0" from showing tracking information,
but still do show the tracking information for the current branch
for "git checkout HEAD".  The introduction of an explicit option
"--detach" broke this subtley.  The new rule should have been

    If --quiet is given, do not bother with tracking info.
    If --detach is given, do not bother with tracking info.

    Otherwise, if we know that the branch name given is a real local
    branch, or if we were given "HEAD" and "HEAD" is not detached,
    then attempt to show the tracking info.

but it allowed "git checkout --detach master" to also show the
tracking info by mistake.  Let's tighten the rule to fix this.

Reported-by: mirth hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-30 17:53:25 -07:00
2d8cf94b28 advice: omit trailing whitespace
Git tools all consistently encourage users to avoid whitespaces at
the end of line by giving them features like "git diff --check" and
"git am --whitespace=fix".  Make sure that the advice messages we
give users avoid trailing whitespaces.  We shouldn't be wasting
vertical screen real estate by adding blank lines in advice messages
that are supposed to be concise hints, but as long as we write such
blank line in our "hints", we should do it right.

A test that expects the current behaviour of leaving trailing
whitespaces has been adjusted.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 16:18:48 -07:00
ebb55042a4 doc: git-clone: do not autoreference the manpage in itself
Auto-reference in man pages is a confusion factor.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:57:41 -07:00
76880f0510 doc: git-clone: apply new documentation formatting guidelines
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:57:40 -07:00
5cf7dfe93e doc: git-init: apply new documentation formatting guidelines
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:57:40 -07:00
71d9f5a19f doc: allow literal and emphasis format in doc vs help tests
As the new formatting of literal and placeholders is introduced,
the synopsis in the man pages can now hold additional markup with
respect to the command help.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:57:40 -07:00
c42ea60495 doc: rework CodingGuidelines with new formatting rules
Literal and placeholder formatting is more heavily enforced, with some
asciidoc magic. Basically, the markup is preserved everywhere.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:57:40 -07:00
776ffd1a30 t4126: fix "funny directory name" test on Windows (again)
Even though "git update-index --cacheinfo" ought to be filesystem
agnostic,

    $ git update-index --add --cacheinfo "100644,$empty_blob,funny /empty"

fails only on Windows, and this unfortunately makes the approach of
the previous step unworkable.

Resurrect the earlier approach to give up on running the test on
known-bad platforms.  Instead of computing a custom prerequisite,
just use !MINGW we have used elsewhere.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:22:34 -07:00
bab1f1c394 add-patch: do not print hunks repeatedly
The interactive-patch is a sequential process where, on each step, we
print one hunk from a patch and then ask the user how to proceed.

There is a possibility of repeating a step, for example if the user
enters a non-applicable option, i.e: "s"

    $ git add -p
    diff --git a/add-patch.c b/add-patch.c
    index 52be1ddb15..8fb75e82e2 100644
    --- a/add-patch.c
    +++ b/add-patch.c
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? s
    Sorry, cannot split this hunk
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?

... or an invalid option, i.e: "U"

    $ git add -p
    diff --git a/add-patch.c b/add-patch.c
    index 52be1ddb15..8fb75e82e2 100644
    --- a/add-patch.c
    +++ b/add-patch.c
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? U
    y - stage this hunk
    n - do not stage this hunk
    q - quit; do not stage this hunk or any of the remaining ones
    a - stage this hunk and all later hunks in the file
    d - do not stage this hunk or any of the later hunks in the file
    j - leave this hunk undecided, see next undecided hunk
    J - leave this hunk undecided, see next hunk
    g - select a hunk to go to
    / - search for a hunk matching the given regex
    e - manually edit the current hunk
    p - print again the current hunk
    ? - print help
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?

Printing the chunk again followed by the question can be confusing as
the user has to pay special attention to notice that the same chunk is
being reconsidered.

It can also be problematic if the chunk is longer than one screen height
because the result of the previous iteration is lost off the screen (the
help guide in the previous example).

To avoid such problems, stop printing the chunk if the iteration does
not advance to a different chunk.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-29 10:12:39 -07:00
66c14ab592 add-patch: introduce 'p' in interactive-patch
Shortly we're going make interactive-patch stop printing automatically
the hunk under certain circumstances.

Let's introduce a new option to allow the user to explicitly request
the printing.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-28 22:40:08 -07:00
012c8b307d t4126: make sure a directory with SP at the end is usable
As afb31ad9 (t1010: fix unnoticed failure on Windows, 2021-12-11)
said:

    On Microsoft Windows, a directory name should never end with a period.
    Quoting from Microsoft documentation[1]:

	Do not end a file or directory name with a space or a period.
	Although the underlying file system may support such names, the
	Windows shell and user interface does not.

    [1]: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

and the condition addressed by this change is exactly that.  If the
platform is unable to properly create these sample patches about a
file that lives in a directory whose name ends with a SP, there is
no point testing how "git apply" behaves there on the filesystem.

Even though the ultimate purpose of "git apply" is to apply a patch
and to update the filesystem entities, this particular test is
mainly about parsing a patch on a funny pathname correctly, and even
on a system that is incapable of checking out the resulting state
correctly on its filesystem, at least the parsing can and should work
fine.  Rewrite the test to work inside the index without touching the
filesystem.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-28 14:14:48 -07:00
d6fd04375f The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-28 14:13:51 -07:00
20d1adb6fc Merge branch 'jk/drop-hg-to-git'
Remove an ancient and not well maintained Hg-to-git migration
script from contrib/.

Acked-by: Stelian Pop <stelian@popies.net>
cf. <37e4cd61-b370-437e-bd42-f98f47d3ad32@popies.net>

* jk/drop-hg-to-git:
  contrib: drop hg-to-git script
2024-03-28 14:13:51 -07:00
8e2422320c Merge branch 'rs/t-prio-queue-fixes'
Test clean-up.

* rs/t-prio-queue-fixes:
  t-prio-queue: check result array bounds
  t-prio-queue: shorten array index message
2024-03-28 14:13:51 -07:00
b31d466365 Merge branch 'bt/fuzz-config-parse'
A new fuzz target that exercises config parsing code has been
added.

* bt/fuzz-config-parse:
  fuzz: add fuzzer for config parsing
2024-03-28 14:13:51 -07:00
bf0a352069 Merge branch 'jc/show-untracked-false'
The status.showUntrackedFiles configuration variable had a name
that tempts users to set a Boolean value expressed in our usual
"false", "off", and "0", but it only took "no".  This has been
corrected so "true" and its synonyms are taken as "normal", while
"false" and its synonyms are taken as "no".

* jc/show-untracked-false:
  status: allow --untracked=false and friends
  status: unify parsing of --untracked= and status.showUntrackedFiles
2024-03-28 14:13:50 -07:00
396430b5a7 Merge branch 'ph/diff-src-dst-prefix-config'
"git diff" and friends learned two extra configuration variables,
diff.srcPrefix and diff.dstPrefix.

* ph/diff-src-dst-prefix-config:
  diff.*Prefix: use camelCase in the doc and test titles
  diff: add diff.srcPrefix and diff.dstPrefix configuration variables
2024-03-28 14:13:50 -07:00
1002f28a52 Merge branch 'eb/hash-transition'
Work to support a repository that work with both SHA-1 and SHA-256
hash algorithms has started.

* eb/hash-transition: (30 commits)
  t1016-compatObjectFormat: add tests to verify the conversion between objects
  t1006: test oid compatibility with cat-file
  t1006: rename sha1 to oid
  test-lib: compute the compatibility hash so tests may use it
  builtin/ls-tree: let the oid determine the output algorithm
  object-file: handle compat objects in check_object_signature
  tree-walk: init_tree_desc take an oid to get the hash algorithm
  builtin/cat-file: let the oid determine the output algorithm
  rev-parse: add an --output-object-format parameter
  repository: implement extensions.compatObjectFormat
  object-file: update object_info_extended to reencode objects
  object-file-convert: convert commits that embed signed tags
  object-file-convert: convert commit objects when writing
  object-file-convert: don't leak when converting tag objects
  object-file-convert: convert tag objects when writing
  object-file-convert: add a function to convert trees between algorithms
  object: factor out parse_mode out of fast-import and tree-walk into in object.h
  cache: add a function to read an OID of a specific algorithm
  tag: sign both hashes
  commit: export add_header_signature to support handling signatures on tags
  ...
2024-03-28 14:13:50 -07:00
95ab557b4b MyFirstObjectWalk: add stderr to pipe processing
In the last chapter of this document, pipes are used in commands to
filter out the first/last trace messages.  But according to git(1),
trace messages are sent to stderr if GIT_TRACE is set to '1', so those
commands do not produce the described results.

Fix this by redirecting stderr to stdout prior to the pipe operator
to additionally connect stderr to stdin of the latter command.

Further, while reviewing the above fix, Kyle Lippincott noticed
a second issue with the second of the examples: a missing slash in the
executable path "./bin-wrappers git".

Add the missing slash.

Helped-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 09:24:35 -07:00
7250cdb695 MyFirstObjectWalk: fix description for counting omitted objects
Before the changes to count omitted objects, the function
traverse_commit_list() was used and its call cannot be changed to pass
a pointer to an oidset to record omitted objects.

Fix the text to clarify that we now use another traversal function to
be able to pass the pointer to the introduced oidset.

Helped-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 09:24:35 -07:00
af3888890e MyFirstObjectWalk: fix filtered object walk
Commit f0d2f84919 (MyFirstObjectWalk: update recommended usage,
2022-03-09) changed a call of parse_list_objects_filter() in a way
that probably never worked: parse_list_objects_filter() always needed
a pointer as its first argument.

Fix this by removing the CALLOC_ARRAY and passing the address of
rev->filter to parse_list_objects_filter() in accordance to
such a call in revisions.c, for example.

Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 09:24:34 -07:00
34e0b72b19 MyFirstObjectWalk: fix misspelled "builtins/"
pack-objects.c resides in builtin/ (not builtins/).

Fix the misspelled directory name.

Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 09:24:34 -07:00
d08a189ce2 MyFirstObjectWalk: use additional arg in config_fn_t
Commit a4e7e317f8 (config: add ctx arg to config_fn_t, 2023-06-28)
added a fourth argument to config_fn_t but did not change relevant
function calls in Documentation/MyFirstObjectWalk.txt.

Fix those calls and the example git_walken_config() to use
that additional argument.

Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 09:24:34 -07:00
9ccf3e9b22 config: add core.commentString
The core.commentChar code recently learned to accept more than a
single ASCII character. But using it is annoying with multiple versions
of Git, since older ones will reject it outright:

    $ git.v2.44.0 -c core.commentchar=foo stripspace -s
    error: core.commentChar should only be one ASCII character
    fatal: unable to parse 'core.commentchar' from command-line config

Let's add an alias core.commentString. That's arguably a better name
anyway, since we now can handle strings, and it makes it possible to
have a config that works reasonably with both old and new versions of
Git (see the example in the documentation).

This is strictly an alias, so there's not much point in adding duplicate
tests; I added a single one to t0030 that exercises the alias code.

Note also that the error messages for invalid values will now show the
variable the config parser handed us, and thus will be normalized to
lowercase (rather than camelcase). A few tests in t0030 are adjusted to
match.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 08:48:54 -07:00
d255105c99 SubmittingPatches: release-notes entry experiment
The "What's cooking" report lists the topics in flight, with a short
paragraph descibing what they are about.

Once written, the description is automatically picked up from the
"What's cooking" report and used in the commit log message of the
merge commit when the topic is merged into integration branches.
These commit log messges of the merge commits are then propagated to
the release notes.

It has been the maintainer's task to prepare these entries in the
"What's cooking" report.  Even though the original author of a topic
may be in the best position to write the initial description of a
topic, we so far lacked a formal channel for the author to suggest
what description to use.  The usual procedure has been for the
author to see the topic described in "What's cooking" report, and
then either complain about inaccurate explanation and/or offer a
rewrite.

Let's try an experiment to optionally let the author propose the one
paragraph description when the topic is submitted.  Pick the cover
letter as the logical place to do so, and describe an experimental
workflow in the SubmittingPatches document.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-26 09:37:15 -07:00
ec79d763de cherry-pick: add --empty for more robust redundant commit handling
As with git-rebase(1) and git-am(1), git-cherry-pick(1) can result in a
commit being made redundant if the content from the picked commit is
already present in the target history. However, git-cherry-pick(1) does
not have the same options available that git-rebase(1) and git-am(1) have.

There are three things that can be done with these redundant commits:
drop them, keep them, or have the cherry-pick stop and wait for the user
to take an action. git-rebase(1) has the `--empty` option added in commit
e98c4269c8 (rebase (interactive-backend): fix handling of commits that
become empty, 2020-02-15), which handles all three of these scenarios.
Similarly, git-am(1) got its own `--empty` in 7c096b8d61 (am: support
--empty=<option> to handle empty patches, 2021-12-09).

git-cherry-pick(1), on the other hand, only supports two of the three
possiblities: Keep the redundant commits via `--keep-redundant-commits`,
or have the cherry-pick fail by not specifying that option. There is no
way to automatically drop redundant commits.

In order to bring git-cherry-pick(1) more in-line with git-rebase(1) and
git-am(1), this commit adds an `--empty` option to git-cherry-pick(1). It
has the same three options (keep, drop, and stop), and largely behaves
the same. The notable difference is that for git-cherry-pick(1), the
default will be `stop`, which maintains the current behavior when the
option is not specified.

Like the existing `--keep-redundant-commits`, `--empty=keep` will imply
`--allow-empty`.

The `--keep-redundant-commits` option will be documented as a deprecated
synonym of `--empty=keep`, and will be supported for backwards
compatibility for the time being.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:41 -07:00
bd2f9fd025 cherry-pick: enforce --keep-redundant-commits incompatibility
When `--keep-redundant-commits` was added in  b27cfb0d8d
(git-cherry-pick: Add keep-redundant-commits option, 2012-04-20), it was
not marked as incompatible with the various operations needed to
continue or exit a cherry-pick (`--continue`, `--skip`, `--abort`, and
`--quit`).

Enforce this incompatibility via `verify_opt_compatible` like we do for
the other various options.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:41 -07:00
661b671aec sequencer: do not require allow_empty for redundant commit options
A consumer of the sequencer that wishes to take advantage of either the
`keep_redundant_commits` or `drop_redundant_commits` feature must also
specify `allow_empty`. However, these refer to two distinct types of
empty commits:

- `allow_empty` refers specifically to commits which start empty
- `keep_redundant_commits` refers specifically to commits that do not
  start empty, but become empty due to the content already existing in
  the target history

Conceptually, there is no reason that the behavior for handling one of
these should be entangled with the other. It is particularly unintuitive
to require `allow_empty` in order for `drop_redundant_commits` to have
an effect: in order to prevent redundant commits automatically,
initially-empty commits would need to be kept automatically as well.

Instead, rewrite the `allow_empty()` logic to remove the over-arching
requirement that `allow_empty` be specified in order to reach any of the
keep/drop behaviors. Only if the commit was originally empty will
`allow_empty` have an effect.

Note that no behavioral changes should result from this commit -- it
merely sets the stage for future commits. In one such future commit, an
`--empty` option will be added to git-cherry-pick(1), meaning that
`drop_redundant_commits` will be used by that command.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:40 -07:00
1b90588d62 sequencer: handle unborn branch with --allow-empty
When using git-cherry-pick(1) with `--allow-empty` while on an unborn
branch, an error is thrown. This is inconsistent with the same
cherry-pick when `--allow-empty` is not specified.

Detect unborn branches in `is_index_unchanged`. When on an unborn
branch, use the `empty_tree` as the tree to compare against.

Add a new test to cover this scenario. While modelled off of the
existing 'cherry-pick on unborn branch' test, some improvements can be
made:

- Use `git switch --orphan unborn` instead of `git checkout --orphan
  unborn` to avoid the need for a separate `rm -rf *` call
- Avoid using `--quiet` in the `git diff` call to make debugging easier
  in the event of a failure. Use simply `--exit-code` instead.

Make these improvements to the existing test as well as the new test.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:40 -07:00
c282eba2d5 rebase: update --empty=ask to --empty=stop
When git-am(1) got its own `--empty` option in 7c096b8d61 (am: support
--empty=<option> to handle empty patches, 2021-12-09), `stop` was used
instead of `ask`. `stop` is a more accurate term for describing what
really happens, and consistency is good.

Update git-rebase(1) to also use `stop`, while keeping `ask` as a
deprecated synonym. Update the tests to primarily use `stop`, but also
ensure that `ask` is still allowed.

In a future commit, we'll be adding a new `--empty` option for
git-cherry-pick(1) as well, making the consistency even more relevant.

Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:40 -07:00
64a443efe4 docs: clean up --empty formatting in git-rebase(1) and git-am(1)
Both of these pages document very similar `--empty` options, but with
different styles. The exact behavior of these `--empty` options differs
somewhat, but consistent styling in the docs is still beneficial. This
commit aims to make them more consistent.

Break the possible values for `--empty` into separate sections for
readability. Alphabetical order is chosen for consistency.

In a future commit, we'll be documenting a new `--empty` option for
git-cherry-pick(1), making the consistency even more relevant.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:40 -07:00
0af38890ad docs: address inaccurate --empty default with --exec
The documentation for git-rebase(1) indicates that using the `--exec`
option will use `--empty=drop`. This is inaccurate: when `--interactive`
is not explicitly provided, `--exec` results in `--empty=keep`
behaviors.

Correctly indicate the behavior of `--exec` using `--empty=keep` when
`--interactive` is not specified.

Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:45:40 -07:00
c75fd8d815 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 16:16:35 -07:00
03658df781 Merge branch 'bl/doc-key-val-sep-fix'
The documentation for "%(trailers[:options])" placeholder in the
"--pretty" option of commands in the "git log" family has been
updated.

* bl/doc-key-val-sep-fix:
  docs: adjust trailer `separator` and `key_value_separator` language
  docs: correct trailer `key_value_separator` description
2024-03-25 16:16:35 -07:00
b58cc6aa5d Merge branch 'bl/doc-config-fixes'
A few typoes in "git config --help" have been corrected.

* bl/doc-config-fixes:
  docs: fix typo in git-config `--default`
  docs: clarify file options in git-config `--edit`
2024-03-25 16:16:35 -07:00
0cb25d1744 Merge branch 'ja/doc-formatting-fix'
Documentation mark-up fix.

* ja/doc-formatting-fix:
  doc: fix some placeholders formating
  doc: format alternatives in synopsis
2024-03-25 16:16:34 -07:00
a7f0fcb335 Merge branch 'bb/sh-scripts-cleanup'
Shell scripts clean-up.

* bb/sh-scripts-cleanup: (22 commits)
  git-quiltimport: avoid an unnecessary subshell
  contrib/coverage-diff: avoid redundant pipelines
  t/t9*: merge "grep | sed" pipelines
  t/t8*: merge "grep | sed" pipelines
  t/t5*: merge a "grep | sed" pipeline
  t/t4*: merge a "grep | sed" pipeline
  t/t3*: merge a "grep | awk" pipeline
  t/t1*: merge a "grep | sed" pipeline
  t/t9*: avoid redundant uses of cat
  t/t8*: avoid redundant use of cat
  t/t7*: avoid redundant use of cat
  t/t6*: avoid redundant uses of cat
  t/t5*: avoid redundant uses of cat
  t/t4*: avoid redundant uses of cat
  t/t3*: avoid redundant uses of cat
  t/t1*: avoid redundant uses of cat
  t/t0*: avoid redundant uses of cat
  t/perf: avoid redundant use of cat
  t/annotate-tests.sh: avoid redundant use of cat
  t/lib-cvs.sh: avoid redundant use of cat
  ...
2024-03-25 16:16:34 -07:00
46d8bf30e4 Merge branch 'jc/index-pack-fsck-levels'
Test fix.

* jc/index-pack-fsck-levels:
  t5300: fix test_with_bad_commit()
2024-03-25 16:16:34 -07:00
d921c365ee Merge branch 'js/bugreport-no-suffix-fix'
"git bugreport --no-suffix" was not supported and instead
segfaulted, which has been corrected.

* js/bugreport-no-suffix-fix:
  bugreport.c: fix a crash in `git bugreport` with `--no-suffix` option
2024-03-25 16:16:34 -07:00
199074f893 Merge branch 'rj/restore-plug-leaks'
Leaks from "git restore" have been plugged.

* rj/restore-plug-leaks:
  checkout: plug some leaks in git-restore
2024-03-25 16:16:33 -07:00
6e9ef296e2 grep docs: describe --no-index further and improve formatting a bit
Improve the description of --no-index, to make it more clear to the users
what this option actually does under the hood, and what's its purpose.
Describe the dependency between --no-index and either of the --cached and
--untracked options, which cannot be used together.

As part of that, shuffle a couple of the options, to make the documentation
flow a bit better, because it makes more sense to describe first the options
that have something in common, and to after that describe an option that does
something differently.  In more detail, --cached and --untracked both leave
git-grep(1) in the usual state, in which it treats the directory as a local
git repository, unlike --no-index that makes git-grep(1) treat the directory
not as a git repository.

While there, improve the descriptions of grep worker threads a bit, to give
them better context.  Adjust the language a bit, to avoid addressing the
reader directly, which is in general preferred in technical documentation,
because it eliminates the possible element of persuading the user to do
something.  In other words, we should be telling the user what our software
can do, instead of telling the user what to do.

Also perform some minor formatting improvements, to make it clear it's the
git commands, command parameters, and configuration option names.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 14:00:03 -07:00
4a9357a1ba grep docs: describe --recurse-submodules further and improve formatting a bit
Clarify that --recurse-submodules cannot be used together with --untracked,
and improve the formatting in a couple of places, to make it visually clear
that those are the commands or the names of configuration options.

While there, change a couple of "<tree>" placeholders to "_<tree>_", to help
with an ongoing translation improvement effort. [1]

[1] https://lore.kernel.org/git/CAPig+cQc8W4JOpB+TMP=czketU1U7wcY_x9bsP5T=3-XjGLhRQ@mail.gmail.com/

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 14:00:03 -07:00
f999d5188b pretty: find pretty formats case-insensitively
User-defined pretty formats are stored in config, which is meant to use
case-insensitive matching for names as noted in config.txt's 'Syntax'
section:

    All the other lines [...] are recognized as setting variables, in
    the form 'name = value' [...]. The variable names are
    case-insensitive, [...].

When a user specifies one of their format aliases with an uppercase in
it, however, it is not found.

    $ git config pretty.testAlias %h
    $ git config --list | grep pretty
    pretty.testalias=%h
    $ git log --format=testAlias -1
    fatal: invalid --pretty format: testAlias
    $ git log --format=testalias -1
    3c2a3fdc38

This is true whether the name in the config file uses any uppercase
characters or not.

Use case-insensitive comparisons when identifying format aliases.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 12:19:48 -07:00
2cd134f2c5 pretty: update tests to use test_config
These tests use raw `git config` calls, which is an older style that can
cause config to bleed between tests if not manually unset. `test_config`
ensures that config is unset at the end of each test automatically.

`test_config` is chosen over `git -c` since `test_config` still ends up
calling `git config` which seems slightly more realistic to how pretty
formats would be defined normally.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 12:19:45 -07:00
4d45e79e11 midx: use strvec_pushf() for pack-objects base name
Build the pack base name argument directly using strvec_pushf() instead
of with an intermediate strbuf.  This is shorter, simpler and avoids the
need for explicit cleanup.

Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 12:03:27 -07:00
8d383806fc t/README: mention test files are make targets
Since 23fc63bf8f (make tests ignorable with "make -i", 2005-11-08), each
test file defines a target in the test Makefile, such that one can
invoke:

	make *checkout*

to run all tests with 'checkout' in their filename. This is useful to
run a subset of tests when you have a good idea of what part of the code
is touched by the changes your are testing.

Document that in t/README to help new (or more seasoned) contributors
that might not be aware.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 11:59:48 -07:00
7c43bdf07b cat-file: use strbuf_expand_bad_format()
Report unknown format elements and missing closing parentheses with
consistent and translated messages by calling strbuf_expand_bad_format()
at the very end of the combined if/else chain of expand_format() and
expand_atom().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 11:59:26 -07:00
e36091aa1d factor out strbuf_expand_bad_format()
Extract a function for reporting placeholders that are not enclosed in a
parenthesis or are unknown.  This reduces the number of strings to
translate and improves consistency across commands.  Call it at the end
of the if/else chain, after exhausting all accepted possibilities.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 11:59:24 -07:00
0d527842b7 grep: improve errors for unmatched ( and )
Imagine you want to grep for (. Easy:

  $ git grep '('
  fatal: unmatched parenthesis

uhoh. This is plainly wrong. Unless you know specifically that

 (a) git grep has expression groups and '(' ... ')' are used for them.
 (b) you can use -e '(' to explicitly say '(' is what you are looking
     for, not the beginning of a group.

Similarly,

  $ git grep ')'
  fatal: incomplete pattern expression: )

is somehow worse. ")" is a complete regular expression pattern.
Of course, the error wants to say "group" here.
In this case it is also not "incomplete", it is unmatched.

Make them say

  $ ./git grep '('
  fatal: unmatched ( for expression group
  $ ./git grep ')'
  fatal: incomplete pattern expression group: )

which are clearer in indicating that it is not the expression that
is wrong (since no pattern had been parsed at all), but rather that
it is been misconstrued as a grouping operator.

Link: https://bugs.debian.org/1051205
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 11:40:53 -07:00
9f6714ab3e builtin/gc: pack refs when using git maintenance run --auto
When running `git maintenance run --auto`, then the various subtasks
will only run as needed. Thus, we for example end up only packing loose
objects if we hit a certain threshold.

Interestingly enough, the "pack-refs" task is actually _never_ executed
when the auto-flag is set because it does not have a condition at all.
As 41abfe15d9 (maintenance: add pack-refs task, 2021-02-09) mentions:

    The 'auto_condition' function pointer is left NULL for now. We could
    extend this in the future to have a condition check if pack-refs
    should be run during 'git maintenance run --auto'.

It is not quite clear from that quote whether it is actually intended
that the task doesn't run at all in this mode. Also, no test was added
to verify this behaviour. Ultimately though, it feels quite surprising
that `git maintenance run --auto --task=pack-refs` would quietly never
do anything at all.

In any case, now that we do have the logic in place to let ref backends
decide whether or not to repack refs, it does make sense to wire it up
accordingly. With the "reftable" backend we will thus now perform
auto-compaction, which optimizes the refdb as needed.

But for the "files" backend we now unconditionally pack refs as it does
not yet know to handle the "auto" flag. Arguably, this can be seen as a
bug fix given that previously the task never did anything at all.
Eventually though we should amend the "files" backend to use some
heuristics for auto compaction, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
bfc2f9eb8e builtin/gc: forward git-gc(1)'s --auto flag when packing refs
Forward the `--auto` flag to git-pack-refs(1) when it has been invoked
with this flag itself. This does not change anything for the "files"
backend, which will continue to eagerly pack refs. But it does ensure
that the "reftable" backend only compacts refs as required.

This change does not impact git-maintenance(1) because this command will
in fact never run the pack-refs task when run with `--auto`. This issue
will be addressed in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
77257e3c7e t6500: extract objects with "17" prefix
The ".git/obects/17/" shard is somewhat special because it is used by
git-gc(1) to estimate how many objects there are by extrapolating the
number of objects in that shard, only. In t6500 we thus have a hard
coded set of data that, when written to the object database, result in
blobs starting with that prefix.

We are about to need such "17"-prefixed objects in another test suite.
Extract them into "t/oid-info/hash-info" so that they can be reused by
other tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
0e05d53992 builtin/gc: move struct maintenance_run_opts
We're about to start using `struct maintenance_run_opts` in
`maintenance_task_pack_refs()`. Move its definition up to prepare for
this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
6dcffc68f4 builtin/pack-refs: introduce new "--auto" flag
Calling git-pack-refs(1) will unconditionally cause it to pack all
requested refs regardless of the current state of the ref database. For
example:

  - With the "files" backend we will end up rewriting the complete
    "packed-refs" file even if only a single ref would require
    compaction.

  - With the "reftable" backend we will end up always compacting all
    tables into a single table.

This behaviour can be completely unnecessary depending on the backend
and is thus wasteful.

With the introduction of the `PACK_REFS_AUTO` flag in the preceding
commit we can improve this and let the backends decide for themselves
whether to pack refs in the first place. Expose this functionality via a
new "--auto" flag in git-pack-refs(1), which mirrors the same flag in
both git-gc(1) and git-maintenance(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
a75dc71f37 builtin/pack-refs: release allocated memory
Some of the command line options in `cmd_pack_refs()` require us to
allocate memory. This memory is never released and thus leaking, but we
paper over this leak by declaring the respective variables as `static`
function-level variables, which is somewhat awkward.

Refactor the code to release the allocated memory and drop the `static`
declaration. While at it, remove the useless `flags` variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
f89356db4a refs/reftable: expose auto compaction via new flag
Under normal circumstances, the "reftable" backend will automatically
perform compaction after appending to the stack. It is thus not
necessary and may even be considered wasteful to run git-pack-refs(1) in
"reftable"-backed repositories as it will cause the backend to compact
all tables into a single one. We do exactly that though when running
`git maintenance run --auto` or `git gc --auto`, which gets spawned by
Git after running some specific commands.

The `--auto` mode is typically only executing optimizations as needed.
To do so, we already use several heuristics for the various different
data structures in Git to determine whether to optimize them or not.
We do not use any heuristics for refs though and instead always optimize
them.

Introduce a new `PACK_REFS_AUTO` flag that can be passed to the backend.
When not handled by the backend we will continue to behave the exact
same as we do right now, that is we optimize refs unconditionally. This
is done for the "files" backend for now to retain current behaviour,
even though we may eventually also want to introduce heuristics here.
For the "reftable" backend though we already do have auto-compaction, so
we can easily reuse that logic to implement the new auto-packing flag.

Note that under normal circumstances, this should always end up being a
no-op. After all, we already invoke the code for every single addition
to the stack. But there are special cases where it can still be helpful
to execute the auto-compaction code explicitly:

  - Concurrent writers may cause compaction to not run due to locks.

  - Callers may decide to disable compaction altogether and then pack
    refs at a later point due to various reasons.

  - Other implementations of the reftable format may do compaction
    differently or even not at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
35aeabd6c2 refs: remove PACK_REFS_ALL flag
The intent of the `PACK_REFS_ALL` flag is to ask the backend to compact
all refs instead of only a subset of them. Thus, this flag gets passed
down to `refs_pack_refs()` via `struct pack_refs_opts::flags`.

But starting with 4fe42f326e (pack-refs: teach pack-refs --include
option, 2023-05-12), the flag's semantics have changed. Instead of being
handled by the respective backends, this flag is now getting handled by
the callers of `refs_pack_refs()` which will add a single glob ("*") to
the list of refs-to-be-packed. Thus, the flag serves no purpose to the
ref backends anymore.

Remove the flag and replace it with a local variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
0f65c7a676 refs: move struct pack_refs_opts to where it's used
The declaration of `struct pack_refs_opts` is in a seemingly random
place. Move it so that it's located right next to its flags and
functions that use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
ed12124d4a t/helper: drop pack-refs wrapper
The test helper provides a "ref-store <store> pack-refs" wrapper that
more or less directly invokes `refs_pack_refs()`. This helper is only
used in a single test with the "PACK_REFS_PRUNE" and "PACK_REFS_ALL"
flags. Both of these flags can directly be accessed via git-pack-refs(1)
though via the `--all` and `--prune` flags, which makes the helper
superfluous.

Refactor the test to use git-pack-refs(1) instead of the test helper.
Drop the now-unused test helper command.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
4ccf7060d8 refs/reftable: print errors on compaction failure
When git-pack-refs(1) fails in the reftable backend we end up printing
no error message at all, leaving the caller puzzled as to why compaction
has failed. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
a2f711ade0 reftable/stack: gracefully handle failed auto-compaction due to locks
Whenever we commit a new table to the reftable stack we will end up
invoking auto-compaction of the stack to keep the total number of tables
at bay. This auto-compaction may fail though in case at least one of the
tables which we are about to compact is locked. This is indicated by the
compaction function returning `REFTABLE_LOCK_ERROR`. We do not handle
this case though, and thus bubble that return value up the calling
chain, which will ultimately cause a failure.

Fix this bug by ignoring `REFTABLE_LOCK_ERROR`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
33358350eb reftable/stack: use error codes when locking fails during compaction
Compaction of a reftable stack may fail gracefully when there is a
concurrent process that writes to the reftable stack and which has thus
locked either the "tables.list" file or one of the tables. This is
expected and can be handled gracefully by some of the callers which
invoke compaction. Thus, to indicate this situation to our callers, we
return a positive return code from `stack_compact_range()` and bubble it
up to the caller.

This kind of error handling is somewhat awkward though as many callers
in the call chain never even think of handling positive return values.
Thus, the result is either that such errors are swallowed by accident,
or that we abort operations with an unhelpful error message.

Make the code more robust by always using negative error codes when
compaction fails, with `REFTABLE_LOCK_ERROR` for the described benign
error case.

Note that only a single callsite knew to handle positive error codes
gracefully in the first place. Subsequent commits will touch up some of
the other sites to handle those errors better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:51:11 -07:00
af18098c9d reftable/error: discern locked/outdated errors
We currently throw two different errors into a similar-but-different
error code:

  - Errors when trying to lock the reftable stack.

  - Errors when trying to write to the reftable stack which has been
    modified concurrently.

This results in unclear error handling and user-visible error messages.

Create a new `REFTABLE_OUTDATED_ERROR` so that those error conditions
can be clearly told apart from each other. Adjust users of the old
`REFTABLE_LOCK_ERROR` to use the new error code as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:51:11 -07:00
630942a873 reftable/stack: fix error handling in reftable_stack_init_addition()
In `reftable_stack_init_addition()` we call `stack_uptodate()` after
having created the lockfile to check whether the stack was modified
concurrently, which is indicated by a positive return code from the
latter function. If so, we return a `REFTABLE_LOCK_ERROR` to the caller
and abort the addition.

The error handling has an off-by-one though because we check whether the
error code is `> 1` instead of `> 0`. Thus, instead of returning the
locking error, we would return a positive value. One of the callers of
`reftable_stack_init_addition()` works around this bug by repeating the
error code check without the off-by-one. But other callers are subtly
broken by this bug.

Fix this by checking for `err > 0` instead. This has the consequence
that `reftable_stack_init_addition()` won't ever return a positive error
code anymore, but will instead return `REFTABLE_LOCK_ERROR` now. Thus,
we can drop the check for a positive error code in `stack_try_add()`
now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:51:11 -07:00
b45602e392 editorconfig: add Makefiles to "text files"
The Makefile and makefile fragments use the same indent style than the
rest of the code (with some inconsistencies).

Add them to the relevant .editorconfig section to make life easier for
editors and reviewers.

Signed-off-by: Max Gautier <mg@max.gautier.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-23 11:42:31 -07:00
647e870a08 rebase: use child_process_clear() to clean
In the run_am() function, we set up a child_process struct to run
"git-am", allocating memory for its args and env strvecs. These are
normally cleaned up when we call run_command(). But if we encounter
certain errors, we exit the function early and try to clean up ourselves
by clearing the am.args field. This leaks the "env" strvec.

We should use child_process_clear() instead, which covers both. And more
importantly, it future proofs us against the struct ever growing more
allocated fields.

These are unlikely errors to happen in practice, so they don't actually
trigger the leak sanitizer in the tests. But we can add a new test which
does exercise one of the paths (and fails SANITIZE=leak without this
patch).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-22 10:21:35 -07:00
1c10b8e5b0 format-patch: fix leak of empty header string
The log_write_email_headers() function recently learned to return the
"extra_headers_p" variable to the caller as an allocated string. We
start by copying rev_info.extra_headers into a strbuf, and then detach
the strbuf at the end of the function. If there are no extra headers, we
leave the strbuf empty. Likewise, if there are no headers to return, we
pass back NULL.

This misses a corner case which can cause a leak. The "do we have any
headers to copy" check is done by looking for a NULL opt->extra_headers.
But the "do we have a non-empty string to return" check is done by
checking the length of the strbuf. That means if opt->extra_headers is
the empty string, we'll "copy" it into the strbuf, triggering an
allocation, but then leak the buffer when we return NULL from the
function.

We can solve this in one of two ways:

  1. Rather than checking headers->len at the end, we could check
     headers->alloc to see if we allocated anything. That retains the
     original behavior before the recent change, where an empty
     extra_headers string is "passed through" to the caller. In practice
     this doesn't matter, though (the code which eventually looks at the
     result treats NULL or the empty string the same).

  2. Only bother copying a non-empty string into the strbuf. This has
     the added bonus of avoiding a pointless allocation.

     Arguably strbuf_addstr() could do this optimization itself, though
     it may be slightly dangerous to do so (some existing callers may
     not get a fresh allocation when they expect to). In theory callers
     are all supposed to use strbuf_detach() in such a case, but there's
     no guarantee that this is the case.

This patch uses option 2. Without it, building with SANITIZE=leak shows
many errors in t4021 and elsewhere.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-22 09:50:53 -07:00
7c4449eb31 t/README: document how to loop around test cases
In some cases it makes sense to loop around test cases so that we can
execute the same test with slightly different arguments. There are some
gotchas around quoting here though that are easy to miss and that may
lead to easy-to-miss errors and portability issues.

Document the proper way to do this in "t/README".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-22 07:36:35 -07:00
c559677c1f t7800: use single quotes for test bodies
In eb84c8b6ce (git-difftool--helper: honor `--trust-exit-code` with
`--dir-diff`, 2024-02-20) we have started to loop around some of the
tests in t7800 so that they are reexecuted with slightly different
arguments. As part of that refactoring the quoting of test bodies was
changed from single quotes (') to double quotes (") so that the value of
the loop variable is accessible to the body.

As the test body is later on passed to eval this change was not required
though. Let's revert it back to use single quotes as usual in our tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-22 07:36:34 -07:00
ac45f68866 t7800: improve test descriptions with empty arguments
Some of the tests in t7800 are executed repeatedly in a loop with
different arguments. To distinguish these tests, the value of that
variable is rendered into the test title. But given that one of the
values is the empty string, it results in a somewhat awkward test name:

    difftool  ignores exit code

Improve this by printing "without options" in case the value is empty.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-22 07:36:34 -07:00
e6895c3f97 config.txt: describe handling of whitespace further
Make it more clear what the whitespace characters are in the context of git
configuration files, and significantly improve the description of the leading
and trailing whitespace handling, especially how it works out together with
the presence of inline comments.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21 15:57:10 -07:00
d71bc1b4a3 t1300: add more tests for whitespace and inline comments
Add a handful of additional tests, to improve the coverage of the handling
of configuration file entries whose values contain internal whitespace,
leading and/or trailing whitespace, which may or may not be enclosed within
quotation marks, or which contain an additional inline comment.

At the same time, rework one already existing whitespace-related test a bit,
to ensure its consistency with the newly added tests.  This change introduced
no functional changes to the already existing test.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21 15:57:10 -07:00
f0b8944430 config: really keep value-internal whitespace verbatim
Fix a bug in function parse_value() that prevented whitespace characters
(i.e. spaces and horizontal tabs) found inside configuration option values
from being parsed and returned in their original form.  The bug caused any
number of consecutive whitespace characters to be wrongly "squashed" into
the same number of space characters.

This bug was introduced back in July 2009, in commit ebdaae372b ("config:
Keep inner whitespace verbatim").

Further investigation showed that setting a configuration value, by invoking
git-config(1), converts value-internal horizontal tabs into "\t" escape
sequences, which the buggy value-parsing logic in function parse_value()
didn't "squash" into spaces.  That's why the test included in the ebdaae37
commit passed, which presumably made the bug remain undetected for this long.
On the other hand, value-internal literal horizontal tab characters, found in
a configuration file edited by hand, do get "squashed" by the value-parsing
logic, so the right choice was to fix this bug by making the value-internal
whitespace characters preserved verbatim.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21 15:57:09 -07:00
0d49b1e5a8 config: minor addition of whitespace
In general, binary operators should be enclosed in a pair of leading and
trailing space (SP) characters.  Thus, clean up one spotted expression that
for some reason had a "bunched up" operator.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21 15:57:09 -07:00
11c821f2f2 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21 14:55:14 -07:00
1f49f7506f Merge branch 'bb/iso-strict-utc'
The output format for dates "iso-strict" has been tweaked to show
a time in the Zulu timezone with "Z" suffix, instead of "+00:00".

* bb/iso-strict-utc:
  date: make "iso-strict" conforming for the UTC timezone
2024-03-21 14:55:14 -07:00
e577feced0 Merge branch 'bb/t0006-negative-tz-offset'
More tests on showing time with negative TZ offset.

* bb/t0006-negative-tz-offset:
  t0006: add more tests with a negative TZ offset
2024-03-21 14:55:14 -07:00
6e701146b7 Merge branch 'jw/doc-show-untracked-files-fix'
The status.showUntrackedFiles configuration variable was
incorrectly documented to accept "false", which has been corrected.

* jw/doc-show-untracked-files-fix:
  doc: status.showUntrackedFiles does not take "false"
2024-03-21 14:55:14 -07:00
509a047355 Merge branch 'dg/user-manual-hash-example'
User manual (the original one) update.

* dg/user-manual-hash-example:
  Documentation/user-manual.txt: example for generating object hashes
2024-03-21 14:55:14 -07:00
81ba11b7c4 Merge branch 'ja/doc-markup-fixes'
Mark-ups used in the documentation has been improved for
consistency.

* ja/doc-markup-fixes:
  doc: git-clone: format placeholders
  doc: git-clone: format verbatim words
  doc: git-init: rework config item init.templateDir
  doc: git-init: rework definition lists
  doc: git-init: format placeholders
  doc: git-init: format verbatim parts
2024-03-21 14:55:13 -07:00
b0b43e3b1a Merge branch 'pb/ci-win-artifact-names-fix'
CI update.

* pb/ci-win-artifact-names-fix:
  ci(github): make Windows test artifacts name unique
2024-03-21 14:55:13 -07:00
e8c1cda9a9 Merge branch 'ps/reftable-reflog-iteration-perf'
The code to iterate over reflogs in the reftable has been optimized
to reduce memory allocation and deallocation.

Reviewed-by: Josh Steadmon <steadmon@google.com>
cf. <Ze9eX-aaWoVaqsPP@google.com>

* ps/reftable-reflog-iteration-perf:
  refs/reftable: track last log record name via strbuf
  reftable/record: use scratch buffer when decoding records
  reftable/record: reuse message when decoding log records
  reftable/record: reuse refnames when decoding log records
  reftable/record: avoid copying author info
  reftable/record: convert old and new object IDs to arrays
  refs/reftable: reload correct stack when creating reflog iter
2024-03-21 14:55:13 -07:00
dc97afdcb9 Merge branch 'jc/safe-implicit-bare'
Users with safe.bareRepository=explicit can still work from within
$GIT_DIR of a seconary worktree (which resides at .git/worktrees/$name/)
of the primary worktree without explicitly specifying the $GIT_DIR
environment variable or the --git-dir=<path> option.

* jc/safe-implicit-bare:
  setup: notice more types of implicit bare repositories
2024-03-21 14:55:13 -07:00
8be51c1f36 Merge branch 'fs/find-end-of-log-message-fix'
The code to find the effective end of log message can fall into an
endless loop, which has been corrected.

* fs/find-end-of-log-message-fix:
  wt-status: don't find scissors line beyond buf len
2024-03-21 14:55:12 -07:00
3eba921f81 Merge branch 'ps/reftable-block-search-fix'
The reftable code has its own custom binary search function whose
comparison callback has an unusual interface, which caused the
binary search to degenerate into a linear search, which has been
corrected.

* ps/reftable-block-search-fix:
  reftable/block: fix binary search over restart counter
  reftable/record: fix memory leak when decoding object records
2024-03-21 14:55:12 -07:00
330ed38a2d Merge branch 'ps/reftable-stack-tempfile'
The code in reftable backend that creates new table files works
better with the tempfile framework to avoid leaving cruft after a
failure.

* ps/reftable-stack-tempfile:
  reftable/stack: register compacted tables as tempfiles
  reftable/stack: register lockfiles during compaction
  reftable/stack: register new tables as tempfiles
  lockfile: report when rollback fails
2024-03-21 14:55:12 -07:00
7a01b44463 Merge branch 'rs/opt-parse-long-fixups'
The parse-options code that deals with abbreviated long option
names have been cleaned up.

Reviewed-by: Josh Steadmon <steadmon@google.com>
cf. <ZfDM5Or3EKw7Q9SA@google.com>

* rs/opt-parse-long-fixups:
  parse-options: rearrange long_name matching code
  parse-options: normalize arg and long_name before comparison
  parse-options: detect ambiguous self-negation
  parse-options: factor out register_abbrev() and struct parsed_option
  parse-options: set arg of abbreviated option lazily
  parse-options: recognize abbreviated negated option with arg
2024-03-21 14:55:12 -07:00
0068aa7946 reftable: fix tests being broken by NFS' delete-after-close semantics
It was reported that the reftable unit tests in t0032 fail with the
following assertion when running on top of NFS:

    running test_reftable_stack_compaction_concurrent_clean
    reftable/stack_test.c: 1063: failed assertion count_dir_entries(dir) == 2
    Aborted

Setting a breakpoint immediately before the assertion in fact shows the
following list of files:

    ./stack_test-1027.QJBpnd
    ./stack_test-1027.QJBpnd/0x000000000001-0x000000000003-dad7ac80.ref
    ./stack_test-1027.QJBpnd/.nfs000000000001729f00001e11
    ./stack_test-1027.QJBpnd/tables.list

Note the weird ".nfs*" file? This file is maintained by NFS clients in
order to emulate delete-after-last-close semantics that we rely on in
the reftable code [1]. Instead of unlinking the file right away and
keeping it open in the client, the NFS client will rename it to ".nfs*"
and then delete that temporary file when the last reference to it gets
dropped. Quoting the NFS FAQ:

    > D2. What is a "silly rename"? Why do these .nfsXXXXX files keep
    > showing up?
    >
    > A. Unix applications often open a scratch file and then unlink it.
    > They do this so that the file is not visible in the file system name
    > space to any other applications, and so that the system will
    > automatically clean up (delete) the file when the application exits.
    > This is known as "delete on last close", and is a tradition among
    > Unix applications.
    >
    > Because of the design of the NFS protocol, there is no way for a
    > file to be deleted from the name space but still remain in use by an
    > application. Thus NFS clients have to emulate this using what
    > already exists in the protocol. If an open file is unlinked, an NFS
    > client renames it to a special name that looks like ".nfsXXXXX".
    > This "hides" the file while it remains in use. This is known as a
    > "silly rename." Note that NFS servers have nothing to do with this
    > behavior.

This of course throws off the assertion that we got exactly two files in
that directory.

The test in question triggers this behaviour by holding two open file
descriptors to the "tables.list" file. One of the references is because
we are about to append to the stack, whereas the other reference is
because we want to compact it. As the compaction has just finished we
already rewrote "tables.list" to point to the new contents, but the
other file descriptor pointing to the old version is still open. Thus we
trigger the delete-after-last-close emulation.

Furthermore, it was reported that this behaviour only triggers with
4f36b8597c (reftable/stack: fix race in up-to-date check, 2024-01-18).
This is expected as well because it is the first point in time where we
actually keep the "tables.list" file descriptor open for the stat cache.

Fix this bug by skipping over any files that start with a leading dot
when counting files. While we could explicitly check for a prefix of
".nfs", other network file systems like SMB for example do the same
trickery but with a ".smb" prefix. In any case though, this loosening of
the assertion should be fine given that the reftable library would never
write files with leading dots by itself.

[1]: https://nfs.sourceforge.net/#faq_d2

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21 10:32:21 -07:00
ba155b5cb7 contrib: drop hg-to-git script
The hg-to-git script is full of command injection vulnerabilities
against malicious branch and tag names. It's also old and largely
unmaintained; the last commit was over 4 years ago, and the last code
change before that was from 2013. Users are better off with a modern
remote-helper tool like cinnabar or remote-hg.

So rather than spending time to fix it, let's just get rid of it.

Reported-by: Matthew Rollings <admin@stealthcopter.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-20 10:23:45 -07:00
b5b7b17b2e transport-helper: send "true" value for object-format option
The documentation in gitremote-helpers.txt claims that after a helper
has advertised the "object-format" capability, Git may then send "option
object-format true" to indicate that it would like to hear which object
format the helper is using when it returns refs.

However, the code implementing this has always written just "option
object-format", without the extra "true" value. Nobody noticed in
practice or in the tests because the only two helpers we ship are:

  - remote-curl, which quietly converts missing values into "true". This
    goes all the way back to ef08ef9ea0 (remote-helpers: Support custom
    transport options, 2009-10-30), despite the fact that I don't think
    any other option has ever made use of it.

  - remote-testgit in t5801 does insist on having a "true" value. But
    since it sends the ":object-format" response regardless of whether
    it thinks the caller asked for it (technically breaking protocol),
    everything just works, albeit with an extra shell error:

      .../git/t/t5801/git-remote-testgit: 150: test: =: unexpected operator

    printed to stderr, which you can see running t5801 with --verbose.
    (The problem is that $val is the empty string, and since we don't
    double-quote it in "test $val = true", we invoke "test = true"
    instead).

When the documentation and code do not match, it is often good to fix
the documentation rather than break compatibility. And in this case, we
have had the mis-match since 8b85ee4f47 (transport-helper: implement
object-format extensions, 2020-05-25). However, the sha256 feature was
listed as experimental until 8e42eb0e9a (doc: sha256 is no longer
experimental, 2023-07-31).

It's possible there are some third party helpers that tried to follow
the documentation, and are broken. Changing the code will fix them. It's
also possible that there are ones that follow the code and will be
broken if we change it. I suspect neither is the case given that no
helper authors have brought this up as an issue (I only noticed it
because I was running t5801 in verbose mode for other reasons and
wondered about the weird shell error). That, coupled with the relative
new-ness of sha256, makes me think nobody has really worked on helpers
for it yet, which gives us an opportunity to correct the code before too
much time passes.

And doing so has some value: it brings "object-format" in line with the
syntax of other options, making the protocol more consistent. It also
lets us use set_helper_option(), which has better error reporting.

Note that we don't really need to allow any other values like "false"
here. The point is for Git to tell the helper that it understands
":object-format" lines coming back as part of the ref listing. There's
no point in future versions saying "no, I don't understand that".

To make sure everything works as expected, we can improve the
remote-testgit helper from t5801 to send the ":object-format" line only
if the other side correctly asked for it (which modern Git will always
do). With that test change and without the matching code fix here, t5801
will fail when run with GIT_TEST_DEFAULT_HASH=sha256.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-20 10:01:30 -07:00
d6f6b433a8 transport-helper: drop "object-format <algo>" option
The documentation in gitremote-helpers.txt claims that helpers should
accept an object-format option from Git whose value is either:

  1. "true", in which case the helper is merely told that Git
     understands the special ":object-format" response, and will send it

  2. an algorithm name that the helper should use

However, Git has never sent the second form, and it's not clear if it
would ever be useful.

When interacting with a remote Git repository, we generally discover
what _their_ object format is, and then decide what to do with a
mismatch (where that is currently just "bail out", but could eventually
be on-the-fly conversion and interop). And that is true for native
protocols, but also for transport helpers like remote-curl that talk to
remote Git repositories.  There we send back an ":object-format" line
telling Git what remote-curl detected on the other side.

And this is true even for pushes (since we get it via receive-pack's
advertisement). And it is even true for dumb-http, as we guess at the
algorithm based on the hash size, due to ac093d0790 (remote-curl: detect
algorithm for dumb HTTP by size, 2020-06-19).

The one case where it _isn't_ true is dumb-http talking to an empty
repository. There we have no clue what the remote hash is, so
remote-curl just sends back its default. If we kept the "object-format
<algo>" form then in theory Git could say "object-format sha256" to
change that default. But it doesn't really accomplish anything. We still
may or may not be mis-matched with the other side. For a fetch that's
OK, since it's by definition a noop. For a push into an empty
repository, it might matter (though the dumb http-push DAV code seems
happy to clobber a remote sha256 info/refs and corrupt the repository).
If we want to pursue making this work, I think we'd be better off
improving detection of the object format of empty repositories over
dumb-http (e.g., an "info/object-format" file).

But what about helpers that _aren't_ talking to another Git repo?
Consider something like git-cinnabar, which is converting on the fly
to/from hg. Most of the heavy lifting is done by fast-import/export, but
some oids may still pass between Git and the helper. Could
"object-format <algo>" be useful to tell the helper what oids we expect
to see?

Possibly, but in practice this isn't necessary. Git-cinnabar for example
already peeks at the local-repo .git/config to check its object-format
(and currently just bails if it is sha256).

So I think the "object-format" extension really is only useful for the
helper telling Git what object-format it found, and not the other way
around.

Note that this patch can't break any remote helpers; we're not changing
the code on the Git side at all, but just bringing the documentation in
line with what Git has always done. It does remove the receiving support
in remote-curl.c, but that code was never actually triggered.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-20 10:01:27 -07:00
cf7335f5b6 transport-helper: use write helpers more consistently
The transport-helper code provides some functions for writing to the
helper process, but there are a few spots that don't use them. We should
do so consistently because:

  1. They detect errors on write (though in practice this means the
     helper process went away, and we'd see the problem as soon as we
     try to read the response).

  2. They dump the written bytes to the GIT_TRANSPORT_HELPER_DEBUG
     stream. It's doubly confusing to miss some writes but not others,
     as you see a partial conversation.

The "list" ones go all the way back to the beginning of the transport
helper code; they were just missed when most writes were converted in
bf3c523c3f (Add remote helper debug mode, 2009-12-09). The nearby
"object-format" write presumably just cargo-culted them, as it's only a
few lines away.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-20 10:00:55 -07:00
9dc75d81b8 doc/gitremote-helpers: fix more missing single-quotes
There are a few cases left in gitremote-helpers.txt that are missing a
closing quote, so you end up with:

  'option deepen-since <timestamp>

with a stray opening quote instead of rendering correctly in italics.
These should have been part of 51d41dc243 (doc/gitremote-helpers: fix
missing single-quote, 2024-03-07), but apparently my eyesight is not
what it once was. Hopefully this is now all of them.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-20 09:53:09 -07:00
838ba014ce format-patch: simplify after-subject MIME header handling
In log_write_email_headers(), we append our MIME headers to the set of
extra headers by creating a new strbuf, adding the existing headers, and
then adding our new ones.  We had to do it this way when our output
buffer might point to the constant opt->extra_headers variable.

But since the previous commit, we always make a local copy of that
variable. Let's turn that into a strbuf, which lets the MIME code simply
append to it. That simplifies the function and avoids a pointless extra
copy of the headers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 17:54:16 -07:00
305a68143c format-patch: return an allocated string from log_write_email_headers()
When pretty-printing a commit in the email format, we have to fill in
the "after subject" field of the pretty_print_context with any extra
headers the user provided (e.g., from "--to" or "--cc" options) plus any
special MIME headers.

We return an out-pointer that sometimes points to a newly heap-allocated
string and sometimes not. To avoid leaking, we store the allocated
version in a buffer with static lifetime, which is ugly. Worse, as we
extend the header feature, we'll end up having to repeat this ugly
pattern.

Instead, let's have our out-pointer pass ownership back to the caller,
and duplicate the string when necessary. This does mean one extra
allocation per commit when you use extra headers, but in the context of
format-patch which is showing diffs, I don't think that's even
measurable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 17:54:16 -07:00
82363d9670 log: do not set up extra_headers for non-email formats
The commit pretty-printer code has an "after_subject" parameter which it
uses to insert extra headers into the email format. In show_log() we set
this by calling log_write_email_headers() if we are using an email
format, but otherwise default the variable to the rev_info.extra_headers
variable.

Since the pretty-printer code will ignore after_subject unless we are
using an email format, this default is pointless. We can just set
after_subject directly, eliminating an extra variable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 17:54:16 -07:00
d5a90d6319 pretty: drop print_email_subject flag
With one exception, the print_email_subject flag is set if and only if
the commit format is email based:

  - in make_cover_letter() we set it along with CMIT_FMT_EMAIL
    explicitly

  - in show_log(), we set it if cmit_fmt_is_mail() is true. That covers
    format-patch as well as "git log --format=email" (or mboxrd).

The one exception is "rev-list --format=email", which somewhat
nonsensically prints the author and date as email headers, but no
subject, like:

  $ git rev-list --format=email HEAD
  commit 64fc4c2cdd4db2645eaabb47aa4bac820b03cdba
  From: Jeff King <peff@peff.net>
  Date: Tue, 19 Mar 2024 19:39:26 -0400

  this is the subject

  this is the body

It's doubtful that this is a useful format at all (the "commit" lines
replace the "From" lines that would make it work as an actual mbox).
But I think that printing the subject as a header (like this patch does)
is the least surprising thing to do.

So let's drop this field, making the code a little simpler and easier to
reason about. Note that we do need to set the "rev" field of the
pretty_print_context in rev-list, since that is used to check for
subject_prefix, etc. It's not possible to set those fields via rev-list,
so we'll always just print "Subject: ". But unless we pass in our
rev_info, fmt_output_email_subject() would segfault trying to figure it
out.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 17:54:15 -07:00
69aff6200c pretty: split oneline and email subject printing
The pp_title_line() function is used for two formats: the oneline format
and the subject line of the email format. But most of the logic in the
function does not make any sense for oneline; it is about special
formatting of email headers.

Lumping the two formats together made sense long ago in 4234a76167
(Extend --pretty=oneline to cover the first paragraph, 2007-06-11), when
there was a lot of manual logic to paste lines together. But later,
88c44735ab (pretty: factor out format_subject(), 2008-12-27) pulled that
logic into its own function.

We can implement the oneline format by just calling that one function.
This makes the intention of the code much more clear, as we know we only
need to worry about those extra email options when dealing with actual
email.

While the intent here is cleanup, it is possible to trigger these cases
in practice by running format-patch with an explicit --oneline option.
But if you did, the results are basically nonsense. For example, with
the preserve_subject flag:

  $ printf "%s\n" one two three | git commit --allow-empty -F -
  $ git format-patch -1 --stdout -k | grep ^Subject
  Subject: =?UTF-8?q?one=0Atwo=0Athree?=
  $ git format-patch -1 --stdout -k --oneline --no-signature
  2af7fbe one
  two
  three

Or with extra headers:

  $ git format-patch -1 --stdout --cc=me --oneline --no-signature
  2af7fbe one two three
  Cc: me

So I'd actually consider this to be an improvement, though you are
probably crazy to use other formats with format-patch in the first place
(arguably it should forbid non-email formats entirely, but that's a
bigger change).

As a bonus, it eliminates some pointless extra allocations for the
oneline output. The email code, since it has to deal with wrapping,
formats into an extra auxiliary buffer. The speedup is tiny, though like
"rev-list --no-abbrev --format=oneline" seems to improve by a consistent
1-2% for me.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 17:54:15 -07:00
c7f6a534f0 shortlog: stop setting pp.print_email_subject
When shortlog processes a commit using its internal traversal, it may
pretty-print the subject line for the summary view. When we do so, we
set the "print_email_subject" flag in the pretty-print context. But this
flag does nothing! Since we are using CMIT_FMT_USERFORMAT, we skip most
of the usual formatting code entirely.

This flag is there due to commit 6d167fd7cc (pretty: use
fmt_output_email_subject(), 2017-03-01). But that just switched us away
from setting an empty "subject" header field, which was similarly
useless. That was added by dd2e794a21 (Refactor pretty_print_commit
arguments into a struct, 2009-10-19). Before using the struct, we had to
pass _something_ as the argument, so we passed the empty string (a NULL
would have worked equally well).

So this setting has never done anything, and we can drop the line. That
shortens the code, but more importantly, makes it easier to reason about
and refactor the other users of this flag.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 17:54:15 -07:00
5ea0176003 apply: parse names out of "diff --git" more carefully
"git apply" uses the pathname parsed out of the "diff --git" header
to decide which path is being patched, but this is used only when
there is no other names available in the patch.  When there is any
content change (like we can see in this patch, that modifies the
contents of "apply.c") or rename (which comes with "rename from" and
"rename to" extended diff headers), the names are available without
having to parse this header.

When we do need to parse this header, a special care needs to be
taken, as the name of a directory or a file can have a SP in it so
it is not like "find a space, and take everything before the space
and that is the preimage filename, everything after the space is the
postimage filename".  We have a loop that stops at every SP on the
"diff --git a/dir/file b/dir/foo" line and see if that SP is the
right place that separates such a pair of names.

Unfortunately, this loop can terminate prematurely when a crafted
directory name ended with a SP.  The next pathname component after
that SP (i.e. the beginning of the possible postimage filename) will
be a slash, and instead of rejecting that position as the valid
separation point between pre- and post-image filenames and keep
looping, we stopped processing right there.

The fix is simple.  Instead of stopping and giving up, keep going on
when we see such a condition.

Reported-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-19 15:58:15 -07:00
667b545c62 Merge branch 'ps/reftable-stack-tempfile' into ps/pack-refs-auto
* ps/reftable-stack-tempfile:
  reftable/stack: register compacted tables as tempfiles
  reftable/stack: register lockfiles during compaction
  reftable/stack: register new tables as tempfiles
  lockfile: report when rollback fails
2024-03-18 13:24:32 -07:00
3bd955d269 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 13:04:25 -07:00
d2e4e26d13 Merge branch 'jk/doc-remote-helpers-markup-fix'
Doc mark-up fix.

* jk/doc-remote-helpers-markup-fix:
  doc/gitremote-helpers: fix missing single-quote
2024-03-18 13:04:25 -07:00
7f1e92643d Merge branch 'jh/trace2-missing-def-param-fix'
Some trace2 events that lacked def_param have learned to show it,
enriching the output.

Reviewed-by: Josh Steadmon <steadmon@google.com>
cf. <ZejkVOVQBZhLVfHW@google.com>

* jh/trace2-missing-def-param-fix:
  trace2: emit 'def_param' set with 'cmd_name' event
  trace2: avoid emitting 'def_param' set more than once
  t0211: demonstrate missing 'def_param' events for certain commands
2024-03-18 13:04:25 -07:00
184969ce1d Merge branch 'pw/rebase-i-ignore-cherry-pick-help-environment'
Code simplification by getting rid of code that sets an environment
variable that is no longer used.

* pw/rebase-i-ignore-cherry-pick-help-environment:
  rebase -i: stop setting GIT_CHERRY_PICK_HELP
2024-03-18 13:04:25 -07:00
bff85a338c docs: adjust trailer separator and key_value_separator language
The language describing the trailer separator and key-value separator
default value is overly complicated.

Indicate the default with simpler "Defaults to ..." language.

Suggested-by: Linus Arver <linusa@google.com>
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Acked-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 09:36:00 -07:00
cb85ed1eb4 docs: correct trailer key_value_separator description
The description for `key_value_separator` incorrectly states that this
separator is inserted between trailer lines, which appears likely to
have been incorrectly copied from `separator` when this option was
added.

Update the description to correctly indicate that it is a separator that
appears between the key and the value of each trailer.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Acked-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 09:35:49 -07:00
37ce97353c builtin/am: allow disabling conflict advice
When 'git am' or 'git rebase --apply' encounter a conflict, they show a
message instructing the user how to continue the operation. This message
can't be disabled.

Use ADVICE_MERGE_CONFLICT introduced in the previous commit to allow
disabling it. Update the tests accordingly, as the advice output is now
on stderr instead of stdout. In t4150, redirect stdout to 'out' and
stderr to 'err', since this is less confusing. In t4254, as we are
testing a specific failure mode of 'git am', simply disable the advice.
Note that we are not testing that this advice is shown in 'git rebase'
for the apply backend since 2ac0d6273f (rebase: change the default
backend from "am" to "merge", 2020-02-15).

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 09:28:42 -07:00
ec0300914b sequencer: allow disabling conflict advice
Allow disabling the advice shown when a squencer operation results in a
merge conflict through a new config 'advice.mergeConflict', which is
named generically such that it can be used by other commands eventually.

Remove that final '\n' in the first hunk in sequencer.c to avoid an
otherwise empty 'hint: ' line before the line 'hint: Disable this
message with "git config advice.mergeConflict false"' which is
automatically added by 'advise_if_enabled'.

Note that we use 'advise_if_enabled' for each message in the second hunk
in sequencer.c, instead of using 'if (show_hints &&
advice_enabled(...)', because the former instructs the user how to
disable the advice, which is more user-friendly.

Update the tests accordingly. Note that the body of the second test in
t3507-cherry-pick-conflict.sh is enclosed in double quotes, so we must
escape them in the added line. Note that t5520-pull.sh, which checks
that we display the advice for 'git rebase' (via 'git pull --rebase')
does not have to be updated because it only greps for a specific line in
the advice message.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 09:28:40 -07:00
30ff05094c t-prio-queue: check result array bounds
Avoid reading past the end of the "result" array, which could otherwise
happen if the prio-queue were to yield more items than were put into it
due to an implementation bug, or if the array has not enough entries due
to a test bug.

Also check at the end whether all "result" entries were consumed, which
would not be the case if the prio-queue forgot some entries or the test
definition contained too many.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 09:25:54 -07:00
e6f9cb76ea t-prio-queue: shorten array index message
If we get an unexpected result, the prio-queue unit test reports it like
this:

 # check "result[j++] == show(get)" failed at t/unit-tests/t-prio-queue.c:43
 #    left: 5
 #   right: 1
 # failed at result[] index 0

That last line repeats "failed" and "result" from the first line.
Shorten it to resemble a similar one in t-ctype and also remove the
incrementation from the first line to avoid possible distractions from
the message of which comparison went wrong where:

 # check "result[j] == show(get)" failed at t/unit-tests/t-prio-queue.c:43
 #    left: 5
 #   right: 1
 #       j: 0

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 09:24:50 -07:00
178401dc25 diff.*Prefix: use camelCase in the doc and test titles
We added documentation for diff.srcPrefix and diff.dstPrefix with
their names properly camelCased, but the diff.noPrefix is listed
there in all lowercase.  Also these configuration variables, both
existing ones and the {src,dst}Prefix we recently added, were
spelled in all lowercase in the tests in t4013.

Now we are done with the main change, clean these up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18 08:47:18 -07:00
c2a7536354 git-quiltimport: avoid an unnecessary subshell
Use braces for the compound command.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:57 -07:00
f70bc702e5 contrib/coverage-diff: avoid redundant pipelines
Merge multiple sed and "grep | awk" invocations, finally use "sort -u"
instead of "sort | uniq".

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:57 -07:00
babf0b89b3 t/t9*: merge "grep | sed" pipelines
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:57 -07:00
c7e7f68aad t/t8*: merge "grep | sed" pipelines
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:57 -07:00
37ea7c4875 t/t5*: merge a "grep | sed" pipeline
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:57 -07:00
f7caf1479e t/t4*: merge a "grep | sed" pipeline
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:57 -07:00
67dd07e8af t/t3*: merge a "grep | awk" pipeline
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
8a3c5ccc4d t/t1*: merge a "grep | sed" pipeline
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
af7dd8bd73 t/t9*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
a28a5ea909 t/t8*: avoid redundant use of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
6178c08ec7 t/t7*: avoid redundant use of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
edfa63e7f4 t/t6*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
f636d25dc4 t/t5*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
237ce762ef t/t4*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
2b5a303ad8 t/t3*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
2ed139ccc9 t/t1*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
74615c2a74 t/t0*: avoid redundant uses of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
108e18acc3 t/perf: avoid redundant use of cat
Take care to redirect stdin, otherwise the output of wc would also contain
the file name.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
47c0f24539 t/annotate-tests.sh: avoid redundant use of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
99eb825c09 t/lib-cvs.sh: avoid redundant use of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:56 -07:00
2fbd3ac8e6 contrib/subtree/t: avoid redundant use of cat
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:55 -07:00
938e891a9a doc: avoid redundant use of cat
The update-hook-example.txt script uses this anti-pattern twice. Call grep
with the input file name directy. While at it, merge the two consecutive
grep calls.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 11:08:55 -07:00
fe86a3474a Merge branch 'la/format-trailer-info' into la/hide-trailer-info
* la/format-trailer-info:
  trailer: finish formatting unification
  trailer: begin formatting unification
  format_trailer_info(): append newline for non-trailer lines
  format_trailer_info(): drop redundant unfold_value()
  format_trailer_info(): use trailer_item objects
2024-03-16 10:07:39 -07:00
67471bc704 doc: fix some placeholders formating
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 10:04:53 -07:00
0620ae0f5b doc: format alternatives in synopsis
This is a list of various fixes on malformed alternative in commands
and option syntax.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 10:04:45 -07:00
86f9ce7dd6 docs: fix typo in git-config --default
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 09:52:11 -07:00
7823a51203 docs: clarify file options in git-config --edit
The description for the `-e`/`--edit` option references scopes
inconsistently: system and global are referenced by their option name
(`--system`/`--global`), but repository (`--local` is not. Additionally,
neither `--worktree` nor `--file` are referenced at all, despite also
being a valid options.

Update the description to mention all four available scopes as well as
`--file`, referencing each consistently by their option name.

Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 09:52:10 -07:00
b3b57c69da bugreport.c: fix a crash in git bugreport with --no-suffix option
`git bugreport` does not complain when `--no-suffix` is given, but
it leads to a segmentation fault as the it is not prepared to see a
NULL assigned to the option_suffix variable.

Signed-off-by: Jiamu Sun <barroit@linux.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16 09:31:42 -07:00
31399a6b61 config: allow tweaking whitespace between value and comment
Extending the previous step, this allows the whitespace placed after
the value before the "# comment message" to be tweaked by tweaking
the preprocessing rule to:

 * If the given comment string begins with one or more whitespace
   characters followed by '#', it is passed intact.

 * If the given comment string begins with '#', a Space is
   prepended.

 * Otherwise, " # " (Space, '#', Space) is prefixed.

 * A string with LF in it cannot be used as a comment string.

Unlike the previous step, which unconditionally added a space after
the value before writing the "# comment string", because the above
preprocessing already gives a whitespace before the '#', the
resulting string is written immediately after copying the value.

And the sanity checking rule becomes

 * comment string after the above massaging that comes into
   git_config_set_multivar_in_file_gently() must

   - begin with zero or more whitespace characters followed by '#'.
   - not have a LF in it.

I personally think this is over-engineered, but since I thought
things through anyway, here it is in the patch form.  The logic to
tweak end-user supplied comment string is encapsulated in a new
helper function, git_config_prepare_comment_string(), so if new
front-end callers would want to use the same massaging rules, it is
easily reused.

Unfortunately I do not think of a way to tweak the preprocessing
rules further to optionally allow having no blank after the value,
i.e. to produce

	[section]
		variable = value#comment

(which is a valid way to say section.variable=value, by the way)
without sacrificing the ergonomics for the more usual case, so this
time I really stop here.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 16:07:39 -07:00
fbad334db9 config: fix --comment formatting
When git adds comments itself (like "rebase -i" todo list and
"commit -e" log message editor), it always gives a comment
introducer "#" followed by a Space before the message, except for
the recently introduced "git config --comment", where the users are
forced to say " this is my comment" if they want to add their
comment in this usual format; otherwise their comment string will
end up without a space after the "#".

Make it more ergonomic, while keeping it possible to also use this
unusual style, by massaging the comment string at the UI layer with
a set of simple rules:

 * If the given comment string begins with '#', it is passed intact.
 * Otherwise, "# " is prefixed.
 * A string with LF in it cannot be used as a comment string.

Right now there is only one "front-end" that accepts end-user
comment string and calls the underlying machinery to add or modify
configuration file with comments, but to make sure that the future
callers perform similar massaging as they see fit, add a sanity
check logic in git_config_set_multivar_in_file_gently(), which is
the single choke point in the codepaths that consumes the comment
string.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 16:07:37 -07:00
2953d95d40 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 16:06:00 -07:00
84ead08cc7 Merge branch 'hd/config-mak-os390'
Platform specific tweaks for OS/390 has been added to
config.mak.uname.

* hd/config-mak-os390:
  build: support z/OS (OS/390).
2024-03-15 16:06:00 -07:00
1c61dfa543 Merge branch 'vm/t7301-use-test-path-helpers'
GSoC practice to replace "test -f" with "test_path_is_file".

* vm/t7301-use-test-path-helpers:
  t7301: use test_path_is_(missing|file)
2024-03-15 16:06:00 -07:00
d4636aea6f Merge branch 'jc/xwrite-cleanup'
Uses of xwrite() helper have been audited and updated for better
error checking and simpler code.

* jc/xwrite-cleanup:
  repack: check error writing to pack-objects subprocess
  sideband: avoid short write(2)
  unpack: replace xwrite() loop with write_in_full()
2024-03-15 16:06:00 -07:00
06ac518981 Merge branch 'ag/t0010-modernize'
GSoC practice to modernize a test script.

* ag/t0010-modernize:
  tests: modernize the test script t0010-racy-git.sh
2024-03-15 16:06:00 -07:00
8e663afb95 Merge branch 'as/option-names-in-messages'
Error message updates.

* as/option-names-in-messages:
  revision.c: trivial fix to message
  builtin/clone.c: trivial fix of message
  builtin/remote.c: trivial fix of error message
  transport-helper.c: trivial fix of error message
2024-03-15 16:05:59 -07:00
b09a8839a4 Merge branch 'kh/branch-ref-syntax-advice'
When git refuses to create a branch because the proposed branch
name is not a valid refname, an advice message is given to refer
the user to exact naming rules.

* kh/branch-ref-syntax-advice:
  branch: advise about ref syntax rules
  advice: use double quotes for regular quoting
  advice: use backticks for verbatim
  advice: make all entries stylistically consistent
  t3200: improve test style
2024-03-15 16:05:59 -07:00
42d5c03394 config: add --comment option to add a comment
Introduce the ability to append comments to modifications
made using git-config. Example usage:

  git config --comment "changed via script" \
    --add safe.directory /home/alice/repo.git

based on the proposed patch, the output produced is:

  [safe]
    directory = /home/alice/repo.git #changed via script

Users need to be able to distinguish between config entries made
using automation and entries made by a human. Automation can add
comments containing a URL pointing to explanations for the change
made, avoiding questions from users as to why their config file
was changed by a third party.

The implementation ensures that a # character is unconditionally
prepended to the provided comment string, and that the comment
text is appended as a suffix to the changed key-value-pair in the
same line of text. Multi-line comments (i.e. comments containing
linefeed) are rejected as errors, causing Git to exit without
making changes.

Comments are aimed at humans who inspect or change their Git
config using a pager or editor. Comments are not meant to be
read or displayed by git-config at a later time.

Signed-off-by: Ralph Seichter <github@seichter.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 12:25:35 -07:00
fe2033b84f fuzz: add fuzzer for config parsing
Add a new fuzz target that exercises the parsing of git configs.
The existing git_config_from_mem function is a perfect entry point
for fuzzing as it exercises the same code paths as the rest of the
config parsing functions and offers an easily fuzzable interface.

Config parsing is a useful thing to fuzz because it operates on user
controlled data and is a central component of many git operations.

Signed-off-by: Brian C Tracy <brian.tracy33@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:47:05 -07:00
3452d17324 trailer: finish formatting unification
Rename format_trailer_info() to format_trailers(). Finally, both
interpret-trailers and format_trailers_from_commit() can call
"format_trailers()"!

Update the comment in <trailer.h> to remove the (now obsolete) caveats
about format_trailers_from_commit(). Those caveats come from
a388b10fc1 (pretty: move trailer formatting to trailer.c, 2017-08-15)
where it says:

    pretty: move trailer formatting to trailer.c

    The next commit will add many features to the %(trailer)
    placeholder in pretty.c. We'll need to access some internal
    functions of trailer.c for that, so our options are either:

      1. expose those functions publicly

    or

      2. make an entry point into trailer.c to do the formatting

    Doing (2) ends up exposing less surface area, though do note
    that caveats in the docstring of the new function.

which suggests format_trailers_from_commit() started out from pretty.c
and did not have access to all of the trailer implementation internals,
and was never intended to replace (unify) the formatting machinery in
trailer.c. The refactors leading up to this commit (as well as
additional refactors that will follow) expose additional functions
publicly, and is therefore choosing option (1) as described in
a388b10fc1.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:10:25 -07:00
676c1db76e trailer: begin formatting unification
Now that the preparatory refactors are over, we can replace the call to
format_trailers() in interpret-trailers with format_trailer_info(). This
unifies the trailer formatting machinery

In order to avoid breakages in t7502 and t7513, we have to steal the
features present in format_trailers(). Namely, we have to teach
format_trailer_info() as follows:

  (1) make it aware of opts->trim_empty, and

  (2) make it avoid hardcoding ": " as the separator and space (which
  can result in double-printing these characters).

For (2), make it only print the separator and space if we cannot find
any recognized separator somewhere in the key (yes, keys may have a
trailing separator in it --- we will eventually fix this design but not
now). Do so by copying the code out of print_tok_val(), and deleting the
same function.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:10:25 -07:00
9f0c9702de format_trailer_info(): append newline for non-trailer lines
This wraps up the preparatory refactors to unify the trailer formatters.

Two patches ago we made format_trailer_info() use trailer_item objects
instead of the "trailers" string array. The strings in the array
include trailing newlines, because the string array is split up with

    trailer_lines = strbuf_split_buf(str + trailer_block_start,
                                     end_of_log_message - trailer_block_start,
                                     '\n',
                                     0);

in trailer_info_get() and strbuf_split_buf() includes the terminator (in
this case the newline character '\n') for each split-up substring.

And before we made the transition to use trailer_item objects for it,
format_trailer_info() called parse_trailer() (which trims newlines) for
trailer lines but did _not_ call parse_trailer() for non-trailer lines.
So for trailer lines it had to add back the trimmed newline like this

    if (!opts->separator)
        strbuf_addch(out, '\n');

But for non-trailer lines it didn't have to add back the newline because
it could just reuse same string in the "trailers" string array (which
again, already included the trailing newline).

Now that format_trailer_info() uses trailer_item objects for all cases,
it can't rely on "trailers" string array anymore.  And so it must be
taught to add a newline back when printing non-trailer lines, just like
it already does for trailer lines. Do so now.

The test suite can pass again without the need to hide failures
with *_failure, so flip the affected test cases back to *_success. Now,
format_trailer_info() is in better shape to supersede format_trailers(),
which we'll do in the next commit.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:10:25 -07:00
41ea0a9002 format_trailer_info(): drop redundant unfold_value()
This is another preparatory refactor to unify the trailer formatters.

In the last patch we made format_trailer_info() use trailer_item objects
instead of the "trailers" string array. This means that the call to
unfold_value() here is redundant because the trailer_item objects are
already unfolded in parse_trailers() which is a dependency of our
caller, format_trailers_from_commit().

Remove the redundant call.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:10:24 -07:00
65b4ad82b8 format_trailer_info(): use trailer_item objects
This is another preparatory refactor to unify the trailer formatters.

Make format_trailer_info() operate on trailer_item objects, not the raw
string array.

We will continue to make improvements, culminating in the renaming of
format_trailer_info() to format_trailers(), at which point the
unification of these formatters will be complete.

In order to avoid breaking t4205 and t6300, flip *_success to *_failure
in the affected test cases. Add a temporary
"test_trailer_option_expect_failure" wrapper which we will use along
with "test_expect_failure" in the next commit to avoid breaking tests.
When the dust settles with the refactors a few more commits later, we
will drop the use of *_failure to make the tests truly pass again.

When the preparatory refactors are complete,
we'll be able to drop the use of *_failure that we introduce here.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:10:24 -07:00
ad538c61da t5300: fix test_with_bad_commit()
0f8edf7317 (index-pack: --fsck-objects to take an optional argument for
fsck msgs, 2024-02-01) added a test function test_with_bad_commit() that
contained two bugs. test_expect_fail was used instead of test_must_fail,
and a && was not included at the end of the line.

Fix these two issues in the test.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:08:30 -07:00
7fdc265633 diff: add diff.srcPrefix and diff.dstPrefix configuration variables
Allow the default prefixes "a/" and "b/" to be tweaked by the
diff.srcPrefix and diff.dstPrefix configuration variables.

Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15 10:04:45 -07:00
4f9b731bde The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 14:05:25 -07:00
c5a7ee124d Merge branch 'rj/complete-worktree-paths-fix'
The logic to complete the command line arguments to "git worktree"
subcommand (in contrib/) has been updated to correctly honor things
like "git -C dir" etc.

* rj/complete-worktree-paths-fix:
  completion: fix __git_complete_worktree_paths
2024-03-14 14:05:25 -07:00
43100746e6 Merge branch 'rj/complete-reflog'
The command line completion script (in contrib/) learned to
complete "git reflog" better.

* rj/complete-reflog:
  completion: reflog subcommands and options
  completion: factor out __git_resolve_builtins
  completion: introduce __git_find_subcommand
  completion: reflog show <log-options>
  completion: reflog with implicit "show"
2024-03-14 14:05:24 -07:00
edae49e3c0 Merge branch 'jc/test-i18ngrep'
With release 2.44 we got rid of all uses of test_i18ngrep and there
is no in-flight topic that adds a new use of it.  Make a call to
test_i18ngrep a hard failure, so that we can remove it at the end
of this release cycle.

* jc/test-i18ngrep:
  test_i18ngrep: hard deprecate and forbid its use
2024-03-14 14:05:24 -07:00
272fd9125a Merge branch 'gt/core-bare-in-templates'
Code simplification.

* gt/core-bare-in-templates:
  setup: remove unnecessary variable
2024-03-14 14:05:24 -07:00
4fecb94887 Merge branch 'la/trailer-api'
Trailer API updates.

Acked-by: Christian Couder <christian.couder@gmail.com>
cf. <CAP8UFD1Zd+9q0z1JmfOf60S2vn5-sD3SafDvAJUzRFwHJKcb8A@mail.gmail.com>

* la/trailer-api:
  format_trailers_from_commit(): indirectly call trailer_info_get()
  format_trailer_info(): move "fast path" to caller
  format_trailers(): use strbuf instead of FILE
  trailer_info_get(): reorder parameters
  trailer: move interpret_trailers() to interpret-trailers.c
  trailer: reorder format_trailers_from_commit() parameters
  trailer: rename functions to use 'trailer'
  shortlog: add test for de-duplicating folded trailers
  trailer: free trailer_info _after_ all related usage
2024-03-14 14:05:24 -07:00
26ab20ccb2 Merge branch 'kh/doc-commentchar-is-a-byte'
The "core.commentChar" configuration variable only allows an ASCII
character, which was not clearly documented, which has been
corrected.

* kh/doc-commentchar-is-a-byte:
  config: document `core.commentChar` as ASCII-only
2024-03-14 14:05:24 -07:00
720c1129c4 Merge branch 'jh/fsmonitor-icase-corner-case-fix'
FSMonitor client code was confused when FSEvents were given in a
different case on a case-insensitive filesystem, which has been
corrected.

Acked-by: Patrick Steinhardt <ps@pks.im>
cf. <ZehofMaSZyUq8S1N@tanuki>

* jh/fsmonitor-icase-corner-case-fix:
  fsmonitor: support case-insensitive events
  fsmonitor: refactor bit invalidation in refresh callback
  fsmonitor: trace the new invalidated cache-entry count
  fsmonitor: return invalidated cache-entry count on non-directory event
  fsmonitor: remove custom loop from non-directory path handler
  fsmonitor: return invalidated cache-entry count on directory event
  fsmonitor: move untracked-cache invalidation into helper functions
  fsmonitor: refactor untracked-cache invalidation
  dir: create untracked_cache_invalidate_trimmed_path()
  fsmonitor: refactor refresh callback for non-directory events
  fsmonitor: clarify handling of directory events in callback helper
  fsmonitor: refactor refresh callback on directory events
  t7527: add case-insensitve test for FSMonitor
  name-hash: add index_dir_find()
2024-03-14 14:05:23 -07:00
448a74e151 Merge branch 'ps/reftable-iteration-perf-part2'
The code to iterate over refs with the reftable backend has seen
some optimization.

* ps/reftable-iteration-perf-part2:
  refs/reftable: precompute prefix length
  reftable: allow inlining of a few functions
  reftable/record: decode keys in place
  reftable/record: reuse refname when copying
  reftable/record: reuse refname when decoding
  reftable/merged: avoid duplicate pqueue emptiness check
  reftable/merged: circumvent pqueue with single subiter
  reftable/merged: handle subiter cleanup on close only
  reftable/merged: remove unnecessary null check for subiters
  reftable/merged: make subiters own their records
  reftable/merged: advance subiter on subsequent iteration
  reftable/merged: make `merged_iter` structure private
  reftable/pq: use `size_t` to track iterator index
2024-03-14 14:05:23 -07:00
066124da88 Merge branch 'so/clean-dry-run-without-force'
The implementation in "git clean" that makes "-n" and "-i" ignore
clean.requireForce has been simplified, together with the
documentation.

* so/clean-dry-run-without-force:
  clean: further clean-up of implementation around "--force"
  clean: improve -n and -f implementation and documentation
2024-03-14 14:05:23 -07:00
2f64da0790 checkout: plug some leaks in git-restore
In git-restore we need to free the pathspec and pathspec_from_file
values from the struct checkout_opts.

A simple fix could be to free them in cmd_restore, after the call to
checkout_main returns, like we are doing [1][2] in the sibling function
cmd_checkout.

However, we can do even better.

We have git-switch and git-restore, both of them spin-offs[3][4] of
git-checkout.  All three are implemented as thin wrappers around
checkout_main.  Considering this, it makes a lot of sense to do the
cleanup closer to checkout_main.

Move the cleanups, including the new_branch_info variable, to
checkout_main.

As a consequence, mark: t2070, t2071, t2072 and t6418 as leak-free.

 [1] 9081a421a6 (checkout: fix "branch info" memory leaks, 2021-11-16)

 [2] 7ce4088ab7 (parse-options: consistently allocate memory in
     fix_filename(), 2023-03-04)

 [3] d787d311db (checkout: split part of it to new command 'switch',
     2019-03-29)

 [4] 46e91b663b (checkout: split part of it to new command 'restore',
     2019-04-25)

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 11:58:04 -07:00
5a99c1ac1a checkout: fix interaction between --conflict and --merge
When using "git checkout" to recreate merge conflicts or merge
uncommitted changes when switching branch "--conflict" sensibly implies
"--merge". Unfortunately the way this is implemented means that "git
checkout --conflict=diff3 --no-merge" implies "--merge" violating the
usual last-one-wins rule. Fix this by only overriding the value of
opts->merge if "--conflicts" comes after "--no-merge" or "-[-no]-merge"
is not given on the command line.

The behavior of "git checkout --merge --no-conflict" is unchanged and
will still merge on the basis that the "-[-no]-conflict" options are
primarily intended to affect the conflict style and so "--no-conflict"
should cancel a previous "--conflict" but not override "--merge".

Of the four new tests the second one tests the behavior change
introduced by this commit, the other three check that this commit does
not regress the existing behavior.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 10:08:53 -07:00
dbeaf8e8c0 checkout: cleanup --conflict=<style> parsing
Passing an invalid conflict style name such as "--conflict=bad" gives
the error message

    error: unknown style 'bad' given for 'merge.conflictstyle'

which is unfortunate as it talks about a config setting rather than
the option given on the command line. This happens because the
implementation calls git_xmerge_config() to set the conflict style
using the value given on the command line. Use the newly added
parse_conflict_style_name() instead and pass the value down the call
chain to override the config setting. This also means we can avoid
setting up a struct config_context required for calling
git_xmerge_config().

The option is now parsed in a callback to avoid having to store the
option name. This is a change in behavior as now

    git checkout --conflict=bad --conflict=diff3

will error out when parsing "--conflict=bad" whereas before this change
it would succeed because it would only try to parse the value of the
last "--conflict" option given on the command line.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 10:08:53 -07:00
135cc712c3 merge options: add a conflict style member
Add a conflict_style member to `struct merge_options` and `struct
ll_merge_options` to allow callers to override the default conflict
style. This will be used in the next commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 10:08:52 -07:00
412aff7b33 merge-ll: introduce LL_MERGE_OPTIONS_INIT
Introduce a macro to initialize `struct ll_merge_options` in preparation
for the next commit that will add a new member that needs to be
initialized to a non-zero value.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 10:08:52 -07:00
7457014be5 xdiff-interface: refactor parsing of merge.conflictstyle
Factor out the code that parses of conflict style name so it can be
reused in a later commit that wants to parse the name given on the
command line.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14 10:08:52 -07:00
e4e9d5fa97 t0006: add more tests with a negative TZ offset
This test doesn't systematically check a negative timezone offset. Add a
test for each format that outputs the offset to improve our test
coverage.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
2024-03-14 09:54:31 -07:00
69e2bee1a3 date: make "iso-strict" conforming for the UTC timezone
ISO 8601-1:2020-12 specifies that a zero timezone offset must be denoted
with a "Z" suffix instead of the numeric "+00:00". Add the correponding
special case to show_date() and a new test.

Changing an established output format which might be depended on by
scripts is always problematic, but here we choose to adhere more closely
to the published standard.

Reported-by: Michael Osipov <michael.osipov@innomotics.com>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-13 16:06:52 -07:00
f66e1a071b status: allow --untracked=false and friends
It is natural to expect that the "--untracked" option and the
status.showuntrackedFiles configuration variable to take a Boolean
value ("do you want me to show untracked files?"), but the current
code takes nothing but "no" as "no, please do not show any".

Allow the usual Boolean values to be given, and treat 'true' as
"normal", and 'false' as "no".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-13 10:43:32 -07:00
63acdc4827 status: unify parsing of --untracked= and status.showUntrackedFiles
There are two code paths that take a string and parse it to enum
untracked_status_type.  Introduce a helper function and use it.

As these two places handle an error differently, add an additional
invalid value to the enum, and have the caller of the helper handle
the error condition, instead of dying or emitting error message from
the helper.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-13 10:43:32 -07:00
71ccda7e6c doc: status.showUntrackedFiles does not take "false"
The `status.showUntrackedFiles` config option only accepts the
values "no", "normal" or "all", but not as this part of the man page
suggested "false".  While we are at it, camel-case the name of the
variable.

Signed-off-by: Jonas Wunderlich <git@03j.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-13 09:14:46 -07:00
0eab85b90f t5601: exercise clones with "includeIf.*.onbranch"
It was reported that git-clone(1) started to fail in Git v2.44 when
cloning via HTTPS when the config contains an "includeIf.*.onbranch"
condition:

    $ git clone https://example.com/repo.git
    Cloning into 'repo'...
    BUG: refs.c:2083: reference backend is unknown
    error: git-remote-https died of signal 6

This regression was bisected to 0fcc285c5e (refs: refactor logic to look
up storage backends, 2023-12-29). This commit tightens the logic to look
up ref backends such that we now die when the backend has not yet been
detected by reading the gitconfig.

Now on its own, this commit wouldn't have caused the failure. But in
18c9cb7524 (builtin/clone: create the refdb with the correct object
format, 2023-12-12) we have also changed how git-clone(1) initializes
the refdb such that it happens after the remote helper is spawned, which
is required so that we can first learn about the object format used by
the remote repository before initializing the refdb. Starting with this
change, the remote helper will be unable to detect the repository right
from the start and thus have an unconfigured ref backend. Consequently,
when we try to resolve the "includeIf.*.onbranch" condition, we will now
fail to look up the refdb and die.

This regression has already been fixed via 199f44cb2e (builtin/clone:
allow remote helpers to detect repo, 2024-02-27), where we now
pre-initialize a partial refdb so that the remote helper can detect the
repository right from the start. But it's clear that we're lacking test
coverage of this functionality.

Add a test to avoid regressing in the future. Note that this test stops
short of defining the desired behaviour for the "onbranch" condition
during a clone. It's not quite clear how exactly it should behave, so
this is a leftover bit for the future.

Reported-by: Angelo Dureghello <angelo@kernel-space.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:34:00 -07:00
28636d797f Documentation/user-manual.txt: example for generating object hashes
Add a simple example on how object hashes can be generated manually.

Further, because the document suggests to have a look at the initial
commit, clarify that some details changed since that time.

Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:32:11 -07:00
8b311478ad config: allow multi-byte core.commentChar
Now that all of the code handles multi-byte comment characters, it's
safe to allow users to set them.

There is one special case I kept: we still will not allow an empty
string for the commentChar. While it might make sense in some contexts
(e.g., output where you don't want any comment prefix), there are plenty
where it will behave badly (e.g., all of our starts_with() checks will
indicate that every line is a comment!). It might be reasonable to
assign some meaningful semantics, but it would probably involve checking
how each site behaves. In the interim let's forbid it and we can loosen
things later.

Likewise, the "commentChar cannot be a newline" rule is now extended to
"it cannot contain a newline" (for the same reason: it can confuse our
parsing loops).

Since comment_line_str is used in many parts of the code, it's hard to
cover all possibilities with tests. We can convert the existing
double-semicolon prefix test to show that "git status" works. And we'll
give it a more challenging case in t7507, where we confirm that
git-commit strips out the commit template along with any --verbose text
when reading the edited commit message back in. That covers the basics,
though it's possible there could be issues in more exotic spots (e.g.,
the sequencer todo list uses its own code).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:11 -07:00
103d563f37 environment: drop comment_line_char compatibility macro
There is no longer any code which references the single-byte
comment_line_char. Let's drop it, clearing the way for true multi-byte
entries in comment_line_str.

It's possible there are topics in flight that have added new references
to comment_line_char. But we would prefer to fail compilation (and then
fix it) upon merging with this, rather than have them quietly ignore the
bytes after the first.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
78275b08e3 wt-status: drop custom comment-char stringification
In wt_longstatus_print_tracking() we may conditionally show a comment
prefix based on the wt_status->display_comment_prefix flag. We handle
that by creating a local "comment_line_string" that is either the empty
string or the comment character followed by a space.

For a single-byte comment, the maximum length of this string is 2 (plus
a NUL byte). But to handle multi-byte comment characters, it can be
arbitrarily large. One way to handle this is to just call
xstrfmt("%s ", comment_line_str), and then free it when we're done.

But we can simplify things further by just conditionally switching
between our prefix string and an empty string when formatting. We
couldn't just do that with the previous code, because the comment
character was a single byte. There's no way to have a "%c" format switch
between some character and "no character at all". Whereas with "%s" you
can switch between some string and the empty string. So now that we have
a comment string and not a comment char, we can just use it directly
when formatting. Do note that we have to also conditionally add the
trailing space at the same time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
7eb35e07c6 sequencer: handle multi-byte comment characters when writing todo list
We already match multi-byte comment characters in parse_insn_line(),
thanks to the previous commit, yielding a TODO_COMMENT entry. But in
todo_list_to_strbuf(), we may call command_to_char() to convert that
back into something we can output.

We can't just return comment_line_char anymore, since it may require
multiple bytes. Instead, we'll return "0" for this case, which is the
same thing we'd return for a command which does not have a single-letter
abbreviation (e.g., "revert" or "noop"). There is only a single caller
of command_to_char(), and upon seeing "0" it falls back to outputting
the full name via command_to_string(). So we can handle TODO_COMMENT
there, returning the full string.

Note that there are many other callers of command_to_string(), which
will now behave differently if they pass TODO_COMMENT. But we would not
expect that to happen; prior to this commit, the function just calls
die() in this case. And looking at those callers, that makes sense;
e.g., do_pick_commit() will only be called when servicing a pick
command, and should never be called for a comment in the first place.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
2ec225d397 find multi-byte comment chars in unterminated buffers
As with the previous patch, we need to swap out single-byte matching for
something like starts_with() to match all bytes of a multi-byte comment
character. But for cases where the buffer is not NUL-terminated (and we
instead have an explicit size or end pointer), it's not safe to use
starts_with(), as it might walk off the end of the buffer.

Let's introduce a new starts_with_mem() that does the same thing but
also accepts the length of the "haystack" str and makes sure not to walk
past it.

Note that in most cases the existing code did not need a length check at
all, since it was written in a way that knew we had at least one byte
available (and that was all we checked). So I had to read each one to
find the appropriate bounds. The one exception is sequencer.c's
add_commented_lines(), where we can actually get rid of the length
check. Just like starts_with(), our starts_with_mem() handles an empty
haystack variable by not matching (assuming a non-empty prefix).

A few notes on the implementation of starts_with_mem():

  - it would be equally correct to take an "end" pointer (and indeed,
    many of the callers have this and have to subtract to come up with
    the length). I think taking a ptr/size combo is a more usual
    interface for our codebase, though, and has the added benefit that
    the function signature makes it harder to mix up the three
    parameters.

  - we could obviously build starts_with() on top of this by passing
    strlen(str) as the length. But it's possible that starts_with() is a
    relatively hot code path, and it should not pay that penalty (it can
    generally return an answer proportional to the size of the prefix,
    not the whole string).

  - it naively feels like xstrncmpz() should be able to do the same
    thing, but that's not quite true. If you pass the length of the
    haystack buffer, then strncmp() finds that a shorter prefix string
    is "less than" than the haystack, even if the haystack starts with
    the prefix. If you pass the length of the prefix, then you risk
    reading past the end of the haystack if it is shorter than the
    prefix. So I think we really do need a new function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
600559b716 find multi-byte comment chars in NUL-terminated strings
Several parts of the code need to identify lines that begin with the
comment character, and do so with a simple byte equality check. As part
of the transition to handling multi-byte characters, we need to match
all of the bytes. For cases where we are looking in a NUL-terminated
string, we can just use starts_with(), which checks all of the
characters in comment_line_str.

Note that we can drop the "line.len" check in wt-status.c's
read_rebase_todolist(). The starts_with() function handles the case of
an empty haystack buffer (it will always return false for a non-empty
prefix).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
f99e1d94f5 prefer comment_line_str to comment_line_char for printing
As part of our transition to multi-byte comment characters, we should
use the string variable rather than the historical character variable.
All of the sites adjusted here are just swapping out "%c" for "%s" in
format strings, or strbuf_addch() for strbuf_addstr(). The type system
and printf-attribute give the compiler enough information to make sure
our formats and variable changes all match (especially important for
cases where the format string is defined far away from its use, like
prepare_to_commit() in commit.c).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
a1bb146aaf strbuf: accept a comment string for strbuf_add_commented_lines()
As part of our transition to multi-byte comment characters, let's take a
NUL-terminated string pointer for strbuf_add_commented_lines() rather
than a single character.

All of the callers have to be adjusted; most can just pass
comment_line_str rather than comment_line_char.

And now our "cheat" in strbuf_commented_addf() can go away, as we can
take the full string from it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
3a35d96284 strbuf: accept a comment string for strbuf_commented_addf()
As part of our transition to multi-byte comment characters, let's take a
NUL-terminated string pointer for strbuf_commented_addf() rather than a
single character.

All of the callers have to be adjusted, but they can just pass
comment_line_str rather than comment_line_char.

Note that we rely on strbuf_add_commented_lines() under the hood, so
we'll cheat a bit to squeeze our string into a single character (for now
the two are equivalent, and we'll address this TODO in the next patch).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
2982b65690 strbuf: accept a comment string for strbuf_stripspace()
As part of our transition to multi-byte comment characters, let's take a
NUL-terminated string pointer for strbuf_stripspace(), rather than a
single character. We can continue to support its feature of ignoring
comments by accepting a NULL pointer (as opposed to the current behavior
of a NUL byte).

All of the callers have to be adjusted, but they can all just pass
comment_line_str (or NULL).

Inside the function we detect comments by comparing the first byte of a
line to the comment character. We'll adjust that to use starts_with(),
which will match multiple bytes (though for now, of course, we still
only allow a single byte, so it's academic).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
72a7d5d97f environment: store comment_line_char as a string
We'd like to eventually support multi-byte comment prefixes, but the
comment_line_char variable is referenced in many spots, making the
transition difficult.

Let's start by storing the character in a NUL-terminated string. That
will let us switch code over incrementally to the string format, and we
can easily support the existing code with a macro wrapper (since we'll
continue to allow only a single-byte prefix, this will behave
identically).

Once all references to the "char" variable have been converted, we can
drop it and enable longer strings.

We'll still have to touch all of the spots that create or set the
variable in this patch, but there are only a few (reading the config,
and the "auto" character selector).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
2786d058b6 strbuf: avoid shadowing global comment_line_char name
Several comment-related strbuf functions take a comment_line_char
parameter. There's also a global comment_line_char variable, which is
closely related (most callers pass it in as this parameter). Let's avoid
shadowing the global name. This makes it more obvious that we're not
using the global value, and it will be especially helpful as we refactor
the global in future patches (in particular, any macro trickery wouldn't
work because the preprocessor doesn't respect scope).

We'll use "comment_prefix". That should be descriptive enough, and as a
bonus is more neutral with respect to the "char" type (since we'll
eventually swap it out for a string).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
1751e581a3 commit: refactor base-case of adjust_comment_line_char()
When core.commentChar is set to "auto", we check a set of candidate
characters against the proposed buffer to see which if any can be used
without ambiguity. But before we do that, we optimize for the common
case that the default "#" is fine by just seeing if it is present in the
buffer at all.

The way we do this is a bit subtle, though: we assign the candidate
character to comment_line_char preemptively, then check if it works, and
return if it does. The subtle part is that sometimes setting
comment_line_char is important (after we return, the important outcome
is the fact that we have set the variable) and sometimes it is useless
(if our optimization fails, we go on to do the more careful checks and
eventually assign something else instead).

To make it more clear what is happening (and to make further refactoring
of comment_line_char easier), let's check our candidate character
directly, and then assign as part of returning if it worked out.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:09 -07:00
3b45450db6 strbuf: avoid static variables in strbuf_add_commented_lines()
In strbuf_add_commented_lines(), we have to convert the single-byte
comment_line_char into a string to pass to add_lines(). We cache the
created string using a static-local variable. But this makes the
function non-reentrant, and it's doubtful that this provides any real
performance benefit given that we know the string always contains a
single character.

So let's just create it from scratch each time, and to give the compiler
the maximal opportunity to make it fast we'll ditch the over-complicated
xsnprintf() and just assign directly into the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:09 -07:00
db7f93093f strbuf: simplify comment-handling in add_lines() helper
In strbuf_add_commented_lines(), we prepare two strings with potential
prefixes: one with just the comment char, and one with an additional
space. In the add_lines() helper, we use the one without the extra space
for blank lines or lines starting with a tab.

While passing in two separate prefixes to the helper is very flexible,
it's more flexibility than we actually use (or are likely to use, since
the rules inside add_lines() only make sense if "prefix2" is a variant
of "prefix1" without the extra space). And setting up the two strings
makes refactoring in strbuf_add_commented_lines() awkward.

Instead, let's pass in a single string, and just let add_lines() add the
extra space to the result as appropriate.

We do still need to pass in a flag to trigger this behavior. The helper
is shared by strbuf_add_lines(), which passes in a NULL "prefix2" to
inhibit this extra handling.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:09 -07:00
727565ef15 config: forbid newline as core.commentChar
Since we usually look for a comment char while parsing line-oriented
files, setting core.commentChar to a single newline can confuse our code
quite a bit. For example, using it with "git commit" causes us to fail
to recognize any of the template as comments, including it in the config
message. Which kind of makes sense, since the template content is on its
own line (so no line can "start" with a newline). In other spots I would
not be surprised if you can create more mischief (e.g., violating loop
assumptions) but I didn't dig into it.

Since comment characters are a local preference, to some degree this is
a case of "if it hurts, don't do it". But given that this would be a
silly and pointless thing to do, and that it makes it harder to reason
about code parsing comment lines, let's just forbid it.

There are other cases that are perhaps questionable (e.g., setting the
comment char to a single space), but they seem to behave reasonably (at
least a simple "git commit" will correctly identify and strip the
template lines). So I haven't worried about going on a hunt for every
stupid thing a user might do to themselves, and just focused on the most
confusing case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:09 -07:00
945115026a The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 14:12:31 -07:00
0aa44f0a3c Merge branch 'sj/t9117-path-is-file'
GSoC practice to replace "test -f" with "test_path_is_file".

* sj/t9117-path-is-file:
  t9117: prefer test_path_* helper functions
2024-03-11 14:12:31 -07:00
5b6262b193 Merge branch 'kh/doc-dashed-commands-have-not-worked-for-a-long-time'
Doc update.

* kh/doc-dashed-commands-have-not-worked-for-a-long-time:
  gitcli: drop mention of “non-dashed form”
2024-03-11 14:12:31 -07:00
572bf49341 Merge branch 'rs/t-ctype-simplify'
Code simplification to one unit-test program.

* rs/t-ctype-simplify:
  t-ctype: avoid duplicating class names
  t-ctype: align output of i
  t-ctype: simplify EOF check
  t-ctype: allow NUL anywhere in the specification string
2024-03-11 14:12:31 -07:00
ef7e896eca Merge branch 'es/config-doc-sort-sections'
Doc updates.

* es/config-doc-sort-sections:
  docs: sort configuration variable groupings alphabetically
2024-03-11 14:12:30 -07:00
7745f92507 Merge branch 'js/merge-base-with-missing-commit'
Make sure failure return from merge_bases_many() is properly caught.

* js/merge-base-with-missing-commit:
  merge-ort/merge-recursive: do report errors in `merge_submodule()`
  merge-recursive: prepare for `merge_submodule()` to report errors
  commit-reach(repo_get_merge_bases_many_dirty): pass on errors
  commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
  commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
  commit-reach(repo_get_merge_bases): pass on "missing commits" errors
  commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
  commit-reach(merge_bases_many): pass on "missing commits" errors
  commit-reach(paint_down_to_common): start reporting errors
  commit-reach(paint_down_to_common): prepare for handling shallow commits
  commit-reach(repo_in_merge_bases_many): report missing commits
  commit-reach(repo_in_merge_bases_many): optionally expect missing commits
  commit-reach(paint_down_to_common): plug two memory leaks
2024-03-11 14:12:30 -07:00
30b7c4bdca setup: notice more types of implicit bare repositories
Setting the safe.bareRepository configuration variable to explicit
stops git from using a bare repository, unless the repository is
explicitly specified, either by the "--git-dir=<path>" command line
option, or by exporting $GIT_DIR environment variable.  This may be
a reasonable measure to safeguard users from accidentally straying
into a bare repository in unexpected places, but often gets in the
way of users who need valid accesses to the repository.

Earlier, 45bb9162 (setup: allow cwd=.git w/ bareRepository=explicit,
2024-01-20) loosened the rule such that being inside the ".git"
directory of a non-bare repository does not really count as
accessing a "bare" repository.  The reason why such a loosening is
needed is because often hooks and third-party tools run from within
$GIT_DIR while working with a non-bare repository.

More importantly, the reason why this is safe is because a directory
whose contents look like that of a "bare" repository cannot be a
bare repository that came embedded within a checkout of a malicious
project, as long as its directory name is ".git", because ".git" is
not a name allowed for a directory in payload.

There are at least two other cases where tools have to work in a
bare-repository looking directory that is not an embedded bare
repository, and accesses to them are still not allowed by the recent
change.

 - A secondary worktree (whose name is $name) has its $GIT_DIR
   inside "worktrees/$name/" subdirectory of the $GIT_DIR of the
   primary worktree of the same repository.

 - A submodule worktree (whose name is $name) has its $GIT_DIR
   inside "modules/$name/" subdirectory of the $GIT_DIR of its
   superproject.

As long as the primary worktree or the superproject in these cases
are not bare, the pathname of these "looks like bare but not really"
directories will have "/.git/worktrees/" and "/.git/modules/" as a
substring in its leading part, and we can take advantage of the same
security guarantee allow git to work from these places.

Extend the earlier "in a directory called '.git' we are OK" logic
used for the primary worktree to also cover the secondary worktree's
and non-embedded submodule's $GIT_DIR, by moving the logic to a
helper function "is_implicit_bare_repo()".  We deliberately exclude
secondary worktrees and submodules of a bare repository, as these
are exactly what safe.bareRepository=explicit setting is designed to
forbid accesses to without an explicit GIT_DIR/--git-dir=<path>

Helped-by: Kyle Lippincott <spectral@google.com>
Helped-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 13:51:36 -07:00
e1aaf309db ci(github): make Windows test artifacts name unique
If several jobs in the windows-test or vs-test matrices fail, the
upload-artifact action in each job tries to upload the test directories
of the failed tests as "failed-tests-windows.zip", which fails for all
jobs except the one which finishes first with the following error:

    Error: Failed to CreateArtifact: Received non-retryable error:
    Failed request: (409) Conflict: an artifact with this name
    already exists on the workflow run

Make the artifacts name unique by using the 'matrix.nr' token, and
disambiguate the vs-test artifacts from the windows-test ones.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 10:13:03 -07:00
45d5ed3e50 doc: git-clone: format placeholders
With the new formatting rules, we use _<placeholders>_.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 09:58:11 -07:00
d3717e1e9c doc: git-clone: format verbatim words
We also apply the formatting to urls.txt which is included.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 09:58:11 -07:00
e7b3a7683c doc: git-init: rework config item init.templateDir
When included into a the manpage of git-init, the param section must
not refer to the manpage.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 09:58:11 -07:00
aa804b7a4c doc: git-init: rework definition lists
In all cases of option description, each option is in its own
term. Use the same format here.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 09:58:11 -07:00
cb8ae0442a doc: git-init: format placeholders
With the new doc format conventions, we use _<placeholders>_.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 09:58:11 -07:00
2261d81490 doc: git-init: format verbatim parts
Verbatim parts are all formatted as `fixed font`.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11 09:58:11 -07:00
25fd20eb44 merge-ort/merge-recursive: do report errors in merge_submodule()
In 24876ebf68 (commit-reach(repo_in_merge_bases_many): report missing
commits, 2024-02-28), I taught `merge_submodule()` to handle errors
reported by `repo_in_merge_bases_many()`.

However, those errors were not passed through to the callers. That was
unintentional, and this commit remedies that.

Note that `find_first_merges()` can now also return -1 (because it
passes through that return value from `repo_in_merge_bases()`), and this
commit also adds the forgotten handling for that scenario.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-09 09:57:16 -08:00
81a34cbb2e merge-recursive: prepare for merge_submodule() to report errors
The `merge_submodule()` function returns an integer that indicates
whether the merge was clean (returning 1) or unclean (returning 0).

Like the version in `merge-ort.c`, the version in `merge-recursive.c`
does not report any errors (such as repository corruption) by returning
-1 as of time of writing, even if the callers in `merge-ort.c` are
prepared for exactly such errors.

However, we want to teach (both variants of) the `merge_submodule()`
function that trick: to report errors by returning -1. Therefore,
prepare the caller in `merge-recursive.c` to handle that scenario.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-09 09:57:05 -08:00
e09f1254c5 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 15:59:42 -08:00
ce65a188b1 Merge branch 'ps/remote-helper-repo-initialization-fix'
A custom remote helper no longer cannot access the newly created
repository during "git clone", which is a regression in Git 2.44.
This has been corrected.

* ps/remote-helper-repo-initialization-fix:
  builtin/clone: allow remote helpers to detect repo
2024-03-07 15:59:42 -08:00
a82fa7bce8 Merge branch 'jk/upload-pack-v2-capability-cleanup'
The upload-pack program, when talking over v2, accepted the
packfile-uris protocol extension from the client, even if it did
not advertise the capability, which has been corrected.

* jk/upload-pack-v2-capability-cleanup:
  upload-pack: only accept packfile-uris if we advertised it
  upload-pack: use existing config mechanism for advertisement
  upload-pack: centralize setup of sideband-all config
  upload-pack: use repository struct to get config
2024-03-07 15:59:42 -08:00
56d6084560 Merge branch 'jk/upload-pack-bounded-resources'
Various parts of upload-pack has been updated to bound the resource
consumption relative to the size of the repository to protect from
abusive clients.

* jk/upload-pack-bounded-resources:
  upload-pack: free tree buffers after parsing
  upload-pack: use PARSE_OBJECT_SKIP_HASH_CHECK in more places
  upload-pack: always turn off save_commit_buffer
  upload-pack: disallow object-info capability by default
  upload-pack: accept only a single packfile-uri line
  upload-pack: use a strmap for want-ref lines
  upload-pack: use oidset for deepen_not list
  upload-pack: switch deepen-not list to an oid_array
  upload-pack: drop separate v2 "haves" array
2024-03-07 15:59:42 -08:00
963a277a52 Merge branch 'ps/reftable-repo-init-fix'
Clear the fallout from a fix for 2.44 regression.

* ps/reftable-repo-init-fix:
  t0610: remove unused variable assignment
  refs/reftable: don't fail empty transactions in repo without HEAD
2024-03-07 15:59:42 -08:00
6a887bdd92 Merge branch 'ml/log-merge-with-cherry-pick-and-other-pseudo-heads'
"git log --merge" learned to pay attention to CHERRY_PICK_HEAD and
other kinds of *_HEAD pseudorefs.

* ml/log-merge-with-cherry-pick-and-other-pseudo-heads:
  revision: implement `git log --merge` also for rebase/cherry-pick/revert
  revision: ensure MERGE_HEAD is a ref in prepare_show_merge
2024-03-07 15:59:41 -08:00
f46a3f143e Merge branch 'eg/add-uflags'
Code clean-up practice.

* eg/add-uflags:
  add: use unsigned type for collection of bits
2024-03-07 15:59:41 -08:00
798ddfc17f Merge branch 'jt/commit-redundant-scissors-fix'
"git commit -v --cleanup=scissors" used to add the scissors line
twice in the log message buffer, which has been corrected.

* jt/commit-redundant-scissors-fix:
  commit: unify logic to avoid multiple scissors lines when merging
  commit: avoid redundant scissor line with --cleanup=scissors -v
2024-03-07 15:59:41 -08:00
ae46d5fb98 Merge branch 'js/merge-tree-3-trees'
"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.

* js/merge-tree-3-trees:
  fill_tree_descriptor(): mark error message for translation
  cache-tree: avoid an unnecessary check
  Always check `parse_tree*()`'s return value
  t4301: verify that merge-tree fails on missing blob objects
  merge-ort: do check `parse_tree()`'s return value
  merge-tree: fail with a non-zero exit code on missing tree objects
  merge-tree: accept 3 trees as arguments
2024-03-07 15:59:41 -08:00
76d1cd8e5e Merge branch 'cc/rev-list-allow-missing-tips'
"git rev-list --missing=print" has learned to optionally take
"--allow-missing-tips", which allows the objects at the starting
points to be missing.

* cc/rev-list-allow-missing-tips:
  revision: fix --missing=[print|allow*] for annotated tags
  rev-list: allow missing tips with --missing=[print|allow*]
  t6022: fix 'test' style and 'even though' typo
  oidset: refactor oidset_insert_from_set()
  revision: clarify a 'return NULL' in get_reference()
2024-03-07 15:59:40 -08:00
2c206fc82a Merge branch 'jc/no-lazy-fetch'
"git --no-lazy-fetch cmd" allows to run "cmd" while disabling lazy
fetching of objects from the promisor remote, which may be handy
for debugging.

* jc/no-lazy-fetch:
  git: extend --no-lazy-fetch to work across subprocesses
  git: document GIT_NO_REPLACE_OBJECTS environment variable
  git: --no-lazy-fetch option
2024-03-07 15:59:40 -08:00
fffd981ec2 reftable/block: fix binary search over restart counter
Records store their keys prefix-compressed. As many records will share a
common prefix (e.g. "refs/heads/"), this can end up saving quite a bit
of disk space. The downside of this is that it is not possible to just
seek into the middle of a block and consume the corresponding record
because it may depend on prefixes read from preceding records.

To help with this usecase, the reftable format writes every n'th record
without using prefix compression, which is called a "restart". The list
of restarts is stored at the end of each block so that a reader can
figure out entry points at which to read a full record without having to
read all preceding records.

This allows us to do a binary search over the records in a block when
searching for a particular key by iterating through the restarts until
we have found the section in which our record must be located. From
thereon we perform a linear search to locate the desired record.

This mechanism is broken though. In `block_reader_seek()` we call
`binsearch()` over the count of restarts in the current block. The
function we pass to compare records with each other computes the key at
the current index and then compares it to our search key by calling
`strbuf_cmp()`, returning its result directly. But `binsearch()` expects
us to return a truish value that indicates whether the current index is
smaller than the searched-for key. And unless our key exactly matches
the value at the restart counter we always end up returning a truish
value.

The consequence is that `binsearch()` essentially always returns 0,
indicacting to us that we must start searching right at the beginning of
the block. This works by chance because we now always do a linear scan
from the start of the block, and thus we would still end up finding the
desired record. But needless to say, this makes the optimization quite
useless.

Fix this bug by returning whether the current key is smaller than the
searched key. As the current behaviour was correct it is not possible to
write a test. Furthermore it is also not really possible to demonstrate
in a benchmark that this fix speeds up seeking records.

This may cause the reader to question whether this binary search makes
sense in the first place if it doesn't even help with performance. But
it would end up helping if we were to read a reftable with a much larger
block size. Blocks can be up to 16MB in size, in which case it will
become much more important to avoid the linear scan. We are not yet
ready to read or write such larger blocks though, so we have to live
without a benchmark demonstrating this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 13:59:36 -08:00
1a03591812 reftable/record: fix memory leak when decoding object records
When decoding records it is customary to reuse a `struct
reftable_ref_record` across calls. Thus, it may happen that the record
already holds some allocated memory. When decoding ref and log records
we handle this by releasing or reallocating held memory. But we fail to
do this for object records, which causes us to leak memory.

Fix this memory leak by releasing object records before we decode into
them. We may eventually want to reuse memory instead to avoid needless
reallocations. But for now, let's just plug the leak and be done.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 13:59:19 -08:00
2541cba2d6 wt-status: don't find scissors line beyond buf len
If

  (a) There is a "---" divider in a commit message,

  (b) At some point beyond that divider, there is a cut-line (that is,
      "# ------------------------ >8 ------------------------") in the
      commit message,

  (c) the user does not explicitly set the "no-divider" option,

then "git interpret-trailers" will hang indefinitively.

This is because when (a) is true, find_end_of_log_message() will invoke
ignored_log_message_bytes() with a len that is intended to make it
ignore the part of the commit message beyond the divider. However,
ignored_log_message_bytes() calls wt_status_locate_end(), and that
function ignores the length restriction when it tries to locate the cut
line. If it manages to find one, the returned cutoff value is greater
than len. At this point, ignored_log_message_bytes() goes into an
infinite loop, because it won't advance the string parsing beyond len,
but the exit condition expects to reach cutoff.

Make wt_status_locate_end() honor the length parameter passed in, to
fix this issue.

In general, if wt_status_locate_end() is given a piece of the memory
that lacks NUL at all, strstr() may continue across page boundaries
and run into an unmapped page.  For our current callers, this is not
a problem, as all of them except one uses a memory owned by a strbuf
(which guarantees an implicit NUL-termination after its payload),
and the one exception in trailer.c:find_end_of_log_message() uses
strlen() to compute the length before calling this function.

Signed-off-by: Florian Schmidt <flosch@nutanix.com>
Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
[jc: tweaked the commit log message and the implementation a bit]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 13:22:39 -08:00
60c4c42515 reftable/stack: register compacted tables as tempfiles
We do not register tables resulting from stack compaction with the
tempfile API. Those tables will thus not be deleted in case Git gets
killed.

Refactor the code to register compacted tables as tempfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 12:34:14 -08:00
3a60f6a2c4 reftable/stack: register lockfiles during compaction
We do not register any of the locks we acquire when compacting the
reftable stack via our lockfiles interfaces. These locks will thus not
be released when Git gets killed.

Refactor the code to register locks as lockfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 12:34:13 -08:00
1920d17a99 reftable/stack: register new tables as tempfiles
We do not register new tables which we're about to add to the stack with
the tempfile API. Those tables will thus not be deleted in case Git gets
killed.

Refactor the code to register tables as tempfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 12:34:13 -08:00
4ae540d421 lockfile: report when rollback fails
We do not report to the caller when rolling back a lockfile fails, which
will be needed by the reftable compaction logic in a subsequent commit.
It also cannot really report on all errors because the function calls
`delete_tempfile()`, which doesn't return an error either.

Refactor the code so that both `delete_tempfile()` and
`rollback_lock_file()` return an error code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 12:34:13 -08:00
51d41dc243 doc/gitremote-helpers: fix missing single-quote
The formatting around "option push-option" was missing its closing
quote, leading to the output having a stray opening quote, rather than
rendering the item in italics (as we do for all of the other options in
the list).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 12:30:48 -08:00
6111252cbf trace2: emit 'def_param' set with 'cmd_name' event
Some commands do not cause a set of 'def_param' events to be emitted.
This includes "git-remote-https", "git-http-fetch", and various
"query" commands, like "git --man-path".

Since all of these commands do emit a 'cmd_name' event, add code to
the "trace2_cmd_name()" function to generate the set of 'def_param'
events.

Remove explicit calls to "trace2_cmd_list_config()" and
"trace2_cmd_list_env_vars()" in git.c since they are no longer needed.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 10:24:34 -08:00
520cf66814 trace2: avoid emitting 'def_param' set more than once
During nested alias expansion it is possible for
"trace2_cmd_list_config()" and "trace2_cmd_list_env_vars()"
to be called more than once.  This causes a full set of
'def_param' events to be emitted each time.  Let's avoid
that.

Add code to those two functions to only emit them once.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 10:24:34 -08:00
0c1c3c861e t0211: demonstrate missing 'def_param' events for certain commands
Some Git commands fail to emit 'def_param' events for interesting
config and environment variable settings.

Add unit tests to demonstrate this.

Most commands are considered "builtin" and are based upon git.c.
These typically do emit 'def_param' events.  Exceptions are some of
the "query" commands, the "run-dashed" mechanism, and alias handling.

Commands built from remote-curl.c (instead of git.c), such as
"git-remote-https", do not emit 'def_param' events.

Likewise, "git-http-fetch" is built http-fetch.c and does not emit
them.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07 10:24:34 -08:00
9a90118d78 t7301: use test_path_is_(missing|file)
Replace "test -f" and friends to use the test_path_is_file helper
function and friends from test-lib-functions.sh. These functions
perform identical operations while enhancing debugging capabilities
in case of test failures.

The original used 'test ! -f' to check if the file has been
correctly cleaned, so 'test ! -e' would have been a better choice.
Replace them with 'test_path_is_missing'.

Signed-off-by: Vincenzo Mezzela <vincenzo.mezzela@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 15:32:12 -08:00
29c139ce78 fsmonitor: support case-insensitive events
Teach fsmonitor_refresh_callback() to handle case-insensitive
lookups if case-sensitive lookups fail on case-insensitive systems.
This can cause 'git status' to report stale status for files if there
are case issues/errors in the worktree.

The FSMonitor daemon sends FSEvents using the observed spelling
of each pathname.  On case-insensitive file systems this may be
different than the expected case spelling.

The existing code uses index_name_pos() to find the cache-entry for
the pathname in the FSEvent and clear the CE_FSMONITOR_VALID bit so
that the worktree scan/index refresh will revisit and revalidate the
path.

On a case-insensitive file system, the exact match lookup may fail
to find the associated cache-entry. This causes status to think that
the cached CE flags are correct and skip over the file.

Update event handling to optionally use the name-hash and dir-name-hash
if necessary.

Also update t7527 to convert the "test_expect_failure" to "_success"
now that we have fixed the bug.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 09:10:06 -08:00
b0dba507fe fsmonitor: refactor bit invalidation in refresh callback
Refactor code in the fsmonitor_refresh_callback() call chain dealing
with invalidating the CE_FSMONITOR_VALID bit and add a trace message.

During the refresh, we clear the CE_FSMONITOR_VALID bit in response to
data from the FSMonitor daemon (so that a later phase will lstat() and
verify the true state of the file).

Create a new function to clear the bit and add some unique tracing for
it to help debug edge cases.

This is similar to the existing `mark_fsmonitor_invalid()` function,
but it also does untracked-cache invalidation and we've already
handled that in the refresh-callback handlers, so but we don't need
to repeat that.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 09:10:06 -08:00
84d441f2f0 fsmonitor: trace the new invalidated cache-entry count
Consolidate the directory/non-directory calls to the refresh handler
code.  Log the resulting count of invalidated cache-entries.

The nr_in_cone value will be used in a later commit to decide if
we also need to try to do case-insensitive lookups.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 09:10:06 -08:00
9e34e56280 fsmonitor: return invalidated cache-entry count on non-directory event
Teach the refresh callback helper function for unqualified FSEvents
(pathnames without a trailing slash) to return the number of
cache-entries that were invalided in response to the event.

This will be used in a later commit to help determine if the observed
pathname was (possibly) case-incorrect when (on a case-insensitive
file system).

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 09:10:00 -08:00
e0795e2c79 t0610: remove unused variable assignment
In b0f6b6b523 (refs/reftable: don't fail empty transactions in repo
without HEAD, 2024-02-27), we have added a new test to t0610. This test
contains a useless assignment to a variable that is never actually used.
Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 08:40:40 -08:00
d254e65092 build: support z/OS (OS/390).
Introduced z/OS (OS/390) as a platform in config.mak.uname

Signed-off-by: Haritha D <harithamma.d@ibm.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06 08:10:58 -08:00
1605035217 tests: modernize the test script t0010-racy-git.sh
Modernize the formatting of the test script to align with current
standards and improve its overall readability.

Signed-off-by: Aryan Gupta <garyan447@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 14:52:57 -08:00
781fb7b4c2 revision.c: trivial fix to message
ancestry-path is an option, not a command - mark it as such.
This brings it in sync with the rest of usages in the file

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 14:11:56 -08:00
6567eed94f builtin/clone.c: trivial fix of message
bare in that context is an option, not purely an adjective
Mark it properly

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 14:11:56 -08:00
fe7b5150cb builtin/remote.c: trivial fix of error message
Mark --mirror as option rather than command

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 14:11:56 -08:00
3a12749b50 transport-helper.c: trivial fix of error message
Mark --force as option rather than variable names

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 14:11:56 -08:00
8fbd903e58 branch: advise about ref syntax rules
git-branch(1) will error out if you give it a bad ref name. But the user
might not understand why or what part of the name is illegal.

The user might know that there are some limitations based on the *loose
ref* format (filenames), but there are also further rules for
easier integration with shell-based tools, pathname expansion, and
playing well with reference name expressions.

The man page for git-check-ref-format(1) contains these rules. Let’s
advise about it since that is not a command that you just happen
upon. Also make this advise configurable since you might not want to be
reminded every time you make a little typo.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 13:04:26 -08:00
15cb03728f advice: use double quotes for regular quoting
Use double quotes like we use for “die” in this document.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 13:04:26 -08:00
3ccc4782ce advice: use backticks for verbatim
Use backticks for inline-verbatim rather than single quotes. Also quote
the unquoted ref globs.

Also replace “the add command” with “`git add`”.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 13:04:26 -08:00
95c987e6fa advice: make all entries stylistically consistent
In general, rewrite entries to the following form:

1. Clause or sentence describing when the advice is shown
2. Optional “to <verb>” clause which says what the advice is
   about (e.g. for resetNoRefresh: tell the user that they can use
   `--no-refresh`)

Concretely:

1. Use “shown” instead of “advice shown”
   • “advice” is implied and a bit repetitive
2. Use “when” instead of “if”
3. Lead with “Shown when” and end the entry with the effect it has,
   where applicable
4. Use “the user” instead of “a user” or “you”
5. implicitIdentity: rewrite description in order to lead with *when*
   the advice is shown (see point (3))
6. Prefer the present tense (with the exception of pushNonFFMatching)
7. waitingForEditor: give example of relevance in this new context
8. pushUpdateRejected: exception to the above principles

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 13:04:25 -08:00
8c5001c68e t3200: improve test style
Some tests use a preliminary heredoc for `expect` or have setup and
teardown commands before and after, respectively. It is however
preferred to keep all the logic in the test itself. Let’s move these
into the tests.

Also:

• Remove a now-irrelevant comment about test placement and switch back
  to `main` post-test
• Prefer indented literal heredocs (`-\EOF`) except for a block which
  says that this is intentional

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 13:04:25 -08:00
fb7c556f58 config: document core.commentChar as ASCII-only
d3b3419f8f (config: tell the user that we expect an ASCII character,
2023-03-27) updated an error message to make clear that this option
specifically wants an ASCII character but neglected to consider the
config documentation.

Reported-by: Manlio Perillo <manlio.perillo@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:51:30 -08:00
43072b4ca1 The fourth batch
Also update the DEF_VER in GIT-VERSION-GEN, which I forgot to do
earlier (it should have been done when we started the new cycle).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:44:44 -08:00
53ac1f106f Merge branch 'ak/rebase-autosquash'
Typofix.

* ak/rebase-autosquash:
  rebase: fix typo in autosquash documentation
2024-03-05 09:44:44 -08:00
d037212d97 Merge branch 'kn/for-all-refs'
"git for-each-ref" learned "--include-root-refs" option to show
even the stuff outside the 'refs/' hierarchy.

* kn/for-all-refs:
  for-each-ref: add new option to include root refs
  ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
  refs: introduce `refs_for_each_include_root_refs()`
  refs: extract out `loose_fill_ref_dir_regular_file()`
  refs: introduce `is_pseudoref()` and `is_headref()`
2024-03-05 09:44:44 -08:00
661f379791 Merge branch 'pb/ort-make-submodule-conflict-message-an-advice'
When a merge conflicted at a submodule, merge-ort backend used to
unconditionally give a lengthy message to suggest how to resolve
it.  Now the message can be squelched as an advice message.

* pb/ort-make-submodule-conflict-message-an-advice:
  merge-ort: turn submodule conflict suggestions into an advice
2024-03-05 09:44:43 -08:00
53929db7c4 Merge branch 'jc/doc-compat-util'
Clarify wording in the CodingGuidelines that requires <git-compat-util.h>
to be the first header file.

* jc/doc-compat-util:
  doc: clarify the wording on <git-compat-util.h> requirement
2024-03-05 09:44:43 -08:00
e58a4de3bb Merge branch 'sg/upload-pack-error-message-fix'
An error message from "git upload-pack", which responds to "git
fetch" requests, had a trialing NUL in it, which has been
corrected.

* sg/upload-pack-error-message-fix:
  upload-pack: don't send null character in abort message to the client
2024-03-05 09:44:43 -08:00
d31a515e9c Merge branch 'rs/submodule-prefix-simplify'
Code simplification.

* rs/submodule-prefix-simplify:
  submodule: use strvec_pushf() for --submodule-prefix
2024-03-05 09:44:43 -08:00
b5111647cb Merge branch 'rs/name-rev-with-mempool'
Many small allocations "git name-rev" makes have been updated to
allocate from a mem-pool.

* rs/name-rev-with-mempool:
  name-rev: use mem_pool_strfmt()
  mem-pool: add mem_pool_strfmt()
2024-03-05 09:44:43 -08:00
6f74483667 Merge branch 'rs/fetch-simplify-with-starts-with'
Code simplification.

* rs/fetch-simplify-with-starts-with:
  fetch: convert strncmp() with strlen() to starts_with()
2024-03-05 09:44:42 -08:00
74522bbd98 Merge branch 'jk/reflog-special-cases-fix'
The logic to access reflog entries by date and number had ugly
corner cases at the boundaries, which have been cleaned up.

* jk/reflog-special-cases-fix:
  read_ref_at(): special-case ref@{0} for an empty reflog
  get_oid_basic(): special-case ref@{n} for oldest reflog entry
  Revert "refs: allow @{n} to work with n-sized reflog"
2024-03-05 09:44:42 -08:00
542d093b1d Merge branch 'jc/no-include-of-compat-util-from-headers'
Header file clean-up.

* jc/no-include-of-compat-util-from-headers:
  compat: drop inclusion of <git-compat-util.h>
2024-03-05 09:44:42 -08:00
d619abf7fa Merge branch 'js/remove-cruft-files'
Remove an empty file that shouldn't have been added in the first
place.

* js/remove-cruft-files:
  neue: remove a bogus empty file
2024-03-05 09:44:42 -08:00
6249de53a3 Merge branch 'jk/textconv-cache-outside-repo-fix'
The code incorrectly attempted to use textconv cache when asked,
even when we are not running in a repository, which has been
corrected.

* jk/textconv-cache-outside-repo-fix:
  userdiff: skip textconv caching when not in a repository
2024-03-05 09:44:42 -08:00
fcacc2b161 refs/reftable: track last log record name via strbuf
The reflog iterator enumerates all reflogs known to a ref backend. In
the "reftable" backend there is no way to list all existing reflogs
directly. Instead, we have to iterate through all reflog entries and
discard all those redundant entries for which we have already returned a
reflog entry.

This logic is implemented by tracking the last reflog name that we have
emitted to the iterator's user. If the next log record has the same name
we simply skip it until we find another record with a different refname.

This last reflog name is stored in a simple C string, which requires us
to free and reallocate it whenever we need to update the reflog name.
Convert it to use a `struct strbuf` instead, which reduces the number of
allocations. Before:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 68,485 allocs, 68,363 frees, 256,234,072 bytes allocated

Note that even after this change we still allocate quite a lot of data,
even though the number of allocations does not scale with the number of
log records anymore. This remainder comes mostly from decompressing the
log blocks, where we decompress each block into newly allocated memory.
This will be addressed at a later point in time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:07 -08:00
7b8abc4d8c reftable/record: use scratch buffer when decoding records
When decoding log records we need a temporary buffer to decode the
reflog entry's name, mail address and message. As this buffer is local
to the function we thus have to reallocate it for every single log
record which we're about to decode, which is inefficient.

Refactor the code such that callers need to pass in a scratch buffer,
which allows us to reuse it for multiple decodes. This reduces the
number of allocations when iterating through reflogs. Before:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 2,068,487 allocs, 2,068,365 frees, 305,122,946 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated

Note that this commit also drop some redundant calls to `strbuf_reset()`
right before calling `decode_string()`. The latter already knows to
reset the buffer, so there is no need for these.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:06 -08:00
e0bd13beea reftable/record: reuse message when decoding log records
Same as the preceding commit we can allocate log messages as needed when
decoding log records, thus further reducing the number of allocations.
Before:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 3,068,488 allocs, 3,068,366 frees, 307,122,961 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 2,068,487 allocs, 2,068,365 frees, 305,122,946 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:06 -08:00
193fcb3ff8 reftable/record: reuse refnames when decoding log records
When decoding a log record we always reallocate their refname arrays.
This results in quite a lot of needless allocation churn.

Refactor the code to grow the array as required only. Like this, we
should usually only end up reallocating the array a small handful of
times when iterating over many refs. Before:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 3,068,488 allocs, 3,068,366 frees, 307,122,961 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:06 -08:00
01639ec148 reftable/record: avoid copying author info
Each reflog entry contains information regarding the authorship of who
has made the change. This authorship information is not the same as that
of any of the commits that the reflog entry references, but instead
corresponds to the local user that has executed the command. Thus, it is
almost always the case that all reflog entries have the same author.

We can make use of this fact when decoding reftable records: instead of
freeing and then reallocating the authorship information of log records,
we can special-case when the next record during an iteration has the
exact same authorship as the preceding record. If so, then there is no
need to reallocate the respective fields.

This change results in two allocations less per log record that we're
iterating over in the most common case. Before:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated

An alternative would be to store the capacity of both name and email and
then use `REFTABLE_ALLOC_GROW()` to conditionally reallocate the array.
But reftable records are copied around quite a lot, and thus we need to
be a bit mindful of the overall record size. Furthermore, a memory
comparison should also be more efficient than having to copy over memory
even if we wouldn't have to allocate a new array every time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:06 -08:00
87ff723018 reftable/record: convert old and new object IDs to arrays
In 7af607c58d (reftable/record: store "val1" hashes as static arrays,
2024-01-03) and b31e3cc620 (reftable/record: store "val2" hashes as
static arrays, 2024-01-03) we have converted ref records to store their
object IDs in a static array. Convert log records to do the same so that
their old and new object IDs are arrays, too.

This change results in two allocations less per log record that we're
iterating over. Before:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 8,068,495 allocs, 8,068,373 frees, 401,011,862 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 13,473 bytes in 122 blocks
      total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:06 -08:00
eea0d11d6d refs/reftable: reload correct stack when creating reflog iter
When creating a new reflog iterator, we first have to reload the stack
that the iterator is being created. This is done so that any concurrent
writes to the stack are reflected. But `reflog_iterator_for_stack()`
always reloads the main stack, which is wrong.

Fix this and reload the correct stack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05 09:10:06 -08:00
2efe7958d6 Merge branch 'ps/reftable-iteration-perf-part2' into ps/reftable-reflog-iteration-perf
* ps/reftable-iteration-perf-part2:
  refs/reftable: precompute prefix length
  reftable: allow inlining of a few functions
  reftable/record: decode keys in place
  reftable/record: reuse refname when copying
  reftable/record: reuse refname when decoding
  reftable/merged: avoid duplicate pqueue emptiness check
  reftable/merged: circumvent pqueue with single subiter
  reftable/merged: handle subiter cleanup on close only
  reftable/merged: remove unnecessary null check for subiters
  reftable/merged: make subiters own their records
  reftable/merged: advance subiter on subsequent iteration
  reftable/merged: make `merged_iter` structure private
  reftable/pq: use `size_t` to track iterator index
2024-03-05 09:09:46 -08:00
105ec9ae8d clean: further clean-up of implementation around "--force"
We clarified how "clean.requireForce" interacts with the "--dry-run"
option in the previous commit, both in the implementation and in the
documentation.  Even when "git clean" (without other options) is
required to be used with "--force" (i.e. either clean.requireForce
is unset, or explicitly set to true) to protect end-users from
casual invocation of the command by mistake, "--dry-run" does not
require "--force" to be used, because it is already its own
protection mechanism by being a no-op to the working tree files.

The previous commit, however, missed another clean-up opportunity
around the same area.  Just like in the "--dry-run" mode, the
command in the "--interactive" mode does not require "--force",
either.  This is because by going interactive and giving the end
user one more chance to confirm, the mode itself is serving as its
own protection mechanism.

Let's take things one step further, and unify the code that defines
interaction between "--force" and these two other options.  Just
like we added explanation for the reason why "--dry-run" does not
honor "clean.requireForce", give an explanation for the reason why
"--interactive" makes "clean.requireForce" to be ignored.

Finally, add some tests to show the interaction between "--force"
and "--interactive".  We already have tests that show interaction
between "--force" and "--dry-run", but didn't test "--interactive".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 14:05:13 -08:00
43f70eaea0 refs/reftable: precompute prefix length
We're recomputing the prefix length on every iteration of the ref
iterator. Precompute it for another speedup when iterating over 1
million refs:

    Benchmark 1: show-ref: single matching ref (revision = HEAD~)
      Time (mean ± σ):     100.3 ms ±   3.7 ms    [User: 97.3 ms, System: 2.8 ms]
      Range (min … max):    97.5 ms … 139.7 ms    1000 runs

    Benchmark 2: show-ref: single matching ref (revision = HEAD)
      Time (mean ± σ):      95.8 ms ±   3.4 ms    [User: 92.9 ms, System: 2.8 ms]
      Range (min … max):    93.0 ms … 121.9 ms    1000 runs

    Summary
      show-ref: single matching ref (revision = HEAD) ran
        1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:58 -08:00
f1bf54aee3 reftable: allow inlining of a few functions
We have a few functions which are basically just accessors to
structures. As those functions are executed inside the hot loop when
iterating through many refs, the fact that they cannot be inlined is
costing us some performance.

Move the function definitions into their respective headers so that they
can be inlined. This results in a performance improvement when iterating
over 1 million refs:

    Benchmark 1: show-ref: single matching ref (revision = HEAD~)
      Time (mean ± σ):     105.9 ms ±   3.6 ms    [User: 103.0 ms, System: 2.8 ms]
      Range (min … max):   103.1 ms … 133.4 ms    1000 runs

    Benchmark 2: show-ref: single matching ref (revision = HEAD)
      Time (mean ± σ):     100.7 ms ±   3.4 ms    [User: 97.8 ms, System: 2.8 ms]
      Range (min … max):    97.8 ms … 124.0 ms    1000 runs

    Summary
      show-ref: single matching ref (revision = HEAD) ran
        1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:49 -08:00
daf4f43d0d reftable/record: decode keys in place
When reading a record from a block, we need to decode the record's key.
As reftable keys are prefix-compressed, meaning they reuse a prefix from
the preceding record's key, this is a bit more involved than just having
to copy the relevant bytes: we need to figure out the prefix and suffix
lengths, copy the prefix from the preceding record and finally copy the
suffix from the current record.

This is done by passing three buffers to `reftable_decode_key()`: one
buffer that holds the result, one buffer that holds the last key, and
one buffer that points to the current record. The final key is then
assembled by calling `strbuf_add()` twice to copy over the prefix and
suffix.

Performing two memory copies is inefficient though. And we can indeed do
better by decoding keys in place. Instead of providing two buffers, the
caller may only call a single buffer that is already pre-populated with
the last key. Like this, we only have to call `strbuf_setlen()` to trim
the record to its prefix and then `strbuf_add()` to add the suffix.

This refactoring leads to a noticeable performance bump when iterating
over 1 million refs:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     112.2 ms ±   3.9 ms    [User: 109.3 ms, System: 2.8 ms]
    Range (min … max):   109.2 ms … 149.6 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     106.0 ms ±   3.5 ms    [User: 103.2 ms, System: 2.7 ms]
    Range (min … max):   103.2 ms … 133.7 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.06 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:49 -08:00
6620f9134c reftable/record: reuse refname when copying
Do the same optimization as in the preceding commit, but this time for
`reftable_record_copy()`. While not as noticeable, it still results in a
small speedup when iterating over 1 million refs:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     114.0 ms ±   3.8 ms    [User: 111.1 ms, System: 2.7 ms]
    Range (min … max):   110.9 ms … 144.3 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     112.5 ms ±   3.7 ms    [User: 109.5 ms, System: 2.8 ms]
    Range (min … max):   109.2 ms … 140.7 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.01 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:49 -08:00
71d9a2e991 reftable/record: reuse refname when decoding
When decoding a reftable record we will first release the user-provided
record and then decode the new record into it. This is quite inefficient
as we basically need to reallocate at least the refname every time.

Refactor the function to start tracking the refname capacity. Like this,
we can stow away the refname, release, restore and then grow the refname
to the required number of bytes via `REFTABLE_ALLOC_GROW()`.

This refactoring is safe to do because all functions that assigning to
the refname will first call `reftable_ref_record_release()`, which will
zero out the complete record after releasing memory.

This change results in a nice speedup when iterating over 1 million
refs:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)

    Time (mean ± σ):     124.0 ms ±   3.9 ms    [User: 121.1 ms, System: 2.7 ms]
    Range (min … max):   120.4 ms … 152.7 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     114.4 ms ±   3.7 ms    [User: 111.5 ms, System: 2.7 ms]
    Range (min … max):   111.0 ms … 152.1 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.08 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Furthermore, with this change we now perform a mostly constant number of
allocations when iterating. Before this change:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 1,006,620 allocs, 1,006,495 frees, 25,398,363 bytes allocated

After this change:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 6,623 allocs, 6,498 frees, 509,592 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:40 -08:00
080f8c4565 reftable/merged: avoid duplicate pqueue emptiness check
When calling `merged_iter_next_void()` we first check whether the iter
has been exhausted already. We already perform this check two levels
down the stack in `merged_iter_next_entry()` though, which makes this
check redundant.

Now if this check was there to accelerate the common case it might have
made sense to keep it. But the iterator being exhausted is rather the
uncommon case because you can expect most reftable stacks to contain
more than two refs.

Simplify the code by removing the check. As `merged_iter_next_void()` is
basically empty except for calling `merged_iter_next()` now, merge these
two functions. This also results in a tiny speedup when iterating over
many refs:

    Benchmark 1: show-ref: single matching ref (revision = HEAD~)
      Time (mean ± σ):     125.6 ms ±   3.8 ms    [User: 122.7 ms, System: 2.8 ms]
      Range (min … max):   122.4 ms … 153.4 ms    1000 runs

    Benchmark 2: show-ref: single matching ref (revision = HEAD)
      Time (mean ± σ):     124.0 ms ±   3.9 ms    [User: 121.1 ms, System: 2.8 ms]
      Range (min … max):   120.1 ms … 156.4 ms    1000 runs

    Summary
      show-ref: single matching ref (revision = HEAD) ran
        1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:40 -08:00
f8c1a8e2e1 reftable/merged: circumvent pqueue with single subiter
The merged iterator uses a priority queue to order records so that we
can yielid them in the expected order. This priority queue of course
comes with some overhead as we need to add, compare and remove entries
in that priority queue.

In the general case, that overhead cannot really be avoided. But when we
have a single subiter left then there is no need to use the priority
queue anymore because the order is exactly the same as what that subiter
would return.

While having a single subiter may sound like an edge case, it happens
more frequently than one might think. In the most common scenario, you
can expect a repository to have a single large table that contains most
of the records and then a set of smaller tables which contain later
additions to the reftable stack. In this case it is quite likely that we
exhaust subiters of those smaller stacks before exhausting the large
table.

Special-case this and return records directly from the remaining
subiter. This results in a sizeable speedup when iterating over 1m refs
in a repository with a single table:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     135.4 ms ±   4.4 ms    [User: 132.5 ms, System: 2.8 ms]
    Range (min … max):   131.0 ms … 166.3 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     126.3 ms ±   3.9 ms    [User: 123.3 ms, System: 2.8 ms]
    Range (min … max):   122.7 ms … 157.0 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.07 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:40 -08:00
3b6dd6ad1d reftable/merged: handle subiter cleanup on close only
When advancing one of the subiters fails we immediately release
resources associated with that subiter. This is not necessary though as
we will release these resources when closing the merged iterator anyway.

Drop the logic and only release resources when the merged iterator is
done. This is a mere cleanup that should help reduce the cognitive load
when reading through the code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:39 -08:00
2d71a1d4a2 reftable/merged: remove unnecessary null check for subiters
Whenever we advance a subiter we first call `iterator_is_null()`. This
is not needed though because we only ever advance subiters which have
entries in the priority queue, and we do not end entries to the priority
queue when the subiter has been exhausted.

Drop the check as well as the now-unused function. This results in a
surprisingly big speedup:

    Benchmark 1: show-ref: single matching ref (revision = HEAD~)
      Time (mean ± σ):     138.1 ms ±   4.4 ms    [User: 135.1 ms, System: 2.8 ms]
      Range (min … max):   133.4 ms … 167.3 ms    1000 runs

    Benchmark 2: show-ref: single matching ref (revision = HEAD)
      Time (mean ± σ):     134.4 ms ±   4.2 ms    [User: 131.5 ms, System: 2.8 ms]
      Range (min … max):   130.0 ms … 164.0 ms    1000 runs

    Summary
      show-ref: single matching ref (revision = HEAD) ran
        1.03 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:39 -08:00
bb2d6be4c1 reftable/merged: make subiters own their records
For each subiterator, the merged table needs to track their current
record. This record is owned by the priority queue though instead of by
the merged iterator. This is not optimal performance-wise.

For one, we need to move around records whenever we add or remove a
record from the priority queue. Thus, the bigger the entries the more
bytes we need to copy around. And compared to pointers, a reftable
record is rather on the bigger side. The other issue is that this makes
it harder to reuse the records.

Refactor the code so that the merged iterator tracks ownership of the
records per-subiter. Instead of having records in the priority queue, we
can now use mere pointers to the per-subiter records. This also allows
us to swap records between the caller and the per-subiter record instead
of doing an actual copy via `reftable_record_copy_from()`, which removes
the need to release the caller-provided record.

This results in a noticeable speedup when iterating through many refs.
The following benchmark iterates through 1 million refs:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     145.5 ms ±   4.5 ms    [User: 142.5 ms, System: 2.8 ms]
    Range (min … max):   141.3 ms … 177.0 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     139.0 ms ±   4.7 ms    [User: 136.1 ms, System: 2.8 ms]
    Range (min … max):   134.2 ms … 182.2 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

This refactoring also allows a subsequent refactoring where we start
reusing memory allocated by the reftable records because we do not need
to release the caller-provided record anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:39 -08:00
aad8ad6fe1 reftable/merged: advance subiter on subsequent iteration
When advancing the merged iterator, we pop the topmost entry from its
priority queue and then advance the sub-iterator that the entry belongs
to, adding the result as a new entry. This is quite sensible in the case
where the merged iterator is used to actually iterate through records.
But the merged iterator is also used when we look up a single record,
only, so advancing the sub-iterator is wasted effort because we would
never even look at the result.

Instead of immediately advancing the sub-iterator, we can also defer
this to the next iteration of the merged iterator by storing the
intent-to-advance. This results in a small speedup when reading many
records. The following benchmark creates 10000 refs, which will also end
up with many ref lookups:

    Benchmark 1: update-ref: create many refs (revision = HEAD~)
      Time (mean ± σ):     337.2 ms ±   7.3 ms    [User: 200.1 ms, System: 136.9 ms]
      Range (min … max):   329.3 ms … 373.2 ms    100 runs

    Benchmark 2: update-ref: create many refs (revision = HEAD)
      Time (mean ± σ):     332.5 ms ±   5.9 ms    [User: 197.2 ms, System: 135.1 ms]
      Range (min … max):   327.6 ms … 359.8 ms    100 runs

    Summary
      update-ref: create many refs (revision = HEAD) ran
        1.01 ± 0.03 times faster than update-ref: create many refs (revision = HEAD~)

While this speedup alone isn't really worth it, this refactoring will
also allow two additional optimizations in subsequent patches. First, it
will allow us to special-case when there is only a single sub-iter left
to circumvent the priority queue altogether. And second, it makes it
easier to avoid copying records to the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:30 -08:00
48929d2e47 reftable/merged: make merged_iter structure private
The `merged_iter` structure is not used anywhere outside of "merged.c",
but is declared in its header. Move it into the code file so that it is
clear that its implementation details are never exposed to anything.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:30 -08:00
5c11529c66 reftable/pq: use size_t to track iterator index
The reftable priority queue is used by the merged iterator to yield
records from its sub-iterators in the expected order. Each entry has a
record corresponding to such a sub-iterator as well as an index that
indicates which sub-iterator the record belongs to. But while the
sub-iterators are tracked with a `size_t`, we store the index as an
`int` in the entry.

Fix this and use `size_t` consistently.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:19:30 -08:00
8145a8fd02 setup: remove unnecessary variable
The TODO comment suggested to heed core.bare from template config file
if no command line override given. And the prev_bare_repository
variable seems to have been placed for this sole purpose as it is not
used anywhere else.

However, it was clarified by Junio [1] that such values (including
core.bare) are ignored intentionally and does not make sense to
propagate them from template config to repository config. Also, the
directories for the worktree and repository are already created, and
therefore the bare/non-bare decision has already been made, by the
point we reach the codepath where the TODO comment is placed.
Therefore, prev_bare_repository does not have a usecase with/without
supporting core.bare from template. And the removal of
prev_bare_repository is safe as proved by the later part of the
comment:

    "Unfortunately, the line above is equivalent to
        is_bare_repository_cfg = !work_tree;
    which ignores the config entirely even if no `--[no-]bare`
    command line option was present.

    To see why, note that before this function, there was this call:
        prev_bare_repository = is_bare_repository()
    expanding the right hand side:
        = is_bare_repository_cfg && !get_git_work_tree()
        = is_bare_repository_cfg && !work_tree
    note that the last simplification above is valid because nothing
    calls repo_init() or set_git_work_tree() between any of the
    relevant calls in the code, and thus the !get_git_work_tree()
    calls will return the same result each time.  So, what we are
    interested in computing is the right hand side of the line of
    code just above this comment:
        prev_bare_repository || !work_tree
        = is_bare_repository_cfg && !work_tree || !work_tree
        = !work_tree
    because "A && !B || !B == !B" for all boolean values of A & B."

Therefore, remove the TODO comment and remove prev_bare_repository
variable. Also, update relevant testcases and remove one redundant
testcase.

[1]: https://lore.kernel.org/git/xmqqjzonpy9l.fsf@gitster.g/

Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 10:18:31 -08:00
0332e813d6 t9117: prefer test_path_* helper functions
test -(e|d) does not provide a nice error message when we hit test
failures, so use test_path_exists, test_path_is_dir instead.

Signed-off-by: shejialuo <shejialuo@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-04 09:50:21 -08:00
1284f9cc11 completion: reflog subcommands and options
Make generic the completion for reflog subcommands and its options.

Note that we still need to special case the options for "show".

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 14:21:39 -08:00
476a236e72 completion: factor out __git_resolve_builtins
We're going to use the result of "git xxx --git-completion-helper" not
only for feeding COMPREPLY.

Therefore, factor out the execution and the caching of its results in
__gitcomp_builtin, to a new function __git_resolve_builtins.

While we're here, move an important comment we have in the function to
its header, so it gains visibility.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 14:21:39 -08:00
3fec482b5f completion: introduce __git_find_subcommand
Let's have a function to get the current subcommand when completing
commands that follow the syntax:

    git <command> <subcommand>

As a convenience, let's allow an optional "default subcommand" to be
returned if none is found.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 14:21:38 -08:00
c689c38bc2 completion: reflog show <log-options>
Let's add completion for <log-options> in "reflog show" so that the user
can easily discover uses like:

   $ git reflog --since=1.day.ago

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 14:21:38 -08:00
85452a1d4b completion: reflog with implicit "show"
When no subcommand is specified to "reflog", we assume "show" [1]:

    $ git reflog -h
    usage: git reflog [show] [<log-options>] [<ref>]
    ...

This implicit "show" is not being completed correctly:

    $ git checkout -b default
    $ git reflog def<TAB><TAB>
    ... no completion options ...

The expected result is:

    $ git reflog default

This happens because we're completing references after seeing a valid
subcommand in the command line.  This prevents the implicit "show" from
working properly, but also introduces a new problem: it keeps offering
subcommand options when the subcommand is implicit:

    $ git checkout -b explore
    $ git reflog default ex<TAB>
    ...
    $ git reflog default expire

The expected result is:

    $ git reflog default explore

To fix this, complete references even if no subcommand is present, or in
other words when the subcommand is implicit "show".

Also, only include completion options for subcommands when completing
the right position in the command line.

  1. cf39f54efc (git reflog show, 2007-02-08)

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 14:21:38 -08:00
12a4883feb clean: improve -n and -f implementation and documentation
What -n actually does in addition to its documented behavior is
ignoring of configuration variable clean.requireForce, that makes
sense provided -n prevents files removal anyway.

So, first, document this in the manual, and then modify implementation
to make this more explicit in the code.

Improved implementation also stops to share single internal variable
'force' between command-line -f option and configuration variable
clean.requireForce, resulting in more clear logic.

Two error messages with slightly different text depending on if
clean.requireForce was explicitly set or not, are merged into a single
one.

The resulting error message now does not mention -n as well, as it
neither matches intended clean.requireForce usage nor reflects
clarified implementation.

Documentation of clean.requireForce is changed accordingly.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:50:04 -08:00
28a92478b8 parse-options: rearrange long_name matching code
Move the code for handling a full match of long_name first and get rid
of negations.  Reduce the indent of the code for matching abbreviations
and remove unnecessary curly braces.  Combine the checks for whether
negation is allowed and whether arg is "n", "no" or "no-" because they
belong together and avoid a continue statement.  The result is shorter,
more readable code.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:49:22 -08:00
b1ce2b62fa parse-options: normalize arg and long_name before comparison
Strip "no-" from arg and long_name before comparing them.  This way we
no longer have to repeat the comparison with an offset of 3 for negated
arguments.

Note that we must not modify the "flags" value, which tracks whether arg
is negated, inside the loop.  When registering "--n", "--no" or "--no-"
as abbreviation for any negative option, we used to OR it with OPT_UNSET
and end the loop.  We can simply hard-code OPT_UNSET and leave flags
unchanged instead.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:49:22 -08:00
0d8a3097c7 parse-options: detect ambiguous self-negation
Git currently does not detect the ambiguity of an option that starts
with "no" like --notes and its negated form if given just --n or --no.
All Git commands with such options have other negatable options, and
we detect the ambiguity with them, so that's currently only a potential
problem for scripts that use git rev-parse --parseopt.

Let's fix it nevertheless, as there's no need for that confusion.  To
detect the ambiguity we have to loosen the check in register_abbrev(),
as an option is considered an alias of itself.  Add non-matching
negation flags as a criterion to recognize an option being ambiguous
with its negated form.

And we need to keep going after finding a non-negated option as an
abbreviated candidate and perform the negation checks in the same
loop.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:49:21 -08:00
cb46c3faf8 parse-options: factor out register_abbrev() and struct parsed_option
Add a function, register_abbrev(), for storing the necessary details for
remembering an abbreviated and thus potentially ambiguous option.  Call
it instead of sharing the code using goto, to make the control flow more
explicit.

Conveniently collect these details in the new struct parsed_option to
reduce the number of necessary function arguments.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:49:21 -08:00
597f9d037d parse-options: set arg of abbreviated option lazily
Postpone setting the opt pointer until we're about to call get_value(),
which uses it.  There's no point in setting it eagerly for every
abbreviated candidate option, which may turn out to be ambiguous.
Removing this assignment from the loop doesn't noticeably improve the
performance, but allows further simplification.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:49:21 -08:00
289cb15541 parse-options: recognize abbreviated negated option with arg
Giving an argument to an option that doesn't take one causes Git to
report that error specifically:

   $ git rm --dry-run=bogus
   error: option `dry-run' takes no value

The same is true when the option is negated or abbreviated:

   $ git rm --no-dry-run=bogus
   error: option `no-dry-run' takes no value

   $ git rm --dry=bogus
   error: option `dry-run' takes no value

Not so when doing both, though:

   $ git rm --no-dry=bogus
   error: unknown option `no-dry=bogus'
   usage: git rm [-f | --force] [-n] [-r] [--cached] [--ignore-unmatch]

(Rest of the usage message omitted.)

Improve consistency and usefulness of the error message by recognizing
abbreviated negated options even if they have a (most likely bogus)
argument.  With this patch we get:

   $ git rm --no-dry=bogus
   error: option `no-dry-run' takes no value

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:49:21 -08:00
6cf06e9c6e t-ctype: avoid duplicating class names
TEST_CTYPE_FUNC defines a function for testing a character classifier,
TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.

Avoid the need to define a class-specific function by letting
TEST_CHAR_CLASS do all the work.  This is done by using the internal
functions test__run_begin() and test__run_end(), but they do exist to be
used in test macros after all.

Alternatively we could unroll the loop to provide a very long expression
that tests all 256 characters and EOF and hand that to TEST, but that
seems awkward and hard to read.

No change of behavior or output intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:47:33 -08:00
7a8d6c0a10 t-ctype: align output of i
The unit test reports misclassified characters like this:

   # check "isdigit(i) == !!memchr("123456789", i, len)" failed at t/unit-tests/t-ctype.c:36
   #    left: 1
   #   right: 0
   #        i: 0x30

Reduce the indent of i to put its colon directly below the ones in the
preceding lines for consistency.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:47:33 -08:00
752cb6ef81 t-ctype: simplify EOF check
EOF is not a member of any character class.  If a classifier function
returns a non-zero result for it, presumably by mistake, then the unit
test check reports:

   # check "!iseof(EOF)" failed at t/unit-tests/t-ctype.c:53
   #       i: 0xffffffff (EOF)

The numeric value of EOF is not particularly interesting in this
context.  Stop printing the second line.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:47:33 -08:00
980013e90d t-ctype: allow NUL anywhere in the specification string
Replace the custom function is_in() for looking up a character in the
specification string with memchr(3) and sizeof.  This is shorter,
simpler and allows NUL anywhere in the string, which may come in handy
if we ever want to support more character classes that contain it.

Getting the string size using sizeof only works in a macro and with a
string constant.  Use ARRAY_SIZE and compile-time checks to make sure we
are not passed a string pointer.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-03 09:47:33 -08:00
4c9355ff48 repack: check error writing to pack-objects subprocess
When "git repack" repacks promisor objects, it starts a pack-objects
subprocess and uses xwrite() to send object names over the pipe to
it, but without any error checking.  An I/O error or short write
(even though a short write is unlikely for such a small amount of
data) can result in a packfile that lacks certain objects we wanted
to put in there, leading to a silent repository corruption.

Use write_in_full(), instead of xwrite(), to mitigate short write
risks, check errors from it, and abort if we see a failure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-02 11:12:16 -08:00
36ffba1c7b sideband: avoid short write(2)
The sideband demultiplexor writes the data it receives on sideband
with xwrite().  We can lose data if the underlying write(2) results
in a short write.

If they are limited to unimportant bytes like eye-candy progress
meter, it may be OK to lose them, but lets be careful and ensure
that we use write_in_full() instead.  Note that the original does
not check for errors, and this rewrite does not check for one.  At
least not yet.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-02 11:12:16 -08:00
fa6c383309 unpack: replace xwrite() loop with write_in_full()
We have two packfile stream consumers, index-pack and
unpack-objects, that allow excess payload after the packfile stream
data. Their code to relay excess data hasn't changed significantly
since their original implementation that appeared in 67e5a5ec
(git-unpack-objects: re-write to read from stdin, 2005-06-28) and
9bee2478 (mimic unpack-objects when --stdin is used with index-pack,
2006-10-25).

These code blocks contain hand-rolled loops using xwrite(), written
before our write_in_full() helper existed. This helper now provides
the same functionality.

Replace these loops with write_in_full() for shorter, clearer
code. Update related variables accordingly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-02 11:12:16 -08:00
381a83dfa3 test_i18ngrep: hard deprecate and forbid its use
Since v2.44.0-rc0~109 (Merge branch 'sp/test-i18ngrep', 2023-12-27)
none of the tests we have, either in 'master' or in flight and
collected in 'seen', use test_i18ngrep.

Perhaps it is good time to update test_i18ngrep to BUG to avoid
people adding new calls to it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-02 10:21:10 -08:00
b387623c12 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 14:38:56 -08:00
421d5a7574 Merge branch 'tb/multi-pack-verbatim-reuse' into HEAD
Docfix.

* tb/multi-pack-verbatim-reuse:
  Documentation/config/pack.txt: fix broken AsciiDoc mark-up
2024-03-01 14:38:56 -08:00
2b5738c867 Merge branch 'hs/rebase-not-in-progress' into HEAD
Error message update.

* hs/rebase-not-in-progress:
  rebase: make warning less passive aggressive
2024-03-01 14:38:56 -08:00
8e69efba8f Merge branch 'jw/remote-doc-typofix' into HEAD
Docfix.

* jw/remote-doc-typofix:
  git-remote.txt: fix typo
2024-03-01 14:38:56 -08:00
fd6e3cdaea Merge branch 'jc/doc-add-placeholder-fix' into HEAD
Practice the new mark-up rule for <placeholders> with "git add"
documentation page.

* jc/doc-add-placeholder-fix:
  doc: apply the new placeholder rules to git-add documentation
2024-03-01 14:38:55 -08:00
9ce1ca3045 Merge branch 'ja/doc-placeholders-markup-rules' into HEAD
The way placeholders are to be marked-up in documentation have been
specified; use "_<placeholder>_" to typeset the word inside a pair
of <angle-brakets> emphasized.

* ja/doc-placeholders-markup-rules:
  doc: clarify the format of placeholders
2024-03-01 14:38:55 -08:00
510a27e9e4 Merge branch 'ps/reflog-list' into HEAD
"git reflog" learned a "list" subcommand that enumerates known reflogs.

* ps/reflog-list:
  builtin/reflog: introduce subcommand to list reflogs
  refs: stop resolving ref corresponding to reflogs
  refs: drop unused params from the reflog iterator callback
  refs: always treat iterators as ordered
  refs/files: sort merged worktree and common reflogs
  refs/files: sort reflogs returned by the reflog iterator
  dir-iterator: support iteration in sorted order
  dir-iterator: pass name to `prepare_next_entry_data()` directly
2024-03-01 14:38:55 -08:00
221c3daef4 Merge branch 'ds/doc-send-email-capitalization' into HEAD
Doc update.

* ds/doc-send-email-capitalization:
  documentation: send-email: use camel case consistently
2024-03-01 14:38:54 -08:00
af88fbd949 Merge branch 'ja/docfixes' into HEAD
Doc update.

* ja/docfixes:
  doc: end sentences with full-stop
  doc: close unclosed angle-bracket of a placeholder in git-clone doc
  doc: git-rev-parse: enforce command-line description syntax
2024-03-01 14:38:54 -08:00
90c0c15e56 Merge branch 'cp/t9146-use-test-path-helpers' into HEAD
Test script clean-up.

* cp/t9146-use-test-path-helpers:
  t9146: replace test -d/-e/-f with appropriate test_path_is_* function
2024-03-01 14:38:54 -08:00
a87469cc99 Merge branch 'ps/difftool-dir-diff-exit-code' into HEAD
"git difftool --dir-diff" learned to honor the "--trust-exit-code"
option; it used to always exit with 0 and signalled success.

* ps/difftool-dir-diff-exit-code:
  git-difftool--helper: honor `--trust-exit-code` with `--dir-diff`
2024-03-01 14:38:54 -08:00
7a96b75e05 gitcli: drop mention of “non-dashed form”
Git builtins used to be called like e.g. `git-commit`, not `git
commit` (*dashed form* and *non-dashed form*, respectively). The dashed
form was deprecated in version 1.5.4 (2006). Now only a few commands
have an alternative dashed form when `SKIP_DASHED_BUILT_INS` is
active.[1]

The mention here is from 2f7ee089df (parse-options: Add a gitcli(5) man
page., 2007-12-13), back when the deprecation was relatively
recent. These days though it seems like an irrelevant point to make to
budding CLI scripters—you don’t have to warn against a style that
probably doesn’t even work on their git(1) installation.

† 1: 179227d6e2 (Optionally skip linking/copying the built-ins,
    2020-09-21)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:45:01 -08:00
35ca4411a0 format_trailers_from_commit(): indirectly call trailer_info_get()
This is another preparatory refactor to unify the trailer formatters.

For background, note that the "trailers" string array is the
`char **trailers` member in `struct trailer_info` and that the
trailer_item objects are the elements of the `struct list_head *head`
linked list.

Currently trailer_info_get() only populates `char **trailers`. And
parse_trailers() first calls trailer_info_get() so that it can use the
`char **trailers` to populate a list of `struct trailer_item` objects

Instead of calling trailer_info_get() directly from
format_trailers_from_commit(), make it call parse_trailers() instead
because parse_trailers() already calls trailer_info_get().

This change is a NOP because format_trailer_info() (which
format_trailers_from_commit() wraps around) only looks at the "trailers"
string array, not the trailer_item objects which parse_trailers()
populates. For now we do need to create a dummy

    LIST_HEAD(trailer_objects);

because parse_trailers() expects it in its signature.

In a future patch, we'll change format_trailer_info() to use the parsed
trailer_item objects (trailer_objects) instead of the `char **trailers`
array.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
2c948a78fd format_trailer_info(): move "fast path" to caller
This is another preparatory refactor to unify the trailer formatters.

This allows us to drop the "msg" parameter from format_trailer_info(),
so that it take 3 parameters, similar to format_trailers() which also
takes 3 parameters:

    void format_trailers(const struct process_trailer_options *opts,
                         struct list_head *trailers,
                         struct strbuf *out)

The short-term goal is to make format_trailer_info() be smart enough to
deprecate format_trailers(). And then ultimately we will rename
format_trailer_info() to format_trailers().

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
bf35e0a018 format_trailers(): use strbuf instead of FILE
This is another preparatory refactor to unify the trailer formatters.

Make format_trailers() also write to a strbuf, to align with
format_trailers_from_commit() which also does the same. Doing this makes
format_trailers() behave similar to format_trailer_info() (which will
soon help us replace one with the other).

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
9aa1b2bc89 trailer_info_get(): reorder parameters
This is another preparatory refactor to unify the trailer formatters.

Take

    const struct process_trailer_options *opts

as the first parameter, because these options are required for
parsing trailers (e.g., whether to treat "---" as the end of the log
message). And take

    struct trailer_info *info

last, because it's an "out parameter" (something that the caller wants
to use as the output of this function).

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
ae0ec2e0e0 trailer: move interpret_trailers() to interpret-trailers.c
The interpret-trailers.c builtin is the only place we need to call
interpret_trailers(), so move its definition there (together with a few
helper functions called only by it) and remove its external declaration
from <trailer.h>.

Several helper functions that are called by interpret_trailers() remain
in trailer.c because other callers in the same file still call them.
Declare them in <trailer.h> so that interpret_trailers() (now in
builtin/interpret-trailers.c) can continue calling them as a trailer API
user.

This enriches <trailer.h> with a more granular API, which can then be
unit-tested in the future (because interpret_trailers() by itself does
too many things to be able to be easily unit-tested).

Take this opportunity to demote some file-handling functions out of the
trailer API implementation, as these have nothing to do with trailers.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
0383dc5629 trailer: reorder format_trailers_from_commit() parameters
Currently there are two functions for formatting trailers in
<trailer.h>:

    void format_trailers(const struct process_trailer_options *,
                         struct list_head *trailers, FILE *outfile);

    void format_trailers_from_commit(struct strbuf *out, const char *msg,
                                     const struct process_trailer_options *opts);

and although they are similar enough (even taking the same
process_trailer_options struct pointer) they are used quite differently.
One might intuitively think that format_trailers_from_commit() builds on
top of format_trailers(), but this is not the case. Instead
format_trailers_from_commit() calls format_trailer_info() and
format_trailers() is never called in that codepath.

This is a preparatory refactor to help us deprecate format_trailers() in
favor of format_trailer_info() (at which point we can rename the latter
to the former). When the deprecation is complete, both
format_trailers_from_commit(), and the interpret-trailers builtin will
be able to call into the same helper function (instead of
format_trailers() and format_trailer_info(), respectively). Unifying the
formatters is desirable because it simplifies the API.

Reorder parameters for format_trailers_from_commit() to prefer

    const struct process_trailer_options *opts

as the first parameter, because these options are intimately tied to
formatting trailers. And take

    struct strbuf *out

last, because it's an "out parameter" (something that the caller wants
to use as the output of this function).

Similarly, reorder parameters for format_trailer_info(), because later
on we will unify the two together.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
7b1c6aa541 trailer: rename functions to use 'trailer'
Rename process_trailers() to interpret_trailers(), because it matches
the name for the builtin command of the same name
(git-interpret-trailers), which is the sole user of process_trailers().

In a following commit, we will move "interpret_trailers" from trailer.c
to builtin/interpret-trailers.c. That move will necessitate the growth
of the trailer.h API, forcing us to expose some additional functions in
trailer.h.

Rename relevant functions so that they include the term "trailer" in
their name, so that clients of the API will be able to easily identify
them by their "trailer" moniker, just like all the other functions
already exposed by trailer.h.

Rename `struct list_head *head` to `struct list_head *trailers` because
"head" conveys no additional information beyond the "list_head" type.

Reorder parameters for format_trailers_from_commit() to prefer

    const struct process_trailer_options *opts

as the first parameter, because these options are intimately tied to
formatting trailers. Parameters like `FILE *outfile` should be last
because they are a kind of 'out' parameter, so put such parameters at
the end. This will be the pattern going forward in this series.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
a082e28938 shortlog: add test for de-duplicating folded trailers
The shortlog builtin was taught to use the trailer iterator interface in
47beb37bc6 (shortlog: match commit trailers with --group, 2020-09-27).
The iterator always unfolds values and this has always been the case
since the time the iterator was first introduced in f0939a0eb1 (trailer:
add interface for iterating over commit trailers, 2020-09-27). Add a
comment line to remind readers of this behavior.

The fact that the iterator always unfolds values is important
(at least for shortlog) because unfolding allows it to recognize both
folded and unfolded versions of the same trailer for de-duplication.

Capture the existing behavior in a new test case to guard against
regressions in this area. This test case is based off of the existing
"shortlog de-duplicates trailers in a single commit" just above it. Now
if we were to remove the call to

    unfold_value(&iter->val);

inside the iterator, this new test case will break.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
0f3a461d4e trailer: free trailer_info _after_ all related usage
In de7c27a186 (trailer: use offsets for trailer_start/trailer_end,
2023-10-20), we started using trailer block offsets in trailer_info. In
particular, we dropped the use of a separate stack variable "size_t
trailer_end", in favor of accessing the new "trailer_block_end" member
of trailer_info (as "info.trailer_block_end").

At that time, we forgot to also move the

   trailer_info_release(&info);

line to be _after_ this new use of the trailer_info struct. Move it now.

Note that even without this patch, we didn't have leaks or any other
problems because trailer_info_release() only frees memory allocated on
the heap. The "trailer_block_end" member was allocated on the stack back
then (as it is now) so it was still safe to use for all this time.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01 10:35:42 -08:00
5f78d52dce docs: sort configuration variable groupings alphabetically
By and large, variable groupings in Documentation/config.txt are sorted
alphabetically, though a few are not. Those outliers make it more
difficult to find a specific grouping when quickly running an eye over
the list to locate a variable of interest. Address this shortcoming by
sorting the groupings alphabetically.

NOTE: This change only sorts the top-level groupings (i.e. "core.*"
comes after "completion.*"); it does not touch the ordering of variables
within each group since variables within individual groups might
intentionally be ordered in some other fashion (such as
most-common-first or most-important-first).

Reported-by: Bruno Haible <bruno@clisp.org>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 11:53:29 -08:00
3223204456 add: use unsigned type for collection of bits
The 'refresh' function in 'builtin/add.c' declares 'flags' as
signed, and passes it as an argument to the 'refresh_index'
function, which though expects an unsigned value.

Since in this case 'flags' represents a bag of bits, whose MSB is
not used in special ways, change the type of 'flags' to unsigned.

Signed-off-by: Eugenio Gigante <giganteeugenio2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 11:52:42 -08:00
a922bfa3b5 upload-pack: only accept packfile-uris if we advertised it
Clients are only supposed to request particular capabilities or features
if the server advertised them. For the "packfile-uris" feature, we only
advertise it if uploadpack.blobpacfileuri is set, but we always accept a
request from the client regardless.

In practice this doesn't really hurt anything, as we'd pass the client's
protocol list on to pack-objects, which ends up ignoring it. But we
should try to follow the protocol spec, and tightening this up may catch
buggy or misbehaving clients more easily.

Thanks to recent refactoring, we can hoist the config check from
upload_pack_advertise() into upload_pack_config(). Note the subtle
handling of a value-less bool (which does not count for triggering an
advertisement).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:10:42 -08:00
caaf1a2942 commit-reach(repo_get_merge_bases_many_dirty): pass on errors
(Actually, this commit is only about passing on "missing commits"
errors, but adding that to the commit's title would have made it too
long.)

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many_dirty()` function is
aware of that, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
5317380521 commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next stop: `repo_get_merge_bases_dirty()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
f87056ce40 commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `get_merge_bases()` macro) is aware of that, too.

Naturally, the callers need to be adjusted now, too.

Next step: adjust `repo_get_merge_bases_many()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
76e2a09999 commit-reach(repo_get_merge_bases): pass on "missing commits" errors
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next step: adjust the callers of `get_octopus_merge_bases()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
8226e157a9 commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
The `merge_bases_many()` function was just taught to indicate
parsing errors, and now the `get_merge_bases_many_0()` function is aware
of that, too.

Next step: adjust the callers of `get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
fb02c523a3 commit-reach(merge_bases_many): pass on "missing commits" errors
The `paint_down_to_common()` function was just taught to indicate
parsing errors, and now the `merge_bases_many()` function is aware of
that, too.

One tricky aspect is that `merge_bases_many()` parses commits of its
own, but wants to gracefully handle the scenario where NULL is passed as
a merge head, returning the empty list of merge bases. The way this was
handled involved calling `repo_parse_commit(NULL)` and relying on it to
return an error. This has to be done differently now so that we can
handle missing commits correctly by producing a fatal error.

Next step: adjust the caller of `merge_bases_many()`:
`get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
896a0e11f3 commit-reach(paint_down_to_common): start reporting errors
If a commit cannot be parsed, it is currently ignored when looking for
merge bases. That's undesirable as the operation can pretend success in
a corrupt repository, even though the command should fail with an error
message.

Let's start at the bottom of the stack by teaching the
`paint_down_to_common()` function to return an `int`: if negative, it
indicates fatal error, if 0 success.

This requires a couple of callers to be adjusted accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
2d2da172f3 commit-reach(paint_down_to_common): prepare for handling shallow commits
When `git fetch --update-shallow` needs to test for commit ancestry, it
can naturally run into a missing object (e.g. if it is a parent of a
shallow commit). For the purpose of `--update-shallow`, this needs to be
treated as if the child commit did not even have that parent, i.e. the
commit history needs to be clamped.

For all other scenarios, clamping the commit history is actually a bug,
as it would hide repository corruption (for an analysis regarding
shallow and partial clones, see the analysis further down).

Add a flag to optionally ask the function to ignore missing commits, as
`--update-shallow` needs it to, while detecting missing objects as a
repository corruption error by default.

This flag is needed, and cannot be replaced by `is_repository_shallow()`
to indicate that situation, because that function would return 0 in the
`--update-shallow` scenario: There is not actually a `shallow` file in
that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
with receive.updateshallow on") and t5538.4 ("add new shallow root with
receive.updateshallow on").

Note: shallow commits' parents are set to `NULL` internally already,
therefore there is no need to special-case shallow repositories here, as
the merge-base logic will not try to access parent commits of shallow
commits.

Likewise, partial clones aren't an issue either: If a commit is missing
during the revision walk in the merge-base logic, it is fetched via
`promisor_remote_get_direct()`. And not only the single missing commit
object: Due to the way the "promised" objects are fetched (in
`fetch_objects()` in `promisor-remote.c`, using `fetch
--filter=blob:none`), there is no actual way to fetch a single commit
object, as the remote side will pass that commit OID to `pack-objects
--revs [...]` which in turn passes it to `rev-list` which interprets
this as a commit _range_ instead of a single object. Therefore, in
partial clones (unless they are shallow in addition), all commits
reachable from a commit that is in the local object database are also
present in that local database.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:05:45 -08:00
9a7b22959a upload-pack: use existing config mechanism for advertisement
When serving a v2 capabilities request, we call upload_pack_advertise()
to tell us the set of features we can advertise to the client. That
involves looking at various config options, all of which need to be kept
in sync with the rules we use in upload_pack_config to set flags like
allow_filter, allow_sideband_all, and so on. If these two pieces of code
get out of sync then we may refuse to respect a capability we
advertised, or vice versa accept one that we should not.

Instead, let's call the same config helper that we'll use for processing
the actual client request, and then just pick the values out of the
resulting struct. This is only a little bit shorter than the current
code, but we don't repeat any policy logic (e.g., we don't have to worry
about the magic sideband-all environment variable here anymore).

And this reveals a gap in the existing code: there is no struct flag for
the packfile-uris capability (we accept it even if it is not advertised,
which we should not). We'll leave the advertisement code for now and
deal with it in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 15:30:41 -08:00
37aa89b068 upload-pack: centralize setup of sideband-all config
We read uploadpack.allowsidebandall to set a matching flag in our
upload_pack_data struct. But for our tests, we also respect
GIT_TEST_SIDEBAND_ALL from the environment, and anybody looking at the
flag in the struct needs to remember to check both. There's only one
such piece of code now, but we're about to add another.

So let's have the config step actually fold the environment value into
the struct, letting the rest of the code use the flag in the obvious
way.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 15:30:41 -08:00
922fdefb84 upload-pack: use repository struct to get config
Our upload_pack_v2() function gets a repository struct, but we ignore it
totally.  In practice this doesn't cause any problems, as it will never
differ from the_repository. But in the spirit of taking a small step
towards getting rid of the_repository, let's at least starting using it
to grab config. There are probably other spots that could benefit, but
it's a start.

Note that we don't need to pass the repo for protected_config(); the
whole point there is that we are not looking at repo config, so there is
no repo-specific version of the function.

For the v0 version of the protocol, we're not passed a repository
struct, so we'll continue to use the_repository there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:48:35 -08:00
6cd05e768b upload-pack: free tree buffers after parsing
When a client sends us a "want" or "have" line, we call parse_object()
to get an object struct. If the object is a tree, then the parsed state
means that tree->buffer points to the uncompressed contents of the tree.
But we don't really care about it. We only really need to parse commits
and tags; for trees and blobs, the important output is just a "struct
object" with the correct type.

But much worse, we do not ever free that tree buffer. It's not leaked in
the traditional sense, in that we still have a pointer to it from the
global object hash. But if the client requests many trees, we'll hold
all of their contents in memory at the same time.

Nobody really noticed because it's rare for clients to directly request
a tree. It might happen for a lightweight tag pointing straight at a
tree, or it might happen for a "tree:depth" partial clone filling in
missing trees.

But it's also possible for a malicious client to request a lot of trees,
causing upload-pack's memory to balloon. For example, without this
patch, requesting every tree in git.git like:

  pktline() {
    local msg="$*"
    printf "%04x%s\n" $((1+4+${#msg})) "$msg"
  }

  want_trees() {
    pktline command=fetch
    printf 0001
    git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype)' |
      while read oid type; do
        test "$type" = "tree" || continue
        pktline want $oid
      done
      pktline done
      printf 0000
  }

  want_trees | GIT_PROTOCOL=version=2 valgrind --tool=massif ./git upload-pack . >/dev/null

shows a peak heap usage of ~3.7GB. Which is just about the sum of the
sizes of all of the uncompressed trees. For linux.git, it's closer to
17GB.

So the obvious thing to do is to call free_tree_buffer() after we
realize that we've parsed a tree. We know that upload-pack won't need it
later. But let's push the logic into parse_object_with_flags(), telling
it to discard the tree buffer immediately. There are two reasons for
this. One, all of the relevant call-sites already call the with_options
variant to pass the SKIP_HASH flag. So it actually ends up as less code
than manually free-ing in each spot. And two, it enables an extra
optimization that I'll discuss below.

I've touched all of the sites that currently use SKIP_HASH in
upload-pack. That drops the peak heap of the upload-pack invocation
above from 3.7GB to ~24MB.

I've also modified the caller in get_reference(); a partial clone
benefits from its use in pack-objects for the reasons given in
0bc2557951 (upload-pack: skip parse-object re-hashing of "want" objects,
2022-09-06), where we were measuring blob requests. But note that the
results of get_reference() are used for traversing, as well; so we
really would _eventually_ use the tree contents. That makes this at
first glance a space/time tradeoff: we won't hold all of the trees in
memory at once, but we'll have to reload them each when it comes time to
traverse.

And here's where our extra optimization comes in. If the caller is not
going to immediately look at the tree contents, and it doesn't care
about checking the hash, then parse_object() can simply skip loading the
tree entirely, just like we do for blobs! And now it's not a space/time
tradeoff in get_reference() anymore. It's just a lazy-load: we're
delaying reading the tree contents until it's time to actually traverse
them one by one.

And of course for upload-pack, this optimization means we never load the
trees at all, saving lots of CPU time. Timing the "every tree from
git.git" request above shows upload-pack dropping from 32 seconds of CPU
to 19 (the remainder is mostly due to pack-objects actually sending the
pack; timing just the upload-pack portion shows we go from 13s to
~0.28s).

These are all highly gamed numbers, of course. For real-world
partial-clone requests we're saving only a small bit of time in
practice. But it does help harden upload-pack against malicious
denial-of-service attacks.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
a6ca601cdf upload-pack: use PARSE_OBJECT_SKIP_HASH_CHECK in more places
In commit 0bc2557951 (upload-pack: skip parse-object re-hashing of
"want" objects, 2022-09-06), we optimized the parse_object() calls for
v2 "want" lines from the client so that they avoided parsing blobs, and
so that they used the commit-graph rather than parsing commit objects
from scratch.

We should extend that to two other spots:

  1. We parse "have" objects in the got_oid() function. These won't
     generally be non-commits (unlike "want" lines from a partial
     clone). But we still benefit from the use of the commit-graph.

  2. For v0, the "want" lines are parsed in receive_needs(). These are
     also less likely to be non-commits because by default they have to
     be ref tips. There are config options you might set to allow
     non-tip objects, but you'd mostly do so to support partial clones,
     and clients recent enough to support partial clone will generally
     speak v2 anyway.

So I don't expect this change to improve performance much for day-to-day
operations. But both are possible denial-of-service vectors, where an
attacker can waste our time by sending over a large number of objects to
parse (of course we may waste even more time serving a pack to them, but
we try as much as possible to optimize that in pack-objects; we should
do what we can here in upload-pack, too).

With this patch, running p5600 with GIT_TEST_PROTOCOL_VERSION=0 shows
similar results to what we saw in 0bc2557951 (which ran with the v2
protocol by default). Here are the numbers for linux.git:

  Test                          HEAD^                 HEAD
  -----------------------------------------------------------------------------
  5600.3: checkout of result    50.91(87.95+2.93)     41.75(79.00+3.18) -18.0%

Or for a more extreme (and malicious) case, we can claim to "have" every
blob in git.git over the v0 protocol:

  $ {
      echo "0032want $(git rev-parse HEAD)"
      printf 0000
      git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype)' |
      perl -alne 'print "0032have $F[0]" if $F[1] eq "blob"'
    } >input

  $ time ./git.old upload-pack . <input >/dev/null
  real	0m52.951s
  user	0m51.633s
  sys	0m1.304s

  $ time ./git.new upload-pack . <input >/dev/null
  real	0m0.261s
  user	0m0.156s
  sys	0m0.105s

(Note that these don't actually compute a pack because of the hacky
protocol usage, so those numbers are representing the raw blob-parsing
effort done by upload-pack).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
5f64279443 upload-pack: always turn off save_commit_buffer
When the client sends us "want $oid" lines, we call parse_object($oid)
to get an object struct. It's important to parse the commits because we
need to traverse them in the negotiation phase. But of course we don't
need to hold on to the commit messages for each one.

We've turned off the save_commit_buffer flag in get_common_commits() for
a long time, since f0243f26f6 (git-upload-pack: More efficient usage of
the has_sha1 array, 2005-10-28). That helps with the commits we see
while actually traversing. But:

  1. That function is only used by the v0 protocol. I think the v2
     protocol's code path leaves the flag on (and thus pays the extra
     memory penalty), though I didn't measure it specifically.

  2. If the client sends us a bunch of "want" lines, that happens before
     the negotiation phase. So we'll hold on to all of those commit
     messages. Generally the number of "want" lines scales with the
     refs, not with the number of objects in the repo. But a malicious
     client could send a lot in order to waste memory.

As an example of (2), if I generate a request to fetch all commits in
git.git like this:

  pktline() {
    local msg="$*"
    printf "%04x%s\n" $((1+4+${#msg})) "$msg"
  }

  want_commits() {
    pktline command=fetch
    printf 0001
    git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype)' |
      while read oid type; do
        test "$type" = "commit" || continue
        pktline want $oid
      done
      pktline done
      printf 0000
  }

  want_commits | GIT_PROTOCOL=version=2 valgrind --tool=massif git-upload-pack . >/dev/null

before this patch upload-pack peaks at ~125MB, and after at ~35MB. The
difference is not coincidentally about the same as the sum of all commit
object sizes as computed by:

  git cat-file --batch-all-objects --batch-check='%(objecttype) %(objectsize)' |
  perl -alne '$v += $F[1] if $F[0] eq "commit"; END { print $v }'

In a larger repository like linux.git, that number is ~1GB.

In a repository with a full commit-graph file this will have no impact
(and the commit graph would save us from parsing at all, so is a much
better solution!). But it's easy to do, might help a little in
real-world cases (where even if you have a commit graph it might not be
fully up to date), and helps a lot for a worst-case malicious request.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
8c735b11de upload-pack: disallow object-info capability by default
We added an "object-info" capability to the v2 upload-pack protocol in
a2ba162cda (object-info: support for retrieving object info,
2021-04-20). In the almost 3 years since, we have not added any
client-side support, and it does not appear to exist in other
implementations either (JGit understands the verb on the server side,
but not on the client side).

Since this largely unused code is accessible over the network by
default, it increases the attack surface of upload-pack. I don't know of
any particularly severe problem, but one issue is that because of the
request/response nature of the v2 protocol, it will happily read an
unbounded number of packets, adding each one to a string list (without
regard to whether they are objects we know about, duplicates, etc).

This may be something we want to improve in the long run, but in the
short term it makes sense to disable the feature entirely. We'll add a
config option as an escape hatch for anybody who wants to develop the
feature further.

A more gentle option would be to add the config option to let people
disable it manually, but leave it enabled by default. But given that
there's no client side support, that seems like the wrong balance with
security.

Disabling by default will slow adoption a bit once client-side support
does become available (there were some patches[1] in 2022, but nothing
got merged and there's been nothing since). But clients have to deal
with older servers that do not understand the option anyway (and the
capability system handles that), so it will just be a matter of servers
flipping their config at that point (and hopefully once any unbounded
allocations have been addressed).

[jk: this is a patch that GitHub has been running for several years, but
     rebased forward and with a new commit message for upstream]

[1] https://lore.kernel.org/git/20220208231911.725273-1-calvinwan@google.com/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
179776f9e6 upload-pack: accept only a single packfile-uri line
When we see a packfile-uri line from the client, we use
string_list_split() to split it on commas and store the result in a
string_list.  A single packfile-uri line is therefore limited to storing
~64kb, the size of a pkt-line.

But we'll happily accept multiple such lines, and each line appends to
the string list, growing without bound.

In theory this could be useful, making:

  0017packfile-uris http
  0018packfile-uris https

equivalent to:

  001dpackfile-uris http,https

But the protocol documentation doesn't indicate that this should work
(and indeed, refers to this in the singular as "the following argument
can be included in the client's request"). And the client-side
implementation in fetch-pack has always sent a single line (JGit appears
to understand the line on the server side but has no client-side
implementation, and libgit2 understands neither).

If we were worried about compatibility, we could instead just put a
limit on the maximum number of values we'd accept. The current client
implementation limits itself to only two values: "http" and "https", so
something like "256" would be more than enough. But accepting only a
single line seems more in line with the protocol documentation, and
matches other parts of the protocol (e.g., we will not accept a second
"filter" line).

We'll also make this more explicit in the protocol documentation; as
above, I think this was always the intent, but there's no harm in making
it clear.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
b065063c57 upload-pack: use a strmap for want-ref lines
When the "ref-in-want" capability is advertised (which it is not by
default), then upload-pack processes a "want-ref" line from the client
by checking that the name is a valid ref and recording it in a
string-list.

In theory this list should grow no larger than the number of refs in the
server-side repository. But since we don't do any de-duplication, a
client which sends "want-ref refs/heads/foo" over and over will cause
the array to grow without bound.

We can fix this by switching to strmap, which efficiently detects
duplicates. There are two client-visible changes here:

  1. The "wanted-refs" response will now be in an apparently-random
     order (based on iterating the hashmap) rather than the order given
     by the client. The protocol documentation is quiet on ordering
     here. The current fetch-pack implementation is happy with any
     order, as it looks up each returned ref using a binary search in
     its local sorted list. JGit seems to implement want-ref on the
     server side, but has no client-side support. libgit2 doesn't
     support either side.

     It would obviously be possible to record the original order or to
     use the strmap as an auxiliary data structure. But if the client
     doesn't care, we may as well do the simplest thing.

  2. We'll now reject duplicates explicitly as a protocol error. The
     client should never send them (and our current implementation, even
     when asked to "git fetch master:one master:two" will de-dup on the
     client side).

     If we wanted to be more forgiving, we could perhaps just throw away
     the duplicates. But then our "wanted-refs" response back to the
     client would omit the duplicates, and it's hard to say what a
     client that accidentally sent a duplicate would do with that. So I
     think we're better off to complain loudly before anybody
     accidentally writes such a client.

Let's also add a note to the protocol documentation clarifying that
duplicates are forbidden. As discussed above, this was already the
intent, but it's not very explicit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
388b96df31 upload-pack: use oidset for deepen_not list
We record the oid of every deepen-not line the client sends to us. For a
well-behaved client, the resulting array should be bounded by the number
of unique refs we have. But because there's no de-duplication, a
malicious client can cause the array to grow unbounded by just sending
the same "refs/heads/foo" over and over (assuming such a ref exists).

Since the deepen-not list is just being fed to a "rev-list --not"
traversal, the order of items doesn't matter. So we can replace the
oid_array with an oidset which notices and skips duplicates.

That bounds the memory in malicious cases to be linear in the number of
unique refs. And even in non-malicious cases, there may be a slight
improvement in memory usage if multiple refs point to the same oid
(though in practice this list is probably pretty tiny anyway, as it
comes from the user specifying "--shallow-exclude" on the client fetch).

Note that in the trace2 output we'll now output the number of
de-duplicated objects, rather than the total number of "deepen-not"
lines we received. This is arguably a more useful value for tracing /
debugging anyway.

Reported-by: Benjamin Flesch <benjaminflesch@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
720ba25d99 upload-pack: switch deepen-not list to an oid_array
When we see a "deepen-not" line from the client, we verify that the
given name can be resolved as a ref, and then add it to a string list to
be passed later to an internal "rev-list --not" traversal. We record the
actual refname in the string list (so the traversal resolves it again
later), but we'd be better off recording the resolved oid:

  1. There's a tiny bit of wasted work in resolving it twice.

  2. There's a small race condition with simultaneous updates; the later
     traversal may resolve to a different value (or not at all). This
     shouldn't cause any bad behavior (we do not care about the value
     in this first resolution, so whatever value rev-list gets is OK)
     but it could mean a confusing error message (if upload-pack fails
     to resolve the ref it produces a useful message, but a failing
     traversal later results in just "revision walk setup failed").

  3. It makes it simpler to de-duplicate the results. We don't de-dup at
     all right now, but we will in the next patch.

>From the client's perspective the behavior should be the same.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
fae9627470 upload-pack: drop separate v2 "haves" array
When upload-pack sees a "have" line in the v0 protocol, it immediately
calls got_oid() with its argument and potentially produces an ACK
response. In the v2 protocol, we simply record the argument in an
oid_array, and only later process all of the "have" objects by calling
the equivalent of got_oid() on the contents of the array.

This makes some sense, as v2 is a pure request/response protocol, as
opposed to v0's asynchronous negotiation phase. But there's a downside:
a client can send us an infinite number of garbage "have" lines, which
we'll happily slurp into the array, consuming memory. Whereas in v0,
they are limited by the number of objects in the repository (because
got_oid() only records objects we have ourselves, and we avoid
duplicates by setting a flag on the object struct).

We can make v2 behave more like v0 by also calling got_oid() directly
when v2 parses a "have" line. Calling it early like this is OK because
got_oid() itself does not interact with the client; it only confirms
that we have the object and sets a few flags. Note that unlike v0, v2
does not ever (before or after this patch) check the return code of
got_oid(), which lets the caller know whether we have the object. But
again, that makes sense; v0 is using it to asynchronously tell the
client to stop sending. In v2's synchronous protocol, we just discard
those entries (and decide how to ACK at the end of each round).

There is one slight tweak we need, though. In v2's state machine, we
reach the SEND_ACKS state if the other side sent us any "have" lines,
whether they were useful or not. Right now we do that by checking
whether the "have" array had any entries, but if we record only the
useful ones, that doesn't work. Instead, we can add a simple boolean
that tells us whether we saw any have line (even if it was useless).

This lets us drop the "haves" array entirely, as we're now placing
objects directly into the "have_obj" object array (which is where
got_oid() put them in the long run anyway). And as a bonus, we can drop
the secondary "common" array used in process_haves_and_send_acks(). It
was essentially a copy of "haves" minus the objects we do not have. But
now that we are using "have_obj" directly, we know everything in it is
useful. So in addition to protecting ourselves against malicious input,
we should slightly lower our memory usage for normal inputs.

Note that there is one user-visible effect. The trace2 output records
the number of "haves". Previously this was the total number of "have"
lines we saw, but now is the number of useful ones. We could retain the
original meaning by keeping a separate counter, but it doesn't seem
worth the effort; this trace info is for debugging and metrics, and
arguably the count of common oids is at least as useful as the total
count.

Reported-by: Benjamin Flesch <benjaminflesch@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
f3fc5d9c91 revision: implement git log --merge also for rebase/cherry-pick/revert
'git log' learned in ae3e5e1ef2 (git log -p --merge [[--] paths...],
2006-07-03) to show commits touching conflicted files in the range
HEAD...MERGE_HEAD, an addition documented in d249b45547 (Document
rev-list's option --merge, 2006-08-04).

It can be useful to look at the commit history to understand what lead
to merge conflicts also for other mergy operations besides merges, like
cherry-pick, revert and rebase.

For rebases and cherry-picks, an interesting range to look at is
HEAD...{REBASE_HEAD,CHERRY_PICK_HEAD}, since even if all the commits
included in that range are not directly part of the 3-way merge,
conflicts encountered during these operations can indeed be caused by
changes introduced in preceding commits on both sides of the history.

For revert, as we are (most likely) reversing changes from a previous
commit, an appropriate range is REVERT_HEAD..HEAD, which is equivalent
to REVERT_HEAD...HEAD and to HEAD...REVERT_HEAD, if we keep HEAD and its
parents on the left side of the range.

As such, adjust the code in prepare_show_merge so it constructs the
range HEAD...$OTHER for OTHER={MERGE_HEAD, CHERRY_PICK_HEAD, REVERT_HEAD
or REBASE_HEAD}. Note that we try these pseudorefs in order, so keep
REBASE_HEAD last since the three other operations can be performed
during a rebase. Note also that in the uncommon case where $OTHER and
HEAD do not share a common ancestor, this will show the complete
histories of both sides since their root commits, which is the same
behaviour as currently happens in that case for HEAD and MERGE_HEAD.

Adjust the documentation of this option accordingly.

Co-authored-by: Johannes Sixt <j6t@kdbg.org>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
[jc: tweaked in j6t's precedence fix that tries REBASE_HEAD last]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 10:04:39 -08:00
f476143ee6 revision: ensure MERGE_HEAD is a ref in prepare_show_merge
This is done to

 (1) ensure MERGE_HEAD is a ref,
 (2) obtain the oid without any prefixing by refs.c:repo_dwim_ref()
 (3) error out when MERGE_HEAD is a symref.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 10:02:46 -08:00
24876ebf68 commit-reach(repo_in_merge_bases_many): report missing commits
Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.

Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).

Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.

Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:47:03 -08:00
207c40e1e4 commit-reach(repo_in_merge_bases_many): optionally expect missing commits
Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.

Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.

To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.

This commit changes behavior slightly: unless called from the
`shallow.c` functions that set the `ignore_missing_commits` bit, any
non-existing tip commit that is passed to `repo_in_merge_bases_many()`
will now result in an error.

Note: When encountering missing commits while traversing the commit
history in search for merge bases, with this commit there won't be a
change in behavior just yet, their children will still be interpreted as
root commits. This bug will get fixed by follow-up commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:47:03 -08:00
e67431d496 commit-reach(paint_down_to_common): plug two memory leaks
When a commit is missing, we return early (currently pretending that no
merge basis could be found in that case). At that stage, it is possible
that a merge base could have been found already, and added to the
`result`, which is now leaked.

The priority queue has a similar issue: There might still be a commit in
that queue.

Let's release both, to address the potential memory leaks.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:47:03 -08:00
a4324babe6 revision: fix --missing=[print|allow*] for annotated tags
In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with missing commits, not just blobs/trees.

Unfortunately, such a command was still failing with a "fatal: bad
object <oid>" if it was passed a missing commit, blob or tree as an
argument (before the rev walking even begins). This was fixed in a
recent commit.

That fix still doesn't work when an argument passed to the command is
an annotated tag pointing to a missing commit though. In that case
`git rev-list --missing=...` still errors out with a "fatal: bad
object <oid>" error where <oid> is the object ID of the missing
commit.

Let's fix this issue, and also, while at it, let's add tests not just
for annotated tags but also for regular tags and branches.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:28:18 -08:00
0f9d4d28b7 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 16:04:33 -08:00
ebd46baf99 Merge branch 'jb/doc-interactive-singlekey-do-not-need-perl'
Doc clean-up.

* jb/doc-interactive-singlekey-do-not-need-perl:
  doc: remove outdated information about interactive.singleKey
2024-02-27 16:04:33 -08:00
a56bb9f66a Merge branch 'jk/t0303-clean'
Test clean-up.

* jk/t0303-clean:
  t0303: check that helper_test_clean removes all credentials
2024-02-27 16:04:33 -08:00
70dadd510b Merge branch 'mh/libsecret-empty-password-fix'
Credential helper based on libsecret (in contrib/) has been updated
to handle an empty password correctly.

* mh/libsecret-empty-password-fix:
  libsecret: retrieve empty password
2024-02-27 16:04:32 -08:00
f71ed54f4d Merge branch 'bb/completion-no-grep-into-awk'
Some parts of command line completion script (in contrib/) have
been micro-optimized.

* bb/completion-no-grep-into-awk:
  completion: use awk for filtering the config entries
2024-02-27 16:04:32 -08:00
66b1160141 Merge branch 'km/mergetool-vimdiff-layout-fallback'
Variants of vimdiff learned to honor mergetool.<variant>.layout settings.

* km/mergetool-vimdiff-layout-fallback:
  mergetools: vimdiff: use correct tool's name when reading mergetool config
2024-02-27 16:04:32 -08:00
03f9f1a3a2 Merge branch 'ba/credential-test-clean-fix'
Test clean-up.

* ba/credential-test-clean-fix:
  t/lib-credential: clean additional credential
2024-02-27 16:04:32 -08:00
98793866b9 Merge branch 'rj/tag-column-fix'
"git tag --column" failed to check the exit status of its "git
column" invocation, which has been corrected.

* rj/tag-column-fix:
  tag: error when git-column fails
2024-02-27 16:04:32 -08:00
45072eefef Merge branch 'jc/am-whitespace-doc'
"git am --help" now tells readers what actions are available in
"git am --whitespace=<action>", in addition to saying that the
option is passed through to the underlying "git apply".

* jc/am-whitespace-doc:
  doc: add shortcut to "am --whitespace=<action>"
2024-02-27 16:04:31 -08:00
b0f6b6b523 refs/reftable: don't fail empty transactions in repo without HEAD
Under normal circumstances, it shouldn't ever happen that a repository
has no HEAD reference. In fact, git-update-ref(1) would fail any request
to delete the HEAD reference, and a newly initialized repository always
pre-creates it, too.

We have however changed git-clone(1) to partially initialize the
refdb just up to the point where remote helpers can find the
repository. With that change, we are going to run into a situation
where repositories have no refs at all.

Now there is a very particular edge case in this situation: when
preparing an empty ref transacton, we end up returning whatever value
`read_ref_without_reload()` returned to the caller. Under normal
conditions this would be fine: "HEAD" should usually exist, and thus the
function would return `0`. But if "HEAD" doesn't exist, the function
returns a positive value which we end up returning to the caller.

Fix this bug by resetting the return code to `0` and add a test.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 13:53:39 -08:00
b6818ff3b1 Merge branch 'ps/remote-helper-repo-initialization-fix' into ps/reftable-repo-init-fix
* ps/remote-helper-repo-initialization-fix:
  builtin/clone: allow remote helpers to detect repo
2024-02-27 13:53:22 -08:00
3574816d98 completion: fix __git_complete_worktree_paths
Use __git to invoke "worktree list" in __git_complete_worktree_paths, to
respect any "-C" and "--git-dir" options present on the command line.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 13:37:24 -08:00
199f44cb2e builtin/clone: allow remote helpers to detect repo
In 18c9cb7524 (builtin/clone: create the refdb with the correct object
format, 2023-12-12), we have changed git-clone(1) so that it delays
creation of the refdb until after it has learned about the remote's
object format. This change was required for the reftable backend, which
encodes the object format into the tables. So if we pre-initialized the
refdb with the default object format, but the remote uses a different
object format than that, then the resulting tables would have encoded
the wrong object format.

This change unfortunately breaks remote helpers which try to access the
repository that is about to be created. Because the refdb has not yet
been initialized at the point where we spawn the remote helper, we also
don't yet have "HEAD" or "refs/". Consequently, any Git commands ran by
the remote helper which try to access the repository would fail because
it cannot be discovered.

This is essentially a chicken-and-egg problem: we cannot initialize the
refdb because we don't know about the object format. But we cannot learn
about the object format because the remote helper may be unable to
access the partially-initialized repository.

Ideally, we would address this issue via capabilities. But the remote
helper protocol is not structured in a way that guarantees that the
capability announcement happens before the remote helper tries to access
the repository.

Instead, fix this issue by partially initializing the refdb up to the
point where it becomes discoverable by Git commands.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 12:58:57 -08:00
72a8d3f027 rebase -i: stop setting GIT_CHERRY_PICK_HELP
Setting this environment variable causes the sequencer to display a
custom message when it stops for the user to resolve conflicts and
remove CHERRY_PICK_HEAD. Setting it in "git rebase" is a vestige of
the scripted implementation, now that it is a builtin command we do
not need to communicate with the sequencer machinery via environment
variables.

Move the conflicts advice to use when rebasing into
sequencer.c so we do not need to pass it via the environment.

Note that we retain the changes in e4301f73ff (sequencer: unset
GIT_CHERRY_PICK_HELP for 'exec' commands, 2024-02-02) just in case
GIT_CHERRY_PICK_HELP is set in the environment when "git rebase" is
run.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 10:33:36 -08:00
e6d5479e7a git: extend --no-lazy-fetch to work across subprocesses
Modeling after how the `--no-replace-objects` option is made usable
across subprocess spawning (e.g., cURL based remote helpers are
spawned as a separate process while running "git fetch"), allow the
`--no-lazy-fetch` option to be passed across process boundaries.

Do not model how the value of GIT_NO_REPLACE_OBJECTS environment
variable is ignored, though.  Just use the usual git_env_bool() to
allow "export GIT_NO_LAZY_FETCH=0" and "unset GIT_NO_LAZY_FETCH"
to be equivalents.

Also do not model how the request is not propagated to subprocesses
we spawn (e.g. "git clone --local" that spawns a new process to work
in the origin repository, while the original one working in the
newly created one) by the "--no-replace-objects" option, as this "do
not lazily fetch from the promisor" is more about a per-request
debugging aid, not "this repository's promisor should not be relied
upon" property specific to a repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:53:14 -08:00
e90cc075cc commit: unify logic to avoid multiple scissors lines when merging
prepare_to_commit has some logic to figure out whether merge already
added a scissors line, and therefore it shouldn't add another. Now that
wt_status_add_cut_line has built-in state for whether it has
already added a previous line, just set that state instead, and then
remove that condition from subsequent calls to wt_status_add_cut_line.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:40:47 -08:00
688a0a751e commit: avoid redundant scissor line with --cleanup=scissors -v
`git commit --cleanup=scissors -v` prints two scissors lines:
one at the start of the comment lines, and the other right before the
diff. This is redundant, and pushes the diff further down in the user's
editor than it needs to be.

Make wt_status_add_cut_line() remember if it has added a cut line before,
and avoid adding a redundant one.

Add a test for this.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:40:46 -08:00
4e89f0e07c doc: clarify the wording on <git-compat-util.h> requirement
The reason why we require the <git-compat-util.h> file to be the
first header file to be included is because it insulates other
header files and source files from platform differences, like which
system header files must be included in what order, and what C
preprocessor feature macros must be defined to trigger certain
features we want out of the system.

We tried to clarify the rule in the coding guidelines document, but
the wording was a bit fuzzy that can lead to misinterpretations like
you can include <xdiff/xinclude.h> only to avoid having to include
<git-compat-util.h> even if you have nothing to do with the xdiff
implementation, for example.  "You do not have to include more than
one of these" was also misleading and would have been puzzling if
you _needed_ to depend on more than one of these approved headers
(answer: you are allowed to include them all if you need the
declarations in them for reasons other than that you want to avoid
including compat-util yourself).

Instead of using the phrase "approved headers", enumerate them as
exceptions, each labeled with its intended audiences, to avoid such
misinterpretations.  The structure also makes it easier to add new
exceptions, so add the description of "t/unit-tests/test-lib.h"
being an exception only for the unit tests implementation as an
example.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Kyle Lippincott <spectral@google.com>
Acked-by: Elijah Newren <newren@gmail.com>
2024-02-27 08:53:32 -08:00
40b8076462 rebase: fix typo in autosquash documentation
This is a minor follow-up to cb00f524df (rebase: rewrite
--(no-)autosquash documentation, 2023-11-14) to fix a typo introduced in
that commit.

Signed-off-by: Richard Macklin <code@rmacklin.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 08:50:49 -08:00
b3806f7633 git: document GIT_NO_REPLACE_OBJECTS environment variable
This variable is used as the primary way to disable the object
replacement mechanism, with the "--no-replace-objects" command line
option as an end-user visible way to set it, but has not been
documented.

The original reason why it was left undocumented might be because it
was meant as an internal implementation detail, but the thing is,
that our tests use the environment variable directly without the
command line option, and there certainly are folks who learned its
use from there, making it impossible to deprecate or change its
behaviour by now.

Add documentation and note that for this variable, unlike many
boolean-looking environment variables, only the presence matters,
not what value it is set to.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 22:08:49 -08:00
a2082dbdd3 Start the 2.45 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:10:25 -08:00
7ece6ad823 Merge branch 'ps/ref-tests-update-even-more'
More tests that are marked as "ref-files only" have been updated to
improve test coverage of reftable backend.

* ps/ref-tests-update-even-more:
  t7003: ensure filter-branch prunes reflogs with the reftable backend
  t2011: exercise D/F conflicts with HEAD with the reftable backend
  t1405: remove unneeded cleanup step
  t1404: make D/F conflict tests compatible with reftable backend
  t1400: exercise reflog with gaps with reftable backend
  t0410: convert tests to use DEFAULT_REPO_FORMAT prereq
  t: move tests exercising the "files" backend
2024-02-26 18:10:25 -08:00
65462776c2 Merge branch 'gt/at-is-synonym-for-head-in-add-patch'
Teach "git checkout -p" and friends that "@" is a synonym for
"HEAD".

* gt/at-is-synonym-for-head-in-add-patch:
  add -p tests: remove PERL prerequisites
  add-patch: classify '@' as a synonym for 'HEAD'
2024-02-26 18:10:25 -08:00
cf258a9e4e Merge branch 'kh/column-reject-negative-padding'
"git column" has been taught to reject negative padding value, as
it would lead to nonsense behaviour including division by zero.

* kh/column-reject-negative-padding:
  column: guard against negative padding
  column: disallow negative padding
2024-02-26 18:10:25 -08:00
225f892685 Merge branch 'jc/t9210-lazy-fix'
Adjust use of "rev-list --missing" in an existing tests so that it
does not depend on a buggy failure mode.

* jc/t9210-lazy-fix:
  t9210: do not rely on lazy fetching to fail
2024-02-26 18:10:24 -08:00
9f67cbd0a7 Merge branch 'ps/reftable-iteration-perf'
The code to iterate over refs with the reftable backend has seen
some optimization.

* ps/reftable-iteration-perf:
  reftable/reader: add comments to `table_iter_next()`
  reftable/record: don't try to reallocate ref record name
  reftable/block: swap buffers instead of copying
  reftable/pq: allocation-less comparison of entry keys
  reftable/merged: skip comparison for records of the same subiter
  reftable/merged: allocation-less dropping of shadowed records
  reftable/record: introduce function to compare records by key
2024-02-26 18:10:24 -08:00
274400998b Merge branch 'rs/use-xstrncmpz'
Code clean-up.

* rs/use-xstrncmpz:
  use xstrncmpz()
2024-02-26 18:10:24 -08:00
cf47fb7ec7 Merge branch 'cp/apply-core-filemode'
"git apply" on a filesystem without filemode support have learned
to take a hint from what is in the index for the path, even when
not working with the "--index" or "--cached" option, when checking
the executable bit match what is required by the preimage in the
patch.

* cp/apply-core-filemode:
  apply: code simplification
  apply: correctly reverse patch's pre- and post-image mode bits
  apply: ignore working tree filemode when !core.filemode
2024-02-26 18:10:24 -08:00
b4385bf016 Merge branch 'ps/reftable-backend'
Integrate the reftable code into the refs framework as a backend.

* ps/reftable-backend:
  refs/reftable: fix leak when copying reflog fails
  ci: add jobs to test with the reftable backend
  refs: introduce reftable backend
2024-02-26 18:10:23 -08:00
558d146d13 fsmonitor: remove custom loop from non-directory path handler
Refactor the code that handles refresh events for pathnames that do
not contain a trailing slash.  Instead of using a custom loop to try
to scan the index and detect if the FSEvent named a file or might be a
directory prefix, use the recently created helper function to do that.

Also update the comments to describe what and why we are doing this.

On platforms that DO NOT annotate FS events with a trailing
slash, if we fail to find an exact match for the pathname
in the index, we do not know if the pathname represents a
directory or simply an untracked file.  Pretend that the pathname
is a directory and try again before assuming it is an untracked
file.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:03 -08:00
a52482036c fsmonitor: return invalidated cache-entry count on directory event
Teach the refresh callback helper function for directory FSEvents to
return the number of cache-entries that were invalidated in response
to a directory event.

This will be used in a later commit to help determine if the observed
pathname in the FSEvent was a (possibly) case-incorrect directory
prefix (on a case-insensitive filesystem) of one or more actual
cache-entries.

If there exists at least one case-insensitive prefix match, then we
can assume that the directory is a (case-incorrect) prefix of at least
one tracked item rather than a completely unknown/untracked file or
directory.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:03 -08:00
7c97174dcd fsmonitor: move untracked-cache invalidation into helper functions
Move the call to invalidate the untracked-cache for the FSEvent
pathname into the two helper functions.

In a later commit in this series, we will call these helpers
from other contexts and it safer to include the UC invalidation
in the helpers than to remember to also add it to each helper
call-site.

This has the side-effect of invalidating the UC *before* we
invalidate the ce_flags in the cache-entry.  These activities
are independent and do not affect each other.  Also, by doing
the UC work first, we can avoid worrying about "early returns"
or the need for the usual "goto the end" in each of the
handler functions.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
48f4cd7155 fsmonitor: refactor untracked-cache invalidation
Update fsmonitor_refresh_callback() to use the new
untracked_cache_invalidate_trimmed_path() to invalidate
the cache using the observed pathname without needing to
modify the caller's buffer.

Previously, we modified the caller's buffer when the observed pathname
contained a trailing slash (and did not restore it).  This wasn't a
problem for the single use-case caller, but felt dirty nontheless.  In
a later commit we will want to invalidate case-corrected versions of
the pathname (using possibly borrowed pathnames from the name-hash or
dir-name-hash) and we may not want to keep the tradition of altering
the passed-in pathname.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
3e4ffda639 dir: create untracked_cache_invalidate_trimmed_path()
Create a wrapper function for untracked_cache_invalidate_path()
that silently trims a trailing slash, if present, before calling
the wrapped function.

The untracked cache expects to be called with a pathname that
does not contain a trailing slash.  This can make it inconvenient
for callers that have a directory path.  Lets hide this complexity.

This will be used by a later commit in the FSMonitor code which
may receive directory pathnames from an FSEvent.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
8687c2b067 fsmonitor: refactor refresh callback for non-directory events
Move the code that handles unqualified FSEvents (without a trailing
slash) into a helper function.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
7a15a62aeb fsmonitor: clarify handling of directory events in callback helper
Improve documentation of the refresh callback helper function
used for directory FSEvents.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
e5da3ddbe9 fsmonitor: refactor refresh callback on directory events
Move the code to handle directory FSEvents (containing pathnames with
a trailing slash) into a helper function.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:01 -08:00
32ca706fad t7527: add case-insensitve test for FSMonitor
The FSMonitor client code trusts the spelling of the pathnames in the
FSEvents received from the FSMonitor daemon.  On case-insensitive file
systems, these OBSERVED pathnames may be spelled differently than the
EXPECTED pathnames listed in the .git/index.  This causes a miss when
using `index_name_pos()` which expects the given case to be correct.

When this happens, the FSMonitor client code does not update the state
of the CE_FSMONITOR_VALID bit when refreshing the index (and before
starting to scan the worktree).

This results in modified files NOT being reported by `git status` when
there is a discrepancy in the case-spelling of a tracked file's
pathname.

This commit contains a (rather contrived) test case to demonstrate
this.  A later commit in this series will update the FSMonitor client
code to recognize these discrepancies and update the CE_ bit accordingly.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:01 -08:00
b316552339 name-hash: add index_dir_find()
index_dir_exists() returns a boolean to indicate if there is a
case-insensitive match in the directory name-hash, but does not
provide the caller with the exact spelling of that match.

Create index_dir_find() to do the case-insensitive search *and*
optionally return the spelling of the matched directory prefix in a
provided strbuf.

To avoid code duplication, convert index_dir_exists() to be a trivial
wrapper around the new index_dir_find().

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:01 -08:00
4f66942215 neue: remove a bogus empty file
This file has been added as part of 2232a88ab6 (attr: add builtin
objectmode values support, 2023-11-16) and most likely serves no
relevant purpose.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:11:07 -08:00
b9e55be740 merge-ort: turn submodule conflict suggestions into an advice
Add a new advice type 'submoduleMergeConflict' for the error message
shown when a non-trivial submodule conflict is encountered, which
was added in 4057523a40 (submodule merge: update conflict error
message, 2022-08-04). That commit mentions making this message an
advice as possible future work.  The message can now be disabled
with the advice mechanism.

Update the tests as the expected message now appears on stderr instead
of stdout.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:07:01 -08:00
5edd126720 read_ref_at(): special-case ref@{0} for an empty reflog
The previous commit special-cased get_oid_basic()'s handling of ref@{n}
for a reflog with n entries. But its special case doesn't work for
ref@{0} in an empty reflog, because read_ref_at() dies when it notices
the empty reflog!

We can make this work by special-casing this in read_ref_at(). It's
somewhat gross, for two reasons:

  1. We have no reflog entry to describe in the "msg" out-parameter. So
     we have to leave it uninitialized or make something up.

  2. Likewise, we have no oid to put in the "oid" out-parameter. Leaving
     it untouched is actually the best thing here, as all of the callers
     will have initialized it with the current ref value via
     repo_dwim_log(). This is rather subtle, but it is how things worked
     in 6436a20284 (refs: allow @{n} to work with n-sized reflog,
     2021-01-07) before we reverted it.

The key difference from 6436a20284 here is that we'll return "1" to
indicate that we _didn't_ find the requested reflog entry. Coupled with
the special-casing in get_oid_basic() in the previous commit, that's
enough to make looking up ref@{0} work, and we can flip 6436a20284's
test back to expect_success.

It also means that the call in show-branch which segfaulted with
6436a20284 (and which is now tested in t3202) remains OK. The caller
notices that we could not find any reflog entry, and so it breaks out of
its loop, showing nothing. This is different from the current behavior
of producing an error, but it's just as reasonable (and is exactly what
we'd do if you asked it to walk starting at ref@{1} but there was only 1
entry).

Thus nobody should actually look at the reflog entry info we return. But
we'll still put in some fake values just to be on the safe side, since
this is such a subtle and confusing interface. Likewise, we'll document
what's going on in a comment above the function declaration. If this
were a function with a lot of callers, the footgun would probably not be
worth it. But it has only ever had two callers in its 18-year existence,
and it seems unlikely to grow more. So let's hold our noses and let
users enjoy the convenience of a simulated ref@{0}.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:05:35 -08:00
755e7465c9 get_oid_basic(): special-case ref@{n} for oldest reflog entry
The goal of 6436a20284 (refs: allow @{n} to work with n-sized reflog,
2021-01-07) was that if we have "n" entries in a reflog, we should still
be able to resolve ref@{n} by looking at the "old" value of the oldest
entry.

Commit 6436a20284 tried to put the logic into read_ref_at() by shifting
its idea of "n" by one. But we reverted that in the previous commit,
since it led to bugs in other callers which cared about the details of
the reflog entry we found. Instead, let's put the special case into the
caller that resolves @{n}, as it cares only about the oid.

read_ref_at() is even kind enough to return the "old" value from the
final reflog; it just returns "1" to signal to us that we ran off the
end of the reflog. But we can notice in the caller that we read just
enough records for that "old" value to be the one we're looking for, and
use it.

Note that read_ref_at() could notice this case, too, and just return 0.
But we don't want to do that, because the caller must be made aware that
we only found the oid, not an actual reflog entry (and the call sites in
show-branch do care about this).

There is one complication, though. When read_ref_at() hits a truncated
reflog, it will return the "old" value of the oldest entry only if it is
not the null oid. Otherwise, it actually returns the "new" value from
that entry! This bit of fudging is due to d1a4489a56 (avoid null SHA1 in
oldest reflog, 2008-07-08), where asking for "ref@{20.years.ago}" for a
ref created recently will produce the initial value as a convenience
(even though technically it did not exist 20 years ago).

But this convenience is only useful for time-based cutoffs. For
count-based cutoffs, get_oid_basic() has always simply complained about
going too far back:

  $ git rev-parse HEAD@{20}
  fatal: log for 'HEAD' only has 16 entries

and we should continue to do so, rather than returning a nonsense value
(there's even a test in t1508 already which covers this). So let's have
the d1a4489a56 code kick in only when doing timestamp-based cutoffs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:05:32 -08:00
aa72e73a2e Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284.

The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).

This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.

For example, in the new test in t3202 we produce:

  ! [branch@{0}] (0 seconds ago) commit: three
   ! [branch@{1}] (0 seconds ago) commit: three
    ! [branch@{2}] (60 seconds ago) commit: two
     ! [branch@{3}] (2 minutes ago) reset: moving to HEAD^

instead of the correct:

  ! [branch@{0}] (0 seconds ago) commit: three
   ! [branch@{1}] (60 seconds ago) commit: two
    ! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
     ! [branch@{3}] (2 minutes ago) commit: one

But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.

Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.

This isn't a pure revert; all of the code changes are reverted, but for
the tests:

  1. We'll flip the cases in t1508 to expect_failure; making these work
     was the goal of 6436a2028, and we'll want to use them for our
     replacement approach.

  2. There's a test in t3202 for "show-branch --reflog", but it expects
     the broken output! It was added by f2463490c4 (show-branch: show
     reflog message, 2021-12-02) which was fixing another bug, and I
     think the author simply didn't notice that the second line showed
     the wrong reflog.

     Rather than fixing that test, let's replace it with one that is
     more thorough (while still covering the reflog message fix from
     that commit). We'll use a longer reflog, which lets us see more
     entries (thus making the "off by one" pattern much more clear). And
     we'll use a more recent timestamp for "now" so that our relative
     dates have more resolution. That lets us see that the reflog dates
     are correct (whereas when you are 4 years away, two entries that
     are 60 seconds apart will have the same "4 years ago" relative
     date). Because we're adjusting the repository state, I've moved
     this new test to the end of the script, leaving the other tests
     undisturbed.

     We'll also add a new test which covers the missing reflog case;
     previously it segfaulted, but now it reports the empty reflog).

Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:05:28 -08:00
3f4c7a0805 upload-pack: don't send null character in abort message to the client
Since 583b7ea31b (upload-pack/fetch-pack: support side-band
communication, 2006-06-21) the abort message sent by upload-pack in
case of possible repository corruption ends with a null character.
This can be seen in several test cases in 't5530-upload-pack-error.sh'
where 'grep <pattern> output.err' often reports "Binary file
output.err matches" because of that null character.

The reason for this is that the abort message is defined as a string
literal, and we pass its size to the send function as
sizeof(abort_msg), which also counts the terminating null character.

Use strlen() instead to avoid sending that terminating null character.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 09:49:08 -08:00
9a97b43e03 submodule: use strvec_pushf() for --submodule-prefix
Add the option --submodule-prefix and its argument directly using
strvec_pushf() instead of via a detour through a strbuf.  This is
shorter, easier to read and doesn't require any explicit cleanup
afterwards.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 09:45:57 -08:00
affe355fe7 userdiff: skip textconv caching when not in a repository
The textconv caching system uses git-notes to store its cache entries.
But if you're using "diff --no-index" outside of a repository, then
obviously that isn't going to work.

Since caching is just an optimization, it's OK for us to skip it.
However, the current behavior is much worse: we call notes_cache_init()
which tries to look up the ref, and the low-level ref code hits a BUG(),
killing the program. Instead, we should notice before setting up the
cache that it there's no repository, and just silently skip it.

Reported-by: Paweł Dominiak <dominiak.pawel@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 09:40:55 -08:00
f39addd0d9 name-rev: use mem_pool_strfmt()
1c56fc2084 (name-rev: pre-size buffer in get_parent_name(), 2020-02-04)
got a big performance boost in an unusual repository by calculating the
name length in advance.  This is a bit awkward, as it references the
name components twice.

Use a memory pool to store the strings for the struct rev_name member
tip_name.  Using mem_pool_strfmt() allows efficient allocation without
explicit size calculation.  This simplifies the formatting part of the
code without giving up performance:

Benchmark 1: ./git_2.44.0 -C ../chromium/src name-rev --all
  Time (mean ± σ):      1.231 s ±  0.013 s    [User: 1.082 s, System: 0.136 s]
  Range (min … max):    1.214 s …  1.252 s    10 runs

Benchmark 2: ./git -C ../chromium/src name-rev --all
  Time (mean ± σ):      1.220 s ±  0.020 s    [User: 1.083 s, System: 0.130 s]
  Range (min … max):    1.197 s …  1.254 s    10 runs

Don't bother discarding the memory pool just before exiting.  The effort
for that would be very low, but actually measurable in the above
example, with no benefit to users.  At least UNLEAK it to calm down leak
checkers.  This addresses the leaks that 45a14f578e (Revert "name-rev:
release unused name strings", 2022-04-22) brought back.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 09:35:40 -08:00
8d25663d70 mem-pool: add mem_pool_strfmt()
Add a function for building a string, printf style, using a memory pool.
It uses the free space in the current block in the first attempt.  If
that suffices then the result can already be used without copying or
reformatting.

For strings that are significantly shorter on average than the block
size (ca. 1 MiB by default) this is the case most of the time, leading
to a better perfomance than a solution that doesn't access mem-pool
internals.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 09:35:40 -08:00
87bd7fbb9c fetch: convert strncmp() with strlen() to starts_with()
Using strncmp() and strlen() to check whether a string starts with
another one requires repeating the prefix candidate.  Use starts_with()
instead, which reduces repetition and is more readable.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 08:58:45 -08:00
2ca6c07db2 compat: drop inclusion of <git-compat-util.h>
These two header files are included from ordinary source files that
already include <git-compat-util.h> as the first header file as they
should.  There is no need to include the compat-util in these
headers.

"make hdr-check" is not affected, as it is designed to assume that
what <git-compat-util.h> offers is available to everybody without
being included.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-24 14:37:41 -08:00
97d1f233c6 Documentation/config/pack.txt: fix broken AsciiDoc mark-up
In af626ac0e0 (pack-bitmap: enable reuse from all bitmapped packs,
2023-12-14), the documentation for `pack.allowPackReuse` was amended to
include its effect when set to "multi".

This split the documentation into two paragraphs, but did not de-dent
the second paragraph on the right-hand side of a line-continuation
marker. This causes the rendered documentation to appear oddly, where
the second paragraph is treated as a <pre> block when rendered as HTML.

Fix this by correctly removing the indentation on the second paragraph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 13:47:16 -08:00
33d15b5435 for-each-ref: add new option to include root refs
The git-for-each-ref(1) command doesn't provide a way to print root refs
i.e pseudorefs and HEAD with the regular "refs/" prefixed refs.

This commit adds a new option "--include-root-refs" to
git-for-each-ref(1). When used this would also print pseudorefs and HEAD
for the current worktree.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:28 -08:00
810f7a1aac ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
The flag 'FILTER_REFS_ALL' is a bit ambiguous, where ALL doesn't specify
if it means to contain refs from all worktrees or whether all types of
refs (regular, HEAD & pseudorefs) or all of the above.

Since here it is actually referring to all refs with the "refs/" prefix,
let's rename it to 'FILTER_REFS_REGULAR' to indicate that this is
specifically for regular refs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:27 -08:00
d0f00c1ac1 refs: introduce refs_for_each_include_root_refs()
Introduce a new ref iteration flag `DO_FOR_EACH_INCLUDE_ROOT_REFS`,
which will be used to iterate over regular refs plus pseudorefs and
HEAD.

Refs which fall outside the `refs/` and aren't either pseudorefs or HEAD
are more of a grey area. This is because we don't block the users from
creating such refs but they are not officially supported.

Introduce `refs_for_each_include_root_refs()` which calls
`do_for_each_ref()` with this newly introduced flag.

In `refs/files-backend.c`, introduce a new function
`add_pseudoref_and_head_entries()` to add pseudorefs and HEAD to the
`ref_dir`. We then finally call `add_pseudoref_and_head_entries()`
whenever the `DO_FOR_EACH_INCLUDE_ROOT_REFS` flag is set. Any new ref
backend will also have to implement similar changes on its end.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:27 -08:00
f768296cf1 refs: extract out loose_fill_ref_dir_regular_file()
Extract out the code for adding a single file to the loose ref dir as
`loose_fill_ref_dir_regular_file()` from `loose_fill_ref_dir()` in
`refs/files-backend.c`.

This allows us to use this function independently in the following
commits where we add code to also add pseudorefs to the ref dir.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:27 -08:00
1eba2240f8 refs: introduce is_pseudoref() and is_headref()
Introduce two new functions `is_pseudoref()` and `is_headref()`. This
provides the necessary functionality for us to add pseudorefs and HEAD
to the loose ref cache in the files backend, allowing us to build
tooling to print these refs.

The `is_pseudoref()` function internally calls `is_pseudoref_syntax()`
but adds onto it by also checking to ensure that the pseudoref either
ends with a "_HEAD" suffix or matches a list of exceptions. After which
we also parse the contents of the pseudoref to ensure that it conforms
to the ref format.

We cannot directly add the new syntax checks to `is_pseudoref_syntax()`
because the function is also used by `is_current_worktree_ref()` and
making it stricter to match only known pseudorefs might have unintended
consequences due to files like 'BISECT_START' which isn't a pseudoref
but sometimes contains object ID.

Keeping this in mind, we leave `is_pseudoref_syntax()` as is and create
`is_pseudoref()` which is stricter. Ideally we'd want to move the new
syntax checks to `is_pseudoref_syntax()` but a prerequisite for this
would be to actually remove the exception list by converting those
pseudorefs to also contain a '_HEAD' suffix and perhaps move bisect
related files like 'BISECT_START' to a new directory similar to the
'rebase-merge' directory.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:27 -08:00
342990c7aa fill_tree_descriptor(): mark error message for translation
There is an error message in that function to report a missing tree; In
contrast to three other, similar error messages, it is not marked for
translation yet.

Mark it for translation, and while at it, make the error message
consistent with the others by enclosing the SHA in parentheses.

This requires a change to t6030 which expects the previous format of the
commit message. Theoretically, this could present problems with existing
scripts that use `git bisect` and parse its output (because Git does not
provide other means for callers to discern between error conditions).
However, this is unlikely to matter in practice because the most common
course of action to deal with fatal corruptions is to report the error
message to the user and exit, rather than trying to do something with
the reported SHA of the missing tree.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:19:40 -08:00
5aca024a74 cache-tree: avoid an unnecessary check
The first thing the `parse_tree()` function does is to return early if
the tree has already been parsed. Therefore we do not need to guard the
`parse_tree()` call behind a check of that flag.

As of time of writing, there are no other instances of this in Git's
code bases: whenever the `parsed` flag guards a `parse_tree()` call, it
guards more than just that call.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:19:40 -08:00
aa9f618909 Always check parse_tree*()'s return value
Otherwise we may easily run into serious crashes: For example, if we run
`init_tree_desc()` directly after a failed `parse_tree()`, we are
accessing uninitialized data or trying to dereference `NULL`.

Note that the `parse_tree()` function already takes care of showing an
error message. The `parse_tree_indirectly()` and
`repo_get_commit_tree()` functions do not, therefore those latter call
sites need to show a useful error message while the former do not.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:19:40 -08:00
98c6d16d67 t4301: verify that merge-tree fails on missing blob objects
We just fixed a problem where `merge-tree` would not fail on missing
tree objects. Let's ensure that that problem does not occur with blob
objects (and won't, in the future, either).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:19:39 -08:00
f30e6c32d8 merge-ort: do check parse_tree()'s return value
The previous commit fixed a bug where a missing tree was reported, but
not treated as an error.

This patch addresses the same issue for the remaining two callers of
`parse_tree()`.

This change is not accompanied by a regression test because the code in
question is only reached at the `checkout` stage, i.e. after the merge
has happened (and therefore the tree objects could only be missing if
the disk had gone bad in that short time window, or something similarly
tricky to recreate in the test suite).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:19:39 -08:00
d4bf19308b merge-tree: fail with a non-zero exit code on missing tree objects
When `git merge-tree` encounters a missing tree object, it should error
out and not continue quietly as if nothing had happened.

However, as of time of writing, `git merge-tree` _does_ continue, and
then offers the empty tree as result.

Let's fix this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:19:39 -08:00
3c2a3fdc38 Git 2.44
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-22 16:14:53 -08:00
0d464a4e6a Git 2.43.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-22 16:13:38 -08:00
5dc7366297 Merge branch 'la/trailer-cleanups' into maint-2.43
* la/trailer-cleanups:
  trailer: fix comment/cut-line regression with opts->no_divider
2024-02-22 16:09:45 -08:00
41bff66e35 doc: apply the new placeholder rules to git-add documentation
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 14:03:57 -08:00
0824639ddf doc: clarify the format of placeholders
Add the new format rule when using placeholders in the description of
commands and options.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 14:01:46 -08:00
6835f0efe9 git-remote.txt: fix typo
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 10:02:55 -08:00
d699d15c32 builtin/reflog: introduce subcommand to list reflogs
While the git-reflog(1) command has subcommands to show reflog entries
or check for reflog existence, it does not have any subcommands that
would allow the user to enumerate all existing reflogs. This makes it
quite hard to discover which reflogs a repository has. While this can
be worked around with the "files" backend by enumerating files in the
".git/logs" directory, users of the "reftable" backend don't enjoy such
a luxury.

Introduce a new subcommand `git reflog list` that lists all reflogs the
repository knows of to fill this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:07 -08:00
59c50a96c5 refs: stop resolving ref corresponding to reflogs
The reflog iterator tries to resolve the corresponding ref for every
reflog that it is about to yield. Historically, this was done due to
multiple reasons:

  - It ensures that the refname is safe because we end up calling
    `check_refname_format()`. Also, non-conformant refnames are skipped
    altogether.

  - The iterator used to yield the resolved object ID as well as its
    flags to the callback. This info was never used though, and the
    corresponding parameters were dropped in the preceding commit.

  - When a ref is corrupt then the reflog is not emitted at all.

We're about to introduce a new `git reflog list` subcommand that will
print all reflogs that the refdb knows about. Skipping over reflogs
whose refs are corrupted would be quite counterproductive in this case
as the user would have no way to learn about reflogs which may still
exist in their repository to help and rescue such a corrupted ref. Thus,
the only remaining reason for why we'd want to resolve the ref is to
verify its refname.

Refactor the code to call `check_refname_format()` directly instead of
trying to resolve the ref. This is significantly more efficient given
that we don't have to hit the object database anymore to list reflogs.
And second, it ensures that we end up showing reflogs of broken refs,
which will help to make the reflog more useful.

Note that this really only impacts the case where the corresponding ref
is corrupt. Reflogs for nonexistent refs would have been returned to the
caller beforehand already as we did not pass `RESOLVE_REF_READING` to
the function, and thus `refs_resolve_ref_unsafe()` would have returned
successfully in that case.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:06 -08:00
31f898397b refs: drop unused params from the reflog iterator callback
The ref and reflog iterators share much of the same underlying code to
iterate over the corresponding entries. This results in some weird code
because the reflog iterator also exposes an object ID as well as a flag
to the callback function. Neither of these fields do refer to the reflog
though -- they refer to the corresponding ref with the same name. This
is quite misleading. In practice at least the object ID cannot really be
implemented in any other way as a reflog does not have a specific object
ID in the first place. This is further stressed by the fact that none of
the callbacks except for our test helper make use of these fields.

Split up the infrastucture so that ref and reflog iterators use separate
callback signatures. This allows us to drop the nonsensical fields from
the reflog iterator.

Note that internally, the backends still use the same shared infra to
iterate over both types. As the backends should never end up being
called directly anyway, this is not much of a problem and thus kept
as-is for simplicity's sake.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:06 -08:00
5e01d83841 refs: always treat iterators as ordered
In the preceding commit we have converted the reflog iterator of the
"files" backend to be ordered, which was the only remaining ref iterator
that wasn't ordered. Refactor the ref iterator infrastructure so that we
always assume iterators to be ordered, thus simplifying the code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:06 -08:00
6f22780017 refs/files: sort merged worktree and common reflogs
When iterating through reflogs in a worktree we create a merged iterator
that merges reflogs from both refdbs. The resulting refs are ordered so
that instead we first return all worktree reflogs before we return all
common refs.

This is the only remaining case where a ref iterator returns entries in
a non-lexicographic order. The result would look something like the
following (listed with a command we introduce in a subsequent commit):

```
$ git reflog list
HEAD
refs/worktree/per-worktree
refs/heads/main
refs/heads/wt
```

So we first print the per-worktree reflogs in lexicographic order, then
the common reflogs in lexicographic order. This is confusing and not
consistent with how we print per-worktree refs, which are exclusively
sorted lexicographically.

Sort reflogs lexicographically in the same way as we sort normal refs.
As this is already implemented properly by the "reftable" backend via a
separate selection function, we simply pull out that logic and reuse it
for the "files" backend. As logs are properly sorted now, mark the
merged reflog iterator as sorted.

Tests will be added in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:06 -08:00
e69e8ffef7 refs/files: sort reflogs returned by the reflog iterator
We use a directory iterator to return reflogs via the reflog iterator.
This iterator returns entries in the same order as readdir(3P) would and
will thus yield reflogs with no discernible order.

Set the new `DIR_ITERATOR_SORTED` flag that was introduced in the
preceding commit so that the order is deterministic. While the effect of
this can only been observed in a test tool, a subsequent commit will
start to expose this functionality to users via a new `git reflog list`
subcommand.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:05 -08:00
de34f2651e dir-iterator: support iteration in sorted order
The `struct dir_iterator` is a helper that allows us to iterate through
directory entries. This iterator returns entries in the exact same order
as readdir(3P) does -- or in other words, it guarantees no specific
order at all.

This is about to become problematic as we are introducing a new reflog
subcommand to list reflogs. As the "files" backend uses the directory
iterator to enumerate reflogs, returning reflog names and exposing them
to the user would inherit the indeterministic ordering. Naturally, it
would make for a terrible user interface to show a list with no
discernible order.

While this could be handled at a higher level by the new subcommand
itself by collecting and ordering the reflogs, this would be inefficient
because we would first have to collect all reflogs before we can sort
them, which would introduce additional latency when there are many
reflogs.

Instead, introduce a new option into the directory iterator that asks
for its entries to be yielded in lexicographical order. If set, the
iterator will read all directory entries greedily and sort them before
we start to iterate over them.

While this will of course also incur overhead as we cannot yield the
directory entries immediately, it should at least be more efficient than
having to sort the complete list of reflogs as we only need to sort one
directory at a time.

This functionality will be used in a follow-up commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:05 -08:00
0218de2bdb dir-iterator: pass name to prepare_next_entry_data() directly
When adding the next directory entry for `struct dir_iterator` we pass
the complete `struct dirent *` to `prepare_next_entry_data()` even
though we only need the entry's name.

Refactor the code to pass in the name, only. This prepares for a
subsequent commit where we introduce the ability to iterate through
dir entries in an ordered manner.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:05 -08:00
244001aa20 rebase: make warning less passive aggressive
When you run `git rebase --continue` when no rebase is in progress, git
outputs `fatal: No rebase in progress?` which is not a question but a
statement. Make it appear as a statement, and use lowercase to align
with error message style.

Signed-off-by: Harmen Stoppels <me@harmenstoppels.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:52:34 -08:00
abab32a613 doc: end sentences with full-stop
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-20 15:03:13 -08:00
2e48553fda doc: close unclosed angle-bracket of a placeholder in git-clone doc
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-20 15:02:27 -08:00
de2852ab6f doc: git-rev-parse: enforce command-line description syntax
git-rev-parse(1) manpage is completely off with respect to the
command-line description syntax with badly formatted placeholders and
malformed alternatives.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-20 14:41:37 -08:00
82d75402d5 documentation: send-email: use camel case consistently
Correct a few random "sendemail.*" configuration parameter names in the
documentation that, for some unknown reason and contrary to the expected,
didn't use camel case format.

The majority of the corrections are straightforward, by using camel case
to denote boundaries of the individual words that, stringed together, make
up configuration parameter names.  A couple of abbreviations found in some
of the corrected configuration parameter names present some exceptions,
which are described in detail below.

First, there's "SSL" as the abbreviation for "Secure Sockets Layer". [1]
As such, it's written using all uppercase letters, which is pretty much the
general rule for making abbreviations, although with certain exceptions.

Second, there's "Cc" as the abbreviation for "carbon copy", which is another
exception.  As the acronym for "carbon copy", "cc" (mind the all lowercase
letters) stems from the rather old times when, literally, carbon copies were
made. [2]  Therefore, using "CC" (mind the all uppercase letters) or "cc"
(mind the all lowercase letters) would be technically correct in the email
domain, as the abbreviation or as mentioned in RFC2076, [3] respectively, but
the age of email has established "Cc" (mind the mixed uppercase and lowercase
letters) as some kind of de facto standard. [1][4][5]  Moreover, some of the
git utilities, primarily git-send-email(1), already refer to making email
carbon copies as specifying "Cc:" email headers.  As a result, "Cc" becomes
one of the exceptions to the general rule for making abbreviations.

[1] https://en.wikipedia.org/wiki/Transport_Layer_Security
[2] https://en.wikipedia.org/wiki/Carbon_copy
[3] https://datatracker.ietf.org/doc/html/rfc2076
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=212059
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=50826

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-20 14:37:44 -08:00
eb84c8b6ce git-difftool--helper: honor --trust-exit-code with --dir-diff
The `--trust-exit-code` option for git-diff-tool(1) was introduced via
2b52123fcf (difftool: add support for --trust-exit-code, 2014-10-26).
When set, it makes us return the exit code of the invoked diff tool when
diffing multiple files. This patch didn't change the code path where
`--dir-diff` was passed because we already returned the exit code of the
diff tool unconditionally in that case.

This was changed a month later via c41d3fedd8 (difftool--helper: add
explicit exit statement, 2014-11-20), where an explicit `exit 0` was
added to the end of git-difftool--helper.sh. While the stated intent of
that commit was merely a cleanup, it had the consequence that we now
to ignore the exit code of the diff tool when `--dir-diff` was set. This
change in behaviour is thus very likely an unintended side effect of
this patch.

Now there are two ways to fix this:

  - We can either restore the original behaviour, which unconditionally
    returned the exit code of the diffing tool when `--dir-diff` is
    passed.

  - Or we can make the `--dir-diff` case respect the `--trust-exit-code`
    flag.

The fact that we have been ignoring exit codes for 7 years by now makes
me rather lean towards the latter option. Furthermore, respecting the
flag in one case but not the other would needlessly make the user
interface more complex.

Fix the bug so that we also honor `--trust-exit-code` for dir diffs and
adjust the documentation accordingly.

Reported-by: Jean-Rémy Falleri <jr.falleri@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-20 09:30:32 -08:00
f41f85c9ec Git 2.44-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19 21:01:01 -08:00
58aa645fc0 Merge branch 'la/trailer-cleanups'
Fix to an already-graduated topic.

* la/trailer-cleanups:
  trailer: fix comment/cut-line regression with opts->no_divider
2024-02-19 20:58:06 -08:00
bc47139f4f trailer: fix comment/cut-line regression with opts->no_divider
Commit 97e9d0b78a (trailer: find the end of the log message, 2023-10-20)
combined two code paths for finding the end of the log message. For the
"no_divider" case, we used to use find_trailer_end(), and that has now
been rolled into find_end_of_log_message(). But there's a regression;
that function returns early when no_divider is set, returning the whole
string.

That's not how find_trailer_end() behaved. Although it did skip the
"---" processing (which is what "no_divider" is meant to do), we should
still respect ignored_log_message_bytes(), which covers things like
comments, "commit -v" cut lines, and so on.

The bug is actually in the interpret-trailers command, but the obvious
way to experience it is by running "commit -v" with a "--trailer"
option. The new trailer will be added at the end of the verbose diff,
rather than before it (and consequently will be ignored entirely, since
everything after the diff's intro scissors line is thrown away).

I've added two tests here: one for interpret-trailers directly, which
shows the bug via the parsing routines, and one for "commit -v".

The fix itself is pretty simple: instead of returning early, no_divider
just skips the "---" handling but still calls ignored_log_message_bytes().

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19 19:06:18 -08:00
64562d784d doc: remove outdated information about interactive.singleKey
The Perl implementation of add --interactive was removed in commit [1].

Additionally, the interactive.singleKey setting is no longer silently
ignored. The internal implementation of ReadKey [2] displays a warning
if the platform is unsupported.

[1] 20b813d7d (add: remove "add.interactive.useBuiltin" & Perl "git add--interactive", 2023-02-06)
[2] a5e46e6b0 (terminal: add a new function to read a single keystroke, 2020-01-14)

Signed-off-by: Julio Bacellari <julio.bacel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19 15:12:41 -08:00
e41d68b75c t0303: check that helper_test_clean removes all credentials
Our lib-credential.sh library comes with a "clean" function that removes
all of the credentials used in its tests (to avoid leaving cruft in
system credential storage). But it's easy to add a test that uses a new
credential but forget to add it to the clean function.  E.g., the case
fixed by 83e6eb7d7a (t/lib-credential: clean additional credential,
2024-02-15).

We should be able to catch this automatically, but it's a little tricky.

We can't just compare the contents of the helper's storage before and
after the test run, because there isn't a way to ask a helper to dump
all of its storage. And in most cases we don't have direct access to the
underlying storage (since the whole point of the helper is to abstract
that away). We can work around that by using our own "store" helper,
since we can directly inspect its state by looking at its on-disk file.

But there's a catch: the "store" helper doesn't support features like
caching or expiration, so using it naively fails tests (and skipping
those tests would give us incomplete coverage). Implementing all of
those features would be non-trivial. But we can hack around that by
overriding the "check" function used by the tests to turn most requests
into noop success (except for "approve" requests, which actually store
things).

And then at the end we can check that running the "clean" function takes
us back to an empty state.

Note that because we've skipped any tests that erase credentials
(because of our noop check function), the state we see at cleanup time
may be larger than it would be normally. That's OK. The point of the
clean function is to clean up any cruft we _might_ have left in place,
so we're just being doubly thorough.

The way this is bolted onto t0303 feels a little messy. But it's really
the best place to do it, because then we know that it is running the
exact sequence of tests that we'd use for testing a real external
helper. In a normal run of "make test" it currently does nothing (the
idea is that you run it manually after pointing it at some helper
program). But now with this patch, "make test" will sanity-check the
script itself.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19 15:01:41 -08:00
30b1e8b920 Merge branch 'ba/credential-test-clean-fix' into jk/t0303-clean
* ba/credential-test-clean-fix:
  t/lib-credential: clean additional credential
2024-02-19 15:01:32 -08:00
8f1f2023b7 libsecret: retrieve empty password
Since 0ce02e2f (credential/libsecret: store new attributes, 2023-06-16)
a test that stores empty username and password fails when
t0303-credential-external.sh is run with
GIT_TEST_CREDENTIAL_HELPER=libsecret.

Retrieve empty password carefully. This fixes test:

    ok 14 - helper (libsecret) can store empty username

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19 14:36:35 -08:00
f7cdeafdd0 Merge branch 'ps/reftable-backend' into ps/reflog-list
* ps/reftable-backend:
  refs/reftable: fix leak when copying reflog fails
  ci: add jobs to test with the reftable backend
  refs: introduce reftable backend
2024-02-19 10:50:07 -08:00
b21d164275 mergetools: vimdiff: use correct tool's name when reading mergetool config
The /mergetools/vimdiff script, which handles both vimdiff, nvimdiff
and gvimdiff mergetools (the latter 2 simply source the vimdiff script), has a
function merge_cmd() which read the layout variable from git config, and it
would always read the value of mergetool.**vimdiff**.layout, instead of the
mergetool being currently used (vimdiff or nvimdiff or gvimdiff).

It looks like in 7b5cf8be18 (vimdiff: add tool documentation, 2022-03-30),
we explained the current behavior in Documentation/config/mergetool.txt:

```
mergetool.vimdiff.layout::
	The vimdiff backend uses this variable to control how its split
	windows look like. Applies even if you are using Neovim (`nvim`) or
	gVim (`gvim`) as the merge tool. See BACKEND SPECIFIC HINTS section
```

which makes sense why it's explained this way - the vimdiff backend is used by
gvim and nvim. But the mergetool's configuration should be separate for each tool,
and indeed that's confirmed in same commit at Documentation/mergetools/vimdiff.txt:

```
Variants

Instead of `--tool=vimdiff`, you can also use one of these other variants:
  * `--tool=gvimdiff`, to open gVim instead of Vim.
  * `--tool=nvimdiff`, to open Neovim instead of Vim.

When using these variants, in order to specify a custom layout you will have to
set configuration variables `mergetool.gvimdiff.layout` and
`mergetool.nvimdiff.layout` instead of `mergetool.vimdiff.layout`
```

So it looks like we just forgot to update the 1 part of the vimdiff script
that read the config variable. Cheers.

Though, for backward compatibility, I've kept the mergetool.vimdiff
fallback, so that people who unknowingly relied on it, won't have their
setup broken now.

Signed-off-by: Kipras Melnikovas <kipras@kipras.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19 08:45:14 -08:00
96c8a0712e Merge tag 'l10n-2.44.0-rnd3' of https://github.com/git-l10n/git-po
l10n-2.44.0-rnd3

* tag 'l10n-2.44.0-rnd3' of https://github.com/git-l10n/git-po:
  l10n: zh_TW: Git 2.44
  l10n: zh_CN: for git 2.44 rounds
  l10n: Update German translation
  l10n: tr: Update Turkish translations for 2.44
  l10n: fr.po: v2.44.0 round 3
  l10n: bg.po: Updated Bulgarian translation (5610t)
  l10n: sv.po: Update Swedish translation
  l10n: Update Catalan translation
  l10n: po-id for 2.44 (round 1)
  l10n: ci: disable cache for setup-go to suppress warnings
  l10n: ci: remove unused param for add-pr-comment@v2
  l10n: uk: v2.44 update (round 3)
  l10n: uk: v2.44 update (round 2)
  l10n: uk: v2.44 localization update
  l10n: bump Actions versions in l10n.yml
2024-02-19 08:35:40 -08:00
5fdd5b989c l10n: zh_TW: Git 2.44
Co-Authored-By: lumynou5 <lumynou5.tw@gmail.com>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2024-02-18 21:03:43 +08:00
63e81f22a6 Merge branch 'master' of github.com:ralfth/git
* 'master' of github.com:ralfth/git:
  l10n: Update German translation
2024-02-18 20:33:01 +08:00
9c4289b3db Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.44 (round 1)
2024-02-18 20:31:55 +08:00
3a00233815 Merge branch '2.44-uk-update' of github.com:arkid15r/git-ukrainian-l10n
* '2.44-uk-update' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: uk: v2.44 update (round 3)
  l10n: uk: v2.44 update (round 2)
  l10n: uk: v2.44 localization update
2024-02-18 20:30:05 +08:00
ce2f6a001f Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5610t)
2024-02-18 20:28:57 +08:00
499f952da0 Merge branch 'tr-l10n' of github.com:bitigchi/git-po
* 'tr-l10n' of github.com:bitigchi/git-po:
  l10n: tr: Update Turkish translations for 2.44
2024-02-18 20:27:47 +08:00
45ebe3fcf6 Merge branch 'fr_2.44.0' of github.com:jnavila/git
* 'fr_2.44.0' of github.com:jnavila/git:
  l10n: fr.po: v2.44.0 round 3
2024-02-18 20:26:45 +08:00
61ad0f6484 Merge branch 'catalan-l10n' of github.com:Softcatala/git-po
* 'catalan-l10n' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2024-02-18 20:25:32 +08:00
362f27f8a8 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation
2024-02-18 20:24:48 +08:00
3c58354a53 l10n: zh_CN: for git 2.44 rounds
In addition to the localized translation in 2.44, for zh_CN, we have
uniformly modified the translation of the word "commit-graph" to make it
more consistent with language usage habits.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
2024-02-18 11:48:52 +08:00
d44a018852 RelNotes: minor typo fixes in 2.44.0 draft
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-17 10:11:55 -08:00
37c2ad6535 l10n: Update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2024-02-17 18:14:54 +01:00
3c2e3d42d1 completion: use awk for filtering the config entries
Commits 1e0ee4087e (completion: add and use
__git_compute_first_level_config_vars_for_section, 2024-02-10) and
6e32f718ff (completion: add and use
__git_compute_second_level_config_vars_for_section, 2024-02-10)
introduced new helpers for config completion.

Both helpers use a pipeline of grep and awk to filter the list of config
entries. awk is perfectly capable of filtering, so let's eliminate the
grep process and move the filtering into the awk script.

The "-E" grep option (extended syntax) was not necessary, as $section is
a single word.

While at it, wrap the over-long lines to make them more readable.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-16 12:14:11 -08:00
b927408183 l10n: tr: Update Turkish translations for 2.44
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2024-02-16 22:06:18 +03:00
2675562081 l10n: fr.po: v2.44.0 round 3
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2024-02-16 19:20:07 +01:00
330e4198b8 l10n: bg.po: Updated Bulgarian translation (5610t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2024-02-16 09:39:04 +01:00
20657a8b43 l10n: sv.po: Update Swedish translation
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2024-02-16 07:59:21 +01:00
6f5e31bec7 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2024-02-16 07:18:20 +01:00
c293cf8c47 l10n: po-id for 2.44 (round 1)
Update following components:

  * builtin/replay.c
  * command-list.h
  * commit-graph.c
  * pack-bitmap.c
  * sequencer.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2024-02-16 11:01:11 +07:00
1bb7fcbffc l10n: ci: disable cache for setup-go to suppress warnings
After we upgraded actions/setup-go to v5, the following warning message
was reported every time we ran the CI.

    Restore cache failed: Dependencies file is not found ...

Disable cache to suppress warning messages as described in the solution
below.

    https://github.com/actions/setup-go/issues/427

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2024-02-16 11:51:19 +08:00
4d733f09f0 l10n: ci: remove unused param for add-pr-comment@v2
When we upgraded GitHub Actions "mshick/add-pr-comment" to v2, the
following warning message was reported every time we ran the CI.

    Unexpected input(s) 'repo-token-user-login', valid inputs ...

Removed the obsolete parameter "repo-token-user-login" to suppress
warning messages.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2024-02-16 11:40:58 +08:00
a2e183e065 l10n: uk: v2.44 update (round 3)
Signed-off-by: Arkadii Yakovets <ark@cho.red>
2024-02-15 18:05:05 -08:00
6ad5961c91 l10n: uk: v2.44 update (round 2)
Signed-off-by: Arkadii Yakovets <ark@cho.red>
2024-02-15 18:02:14 -08:00
ed8e89ec8c l10n: uk: v2.44 localization update
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2024-02-15 18:02:13 -08:00
c68ee9b9cc Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git:
  diff: mark param1 and param2 as placeholders
2024-02-16 09:39:06 +08:00
3e0d3cd5c7 Merge branch 'jx/dirstat-parseopt-help'
The mark-up of diff options has been updated to help translators.

* jx/dirstat-parseopt-help:
  diff: mark param1 and param2 as placeholders
2024-02-15 15:14:48 -08:00
83e6eb7d7a t/lib-credential: clean additional credential
71201ab0e5 (t/lib-credential.sh: ensure credential helpers handle long
headers, 2023-05-01) added a test which stores credentials with the host
victim.example.com but this was never cleaned up, leaving residual data
in the credential store after running the tests.

Add a cleanup call for this credential to resolve this issue.

Signed-off-by: Bo Anderson <mail@boanderson.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 14:16:38 -08:00
5918f30b65 t7003: ensure filter-branch prunes reflogs with the reftable backend
In t7003 we conditionally check whether the reflog for branches pruned
by git-filter-branch(1) get deleted based on whether or not we use the
"files" backend. Same as with the preceding commit, this condition was
added because in its initial iteration the "reftable" backend did not
delete reflogs when their corresponding ref was deleted. Since then, the
backend has been aligned to behave the same as the "files" backend
though, which makes this check unnecessary.

Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:31 -08:00
f85a032c67 t2011: exercise D/F conflicts with HEAD with the reftable backend
Some of the tests in t2011 exercise whether it is possible to move away
from a symbolic HEAD ref whose target ref has a directory-file conflict
with another, preexisting ref. These tests don't use git-symbolic-ref(1)
but manually write HEAD. This is supposedly done to avoid using logic
that we're about to exercise, but it makes it impossible to verify
whether the logic also works for ref backends other than "files".

Refactor the code to use git-symbolic-ref(1) instead so that the tests
work with the "reftable" backend, as well. We already have lots of tests
in t1404 that ensure that both git-update-ref(1) and git-symbolic-ref(1)
work in such a scenario, so it should be safe to rely on it here.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:30 -08:00
c4e3af6d97 t1405: remove unneeded cleanup step
In 5e00514745 (t1405: explictly delete reflogs for reftable, 2022-01-31)
we have added a test that explicitly deletes the reflog when not using
the "files" backend. This was required because back then, the "reftable"
backend didn't yet delete reflogs when deleting their corresponding
branches, and thus subsequent tests would fail because some unexpected
reflogs still exist.

The "reftable" backend was eventually changed though so that it behaves
the same as the "files" backend and deletes reflogs when deleting refs.
This was done to make the "reftable" backend behave like the "files"
backend as closely as possible so that it can act as a drop-in
replacement.

The cleanup-style test is thus not required anymore. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:30 -08:00
3f87bb2c2b t1404: make D/F conflict tests compatible with reftable backend
Some of the tests in t1404 exercise whether Git correctly aborts
transactions when there is a directory/file conflict with ref names.
While these tests are all marked to require the "files" backend, they do
in fact apply to the "reftable" backend as well.

This may not make much sense on the surface: D/F conflicts only exist
because the "files" backend uses the filesystem to store loose refs, and
thus the restriction theoretically shouldn't apply to the "reftable"
backend. But for now, the "reftable" backend artificially restricts the
creation of such conflicting refs so that it is a drop-in replacement
for the "files" backend. This also ensures that the "reftable" backend
can easily be used on the server side without causing issues for clients
which only know to use the "files" backend.

The only difference between the "files" and "reftable" backends is a
slightly different error message. Adapt the tests to accomodate for this
difference and remove the REFFILES prerequisite so that we start testing
with both backends.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:30 -08:00
129a169874 t1400: exercise reflog with gaps with reftable backend
In t1400, we have a test that exercises whether we print a warning
message as expected when the reflog contains entries which have a gap
between the old entry's new object ID and the new entry's old object ID.
While the logic should apply to all ref backends, the test setup writes
into `.git/logs` directly and is thus "files"-backend specific.

Refactor the test to instead use `git reflog delete` to create the gap
and drop the REFFILES prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:30 -08:00
e98839843b t0410: convert tests to use DEFAULT_REPO_FORMAT prereq
In t0410 we have two tests which exercise how partial clones behave in
the context of a repository with extensions. These tests are marked to
require a repository using SHA1 and the "files" backend because we
explicitly set the repository format version to 0, and setting up either
the "objectFormat" or "refStorage" extensions requires a repository
format version of 1.

We have recently introduced a new DEFAULT_REPO_FORMAT prerequisite.
Despite capturing the intent more directly, it also has the added
benefit that it can easily be extended in the future in case we add new
repository extensions. Adapt the tests to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:30 -08:00
161d981641 t: move tests exercising the "files" backend
We still have a bunch of tests scattered across our test suites that
exercise on-disk files of the "files" backend directly:

  - t1301 exercises permissions of reflog files when the config
    "core.sharedRepository" is set.

  - t1400 exercises whether empty directories in the ref store are
    handled correctly.

  - t3200 exercises what happens when there are symlinks in the ref
    store.

  - t3400 also exercises what happens when ".git/logs" is a symlink.

All of these are inherently low-level tests specific to the "files"
backend. Move them into "t0600-reffiles-backend.sh" to reflect this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15 10:12:30 -08:00
f98643fcb2 Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (51 commits)
  Hopefully the last batch of fixes before 2.44 final
  Git 2.43.2
  A few more fixes before -rc1
  write-or-die: fix the polarity of GIT_FLUSH environment variable
  A few more topics before -rc1
  completion: add and use __git_compute_second_level_config_vars_for_section
  completion: add and use __git_compute_first_level_config_vars_for_section
  completion: complete 'submodule.*' config variables
  completion: add space after config variable names also in Bash 3
  receive-pack: use find_commit_header() in check_nonce()
  ci(linux32): add a note about Actions that must not be updated
  ci: bump remaining outdated Actions versions
  unit-tests: do show relative file paths on non-Windows, too
  receive-pack: use find_commit_header() in check_cert_push_options()
  prune: mark rebase autostash and orig-head as reachable
  sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
  ref-filter.c: sort formatted dates by byte value
  ssh signing: signal an error with a negative return value
  bisect: document command line arguments for "bisect start"
  bisect: document "terms" subcommand more fully
  ...
2024-02-15 09:48:25 +08:00
4fc51f00ef Hopefully the last batch of fixes before 2.44 final
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 15:36:06 -08:00
89400c3615 Merge branch 'pb/complete-config'
The command line completion script (in contrib/) learned to
complete configuration variable names better.

* pb/complete-config:
  completion: add and use __git_compute_second_level_config_vars_for_section
  completion: add and use __git_compute_first_level_config_vars_for_section
  completion: complete 'submodule.*' config variables
  completion: add space after config variable names also in Bash 3
2024-02-14 15:36:06 -08:00
c59ba68ea7 Merge branch 'js/check-null-from-read-object-file'
The code paths that call repo_read_object_file() have been
tightened to react to errors.

* js/check-null-from-read-object-file:
  Always check the return value of `repo_read_object_file()`
2024-02-14 15:36:06 -08:00
e864023188 Merge branch 'rs/receive-pack-remove-find-header'
Code simplification.

* rs/receive-pack-remove-find-header:
  receive-pack: use find_commit_header() in check_nonce()
  receive-pack: use find_commit_header() in check_cert_push_options()
2024-02-14 15:36:05 -08:00
c036a145c3 Merge branch 'vn/rebase-with-cherry-pick-authorship'
"git cherry-pick" invoked during "git rebase -i" session lost
the authorship information, which has been corrected.

* vn/rebase-with-cherry-pick-authorship:
  sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
2024-02-14 15:36:05 -08:00
09e0aa64b3 Merge branch 'pw/gc-during-rebase'
The sequencer machinery does not use the ref API and instead
records names of certain objects it needs for its correct operation
in temporary files, which makes these objects susceptible to loss
by garbage collection.  These temporary files have been added as
starting points for reachability analysis to fix this.

* pw/gc-during-rebase:
  prune: mark rebase autostash and orig-head as reachable
2024-02-14 15:36:05 -08:00
c431a235e2 t9146: replace test -d/-e/-f with appropriate test_path_is_* function
The helper functions test_path_is_* provide better debugging
information than test -d/-e/-f.

Replace "if ! test -d then <error message>" and "test -d" with
"test_path_is_dir" at places where we check for existent directories.

Replace "test -f" with "test_path_is_file" at places where we check
for existent files.

Replace "test ! -e" and "if test -d then <error message>" with
"test_path_is_missing" where we check for non-existent directories.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 11:06:28 -08:00
a171dac734 doc: add shortcut to "am --whitespace=<action>"
We refer readers of "git am --help" to "git apply --help" for many
options that are passed through, and most of them are simple
booleans, but --whitespace takes from a set of actions whose names
may slip users' minds.  Give a list of them in "git am --help" to
reduce one level of redirection only to find out what they are.

In the helper function to parse the available options, there was a
helpful comment reminding the developer to update list of <action>s
in the completion script. Mention the two documentation pages there
as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 11:00:04 -08:00
92e66478fc tag: error when git-column fails
If the user asks for the list of tags to be displayed in columns
("--columns"), a child git-column process is used to format the output
as expected.

In a rare situation where we encounter a problem spawning that child
process, we will work erroneously.

Make noticeable we're having a problem executing git-column, so the user
can act accordingly.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 10:16:40 -08:00
7b644c8c5a rev-list: allow missing tips with --missing=[print|allow*]
In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with with missing commits, not just blobs/trees.

Unfortunately, such a command would still fail with a "fatal: bad
object <oid>" if it is passed a missing commit, blob or tree as an
argument (before the rev walking even begins).

When such a command is used to find the dependencies of some objects,
for example the dependencies of quarantined objects (see the
"QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
documentation), it would be better if the command would instead
consider such missing objects, especially commits, in the same way as
other missing objects.

If, for example `--missing=print` is used, it would be nice for some
use cases if the missing tips passed as arguments were reported in
the same way as other missing objects instead of the command just
failing.

We could introduce a new option to make it work like this, but most
users are likely to prefer the command to have this behavior as the
default one. Introducing a new option would require another dumb loop
to look for that option early, which isn't nice.

Also we made `git rev-list` work with missing commits very recently
and the command is most often passed commits as arguments. So let's
consider this as a bug fix related to these recent changes.

While at it let's add a NEEDSWORK comment to say that we should get
rid of the existing ugly dumb loops that parse the
`--exclude-promisor-objects` and `--missing=...` options early.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:39:14 -08:00
686101ffc9 t6022: fix 'test' style and 'even though' typo
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:39:14 -08:00
eaf07b7d15 oidset: refactor oidset_insert_from_set()
In a following commit, we will need to add all the oids from a set into
another set. In "list-objects-filter.c", there is already a static
function called add_all() to do that.

Let's rename this function oidset_insert_from_set() and move it into
oidset.{c,h} to make it generally available.

While at it, let's remove a useless `!= NULL`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:39:14 -08:00
3ff56af99b revision: clarify a 'return NULL' in get_reference()
When we know a pointer variable is NULL, it's clearer to
explicitly return NULL than to return that variable.

In get_reference(), when 'object' is NULL, we already return NULL
when 'revs->exclude_promisor_objects && is_promisor_object(oid)' is
true, but we return 'object' when 'revs->ignore_missing' is true.

Let's make the code clearer and more uniform by also explicitly
returning NULL when 'revs->ignore_missing' is true.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:38:22 -08:00
5e7013aa14 diff: mark param1 and param2 as placeholders
Some l10n translators translated the parameters "files", "param1" and
"param2" in the following message:

    "synonym for --dirstat=files,param1,param2..."

Translating "param1" and "param2" is OK, but changing the parameter
"files" is wrong. The parameters that are not meant to be used verbatim
should be marked as placeholders, but the verbatim parameter not marked
as a placeholder should be left as is.

This change is a complement for commit 51e846e673 (doc: enforce
placeholders in documentation, 2023-12-25).

With the help of Jean-Noël,some parameter combinations in one
placeholder (e.g. "<param1,param2>...") are splited into seperate
placeholders.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14 09:29:10 -08:00
edae91a4cf Git 2.44-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 15:12:53 -08:00
efb050becb Git 2.43.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 14:44:51 -08:00
dc55772259 Merge branch 'pb/template-for-single-commit-pr' into maint-2.43
Doc update.

* pb/template-for-single-commit-pr:
  .github/PULL_REQUEST_TEMPLATE.md: add a note about single-commit PRs
2024-02-13 14:44:51 -08:00
1e73351fef Merge branch 'jc/bisect-doc' into maint-2.43
Doc update.

* jc/bisect-doc:
  bisect: document command line arguments for "bisect start"
  bisect: document "terms" subcommand more fully
2024-02-13 14:44:51 -08:00
8d792dcd5a Merge branch 'js/win32-retry-pipe-write-on-enospc' into maint-2.43
Update to the code that writes to pipes on Windows.

* js/win32-retry-pipe-write-on-enospc:
  win32: special-case `ENOSPC` when writing to a pipe
2024-02-13 14:44:51 -08:00
08b7e46bb1 Merge branch 'tb/pack-bitmap-drop-unused-struct-member' into maint-2.43
Code clean-up.

* tb/pack-bitmap-drop-unused-struct-member:
  pack-bitmap: drop unused `reuse_objects`
2024-02-13 14:44:51 -08:00
8f1c4a7db1 Merge branch 'jt/p4-spell-re-with-raw-string' into maint-2.43
"git p4" update to squelch warnings from Python.

* jt/p4-spell-re-with-raw-string:
  git-p4: use raw string literals for regular expressions
2024-02-13 14:44:50 -08:00
b6fdf929ee Merge branch 'jc/coc-whitespace-fix' into maint-2.43
Docfix.

* jc/coc-whitespace-fix:
  CoC: whitespace fix
2024-02-13 14:44:50 -08:00
81e3eea77c Merge branch 'sd/negotiate-trace-fix' into maint-2.43
Tracing fix.

* sd/negotiate-trace-fix:
  push: region_leave trace for negotiate_using_fetch
2024-02-13 14:44:50 -08:00
05a961754e Merge branch 'jc/majordomo-to-subspace' into maint-2.43
Doc update.

* jc/majordomo-to-subspace:
  Docs: majordomo@vger.kernel.org has been decomissioned
2024-02-13 14:44:50 -08:00
f2e7998613 Merge branch 'nb/rebase-x-shell-docfix' into maint-2.43
Doc update.

* nb/rebase-x-shell-docfix:
  rebase: fix documentation about used shell in -x
2024-02-13 14:44:49 -08:00
2928250218 Merge branch 'la/strvec-comment-fix' into maint-2.43
Comment fix.

* la/strvec-comment-fix:
  strvec: use correct member name in comments
2024-02-13 14:44:49 -08:00
5193aee2a3 Merge branch 'ne/doc-filter-blob-limit-fix' into maint-2.43
Docfix.

* ne/doc-filter-blob-limit-fix:
  rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
2024-02-13 14:44:49 -08:00
7687ca5a90 Merge branch 'cp/git-flush-is-an-env-bool' into maint-2.43
Recent conversion to allow more than 0/1 in GIT_FLUSH broke the
mechanism by flipping what yes/no means by mistake, which has been
corrected.

* cp/git-flush-is-an-env-bool:
  write-or-die: fix the polarity of GIT_FLUSH environment variable
2024-02-13 14:44:49 -08:00
bd10c45672 Merge branch 'ps/report-failure-from-git-stash' into maint-2.43
"git stash" sometimes was silent even when it failed due to
unwritable index file, which has been corrected.

* ps/report-failure-from-git-stash:
  builtin/stash: report failure to write to index
2024-02-13 14:44:49 -08:00
07fa383615 Merge branch 'jc/sign-buffer-failure-propagation-fix' into maint-2.43
A failed "git tag -s" did not necessarily result in an error
depending on the crypto backend, which has been corrected.

* jc/sign-buffer-failure-propagation-fix:
  ssh signing: signal an error with a negative return value
  tag: fix sign_buffer() call to create a signed tag
2024-02-13 14:44:48 -08:00
a1cd814f1f Merge branch 'jc/comment-style-fixes' into maint-2.43
Rewrite //-comments to /* comments */ in files whose comments
prevalently use the latter.

* jc/comment-style-fixes:
  reftable/pq_test: comment style fix
  merge-ort.c: comment style fix
  builtin/worktree: comment style fixes
2024-02-13 14:44:48 -08:00
5071cb78a3 Merge branch 'jk/diff-external-with-no-index' into maint-2.43
"git diff --no-index file1 file2" segfaulted while invoking the
external diff driver, which has been corrected.

* jk/diff-external-with-no-index:
  diff: handle NULL meta-info when spawning external diff
2024-02-13 14:44:48 -08:00
d982de5d32 Merge branch 'rs/parse-options-with-keep-unknown-abbrev-fix' into maint-2.43
"git diff --no-rename A B" did not disable rename detection but did
not trigger an error from the command line parser.

* rs/parse-options-with-keep-unknown-abbrev-fix:
  parse-options: simplify positivation handling
  parse-options: fully disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
2024-02-13 14:44:48 -08:00
904ca69428 Merge branch 'en/diffcore-delta-final-line-fix' into maint-2.43
Rename detection logic ignored the final line of a file if it is an
incomplete line.

* en/diffcore-delta-final-line-fix:
  diffcore-delta: avoid ignoring final 'line' of file
2024-02-13 14:44:48 -08:00
908fde12b0 Merge branch 'tc/show-ref-exists-fix' into maint-2.43
Update to a new feature recently added, "git show-ref --exists".

* tc/show-ref-exists-fix:
  builtin/show-ref: treat directory as non-existing in --exists
2024-02-13 14:44:47 -08:00
4cde9f0726 A few more fixes before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 14:31:12 -08:00
4abab9e51a Merge branch 'cp/git-flush-is-an-env-bool'
Recent conversion to allow more than 0/1 in GIT_FLUSH broke the
mechanism by flipping what yes/no means by mistake, which has been
corrected.

* cp/git-flush-is-an-env-bool:
  write-or-die: fix the polarity of GIT_FLUSH environment variable
2024-02-13 14:31:12 -08:00
9115864cb5 Merge branch 'jc/unit-tests-make-relative-fix'
The mechanism to report the filename in the source code, used by
the unit-test machinery, assumed that the compiler expanded __FILE__
to the path to the source given to the $(CC), but some compilers
give full path, breaking the output.  This has been corrected.

* jc/unit-tests-make-relative-fix:
  unit-tests: do show relative file paths on non-Windows, too
2024-02-13 14:31:11 -08:00
c2914d4677 Merge branch 'js/github-actions-update'
Update remaining GitHub Actions jobs to avoid warnings against
using deprecated version of Node.js.

* js/github-actions-update:
  ci(linux32): add a note about Actions that must not be updated
  ci: bump remaining outdated Actions versions
2024-02-13 14:31:11 -08:00
133a7b08dc Merge branch 'jc/github-actions-update'
Squelch node.js 16 deprecation warnings from GitHub Actions CI
by updating actions/github-script and actions/checkout that use
node.js 20.

* jc/github-actions-update:
  GitHub Actions: update to github-script@v7
  GitHub Actions: update to checkout@v4
2024-02-13 14:31:11 -08:00
7abc1869e5 add -p tests: remove PERL prerequisites
The Perl version of the add -i/-p commands has been removed since
20b813d (add: remove "add.interactive.useBuiltin" & Perl "git
add--interactive", 2023-02-07)

Therefore, Perl prerequisite in the test scripts which use the patch
mode functionality is not neccessary.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 14:12:53 -08:00
5a8ed3fe45 add-patch: classify '@' as a synonym for 'HEAD'
Currently, (restore, checkout, reset) commands correctly take '@' as a
synonym for 'HEAD'. However, in patch mode different prompts/messages
are given on command line due to patch mode machinery not considering
'@' to be a synonym for 'HEAD' due to literal string comparison with
the word 'HEAD', and therefore assigning patch_mode_($command)_nothead
and triggering reverse mode (-R in diff-index). The NEEDSWORK comment
suggested comparing commit objects to get around this. However, doing
so would also take a non-checked out branch pointing to the same commit
as HEAD, as HEAD. This would cause confusion to the user.

Therefore, after parsing '@', replace it with 'HEAD' as reasonably
early as possible. This also solves another problem of disparity
between 'git checkout HEAD' and 'git checkout @' (latter detaches at
the HEAD commit and the former does not).

Trade-offs:
- Some of the errors would show the revision argument as 'HEAD' when
  given '@'. This should be fine, as most users who probably use '@'
  would be aware that it is a shortcut for 'HEAD' and most probably
  used to use 'HEAD'. There is also relevant documentation in
  'gitrevisions' manpage about '@' being the shortcut for 'HEAD'. Also,
  the simplicity of the solution far outweighs this cost.

- Consider '@' as a shortcut for 'HEAD' even if 'refs/heads/@' exists
  at a different commit. Naming a branch '@' is an obvious foot-gun and
  many existing commands already take '@' for 'HEAD' even if
  'refs/heads/@' exists at a different commit or does not exist at all
  (e.g. 'git log @', 'git push origin @' etc.). Therefore this is an
  existing assumption and should not be a problem.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 14:12:51 -08:00
c784b0a5b9 git: --no-lazy-fetch option
Sometimes, especially during tests of low level machinery, it is
handy to have a way to disable lazy fetching of objects.  This
allows us to say, for example, "git cat-file -e <object-name>", to
see if the object is locally available.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 12:53:15 -08:00
b40ba17e44 write-or-die: fix the polarity of GIT_FLUSH environment variable
When GIT_FLUSH is set to 1, true, on, yes, then we should disable
skip_stdout_flush, but the conversion somehow did the opposite.

With the understanding of the original motivation behind "skip" in
06f59e9f (Don't fflush(stdout) when it's not helpful, 2007-06-29),
we can sympathize with the current naming (we wanted to avoid
useless flushing of stdout by default, with an escape hatch to
always flush), but it is still not a good excuse.

Retire the "skip_stdout_flush" variable and replace it with "flush_stdout"
that tells if we do or do not want to run fflush().

Reported-by: Xiaoguang WANG <wxiaoguang@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 11:57:28 -08:00
76fb807faa column: guard against negative padding
Make sure that client code can’t pass in a negative padding by accident.

Suggested-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 10:18:57 -08:00
f2d31c69ce column: disallow negative padding
A negative padding does not make sense and can cause errors in the
memory allocator since it’s interpreted as an unsigned integer.

Reported-by: Tiago Pascoal <tiago@pascoal.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13 10:18:50 -08:00
2996f11c1d Sync with 'maint' 2024-02-12 13:17:06 -08:00
ad1a669545 A few more topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 13:16:12 -08:00
3b89ff16aa Merge branch 'tb/multi-pack-reuse-experiment'
Setting `feature.experimental` opts the user into multi-pack reuse
experiment

* tb/multi-pack-reuse-experiment:
  pack-objects: enable multi-pack reuse via `feature.experimental`
  t5332-multi-pack-reuse.sh: extract pack-objects helper functions
2024-02-12 13:16:11 -08:00
d4833b22ab Merge branch 'vd/for-each-ref-sort-with-formatted-timestamp'
"git branch" and friends learned to use the formatted text as
sorting key, not the underlying timestamp value, when the --sort
option is used with author or committer timestamp with a format
specifier (e.g., "--sort=creatordate:format:%H:%M:%S").

* vd/for-each-ref-sort-with-formatted-timestamp:
  ref-filter.c: sort formatted dates by byte value
2024-02-12 13:16:11 -08:00
b3370dd51e Merge branch 'pw/show-ref-pseudorefs'
"git show-ref --verify" did not show things like "CHERRY_PICK_HEAD",
which has been corrected.

* pw/show-ref-pseudorefs:
  t1400: use show-ref to check pseudorefs
  show-ref --verify: accept pseudorefs
2024-02-12 13:16:11 -08:00
70550a2242 Merge branch 'ps/report-failure-from-git-stash'
"git stash" sometimes was silent even when it failed due to
unwritable index file, which has been corrected.

* ps/report-failure-from-git-stash:
  builtin/stash: report failure to write to index
2024-02-12 13:16:11 -08:00
32c5ab6ee4 Merge branch 'pb/template-for-single-commit-pr'
Doc update.

* pb/template-for-single-commit-pr:
  .github/PULL_REQUEST_TEMPLATE.md: add a note about single-commit PRs
2024-02-12 13:16:11 -08:00
05c5a6db80 Merge branch 'jc/sign-buffer-failure-propagation-fix'
A failed "git tag -s" did not necessarily result in an error
depending on the crypto backend, which has been corrected.

* jc/sign-buffer-failure-propagation-fix:
  ssh signing: signal an error with a negative return value
  tag: fix sign_buffer() call to create a signed tag
2024-02-12 13:16:11 -08:00
13fdf82e09 Merge branch 'jc/bisect-doc'
Doc update.

* jc/bisect-doc:
  bisect: document command line arguments for "bisect start"
  bisect: document "terms" subcommand more fully
2024-02-12 13:16:10 -08:00
46761378c3 Merge branch 'bk/complete-bisect'
Command line completion support (in contrib/) has been
updated for "git bisect".

* bk/complete-bisect:
  completion: bisect: recognize but do not complete view subcommand
  completion: bisect: complete log opts for visualize subcommand
  completion: new function __git_complete_log_opts
  completion: bisect: complete missing --first-parent and - -no-checkout options
  completion: bisect: complete custom terms and related options
  completion: bisect: complete bad, new, old, and help subcommands
  completion: tests: always use 'master' for default initial branch name
2024-02-12 13:16:10 -08:00
f424d7c33d Merge branch 'ps/reftable-styles'
Code clean-up in various reftable code paths.

* ps/reftable-styles:
  reftable/record: improve semantics when initializing records
  reftable/merged: refactor initialization of iterators
  reftable/merged: refactor seeking of records
  reftable/stack: use `size_t` to track stack length
  reftable/stack: use `size_t` to track stack slices during compaction
  reftable/stack: index segments with `size_t`
  reftable/stack: fix parameter validation when compacting range
  reftable: introduce macros to allocate arrays
  reftable: introduce macros to grow arrays
2024-02-12 13:16:10 -08:00
cf4a3bd8f1 Merge branch 'ps/reftable-multi-level-indices-fix'
Write multi-level indices for reftable has been corrected.

* ps/reftable-multi-level-indices-fix:
  reftable: document reading and writing indices
  reftable/writer: fix writing multi-level indices
  reftable/writer: simplify writing index records
  reftable/writer: use correct type to iterate through index entries
  reftable/reader: be more careful about errors in indexed seeks
2024-02-12 13:16:10 -08:00
c684b582bc Merge branch 'ps/reftable-backend' into kn/for-all-refs
* ps/reftable-backend:
  refs/reftable: fix leak when copying reflog fails
  ci: add jobs to test with the reftable backend
  refs: introduce reftable backend
2024-02-12 10:09:19 -08:00
7adf215fed Merge branch 'pb/imap-send-wo-curl-build-fix' into maint-2.43
* pb/imap-send-wo-curl-build-fix:
  imap-send: add missing "strbuf.h" include under NO_CURL
2024-02-12 09:57:59 -08:00
6e32f718ff completion: add and use __git_compute_second_level_config_vars_for_section
In a previous commit we removed some hardcoded config variable names from
function __git_complete_config_variable_name in the completion script by
introducing a new function,
__git_compute_first_level_config_vars_for_section.

The remaining hardcoded config variables are "second level"
configuration variables, meaning 'branch.<name>.upstream',
'remote.<name>.url', etc. where <name> is a user-defined name.

Making use of the new existing --config flag to 'git help', add a new
function, __git_compute_second_level_config_vars_for_section. This
function takes as argument a config section name and computes the
corresponding second-level config variables, i.e. those that contain a
'<' which indicates the start of a placeholder. Note that as in
__git_compute_first_level_config_vars_for_section added previsouly, we
use indirect expansion instead of associative arrays to stay compatible
with Bash 3 on which macOS is stuck for licensing reasons.

As explained in the previous commit, we use the existing pattern in the
completion script of using global variables to cache the list of
variables for each section.

Use this new function and the variables it defines in
__git_complete_config_variable_name to remove hardcoded config
variables, and add a test to verify the new function.  Use a single
'case' for all sections with second-level variables names, since the
code for each of them is now exactly the same.

Adjust the name of a test added in a previous commit to reflect that it
now tests the added function.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:43:42 -08:00
1e0ee4087e completion: add and use __git_compute_first_level_config_vars_for_section
The function __git_complete_config_variable_name in the Bash completion
script hardcodes several config variable names. These variables are
those in config sections where user-defined names can appear, such as
"branch.<name>". These sections are treated first by the case statement,
and the two last "catch all" cases are used for other sections, making
use of the __git_compute_config_vars and __git_compute_config_sections
function, which omit listing any variables containing wildcards or
placeholders. Having hardcoded config variables introduces the risk of
the completion code becoming out of sync with the actual config
variables accepted by Git.

To avoid these hardcoded config variables, introduce a new function,
__git_compute_first_level_config_vars_for_section, making use of the
existing __git_config_vars variable. This function takes as argument a
config section name and computes the matching "first level" config
variables for that section, i.e. those _not_ containing any placeholder,
like 'branch.autoSetupMerge, 'remote.pushDefault', etc.  Use this
function and the variables it defines in the 'branch.*', 'remote.*' and
'submodule.*' switches of the case statement instead of hardcoding the
corresponding config variables.  Note that we use indirect expansion to
create a variable for each section, instead of using a single
associative array indexed by section names, because associative arrays
are not supported in Bash 3, on which macOS is stuck for licensing
reasons.

Use the existing pattern in the completion script of using global
variables to cache the list of config variables for each section. The
rationale for such caching is explained in eaa4e6ee2a (Speed up bash
completion loading, 2009-11-17), and the current approach to using and
defining them via 'test -n' is explained in cf0ff02a38 (completion: work
around zsh option propagation bug, 2012-02-02).

Adjust the name of one of the tests added in the previous commit,
reflecting that it now also tests the new function.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:43:42 -08:00
b1d0cc68d1 completion: complete 'submodule.*' config variables
In the Bash completion script, function
__git_complete_config_variable_name completes config variables and has
special logic to deal with config variables involving user-defined
names, like branch.<name>.* and remote.<name>.*.

This special logic is missing for submodule-related config variables.
Add the appropriate branches to the case statement, making use of the
in-tree '.gitmodules' to list relevant submodules.

Add corresponding tests in t9902-completion.sh, making sure we complete
both first level submodule config variables as well as second level
variables involving submodule names.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:43:42 -08:00
30bd55f901 completion: add space after config variable names also in Bash 3
In be6444d1ca (completion: bash: add correct suffix in variables,
2021-08-16), __git_complete_config_variable_name was changed to use
"${sfx- }" instead of "$sfx" as the fourth argument of _gitcomp_nl and
_gitcomp_nl_append, such that this argument evaluates to a space if sfx
is unset. This was to ensure that e.g.

	git config branch.autoSetupMe[TAB]

correctly completes to 'branch.autoSetupMerge ' with the trailing space.
This commits notes that the fix only works in Bash 4 because in Bash 3
the 'local sfx' construct at the beginning of
__git_complete_config_variable_name creates an empty string.

Make the fix also work for Bash 3 by using the "unset or null' parameter
expansion syntax ("${sfx:- }"), such that the parameter is also expanded
to a space if it is set but null, as is the behaviour of 'local sfx' in
Bash 3.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:43:41 -08:00
f0e578c69c use xstrncmpz()
Add and apply a semantic patch for calling xstrncmpz() to compare a
NUL-terminated string with a buffer of a known length instead of using
strncmp() and checking the terminating NUL explicitly.  This simplifies
callers by reducing code duplication.

I had to adjust remote.c manually because Coccinelle inexplicably
changed the indent of the else branches.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:32:41 -08:00
020456cb74 receive-pack: use find_commit_header() in check_nonce()
Use the public function find_commit_header() and remove find_header(),
as it becomes unused.  This is safe and appropriate because we pass the
NUL-terminated payload buffer to check_nonce() instead of its start and
length.  The underlying strbuf push_cert cannot contain NULs, as it is
built using strbuf_addstr(), only.

We no longer need to call strlen(), as find_commit_header() returns the
length of nonce already.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:22:20 -08:00
c68ca7abd3 reftable/reader: add comments to table_iter_next()
While working on the optimizations in the preceding patches I stumbled
upon `table_iter_next()` multiple times. It is quite easy to miss the
fact that we don't call `table_iter_next_in_block()` twice, but that the
second call is in fact `table_iter_next_block()`.

Add comments to explain what exactly is going on here to make things
more obvious. While at it, touch up the code to conform to our code
style better.

Note that one of the refactorings merges two conditional blocks into
one. Before, we had the following code:

```
err = table_iter_next_block(&next, ti);
if (err != 0) {
	ti->is_finished = 1;
}
table_iter_block_done(ti);
if (err != 0) {
	return err;
}
```

As `table_iter_block_done()` does not care about `is_finished`, the
conditional blocks can be merged into one block:

```
err = table_iter_next_block(&next, ti);
table_iter_block_done(ti);
if (err != 0) {
	ti->is_finished = 1;
	return err;
}
```

This is both easier to reason about and more performant because we have
one branch less.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:19:27 -08:00
a418a7abef reftable/record: don't try to reallocate ref record name
When decoding reftable ref records we first release the pointer to the
record passed to us and then use realloc(3P) to allocate the refname
array. This is a bit misleading though as we know at that point that the
refname will always be `NULL`, so we would always end up allocating a
new char array anyway.

Refactor the code to use `REFTABLE_ALLOC_ARRAY()` instead. As the
following benchmark demonstrates this is a tiny bit more efficient. But
the bigger selling point really is the gained clarity.

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     150.1 ms ±   4.1 ms    [User: 146.6 ms, System: 3.3 ms]
    Range (min … max):   144.5 ms … 180.5 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     148.9 ms ±   4.5 ms    [User: 145.2 ms, System: 3.4 ms]
    Range (min … max):   143.0 ms … 185.4 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)

Ideally, we should try and reuse the memory of the old record instead of
first freeing and then immediately reallocating it. This requires some
more surgery though and is thus left for a future iteration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:05 -08:00
92fa3253c8 reftable/block: swap buffers instead of copying
When iterating towards the next record in a reftable block we need to
keep track of the key that the last record had. This is required because
reftable records use prefix compression, where subsequent records may
reuse parts of their preceding record's key.

This key is stored in the `block_iter::last_key`, which we update after
every call to `block_iter_next()`: we simply reset the buffer and then
add the current key to it.

This is a bit inefficient though because it requires us to copy over the
key on every iteration, which adds up when iterating over many records.
Instead, we can make use of the fact that the `block_iter::key` buffer
is basically only a scratch buffer. So instead of copying over contents,
we can just swap both buffers.

The following benchmark prints a single ref matching a specific pattern
out of 1 million refs via git-show-ref(1):

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     155.7 ms ±   5.0 ms    [User: 152.1 ms, System: 3.4 ms]
    Range (min … max):   150.8 ms … 185.7 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     150.8 ms ±   4.2 ms    [User: 147.1 ms, System: 3.5 ms]
    Range (min … max):   145.1 ms … 180.7 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.03 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:04 -08:00
dbe4e8b3fd reftable/pq: allocation-less comparison of entry keys
The priority queue is used by the merged iterator to iterate over
reftable records from multiple tables in the correct order. The queue
ends up having one record for each table that is being iterated over,
with the record that is supposed to be shown next at the top. For
example, the key of a ref record is equal to its name so that we end up
sorting the priority queue lexicographically by ref name.

To figure out the order we need to compare the reftable record keys with
each other. This comparison is done by formatting them into a `struct
strbuf` and then doing `strbuf_strcmp()` on the result. We then discard
the buffers immediately after the comparison.

This ends up being very expensive. Because the priority queue usually
contains as many records as we have tables, we call the comparison
function `O(log($tablecount))` many times for every record we insert.
Furthermore, when iterating over many refs, we will insert at least one
record for every ref we are iterating over. So ultimately, this ends up
being called `O($refcount * log($tablecount))` many times.

Refactor the code to use the new `refatble_record_cmp()` function that
has been implemented in a preceding commit. This function does not need
to allocate memory and is thus significantly more efficient.

The following benchmark prints a single ref matching a specific pattern
out of 1 million refs via git-show-ref(1), where the reftable stack
consists of three tables:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     224.4 ms ±   6.5 ms    [User: 220.6 ms, System: 3.6 ms]
    Range (min … max):   216.5 ms … 261.1 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     172.9 ms ±   4.4 ms    [User: 169.2 ms, System: 3.6 ms]
    Range (min … max):   166.5 ms … 204.6 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.30 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:04 -08:00
5730a9dccf reftable/merged: skip comparison for records of the same subiter
When retrieving the next entry of a merged iterator we need to drop all
records of other sub-iterators that would be shadowed by the record that
we are about to return. We do this by comparing record keys, dropping
all keys that are smaller or equal to the key of the record we are about
to return.

There is an edge case here where we can skip that comparison: when the
record in the priority queue comes from the same subiterator as the
record we are about to return then we know that its key must be larger
than the key of the record we are about to return. This property is
guaranteed by the sub-iterators, and if it didn't hold then the whole
merged iterator would return records in the wrong order, too.

While this may seem like a very specific edge case it's in fact quite
likely to happen. For most repositories out there you can assume that we
will end up with one large table and several smaller ones on top of it.
Thus, it is very likely that the next entry will sort towards the top of
the priority queue.

Special case this and break out of the loop in that case. The following
benchmark uses git-show-ref(1) to print a single ref matching a pattern
out of 1 million refs:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     162.6 ms ±   4.5 ms    [User: 159.0 ms, System: 3.5 ms]
    Range (min … max):   156.6 ms … 188.5 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     156.8 ms ±   4.7 ms    [User: 153.0 ms, System: 3.6 ms]
    Range (min … max):   151.4 ms … 188.4 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.04 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:04 -08:00
a96e9a20f3 reftable/merged: allocation-less dropping of shadowed records
The purpose of the merged reftable iterator is to iterate through all
entries of a set of tables in the correct order. This is implemented by
using a sub-iterator for each table, where the next entry of each of
these iterators gets put into a priority queue. For each iteration, we
do roughly the following steps:

  1. Retrieve the top record of the priority queue. This is the entry we
     want to return to the caller.

  2. Retrieve the next record of the sub-iterator that this record came
     from. If any, add it to the priority queue at the correct position.
     The position is determined by comparing the record keys, which e.g.
     corresponds to the refname for ref records.

  3. Keep removing the top record of the priority queue until we hit the
     first entry whose key is larger than the returned record's key.
     This is required to drop "shadowed" records.

The last step will lead to at least one comparison to the next entry,
but may lead to many comparisons in case the reftable stack consists of
many tables with shadowed records. It is thus part of the hot code path
when iterating through records.

The code to compare the entries with each other is quite inefficient
though. Instead of comparing record keys with each other directly, we
first format them into `struct strbuf`s and only then compare them with
each other. While we already optimized this code path to reuse buffers
in 829231dc20 (reftable/merged: reuse buffer to compute record keys,
2023-12-11), the cost to format the keys into the buffers still adds up
quite significantly.

Refactor the code to use `reftable_record_cmp()` instead, which has been
introduced in the preceding commit. This function compares records with
each other directly without requiring any memory allocations or copying
and is thus way more efficient.

The following benchmark uses git-show-ref(1) to print a single ref
matching a pattern out of 1 million refs. This is the most direct way to
exercise ref iteration speed as we remove all overhead of having to show
the refs, too.

    Benchmark 1: show-ref: single matching ref (revision = HEAD~)
      Time (mean ± σ):     180.7 ms ±   4.7 ms    [User: 177.1 ms, System: 3.4 ms]
      Range (min … max):   174.9 ms … 211.7 ms    1000 runs

    Benchmark 2: show-ref: single matching ref (revision = HEAD)
      Time (mean ± σ):     162.1 ms ±   4.4 ms    [User: 158.5 ms, System: 3.4 ms]
      Range (min … max):   155.4 ms … 189.3 ms    1000 runs

    Summary
      show-ref: single matching ref (revision = HEAD) ran
        1.11 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:04 -08:00
adb5d2cbe9 reftable/record: introduce function to compare records by key
In some places we need to sort reftable records by their keys to
determine their ordering. This is done by first formatting the keys into
a `struct strbuf` and then using `strbuf_cmp()` to compare them. This
logic is needlessly roundabout and can end up costing quite a bit of CPU
cycles, both due to the allocation and formatting logic.

Introduce a new `reftable_record_cmp()` function that knows how to
compare two records with each other without requiring allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:04 -08:00
20e0ff8835 ci(linux32): add a note about Actions that must not be updated
The Docker container used by the `linux32` job comes without Node.js,
and therefore the `actions/checkout` and `actions/upload-artifact`
Actions cannot be upgraded to the latest versions (because they use
Node.js).

One time too many, I accidentally tried to update them, where
`actions/checkout` at least fails immediately, but the
`actions/upload-artifact` step is only used when any test fails, and
therefore the CI run usually passes even though that Action was updated
to a version that is incompatible with the Docker container in which
this job runs.

So let's add a big fat warning, mainly for my own benefit, to avoid
running into the very same issue over and over again.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 08:48:22 -08:00
820a340085 ci: bump remaining outdated Actions versions
After activating automatic Dependabot updates in the
git-for-windows/git repository, Dependabot noticed a couple
of yet-unaddressed updates.  They avoid "Node.js 16 Actions"
deprecation messages by bumping the following Actions'
versions:

- actions/upload-artifact from 3 to 4
- actions/download-artifact from 3 to 4
- actions/cache from 3 to 4

Helped-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 08:47:38 -08:00
f66286364f unit-tests: do show relative file paths on non-Windows, too
There are compilers other than Visual C that want to show absolute
paths.  Generalize the helper introduced by a2c5e294 (unit-tests: do
show relative file paths, 2023-09-25) so that it can also work with
a path that uses slash as the directory separator, and becomes
almost no-op once one-time preparation finds out that we are using a
compiler that already gives relative paths.  Incidentally, this also
should do the right thing on Windows with a compiler that shows
relative paths but with backslash as the directory separator (if
such a thing exists and is used to build git).

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 08:44:22 -08:00
6032aee65e l10n: bump Actions versions in l10n.yml
This avoids the "Node.js 16 Actions are deprecated" warnings.

Original-commits-by: dependabot[bot] <support@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-02-11 12:47:51 +01:00
f51d790b67 receive-pack: use find_commit_header() in check_cert_push_options()
Use the public function find_commit_header() instead of find_header() to
simplify the code.  This is possible and safe because we're operating on
a strbuf, which is always NUL-terminated, so there is no risk of running
over the end of the buffer.  It cannot contain NUL within the buffer, as
it is built using strbuf_addstr(), only.

The string comparison becomes more complicated because we need to check
for NUL explicitly after comparing the length-limited option, but on the
flip side we don't need to clean up allocations or track the remaining
buffer length.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-09 14:03:15 -08:00
bc7f5db896 prune: mark rebase autostash and orig-head as reachable
Rebase records the oid of HEAD before rebasing and the commit created by
"--autostash" in files in the rebase state directory. This means that
the autostash commit is never reachable from any ref or reflog and when
rebasing a detached HEAD the original HEAD can become unreachable if the
user expires HEAD's the reflog while the rebase is running. Fix this by
reading the relevant files when marking reachable commits.

Note that it is possible for the commit recorded in
.git/rebase-merge/amend to be unreachable but pruning that object does
not affect the operation of "git rebase --continue" as we're only
interested in the object id, not in the object itself.

Reported-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-09 10:04:59 -08:00
c875e0b8e0 Git 2.44-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-08 16:35:07 -08:00
e0b521cb5a Sync with Git 2.43.1 2024-02-08 16:30:54 -08:00
3526e67d91 Git 2.43.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-08 16:22:12 -08:00
963eda258a Merge branch 'ib/rebase-reschedule-doc' into maint-2.43
Doc update.

* ib/rebase-reschedule-doc:
  rebase: clarify --reschedule-failed-exec default
2024-02-08 16:22:12 -08:00
a064af6ef4 Merge branch 'jk/index-pack-lsan-false-positive-fix' into maint-2.43
Fix false positive reported by leak sanitizer.

* jk/index-pack-lsan-false-positive-fix:
  index-pack: spawn threads atomically
2024-02-08 16:22:12 -08:00
16a830f6c2 Merge branch 'cp/sideband-array-index-comment-fix' into maint-2.43
In-code comment fix.

* cp/sideband-array-index-comment-fix:
  sideband.c: remove redundant 'NEEDSWORK' tag
2024-02-08 16:22:12 -08:00
d690c8e142 Merge branch 'ms/rebase-insnformat-doc-fix' into maint-2.43
Docfix.

* ms/rebase-insnformat-doc-fix:
  Documentation: fix statement about rebase.instructionFormat
2024-02-08 16:22:12 -08:00
5b49c1af03 Merge branch 'jx/sideband-chomp-newline-fix' into maint-2.43
Sideband demultiplexer fixes.

* jx/sideband-chomp-newline-fix:
  pkt-line: do not chomp newlines for sideband messages
  pkt-line: memorize sideband fragment in reader
  test-pkt-line: add option parser for unpack-sideband
2024-02-08 16:22:11 -08:00
fb3ead665b Merge branch 'jk/t1006-cat-file-objectsize-disk' into maint-2.43
Test update.

* jk/t1006-cat-file-objectsize-disk:
  t1006: prefer shell loop to awk for packed object sizes
  t1006: add tests for %(objectsize:disk)
2024-02-08 16:22:11 -08:00
5a322a2d3d Merge branch 'js/contributor-docs-updates' into maint-2.43
Doc update.

* js/contributor-docs-updates:
  SubmittingPatches: hyphenate non-ASCII
  SubmittingPatches: clarify GitHub artifact format
  SubmittingPatches: clarify GitHub visual
  SubmittingPatches: provide tag naming advice
  SubmittingPatches: update extra tags list
  SubmittingPatches: discourage new trailers
  SubmittingPatches: drop ref to "What's in git.git"
  CodingGuidelines: write punctuation marks
  CodingGuidelines: move period inside parentheses
2024-02-08 16:22:11 -08:00
232953b904 Merge branch 'rs/fast-import-simplify-mempool-allocation' into maint-2.43
Code simplification.

* rs/fast-import-simplify-mempool-allocation:
  fast-import: use mem_pool_calloc()
2024-02-08 16:22:11 -08:00
0f7a10a3aa Merge branch 'en/header-cleanup' into maint-2.43
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-02-08 16:22:10 -08:00
3aea0dad70 Merge branch 'ml/doc-merge-updates' into maint-2.43
Doc update.

* ml/doc-merge-updates:
  Documentation/git-merge.txt: use backticks for command wrapping
  Documentation/git-merge.txt: fix reference to synopsis
2024-02-08 16:22:10 -08:00
974c9369aa Merge branch 'jc/orphan-unborn' into maint-2.43
Doc updates to clarify what an "unborn branch" means.

* jc/orphan-unborn:
  orphan/unborn: fix use of 'orphan' in end-user facing messages
  orphan/unborn: add to the glossary and use them consistently
2024-02-08 16:22:10 -08:00
541d0d75e7 Merge branch 'la/trailer-cleanups' into maint-2.43
Code clean-up.

* la/trailer-cleanups:
  trailer: use offsets for trailer_start/trailer_end
  trailer: find the end of the log message
  commit: ignore_non_trailer computes number of bytes to ignore
2024-02-08 16:22:09 -08:00
edf4c0d42b Merge branch 'jc/retire-cas-opt-name-constant' into maint-2.43
Code clean-up.

* jc/retire-cas-opt-name-constant:
  remote.h: retire CAS_OPT_NAME
2024-02-08 16:22:09 -08:00
2873a9686c Merge branch 'rs/rebase-use-strvec-pushf' into maint-2.43
Code clean-up.

* rs/rebase-use-strvec-pushf:
  rebase: use strvec_pushf() for format-patch revisions
2024-02-08 16:22:09 -08:00
f5fa75af53 Merge branch 'rs/t6300-compressed-size-fix' into maint-2.43
Test fix.

* rs/t6300-compressed-size-fix:
  t6300: avoid hard-coding object sizes
2024-02-08 16:22:09 -08:00
bb58c037ee Merge branch 'sp/test-i18ngrep' into maint-2.43
Error message fix in the test framework.

* sp/test-i18ngrep:
  test-lib-functions.sh: fix test_grep fail message wording
2024-02-08 16:22:08 -08:00
a1121a79d9 Merge branch 'jc/doc-misspelt-refs-fix' into maint-2.43
Doc update.

* jc/doc-misspelt-refs-fix:
  doc: format.notes specify a ref under refs/notes/ hierarchy
2024-02-08 16:22:08 -08:00
79f79e58a4 Merge branch 'jc/doc-most-refs-are-not-that-special' into maint-2.43
Doc updates.

* jc/doc-most-refs-are-not-that-special:
  docs: MERGE_AUTOSTASH is not that special
  docs: AUTO_MERGE is not that special
  refs.h: HEAD is not that special
  git-bisect.txt: BISECT_HEAD is not that special
  git.txt: HEAD is not that special
2024-02-08 16:22:08 -08:00
7b95c64408 Merge branch 'es/add-doc-list-short-form-of-all-in-synopsis' into maint-2.43
Doc update.

* es/add-doc-list-short-form-of-all-in-synopsis:
  git-add.txt: add missing short option -A to synopsis
2024-02-08 16:22:08 -08:00
b1184c3c69 Merge branch 'ps/chainlint-self-check-update' into maint-2.43
Test framework update.

* ps/chainlint-self-check-update:
  tests: adjust whitespace in chainlint expectations
2024-02-08 16:22:07 -08:00
546f8d2dcd Merge branch 'ps/reftable-fixes' into maint-2.43
Bunch of small fix-ups to the reftable code.

* ps/reftable-fixes:
  reftable/block: reuse buffer to compute record keys
  reftable/block: introduce macro to initialize `struct block_iter`
  reftable/merged: reuse buffer to compute record keys
  reftable/stack: fix use of unseeded randomness
  reftable/stack: fix stale lock when dying
  reftable/stack: reuse buffers when reloading stack
  reftable/stack: perform auto-compaction with transactional interface
  reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
  reftable: handle interrupted writes
  reftable: handle interrupted reads
  reftable: wrap EXPECT macros in do/while
2024-02-08 16:22:07 -08:00
b471ea3a0d Merge branch 'jk/config-cleanup' into maint-2.43
Code clean-up around use of configuration variables.

* jk/config-cleanup:
  sequencer: simplify away extra git_config_string() call
  gpg-interface: drop pointless config_error_nonbool() checks
  push: drop confusing configset/callback redundancy
  config: use git_config_string() for core.checkRoundTripEncoding
  diff: give more detailed messages for bogus diff.* config
  config: use config_error_nonbool() instead of custom messages
  imap-send: don't use git_die_config() inside callback
  git_xmerge_config(): prefer error() to die()
  config: reject bogus values for core.checkstat
2024-02-08 16:22:07 -08:00
6479e121c2 Merge branch 'rs/incompatible-options-messages' into maint-2.43
Clean-up code that handles combinations of incompatible options.

* rs/incompatible-options-messages:
  worktree: simplify incompatibility message for --orphan and commit-ish
  worktree: standardize incompatibility messages
  clean: factorize incompatibility message
  revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
  revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
  repack: use die_for_incompatible_opt3() for -A/-k/--cruft
  push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
2024-02-08 16:22:06 -08:00
1dbc46997a Merge branch 'mk/doc-gitfile-more' into maint-2.43
Doc update.

* mk/doc-gitfile-more:
  doc: make the gitfile syntax easier to discover
2024-02-08 16:22:06 -08:00
67bb8ff5da Merge branch 'ps/ref-tests-update-more' into maint-2.43
Tests update.

* ps/ref-tests-update-more:
  t6301: write invalid object ID via `test-tool ref-store`
  t5551: stop writing packed-refs directly
  t5401: speed up creation of many branches
  t4013: simplify magic parsing and drop "failure"
  t3310: stop checking for reference existence via `test -f`
  t1417: make `reflog --updateref` tests backend agnostic
  t1410: use test-tool to create empty reflog
  t1401: stop treating FETCH_HEAD as real reference
  t1400: split up generic reflog tests from the reffile-specific ones
  t0410: mark tests to require the reffiles backend
2024-02-08 16:22:06 -08:00
a7ea468346 Merge branch 'rs/column-leakfix' into maint-2.43
Leakfix.

* rs/column-leakfix:
  column: release strbuf and string_list after use
2024-02-08 16:22:06 -08:00
25e2039cf6 Merge branch 'rs/i18n-cannot-be-used-together' into maint-2.43
Clean-up code that handles combinations of incompatible options.

* rs/i18n-cannot-be-used-together:
  i18n: factorize even more 'incompatible options' messages
2024-02-08 16:22:05 -08:00
173d7746f6 Merge branch 'jb/reflog-expire-delete-dry-run-options' into maint-2.43
Command line parsing fix for "git reflog".

* jb/reflog-expire-delete-dry-run-options:
  builtin/reflog.c: fix dry-run option short name
2024-02-08 16:22:05 -08:00
bcab40f14f Merge branch 'js/packfile-h-typofix' into maint-2.43
Typofix.

* js/packfile-h-typofix:
  packfile.c: fix a typo in `each_file_in_pack_dir_fn()`'s declaration
2024-02-08 16:22:05 -08:00
1685e9ffe6 Merge branch 'jk/commit-graph-slab-clear-fix' into maint-2.43
Clearing in-core repository (happens during e.g., "git fetch
--recurse-submodules" with commit graph enabled) made in-core
commit object in an inconsistent state by discarding the necessary
data from commit-graph too early, which has been corrected.

* jk/commit-graph-slab-clear-fix:
  commit-graph: retain commit slab when closing NULL commit_graph
2024-02-08 16:22:05 -08:00
3c2ee131f8 Merge branch 'cp/git-flush-is-an-env-bool' into maint-2.43
Unlike other environment variables that took the usual
true/false/yes/no as well as 0/1, GIT_FLUSH only understood 0/1,
which has been corrected.

* cp/git-flush-is-an-env-bool:
  write-or-die: make GIT_FLUSH a Boolean environment variable
2024-02-08 16:22:04 -08:00
8566311a03 Merge branch 'jc/sparse-checkout-set-default-fix' into maint-2.43
"git sparse-checkout set" added default patterns even when the
patterns are being fed from the standard input, which has been
corrected.

* jc/sparse-checkout-set-default-fix:
  sparse-checkout: use default patterns for 'set' only !stdin
2024-02-08 16:22:04 -08:00
878f8c42dc Merge branch 'jc/archive-list-with-extra-args' into maint-2.43
"git archive --list extra garbage" silently ignored excess command
line parameters, which has been corrected.

* jc/archive-list-with-extra-args:
  archive: "--list" does not take further options
2024-02-08 16:22:04 -08:00
a593e2fbce Merge branch 'rj/status-bisect-while-rebase' into maint-2.43
"git status" is taught to show both the branch being bisected and
being rebased when both are in effect at the same time.
cf. <xmqqil76kyov.fsf@gitster.g>

* rj/status-bisect-while-rebase:
  status: fix branch shown when not only bisecting
2024-02-08 16:22:04 -08:00
8f7cc565e0 Merge branch 'sh/completion-with-reftable' into maint-2.43
Command line completion script (in contrib/) learned to work better
with the reftable backend.

* sh/completion-with-reftable:
  completion: support pseudoref existence checks for reftables
  completion: refactor existence checks for pseudorefs
2024-02-08 16:22:04 -08:00
ce54593289 Merge branch 'jx/fetch-atomic-error-message-fix' into maint-2.43
"git fetch --atomic" issued an unnecessary empty error message,
which has been corrected.
cf. <ZX__e7VjyLXIl-uV@tanuki>

* jx/fetch-atomic-error-message-fix:
  fetch: no redundant error message for atomic fetch
  t5574: test porcelain output of atomic fetch
2024-02-08 16:22:03 -08:00
0e92593acf Merge branch 'jk/mailinfo-iterative-unquote-comment' into maint-2.43
The code to parse the From e-mail header has been updated to avoid
recursion.

* jk/mailinfo-iterative-unquote-comment:
  mailinfo: avoid recursion when unquoting From headers
  t5100: make rfc822 comment test more careful
  mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
2024-02-08 16:22:03 -08:00
952916f9e0 Merge branch 'rs/show-ref-incompatible-options' into maint-2.43
Code clean-up for sanity checking of command line options for "git
show-ref".

* rs/show-ref-incompatible-options:
  show-ref: use die_for_incompatible_opt3()
2024-02-08 16:22:03 -08:00
28b47452b3 Merge branch 'jk/implicit-true' into maint-2.43
Some codepaths did not correctly parse configuration variables
specified with valueless "true", which has been corrected.

* jk/implicit-true:
  fsck: handle NULL value when parsing message config
  trailer: handle NULL value when parsing trailer-specific config
  submodule: handle NULL value when parsing submodule.*.branch
  help: handle NULL value for alias.* config
  trace2: handle NULL values in tr2_sysenv config callback
  setup: handle NULL value when parsing extensions
  config: handle NULL value when parsing non-bools
2024-02-08 16:22:03 -08:00
5baedc68b0 Merge branch 'jk/bisect-reset-fix' into maint-2.43
"git bisect reset" has been taught to clean up state files and refs
even when BISECT_START file is gone.

* jk/bisect-reset-fix:
  bisect: always clean on reset
2024-02-08 16:22:03 -08:00
19fa15fb2d Merge branch 'jk/end-of-options' into maint-2.43
"git $cmd --end-of-options --rev -- --path" for some $cmd failed
to interpret "--rev" as a rev, and "--path" as a path.  This was
fixed for many programs like "reset" and "checkout".

* jk/end-of-options:
  parse-options: decouple "--end-of-options" and "--"
2024-02-08 16:22:02 -08:00
4b50f86141 Merge branch 'jc/revision-parse-int' into maint-2.43
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.

* jc/revision-parse-int:
  revision: parse integer arguments to --max-count, --skip, etc., more carefully
2024-02-08 16:22:02 -08:00
7c05241877 Merge branch 'jp/use-diff-index-in-pre-commit-sample' into maint-2.43
The sample pre-commit hook that tries to catch introduction of new
paths that use potentially non-portable characters did not notice
an existing path getting renamed to such a problematic path, when
rename detection was enabled.

* jp/use-diff-index-in-pre-commit-sample:
  hooks--pre-commit: detect non-ASCII when renaming
2024-02-08 16:22:02 -08:00
13031f6689 Merge branch 'jh/trace2-redact-auth' into maint-2.43
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.

* jh/trace2-redact-auth:
  t0212: test URL redacting in EVENT format
  t0211: test URL redacting in PERF format
  trace2: redact passwords from https:// URLs by default
  trace2: fix signature of trace2_def_param() macro
2024-02-08 16:22:01 -08:00
efbae0583b Merge branch 'js/update-urls-in-doc-and-comment' into maint-2.43
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.

* js/update-urls-in-doc-and-comment:
  doc: refer to internet archive
  doc: update links for andre-simon.de
  doc: switch links to https
  doc: update links to current pages
2024-02-08 16:22:01 -08:00
50b8f513a2 Merge branch 'ps/commit-graph-less-paranoid' into maint-2.43
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications.  The default has been
flipped to disable this pessimization.

* ps/commit-graph-less-paranoid:
  commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
2024-02-08 16:22:01 -08:00
f8e2ad965a Merge branch 'tz/send-email-negatable-options' into maint-2.43
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email".  Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.

* tz/send-email-negatable-options:
  send-email: avoid duplicate specification warnings
  perl: bump the required Perl version to 5.8.1 from 5.8.0
2024-02-08 16:22:01 -08:00
c8bcf66bf7 Merge branch 'js/ci-discard-prove-state' into maint-2.43
The way CI testing used "prove" could lead to running the test
suite twice needlessly, which has been corrected.

* js/ci-discard-prove-state:
  ci: avoid running the test suite _twice_
  ci: add support for GitLab CI
  ci: install test dependencies for linux-musl
  ci: squelch warnings when testing with unusable Git repo
  ci: unify setup of some environment variables
  ci: split out logic to set up failed test artifacts
  ci: group installation of Docker dependencies
  ci: make grouping setup more generic
  ci: reorder definitions for grouping functions
2024-02-08 16:22:00 -08:00
75389e275c t9210: do not rely on lazy fetching to fail
With "rev-list --missing=print $start", where "$start" is a 40-hex
object name, the object may or may not be lazily fetched from the
promisor.  Make sure it fails by forcing dereference of "$start"
at that point.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-08 15:18:58 -08:00
5216f8f5c4 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-08 13:20:41 -08:00
294dd2057c Merge branch 'jh/sparse-index-expand-to-path-fix'
A caller called index_file_exists() that takes a string expressed
as <ptr, length> with a wrong length, which has been corrected.

* jh/sparse-index-expand-to-path-fix:
  sparse-index: pass string length to index_file_exists()
2024-02-08 13:20:34 -08:00
33d03f61b9 Merge branch 'pb/imap-send-wo-curl-build-fix'
Build fix.

* pb/imap-send-wo-curl-build-fix:
  imap-send: add missing "strbuf.h" include under NO_CURL
2024-02-08 13:20:34 -08:00
2a10505a77 Merge branch 'ja/doc-placeholders-fix'
Docfix.

* ja/doc-placeholders-fix:
  doc: enforce placeholders in documentation
  doc: enforce dashes in placeholders
2024-02-08 13:20:34 -08:00
bec9160394 Merge branch 'mh/credential-oauth-refresh-token-with-wincred'
The wincred credential backend has been taught to support oauth refresh
token the same way as credential-cache and credential-libsecret backends.

* mh/credential-oauth-refresh-token-with-wincred:
  credential/wincred: store oauth_refresh_token
2024-02-08 13:20:34 -08:00
6dbc1eb664 Merge branch 'jk/unit-tests-buildfix'
Build dependency around unit tests has been fixed.

* jk/unit-tests-buildfix:
  t/Makefile: say the default target upfront
  t/Makefile: get UNIT_TESTS list from C sources
  Makefile: remove UNIT_TEST_BIN directory with "make clean"
  Makefile: use mkdir_p_parent_template for UNIT_TEST_BIN
2024-02-08 13:20:33 -08:00
2c90347a94 Merge branch 'jc/index-pack-fsck-levels'
The "--fsck-objects" option of "git index-pack" now can take the
optional parameter to tweak severity of different fsck errors.

* jc/index-pack-fsck-levels:
  index-pack: --fsck-objects to take an optional argument for fsck msgs
  index-pack: test and document --strict=<msg-id>=<severity>...
2024-02-08 13:20:33 -08:00
107023e1c9 Merge branch 'cp/unit-test-prio-queue'
The priority queue test has been migrated to the unit testing
framework.

* cp/unit-test-prio-queue:
  tests: move t0009-prio-queue.sh to the new unit testing framework
2024-02-08 13:20:33 -08:00
e4301f73ff sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
Running "git cherry-pick" as an x-command in the rebase plan loses
the original authorship information.

To fix this, unset GIT_CHERRY_PICK_HELP for 'exec' commands.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-08 09:17:55 -08:00
46176d77c9 ref-filter.c: sort formatted dates by byte value
Update the ref sorting functions of 'ref-filter.c' so that when date fields
are specified with a format string (such as in 'git for-each-ref
--sort=creatordate:<something>'), they are sorted by their formatted string
value rather than by the underlying numeric timestamp. Currently, date
fields are always sorted by timestamp, regardless of whether formatting
information is included in the '--sort' key.

Leaving the default (unformatted) date sorting unchanged, sorting by the
formatted date string adds some flexibility to 'for-each-ref' by allowing
for behavior like "sort by year, then by refname within each year" or "sort
by time of day". Because the inclusion of a format string previously had no
effect on sort behavior, this change likely will not affect existing usage
of 'for-each-ref' or other ref listing commands.

Additionally, update documentation & tests to document the new sorting
mechanism.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 21:33:37 -08:00
6931049c32 ssh signing: signal an error with a negative return value
The other backend for the sign_buffer() function followed our usual
"an error is signalled with a negative return" convention, but the
SSH signer did not.  Even though we already fixed the caller that
assumed only a negative return value is an error, tighten the callee
to signal an error with a negative return as well.  This way, the
callees will be strict on what they produce, while the callers will
be lenient in what they accept.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 21:31:42 -08:00
8a0bebdeae refs/reftable: fix leak when copying reflog fails
When copying a ref with the reftable backend we also copy the
corresponding log records. When seeking the first log record that we're
about to copy fails though we directly return from `write_copy_table()`
without doing any cleanup, leaking several allocated data structures.

Fix this by exiting via our common cleanup logic instead.

Reported-by: Jeff King <peff@peff.net> via Coverity
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 21:30:43 -08:00
841dbd40a3 bisect: document command line arguments for "bisect start"
The syntax commonly used for alternatives is --opt-(a|b), not
--opt-{a,b}.

List bad/new and good/old consistently in this order, to be
consistent with the description for "git bisect terms".  Clarify
<term> to either <term-old> or <term-new> to make them consistent
with the description of "git bisect (good|bad)" subcommands.

Suggested-by: Matthieu Moy <git@matthieu-moy.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 13:46:01 -08:00
47ac5f6e1a bisect: document "terms" subcommand more fully
The documentation for "git bisect terms", although it did not hide
any information, was a bit incomplete and forced readers to fill in
the blanks to get the complete picture.

Acked-by: Matthieu Moy <git@matthieu-moy.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 13:46:01 -08:00
abfbff61ef tag: fix sign_buffer() call to create a signed tag
The command "git tag -s" internally calls sign_buffer() to make a
cryptographic signature using the chosen backend like GPG and SSH.
The internal helper functions used by "git tag" implementation seem
to use a "negative return values are errors, zero or positive return
values are not" convention, and there are places (e.g., verify_tag()
that calls gpg_verify_tag()) that these internal helper functions
translate return values that signal errors to conform to this
convention, but do_sign() that calls sign_buffer() forgets to do so.

Fix it, so that a failed call to sign_buffer() that can return the
exit status from pipe_command() will not be overlooked.

Reported-by: Sergey Kosukhin <skosukhin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 10:47:25 -08:00
1af410d455 t1400: use show-ref to check pseudorefs
Now that "git show-ref --verify" accepts pseudorefs use that in
preference to "git rev-parse" when checking pseudorefs as we do when
checking branches etc.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 09:14:48 -08:00
1dbe401563 show-ref --verify: accept pseudorefs
"git show-ref --verify" is useful for scripts that want to look up a
fully qualified refname without falling back to the DWIM rules used by
"git rev-parse" rules when the ref does not exist. Currently it will
only accept "HEAD" or a refname beginning with "refs/". Running

    git show-ref --verify CHERRY_PICK_HEAD

will always result in

    fatal: 'CHERRY_PICK_HEAD' - not a valid ref

even when CHERRY_PICK_HEAD exists. By calling refname_is_safe() instead
of comparing the refname to "HEAD" we can accept all one-level refs that
contain only uppercase ascii letters and underscores.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 09:12:47 -08:00
c0350cb964 ci: add jobs to test with the reftable backend
Add CI jobs for both GitHub Workflows and GitLab CI to run Git with the
new reftable backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 08:28:37 -08:00
57db2a094d refs: introduce reftable backend
Due to scalability issues, Shawn Pearce has originally proposed a new
"reftable" format more than six years ago [1]. Initially, this new
format was implemented in JGit with promising results. Around two years
ago, we have then added the "reftable" library to the Git codebase via
a4bbd13be3 (Merge branch 'hn/reftable', 2021-12-15). With this we have
landed all the low-level code to read and write reftables. Notably
missing though was the integration of this low-level code into the Git
code base in the form of a new ref backend that ties all of this
together.

This gap is now finally closed by introducing a new "reftable" backend
into the Git codebase. This new backend promises to bring some notable
improvements to Git repositories:

  - It becomes possible to do truly atomic writes where either all refs
    are committed to disk or none are. This was not possible with the
    "files" backend because ref updates were split across multiple loose
    files.

  - The disk space required to store many refs is reduced, both compared
    to loose refs and packed-refs. This is enabled both by the reftable
    format being a binary format, which is more compact, and by prefix
    compression.

  - We can ignore filesystem-specific behaviour as ref names are not
    encoded via paths anymore. This means there is no need to handle
    case sensitivity on Windows systems or Unicode precomposition on
    macOS.

  - There is no need to rewrite the complete refdb anymore every time a
    ref is being deleted like it was the case for packed-refs. This
    means that ref deletions are now constant time instead of scaling
    linearly with the number of refs.

  - We can ignore file/directory conflicts so that it becomes possible
    to store both "refs/heads/foo" and "refs/heads/foo/bar".

  - Due to this property we can retain reflogs for deleted refs. We have
    previously been deleting reflogs together with their refs to avoid
    file/directory conflicts, which is not necessary anymore.

  - We can properly enumerate all refs. With the "files" backend it is
    not easily possible to distinguish between refs and non-refs because
    they may live side by side in the gitdir.

Not all of these improvements are realized with the current "reftable"
backend implementation. At this point, the new backend is supposed to be
a drop-in replacement for the "files" backend that is used by basically
all Git repositories nowadays. It strives for 1:1 compatibility, which
means that a user can expect the same behaviour regardless of whether
they use the "reftable" backend or the "files" backend for most of the
part.

Most notably, this means we artificially limit the capabilities of the
"reftable" backend to match the limits of the "files" backend. It is not
possible to create refs that would end up with file/directory conflicts,
we do not retain reflogs, we perform stricter-than-necessary checks.
This is done intentionally due to two main reasons:

  - It makes it significantly easier to land the "reftable" backend as
    tests behave the same. It would be tough to argue for each and every
    single test that doesn't pass with the "reftable" backend.

  - It ensures compatibility between repositories that use the "files"
    backend and repositories that use the "reftable" backend. Like this,
    hosters can migrate their repositories to use the "reftable" backend
    without causing issues for clients that use the "files" backend in
    their clones.

It is expected that these artificial limitations may eventually go away
in the long term.

Performance-wise things very much depend on the actual workload. The
following benchmarks compare the "files" and "reftable" backends in the
current version:

  - Creating N refs in separate transactions shows that the "files"
    backend is ~50% faster. This is not surprising given that creating a
    ref only requires us to create a single loose ref. The "reftable"
    backend will also perform auto compaction on updates. In real-world
    workloads we would likely also want to perform pack loose refs,
    which would likely change the picture.

        Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1)
          Time (mean ± σ):       2.1 ms ±   0.3 ms    [User: 0.6 ms, System: 1.7 ms]
          Range (min … max):     1.8 ms …   4.3 ms    133 runs

        Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1)
          Time (mean ± σ):       2.7 ms ±   0.1 ms    [User: 0.6 ms, System: 2.2 ms]
          Range (min … max):     2.4 ms …   2.9 ms    132 runs

        Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000)
          Time (mean ± σ):      1.975 s ±  0.006 s    [User: 0.437 s, System: 1.535 s]
          Range (min … max):    1.969 s …  1.980 s    3 runs

        Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000)
          Time (mean ± σ):      2.611 s ±  0.013 s    [User: 0.782 s, System: 1.825 s]
          Range (min … max):    2.597 s …  2.622 s    3 runs

        Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000)
          Time (mean ± σ):     198.442 s ±  0.241 s    [User: 43.051 s, System: 155.250 s]
          Range (min … max):   198.189 s … 198.670 s    3 runs

        Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000)
          Time (mean ± σ):     294.509 s ±  4.269 s    [User: 104.046 s, System: 190.326 s]
          Range (min … max):   290.223 s … 298.761 s    3 runs

  - Creating N refs in a single transaction shows that the "files"
    backend is significantly slower once we start to write many refs.
    The "reftable" backend only needs to update two files, whereas the
    "files" backend needs to write one file per ref.

        Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1)
          Time (mean ± σ):       1.9 ms ±   0.1 ms    [User: 0.4 ms, System: 1.4 ms]
          Range (min … max):     1.8 ms …   2.6 ms    151 runs

        Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1)
          Time (mean ± σ):       2.5 ms ±   0.1 ms    [User: 0.7 ms, System: 1.7 ms]
          Range (min … max):     2.4 ms …   3.4 ms    148 runs

        Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000)
          Time (mean ± σ):     152.5 ms ±   5.2 ms    [User: 19.1 ms, System: 133.1 ms]
          Range (min … max):   148.5 ms … 167.8 ms    15 runs

        Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000)
          Time (mean ± σ):      58.0 ms ±   2.5 ms    [User: 28.4 ms, System: 29.4 ms]
          Range (min … max):    56.3 ms …  72.9 ms    40 runs

        Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000)
          Time (mean ± σ):     152.752 s ±  0.710 s    [User: 20.315 s, System: 131.310 s]
          Range (min … max):   152.165 s … 153.542 s    3 runs

        Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000)
          Time (mean ± σ):     51.912 s ±  0.127 s    [User: 26.483 s, System: 25.424 s]
          Range (min … max):   51.769 s … 52.012 s    3 runs

  - Deleting a ref in a fully-packed repository shows that the "files"
    backend scales with the number of refs. The "reftable" backend has
    constant-time deletions.

        Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1)
          Time (mean ± σ):       1.7 ms ±   0.1 ms    [User: 0.4 ms, System: 1.2 ms]
          Range (min … max):     1.6 ms …   2.1 ms    316 runs

        Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1)
          Time (mean ± σ):       1.8 ms ±   0.1 ms    [User: 0.4 ms, System: 1.3 ms]
          Range (min … max):     1.7 ms …   2.1 ms    294 runs

        Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000)
          Time (mean ± σ):       2.0 ms ±   0.1 ms    [User: 0.5 ms, System: 1.4 ms]
          Range (min … max):     1.9 ms …   2.5 ms    287 runs

        Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000)
          Time (mean ± σ):       1.9 ms ±   0.1 ms    [User: 0.5 ms, System: 1.3 ms]
          Range (min … max):     1.8 ms …   2.1 ms    217 runs

        Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000)
          Time (mean ± σ):     229.8 ms ±   7.9 ms    [User: 182.6 ms, System: 46.8 ms]
          Range (min … max):   224.6 ms … 245.2 ms    6 runs

        Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000)
          Time (mean ± σ):       2.0 ms ±   0.0 ms    [User: 0.6 ms, System: 1.3 ms]
          Range (min … max):     2.0 ms …   2.1 ms    3 runs

  - Listing all refs shows no significant advantage for either of the
    backends. The "files" backend is a bit faster, but not by a
    significant margin. When repositories are not packed the "reftable"
    backend outperforms the "files" backend because the "reftable"
    backend performs auto-compaction.

        Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.5 ms …   2.0 ms    1729 runs

        Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.5 ms …   1.8 ms    1816 runs

        Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true)
          Time (mean ± σ):       4.3 ms ±   0.1 ms    [User: 0.9 ms, System: 3.3 ms]
          Range (min … max):     4.1 ms …   4.6 ms    645 runs

        Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true)
          Time (mean ± σ):       4.5 ms ±   0.2 ms    [User: 1.0 ms, System: 3.3 ms]
          Range (min … max):     4.2 ms …   5.9 ms    643 runs

        Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true)
          Time (mean ± σ):      2.537 s ±  0.034 s    [User: 0.488 s, System: 2.048 s]
          Range (min … max):    2.511 s …  2.627 s    10 runs

        Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true)
          Time (mean ± σ):      2.712 s ±  0.017 s    [User: 0.653 s, System: 2.059 s]
          Range (min … max):    2.692 s …  2.752 s    10 runs

        Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.5 ms …   1.9 ms    1834 runs

        Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.4 ms …   2.0 ms    1840 runs

        Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false)
          Time (mean ± σ):      13.8 ms ±   0.2 ms    [User: 2.8 ms, System: 10.8 ms]
          Range (min … max):    13.3 ms …  14.5 ms    208 runs

        Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false)
          Time (mean ± σ):       4.5 ms ±   0.2 ms    [User: 1.2 ms, System: 3.3 ms]
          Range (min … max):     4.3 ms …   6.2 ms    624 runs

        Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false)
          Time (mean ± σ):     12.127 s ±  0.129 s    [User: 2.675 s, System: 9.451 s]
          Range (min … max):   11.965 s … 12.370 s    10 runs

        Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false)
          Time (mean ± σ):      2.799 s ±  0.022 s    [User: 0.735 s, System: 2.063 s]
          Range (min … max):    2.769 s …  2.836 s    10 runs

  - Printing a single ref shows no real difference between the "files"
    and "reftable" backends.

        Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1)
          Time (mean ± σ):       1.5 ms ±   0.1 ms    [User: 0.4 ms, System: 1.0 ms]
          Range (min … max):     1.4 ms …   1.8 ms    1779 runs

        Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.4 ms …   2.5 ms    1753 runs

        Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000)
          Time (mean ± σ):       1.5 ms ±   0.1 ms    [User: 0.3 ms, System: 1.1 ms]
          Range (min … max):     1.4 ms …   1.9 ms    1840 runs

        Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.5 ms …   2.0 ms    1831 runs

        Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.5 ms …   2.1 ms    1848 runs

        Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000)
          Time (mean ± σ):       1.6 ms ±   0.1 ms    [User: 0.4 ms, System: 1.1 ms]
          Range (min … max):     1.5 ms …   2.1 ms    1762 runs

So overall, performance depends on the usecases. Except for many
sequential writes the "reftable" backend is roughly on par or
significantly faster than the "files" backend though. Given that the
"files" backend has received 18 years of optimizations by now this can
be seen as a win. Furthermore, we can expect that the "reftable" backend
will grow faster over time when attention turns more towards
optimizations.

The complete test suite passes, except for those tests explicitly marked
to require the REFFILES prerequisite. Some tests in t0610 are marked as
failing because they depend on still-in-flight bug fixes. Tests can be
run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT
environment variable to "reftable".

There is a single known conceptual incompatibility with the dumb HTTP
transport. As "info/refs" SHOULD NOT contain the HEAD reference, and
because the "HEAD" file is not valid anymore, it is impossible for the
remote client to figure out the default branch without changing the
protocol. This shortcoming needs to be handled in a subsequent patch
series.

As the reftable library has already been introduced a while ago, this
commit message will not go into the details of how exactly the on-disk
format works. Please refer to our preexisting technical documentation at
Documentation/technical/reftable for this.

[1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/

Original-idea-by: Shawn Pearce <spearce@spearce.org>
Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 08:28:37 -08:00
d8e08f0717 completion: bisect: recognize but do not complete view subcommand
The "view" alias for the visualize subcommand is neither completed nor
recognized.  It's undesirable to complete it because it's first letters
are the same as for visualize, making completion less rather than more
efficient without adding much in the way of interface discovery.
However, it needs to be recognized in order to enable log option
completion for it.

Recognize but do not complete the view command by creating and using
separate lists of completable_subcommands and all_subcommands.  Add
tests.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:46 -08:00
d115b87787 completion: bisect: complete log opts for visualize subcommand
Arguments passed to the "visualize" subcommand of git-bisect(1) get
forwarded to git-log(1). It thus supports the same options as git-log(1)
would, but our Bash completion script does not know to handle this.

Make completion of porcelain git-log options and option arguments to the
visualize subcommand work by calling __git_complete_log_opts when the
start of an option to the subcommand is seen (visualize doesn't support
any options besides the git-log options).  Add test.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:46 -08:00
a9e5b7a76d completion: new function __git_complete_log_opts
The options accepted by git-log are also accepted by at least one other
command (git-bisect).  Factor the common option completion code into a
new function and use it from _git_log.  The new function leaves
COMPREPLY empty if no option candidates are found, so that callers can
safely check it to determine if completion for other arguments should be
attempted.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:46 -08:00
41928aeb45 completion: bisect: complete missing --first-parent and - -no-checkout options
The --first-parent and --no-checkout options to the start subcommand of
git-bisect(1) are not completed.

Enable completion of the --first-parent and --no-checkout options to the
start subcommand.  Add test.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:46 -08:00
af8910a2d4 completion: bisect: complete custom terms and related options
git bisect supports the use of custom terms via the --term-(new|bad) and
--term-(old|good) options, but the completion code doesn't know about
these options or the new subcommands they define.

Add support for these options and the custom subcommands by checking for
BISECT_TERMS and adding them to the list of subcommands.  Add tests.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:46 -08:00
e1f74dd58b completion: bisect: complete bad, new, old, and help subcommands
The bad, new, old and help subcommands to git-bisect(1) are not
completed.

Add the bad, new, old, and help subcommands to the appropriate lists
such that the commands and their possible ref arguments are completed.
Add tests.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:46 -08:00
db489ea4f3 completion: tests: always use 'master' for default initial branch name
The default initial branch name can normally be configured using the
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME environment variable.  However,
when testing e.g. <rev> completion it's convenient to know the
exact initial branch name that will be used.

To achieve that without too much trouble it is considered sufficient
to force the default initial branch name to 'master' for all of
t9902-completion.sh.

Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 15:11:45 -08:00
235986be82 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 14:31:50 -08:00
c0515b3155 Merge branch 'cb/use-freebsd-13-2-at-cirrus-ci'
Cirrus CI jobs started breaking because we specified version of
FreeBSD that is no longer available, which has been corrected.

* cb/use-freebsd-13-2-at-cirrus-ci:
  ci: update FreeBSD cirrus job
2024-02-06 14:31:22 -08:00
07bbe4caab Merge branch 'jc/make-libpath-template'
The Makefile often had to say "-L$(path) -R$(path)" that repeats
the path to the same library directory for link time and runtime.
A Makefile template is used to reduce such repetition.

* jc/make-libpath-template:
  Makefile: simplify output of the libpath_template
  Makefile: reduce repetitive library paths
2024-02-06 14:31:22 -08:00
097c28db78 Merge branch 'rj/test-with-leak-check'
More tests that are supposed to pass leak sanitizer are marked as such.

* rj/test-with-leak-check:
  t0080: mark as leak-free
  test-lib: check for TEST_PASSES_SANITIZE_LEAK
  t6113: mark as leak-free
  t5332: mark as leak-free
2024-02-06 14:31:22 -08:00
c5887af55d Merge branch 'jc/t0091-with-unknown-git'
The test did not work when Git was built from a repository without
tags.

* jc/t0091-with-unknown-git:
  t0091: allow test in a repository without tags
2024-02-06 14:31:21 -08:00
1f9d2745fa Merge branch 'js/win32-retry-pipe-write-on-enospc'
Update to the code that writes to pipes on Windows.

* js/win32-retry-pipe-write-on-enospc:
  win32: special-case `ENOSPC` when writing to a pipe
2024-02-06 14:31:21 -08:00
46b5d75c08 Merge branch 'ps/tests-with-ref-files-backend'
Prepare existing tests on refs to work better with non-default
backends.

* ps/tests-with-ref-files-backend:
  t: mark tests regarding git-pack-refs(1) to be backend specific
  t5526: break test submodule differently
  t1419: mark test suite as files-backend specific
  t1302: make tests more robust with new extensions
  t1301: mark test for `core.sharedRepository` as reffiles specific
  t1300: make tests more robust with non-default ref backends
2024-02-06 14:31:21 -08:00
184c3b4c73 Merge branch 'jc/comment-style-fixes'
Rewrite //-comments to /* comments */ in files whose comments
prevalently use the latter.

* jc/comment-style-fixes:
  reftable/pq_test: comment style fix
  merge-ort.c: comment style fix
  builtin/worktree: comment style fixes
2024-02-06 14:31:21 -08:00
92e69dfb66 Merge branch 'jk/diff-external-with-no-index'
"git diff --no-index file1 file2" segfaulted while invoking the
external diff driver, which has been corrected.

* jk/diff-external-with-no-index:
  diff: handle NULL meta-info when spawning external diff
2024-02-06 14:31:21 -08:00
76bb1896de Merge branch 'kh/maintenance-use-xdg-when-it-should'
Comment fix.

* kh/maintenance-use-xdg-when-it-should:
  config: add back code comment
2024-02-06 14:31:20 -08:00
00e0bc3bd7 Merge branch 'tb/pack-bitmap-drop-unused-struct-member'
Code clean-up.

* tb/pack-bitmap-drop-unused-struct-member:
  pack-bitmap: drop unused `reuse_objects`
2024-02-06 14:31:20 -08:00
e87557faa1 Merge branch 'jt/p4-spell-re-with-raw-string'
"git p4" update to squelch warnings from Python.

* jt/p4-spell-re-with-raw-string:
  git-p4: use raw string literals for regular expressions
2024-02-06 14:31:20 -08:00
0f4e178a4f Merge branch 'ps/reftable-compacted-tables-permission-fix'
Reftable bugfix.

* ps/reftable-compacted-tables-permission-fix:
  reftable/stack: adjust permissions of compacted tables
2024-02-06 14:31:20 -08:00
b6fdf9aafa Merge branch 'jc/reftable-core-fsync'
The write codepath for the reftable data learned to honor
core.fsync configuration.

* jc/reftable-core-fsync:
  reftable/stack: fsync "tables.list" during compaction
  reftable: honor core.fsync
2024-02-06 14:31:20 -08:00
78307f1a89 .github/PULL_REQUEST_TEMPLATE.md: add a note about single-commit PRs
Contributors using Gitgitgadget continue to send single-commit PRs with
their commit message text duplicated below the three-dash line,
increasing the signal-to-noise ratio for reviewers.

This is because Gitgitgadget copies the pull request description as an
in-patch commentary, for single-commit PRs, and _GitHub_ defaults to
prefilling the pull request description with the commit message, for
single-commit PRs (followed by the content of the pull request
template).

Add a note in the pull request template mentioning that for
single-commit PRs, the PR description should thus be kept empty, in the
hope that contributors read it and act on it.

This partly addresses:
https://github.com/gitgitgadget/gitgitgadget/issues/340

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:22:55 -08:00
3ddef475d0 reftable/record: improve semantics when initializing records
According to our usual coding style, the `reftable_new_record()`
function would indicate that it is allocating a new record. This is not
the case though as the function merely initializes records without
allocating any memory.

Replace `reftable_new_record()` with a new `reftable_record_init()`
function that takes a record pointer as input and initializes it
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:09 -08:00
62d3c8e8c8 reftable/merged: refactor initialization of iterators
Refactor the initialization of the merged iterator to fit our code style
better. This refactoring prepares the code for a refactoring of how
records are being initialized.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:09 -08:00
59f302ca5a reftable/merged: refactor seeking of records
The code to seek reftable records in the merged table code is quite hard
to read and does not conform to our coding style in multiple ways:

  - We have multiple exit paths where we release resources even though
    that is not really necessary.

  - We use a scoped error variable `e` which is hard to reason about.
    This variable is not required at all.

  - We allocate memory in the variable declarations, which is easy to
    miss.

Refactor the function so that it becomes more maintainable in the
future.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
81879123c3 reftable/stack: use size_t to track stack length
While the stack length is already stored as `size_t`, we frequently use
`int`s to refer to those stacks throughout the reftable library. Convert
those cases to use `size_t` instead to make things consistent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
47616c4399 reftable/stack: use size_t to track stack slices during compaction
We use `int`s to track reftable slices when compacting the reftable
stack, which is considered to be a code smell in the Git project.
Convert the code to use `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
6d5e80fba2 reftable/stack: index segments with size_t
We use `int`s to index into arrays of segments and track the length of
them, which is considered to be a code smell in the Git project. Convert
the code to use `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
ca63af0a24 reftable/stack: fix parameter validation when compacting range
The `stack_compact_range()` function receives a "first" and "last" index
that indicates which tables of the reftable stack should be compacted.
Naturally, "first" must be smaller than "last" in order to identify a
proper range of tables to compress, which we indeed also assert in the
function. But the validations happens after we have already allocated
arrays with a size of `last - first + 1`, leading to an underflow and
thus an invalid allocation size.

Fix this by reordering the array allocations to happen after we have
validated parameters. While at it, convert the array allocations to use
the newly introduced macros.

Note that the relevant variables pointing into arrays should also be
converted to use `size_t` instead of `int`. This is left for a later
commit in this series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
b4ff12c8ee reftable: introduce macros to allocate arrays
Similar to the preceding commit, let's carry over macros to allocate
arrays with `REFTABLE_ALLOC_ARRAY()` and `REFTABLE_CALLOC_ARRAY()`. This
requires us to change the signature of `reftable_calloc()`, which only
takes a single argument right now and thus puts the burden on the caller
to calculate the final array's size. This is a net improvement though as
it means that we can now provide proper overflow checks when multiplying
the array size with the member size.

Convert callsites of `reftable_calloc()` to the new signature and start
using the new macros where possible.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
f6b58c1be4 reftable: introduce macros to grow arrays
Throughout the reftable library we have many cases where we need to grow
arrays. In order to avoid too many reallocations, we roughly double the
capacity of the array on each iteration. The resulting code pattern is
duplicated across many sites.

We have similar patterns in our main codebase, which is why we have
eventually introduced an `ALLOC_GROW()` macro to abstract it away and
avoid some code duplication. We cannot easily reuse this macro here
though because `ALLOC_GROW()` uses `REALLOC_ARRAY()`, which in turn will
call realloc(3P) to grow the array. The reftable code is structured as a
library though (even if the boundaries are fuzzy), and one property this
brings with it is that it is possible to plug in your own allocators. So
instead of using realloc(3P), we need to use `reftable_realloc()` that
knows to use the user-provided implementation.

So let's introduce two new macros `REFTABLE_REALLOC_ARRAY()` and
`REFTABLE_ALLOC_GROW()` that mirror what we do in our main codebase,
with two modifications:

  - They use `reftable_realloc()`, as explained above.

  - They use a different growth factor of `2 * cap + 1` instead of `(cap
    + 16) * 3 / 2`.

The second change is because we know a bit more about the allocation
patterns in the reftable library. In most cases, we end up only having a
handful of items in the array and don't end up growing them. The initial
capacity that our normal growth factor uses (which is 24) would thus end
up over-allocating in a lot of code paths. This effect is measurable:

  - Before change:

      HEAP SUMMARY:
          in use at exit: 671,983 bytes in 152 blocks
        total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,402 bytes allocated

  - After change with a growth factor of `(2 * alloc + 1)`:

      HEAP SUMMARY:
          in use at exit: 671,983 bytes in 152 blocks
        total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,410 bytes allocated

  - After change with a growth factor of `(alloc + 16)* 2 / 3`:

      HEAP SUMMARY:
          in use at exit: 671,983 bytes in 152 blocks
        total heap usage: 3,833,673 allocs, 3,833,521 frees, 4,728,251,742 bytes allocated

While the total heap usage is roughly the same, we do end up allocating
significantly more bytes with our usual growth factor (in fact, roughly
21 times as many).

Convert the reftable library to use these new macros.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:10:08 -08:00
d2058cb2f0 builtin/stash: report failure to write to index
The git-stash(1) command needs to write to the index for many of its
operations. When the index is locked by a concurrent writer it will thus
fail to operate, which is expected. What is not expected though is that
we do not print any error message at all in this case. The user can thus
easily miss the fact that the command didn't do what they expected it to
do and would be left wondering why that is.

Fix this bug and report failures to write to the index. Add tests for
the subcommands which hit the respective code paths.

While at it, unify error messages when writing to the index fails. The
chosen error message is already used in "builtin/stash.c".

Reported-by: moti sd <motisd8@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:08:38 -08:00
568459bf5e Always check the return value of repo_read_object_file()
There are a couple of places in Git's source code where the return value
is not checked. As a consequence, they are susceptible to segmentation
faults.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 10:42:28 -08:00
23c1e71369 pack-objects: enable multi-pack reuse via feature.experimental
Now that multi-pack reuse is supported, enable it via the
feature.experimental configuration in addition to the classic
`pack.allowPackReuse`.

This will allow more users to experiment with the new behavior who might
not otherwise be aware of the existing `pack.allowPackReuse`
configuration option.

The enum with values NO_PACK_REUSE, SINGLE_PACK_REUSE, and
MULTI_PACK_REUSE is defined statically in builtin/pack-objects.c's
compilation unit. We could hoist that enum into a scope visible from the
repository_settings struct, and then use that enum value in
pack-objects. Instead, define a single int that indicates what
pack-objects's default value should be to avoid additional unnecessary
code movement.

Though `feature.experimental` implies `pack.allowPackReuse=multi`, this
can still be overridden by explicitly setting the latter configuration
to either "single" or "false". Tests covering all of these cases are
showin t5332.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-05 15:27:01 -08:00
7c01878eeb t5332-multi-pack-reuse.sh: extract pack-objects helper functions
Most of the tests in t5332 perform some setup before repeating a common
refrain that looks like:

    : >trace2.txt &&
    GIT_TRACE2_EVENT="$PWD/trace2.txt" \
      git pack-objects --stdout --revs --all >/dev/null &&

    test_pack_reused $objects_nr <trace2.txt &&
    test_packs_reused $packs_nr <trace2.txt

The next commit will add more tests which repeat the above refrain.
Avoid duplicating this invocation even further and prepare for the
following commit by wrapping the above in a helper function called
`test_pack_objects_reused_all()`.

Introduce another similar function `test_pack_objects_reused`, which
expects to read a list of revisions over stdin for tests which need more
fine-grained control of the contents of the pack they generate.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-05 15:27:00 -08:00
483b759b47 Merge branch 'jk/unit-tests-buildfix' into js/unit-test-suite-runner
* jk/unit-tests-buildfix:
  t/Makefile: say the default target upfront
  t/Makefile: get UNIT_TESTS list from C sources
  Makefile: remove UNIT_TEST_BIN directory with "make clean"
  Makefile: use mkdir_p_parent_template for UNIT_TEST_BIN
2024-02-03 12:33:00 -08:00
4904a4d08c t/Makefile: say the default target upfront
Similar to how 2731d048 (Makefile: say the default target upfront.,
2005-12-01) added the default target to the very beginning of the
main Makefile to prevent a random rule that happens to be defined
first in an included makefile fragments from becoming the default
target, protect this Makefile the same way.

This started to matter as we started to include config.mak.uname
and that included makefile fragment does more than defining Make
macros, unfortunately.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-02 18:41:35 -08:00
dcce2bda21 Merge branch 'jc/maint-github-actions-update' into jc/github-actions-update
This contains an evil merge to tell the fuzz-smoke-test job to
also use checkout@v4; the job has been added since the master
track diverged from the maintenance track.

* jc/maint-github-actions-update:
  GitHub Actions: update to github-script@v7
  GitHub Actions: update to checkout@v4
2024-02-02 13:03:30 -08:00
c4ddbe043e GitHub Actions: update to github-script@v7
We seem to be getting "Node.js 16 actions are deprecated." warnings
for jobs that use github-script@v6.  Update to github-script@v7,
which is said to use Node.js 20.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-02 13:00:46 -08:00
e94dec0c1d GitHub Actions: update to checkout@v4
We seem to be getting "Node.js 16 actions are deprecated." warnings
for jobs that use checkout@v3.  Except for the i686 containers job
that is kept at checkout@v1 [*], update to checkout@v4, which is
said to use Node.js 20.

[*] 6cf4d908 (ci(main): upgrade actions/checkout to v3, 2022-12-05)
    refers to https://github.com/actions/runner/issues/2115 and
    explains why container jobs are kept at checkout@v1.  We may
    want to check the current status of the issue and move it to the
    same version as other jobs, but that is outside the scope of
    this step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-02 13:00:34 -08:00
2a540e432f The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-02 11:31:51 -08:00
bcf524023e Merge branch 'zf/subtree-split-fix'
"git subtree" (in contrib/) update.

* zf/subtree-split-fix:
  subtree: fix split processing with multiple subtrees present
2024-02-02 11:31:51 -08:00
bbc8c05670 Merge branch 'jc/ls-files-doc-update'
The documentation for the --exclude-per-directory option marked it
as deprecated, which confused readers into thinking there may be a
plan to remove it in the future, which was not our intention.

* jc/ls-files-doc-update:
  ls-files: avoid the verb "deprecate" for individual options
2024-02-02 11:31:51 -08:00
cbcf61990f Merge branch 'jk/fetch-auto-tag-following-fix'
Fetching via protocol v0 over Smart HTTP transport sometimes failed
to correctly auto-follow tags.

* jk/fetch-auto-tag-following-fix:
  transport-helper: re-examine object dir after fetching
2024-02-02 11:31:51 -08:00
082f7b0f79 Merge branch 'jc/coc-whitespace-fix'
Docfix.

* jc/coc-whitespace-fix:
  CoC: whitespace fix
2024-02-02 11:31:51 -08:00
9e189a03da Merge branch 'ad/custom-merge-placeholder-for-symbolic-pathnames'
The labels on conflict markers for the common ancestor, our version,
and the other version are available to custom 3-way merge driver
via %S, %X, and %Y placeholders.

* ad/custom-merge-placeholder-for-symbolic-pathnames:
  merge-ll: expose revision names to custom drivers
2024-02-02 11:31:50 -08:00
35d94b55f7 Merge branch 'jc/reffiles-tests'
Tests on ref API are moved around to prepare for reftable.

* jc/reffiles-tests:
  t5312: move reffiles specific tests to t0601
  t4202: move reffiles specific tests to t0600
  t3903: make drop stash test ref backend agnostic
  t1503: move reffiles specific tests to t0600
  t1415: move reffiles specific tests to t0601
  t1410: move reffiles specific tests to t0600
  t1406: move reffiles specific tests to t0600
  t1405: move reffiles specific tests to t0601
  t1404: move reffiles specific tests to t0600
  t1414: convert test to use Git commands instead of writing refs manually
  remove REFFILES prerequisite for some tests in t1405 and t2017
  t3210: move to t0601
2024-02-02 11:31:50 -08:00
3c0b8444a7 Merge branch 'pb/complete-log-more'
The completion script (in contrib/) learned more options that can
be used with "git log".

* pb/complete-log-more:
  completion: complete missing 'git log' options
  completion: complete --encoding
  completion: complete --patch-with-raw
  completion: complete missing rev-list options
2024-02-02 11:31:50 -08:00
156e28b36d sparse-index: pass string length to index_file_exists()
The call to index_file_exists() in the loop in expand_to_path() passes
the wrong string length.  Let's fix that.

The loop in expand_to_path() searches the name-hash for each
sub-directory prefix in the provided pathname. That is, by searching
for "dir1/" then "dir1/dir2/" then "dir1/dir2/dir3/" and so on until
it finds a cache-entry representing a sparse directory.

The code creates "strbuf path_mutable" to contain the working pathname
and modifies the buffer in-place by temporarily replacing the character
following each successive "/" with NUL for the duration of the call to
index_file_exists().

It does not update the strbuf.len during this substitution.

Pass the patched length of the prefix path instead.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-02 10:25:39 -08:00
30bced3a67 imap-send: add missing "strbuf.h" include under NO_CURL
Building with NO_CURL is currently broken since imap-send.c uses things
defined in "strbuf.h" wihtout including it.

The inclusion of that header was removed in eea0e59ffb (treewide: remove
unnecessary includes in source files, 2023-12-23), which failed to
notice that "strbuf.h" was transitively included in imap-send.c via
"http.h", but only if USE_CURL_FOR_IMAP_SEND is defined. Add back the
missing include. Note that it was explicitely added in 3307f7dde2
(imap-send: include strbuf.h, 2023-05-17) after a similar breakage in
ba3d1c73da (treewide: remove unnecessary cache.h includes, 2023-02-24) -
see the thread starting at [1].

It can be verified by inspection that this is the only case where a
header we include is dependent on a Makefile knob in the files modified
in eea0e59ffb.

[1] https://lore.kernel.org/git/20230517070632.71884-1-list@eworm.de/

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 18:20:41 -08:00
4950acae7d reftable: document reading and writing indices
The way the index gets written and read is not trivial at all and
requires the reader to piece together a bunch of parts to figure out how
it works. Add some documentation to hopefully make this easier to
understand for the next reader.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:11:33 -08:00
e7485601ca reftable/writer: fix writing multi-level indices
When finishing a section we will potentially write an index that makes
it more efficient to look up relevant blocks. The index records written
will encode, for each block of the indexed section, what the offset of
that block is as well as the last key of that block. Thus, the reader
would iterate through the index records to find the first key larger or
equal to the wanted key and then use the encoded offset to look up the
desired block.

When there are a lot of blocks to index though we may end up writing
multiple index blocks, too. To not require a linear search across all
index blocks we instead end up writing a multi-level index. Instead of
referring to the block we are after, an index record may point to
another index block. The reader will then access the highest-level index
and follow down the chain of index blocks until it hits the sought-after
block.

It has been observed though that it is impossible to seek ref records of
the last ref block when using a multi-level index. While the multi-level
index exists and looks fine for most of the part, the highest-level
index was missing an index record pointing to the last block of the next
index. Thus, every additional level made more refs become unseekable at
the end of the ref section.

The root cause is that we are not flushing the last block of the current
level once done writing the level. Consequently, it wasn't recorded in
the blocks that need to be indexed by the next-higher level and thus we
forgot about it.

Fix this bug by flushing blocks after we have written all index records.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:11:32 -08:00
b66e006ff5 reftable/writer: simplify writing index records
When finishing the current section some index records might be written
for the section to the table. The logic that adds these records to the
writer duplicates what we already have in `writer_add_record()`, making
this more complicated than it really has to be.

Simplify the code by using `writer_add_record()` instead. While at it,
drop the unneeded braces around a loop to make the code conform to our
code style better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:11:32 -08:00
9ebb2d7b08 reftable/writer: use correct type to iterate through index entries
The reftable writer is tracking the number of blocks it has to index via
the `index_len` variable. But while this variable is of type `size_t`,
some sites use an `int` to loop through the index entries.

Convert the code to consistently use `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:11:32 -08:00
d55fc5128b reftable/reader: be more careful about errors in indexed seeks
When doing an indexed seek we first need to do a linear seek in order to
find the index block for our wanted key. We do not check the returned
error of the linear seek though. This is likely not an issue because the
next call to `table_iter_next()` would return error, too. But it very
much is a code smell when an error variable is being assigned to without
actually checking it.

Safeguard the code by checking for errors.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:11:32 -08:00
0f8edf7317 index-pack: --fsck-objects to take an optional argument for fsck msgs
git-index-pack has a --strict option that can take an optional argument
to provide a list of fsck issues to change their severity.
--fsck-objects does not have such a utility, which would be useful if
one would like to be more lenient or strict on data integrity in a
repository.

Like --strict, allow --fsck-objects to also take a list of fsck msgs to
change the severity.

Remove the "For internal use only" note for --fsck-objects, and document
the option. This won't often be used by the normal end user, but it
turns out it is useful for Git forges like GitLab.

Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:09:53 -08:00
2811019f47 index-pack: test and document --strict=<msg-id>=<severity>...
5d477a334a (fsck (receive-pack): allow demoting errors to warnings,
2015-06-22) allowed a list of fsck msg to downgrade to be passed to
--strict. However this is a hidden argument that was not documented nor
tested. Though it is true that most users would not call this option
directly, (nor use index-pack for that matter) it is still useful to
document and test this feature.

Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 11:09:53 -08:00
4ee286e8e6 Makefile: simplify output of the libpath_template
If a platform lacks the support to specify the dynamic library path,
there is no suitable value to give to the CC_LD_DYNPATH variable.
Allow them to be set to an empty string to signal that they do not
need to add the usual -Wl,-rpath, or -R or whatever option followed
by a directory name.  This way,

    $(call libpath_template,$(SOMELIBDIR))

would expand to just a single mention of that directory, i.e.

    -L$(SOMELIBDIR)

when CC_LD_DYNPATH is set to an empty string (or a "-L", which
would have repeated the same "-L$(SOMELIBDIR)" twice without any
ill effect).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-31 14:43:00 -08:00
81fffb66d3 ci: update FreeBSD cirrus job
FreeBSD 12 is EOL and no longer available, causing errors in this job.

Upgrade to 13.2, which is the next oldest release with support and that
should keep it for at least another 4 months.

This will be upgraded again once 13.3 is released to avoid further
surprises.

The original report [*] of this problem mentions an error message
"Not enough compute credits to prioritize tasks!".  It seems to be
just a reminder that the credit allocate for the Free Tier by Cirrus
is all used up and which might result in additional delays getting a
result.

[*] https://lore.kernel.org/git/d2d7da84-e2a3-a7b2-3f95-c8d53ad4dd5f@gmx.de/

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-31 14:41:56 -08:00
799d449105 t/Makefile: get UNIT_TESTS list from C sources
We decide on the set of unit tests to run by asking make to expand the
wildcard "t/unit-tests/bin/*". One unfortunate outcome of this is that
we'll run anything in that directory, even if it is leftover cruft from
a previous build. This isn't _quite_ as bad as it sounds, since in
theory the unit tests executables are self-contained (so if they passed
before, they'll pass even though they now have nothing to do with the
checked out version of Git). But at the very least it's wasteful, and if
they _do_ fail it can be quite confusing to understand why they are
being run at all.

This wildcarding presumably came from our handling of the regular
shell-script tests, which use $(wildcard t[0-9][0-9][0-9][0-9]-*.sh).
But the difference there is that those are actual tracked files. So if
you checkout a different commit, they'll go away. Whereas the contents
of unit-tests/bin are ignored (so not only do they stick around, but you
are not even warned of the stale files via "git status").

This patch fixes the situation by looking for the actual unit-test
source files and then massaging those names into the final executable
names. This has two additional benefits:

  1. It will notice if we failed to build one or more unit-tests for
     some reason (whereas the current code just runs whatever made it to
     the bin/ directory).

  2. The wildcard should avoid other build cruft, like the pdb files we
     worked around in 0df903d402 (unit-tests: do not mistake `.pdb`
     files for being executable, 2023-09-25).

Our new wildcard does make an assumption that unit tests are built from
C sources. It would be a bit cleaner if we consulted UNIT_TEST_PROGRAMS
from the top-level Makefile. But doing so is tricky unless we reorganize
that Makefile to split the source file lists into include-able subfiles.
That might be worth doing in general, but in the meantime, the
assumptions made by the wildcard here seems reasonable.

Note that we do need to include config.mak.uname either way, though, as
we need the value of $(X) to compute the correct executable names (which
would be true even if we had access to the top-level's UNIT_TEST_PROGRAMS
variable).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-31 14:41:12 -08:00
354dbf7d64 Makefile: reduce repetitive library paths
When we take a library package we depend on (e.g., LIBPCRE) from a
directory other than the default location of the system, we add the
same directory twice on the linker command like, like so:

  EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)

Introduce a template "libpath_template" that takes the path to the
directory, which can be used like so:

  EXTLIBS += $(call libpath_template,$(LIBPCREDIR)/$(lib))

and expand it into the "-L$(DIR) $(CC_LD_DYNPATH)$(DIR)" form.
Hopefully we can reduce the chance of typoes this way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-31 10:01:56 -08:00
77d1ae4793 Merge branch 'jc/reftable-core-fsync' into ps/reftable-multi-level-indices-fix
* jc/reftable-core-fsync:
  reftable/stack: fsync "tables.list" during compaction
  reftable: honor core.fsync
2024-01-30 14:11:44 -08:00
19ed0dff8f win32: special-case ENOSPC when writing to a pipe
Since c6d3cce6f3 (pipe_command(): handle ENOSPC when writing to a
pipe, 2022-08-17), one `write()` call that results in an `errno` value
`ENOSPC` (which typically indicates out of disk space, which makes
little sense in the context of a pipe) is treated the same as `EAGAIN`.

However, contrary to expectations, as diagnosed in
https://github.com/python/cpython/issues/101881#issuecomment-1428667015,
when writing to a non-blocking pipe on Windows, an `errno` value of
`ENOSPC` means something else: the write _fails_. Completely. Because
more data was provided than the internal pipe buffer can handle.
Somewhat surprising, considering that `write()` is allowed to write less
than the specified amount, e.g. by writing only as much as fits in that
buffer. But it doesn't, it writes no byte at all in that instance.

Let's handle this by manually detecting when an `ENOSPC` indicates that
a pipe's buffer is smaller than what needs to be written, and re-try
using the pipe's buffer size as `size` parameter.

It would be plausible to try writing the entire buffer in a loop,
feeding pipe buffer-sized chunks, but experiments show that trying to
write more than one buffer-sized chunk right after that will immediately
fail because the buffer is unlikely to be drained as fast as `write()`
could write again. And the whole point of a non-blocking pipe is to be
non-blocking.

Which means that the logic that determines the pipe's buffer size
unfortunately has to be run potentially many times when writing large
amounts of data to a non-blocking pipe, as there is no elegant way to
cache that information between `write()` calls. It's the best we can do,
though, so it has to be good enough.

This fix is required to let t3701.60 (handle very large filtered diff)
pass with the MSYS2 runtime provided by the MSYS2 project: Without this
patch, the failed write would result in an infinite loop. This patch is
not required with Git for Windows' variant of the MSYS2 runtime only
because Git for Windows added an ugly work-around specifically to avoid
a hang in that test case.

The diff is slightly chatty because it extends an already-existing
conditional that special-cases a _different_ `errno` value for pipes,
and because this patch needs to account for the fact that
`_get_osfhandle()` potentially overwrites `errno`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-30 13:59:16 -08:00
bc7ee2e5e1 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-30 13:34:13 -08:00
65973e4e99 Merge branch 'sd/negotiate-trace-fix'
Tracing fix.

* sd/negotiate-trace-fix:
  push: region_leave trace for negotiate_using_fetch
2024-01-30 13:34:13 -08:00
a8bf3c0cac Merge branch 'kl/allow-working-in-dot-git-in-non-bare-repository'
The "disable repository discovery of a bare repository" check,
triggered by setting safe.bareRepository configuration variable to
'explicit', has been loosened to exclude the ".git/" directory inside
a non-bare repository from the check.  So you can do "cd .git &&
git cmd" to run a Git command that works on a bare repository without
explicitly specifying $GIT_DIR now.

* kl/allow-working-in-dot-git-in-non-bare-repository:
  setup: allow cwd=.git w/ bareRepository=explicit
2024-01-30 13:34:12 -08:00
fa50e7a8a0 Merge branch 'jx/remote-archive-over-smart-http'
"git archive --remote=<remote>" learned to talk over the smart
http (aka stateless) transport.

* jx/remote-archive-over-smart-http:
  transport-helper: call do_take_over() in process_connect
  transport-helper: call do_take_over() in connect_helper
  http-backend: new rpc-service for git-upload-archive
  transport-helper: protocol v2 supports upload-archive
  remote-curl: supports git-upload-archive service
  transport-helper: no connection restriction in connect_helper
2024-01-30 13:34:12 -08:00
e14c0ab176 Merge branch 'rj/advice-disable-how-to-disable'
All conditional "advice" messages show how to turn them off, which
becomes repetitive.  Setting advice.* configuration explicitly on
now omits the instruction part.

* rj/advice-disable-how-to-disable:
  advice: allow disabling the automatic hint in advise_if_enabled()
2024-01-30 13:34:12 -08:00
2e77b83993 Merge branch 'rs/parse-options-with-keep-unknown-abbrev-fix'
"git diff --no-rename A B" did not disable rename detection but did
not trigger an error from the command line parser.

* rs/parse-options-with-keep-unknown-abbrev-fix:
  parse-options: simplify positivation handling
  parse-options: fully disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
2024-01-30 13:34:12 -08:00
262fa1e968 Merge branch 'pb/ci-github-skip-logs-for-broken-tests'
GitHub CI update.

* pb/ci-github-skip-logs-for-broken-tests:
  ci(github): also skip logs of broken test cases
2024-01-30 13:34:11 -08:00
3cb4384683 t0091: allow test in a repository without tags
The beginning of the [System Info] section, which should match the
"git version --build-options" output, may not identify our version
as "git version 2.whatever".  When built in a repository cloned
without tags, for example, "git version unknown.g00000000" can be a
legit version string.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-30 13:32:41 -08:00
7fa52fdad5 reftable/stack: fsync "tables.list" during compaction
In 1df18a1c9a (reftable: honor core.fsync, 2024-01-23), we have added
code to fsync both newly written reftables as well as "tables.list" to
disk. But there are two code paths where "tables.list" is being written:

  - When appending a new table due to a normal ref update.

  - When compacting a range of tables during compaction.

We have only addressed the former code path, but do not yet sync the new
"tables.list" file in the latter. Fix this omission.

Note that we are not yet adding any tests. These tests will be added
once the "reftable" backend has been upstreamed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-30 11:56:15 -08:00
5d5ca1b362 Makefile: remove UNIT_TEST_BIN directory with "make clean"
We remove $(UNIT_TEST_PROGS), but that leaves the automatically
generated "bin" dir they reside in. And once we start cleaning that,
there is no point in removing the individual programs, as they'll by
wiped out by the recurse "rm".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 21:57:22 -08:00
318ecda5aa Makefile: use mkdir_p_parent_template for UNIT_TEST_BIN
We build the UNIT_TEST_BIN directory (t/unit-tests/bin) on the fly with
"mkdir -p". And so the recipe for UNIT_TEST_PROGS, which put their
output in that directory, depend on UNIT_TEST_BIN to make sure it's
there.

But using a normal dependency leads to weird outcomes, because the
timestamp of the directory is important. For example, try this:

  $ make
  [...builds everything...]

  [now re-build one unit test]
  $ touch t/unit-tests/t-ctype.c
  $ make
      SUBDIR templates
      CC t/unit-tests/t-ctype.o
      LINK t/unit-tests/bin/t-ctype

So far so good. Now running make again should build nothing. But it
doesn't!

  $ make
      SUBDIR templates
      LINK t/unit-tests/bin/t-basic
      LINK t/unit-tests/bin/t-mem-pool
      LINK t/unit-tests/bin/t-strbuf

Er, what? Let's rebuild again:

  $ make
      SUBDIR templates
      LINK t/unit-tests/bin/t-ctype

Weird. And now we ping-pong back and forth forever:

  $ make
      SUBDIR templates
      LINK t/unit-tests/bin/t-basic
      LINK t/unit-tests/bin/t-mem-pool
      LINK t/unit-tests/bin/t-strbuf
  $ make
      SUBDIR templates
      LINK t/unit-tests/bin/t-ctype

What happens is that writing t/unit-tests/bin/t-ctype updates the mtime
of the directory t/unit-tests/bin. And then on the next invocation of
make, all of those other tests are now older and so get rebuilt. And
back and forth forever.

We can fix this by making the directory as part of the build recipe for
the programs, using the template from 0b6d0bc924 (Makefiles: add and use
wildcard "mkdir -p" template, 2022-03-03).

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 21:57:22 -08:00
c5b454771e The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 16:03:01 -08:00
474df39f58 Merge branch 'al/t2400-depipe'
Coding style fix.

* al/t2400-depipe:
  t2400: avoid losing exit status to pipes
2024-01-29 16:03:01 -08:00
40225ba8b4 Merge branch 'gt/t0024-style-fixes'
Coding style fix.

* gt/t0024-style-fixes:
  t0024: style fix
  t0024: avoid losing exit status to pipes
2024-01-29 16:03:00 -08:00
d3bf8d32d3 Merge branch 'en/diffcore-delta-final-line-fix'
Rename detection logic ignored the final line of a file if it is an
incomplete line.

* en/diffcore-delta-final-line-fix:
  diffcore-delta: avoid ignoring final 'line' of file
2024-01-29 16:03:00 -08:00
8282f95928 Merge branch 'ps/not-so-many-refs-are-special'
Define "special ref" as a very narrow set that consists of
FETCH_HEAD and MERGE_HEAD, and clarify everything else that used to
be classified as such are actually just pseudorefs.

* ps/not-so-many-refs-are-special:
  Documentation: add "special refs" to the glossary
  refs: redefine special refs
  refs: convert MERGE_AUTOSTASH to become a normal pseudo-ref
  sequencer: introduce functions to handle autostashes via refs
  refs: convert AUTO_MERGE to become a normal pseudo-ref
  sequencer: delete REBASE_HEAD in correct repo when picking commits
  sequencer: clean up pseudo refs with REF_NO_DEREF
2024-01-29 16:03:00 -08:00
9869e02a64 Merge branch 'js/oss-fuzz-build-in-ci'
oss-fuzz tests are built and run in CI.

* js/oss-fuzz-build-in-ci:
  ci: build and run minimal fuzzers in GitHub CI
  fuzz: fix fuzz test build rules
2024-01-29 16:03:00 -08:00
68812df310 Merge branch 'jc/majordomo-to-subspace'
Doc update.

* jc/majordomo-to-subspace:
  Docs: majordomo@vger.kernel.org has been decomissioned
2024-01-29 16:03:00 -08:00
a0003a5490 Merge branch 'nb/rebase-x-shell-docfix'
Doc update.

* nb/rebase-x-shell-docfix:
  rebase: fix documentation about used shell in -x
2024-01-29 16:02:59 -08:00
cf58f5920d Merge branch 'tc/show-ref-exists-fix'
Update to a new feature recently added, "git show-ref --exists".

* tc/show-ref-exists-fix:
  builtin/show-ref: treat directory as non-existing in --exists
2024-01-29 16:02:59 -08:00
4d5a46ecb1 Merge branch 'ps/reftable-optimize-io'
Low-level I/O optimization for reftable.

* ps/reftable-optimize-io:
  reftable/stack: fix race in up-to-date check
  reftable/stack: unconditionally reload stack after commit
  reftable/blocksource: use mmap to read tables
  reftable/blocksource: refactor code to match our coding style
  reftable/stack: use stat info to avoid re-reading stack list
  reftable/stack: refactor reloading to use file descriptor
  reftable/stack: refactor stack reloading to have common exit path
2024-01-29 16:02:59 -08:00
03f72a4ed8 t0080: mark as leak-free
This test is leak-free since it was added in e137fe3b29 (unit tests: add
TAP unit test framework, 2023-11-09)

Let's mark it as leak-free to make sure it stays that way (and to reduce
noise when looking for other leak-free scripts after we fix some leaks).

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 16:02:47 -08:00
92e209be78 test-lib: check for TEST_PASSES_SANITIZE_LEAK
TEST_PASSES_SANITIZE_LEAK must be set before sourcing test-lib.sh, as we
say in t/README:

   GIT_TEST_PASSING_SANITIZE_LEAK=true skips those tests that haven't
   declared themselves as leak-free by setting
   "TEST_PASSES_SANITIZE_LEAK=true" before sourcing "test-lib.sh". This
   test mode is used by the "linux-leaks" CI target.

   GIT_TEST_PASSING_SANITIZE_LEAK=check checks that our
   "TEST_PASSES_SANITIZE_LEAK=true" markings are current. Rather than
   skipping those tests that haven't set "TEST_PASSES_SANITIZE_LEAK=true"
   before sourcing "test-lib.sh" this mode runs them with
   "--invert-exit-code". This is used to check that there's a one-to-one
   mapping between "TEST_PASSES_SANITIZE_LEAK=true" and those tests that
   pass under "SANITIZE=leak". This is especially useful when testing a
   series that fixes various memory leaks with "git rebase -x".

In a recent commit we fixed a test where it was set after sourcing
test-lib.sh, leading to confusing results.

To prevent future oversights, let's add a simple check to ensure the
value for TEST_PASSES_SANITIZE_LEAK remains unchanged at test_done().

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:21:53 -08:00
90a694a27e t6113: mark as leak-free
This test does not leak since a96015a517 (pack-bitmap: plug leak in
find_objects(), 2023-12-14) when the annotation
TEST_PASSES_SANITIZE_LEAK=true was also added.

Unfortunately it was added after test-lib.sh is sourced, which makes
GIT_TEST_PASSING_SANITIZE_LEAK=check error:

   $ make SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check test T=t6113-rev-list-bitmap-filters.sh
   ...
   make[2]: Entering directory '/tmp/git/git/t'
   *** t6113-rev-list-bitmap-filters.sh ***
   in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
   ok 1 - set up bitmapped repo
   ok 2 - filters fallback to non-bitmap traversal
   ok 3 - blob:none filter
   ok 4 - blob:none filter with specified blob
   ok 5 - blob:limit filter
   ok 6 - blob:limit filter with specified blob
   ok 7 - tree:0 filter
   ok 8 - tree:0 filter with specified blob, tree
   ok 9 - tree:1 filter
   ok 10 - object:type filter
   ok 11 - object:type filter with --filter-provided-objects
   ok 12 - combine filter
   ok 13 - combine filter with --filter-provided-objects
   ok 14 - bitmap traversal with --unpacked
   # passed all 14 test(s)
   1..14
   # faking up non-zero exit with --invert-exit-code
   make[2]: *** [Makefile:68: t6113-rev-list-bitmap-filters.sh] Error 1
   make[2]: Leaving directory '/tmp/git/git/t'
   make[1]: *** [Makefile:55: test] Error 2
   make[1]: Leaving directory '/tmp/git/git/t'
   make: *** [Makefile:3212: test] Error 2

Let's move the annotation before sourcing test-lib.sh, to make
GIT_TEST_PASSING_SANITIZE_LEAK=check happy.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:20:36 -08:00
86b8254144 t5332: mark as leak-free
This test is leak-free since it was added in af626ac0e0 (pack-bitmap:
enable reuse from all bitmapped packs, 2023-12-14).

Let's mark it as leak-free to make sure it stays that way (and to reduce
noise when looking for other leak-free scripts after we fix some leaks).

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:16:51 -08:00
de65079d7b reftable/pq_test: comment style fix
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:08:52 -08:00
9d2cdd8ae8 merge-ort.c: comment style fix
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:08:52 -08:00
777f783841 builtin/worktree: comment style fixes
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:08:52 -08:00
bbd6106967 t: mark tests regarding git-pack-refs(1) to be backend specific
Both t1409 and t3210 exercise parts of git-pack-refs(1). Given that we
must check the on-disk files to verify whether the backend has indeed
packed refs as expected those test suites are deeply tied to the actual
backend that is in use.

Mark the test suites to depend on the REFFILES backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 13:54:33 -08:00
7a746904a9 t5526: break test submodule differently
In 10f5c52656 (submodule: avoid auto-discovery in
prepare_submodule_repo_env(), 2016-09-01) we fixed a bug when doing a
recursive fetch with submodule in the case where the submodule is broken
due to whatever reason. The test to exercise that the fix works breaks
the submodule by deleting its `HEAD` reference, which will cause us to
not detect the directory as a Git repository.

While this is perfectly fine in theory, this way of breaking the repo
becomes problematic with the current efforts to introduce another refdb
backend into Git. The new reftable backend has a stub HEAD file that
always contains "ref: refs/heads/.invalid" so that tools continue to be
able to detect such a repository. But as the reftable backend will never
delete this file even when asked to delete `HEAD` the current way to
delete the `HEAD` reference will stop working.

Adapt the code to instead delete the objects database. Going back with
this new way to cause breakage confirms that it triggers the infinite
recursion just the same, and there are no equivalent ongoing efforts to
replace the object database with an alternate backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 13:54:33 -08:00
61e1c560bc t1419: mark test suite as files-backend specific
With 59c35fac54 (refs/packed-backend.c: implement jump lists to avoid
excluded pattern(s), 2023-07-10) we have implemented logic to handle
excluded refs more efficiently in the "packed" ref backend. This logic
allows us to skip emitting refs completely which we know to not be of
any interest to the caller, which can avoid quite some allocations and
object lookups.

This was wired up via a new `exclude_patterns` parameter passed to the
backend's ref iterator. The backend only needs to handle them on a best
effort basis though, and in fact we only handle it for the "packed-refs"
file, but not for loose references. Consequently, all callers must still
filter emitted refs with those exclude patterns.

The result is that handling exclude patterns is completely optional in
the ref backend, and any future backends may or may not implement it.
Let's thus mark the test for t1419 to depend on the REFFILES prereq.

An alternative would be to introduce a new prereq that tells us whether
the backend under test supports exclude patterns or not. But this does
feel a bit overblown:

  - It would either map to the REFFILES prereq, in which case it feels
    overengineered because the prereq is only ever relevant to t1419.

  - Otherwise, it could auto-detect whether the backend supports exclude
    patterns. But this could lead to silent failures in case the support
    for this feature breaks at any point in time.

It should thus be good enough to just use the REFFILES prereq for now.
If future backends ever grow support for exclude patterns we can easily
add their respective prereq as another condition for this test suite to
execute.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 13:54:33 -08:00
afb99327d0 t1302: make tests more robust with new extensions
In t1302 we exercise logic around "core.repositoryFormatVersion" and
extensions. These tests are not particularly robust against extensions
like the newly introduced "refStorage" extension as we tend to clobber
the repository's config file. We thus overwrite any extensions that were
set, which may render the repository inaccessible in case it has to be
accessed with a non-default ref storage.

Refactor the tests to be more robust:

  - Check the DEFAULT_REPO_FORMAT prereq to determine the expected
    repository format version. This helps to ensure that we only need to
    update the prereq in a central place when new extensions are added.
    Furthermore, this allows us to stop seeding the now-unneeded object
    ID cache that was only used to figure out the repository version.

  - Use a separate repository to rewrite ".git/config" to test
    combinations of the repository format version and extensions. This
    ensures that we don't break the main test repository. While we could
    rewrite these tests to not overwrite preexisting extensions, it
    feels cleaner like this so that we can test extensions standalone
    without interference from the environment.

  - Do not rewrite ".git/config" when exercising the "preciousObjects"
    extension.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 13:54:32 -08:00
1f83e752c5 t1301: mark test for core.sharedRepository as reffiles specific
In t1301 we verify whether reflog files written by the "files" ref
backend correctly honor permissions when "core.sharedRepository" is set.
The test logic is thus specific to the reffiles backend and will not
work with any other backends.

Mark the test accordingly with the REFFILES prereq.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 13:54:32 -08:00
91c5a5e000 t1300: make tests more robust with non-default ref backends
The t1300 test suite exercises the git-config(1) tool. To do so, the
test overwrites ".git/config" to contain custom contents in several
places with code like the following:

```
cat > .git/config <<\EOF
...
EOF
```

While this is easy enough to do, it may create problems when using a
non-default repository format because this causes us to overwrite the
repository format version as well as any potential extensions. With the
upcoming "reftable" ref backend the result is that Git would try to
access refs via the "files" backend even though the repository has been
initialized with the "reftable" backend, which will cause failures when
trying to access any refs.

Ideally, we would rewrite the whole test suite to not depend on state
written by previous tests, but that would result in a lot of changes in
this test suite. Instead, we only refactor tests which access the refdb
to be more robust by using their own separate repositories, which allows
us to be more careful and not discard required extensions.

Note that we also have to touch up how the CUSTOM_CONFIG_FILE gets
accessed. This environment variable contains the relative path to a
custom config file which we're setting up. But because we are now using
subrepositories, this relative path will not be found anymore because
our working directory changes. This issue is addressed by storing the
absolute path to the file in CUSTOM_CONFIG_FILE instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 13:54:32 -08:00
f061959efd credential/wincred: store oauth_refresh_token
a5c7656 (credential: new attribute oauth_refresh_token) introduced
a new confidential credential attribute and added support to
credential-cache. Later 0ce02e2f (credential/libsecret: store new
attributes, 2023-06-16) added support in credential-libsecret.

To add support in credential-wincred, we encode the new attribute in the
CredentialBlob, separated by newline:

    hunter2
    oauth_refresh_token=xyzzy

This is extensible and backwards compatible. The credential protocol
already assumes that attribute values do not contain newlines.

This fixes test "helper (wincred) gets oauth_refresh_token" when
t0303-credential-external.sh is run with
GIT_TEST_CREDENTIAL_HELPER=wincred. This test was added in a5c76569e7
(credential: new attribute oauth_refresh_token, 2023-04-21).

Alternatives considered: store oauth_refresh_token in a wincred
attribute. This would be insecure because wincred assumes attribute
values to be non-confidential.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 12:11:24 -08:00
85a9a63c92 diff: handle NULL meta-info when spawning external diff
Running this:

  $ touch foo bar
  $ chmod +x foo
  $ git -c diff.external=echo diff --ext-diff --no-index foo bar

results in a segfault. The issue is that run_diff_cmd() passes a NULL
"xfrm_msg" variable to run_external_diff(), which feeds it to
strvec_push(), causing the segfault. The bug dates back to 82fbf269b9
(run_external_diff: use an argv_array for the command line, 2014-04-19),
though it mostly only ever worked accidentally.  Before then, we just
stuck the NULL pointer into a "const char **" array, so our NULL ended
up acting as an extra end-of-argv sentinel (which was OK, because it was
the last thing in the array).

Curiously, though, this is only a problem with --no-index. We set up
xfrm_msg by calling fill_metainfo(). This result may be empty, or may
have text like "index 1234..5678\n", "rename from foo\nrename from
bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the
"other" variable is not NULL. That variable is set when the paths of the
two sides of the diff pair aren't the same (in which case the
destination path becomes "other"). So normally it would kick in only for
a rename, in which case xfrm_msg should not be NULL (it would have the
rename information in it).

But with a "--no-index" of two blobs, we of course have two different
pathnames, and thus end up with a non-NULL "other" filename (which is
always just a repeat of the file2-name), but possibly a NULL xfrm_msg.

So how to fix it? I have a feeling that --no-index always passing
"other" to the external diff command is probably a bug. There was no
rename, and the name is always redundant with existing information we
pass (and this may even cause us to pass a useless "xfrm_msg" that
contains an "index 1234..5678" line). So one option would be to change
that behavior. We don't seem to have ever documented the "other" or
"xfrm_msg" parameters for external diffs.

But I'm not sure what fallout we might have from changing that behavior
now. So this patch takes the less-risky option, and simply teaches
run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes
it agnostic to whether "other" and "xfrm_msg" always come as a pair. It
fixes the segfault now, and if we want to change the --no-index "other"
behavior on top, it will handle that, too.

Reported-by: Wilfred Hughes <me@wilfred.me.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 10:37:44 -08:00
1cb3b92fc6 config: add back code comment
c15129b699 (config: factor out global config file retrieval, 2024-01-18)
was a refactor that moved some of the code in this function to
`config.c`. However, in the process I managed to drop this code comment
which explains `$HOME not set`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 10:27:53 -08:00
36c9c44fa4 pack-bitmap: drop unused reuse_objects
This variable is no longer used for doing verbatim pack-reuse (or
anywhere within pack-bitmap.c) since d2ea031046 (pack-bitmap: don't rely
on bitmap_git->reuse_objects, 2019-12-18).

Remove it to avoid an unused struct member.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 09:26:17 -08:00
9023198280 git-p4: use raw string literals for regular expressions
Fixes several Python diagnostics about invalid escape sequences. The
diagnostics appear for me in Python 3.12, and may appear in earlier
versions. The fix is to use raw string literals so that backslashes are
not interpreted as introducing escape sequences. Raw string literals
are already in use in this file, so adding more does not impact
toolchain compatibility.

Signed-off-by: James Touton <bekenn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 09:25:16 -08:00
5f43cf5b2e merge-tree: accept 3 trees as arguments
When specifying a merge base explicitly, there is actually no good
reason why the inputs need to be commits: that's only needed if the
merge base has to be deduced from the commit graph.

This commit is best viewed with `--color-moved
--color-moved-ws=allow-indentation-change`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 09:20:49 -08:00
b50a608ba2 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-26 08:54:47 -08:00
76bd1294d8 Merge branch 'vd/fsck-submodule-url-test'
Tighten URL checks fsck makes in a URL recorded for submodules.

* vd/fsck-submodule-url-test:
  submodule-config.c: strengthen URL fsck check
  t7450: test submodule urls
  test-submodule: remove command line handling for check-name
  submodule-config.h: move check_submodule_url
2024-01-26 08:54:47 -08:00
12ee4ed506 Merge branch 'kh/maintenance-use-xdg-when-it-should'
When $HOME/.gitignore is missing but XDG config file available, we
should write into the latter, not former.  "git gc" and "git
maintenance" wrote into a wrong "global config" file, which have
been corrected.

* kh/maintenance-use-xdg-when-it-should:
  maintenance: use XDG config if it exists
  config: factor out global config file retrieval
  config: rename global config function
  config: format newlines
2024-01-26 08:54:47 -08:00
2be9ccf23e Merge branch 'gt/test-commit-o-i-options'
A few tests to "git commit -o <pathspec>" and "git commit -i
<pathspec>" has been added.

* gt/test-commit-o-i-options:
  t7501: add tests for --amend --signoff
  t7501: add tests for --include and --only
2024-01-26 08:54:47 -08:00
93bc02f8f9 Merge branch 'ps/gitlab-ci-macos'
CI for GitLab learned to drive macOS jobs.

* ps/gitlab-ci-macos:
  ci: add macOS jobs to GitLab CI
  ci: make p4 setup on macOS more robust
  ci: handle TEST_OUTPUT_DIRECTORY when printing test failures
  Makefile: detect new Homebrew location for ARM-based Macs
  t7527: decrease likelihood of racing with fsmonitor daemon
2024-01-26 08:54:47 -08:00
c7c0811fd0 Merge branch 'ps/completion-with-reftable-fix'
Completion update to prepare for reftable

* ps/completion-with-reftable-fix:
  completion: treat dangling symrefs as existing pseudorefs
  completion: silence pseudoref existence check
  completion: improve existence check for pseudo-refs
  t9902: verify that completion does not print anything
  completion: discover repo path in `__git_pseudoref_exists ()`
2024-01-26 08:54:46 -08:00
bb98703f60 Merge branch 'jt/tests-with-reftable'
Tweak a few tests not to manually modify the reference database
(hence easier to work with other backends like reftable).

* jt/tests-with-reftable:
  t5541: remove lockfile creation
  t1401: remove lockfile creation
2024-01-26 08:54:46 -08:00
bc554e6c3f Merge branch 'la/strvec-comment-fix'
Comment fix.

* la/strvec-comment-fix:
  strvec: use correct member name in comments
2024-01-26 08:54:46 -08:00
b700f119d1 Merge branch 'mj/gitweb-unreadable-config-error'
When given an existing but unreadable file as a configuration file,
gitweb behaved as if the file did not exist at all, but now it
errors out.  This is a change that may break backward compatibility.

* mj/gitweb-unreadable-config-error:
  gitweb: die when a configuration file cannot be read
2024-01-26 08:54:46 -08:00
dc8ce995a2 Merge branch 'ps/worktree-refdb-initialization'
Instead of manually creating refs/ hierarchy on disk upon a
creation of a secondary worktree, which is only usable via the
files backend, use the refs API to populate it.

* ps/worktree-refdb-initialization:
  builtin/worktree: create refdb via ref backend
  worktree: expose interface to look up worktree by name
  builtin/worktree: move setup of commondir file earlier
  refs/files: skip creation of "refs/{heads,tags}" for worktrees
  setup: move creation of "refs/" into the files backend
  refs: prepare `refs_init_db()` for initializing worktree refs
2024-01-26 08:54:46 -08:00
f95bafbaed Merge branch 'ps/commit-graph-write-leakfix'
Leakfix.

* ps/commit-graph-write-leakfix:
  commit-graph: fix memory leak when not writing graph
2024-01-26 08:54:45 -08:00
b982aa9a9f Merge branch 'al/unit-test-ctype'
Move test-ctype helper to the unit-test framework.

* al/unit-test-ctype:
  unit-tests: rewrite t/helper/test-ctype.c as a unit test
2024-01-26 08:54:45 -08:00
f3657b3526 Merge branch 'ne/doc-filter-blob-limit-fix'
Docfix.

* ne/doc-filter-blob-limit-fix:
  rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
2024-01-26 08:54:45 -08:00
bed1524e04 Merge branch 'rj/advice-delete-branch-not-fully-merged'
The error message given when "git branch -d branch" fails due to
commits unique to the branch has been split into an error and a new
conditional advice message.

* rj/advice-delete-branch-not-fully-merged:
  branch: make the advice to force-deleting a conditional one
  advice: fix an unexpected leading space
  advice: sort the advice related lists
2024-01-26 08:54:45 -08:00
951eafe36f Merge branch 'es/some-up-to-date-messages-must-stay'
Comment updates to help developers not to attempt to modify
messages from plumbing commands that must stay constant.

It might make sense to reassess the plumbing needs every few years,
but that should be done as a separate effort.

* es/some-up-to-date-messages-must-stay:
  messages: mark some strings with "up-to-date" not to touch
2024-01-26 08:54:45 -08:00
b3a79dd4e9 reftable/stack: adjust permissions of compacted tables
When creating a new compacted table from a range of preexisting ones we
don't set the default permissions on the resulting table when specified
by the user. This has the effect that the "core.sharedRepository" config
will not be honored correctly.

Fix this bug and add a test to catch this issue. Note that we only test
on non-Windows platforms because Windows does not use POSIX permissions
natively.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-26 08:21:44 -08:00
98ba49ccc2 subtree: fix split processing with multiple subtrees present
When there are multiple subtrees present in a repository and they are
all using 'git subtree split', the 'split' command can take a
significant (and constantly growing) amount of time to run even when
using the '--rejoin' flag. This is due to the fact that when processing
commits to determine the last known split to start from when looking
for changes, if there has been a split/merge done from another subtree
there will be 2 split commits, one mainline and one subtree, for the
second subtree that are part of the processing. The non-mainline
subtree split commit will cause the processing to always need to search
the entire history of the given subtree as part of its processing even
though those commits are totally irrelevant to the current subtree
split being run.

To see this in practice you can use the open source GitHub repo
'apollo-ios-dev' and do the following in order:

-Make a changes to a file in 'apollo-ios' and 'apollo-ios-codegen'
 directories
-Create a commit containing these changes
-Do a split on apollo-ios-codegen
   - Do a fetch on the subtree repo
      - git fetch git@github.com:apollographql/apollo-ios-codegen.git
   - git subtree split --prefix=apollo-ios-codegen --squash --rejoin
   - Depending on the current state of the 'apollo-ios-dev' repo
     you may see the issue at this point if the last split was on
     apollo-ios
-Do a split on apollo-ios
   - Do a fetch on the subtree repo
      - git fetch git@github.com:apollographql/apollo-ios.git
   - git subtree split --prefix=apollo-ios --squash --rejoin
-Make changes to a file in apollo-ios-codegen
-Create a commit containing the change(s)
-Do a split on apollo-ios-codegen
   - git subtree split --prefix=apollo-ios-codegen --squash --rejoin
-To see that the patch fixes the issue you can use the custom subtree
 script in the repo so following the same steps as above, except
 instead of using 'git subtree ...' for the commands use
 'git-subtree.sh ...' for the commands

You will see that the final split is looking for the last split
on apollo-ios-codegen to use as it's starting point to process
commits. Since there is a split commit from apollo-ios in between the
2 splits run on apollo-ios-codegen, the processing ends up traversing
the entire history of apollo-ios which increases the time it takes to
do a split based on how long of a history apollo-ios has, while none
of these commits are relevant to the split being done on
apollo-ios-codegen.

So this commit makes a change to the processing of commits for the
split command in order to ignore non-mainline commits from other
subtrees such as apollo-ios in the above breakdown by adding a new
function 'should_ignore_subtree_commit' which is called during
'process_split_commit'. This allows the split/rejoin processing to
still function as expected but removes all of the unnecessary
processing that takes place currently which greatly inflates the
processing time. In the above example, previously the final split
would take ~10-12 minutes, while after this fix it takes seconds.

Added a test to validate that the proposed fix
solves the issue.

The test accomplishes this by checking the output
of the split command to ensure the output from
the progress of 'process_split_commit' function
that represents the 'extracount' of commits
processed remains at 0, meaning none of the commits
from the second subtree were processed.

This was tested against the original functionality
to show the test failed, and then with this fix
to show the test passes.

This illustrated that when using multiple subtrees,
A and B, when doing a split on subtree B, the
processing does not traverse the entire history
of subtree A which is unnecessary and would cause
the 'extracount' of processed commits to climb
based on the number of commits in the history of
subtree A.

Signed-off-by: Zach FettersMoore <zach.fetters@apollographql.com>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-25 10:56:34 -08:00
0009542cab ls-files: avoid the verb "deprecate" for individual options
When e750951e (ls-files: guide folks to --exclude-standard over
other --exclude* options, 2023-01-13) updated the documentation to
give greater visibility to the `--exclude-standard` option, it marked
the `--exclude-per-directory` option as "deprecated".

While it is technically correct that being deprecated does not
necessarily mean it is planned to be removed later, it seems to
cause confusion to readers, especially when we merely mean

    The option Y can be used to achieve the same thing as the option
    X much simpler. To those of you who aren't familiar with either
    X or Y, we would recommend to use Y when appropriate.

This is especially true for `--exclude-standard` vs the combination
of more granular `--exclude-from` and `--exclude-per-directory`
options.  It is true that one common combination of the granular
options can be obtained by just giving the former, but that does not
necessarily mean a more granular control is not necessary.

State the reason why we recommend readers `--exclude-standard` in
the description of `--exclude-per-directory`, instead of saying that
the option is deprecated.  Also, spell out the recipe to emulate
what `--exclude-standard` does, so that the users can give it minute
tweaks (like "do the same as Git Porcelain, except I do not want to
read the global exclusion file from core.excludes").

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-25 10:55:53 -08:00
81effe9468 merge-ll: expose revision names to custom drivers
Custom merge drivers need access to the names of the revisions they
are working on, so that the merge conflict markers they introduce
can refer to those revisions. The placeholders '%S', '%X' and '%Y'
are introduced to this end.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-24 13:15:06 -08:00
fba732c462 transport-helper: re-examine object dir after fetching
This patch fixes a bug where fetch over http (or any helper) using the
v0 protocol may sometimes fail to auto-follow tags. The bug comes from
61c7711cfe (sha1-file: use loose object cache for quick existence check,
2018-11-12). But to explain why (and why this is the right fix), let's
take a step back.

After fetching a pack, the object database has changed, but we may still
hold in-memory caches that are now out of date. Traditionally this was
just the packed_git list, but 61c7711cfe started using a loose-object
cache, as well.

Usually these caches are invalidated automatically. When an expected
object cannot be found, the low-level object lookup routines call
reprepare_packed_git(), which re-scans the set of packs (and thanks to
some preparatory patches ahead of 61c7711cfe, throws away the loose
object cache). But not all calls do this! In some cases we expect that
the object might not exist, and pass OBJECT_INFO_QUICK to tell the
low-level routines not to bother re-scanning. And the tag auto-following
code is one such caller, since we are asking about oids that the other
side has (but we might not have locally).

To deal with this, we explicitly call reprepare_packed_git() ourselves
after fetching a pack; this goes all the way back to 48ec3e5c07
(Incorporate fetched packs in future object traversal, 2008-06-15). But
that only helps if we call fetch_pack() in the main fetch process. When
we're using a transport helper, it happens in a separate sub-process,
and the parent process is left with old values. So this is only a
problem with protocols which require a separate helper process (like
http).

This patch fixes it by teaching the parent process in the transport
helper relationship to make that same reprepare call after the helper
finishes fetching.

You might be left with some lingering questions, like:

  1. Why only the v0 protocol, and not v2? It's because in v2 the child
     helper doesn't actually run fetch_pack(); it merely establishes a
     tunnel over which the main process can talk to the remote side (so
     the fetch_pack() and reprepare happen in the main process).

  2. Wouldn't we have the same bug even before the 61c7711cfe added
     the loose object cache? For example, when we store the fetch as a
     pack locally, wouldn't our packed_git list still be out of date?

     If we store a pack, everything works because other parts of the
     fetch process happen to trigger a call to reprepare_packed_git().
     In particular, before storing whatever ref was originally
     requested, we'll make sure we have the pointed-to object, and that
     call happens without the QUICK flag. So in that case we'll see that
     we don't know about it, reprepare, and then repeat our lookup. And
     now we _do_ know about the pack, and further calls with QUICK will
     find its contents.

     Whereas when we unpack the result into loose objects, we never get
     that same invalidation trigger. We didn't have packs before, and we
     don't after. But when we do the loose object lookup, we find the
     object. There's no way to realize that we didn't have the object
     before the pack, and that having it now means things have changed
     (in theory we could do a superfluous cache lookup to see that it
     was missing from the old cache; but depending on the tags the other
     side showed us, we might not even have filled in that part of the
     cache earlier).

  3. Why does the included test use "--depth 1"? This is important
     because without it, we happen to invalidate the cache as a side
     effect of other parts of the fetch process. What happens in a
     non-shallow fetch is something like this:

        1. we call find_non_local_tags() once before actually getting the
           pack, to see if there are any tags we can fill in from what we
           already have. This fills in the cache (which is obviously
           missing objects we're about to fetch).

        2. before fetching the actual pack, fetch_and_consume_refs()
           calls check_exist_and_connected(), to see if we even need to
           fetch a pack at all. This doesn't use QUICK (though arguably
           it could, as it's purely an optimization). And since it sees
           there are objects we are indeed missing, that triggers a
           reprepare_packed_git() call, which throws out the loose object
           cache.

        3. after fetching, now we call find_non_local_tags() again. And
           since step (2) invalidated our loose object cache, we find
           the new objects and create the tags.

     So everything works, but mostly due to luck. Whereas in a fetch
     with --depth, we skip step 2 entirely, and thus the out-of-date
     cache is still in place for step 3, giving us the wrong answer.

So the test works with a small "--depth 1" fetch, which makes sure that
we don't store the pack from the other side, and that we don't trigger
the accidental cache invalidation. And of course it forces the use of
v0 along with using the http protocol.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-24 11:22:25 -08:00
1df18a1c9a reftable: honor core.fsync
While the reffiles backend honors configured fsync settings, the
reftable backend does not. Address this by fsyncing reftable files using
the write-or-die api's fsync_component() in two places: when we
add additional entries into the table, and when we close the reftable
writer.

This commits adds a flush function pointer as a new member of
reftable_writer because we are not sure that the first argument to the
*write function pointer always contains a file descriptor. In the case of
strbuf_add_void, the first argument is a buffer. This way, we can pass
in a corresponding flush function that knows how to flush depending on
which writer is being used.

This patch does not contain tests as they will need to wait for another
patch to start to exercise the reftable backend. At that point, the
tests will be added to observe that fsyncs are happening when the
reftable is in use.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-23 13:45:27 -08:00
976d0251ce CoC: whitespace fix
Fix two lines with trailing whitespaces.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-23 10:40:10 -08:00
fa1033accc t5312: move reffiles specific tests to t0601
Move a few tests into t0601 since they specifically test the packed-refs
file and thus are specific to the reffiles backend.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:57:26 -08:00
1030d1407f t4202: move reffiles specific tests to t0600
Move two tests into t0600 since they write loose reflog refs manually
and thus are specific to the reffiles backend.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:57:26 -08:00
99a294bcdb t3903: make drop stash test ref backend agnostic
In this test, the calls to cut(1) are only used to verify that the
contents of the reflog entry look as expected. By replacing these with
git-reflog(1) calls, we can make this test ref-backend agnostic.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:57:25 -08:00
dfc9486cb7 t1503: move reffiles specific tests to t0600
Move this test to t0600 with other reffiles specific tests since it
checks for loose refs and is specific to the reffiles backend.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:57:25 -08:00
f0de108417 t1415: move reffiles specific tests to t0601
Move this test into t0601 with other reffiles pack-refs specific tests
since it checks for individual loose refs and thus is specific to the
reffiles backend.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:57:13 -08:00
c02ce75823 t1410: move reffiles specific tests to t0600
Move these tests to t0600 with other reffiles specific tests since they
do things like take a lock on an individual ref, and write directly into
the reflog refs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:56:59 -08:00
e74d9f5716 t1406: move reffiles specific tests to t0600
Move this test to t0600 with the rest of the tests that are specific to
reffiles. This test reaches into reflog directories manually, and so are
specific to reffiles.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:56:59 -08:00
0453030709 t1405: move reffiles specific tests to t0601
Move this test to t0601 with other reffiles specific pack-refs tests
since it is reffiles specific in that it looks into the loose refs
directory for an assertion.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:56:59 -08:00
102d7154a0 t1404: move reffiles specific tests to t0600
These tests modify loose refs manually and are specific to the reffiles
backend. Move these to t0600 to be part of a test suite of reffiles
specific tests.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:56:57 -08:00
9901af48ea t1414: convert test to use Git commands instead of writing refs manually
This test can be re-written to use Git commands rather than writing a
manual ref in the reflog. This way this test no longer needs the
REFFILES prerequisite.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:55:49 -08:00
4e8df1a3c0 remove REFFILES prerequisite for some tests in t1405 and t2017
These tests are compatible with the reftable backend and thus do not
need the REFFILES prerequisite. Even though 53af25e4
(t1405: mark test that checks existence as REFFILES, 2022-01-31) and
53af25e4 (t1405: mark test that checks existence as REFFILES,
2022-01-31) marked these tests to require REFFILES, the reftable backend
in its current state does indeed work with these tests.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:55:49 -08:00
28d4e9f00a t3210: move to t0601
Move t3210 to t0601, since these tests are reffiles specific in that
they modify loose refs manually. This is part of the effort to
categorize these tests together based on the ref backend they test. When
we upstream the reftable backend, we can add more tests to t06xx. This
way, all tests that test specific ref backend behavior will be grouped
together.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:55:45 -08:00
0188b2c8e0 ci(github): also skip logs of broken test cases
When a test fails in the GitHub Actions CI pipeline, we mark it up using
special GitHub syntax so it stands out when looking at the run log. We
also mark up "fixed" test cases, and skip passing tests since we want to
concentrate on the failures.

The finalize_test_case_output function in
test-lib-github-workflow-markup.sh which performs this markup is however
missing a fourth case: "broken" tests, i.e. tests using
'test_expect_failure' to document a known bug. This leads to these
"broken" tests appearing along with any failed tests, potentially
confusing the reader who might not be aware that "broken" is the status
for 'test_expect_failure' tests that indeed failed, and wondering what
their commits "broke".

Also skip these "broken" tests so that only failures and fixed tests
stand out.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 15:16:11 -08:00
808b77e5d4 tests: move t0009-prio-queue.sh to the new unit testing framework
t/t0009-prio-queue.sh along with t/helper/test-prio-queue.c unit
tests Git's implementation of a priority queue. Migrate the
test over to the new unit testing framework to simplify debugging
and reduce test run-time. Refactor the required logic and add
a new test case in addition to porting over the original ones in
shell.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 10:55:01 -08:00
544ea7f375 completion: complete missing 'git log' options
Some options specific to 'git log' are missing from the Bash completion
script. Add them to _git_log.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 08:09:53 -08:00
6d1bfcdd2a completion: complete --encoding
The option --encoding is supported by 'git log' and 'git show', so add
it to __git_log_show_options.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 08:09:53 -08:00
2e419b0578 completion: complete --patch-with-raw
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 08:09:52 -08:00
706b3e7a09 completion: complete missing rev-list options
Some options listed in rev-list-options.txt, and thus accepted by 'git
log' and friends, are missing from the Bash completion script.

Add them to __git_log_common_options.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 08:09:52 -08:00
176cd68634 transport-helper: call do_take_over() in process_connect
The existing pattern among all callers of process_connect() seems to be

        if (process_connect(...)) {
                do_take_over();
                ... dispatch to the underlying method ...
        }
        ... otherwise implement the fallback ...

where the return value from process_connect() is the return value of the
call it makes to process_connect_service().

Move the call of do_take_over() inside process_connect(), so that
calling the process_connect() function is more concise and will not
miss do_take_over().

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:38 -08:00
35d26e79f8 transport-helper: call do_take_over() in connect_helper
After successfully connecting to the smart transport by calling
process_connect_service() in connect_helper(), run do_take_over() to
replace the old vtable with a new one which has methods ready for the
smart transport connection. This fixes the exit code of git-archive
in test case "archive remote http repository" of t5003.

The connect_helper() function is used as the connect method of the
vtable in "transport-helper.c", and it is called by transport_connect()
in "transport.c" to setup a connection. The only place that we call
transport_connect() so far is in "builtin/archive.c". Without running
do_take_over(), it may fail to call transport_disconnect() in
run_remote_archiver() of "builtin/archive.c". This is because for a
stateless connection and a service like "git-upload-archive", the
remote helper may receive a SIGPIPE signal and exit early. Call
do_take_over() to have a graceful disconnect method, so that we still
call transport_disconnect() even if the remote helper exits early.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:37 -08:00
24f275ab33 http-backend: new rpc-service for git-upload-archive
Add new rpc-service "upload-archive" in http-backend to add server side
support for remote archive over HTTP/HTTPS protocols.

Also add new test cases in t5003. In the test case "archive remote http
repository", git-archive exits with a non-0 exit code even though we
create the archive correctly. It will be fixed in a later commit.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:37 -08:00
5c85836896 transport-helper: protocol v2 supports upload-archive
We used to support only git-upload-pack service for protocol v2. In
order to support remote archive over HTTP/HTTPS protocols, add new
service support for git-upload-archive in protocol v2.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:37 -08:00
23b7d59a82 remote-curl: supports git-upload-archive service
Add new service (git-upload-archive) support in remote-curl, so we can
support remote archive over HTTP/HTTPS protocols. Differences between
git-upload-archive and other services:

 1. The git-archive program does not expect to see protocol version and
    capabilities when connecting to remote-helper, so do not send them
    in remote-curl for the git-upload-archive service.

 2. We need to detect protocol version by calling discover_refs().
    Fallback to use the git-upload-pack service (which, like
    git-upload-archive, is a read-only operation) to discover protocol
    version.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:37 -08:00
4a61faf75d transport-helper: no connection restriction in connect_helper
When commit b236752a (Support remote archive from all smart transports,
2009-12-09) added "remote archive" support for "smart transports", it
was for transport that supports the ".connect" method. The
"connect_helper()" function protected itself from getting called for a
transport without the method before calling process_connect_service(),
which only worked with the ".connect" method.

Later, commit edc9caf7 (transport-helper: introduce stateless-connect,
2018-03-15) added a way for a transport without the ".connect" method
to establish a "stateless" connection in protocol v2, where
process_connect_service() was taught to handle the ".stateless_connect"
method, making the old protection too strict. But commit edc9caf7 forgot
to adjust this protection accordingly. Even at the time of commit
b236752a, this protection seemed redundant, since
process_connect_service() would return 0 if the connection could not be
established, and connect_helper() would still die() early.

Remove the restriction in connect_helper() and give the function
process_connect_service() the opportunity to establish a connection
using ".connect" or ".stateless_connect" for protocol v2. So we can
connect with a stateless-rpc and do something useful. E.g., in a later
commit, implements remote archive for a repository over HTTP protocol.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:37 -08:00
457f96252f parse-options: simplify positivation handling
We accept the positive version of options whose long name starts with
"no-" and are defined without the flag PARSE_OPT_NONEG.  E.g. git clone
has an explicitly defined --no-checkout option and also implicitly
accepts --checkout to override it.

parse_long_opt() handles that by restarting the option matching with the
positive version when it finds that only the current option definition
starts with "no-", but not the user-supplied argument.  This code is
located almost at the end of the matching logic.

Avoid the need for a restart by moving the code up.  We don't have to
check the positive arg against the negative long_name at all -- the
"no-" prefix of the latter makes a match impossible.  Skip it and toggle
OPT_UNSET right away to simplify the control flow.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:17:12 -08:00
45bb916248 setup: allow cwd=.git w/ bareRepository=explicit
The safe.bareRepository setting can be set to 'explicit' to disallow
implicit uses of bare repositories, preventing an attack [1] where an
artificial and malicious bare repository is embedded in another git
repository. Unfortunately, some tooling uses myrepo/.git/ as the cwd
when executing commands, and this is blocked when
safe.bareRepository=explicit. Blocking is unnecessary, as git already
prevents nested .git directories.

Teach git to not reject uses of git inside of the .git directory: check
if cwd is .git (or a subdirectory of it) and allow it even if
safe.bareRepository=explicit.

[1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 14:11:49 -08:00
425ae8a3df t2400: avoid losing exit status to pipes
The exit code of the preceding command in a pipe is disregarded. So
if that preceding command is a Git command that fails, the test would
not fail. Instead, by saving the output of that Git command to a file,
and removing the pipe, we make sure the test will fail if that Git
command fails.

Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 10:27:18 -08:00
af3d2c160f Docs: majordomo@vger.kernel.org has been decomissioned
Update the instruction for subscribing to the Git mailing list
we have on a few documentation pages.

Reported-by: Kyle Lippincott <spectral@google.com>
Helped-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 10:09:07 -08:00
5825268db1 parse-options: fully disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
baa4adc66a (parse-options: disable option abbreviation with
PARSE_OPT_KEEP_UNKNOWN, 2019-01-27) turned off support for abbreviated
options when the flag PARSE_OPT_KEEP_UNKNOWN is given, as any shortened
option could also be an abbreviation for one of the unknown options.

The code for handling abbreviated options is guarded by an if, but it
can also be reached via goto.  baa4adc66a only blocked the first way.
Add the condition to the other ones as well.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 09:55:43 -08:00
5ba95e0880 t0024: style fix
t0024 has multiple command invocations on a single line, which
goes against the style described in CodingGuidelines, thus fix
that.

Also, use the -C flag to give the destination when using $TAR,
therefore, not requiring a subshell.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 09:20:41 -08:00
d262bfa302 t0024: avoid losing exit status to pipes
Replace pipe with redirection operator '>' to store the output
to a temporary file after 'git archive' command since the pipe
will swallow the command's exit code and a crash won't
necessarily be noticed.

Also fix an unwanted space after redirection '>' to match the
style described in CodingGuidelines.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 09:20:39 -08:00
e02ecfcc53 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 15:04:46 -08:00
ed87d37eaa Merge branch 'ps/p4-use-ref-api'
"git p4" update to prepare for reftable

* ps/p4-use-ref-api:
  git-p4: stop reaching into the refdb
2024-01-19 15:04:46 -08:00
1b09562693 Merge branch 'cp/t4129-pipefix'
Test update.

* cp/t4129-pipefix:
  t4129: prevent loss of exit code due to the use of pipes
2024-01-19 15:04:46 -08:00
b5fb623542 Merge branch 'sk/mingw-owner-check-error-message-improvement'
In addition to (rather cryptic) Security Identifiers, show username
and domain in the error message when we barf on mismatch between
the Git directory and the current user on Windows.

* sk/mingw-owner-check-error-message-improvement:
  mingw: give more details about unsafe directory's ownership
2024-01-19 15:04:46 -08:00
22695a38a4 Merge branch 'bk/bisect-doc-fix'
Synopsis fix.

* bk/bisect-doc-fix:
  doc: refer to pathspec instead of path
  doc: use singular form of repeatable path arg
2024-01-19 15:04:46 -08:00
f033388b0f Merge branch 'tb/fetch-all-configuration'
"git fetch" learned to pay attention to "fetch.all" configuration
variable, which pretends as if "--all" was passed from the command
line when no remote parameter was given.

* tb/fetch-all-configuration:
  fetch: add new config option fetch.all
2024-01-19 15:04:45 -08:00
5d1ee0749b Merge branch 'rj/clarify-branch-doc-m'
Doc update.

* rj/clarify-branch-doc-m:
  branch: clarify <oldbranch> term
2024-01-19 15:04:45 -08:00
95a9cfbb83 Merge branch 'ps/gitlab-ci-static-analysis'
GitLab CI update.

* ps/gitlab-ci-static-analysis:
  ci: add job performing static analysis on GitLab CI
2024-01-19 15:04:45 -08:00
9ea8145387 Merge branch 'ps/prompt-parse-HEAD-futureproof'
Futureproof command line prompt support (in contrib/).

* ps/prompt-parse-HEAD-futureproof:
  git-prompt: stop manually parsing HEAD with unknown ref formats
2024-01-19 15:04:45 -08:00
c4a9cf1df3 ci: build and run minimal fuzzers in GitHub CI
To prevent bitrot, we would like to regularly exercise the fuzz tests in
order to make sure they still link & run properly. We already compile
the fuzz test objects as part of the default `make` target, but we do
not link the executables due to the fuzz tests needing specific
compilers and compiler features. This has lead to frequent build
breakages for the fuzz tests.

To remedy this, we can add a CI step to actually link the fuzz
executables, and run them (with finite input rather than the default
infinite random input mode) to verify that they execute properly.

Since the main use of the fuzz tests is via OSS-Fuzz [1], and OSS-Fuzz
only runs tests on Linux [2], we only set up a CI test for the fuzzers
on Linux.

[1] https://github.com/google/oss-fuzz
[2] https://google.github.io/oss-fuzz/further-reading/fuzzer-environment/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 14:29:25 -08:00
8b9a42bf48 fuzz: fix fuzz test build rules
When we originally added the fuzz tests in 5e47215080 (fuzz: add basic
fuzz testing target., 2018-10-12), we went to some trouble to create a
Makefile rule that allowed linking the fuzz executables without pulling
in common-main.o. This was necessary to prevent the
fuzzing-engine-provided main() from clashing with Git's main().

However, since 19d75948ef (common-main.c: move non-trace2 exit()
behavior out of trace2.c, 2022-06-02), it has been necessary to link
common-main.o due to moving the common_exit() function to that file.
Ævar suggested a set of compiler flags to allow this in [1], but this
was never reflected in the Makefile.

Since we now must include common-main.o, there's no reason to pick and
choose a subset of object files to link, so simplify the Makefile rule
for the fuzzer executables to just use libgit.a. While we're at it,
include the necessary linker flag to allow multiple definitions
directly in the Makefile rule, rather than requiring it to be passed on
the command-line each time. This means the Makefile rule as written is
now more compiler-specific, but this was already the case for the
fuzzers themselves anyway.

[1] https://lore.kernel.org/git/220607.8635ggupws.gmgdl@evledraar.gmail.com/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 14:29:18 -08:00
8df4c5d205 Documentation: add "special refs" to the glossary
Add the "special refs" term to our glossary.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:42 -08:00
2cd33f4428 refs: redefine special refs
Now that our list of special refs really only contains refs which have
actually-special semantics, let's redefine what makes a special ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
3f921c7591 refs: convert MERGE_AUTOSTASH to become a normal pseudo-ref
Similar to the preceding conversion of the AUTO_MERGE pseudo-ref, let's
convert the MERGE_AUTOSTASH ref to become a normal pseudo-ref as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
35122daebc sequencer: introduce functions to handle autostashes via refs
We're about to convert the MERGE_AUTOSTASH ref to become non-special,
using the refs API instead of direct filesystem access to both read and
write the ref. The current interfaces to write autostashes is entirely
path-based though, so we need to extend them to also support writes via
the refs API instead.

Ideally, we would be able to fully replace the old set of path-based
interfaces. But the sequencer will continue to write state into
"rebase-merge/autostash". This path is not considered to be a ref at all
and will thus stay is-is for now, which requires us to keep both path-
and refs-based interfaces to handle autostashes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
fd7c6ffa9e refs: convert AUTO_MERGE to become a normal pseudo-ref
In 70c70de616 (refs: complete list of special refs, 2023-12-14) we have
inrtoduced a new `is_special_ref()` function that classifies some refs
as being special. The rule is that special refs are exclusively read and
written via the filesystem directly, whereas normal refs exclucsively go
via the refs API.

The intent of that commit was to record the status quo so that we know
to route reads of such special refs consistently. Eventually, the list
should be reduced to its bare minimum of refs which really are special,
namely FETCH_HEAD and MERGE_HEAD.

Follow up on this promise and convert the AUTO_MERGE ref to become a
normal pseudo-ref by using the refs API to both read and write it
instead of accessing the filesystem directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
bb02e95f3b sequencer: delete REBASE_HEAD in correct repo when picking commits
When picking commits, we delete some state before executing the next
sequencer action on interactive rebases. But while we use the correct
repository to calculate paths to state files that need deletion, we use
the repo-less `delete_ref()` function to delete REBASE_HEAD. Thus, if
the sequencer ran in a different repository than `the_repository`, we
would end up deleting the ref in the wrong repository.

Fix this by using `refs_delete_ref()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
821f6632b0 sequencer: clean up pseudo refs with REF_NO_DEREF
When cleaning up the state-tracking pseudorefs CHERRY_PICK_HEAD or
REVERT_HEAD we do not set REF_NO_DEREF. In the unlikely case where those
refs are a symref we would thus end up deleting the symref targets, and
not the symrefs themselves.

Harden the code to use REF_NO_DEREF to fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:40 -08:00
8430b438f6 submodule-config.c: strengthen URL fsck check
Update the validation of "curl URL" submodule URLs (i.e. those that specify
an "http[s]" or "ftp[s]" protocol) in 'check_submodule_url()' to catch more
invalid URLs. The existing validation using 'credential_from_url_gently()'
parses certain URLs incorrectly, leading to invalid submodule URLs passing
'git fsck' checks. Conversely, 'url_normalize()' - used to validate remote
URLs in 'remote_get()' - correctly identifies the invalid URLs missed by
'credential_from_url_gently()'.

To catch more invalid cases, replace 'credential_from_url_gently()' with
'url_normalize()' followed by a 'url_decode()' and a check for newlines
(mirroring 'check_url_component()' in the 'credential_from_url_gently()'
validation).

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 10:15:41 -08:00
7e2fc39d8c t7450: test submodule urls
Add tests to 't7450-bad-git-dotfiles.sh' to check the validity of different
submodule URLs. To verify this directly (without setting up test
repositories & submodules), add a 'check-url' subcommand to 'test-tool
submodule' that calls 'check_submodule_url' in the same way that
'check-name' calls 'check_submodule_name'.

Add two tests to separately address cases where the URL check correctly
filters out invalid URLs and cases where the check misses invalid URLs. Mark
the latter ("url check misses invalid cases") with 'test_expect_failure' to
indicate that this is currently broken, which will be fixed in the next step.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 10:15:34 -08:00
1c5bc6971e diffcore-delta: avoid ignoring final 'line' of file
hash_chars() would hash lines to integers, and store them in a spanhash,
but cut lines at 64 characters.  Thus, whenever it reached 64 characters
or a newline, it would create a new spanhash.  The problem is, the final
part of the file might not end 64 characters after the previous 'line'
and might not end with a newline.  This could, for example, cause an
85-byte file with 12 lines and only the first character in the file
differing to appear merely 23% similar rather than the expected 97%.
Ensure the last line is included, and add a testcase that would have
caught this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 19:10:11 -08:00
74e12192e6 maintenance: use XDG config if it exists
`git maintenance register` registers the repository in the user's global
config. `$XDG_CONFIG_HOME/git/config` is supposed to be used if
`~/.gitconfig` does not exist. However, this command creates a
`~/.gitconfig` file and writes to that one even though the XDG variant
exists.

This used to work correctly until 50a044f1e4 (gc: replace config
subprocesses with API calls, 2022-09-27), when the command started calling
the config API instead of git-config(1).

Also change `unregister` accordingly.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 12:17:42 -08:00
c15129b699 config: factor out global config file retrieval
Factor out code that retrieves the global config file so that we can use
it in `gc.c` as well.

Use the old name from the previous commit since this function acts
functionally the same as `git_system_config` but for “global”.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 12:17:41 -08:00
ecffa3ed51 config: rename global config function
Rename this function to a more descriptive name since we want to use the
existing name for a new function.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 12:17:41 -08:00
4ef97dc4cd config: format newlines
Remove unneeded newlines according to `clang-format`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 12:17:41 -08:00
4f36b8597c reftable/stack: fix race in up-to-date check
In 6fdfaf15a0 (reftable/stack: use stat info to avoid re-reading stack
list, 2024-01-11) we have introduced a new mechanism to avoid re-reading
the table list in case stat(3P) figures out that the stack didn't change
since the last time we read it.

While this change significantly improved performance when writing many
refs, it can unfortunately lead to false negatives in very specific
scenarios. Given two processes A and B, there is a feasible sequence of
events that cause us to accidentally treat the table list as up-to-date
even though it changed:

  1. A reads the reftable stack and caches its stat info.

  2. B updates the stack, appending a new table to "tables.list". This
     will both use a new inode and result in a different file size, thus
     invalidating A's cache in theory.

  3. B decides to auto-compact the stack and merges two tables. The file
     size now matches what A has cached again. Furthermore, the
     filesystem may decide to recycle the inode number of the file we
     have replaced in (2) because it is not in use anymore.

  4. A reloads the reftable stack. Neither the inode number nor the
     file size changed. If the timestamps did not change either then we
     think the cached copy of our stack is up-to-date.

In fact, the commit introduced three related issues:

  - Non-POSIX compliant systems may not report proper `st_dev` and
    `st_ino` values in stat(3P), which made us rely solely on the
    file's potentially coarse-grained mtime and ctime.

  - `stat_validity_check()` and friends may end up not comparing
    `st_dev` and `st_ino` depending on the "core.checkstat" config,
    again reducing the signal to the mtime and ctime.

  - `st_ino` can be recycled, rendering the check moot even on
    POSIX-compliant systems.

Given that POSIX defines that "The st_ino and st_dev fields taken
together uniquely identify the file within the system", these issues led
to the most important signal to establish file identity to be ignored or
become useless in some cases.

Refactor the code to stop using `stat_validity_check()`. Instead, we
manually stat(3P) the file descriptors to make relevant information
available. On Windows and MSYS2 the result will have both `st_dev` and
`st_ino` set to 0, which allows us to address the first issue by not
using the stat-based cache in that case. It also allows us to make sure
that we always compare `st_dev` and `st_ino`, addressing the second
issue.

The third issue of inode recycling can be addressed by keeping the file
descriptor of "files.list" open during the lifetime of the reftable
stack. As the file will still exist on disk even though it has been
unlinked it is impossible for its inode to be recycled as long as the
file descriptor is still open.

This should address the race in a POSIX-compliant way. The only real
downside is that this mechanism cannot be used on non-POSIX-compliant
systems like Windows. But we at least have the second-level caching
mechanism in place that compares contents of "files.list" with the
currently loaded list of tables.

This new mechanism performs roughly the same as the previous one that
relied on `stat_validity_check()`:

  Benchmark 1: update-ref: create many refs (HEAD~)
    Time (mean ± σ):      4.754 s ±  0.026 s    [User: 2.204 s, System: 2.549 s]
    Range (min … max):    4.694 s …  4.802 s    20 runs

  Benchmark 2: update-ref: create many refs (HEAD)
    Time (mean ± σ):      4.721 s ±  0.020 s    [User: 2.194 s, System: 2.527 s]
    Range (min … max):    4.691 s …  4.753 s    20 runs

  Summary
    update-ref: create many refs (HEAD~) ran
      1.01 ± 0.01 times faster than update-ref: create many refs (HEAD)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 12:02:09 -08:00
456333eb4d reftable/stack: unconditionally reload stack after commit
After we have committed an addition to the reftable stack we call
`reftable_stack_reload()` to reload the stack and thus reflect the
changes that were just added. This function will only conditionally
reload the stack in case `stack_uptodate()` tells us that the stack
needs reloading. This check is wasteful though because we already know
that the stack needs reloading.

Call `reftable_stack_reload_maybe_reuse()` instead, which will
unconditionally reload the stack. This is merely a conceptual fix, the
code in question was not found to cause any problems in practice.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 12:02:09 -08:00
56090a35ab ci: add macOS jobs to GitLab CI
Add a job to GitLab CI which runs tests on macOS, which matches the
equivalent "osx-clang" job that we have for GitHub Workflows. One
significant difference though is that this new job runs on Apple M1
machines and thus uses the "arm64" architecture. As GCC does not yet
support this comparatively new architecture we cannot easily include an
equivalent for the "osx-gcc" job that exists in GitHub Workflows.

Note that one test marked as `test_must_fail` is surprisingly passing:

  t7815-grep-binary.sh                             (Wstat: 0 Tests: 22 Failed: 0)
    TODO passed:   12

This seems to boil down to an unexpected difference in how regcomp(3P)
works when matching NUL bytes. Cross-checking with the respective GitHub
job shows that this is not an issue unique to the GitLab CI job as it
passes in the same way there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:53:17 -08:00
c4b84b137a ci: make p4 setup on macOS more robust
When setting up Perforce on macOS we put both `p4` and `p4d` into
"$HOME/bin". On GitHub CI this directory is indeed contained in the PATH
environment variable and thus there is no need for additional setup than
to put the binaries there. But GitLab CI does not do this, and thus our
Perforce-based tests would be skipped there even though we download the
binaries.

Refactor the setup code to become more robust by downloading binaries
into a separate directory which we then manually append to our PATH.
This matches what we do on Linux-based jobs.

Note that it may seem like we already did append "$HOME/bin" to PATH
because we're actually removing the lines that adapt PATH. But we only
ever adapted the PATH variable in "ci/install-dependencies.sh", and
didn't adapt it when running "ci/run-build-and-test.sh". Consequently,
the required binaries wouldn't be found during the test run unless the
CI platform already had the "$HOME/bin" in PATH right from the start.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:53:17 -08:00
99c60edc5b ci: handle TEST_OUTPUT_DIRECTORY when printing test failures
The TEST_OUTPUT_DIRECTORY environment variable can be used to instruct
the test suite to write test data and test results into a different
location than into "t/". The "ci/print-test-failures.sh" script does not
know to handle this environment variable though, which means that it
will search for test results in the wrong location if it was set.

Update the script to handle TEST_OUTPUT_DIRECTORY so that we can start
to set it in our CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:53:17 -08:00
d52b426ad4 Makefile: detect new Homebrew location for ARM-based Macs
With the introduction of the ARM-based Macs the default location for
Homebrew has changed from "/usr/local" to "/opt/homebrew". We only
handle the former location though, which means that unless the user has
manually configured required search paths we won't be able to locate it.

Improve upon this by adding relevant paths to our CFLAGS and LDFLAGS as
well as detecting the location of msgfmt(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:53:17 -08:00
f591a9bfeb t7527: decrease likelihood of racing with fsmonitor daemon
In t7527, we test that the builtin fsmonitor daemon works well in
various edge cases. One of these tests is frequently failing because
events reported by the fsmonitor--daemon are missing an expected event.
This failure is essentially a race condition: we do not wait for the
daemon to flush out all events before we ask it to quit. Consequently,
it can happen that we miss some expected events.

In other testcases we counteract this race by sending a simple query to
the daemon. Quoting a comment:

  We run a simple query after modifying the filesystem just to introduce
  a bit of a delay so that the trace logging from the daemon has time to
  get flushed to disk.

Now this workaround is not a "proper" fix as we do not wait for all
events to have been synchronized in a deterministic way. But this fix
seems to be sufficient for all the other tests to pass, so it must not
be all that bad.

Convert the failing test to do the same. While the test was previously
failing in about 50% of the test runs, I couldn't reproduce the failure
after the change anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:53:17 -08:00
0aabeaa562 builtin/show-ref: treat directory as non-existing in --exists
9080a7f178 (builtin/show-ref: add new mode to check for reference
existence, 2023-10-31) added the option --exists to git-show-ref(1).

When you use this option against a ref that doesn't exist, but it is
a parent directory of an existing ref, you get the following error:

    $ git show-ref --exists refs/heads
    error: failed to look up reference: Is a directory

when the ref-files backend is in use.  To be more clear to user,
hide the error about having found a directory.  What matters to the
user is that the named ref does not exist.  Instead, print the same
error as when the ref was not found:

    error: reference does not exist

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:17:25 -08:00
6af2c4ad45 test-submodule: remove command line handling for check-name
The 'check-name' subcommand to 'test-tool submodule' is documented as being
able to take a command line argument '<name>'. However, this does not work -
and has never worked - because 'argc > 0' triggers the usage message in
'cmd__submodule_check_name()'. To simplify the helper and avoid future
confusion around proper use of the subcommand, remove any references to
command line arguments for 'check-name' in usage strings and handling in
'check_name()'.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 10:17:10 -08:00
13320ff610 submodule-config.h: move check_submodule_url
Move 'check_submodule_url' out of 'fsck.c' and into 'submodule-config.h' as
a public method, similar to 'check_submodule_name'. With the function now
accessible outside of 'fsck', it can be used in a later commit to extend
'test-tool submodule' to check the validity of submodule URLs as it does
with names in the 'check-name' subcommand.

Other than its location, no changes are made to 'check_submodule_url' in
this patch.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 10:12:48 -08:00
f10031fadd rebase: fix documentation about used shell in -x
The shell used when using the -x option is erroneously documented to be
the one pointed to by the $SHELL environmental variable. This was true
when rebase was implemented as a shell script but this is no longer
true.

Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-17 16:14:57 -08:00
cab11f4e41 t7501: add tests for --amend --signoff
Add tests for amending the commit to add Signed-off-by trailer. And
also to check if it does not add another trailer if one already exists.

Currently, there are tests for --signoff separately in t7501, however,
they are not tested with --amend.

Therefore, these tests belong with other similar tests of --amend in
t7501-commit-basic-functionality.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-17 13:31:16 -08:00
4e4f576b06 t7501: add tests for --include and --only
Add tests for --only (-o) and --include (-i). This include testing
with or without staged changes for both -i and -o. Also to test
for committing untracked files with -i, -o and without -i/-o.

Some tests already exist in t7501 for testing --only, however,
it is only tested in combination with --amend and --allow-empty
and on to-be-born branch. The addition of these tests check, when
the pathspec is provided without using -only, that only the files
matching the pathspec get committed. This behavior is same when
we provide --only and it is checked by the tests.
(as --only is the default mode of operation when pathspec is
provided.)

As for --include, there is no prior test for checking if --include
also commits staged changes, thus add test for that. Along with
the tests also document a potential bug, in which, when provided
with -i and a pathspec that does not match any tracked path,
commit does not fail if there are staged changes. And when there
are no staged changes commit fails. However, no error is returned
to stderr in either of the cases. This is described in the TODO
comment before the relevent testcase.

And also add a test for checking incompatibilty when using -o and
-i together.

Thus, these tests belong in t7501 with other similar existing tests,
as described in the case of --only.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-17 13:31:15 -08:00
7310fa4a75 Merge branch 'ps/gitlab-ci-static-analysis' into ps/gitlab-ci-macos
* ps/gitlab-ci-static-analysis:
  ci: add job performing static analysis on GitLab CI
2024-01-16 13:13:15 -08:00
d919965af1 advice: allow disabling the automatic hint in advise_if_enabled()
Using advise_if_enabled() to display an advice will automatically
include instructions on how to disable the advice, alongside the main
advice:

	hint: use --reapply-cherry-picks to include skipped commits
	hint: Disable this message with "git config advice.skippedCherryPicks false"

To do so, we provide a knob which can be used to disable the advice.

But also to tell us the opposite: to show the advice.

Let's not include the deactivation instructions for an advice if the
user explicitly sets its visibility.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 13:07:00 -08:00
8cf646faac Merge branch 'rj/advice-delete-branch-not-fully-merged' into rj/advice-disable-how-to-disable
* rj/advice-delete-branch-not-fully-merged:
  branch: make the advice to force-deleting a conditional one
  advice: fix an unexpected leading space
  advice: sort the advice related lists
2024-01-16 13:06:35 -08:00
186b115d30 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 10:11:58 -08:00
a57da6bfee Merge branch 'ib/rebase-reschedule-doc'
Doc update.

* ib/rebase-reschedule-doc:
  rebase: clarify --reschedule-failed-exec default
2024-01-16 10:11:58 -08:00
4cc0f8e8fa Merge branch 'jk/commit-graph-slab-clear-fix'
Clearing in-core repository (happens during e.g., "git fetch
--recurse-submodules" with commit graph enabled) made in-core
commit object in an inconsistent state by discarding the necessary
data from commit-graph too early, which has been corrected.

* jk/commit-graph-slab-clear-fix:
  commit-graph: retain commit slab when closing NULL commit_graph
2024-01-16 10:11:58 -08:00
b27f67aa93 Merge branch 'jk/index-pack-lsan-false-positive-fix'
Fix false positive reported by leak sanitizer.

* jk/index-pack-lsan-false-positive-fix:
  index-pack: spawn threads atomically
2024-01-16 10:11:58 -08:00
6484eb9a97 Merge branch 'cp/sideband-array-index-comment-fix'
In-code comment fix.

* cp/sideband-array-index-comment-fix:
  sideband.c: remove redundant 'NEEDSWORK' tag
2024-01-16 10:11:57 -08:00
32c6fc3e30 Merge branch 'ps/refstorage-extension'
Introduce a new extension "refstorage" so that we can mark a
repository that uses a non-default ref backend, like reftable.

* ps/refstorage-extension:
  t9500: write "extensions.refstorage" into config
  builtin/clone: introduce `--ref-format=` value flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/rev-parse: introduce `--show-ref-format` flag
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  setup: introduce "extensions.refStorage" extension
  setup: set repository's formats on init
  setup: start tracking ref storage format
  refs: refactor logic to look up storage backends
  worktree: skip reading HEAD when repairing worktrees
  t: introduce DEFAULT_REPO_FORMAT prereq
2024-01-16 10:11:57 -08:00
481d69dd63 Merge branch 'ps/reftable-fixes-and-optims'
More fixes and optimizations to the reftable backend.

* ps/reftable-fixes-and-optims:
  reftable/merged: transfer ownership of records when iterating
  reftable/merged: really reuse buffers to compute record keys
  reftable/record: store "val2" hashes as static arrays
  reftable/record: store "val1" hashes as static arrays
  reftable/record: constify some parts of the interface
  reftable/writer: fix index corruption when writing multiple indices
  reftable/stack: do not auto-compact twice in `reftable_stack_add()`
  reftable/stack: do not overwrite errors when compacting
2024-01-16 10:11:57 -08:00
020e0a087f completion: treat dangling symrefs as existing pseudorefs
The `__git_pseudoref_exists ()` helper function back to git-rev-parse(1)
in case the reftable backend is in use. This is not in the same spirit
as the simple existence check that the "files" backend does though,
because there we only check for the pseudo-ref to exist with `test -f`.
With git-rev-parse(1) we not only check for existence, but also verify
that the pseudo-ref resolves to an object, which may not be the case
when the pseudo-ref points to an unborn branch.

Fix this issue by using `git show-ref --exists` instead. Note that we do
not have to silence stdout anymore as git-show-ref(1) will not print
anything.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 09:18:21 -08:00
9a9c31135e completion: silence pseudoref existence check
In 44dbb3bf29 (completion: support pseudoref existence checks for
reftables, 2023-12-19), we have extended the Bash completion script to
support future ref backends better by using git-rev-parse(1) to check
for pseudo-ref existence. This conversion has introduced a bug, because
even though we pass `--quiet` to git-rev-parse(1) it would still output
the resolved object ID of the ref in question if it exists.

Fix this by redirecting its stdout to `/dev/null` and add a test that
catches this behaviour. Note that the test passes even without the fix
for the "files" backend because we parse pseudo refs via the filesystem
directly in that case. But the test will fail with the "reftable"
backend.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 09:18:21 -08:00
7b9cda2d3d completion: improve existence check for pseudo-refs
Improve the existence check along the following lines:

  - Stop stripping the "ref :" prefix and compare to the expected value
    directly. This allows us to drop a now-unused variable that was
    previously leaking into the user's shell.

  - Mark the "head" variable as local so that we don't leak its value
    into the user's shell.

  - Stop manually handling the `-C $__git_repo_path` option, which the
    `__git ()` wrapper aleady does for us.

  - In simlar spirit, stop redirecting stderr, which is also handled by
    the wrapper already.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 09:18:21 -08:00
6807d3942c t9902: verify that completion does not print anything
The Bash completion script must not print anything to either stdout or
stderr. Instead, it is only expected to populate certain variables.
Tighten our `test_completion ()` test helper to verify this requirement.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 09:18:20 -08:00
3bf5ccf429 completion: discover repo path in __git_pseudoref_exists ()
The helper function `__git_pseudoref_exists ()` expects that the repo
path has already been discovered by its callers, which makes for a
rather fragile calling convention. Refactor the function to discover the
repo path itself to make it more self-contained, which also removes the
need to discover the path in some of its callers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 09:18:20 -08:00
8f50984cf4 rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
'--filter=blob:limit=<n>' was introduced in 25ec7bcac0 (list-objects:
filter objects in traverse_commit_list, 2017-11-21) and later expanded
to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
2020-02-14)

The logic that was introduced in these commits (and that still persists
to this day) omits blobs larger than _or equal_ to n bytes or units.

However, the documentation (Documentation/rev-list-options.txt) states:

>The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n
bytes or units. n may be zero.

Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
logic, so it seems it is the documentation that needs fixing, not the
code.

This changes the explanation to be similar to
Documentation/git-clone.txt, which is correct.

Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 08:53:13 -08:00
e875d4511c unit-tests: rewrite t/helper/test-ctype.c as a unit test
In the recent codebase update (8bf6fbd00d (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.

This commit migrates the unit tests for C character classification
functions (isdigit(), isspace(), etc) from the legacy approach
using the test-tool command `test-tool ctype` in t/helper/test-ctype.c
to the new unit testing framework (t/unit-tests/test-lib.h).

The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 07:37:47 -08:00
4efa9308ea commit-graph: fix memory leak when not writing graph
When `write_commit_graph()` bails out writing a split commit-graph early
then it may happen that we have already gathered the set of existing
commit-graph file names without yet determining the new merged set of
files. This can result in a memory leak though because we only clear the
preimage of files when we have collected the postimage.

Fix this issue by dropping the condition altogether so that we always
try to free both preimage and postimage filenames. As the context
structure is zero-initialized this simplification is safe to do.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-15 17:08:28 -08:00
d4dbce1db5 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 16:09:57 -08:00
b3049bbb97 Merge branch 'cp/git-flush-is-an-env-bool'
Unlike other environment variables that took the usual
true/false/yes/no as well as 0/1, GIT_FLUSH only understood 0/1,
which has been corrected.

* cp/git-flush-is-an-env-bool:
  write-or-die: make GIT_FLUSH a Boolean environment variable
2024-01-12 16:09:57 -08:00
566471105c Merge branch 'ms/rebase-insnformat-doc-fix'
Docfix.

* ms/rebase-insnformat-doc-fix:
  Documentation: fix statement about rebase.instructionFormat
2024-01-12 16:09:57 -08:00
15df15fe07 Merge branch 'jx/sideband-chomp-newline-fix'
Sideband demultiplexer fixes.

* jx/sideband-chomp-newline-fix:
  pkt-line: do not chomp newlines for sideband messages
  pkt-line: memorize sideband fragment in reader
  test-pkt-line: add option parser for unpack-sideband
2024-01-12 16:09:56 -08:00
0fea6b73f1 Merge branch 'tb/multi-pack-verbatim-reuse'
Streaming spans of packfile data used to be done only from a
single, primary, pack in a repository with multiple packfiles.  It
has been extended to allow reuse from other packfiles, too.

* tb/multi-pack-verbatim-reuse: (26 commits)
  t/perf: add performance tests for multi-pack reuse
  pack-bitmap: enable reuse from all bitmapped packs
  pack-objects: allow setting `pack.allowPackReuse` to "single"
  t/test-lib-functions.sh: implement `test_trace2_data` helper
  pack-objects: add tracing for various packfile metrics
  pack-bitmap: prepare to mark objects from multiple packs for reuse
  pack-revindex: implement `midx_pair_to_pack_pos()`
  pack-revindex: factor out `midx_key_to_pack_pos()` helper
  midx: implement `midx_preferred_pack()`
  git-compat-util.h: implement checked size_t to uint32_t conversion
  pack-objects: include number of packs reused in output
  pack-objects: prepare `write_reused_pack_verbatim()` for multi-pack reuse
  pack-objects: prepare `write_reused_pack()` for multi-pack reuse
  pack-objects: pass `bitmapped_pack`'s to pack-reuse functions
  pack-objects: keep track of `pack_start` for each reuse pack
  pack-objects: parameterize pack-reuse routines over a single pack
  pack-bitmap: return multiple packs via `reuse_partial_packfile_from_bitmap()`
  pack-bitmap: simplify `reuse_partial_packfile_from_bitmap()` signature
  ewah: implement `bitmap_is_empty()`
  pack-bitmap: pass `bitmapped_pack` struct to pack-reuse functions
  ...
2024-01-12 16:09:56 -08:00
0ebbaa07d0 Merge branch 'jk/t1006-cat-file-objectsize-disk'
Test update.

* jk/t1006-cat-file-objectsize-disk:
  t1006: prefer shell loop to awk for packed object sizes
  t1006: add tests for %(objectsize:disk)
2024-01-12 16:09:56 -08:00
3e8558438d Merge branch 'jw/builtin-objectmode-attr'
The builtin_objectmode attribute is populated for each path
without adding anything in .gitattributes files, which would be
useful in magic pathspec, e.g., ":(attr:builtin_objectmode=100755)"
to limit to executables.

* jw/builtin-objectmode-attr:
  attr: add builtin objectmode values support
2024-01-12 16:09:55 -08:00
99bb88a6f6 Merge branch 'js/contributor-docs-updates'
Doc update.

* js/contributor-docs-updates:
  SubmittingPatches: hyphenate non-ASCII
  SubmittingPatches: clarify GitHub artifact format
  SubmittingPatches: clarify GitHub visual
  SubmittingPatches: provide tag naming advice
  SubmittingPatches: update extra tags list
  SubmittingPatches: discourage new trailers
  SubmittingPatches: drop ref to "What's in git.git"
  CodingGuidelines: write punctuation marks
  CodingGuidelines: move period inside parentheses
2024-01-12 16:09:55 -08:00
f10b0989b8 strvec: use correct member name in comments
In d70a9eb611 (strvec: rename struct fields, 2020-07-28), we renamed the
"argv" member to "v". In the same patch we also did the following rename
in strvec.c:

    -void strvec_pushv(struct strvec *array, const char **argv)
    +void strvec_pushv(struct strvec *array, const char **items)

and it appears that this s/argv/items operation was erroneously applied
to strvec.h.

Rename "items" to "v".

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 13:38:07 -08:00
80bdaba894 messages: mark some strings with "up-to-date" not to touch
The treewide clean-up of "up-to-date" strings done in 7560f547
(treewide: correct several "up-to-date" to "up to date", 2017-08-23)
deliberately left some out, but unlike the lines that were changed
by the commit, the lines that were deliberately left untouched by
the commit is impossible to ask "git blame" to link back to the
commit that did not touch them.

Let's do the second best thing, leave a short comment near them
explaining why those strings should not be modified or localized.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
[es: make in-code comment more developer-friendly]
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 10:20:01 -08:00
acf8ea23af t5541: remove lockfile creation
To create error conditions, some tests set up reference locks by
directly creating its lockfile. While this works for the files reference
backend, this approach is incompatible with the reftable backend.
Refactor the test to create a d/f conflict via git-update-ref(1) instead
so that the test is reference backend agnostic.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 09:56:56 -08:00
a26f1fb62b t1401: remove lockfile creation
To create error conditions, some tests set up reference locks by
directly creating its lockfile. While this works for the files reference
backend, this approach is incompatible with the reftable backend.
Refactor the test to create a d/f conflict via git-symbolic-ref(1)
instead so that the test is reference backend agnostic.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 09:56:47 -08:00
bec9bb4b39 branch: make the advice to force-deleting a conditional one
The error message we show when the user tries to delete a not fully
merged branch describes the error and gives a hint to the user:

	error: the branch 'foo' is not fully merged.
	If you are sure you want to delete it, run 'git branch -D foo'.

Let's move the hint part so that it is displayed using the advice
machinery:

	error: the branch 'foo' is not fully merged
	hint: If you are sure you want to delete it, run 'git branch -D foo'
	hint: Disable this message with "git config advice.forceDeleteBranch false"

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 17:15:54 -08:00
eddd134ce3 advice: fix an unexpected leading space
This space was introduced, presumably unintentionally, in b3b18d1621
(advice: revamp advise API, 2020-03-02)

I notice this space due to confuse diff outputs while doing some
changes to enum advice_type.

As a reference, a recent change we have to that enum is:

    $ git show 35f0383

    ...
    diff --git a/advice.h b/advice.h
    index 0f584163f5..2affbe1426 100644
    --- a/advice.h
    +++ b/advice.h
    @@ -49,6 +49,7 @@ struct string_list;
	    ADVICE_UPDATE_SPARSE_PATH,
	    ADVICE_WAITING_FOR_EDITOR,
	    ADVICE_SKIPPED_CHERRY_PICKS,
    +       ADVICE_WORKTREE_ADD_ORPHAN,
     };

Note the hunk header, instead of a much more expected:

    @@ -49,6 +49,7 @@ enum advice_type

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 17:15:54 -08:00
3196029b5b advice: sort the advice related lists
Let's keep the advice related lists sorted to make them more
digestible.

A multi-line comment has also been changed; that produces the unexpected
'insertion != deletion' in this supposedly 'only sort lines' commit.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 17:15:54 -08:00
02b5c1a946 git-p4: stop reaching into the refdb
The git-p4 tool creates a bunch of temporary branches that share a
common prefix "refs/git-p4-tmp/". These branches get cleaned up via
git-update-ref(1) after the import has finished. Once done, we try to
manually remove the now supposedly-empty ".git/refs/git-p4-tmp/"
directory.

This last step can fail in case there still are any temporary branches
around that we failed to delete because `os.rmdir()` refuses to delete a
non-empty directory. It can thus be seen as kind of a sanity check to
verify that we really did delete all temporary branches. Another failure
mode though is when the directory didn't exist in the first place, which
can be the case when using an alternate ref backend like the upcoming
"reftable" backend.

Convert the code to instead use git-for-each-ref(1) to verify that there
are no more temporary branches around. This works alright with alternate
ref backends while retaining the sanity check that we really did prune
all temporary branches.

This is a modification in behaviour for the "files" backend because the
empty directory does not get deleted anymore. But arguably we should not
care about such implementation details of the ref backend anyway, and
this should not cause any user-visible change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 13:10:41 -08:00
718a93ecc0 reftable/blocksource: use mmap to read tables
The blocksource interface provides an interface to read blocks from a
reftable table. This interface is implemented using read(3P) calls on
the underlying file descriptor. While this works alright, this pattern
is very inefficient when repeatedly querying the reftable stack for one
or more refs. This inefficiency can mostly be attributed to the fact
that we often need to re-read the same blocks over and over again, and
every single time we need to call read(3P) again.

A natural fit in this context is to use mmap(3P) instead of read(3P),
which has a bunch of benefits:

  - We do not need to come up with a caching strategy for some of the
    blocks as this will be handled by the kernel already.

  - We can avoid the overhead of having to call into the read(3P)
    syscall repeatedly.

  - We do not need to allocate returned blocks repeatedly, but can
    instead hand out pointers into the mmapped region directly.

Using mmap comes with a significant drawback on Windows though, because
mmapped files cannot be deleted and neither is it possible to rename
files onto an mmapped file. But for one, the reftable library gracefully
handles the case where auto-compaction cannot delete a still-open stack
already and ignores any such errors. Also, `reftable_stack_clean()` will
prune stale tables which are not referenced by "tables.list" anymore so
that those files can eventually be pruned. And second, we never rewrite
already-written stacks, so it does not matter that we cannot rename a
file over an mmaped file, either.

Another unfortunate property of mmap is that it is not supported by all
systems. But given that the size of reftables should typically be rather
limited (megabytes at most in the vast majority of repositories), we can
use the fallback implementation provided by `git_mmap()` which reads the
whole file into memory instead. This is the same strategy that the
"packed" backend uses.

While this change doesn't significantly improve performance in the case
where we're seeking through stacks once (like e.g. git-for-each-ref(1)
would). But it does speed up usecases where there is lots of random
access to refs, e.g. when writing. The following benchmark demonstrates
these savings with git-update-ref(1) creating N refs in an otherwise
empty repository:

  Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.5 ms, System: 2.5 ms]
    Range (min … max):     4.8 ms …   7.1 ms    111 runs

  Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    Time (mean ± σ):      14.8 ms ±   0.5 ms    [User: 7.1 ms, System: 7.5 ms]
    Range (min … max):    14.1 ms …  18.7 ms    84 runs

  Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    Time (mean ± σ):     926.4 ms ±   5.6 ms    [User: 448.5 ms, System: 477.7 ms]
    Range (min … max):   920.0 ms … 936.1 ms    10 runs

  Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    Time (mean ± σ):       5.0 ms ±   0.2 ms    [User: 2.4 ms, System: 2.5 ms]
    Range (min … max):     4.7 ms …   5.4 ms    111 runs

  Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    Time (mean ± σ):      10.5 ms ±   0.2 ms    [User: 5.0 ms, System: 5.3 ms]
    Range (min … max):    10.0 ms …  10.9 ms    93 runs

  Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    Time (mean ± σ):     529.6 ms ±   9.1 ms    [User: 268.0 ms, System: 261.4 ms]
    Range (min … max):   522.4 ms … 547.1 ms    10 runs

  Summary
    update-ref: create many refs (refcount = 1, revision = HEAD) ran
      1.01 ± 0.06 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
      2.08 ± 0.07 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
      2.95 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    105.33 ± 3.76 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    184.24 ± 5.89 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)

Theoretically, we could also replicate the strategy of the "packed"
backend where small tables are read into memory instead of using mmap.
Benchmarks did not confirm that this has a performance benefit though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 12:10:59 -08:00
85e72be15d reftable/blocksource: refactor code to match our coding style
Refactor `reftable_block_source_from_file()` to match our coding style
better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 12:10:59 -08:00
6fdfaf15a0 reftable/stack: use stat info to avoid re-reading stack list
Whenever we call into the refs interfaces we potentially have to reload
refs in case they have been concurrently modified, either in-process or
externally. While this happens somewhat automatically for loose refs
because we simply try to re-read the files, the "packed" backend will
reload its snapshot of the packed-refs file in case its stat info has
changed since last reading it.

In the reftable backend we have a similar mechanism that is provided by
`reftable_stack_reload()`. This function will read the list of stacks
from "tables.list" and, if they have changed from the currently stored
list, reload the stacks. This is heavily inefficient though, as we have
to check whether the stack is up-to-date on basically every read and
thus keep on re-reading the file all the time even if it didn't change
at all.

We can do better and use the same stat(3P)-based mechanism that the
"packed" backend uses. Instead of reading the file, we will only open
the file descriptor, fstat(3P) it, and then compare the info against the
cached value from the last time we have updated the stack. This should
always work alright because "tables.list" is updated atomically via a
rename, so even if the ctime or mtime wasn't granular enough to identify
a change, at least the inode number or file size should have changed.

This change significantly speeds up operations where many refs are read,
like when using git-update-ref(1). The following benchmark creates N
refs in an otherwise-empty repository via `git update-ref --stdin`:

  Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.4 ms, System: 2.6 ms]
    Range (min … max):     4.8 ms …   7.2 ms    109 runs

  Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    Time (mean ± σ):      19.1 ms ±   0.9 ms    [User: 8.9 ms, System: 9.9 ms]
    Range (min … max):    18.4 ms …  26.7 ms    72 runs

  Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    Time (mean ± σ):      1.336 s ±  0.018 s    [User: 0.590 s, System: 0.724 s]
    Range (min … max):    1.314 s …  1.373 s    10 runs

  Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.4 ms, System: 2.6 ms]
    Range (min … max):     4.8 ms …   7.2 ms    109 runs

  Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    Time (mean ± σ):      14.8 ms ±   0.2 ms    [User: 7.1 ms, System: 7.5 ms]
    Range (min … max):    14.2 ms …  15.2 ms    82 runs

  Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    Time (mean ± σ):     927.6 ms ±   5.3 ms    [User: 437.8 ms, System: 489.5 ms]
    Range (min … max):   919.4 ms … 936.4 ms    10 runs

  Summary
    update-ref: create many refs (refcount = 1, revision = HEAD) ran
      1.00 ± 0.07 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
      2.89 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
      3.74 ± 0.25 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    181.26 ± 8.30 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    261.01 ± 12.35 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 12:10:59 -08:00
c5b5d5fbbc reftable/stack: refactor reloading to use file descriptor
We're about to introduce a stat(3P)-based caching mechanism to reload
the list of stacks only when it has changed. In order to avoid race
conditions this requires us to have a file descriptor available that we
can use to call fstat(3P) on.

Prepare for this by converting the code to use `fd_read_lines()` so that
we have the file descriptor readily available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 12:10:59 -08:00
3c94bd8dfb reftable/stack: refactor stack reloading to have common exit path
The `reftable_stack_reload_maybe_reuse()` function is responsible for
reloading the reftable list from disk. The function is quite hard to
follow though because it has a bunch of different exit paths, many of
which have to free the same set of resources.

Refactor the function to have a common exit path. While at it, touch up
the style of this function a bit to match our usual coding style better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-11 12:10:59 -08:00
ac62a3649f gitweb: die when a configuration file cannot be read
Fix a possibility of a permission to access error go unnoticed.

Perl uses two different variables to manage errors from a "do
$filename" construct. One is $@, which is set in this case when do
is unable to compile the file. The other is $!, which is set in case
do cannot read the file.  The current code only checks "$@", which
means a configuration file passed to GitWeb that is not readable by
the server process does not cause it to "die".

Make sure we also check and act on "$!" to fix this.

Signed-off-by: Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-10 16:08:21 -08:00
9cce3be2df doc: refer to pathspec instead of path
Signed-off-by: Britton Leo Kerin <britton.kerin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-10 14:32:27 -08:00
3dbb69e1f2 doc: use singular form of repeatable path arg
This is more correct because the <path>... doc syntax already indicates
that the arg is "array-type".  It's how other tools do it.  Finally, the
later document text mentions 'path' arguments, while it doesn't mention
'paths'.

Signed-off-by: Britton Leo Kerin <britton.kergin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-10 14:32:27 -08:00
1260914190 t4129: prevent loss of exit code due to the use of pipes
Piping the output of git commands like git-ls-files to another
command (grep in this case) hides the exit code returned by
these commands. Prevent this by storing the output of git-ls-files
to a temporary file and then "grep-ping" from that file. Replace
grep with test_grep as the latter is more verbose when it fails.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-10 09:24:56 -08:00
f755e092e8 mingw: give more details about unsafe directory's ownership
Add domain/username in error message, if owner sid of repository and
user sid are not equal on windows systems.

Old error message:
'''
fatal: detected dubious ownership in repository at 'C:/Users/test/source/repos/git'
'C:/Users/test/source/repos/git' is owned by:
	'S-1-5-21-571067702-4104414259-3379520149-500'
but the current user is:
	'S-1-5-21-571067702-4104414259-3379520149-1001'
To add an exception for this directory, call:

	git config --global --add safe.directory C:/Users/test/source/repos/git
'''

New error message:
'''
fatal: detected dubious ownership in repository at 'C:/Users/test/source/repos/git'
'C:/Users/test/source/repos/git' is owned by:
        DESKTOP-L78JVA6/Administrator (S-1-5-21-571067702-4104414259-3379520149-500)
but the current user is:
        DESKTOP-L78JVA6/test (S-1-5-21-571067702-4104414259-3379520149-1001)
To add an exception for this directory, call:

        git config --global --add safe.directory C:/Users/test/source/repos/git
'''

Signed-off-by: Sören Krecker <soekkle@freenet.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-10 08:23:30 -08:00
a54a84b333 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 14:05:24 -08:00
bdfa7a2445 Merge branch 'rs/mem-pool-improvements'
MemPool allocator fixes.

* rs/mem-pool-improvements:
  mem-pool: simplify alignment calculation
  mem-pool: fix big allocations
2024-01-08 14:05:17 -08:00
8a48cd484f Merge branch 'rs/fast-import-simplify-mempool-allocation'
Code simplification.

* rs/fast-import-simplify-mempool-allocation:
  fast-import: use mem_pool_calloc()
2024-01-08 14:05:16 -08:00
d73db002b5 Merge branch 'en/sparse-checkout-eoo'
"git sparse-checkout (add|set) --[no-]cone --end-of-options" did
not handle "--end-of-options" correctly after a recent update.

* en/sparse-checkout-eoo:
  sparse-checkout: be consistent with end of options markers
2024-01-08 14:05:16 -08:00
863c596e68 Merge branch 'jc/sparse-checkout-set-default-fix'
"git sparse-checkout set" added default patterns even when the
patterns are being fed from the standard input, which has been
corrected.

* jc/sparse-checkout-set-default-fix:
  sparse-checkout: use default patterns for 'set' only !stdin
2024-01-08 14:05:16 -08:00
492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
9decd56cc9 Merge branch 'ml/doc-merge-updates'
Doc update.

* ml/doc-merge-updates:
  Documentation/git-merge.txt: use backticks for command wrapping
  Documentation/git-merge.txt: fix reference to synopsis
2024-01-08 14:05:15 -08:00
6bf317df4b Merge branch 'jc/archive-list-with-extra-args'
"git archive --list extra garbage" silently ignored excess command
line parameters, which has been corrected.

* jc/archive-list-with-extra-args:
  archive: "--list" does not take further options
2024-01-08 14:05:14 -08:00
39487a1510 fetch: add new config option fetch.all
Introduce a boolean configuration option fetch.all which allows to
fetch all available remotes by default. The config option can be
overridden by explicitly specifying a remote or by using --no-all.
The behavior for --all is unchanged and calling git-fetch with --all
and a remote will still result in an error.

Additionally, describe the configuration variable in the config
documentation and implement new tests to cover the expected behavior.
Also add --no-all to the command-line documentation of git-fetch.

Signed-off-by: Tamino Bauknecht <dev@tb6.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:36:23 -08:00
8f4c00de95 builtin/worktree: create refdb via ref backend
When creating a worktree we create the worktree's ref database manually
by first writing a "HEAD" file so that the directory is recognized as a
Git repository by other commands, and then running git-update-ref(1) or
git-symbolic-ref(1) to write the actual value. But while this is fine
for the files backend, this logic simply assumes too much about how the
ref backend works and will leave behind an invalid ref database once any
other ref backend lands.

Refactor the code to instead use `refs_init_db()` to initialize the ref
database so that git-worktree(1) itself does not need to know about how
to initialize it. This will allow future ref backends to customize how
the per-worktree ref database is set up. Furthermore, as we now already
have a worktree ref store around, we can also avoid spawning external
commands to write the HEAD reference and instead use the refs API to do
so.

Note that we do not have an equivalent to passing the `--quiet` flag to
git-symbolic-ref(1) as we did before. This flag does not have an effect
anyway though, as git-symbolic-ref(1) only honors it when reading a
symref, but never when writing one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
b8a846b2e0 worktree: expose interface to look up worktree by name
Our worktree interfaces do not provide a way to look up a worktree by
its name. Expose `get_linked_worktree()` to allow for this usecase. As
callers are responsible for freeing this worktree, introduce a new
function `free_worktree()` that does so.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
84f0ea956f builtin/worktree: move setup of commondir file earlier
Shuffle around how we create supporting worktree files so that we first
ensure that the worktree has all link files ("gitdir", "commondir")
before we try to initialize the ref database by writing "HEAD". This
will be required by a subsequent commit where we start to initialize the
ref database via `refs_init_db()`, which will require an initialized
`struct worktree *`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
2eb1d0c452 refs/files: skip creation of "refs/{heads,tags}" for worktrees
The files ref backend will create both "refs/heads" and "refs/tags" in
the Git directory. While this logic makes sense for normal repositories,
it does not for worktrees because those refs are "common" refs that
would always be contained in the main repository's ref database.

Introduce a new flag telling the backend that it is expected to create a
per-worktree ref database and skip creation of these dirs in the files
backend when the flag is set. No other backends (currently) need
worktree-specific logic, so this is the only required change to start
creating per-worktree ref databases via `refs_init_db()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
c358d165f2 setup: move creation of "refs/" into the files backend
When creating the ref database we unconditionally create the "refs/"
directory in "setup.c". This is a mandatory prerequisite for all Git
repositories regardless of the ref backend in use, because Git will be
unable to detect the directory as a repository if "refs/" doesn't exist.

We are about to add another new caller that will want to create a ref
database when creating worktrees. We would require the same logic to
create the "refs/" directory even though the caller really should not
care about such low-level details. Ideally, the ref database should be
fully initialized after calling `refs_init_db()`.

Move the code to create the directory into the files backend itself to
make it so. This means that future ref backends will also need to have
equivalent logic around to ensure that the directory exists, but it
seems a lot more sensible to have it this way round than to require
callers to create the directory themselves.

An alternative to this would be to create "refs/" in `refs_init_db()`
directly. This feels conceptually unclean though as the creation of the
refdb is now cluttered across different callsites. Furthermore, both the
"files" and the upcoming "reftable" backend write backend-specific data
into the "refs/" directory anyway, so splitting up this logic would only
make it harder to reason about.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
2e573d61ff refs: prepare refs_init_db() for initializing worktree refs
The purpose of `refs_init_db()` is to initialize the on-disk files of a
new ref database. The function is quite inflexible right now though, as
callers can neither specify the `struct ref_store` nor can they pass any
flags.

Refactor the interface to accept both of these. This will be required so
that we can start initializing per-worktree ref databases via the ref
backend instead of open-coding the initialization in "worktree.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
5bf20d6c77 Merge branch 'ps/refstorage-extension' into ps/worktree-refdb-initialization
* ps/refstorage-extension:
  t9500: write "extensions.refstorage" into config
  builtin/clone: introduce `--ref-format=` value flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/rev-parse: introduce `--show-ref-format` flag
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  setup: introduce "extensions.refStorage" extension
  setup: set repository's formats on init
  setup: start tracking ref storage format
  refs: refactor logic to look up storage backends
  worktree: skip reading HEAD when repairing worktrees
  t: introduce DEFAULT_REPO_FORMAT prereq
  builtin/clone: create the refdb with the correct object format
  builtin/clone: skip reading HEAD when retrieving remote
  builtin/clone: set up sparse checkout later
  builtin/clone: fix bundle URIs with mismatching object formats
  remote-curl: rediscover repository when fetching refs
  setup: allow skipping creation of the refdb
  setup: extract function to create the refdb
2024-01-08 12:58:54 -08:00
cd69c635a1 ci: add job performing static analysis on GitLab CI
Our GitHub Workflows definitions have a static analysis job that
runs the following tasks:

  - Coccinelle to check for suggested refactorings.

  - `make hdr-check` to check for missing includes or forward
    declarations in our header files.

  - `make check-pot` to check our translations for issues.

  - `./ci/check-directional-formatting.bash` to check whether our
    sources contain any Unicode directional formatting code points.

Add an equivalent job to our GitLab CI definitions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 11:23:03 -08:00
fc134b41ce git-prompt: stop manually parsing HEAD with unknown ref formats
We're manually parsing the HEAD reference in git-prompt to figure out
whether it is a symbolic or direct reference. This makes it intimately
tied to the on-disk format we use to store references and will stop
working once we gain additional reference backends in the Git project.

Ideally, we would refactor the code to exclusively use plumbing tools to
read refs such that we do not have to care about the on-disk format at
all. Unfortunately though, spawning processes can be quite expensive on
some systems like Windows. As the Git prompt logic may be executed quite
frequently we try very hard to spawn as few processes as possible. This
refactoring is thus out of question for now.

Instead, condition the logic on the repository's ref format: if the repo
uses the the "files" backend we can continue to use the old logic and
read the respective files from disk directly. If it's anything else,
then we use git-symbolic-ref(1) to read the value of HEAD.

This change makes the Git prompt compatible with the upcoming "reftable"
format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 11:21:45 -08:00
4081d45d7f Merge branch 'ps/refstorage-extension' into ps/prompt-parse-HEAD-futureproof
* ps/refstorage-extension:
  t9500: write "extensions.refstorage" into config
  builtin/clone: introduce `--ref-format=` value flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/rev-parse: introduce `--show-ref-format` flag
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  setup: introduce "extensions.refStorage" extension
  setup: set repository's formats on init
  setup: start tracking ref storage format
  refs: refactor logic to look up storage backends
  worktree: skip reading HEAD when repairing worktrees
  t: introduce DEFAULT_REPO_FORMAT prereq
2024-01-08 11:21:18 -08:00
5aea3955bc branch: clarify <oldbranch> term
Since 52d59cc645 (branch: add a --copy (-c) option to go with --move
(-m), 2017-06-18) <oldbranch> is used in more operations than just -m.

Let's also clarify what we do if <oldbranch> is omitted.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 10:06:05 -08:00
25aec06326 rebase: clarify --reschedule-failed-exec default
Documentation should mention the default behavior.

It is better to explain the persistent nature of the
--reschedule-failed-exec flag from the user standpoint, rather than from
the implementation standpoint.

Signed-off-by: Illia Bobyr <illia.bobyr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-05 09:41:25 -08:00
993d38a066 index-pack: spawn threads atomically
The t5309 script triggers a racy false positive with SANITIZE=leak on a
multi-core system. Running with "--stress --run=6" usually fails within
10 seconds or so for me, complaining with something like:

    + git index-pack --fix-thin --stdin
    fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)

    =================================================================
    ==3904583==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 32 byte(s) in 1 object(s) allocated from:
        #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
        #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
        #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
        #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
        #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
        #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
        #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
        #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

    SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
    Aborted

What happens is this:

  0. We construct a bogus pack with a duplicate object in it and trigger
     index-pack.

  1. We spawn a bunch of worker threads to resolve deltas (on my system
     it is 16 threads).

  2. One of the threads sees the duplicate object and bails by calling
     exit(), taking down all of the threads. This is expected and is the
     point of the test.

  3. At the time exit() is called, we may still be spawning threads from
     the main process via pthread_create(). LSan hooks thread creation
     to update its book-keeping; it has to know where each thread's
     stack is (so it can find entry points for reachable memory). So it
     calls pthread_getattr_np() to get information about the new thread.
     That may allocate memory that must be freed with a matching call to
     pthread_attr_destroy(). Probably LSan does that immediately, but
     if you're unlucky enough, the exit() will happen while it's between
     those two calls, and the allocated pthread_attr_t appears as a
     leak.

This isn't a real leak. It's not even in our code, but rather in the
LSan instrumentation code. So we could just ignore it. But the false
positive can cause people to waste time tracking it down.

It's possibly something that LSan could protect against (e.g., cover the
getattr/destroy pair with a mutex, and then in the final post-exit()
check for leaks try to take the same mutex). But I don't know enough
about LSan to say if that's a reasonable approach or not (or if my
analysis is even completely correct).

In the meantime, it's pretty easy to avoid the race by making creation
of the worker threads "atomic". That is, we'll spawn all of them before
letting any of them start to work. That's easy to do because we already
have a work_lock() mutex for handing out that work. If the main process
takes it, then all of the threads will immediately block until we've
finished spawning and released it.

This shouldn't make any practical difference for non-LSan runs. The
thread spawning is quick, and could happen before any worker thread gets
scheduled anyway.

Probably other spots that use threads are subject to the same issues.
But since we have to manually insert locking (and since this really is
kind of a hack), let's not bother with them unless somebody experiences
a similar racy false-positive in practice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-05 08:40:56 -08:00
d70f554cdf commit-graph: retain commit slab when closing NULL commit_graph
This fixes a regression introduced in ac6d45d11f (commit-graph: move
slab-clearing to close_commit_graph(), 2023-10-03), in which running:

  git -c fetch.writeCommitGraph=true fetch --recurse-submodules

multiple times in a freshly cloned repository causes a segfault. What
happens in the second (and subsequent) runs is this:

  1. We make a "struct commit" for any ref tips which we're storing
     (even if we already have them, they still go into FETCH_HEAD).

     Because the first run will have created a commit graph, we'll find
     those commits in the graph.

     The commit struct is therefore created with a NULL "maybe_tree"
     entry, because we can load its oid from the graph later. But to do
     that we need to remember that we got the commit from the graph,
     which is recorded in a global commit_graph_data_slab object.

  2. Because we're using --recurse-submodules, we'll try to fetch each
     of the possible submodules. That implies creating a separate
     "struct repository" in-process for each submodule, which will
     require a later call to repo_clear().

     The call to repo_clear() calls raw_object_store_clear(), which in
     turn calls close_object_store(), which in turn calls
     close_commit_graph(). And the latter frees the commit graph data
     slab.

  3. Later, when trying to write out a new commit graph, we'll ask for
     their tree oid via get_commit_tree_oid(), which will see that the
     object is parsed but with a NULL maybe_tree field. We'd then
     usually pull it from the graph file, but because the slab was
     cleared, we don't realize that we can do so! We end up returning
     NULL and segfaulting.

     (It seems questionable that we'd write a graph entry for such a
     commit anyway, since we know we already have one. I didn't
     double-check, but that may simply be another side effect of having
     cleared the slab).

The bug is in step (2) above. We should not be clearing the slab when
cleaning up the submodule repository structs. Prior to ac6d45d11f, we
did not do so because it was done inside a helper function that returned
early when it saw NULL. So the behavior change from that commit is that
we'll now _always_ clear the slab via repo_clear(), even if the
repository being closed did not have a commit graph (and thus would have
a NULL commit_graph struct).

The most immediate fix is to add in a NULL check in close_commit_graph(),
making it a true noop when passed in an object_store with a NULL
commit_graph (it's OK to just return early, since the rest of its code
is already a noop when passed NULL). That restores the pre-ac6d45d11f
behavior. And that's what this patch does, along with a test that
exercises it (we already have a test that uses submodules along with
fetch.writeCommitGraph, but the bug only triggers when there is a
subsequent fetch and when that fetch uses --recurse-submodules).

So that fixes the regression in the least-risky way possible.

I do think there's some fragility here that we might want to follow up
on. We have a global commit_graph_data_slab that contains graph
positions, and our global commit structs depend on the that slab
remaining valid. But close_commit_graph() is just about closing _one_
object store's graph. So it's dangerous to call that function and clear
the slab without also throwing away any "struct commit" we might have
parsed that depends on it.

Which at first glance seems like a bug we could already trigger. In the
situation described here, there is no commit graph in the submodule
repository, so our commit graph is NULL (in fact, in our test script
there is no submodule repo at all, so we immediately return from
repo_init() and call repo_clear() only to free up memory). But what
would happen if there was one? Wouldn't we see a non-NULL commit_graph
entry, and then clear the global slab anyway?

The answer is "no", but for very bizarre reasons. Remember that
repo_clear() calls raw_object_store_clear(), which then calls
close_object_store() and thus close_commit_graph(). But before it does
so, raw_object_store_clear() does something else: it frees the commit
graph and sets it to NULL! So by this code path we'll _never_ see a
non-NULL commit_graph struct, and thus never clear the slab.

So it happens to work out. But it still seems questionable to me that we
would clear a global slab (which might still be in use) when closing the
commit graph. This clearing comes from 957ba814bf (commit-graph: when
closing the graph, also release the slab, 2021-09-08), and was fixing a
case where we really did need it to be closed (and in that case we
presumably call close_object_store() more directly).

So I suspect there may still be a bug waiting to happen there, as any
object loaded before the call to close_object_store() may be stranded
with a bogus maybe_tree entry (and thus looking at it after the call
might cause an error). But I'm not sure how to trigger it, nor what the
fix should look like (you probably would need to "unparse" any objects
pulled from the graph). And so this patch punts on that for now in favor
of fixing the recent regression in the most direct way, which should not
have any other fallouts.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-05 08:35:26 -08:00
556e68032f write-or-die: make GIT_FLUSH a Boolean environment variable
Among Git's environment variables, the ones marked as "Boolean"
accept values in a way similar to Boolean configuration variables,
i.e. values like 'yes', 'on', 'true' and positive numbers are
taken as "on" and values like 'no', 'off', 'false' are taken as
"off".

GIT_FLUSH can be used to force Git to use non-buffered I/O when
writing to stdout. It can only accept two values, '1' which causes
Git to flush more often and '0' which makes all output buffered.
Make GIT_FLUSH accept more values besides '0' and '1' by turning it
into a Boolean environment variable, modifying the required logic.
Update the related documentation.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-04 10:32:21 -08:00
ee9895b0ff push: region_leave trace for negotiate_using_fetch
There were two region_enter events for negotiate_using_fetch instead of
one enter and one leave. This commit replaces the second region_enter
event with a region_leave.

Signed-off-by: Sam Delmerico <delmerico@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 15:32:44 -08:00
9cd30af991 Documentation: fix statement about rebase.instructionFormat
Since commit 62db5247 (rebase -i: generate the script via
rebase--helper, 2017-07-14), the short hash is given in
rebase-todo. Specifying rebase.instructionFormat does not alter this
behavior, contrary to what the documentation implies.

Signed-off-by: Maarten van der Schrieck <maarten@thingsconnected.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 11:21:15 -08:00
19b9496c1f reftable/merged: transfer ownership of records when iterating
When iterating over records with the merged iterator we put the records
into a priority queue before yielding them to the caller. This means
that we need to allocate the contents of these records before we can
pass them over to the caller.

The handover to the caller is quite inefficient though because we first
deallocate the record passed in by the caller and then copy over the new
record, which requires us to reallocate memory.

Refactor the code to instead transfer ownership of the new record to the
caller. So instead of reallocating all contents, we now release the old
record and then copy contents of the new record into place.

The following benchmark of `git show-ref --quiet` in a repository with
around 350k refs shows a clear improvement. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 357,007 allocs, 356,814 frees, 24,193,602 bytes allocated

This shows that we now have roundabout a single allocation per record
that we're yielding from the iterator. Ideally, we'd also get rid of
this allocation so that the number of allocations doesn't scale with the
number of refs anymore. This would require some larger surgery though
because the memory is owned by the priority queue before transferring it
over to the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:21 -08:00
5473aca376 reftable/merged: really reuse buffers to compute record keys
In 829231dc20 (reftable/merged: reuse buffer to compute record keys,
2023-12-11), we have refactored the merged iterator to reuse a pair of
long-living strbufs by relying on the fact that `reftable_record_key()`
tries to reuse already allocated strbufs by calling `strbuf_reset()`,
which should give us significantly fewer reallocations compared to the
old code that used on-stack strbufs that are allocated for each and
every iteration. Unfortunately, we called `strbuf_release()` on these
long-living strbufs that we meant to reuse on each iteration, defeating
the optimization.

Fix this performance issue by not releasing those buffers on iteration
anymore, where we instead rely on `merged_iter_close()` to release the
buffers for us.

Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:21 -08:00
b31e3cc620 reftable/record: store "val2" hashes as static arrays
Similar to the preceding commit, convert ref records of type "val2" to
store their object IDs in static arrays instead of allocating them for
every single record.

We're using the same benchmark as in the preceding commit, with `git
show-ref --quiet` in a repository with ~350k refs. This time around
though the effects aren't this huge. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,419,040 allocs, 1,418,847 frees, 62,153,868 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated

This is because "val2"-type records are typically only stored for peeled
tags, and the number of annotated tags in the benchmark repository is
rather low. Still, it can be seen that this change leads to a reduction
of allocations overall, even if only a small one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:21 -08:00
7af607c58d reftable/record: store "val1" hashes as static arrays
When reading ref records of type "val1", we store its object ID in an
allocated array. This results in an additional allocation for every
single ref record we read, which is rather inefficient especially when
iterating over refs.

Refactor the code to instead use an embedded array of `GIT_MAX_RAWSZ`
bytes. While this means that `struct ref_record` is bigger now, we
typically do not store all refs in an array anyway and instead only
handle a limited number of records at the same point in time.

Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:

    HEAP SUMMARY:
        in use at exit: 21,098 bytes in 192 blocks
      total heap usage: 2,116,683 allocs, 2,116,491 frees, 76,098,060 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,098 bytes in 192 blocks
      total heap usage: 1,419,031 allocs, 1,418,839 frees, 62,145,036 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:20 -08:00
88f59d9e31 reftable/record: constify some parts of the interface
We're about to convert reftable records to stop storing their object IDs
as allocated hashes. Prepare for this refactoring by constifying some
parts of the interface that will be impacted by this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:20 -08:00
ddac965965 reftable/writer: fix index corruption when writing multiple indices
Each reftable may contain multiple types of blocks for refs, objects and
reflog records, where each of these may have an index that makes it more
efficient to find the records. It was observed that the index for log
records can become corrupted under certain circumstances, where the
first entry of the index points into the object index instead of to the
log records.

As it turns out, this corruption can occur whenever we write a log index
as well as at least one additional index. Writing records and their index
is basically a two-step process:

  1. We write all blocks for the corresponding record. Each block that
     gets written is added to a list of blocks to index.

  2. Once all blocks were written we finish the section. If at least two
     blocks have been added to the list of blocks to index then we will
     now write the index for those blocks and flush it, as well.

When we have a very large number of blocks then we may decide to write a
multi-level index, which is why we also keep track of the list of the
index blocks in the same way as we previously kept track of the blocks
to index.

Now when we have finished writing all index blocks we clear the index
and flush the last block to disk. This is done in the wrong order though
because flushing the block to disk will re-add it to the list of blocks
to be indexed. The result is that the next section we are about to write
will have an entry in the list of blocks to index that points to the
last block of the preceding section's index, which will corrupt the log
index.

Fix this corruption by clearing the index after having written the last
block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:20 -08:00
75d790608f reftable/stack: do not auto-compact twice in reftable_stack_add()
In 5c086453ff (reftable/stack: perform auto-compaction with
transactional interface, 2023-12-11), we fixed a bug where the
transactional interface to add changes to a reftable stack did not
perform auto-compaction by calling `reftable_stack_auto_compact()` in
`reftable_stack_addition_commit()`. While correct, this change may now
cause us to perform auto-compaction twice in the non-transactional
interface `reftable_stack_add()`:

  - It performs auto-compaction by itself.

  - It now transitively performs auto-compaction via the transactional
    interface.

Remove the first instance so that we only end up doing auto-compaction
once.

Reported-by: Han-Wen Nienhuys <hanwenn@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:20 -08:00
d26c21483d reftable/stack: do not overwrite errors when compacting
In order to compact multiple stacks we iterate through the merged ref
and log records. When there is any error either when reading the records
from the old merged table or when writing the records to the new table
then we break out of the respective loops. When breaking out of the loop
for the ref records though the error code will be overwritten, which may
cause us to inadvertently skip over bad ref records. In the worst case,
this can lead to a compacted stack that is missing records.

Fix the code by using `goto done` instead so that any potential error
codes are properly returned to the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:54:20 -08:00
54d8a2531b t1006: prefer shell loop to awk for packed object sizes
To compute the expected on-disk size of packed objects, we sort the
output of show-index by pack offset and then compute the difference
between adjacent entries using awk. This works but has a few readability
problems:

  1. Reading the index in pack order means don't find out the size of an
     oid's entry until we see the _next_ entry. So we have to save it to
     print later.

     We can instead iterate in reverse order, so we compute each oid's
     size as we see it.

  2. Since the awk invocation is inside a text_expect block, we can't
     easily use single-quotes to hold the script. So we use
     double-quotes, but then have to escape the dollar signs in the awk
     script.

     We can swap this out for a shell loop instead (which is made much
     easier by the first change).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:26:53 -08:00
a26002b628 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 13:51:30 -08:00
dbf668a1b7 Merge branch 'ps/pseudo-refs'
Assorted changes around pseudoref handling.

* ps/pseudo-refs:
  bisect: consistently write BISECT_EXPECTED_REV via the refdb
  refs: complete list of special refs
  refs: propagate errno when reading special refs fails
  wt-status: read HEAD and ORIG_HEAD via the refdb
2024-01-02 13:51:30 -08:00
601b1571e8 Merge branch 'jc/orphan-unborn'
Doc updates to clarify what an "unborn branch" means.

* jc/orphan-unborn:
  orphan/unborn: fix use of 'orphan' in end-user facing messages
  orphan/unborn: add to the glossary and use them consistently
2024-01-02 13:51:30 -08:00
cce4778520 Merge branch 'rj/status-bisect-while-rebase'
"git status" is taught to show both the branch being bisected and
being rebased when both are in effect at the same time.

* rj/status-bisect-while-rebase:
  status: fix branch shown when not only bisecting
2024-01-02 13:51:29 -08:00
59a29e1274 Merge branch 'la/trailer-cleanups'
Code clean-up.

* la/trailer-cleanups:
  trailer: use offsets for trailer_start/trailer_end
  trailer: find the end of the log message
  commit: ignore_non_trailer computes number of bytes to ignore
2024-01-02 13:51:29 -08:00
43ec879169 Merge branch 'jc/retire-cas-opt-name-constant'
Code clean-up.

* jc/retire-cas-opt-name-constant:
  remote.h: retire CAS_OPT_NAME
2024-01-02 13:51:29 -08:00
9cc710098b Merge branch 'rs/rebase-use-strvec-pushf'
Code clean-up.

* rs/rebase-use-strvec-pushf:
  rebase: use strvec_pushf() for format-patch revisions
2024-01-02 13:51:29 -08:00
72e6a61c40 Merge branch 'sh/completion-with-reftable'
Command line completion script (in contrib/) learned to work better
with the reftable backend.

* sh/completion-with-reftable:
  completion: support pseudoref existence checks for reftables
  completion: refactor existence checks for pseudorefs
2024-01-02 13:51:28 -08:00
1b2234079b t9500: write "extensions.refstorage" into config
In t9500 we're writing a custom configuration that sets up gitweb. This
requires us to manually ensure that the repository format is configured
as required, including both the repository format version and
extensions. With the introduction of the "extensions.refStorage"
extension we need to update the test to also write this new one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:49 -08:00
5ed860f51b builtin/clone: introduce --ref-format= value flag
Introduce a new `--ref-format` value flag for git-clone(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
48fa45f5fb builtin/init: introduce --ref-format= value flag
Introduce a new `--ref-format` value flag for git-init(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
3c4a5318af builtin/rev-parse: introduce --show-ref-format flag
Introduce a new `--show-ref-format` to git-rev-parse(1) that causes it
to print the ref format used by a repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
58aaf59133 t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
Introduce a new GIT_TEST_DEFAULT_REF_FORMAT environment variable that
lets developers run the test suite with a different default ref format
without impacting the ref format used by non-test Git invocations. This
is modeled after GIT_TEST_DEFAULT_OBJECT_FORMAT, which does the same
thing for the repository's object format.

Adapt the setup of the `REFFILES` test prerequisite to be conditionally
set based on the default ref format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
aa19619a98 setup: introduce GIT_DEFAULT_REF_FORMAT envvar
Introduce a new GIT_DEFAULT_REF_FORMAT environment variable that lets
users control the default ref format used by both git-init(1) and
git-clone(1). This is modeled after GIT_DEFAULT_OBJECT_FORMAT, which
does the same thing for the repository's object format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
d7497a42b0 setup: introduce "extensions.refStorage" extension
Introduce a new "extensions.refStorage" extension that allows us to
specify the ref storage format used by a repository. For now, the only
supported format is the "files" format, but this list will likely soon
be extended to also support the upcoming "reftable" format.

There have been discussions on the Git mailing list in the past around
how exactly this extension should look like. One alternative [1] that
was discussed was whether it would make sense to model the extension in
such a way that backends are arbitrarily stackable. This would allow for
a combined value of e.g. "loose,packed-refs" or "loose,reftable", which
indicates that new refs would be written via "loose" files backend and
compressed into "packed-refs" or "reftable" backends, respectively.

It is arguable though whether this flexibility and the complexity that
it brings with it is really required for now. It is not foreseeable that
there will be a proliferation of backends in the near-term future, and
the current set of existing formats and formats which are on the horizon
can easily be configured with the much simpler proposal where we have a
single value, only.

Furthermore, if we ever see that we indeed want to gain the ability to
arbitrarily stack the ref formats, then we can adapt the current
extension rather easily. Given that Git clients will refuse any unknown
value for the "extensions.refStorage" extension they would also know to
ignore a stacked "loose,packed-refs" in the future.

So let's stick with the easy proposal for the time being and wire up the
extension.

[1]: <pull.1408.git.1667846164.gitgitgadget@gmail.com>

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
58be32fff9 setup: set repository's formats on init
The proper hash algorithm and ref storage format that will be used for a
newly initialized repository will be figured out in `init_db()` via
`validate_hash_algorithm()` and `validate_ref_storage_format()`. Until
now though, we never set up the hash algorithm or ref storage format of
`the_repository` accordingly.

There are only two callsites of `init_db()`, one in git-init(1) and one
in git-clone(1). The former function doesn't care for the formats to be
set up properly because it never access the repository after calling the
function in the first place.

For git-clone(1) it's a different story though, as we call `init_db()`
before listing remote refs. While we do indeed have the wrong hash
function in `the_repository` when `init_db()` sets up a non-default
object format for the repository, it never mattered because we adjust
the hash after learning about the remote's hash function via the listed
refs.

So the current state is correct for the hash algo, but it's not for the
ref storage format because git-clone(1) wouldn't know to set it up
properly. But instead of adjusting only the `ref_storage_format`, set
both the hash algo and the ref storage format so that `the_repository`
is in the correct state when `init_db()` exits. This is fine as we will
adjust the hash later on anyway and makes it easier to reason about the
end state of `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
173761e21b setup: start tracking ref storage format
In order to discern which ref storage format a repository is supposed to
use we need to start setting up and/or discovering the format. This
needs to happen in two separate code paths.

  - The first path is when we create a repository via `init_db()`. When
    we are re-initializing a preexisting repository we need to retain
    the previously used ref storage format -- if the user asked for a
    different format then this indicates an error and we error out.
    Otherwise we either initialize the repository with the format asked
    for by the user or the default format, which currently is the
    "files" backend.

  - The second path is when discovering repositories, where we need to
    read the config of that repository. There is not yet any way to
    configure something other than the "files" backend, so we can just
    blindly set the ref storage format to this backend.

Wire up this logic so that we have the ref storage format always readily
available when needed. As there is only a single backend and because it
is not configurable we cannot yet verify that this tracking works as
expected via tests, but tests will be added in subsequent commits. To
countermand this ommission now though, raise a BUG() in case the ref
storage format is not set up properly in `ref_store_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
0fcc285c5e refs: refactor logic to look up storage backends
In order to look up ref storage backends, we're currently using a linked
list of backends, where each backend is expected to set up its `next`
pointer to the next ref storage backend. This is kind of a weird setup
as backends need to be aware of other backends without much of a reason.

Refactor the code so that the array of backends is centrally defined in
"refs.c", where each backend is now identified by an integer constant.
Expose functions to translate from those integer constants to the name
and vice versa, which will be required by subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
465a22b338 worktree: skip reading HEAD when repairing worktrees
When calling `git init --separate-git-dir=<new-path>` on a preexisting
repository, we move the Git directory of that repository to the new path
specified by the user. If there are worktrees present in the repository,
we need to repair the worktrees so that their gitlinks point to the new
location of the repository.

This repair logic will load repositories via `get_worktrees()`, which
will enumerate up and initialize all worktrees. Part of initialization
is logic that we resolve their respective worktree HEADs, even though
that information may not actually be needed in the end by all callers.

Although not a problem presently with the file-based reference backend,
it will become a problem with the upcoming reftable backend. In the
context of git-init(1) we do not have a fully-initialized repository set
up via `setup_git_directory()` or friends. Consequently, we do not know
about the repository format when `repair_worktrees()` is called, and
properly setting up all parts of the repositroy in `init_db()` before we
try to repair worktrees is not an easy task. With the introduction of
the reftable backend, we would ultimately try to look up the worktree
HEADs before we have figured out the reference format, which does not
work.

We do not require the worktree HEADs at all to repair worktrees. So
let's fix this issue by skipping over the step that reads them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
bb0372c979 t: introduce DEFAULT_REPO_FORMAT prereq
A limited number of tests require repositories to have the default
repository format or otherwise they would fail to run, e.g. because they
fail to detect the correct hash function. While the hash function is the
only extension right now that creates problems like this, we are about
to add a second extension for the ref format.

Introduce a new DEFAULT_REPO_FORMAT prereq that can easily be amended
whenever we add new format extensions. Next to making any such changes
easier on us, the prerequisite's name should also help to clarify the
intent better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
2232a88ab6 attr: add builtin objectmode values support
Gives all paths builtin objectmode values based on the paths' modes
(one of 100644, 100755, 120000, 040000, 160000). Users may use
this feature to filter by file types. For example a pathspec such as
':(attr:builtin_objectmode=160000)' could filter for submodules without
needing to have `builtin_objectmode=160000` to be set in .gitattributes
for every submodule path.

These values are also reflected in `git check-attr` results.
If the git_attr_direction is set to GIT_ATTR_INDEX or GIT_ATTR_CHECKIN
and a path is not found in the index, the value will be unspecified.

This patch also reserves the builtin_* attribute namespace for objectmode
and any future builtin attributes. Any user defined attributes using this
reserved namespace will result in a warning. This is a breaking change for
any existing builtin_* attributes.
Pathspecs with some builtin_* attribute name (excluding builtin_objectmode)
will behave like any attribute where there are no user specified values.

Signed-off-by: Joanna Wang <jojwang@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-28 13:21:52 -08:00
c61740d607 mem-pool: simplify alignment calculation
Use DIV_ROUND_UP in mem_pool_alloc() to round the allocation length to
the next multiple of GIT_MAX_ALIGNMENT instead of twiddling bits
explicitly.  This is shorter and clearer, to the point that we no longer
need the comment that explains what's being calculated.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-28 12:22:58 -08:00
6cbae64000 mem-pool: fix big allocations
Memory pool allocations that require a new block and would fill at
least half of it are handled specially.  Before 158dfeff3d (mem-pool:
add life cycle management functions, 2018-07-02) they used to be
allocated outside of the pool.  This patch made mem_pool_alloc() create
a bespoke block instead, to allow releasing it when the pool gets
discarded.

Unfortunately mem_pool_alloc() returns a pointer to the start of such a
bespoke block, i.e. to the struct mp_block at its top.  When the caller
writes to it, the management information gets corrupted.  This affects
mem_pool_discard() and -- if there are no other blocks in the pool --
also mem_pool_alloc().

Return the payload pointer of bespoke blocks, just like for smaller
allocations, to protect the management struct.

Also update next_free to mark the block as full.  This is only strictly
necessary for the first allocated block, because subsequent ones are
inserted after the current block and never considered for further
allocations, but it's easier to just do it in all cases.

Add a basic unit test to demonstrate the issue by using
mem_pool_calloc() with a tiny block size, which forces the creation of a
bespoke block.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-28 12:22:43 -08:00
03bcc93769 sideband.c: remove redundant 'NEEDSWORK' tag
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-28 07:59:12 -08:00
291873e5d6 SubmittingPatches: hyphenate non-ASCII
Git documentation does this with the exception of ancient release notes.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:24 -08:00
7818951623 SubmittingPatches: clarify GitHub artifact format
GitHub wraps artifacts generated by workflows in a .zip file.

Internally, workflows can package anything they like in them.

A recently generated failure artifact had the form:

windows-artifacts.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
 76001695  12-19-2023 01:35   artifacts.tar.gz
 11005650  12-19-2023 01:35   tracked.tar.gz
---------                     -------
 87007345                     2 files

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:24 -08:00
0771a3b55c SubmittingPatches: clarify GitHub visual
GitHub has two general forms for its states, sometimes they're a simple
colored object (e.g. green check or red x), and sometimes there's also a
colored container (e.g. green box or red circle) which contains that
object (e.g. check or x).

That's a lot of words to try to describe things, but in general, the key
for a failure is that it's recognized as an `x` and that it's associated
with the color red -- the color of course is problematic for people who
are red-green color-blind, but that's why they are paired with distinct
shapes.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:24 -08:00
08e2e6f8d2 SubmittingPatches: provide tag naming advice
Current statistics show a strong preference to only capitalize the first
letter in a hyphenated tag, but that some guidance would be helpful:

git log |
  perl -ne 'next unless /^\s+(?:Signed-[oO]ff|Acked)-[bB]y:/;
    s/^\s+//;s/:.*/:/;print'|
  sort|uniq -c|sort -n
   2 Signed-off-By:
   4 Signed-Off-by:
  22 Acked-By:
  47 Signed-Off-By:
2202 Acked-by:
95315 Signed-off-by:

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
c771ef6f77 SubmittingPatches: update extra tags list
Add items with at least 100 uses in the past three years:
- Co-authored-by
- Helped-by
- Mentored-by
- Suggested-by

git log --since=3.years|
  perl -ne 'next unless /^\s+[A-Z][a-z]+-\S+:/;s/^\s+//;s/:.*/:/;print'|
  sort|uniq -c|sort -n|grep '[0-9][0-9] '
  14 Based-on-patch-by:
  14 Original-patch-by:
  17 Tested-by:
 100 Suggested-by:
 121 Co-authored-by:
 163 Mentored-by:
 274 Reported-by:
 290 Acked-by:
 450 Helped-by:
 602 Reviewed-by:
14111 Signed-off-by:

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
ac9fff2bf1 SubmittingPatches: discourage new trailers
There seems to be consensus amongst the core Git community on a working
set of common trailers, and there are non-trivial costs to people
inventing new trailers (research to discover what they mean/how they
differ from existing trailers) such that inventing new ones is generally
unwarranted and not something to be recommended to new contributors.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
127106294a SubmittingPatches: drop ref to "What's in git.git"
"What's in git.git" was last seen in 2010:
  https://lore.kernel.org/git/?q=%22what%27s+in+git.git%22
  https://lore.kernel.org/git/7vaavikg72.fsf@alter.siamese.dyndns.org/

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
e6397c5cc8 CodingGuidelines: write punctuation marks
- Match style in Release Notes

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
2d194548cb CodingGuidelines: move period inside parentheses
The contents within parenthesis should be omittable without resulting
in broken text.

Eliding the parenthesis left a period to end a run without any content.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
e79552d197 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 14:52:28 -08:00
94e8e404a7 Merge branch 'ps/clone-into-reftable-repository'
"git clone" has been prepared to allow cloning a repository with
non-default hash function into a repository that uses the reftable
backend.

* ps/clone-into-reftable-repository:
  builtin/clone: create the refdb with the correct object format
  builtin/clone: skip reading HEAD when retrieving remote
  builtin/clone: set up sparse checkout later
  builtin/clone: fix bundle URIs with mismatching object formats
  remote-curl: rediscover repository when fetching refs
  setup: allow skipping creation of the refdb
  setup: extract function to create the refdb
2023-12-27 14:52:28 -08:00
6db745e1f5 Merge branch 'rs/t6300-compressed-size-fix'
Test fix.

* rs/t6300-compressed-size-fix:
  t6300: avoid hard-coding object sizes
2023-12-27 14:52:27 -08:00
deb67d12de Merge branch 'jx/fetch-atomic-error-message-fix'
"git fetch --atomic" issued an unnecessary empty error message,
which has been corrected.

* jx/fetch-atomic-error-message-fix:
  fetch: no redundant error message for atomic fetch
  t5574: test porcelain output of atomic fetch
2023-12-27 14:52:27 -08:00
a29e8b6059 Merge branch 'rs/c99-stdbool-test-balloon'
Test balloon to use C99 "bool" type from <stdbool.h>.

* rs/c99-stdbool-test-balloon:
  git-compat-util: convert skip_{prefix,suffix}{,_mem} to bool
2023-12-27 14:52:27 -08:00
aa6122ce52 Merge branch 'sp/test-i18ngrep'
Error message fix in the test framework.

* sp/test-i18ngrep:
  test-lib-functions.sh: fix test_grep fail message wording
2023-12-27 14:52:27 -08:00
c17ed4fe26 Merge branch 'jc/doc-misspelt-refs-fix'
Doc update.

* jc/doc-misspelt-refs-fix:
  doc: format.notes specify a ref under refs/notes/ hierarchy
2023-12-27 14:52:26 -08:00
f96fecc7c4 Merge branch 'jc/doc-most-refs-are-not-that-special'
Doc updates.

* jc/doc-most-refs-are-not-that-special:
  docs: MERGE_AUTOSTASH is not that special
  docs: AUTO_MERGE is not that special
  refs.h: HEAD is not that special
  git-bisect.txt: BISECT_HEAD is not that special
  git.txt: HEAD is not that special
2023-12-27 14:52:26 -08:00
b0d277d69f Merge branch 'es/add-doc-list-short-form-of-all-in-synopsis'
Doc update.

* es/add-doc-list-short-form-of-all-in-synopsis:
  git-add.txt: add missing short option -A to synopsis
2023-12-27 14:52:26 -08:00
9df9e3770a Merge branch 'jk/mailinfo-iterative-unquote-comment'
The code to parse the From e-mail header has been updated to avoid
recursion.

* jk/mailinfo-iterative-unquote-comment:
  mailinfo: avoid recursion when unquoting From headers
  t5100: make rfc822 comment test more careful
2023-12-27 14:52:26 -08:00
f6a129ceaf Merge branch 'ps/chainlint-self-check-update'
Test framework update.

* ps/chainlint-self-check-update:
  tests: adjust whitespace in chainlint expectations
2023-12-27 14:52:25 -08:00
73b1808fa3 Merge branch 'rs/show-ref-incompatible-options'
Code clean-up for sanity checking of command line options for "git
show-ref".

* rs/show-ref-incompatible-options:
  show-ref: use die_for_incompatible_opt3()
2023-12-27 14:52:25 -08:00
637e34a783 Merge branch 'ps/reftable-fixes'
Bunch of small fix-ups to the reftable code.

* ps/reftable-fixes:
  reftable/block: reuse buffer to compute record keys
  reftable/block: introduce macro to initialize `struct block_iter`
  reftable/merged: reuse buffer to compute record keys
  reftable/stack: fix use of unseeded randomness
  reftable/stack: fix stale lock when dying
  reftable/stack: reuse buffers when reloading stack
  reftable/stack: perform auto-compaction with transactional interface
  reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
  reftable: handle interrupted writes
  reftable: handle interrupted reads
  reftable: wrap EXPECT macros in do/while
2023-12-27 14:52:25 -08:00
b7fbd2ab83 Merge branch 'jc/diff-cached-fsmonitor-fix'
The optimization based on fsmonitor in the "diff --cached"
codepath is resurrected with the "fake-lstat" introduced earlier.

* jc/diff-cached-fsmonitor-fix:
  diff-lib: fix check_removed() when fsmonitor is active
2023-12-27 14:52:25 -08:00
01f86ebb95 Merge branch 'jc/fake-lstat'
A new helper to let us pretend that we called lstat() when we know
our cache_entry is up-to-date via fsmonitor.

* jc/fake-lstat:
  cache: add fake_lstat()
2023-12-27 14:52:24 -08:00
db2cf6f3bb Merge branch 'jk/mailinfo-oob-read-fix'
OOB read fix.

* jk/mailinfo-oob-read-fix:
  mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
2023-12-27 14:52:24 -08:00
f09e74175d Merge branch 'jc/checkout-B-branch-in-use'
"git checkout -B <branch> [<start-point>]" allowed a branch that is
in use in another worktree to be updated and checked out, which
might be a bit unexpected.  The rule has been tightened, which is a
breaking change.  "--ignore-other-worktrees" option is required to
unbreak you, if you are used to the current behaviour that "-B"
overrides the safety.

* jc/checkout-B-branch-in-use:
  checkout: forbid "-B <branch>" from touching a branch used elsewhere
  checkout: refactor die_if_checked_out() caller
2023-12-27 14:52:24 -08:00
45b625142d apply: code simplification
Rewrite a bit hard-to-read ternary ?: expression into a cascade of
if/else.

Given that read-cache.c:add_index_entry() makes sure that the
.ce_mode member is filled with a reasonable value before placing a
cache entry in the index, if we see (ce_mode == 0), there is
something seriously wrong going on.  Catch such a bug and abort,
instead of silently ignoring such an entry and silently skipping
the check.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 21:20:32 -08:00
01aff0ae85 apply: correctly reverse patch's pre- and post-image mode bits
When parsing the patch header, unless it is a patch that changes
file modes, we only read the mode bits into the .old_mode member of
the patch structure and leave .new_mode member as initialized, i.e.,
to 0.  Later when we need the original mode bits, we consult .old_mode.

However, reverse_patches() that is used to swap the names and modes
of the preimage and postimage files is not aware of this convention,
leading the .old_mode to be 0 while the mode we read from the patch
is left in .new_mode.

Only swap .old_mode and .new_mode when .new_mode is not 0 (i.e. we
saw a patch that modifies the filemode and know what the new mode
is).  When .new_mode is set to 0, it means the preimage and the
postimage files have the same mode (which is in the .old_mode member)
and when applying such a patch in reverse, the value in .old_mode is
what we expect the (reverse-) preimage file to have.

Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 21:20:32 -08:00
0482c32c33 apply: ignore working tree filemode when !core.filemode
When applying a patch that adds an executable file, git apply
ignores the core.fileMode setting (core.fileMode in git config
specifies whether the executable bit on files in the working tree
should be honored or not) resulting in warnings like:

warning: script.sh has type 100644, expected 100755

even when core.fileMode is set to false, which is undesired. This
is extra true for systems like Windows.

Fix this by inferring the correct file mode from either the existing
index entry, and when it is unavailable, assuming that the file mode
was OK by pretending it had the mode that the preimage wants to see,
when core.filemode is set to false. Add a test case that verifies
the change and prevents future regression.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 21:20:32 -08:00
53ded839ae sparse-checkout: use default patterns for 'set' only !stdin
"git sparse-checkout set ---no-cone" uses default patterns when none
is given from the command line, but it should do so ONLY when
--stdin is not being used.  Right now, add_patterns_from_input()
called when reading from the standard input is sloppy and does not
check if there are extra command line parameters that the command
will silently ignore, but that will change soon and not setting this
unnecessary and unused default patterns start to matter when it gets
fixed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:15:58 -08:00
f8ab66f9f3 sparse-checkout: be consistent with end of options markers
93851746 (parse-options: decouple "--end-of-options" and "--",
2023-12-06) updated the world order to make callers of parse-options
that set PARSE_OPT_KEEP_UNKNOWN_OPT responsible for deciding what to
do with "--end-of-options" they may see after parse_options() returns.

This made a previous bug in sparse-checkout more visible; namely,
that

  git sparse-checkout [add|set] --[no-]cone --end-of-options ...

would simply treat "--end-of-options" as one of the paths to include in
the sparse-checkout.  But this was already problematic before; namely,

  git sparse-checkout [add|set| --[no-]cone --sikp-checks ...

would not give an error on the mis-typed "--skip-checks" but instead
simply treat "--sikp-checks" as a path or pattern to include in the
sparse-checkout, which is highly unfriendly.

This behavior began when the command was converted to parse-options in
7bffca95ea (sparse-checkout: add '--stdin' option to set subcommand,
2019-11-21).  Back then it was just called KEEP_UNKNOWN. Later it was
renamed to KEEP_UNKNOWN_OPT in 99d86d60e5 (parse-options:
PARSE_OPT_KEEP_UNKNOWN only applies to --options, 2022-08-19) to clarify
that it was only about dashed options; we always keep non-option
arguments.  Looking at that original patch, both Peff and I think that
the author was simply confused about the mis-named option, and really
just wanted to keep the non-option arguments.  We never should have used
the flag all along (and the other cases were cargo-culted within the
file).

Remove the erroneous PARSE_OPT_KEEP_UNKNOWN_OPT flag now to fix this
bug.  Note that this does mean that anyone who might have been using

  git sparse-checkout [add|set] [--[no-]cone] --foo --bar

to request paths or patterns '--foo' and '--bar' will now have to use

  git sparse-checkout [add|set] [--[no-]cone] -- --foo --bar

That makes sparse-checkout more consistent with other git commands,
provides users much friendlier error messages and behavior, and is
consistent with the all-caps warning in git-sparse-checkout.txt that
this command "is experimental...its behavior...will likely change".  :-)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:15:25 -08:00
d57c671a51 treewide: remove unnecessary includes in source files
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:33 -08:00
ec2101abf3 treewide: add direct includes currently only pulled in transitively
The next commit will remove a bunch of unnecessary includes, but to do
so, we need some of the lower level direct includes that files rely on
to be explicitly specified.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
0a4d5b9772 trace2/tr2_tls.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
e9bb166491 submodule-config.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
545f7b50e8 pkt-line.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
a28fe2d901 line-log.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
f25e65e0fe http.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
31d20faa90 fsmonitor--daemon.h: remove unnecessary includes
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
bd6cc1d9ec blame.h: remove unnecessary includes
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
c2c4138c07 archive.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
147438e8a0 treewide: remove unnecessary includes from header files
There are three kinds of unnecessary includes:
  * includes which aren't directly needed, but which include some other
    forgotten include
  * includes which could be replaced by a simple forward declaration of
    some structs
  * includes which aren't needed at all

Remove the third kind of include.  Subsequent commits (and a subsequent
series) will work on removing some of the other kinds of includes.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
51e846e673 doc: enforce placeholders in documentation
Any string that is not meant to be used verbatim in the documentation
should be marked as a placeholder.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 11:06:57 -08:00
2162f9f6f8 doc: enforce dashes in placeholders
The CodingGuidelines documents stipulates that multi-word placeholders
are to be separated by dashes, not underscores nor spaces.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 11:06:55 -08:00
5b7eec4bc5 fast-import: use mem_pool_calloc()
Use mem_pool_calloc() to get a zeroed buffer instead of zeroing it
ourselves.  This makes the code clearer and less repetitive.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 11:06:23 -08:00
f546151228 t1006: add tests for %(objectsize:disk)
Back when we added this placeholder in a4ac106178 (cat-file: add
%(objectsize:disk) format atom, 2013-07-10), there were no tests,
claiming "[...]the exact numbers returned are volatile and subject to
zlib and packing decisions".

But we can use a little shell hackery to get the expected numbers
ourselves. To a certain degree this is just re-implementing what Git is
doing under the hood, but it is still worth doing. It makes sure we
exercise the %(objectsize:disk) code at all, and having the two
implementations agree gives us more confidence.

Note that our shell code assumes that no object appears twice (either in
two packs, or as both loose and packed), as then the results really are
undefined. That's OK for our purposes, and the test will notice if that
assumption is violated (the shell version would produce duplicate lines
that Git's output does not have).

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-21 10:37:46 -08:00
d6b6cd1393 archive: "--list" does not take further options
"git archive --list blah" should notice an extra command line
parameter that goes unused.  Make it so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-21 10:33:09 -08:00
2e13ed4671 Merge branch 'jk/end-of-options' into jc/sparse-checkout-set-add-end-of-options
* jk/end-of-options:
  parse-options: decouple "--end-of-options" and "--"
2023-12-20 21:49:33 -08:00
63956c553d Documentation/git-merge.txt: use backticks for command wrapping
As René found in the guidance from CodingGuidelines:

   Literal examples (e.g. use of command-line options, command names,
   branch names, URLs, pathnames (files and directories), configuration
   and environment variables) must be typeset in monospace (i.e. wrapped
   with backticks)

So all instances of single and double quotes for wraping said examples
were replaced with simple backticks.

Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20 13:40:01 -08:00
dc18ead555 Documentation/git-merge.txt: fix reference to synopsis
437591a9d7 combined the synopsis of "The second syntax" (meaning `git
merge --abort`) and "The third syntax" (for `git merge --continue`) into
this single line:

       git merge (--continue | --abort | --quit)

but it was still referred to when describing the preconditions that have
to be fulfilled to run the respective actions. In other words:
References by number are no longer valid after a merge of some of the
synopses.

Also the previous version of the documentation did not acknowledge that
`--no-commit` would result in the precondition being fulfilled (thanks
to Elijah Newren and Junio C Hamano for pointing that out).

This change also groups `--abort` and `--continue` together when
explaining the prerequisites in order to avoid duplication.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20 13:39:56 -08:00
de7c27a186 trailer: use offsets for trailer_start/trailer_end
Previously these fields in the trailer_info struct were of type "const
char *" and pointed to positions in the input string directly (to the
start and end positions of the trailer block).

Use offsets to make the intended usage less ambiguous. We only need to
reference the input string in format_trailer_info(), so update that
function to take a pointer to the input.

While we're at it, rename trailer_start to trailer_block_start to be
more explicit about these offsets (that they are for the entire trailer
block including other trailers). Ditto for trailer_end.

Reported-by: Glen Choo <glencbz@gmail.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20 11:55:04 -08:00
97e9d0b78a trailer: find the end of the log message
Previously, trailer_info_get() computed the trailer block end position
by

(1) checking for the opts->no_divider flag and optionally calling
    find_patch_start() to find the "patch start" location (patch_start), and
(2) calling find_trailer_end() to find the end of the trailer block
    using patch_start as a guide, saving the return value into
    "trailer_end".

The logic in (1) was awkward because the variable "patch_start" is
misleading if there is no patch in the input. The logic in (2) was
misleading because it could be the case that no trailers are in the
input (yet we are setting a "trailer_end" variable before even searching
for trailers, which happens later in find_trailer_start()). The name
"find_trailer_end" was misleading because that function did not look for
any trailer block itself --- instead it just computed the end position
of the log message in the input where the end of the trailer block (if
it exists) would be (because trailer blocks must always come after the
end of the log message).

Combine the logic in (1) and (2) together into find_patch_start() by
renaming it to find_end_of_log_message(). The end of the log message is
the starting point which find_trailer_start() needs to start searching
backward to parse individual trailers (if any).

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20 11:55:04 -08:00
7a75e131d6 Merge branch 'ps/clone-into-reftable-repository' into ps/refstorage-extension
* ps/clone-into-reftable-repository:
  builtin/clone: create the refdb with the correct object format
  builtin/clone: skip reading HEAD when retrieving remote
  builtin/clone: set up sparse checkout later
  builtin/clone: fix bundle URIs with mismatching object formats
  remote-curl: rediscover repository when fetching refs
  setup: allow skipping creation of the refdb
  setup: extract function to create the refdb
2023-12-20 10:19:58 -08:00
055bb6e996 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20 10:15:09 -08:00
66e959f431 Merge branch 'jk/config-cleanup'
Code clean-up around use of configuration variables.

* jk/config-cleanup:
  sequencer: simplify away extra git_config_string() call
  gpg-interface: drop pointless config_error_nonbool() checks
  push: drop confusing configset/callback redundancy
  config: use git_config_string() for core.checkRoundTripEncoding
  diff: give more detailed messages for bogus diff.* config
  config: use config_error_nonbool() instead of custom messages
  imap-send: don't use git_die_config() inside callback
  git_xmerge_config(): prefer error() to die()
  config: reject bogus values for core.checkstat
2023-12-20 10:14:55 -08:00
2b9cbc6d01 Merge branch 'jk/implicit-true'
Some codepaths did not correctly parse configuration variables
specified with valueless "true", which has been corrected.

* jk/implicit-true:
  fsck: handle NULL value when parsing message config
  trailer: handle NULL value when parsing trailer-specific config
  submodule: handle NULL value when parsing submodule.*.branch
  help: handle NULL value for alias.* config
  trace2: handle NULL values in tr2_sysenv config callback
  setup: handle NULL value when parsing extensions
  config: handle NULL value when parsing non-bools
2023-12-20 10:14:54 -08:00
67dfb897b3 Merge branch 'jk/bisect-reset-fix'
"git bisect reset" has been taught to clean up state files and refs
even when BISECT_START file is gone.

* jk/bisect-reset-fix:
  bisect: always clean on reset
2023-12-20 10:14:54 -08:00
9eec6a1c5f Merge branch 'jk/end-of-options'
"git $cmd --end-of-options --rev -- --path" for some $cmd failed
to interpret "--rev" as a rev, and "--path" as a path.  This was
fixed for many programs like "reset" and "checkout".

* jk/end-of-options:
  parse-options: decouple "--end-of-options" and "--"
2023-12-20 10:14:54 -08:00
3c8f932d35 Merge branch 'rs/incompatible-options-messages'
Clean-up code that handles combinations of incompatible options.

* rs/incompatible-options-messages:
  worktree: simplify incompatibility message for --orphan and commit-ish
  worktree: standardize incompatibility messages
  clean: factorize incompatibility message
  revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
  revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
  repack: use die_for_incompatible_opt3() for -A/-k/--cruft
  push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
2023-12-20 10:14:53 -08:00
c4bf868bee Merge branch 'jc/revision-parse-int'
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.

* jc/revision-parse-int:
  revision: parse integer arguments to --max-count, --skip, etc., more carefully
2023-12-20 10:14:53 -08:00
2d09302a01 Merge branch 'mk/doc-gitfile-more'
Doc update.

* mk/doc-gitfile-more:
  doc: make the gitfile syntax easier to discover
2023-12-20 10:14:53 -08:00
a21a929643 Merge branch 'ps/ref-tests-update-more'
Tests update.

* ps/ref-tests-update-more:
  t6301: write invalid object ID via `test-tool ref-store`
  t5551: stop writing packed-refs directly
  t5401: speed up creation of many branches
  t4013: simplify magic parsing and drop "failure"
  t3310: stop checking for reference existence via `test -f`
  t1417: make `reflog --updateref` tests backend agnostic
  t1410: use test-tool to create empty reflog
  t1401: stop treating FETCH_HEAD as real reference
  t1400: split up generic reflog tests from the reffile-specific ones
  t0410: mark tests to require the reffiles backend
2023-12-20 10:14:53 -08:00
145336ec1d Merge branch 'jp/use-diff-index-in-pre-commit-sample'
The sample pre-commit hook that tries to catch introduction of new
paths that use potentially non-portable characters did not notice
an existing path getting renamed to such a problematic path, when
rename detection was enabled.

* jp/use-diff-index-in-pre-commit-sample:
  hooks--pre-commit: detect non-ASCII when renaming
2023-12-20 10:14:52 -08:00
425e7f0532 Merge branch 'en/complete-sparse-checkout'
Command line completion (in contrib/) learned to complete path
arguments to the "add/set" subcommands of "git sparse-checkout"
better.

* en/complete-sparse-checkout:
  completion: avoid user confusion in non-cone mode
  completion: avoid misleading completions in cone mode
  completion: fix logic for determining whether cone mode is active
  completion: squelch stray errors in sparse-checkout completion
2023-12-20 10:14:52 -08:00
45184afb4d rebase: use strvec_pushf() for format-patch revisions
In run_am(), a strbuf is used to create a revision argument that is then
added to the argument list for git format-patch using strvec_push().
Use strvec_pushf() to add it directly instead, simplifying the code
and plugging a small leak on the error code path.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20 09:26:58 -08:00
5809004f26 Merge branch 'ps/reftable-fixes' into ps/reftable-fixes-and-optims
* ps/reftable-fixes:
  reftable/block: reuse buffer to compute record keys
  reftable/block: introduce macro to initialize `struct block_iter`
  reftable/merged: reuse buffer to compute record keys
  reftable/stack: fix use of unseeded randomness
  reftable/stack: fix stale lock when dying
  reftable/stack: reuse buffers when reloading stack
  reftable/stack: perform auto-compaction with transactional interface
  reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
  reftable: handle interrupted writes
  reftable: handle interrupted reads
  reftable: wrap EXPECT macros in do/while
2023-12-20 08:21:50 -08:00
44dbb3bf29 completion: support pseudoref existence checks for reftables
In contrib/completion/git-completion.bash, there are a bunch of
instances where we read pseudorefs, such as HEAD, MERGE_HEAD,
REVERT_HEAD, and others via the filesystem. However, the upcoming
reftable refs backend won't use '.git/HEAD' at all but instead will
write an invalid refname as placeholder for backwards compatibility,
which will break the git-completion script.

Update the '__git_pseudoref_exists' function to:

1. Recognize the placeholder '.git/HEAD' written by the reftable
   backend (its content is specified in the reftable specs).
2. If reftable is in use, use 'git rev-parse' to determine whether the
    given ref exists.
3. Otherwise, continue to use 'test -f' to check for the ref's filename.

Signed-off-by: Stan Hu <stanhu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-19 15:11:58 -08:00
666270a2df completion: refactor existence checks for pseudorefs
In preparation for the reftable backend, this commit introduces a
'__git_pseudoref_exists' function that continues to use 'test -f' to
determine whether a given pseudoref exists in the local filesystem.

Signed-off-by: Stan Hu <stanhu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-19 15:11:56 -08:00
a762af3dfd remote.h: retire CAS_OPT_NAME
When the "--force-with-lease" option was introduced in 28f5d176
(remote.c: add command line option parser for "--force-with-lease",
2013-07-08), the design discussion revolved around the concept of
"compare-and-swap", and it can still be seen in the name used for
variables and helper functions.  The end-user facing option name
ended up to be a bit different, so during the development iteration
of the feature, we used this C preprocessor macro to make it easier
to rename it later.

All of that happened more than 10 years ago, and the flexibility
afforded by the CAS_OPT_NAME macro outlived its usefulness.  Inline
the constant string for the option name, like all other option names
in the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-19 11:27:04 -08:00
624eb90fa8 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 14:10:13 -08:00
c1aefb6d04 Merge branch 'jh/trace2-redact-auth'
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.

* jh/trace2-redact-auth:
  t0212: test URL redacting in EVENT format
  t0211: test URL redacting in PERF format
  trace2: redact passwords from https:// URLs by default
  trace2: fix signature of trace2_def_param() macro
2023-12-18 14:10:13 -08:00
78956864b0 Merge branch 'ad/merge-file-diff-algo'
"git merge-file" learned to take the "--diff-algorithm" option to
use algorithm different from the default "myers" diff.

* ad/merge-file-diff-algo:
  merge-file: add --diff-algorithm option
2023-12-18 14:10:13 -08:00
cacf27bf82 Merge branch 'rs/column-leakfix'
Leakfix.

* rs/column-leakfix:
  column: release strbuf and string_list after use
2023-12-18 14:10:13 -08:00
b3e223ddda Merge branch 'rs/i18n-cannot-be-used-together'
Clean-up code that handles combinations of incompatible options.

* rs/i18n-cannot-be-used-together:
  i18n: factorize even more 'incompatible options' messages
2023-12-18 14:10:12 -08:00
468d49634f Merge branch 'jb/reflog-expire-delete-dry-run-options'
Command line parsing fix for "git reflog".

* jb/reflog-expire-delete-dry-run-options:
  builtin/reflog.c: fix dry-run option short name
2023-12-18 14:10:12 -08:00
ec5ab1482d Merge branch 'js/update-urls-in-doc-and-comment'
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.

* js/update-urls-in-doc-and-comment:
  doc: refer to internet archive
  doc: update links for andre-simon.de
  doc: switch links to https
  doc: update links to current pages
2023-12-18 14:10:12 -08:00
66685e8555 Merge branch 'ps/commit-graph-less-paranoid'
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications.  The default has been
flipped to disable this pessimization.

* ps/commit-graph-less-paranoid:
  commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
2023-12-18 14:10:11 -08:00
02230b74e8 Merge branch 'cc/git-replay'
Introduce "git replay", a tool meant on the server side without
working tree to recreate a history.

* cc/git-replay:
  replay: stop assuming replayed branches do not diverge
  replay: add --contained to rebase contained branches
  replay: add --advance or 'cherry-pick' mode
  replay: use standard revision ranges
  replay: make it a minimal server side command
  replay: remove HEAD related sanity check
  replay: remove progress and info output
  replay: add an important FIXME comment about gpg signing
  replay: change rev walking options
  replay: introduce pick_regular_commit()
  replay: die() instead of failing assert()
  replay: start using parse_options API
  replay: introduce new builtin
  t6429: remove switching aspects of fast-rebase
2023-12-18 14:10:11 -08:00
3335365270 Merge branch 'ac/fuzz-show-date'
Subject approxidate() and show_date() machinery to OSS-Fuzz.

* ac/fuzz-show-date:
  fuzz: add new oss-fuzz fuzzer for date.c / date.h
2023-12-18 14:10:11 -08:00
71c746632a Merge branch 'ps/ref-deletion-updates'
Simplify API implementation to delete references by eliminating
duplication.

* ps/ref-deletion-updates:
  refs: remove `delete_refs` callback from backends
  refs: deduplicate code to delete references
  refs/files: use transactions to delete references
  t5510: ensure that the packed-refs file needs locking
2023-12-18 14:10:11 -08:00
f1c537705b Merge branch 'js/packfile-h-typofix'
Typofix.

* js/packfile-h-typofix:
  packfile.c: fix a typo in `each_file_in_pack_dir_fn()`'s declaration
2023-12-18 14:10:10 -08:00
7033d5479b pkt-line: do not chomp newlines for sideband messages
When calling "packet_read_with_status()" to parse pkt-line encoded
packets, we can turn on the flag "PACKET_READ_CHOMP_NEWLINE" to chomp
newline character for each packet for better line matching. But when
receiving data and progress information using sideband, we should turn
off the flag "PACKET_READ_CHOMP_NEWLINE" to prevent mangling newline
characters from data and progress information.

When both the server and the client support "sideband-all" capability,
we have a dilemma that newline characters in negotiation packets should
be removed, but the newline characters in the progress information
should be left intact.

Add new flag "PACKET_READ_USE_SIDEBAND" for "packet_read_with_status()"
to prevent mangling newline characters in sideband messages.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 13:24:38 -08:00
64220dc5f7 pkt-line: memorize sideband fragment in reader
When we turn on the "use_sideband" field of the packet_reader,
"packet_reader_read()" will call the function "demultiplex_sideband()"
to parse and consume sideband messages. Sideband fragment which does not
end with "\r" or "\n" will be saved in the sixth parameter "scratch"
and it can be reused and be concatenated when parsing another sideband
message.

In "packet_reader_read()" function, the local variable "scratch" can
only be reused by subsequent sideband messages. But if there is a
payload message between two sideband fragments, the first fragment
which is saved in the local variable "scratch" will be lost.

To solve this problem, we can add a new field "scratch" in
packet_reader to memorize the sideband fragment across different calls
of "packet_reader_read()".

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 13:24:37 -08:00
eaa82f8e98 test-pkt-line: add option parser for unpack-sideband
We can use the test helper program "test-tool pkt-line" to test pkt-line
related functions. E.g.:

 * Use "test-tool pkt-line send-split-sideband" to generate sideband
   messages.

 * Pipe these generated sideband messages to command "test-tool pkt-line
   unpack-sideband" to test packet_reader_read() function.

In order to make a complete test of the packet_reader_read() function,
add option parser for command "test-tool pkt-line unpack-sideband".

 * To remove newlines in sideband messages, we can use:

        $ test-tool pkt-line unpack-sideband --chomp-newline

 * To preserve newlines in sideband messages, we can use:

        $ test-tool pkt-line unpack-sideband --no-chomp-newline

 * To parse sideband messages using "demultiplex_sideband()" inside the
   function "packet_reader_read()", we can use:

        $ test-tool pkt-line unpack-sideband --reader-use-sideband

We also add new example sideband packets in send_split_sideband() and
add several new test cases in t0070. Among these test cases, we pipe
output of the "send-split-sideband" subcommand to the "unpack-sideband"
subcommand. We found two issues:

 1. The two splitted sideband messages "Hello," and " world!\n" should
    be concatenated together. But when we turn on use_sideband field of
    reader to parse sideband messages, the first part of the splitted
    message ("Hello,") is lost.

 2. The newline characters in sideband 2 (progress info) and sideband 3
    (error message) should be preserved, but they are both trimmed.

Will fix the above two issues in subsequent commits.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 13:24:37 -08:00
6d6f1cd7ee doc: format.notes specify a ref under refs/notes/ hierarchy
There is no 'ref/notes/' hierarchy.  '[format] notes = foo' uses notes
that are found in 'refs/notes/foo'.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 11:30:46 -08:00
37e8d795be test-lib-functions.sh: fix test_grep fail message wording
In the recent commit 2e87fca189 (test framework: further deprecate
test_i18ngrep, 2023-10-31), the test_i18ngrep function was
deprecated, and all the callers were updated to call the test_grep
function instead.  But test_grep inherited an error message that
still refers to test_i18ngrep by mistake.  Correct it so that a
broken call to the test_grep will identify itself as such.

Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 10:44:41 -08:00
8277dbe987 git-compat-util: convert skip_{prefix,suffix}{,_mem} to bool
Use the data type bool and its values true and false to document the
binary return value of skip_prefix() and friends more explicitly.

This first use of stdbool.h, introduced with C99, is meant to check
whether there are platforms that claim support for C99, as tested by
7bc341e21b (git-compat-util: add a test balloon for C99 support,
2021-12-01), but still lack that header for some reason.

A fallback based on a wider type, e.g. int, would have to deal with
comparisons somehow to emulate that any non-zero value is true:

   bool b1 = 1;
   bool b2 = 2;
   if (b1 == b2) puts("This is true.");

   int i1 = 1;
   int i2 = 2;
   if (i1 == i2) puts("Not printed.");
   #define BOOLEQ(a, b) (!(a) == !(b))
   if (BOOLEQ(i1, i2)) puts("This is true.");

So we'd be better off using bool everywhere without a fallback, if
possible.  That's why this patch doesn't include any.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 09:08:24 -08:00
18ce48918c fetch: no redundant error message for atomic fetch
If an error occurs during an atomic fetch, a redundant error message
will appear at the end of do_fetch(). It was introduced in b3a804663c
(fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17).

Because a failure message is displayed before setting retcode in the
function do_fetch(), calling error() on the err message at the end of
this function may result in redundant or empty error message to be
displayed.

We can remove the redundant error() function, because we know that
the function ref_transaction_abort() never fails. While we can find a
common pattern for calling ref_transaction_abort() by running command
"git grep -A1 ref_transaction_abort", e.g.:

    if (ref_transaction_abort(transaction, &error))
        error("abort: %s", error.buf);

Following this pattern, we can tolerate the return value of the function
ref_transaction_abort() being changed in the future. We also delay the
output of the err message to the end of do_fetch() to reduce redundant
code. With these changes, the test case "fetch porcelain output
(atomic)" in t5574 will also be fixed.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 08:30:33 -08:00
97d82b2963 t5574: test porcelain output of atomic fetch
The test case "fetch porcelain output" checks output of the fetch
command. The error output must be empty with the follow assertion:

    test_must_be_empty stderr

But this assertion fails if using atomic fetch. Refactor this test case
to use different fetch options by splitting it into three test cases.

  1. "setup for fetch porcelain output".

  2. "fetch porcelain output": for non-atomic fetch.

  3. "fetch porcelain output (atomic)": for atomic fetch.

Add new command "test_commit ..." in the first test case, so that if we
run these test cases individually (--run=4-6), "git rev-parse HEAD~"
command will work properly. Run the above test cases, we can find that
one test case has a known breakage, as shown below:

    ok 4 - setup for fetch porcelain output
    ok 5 - fetch porcelain output  # TODO known breakage vanished
    not ok 6 - fetch porcelain output (atomic) # TODO known breakage

The failed test case has an error message with only the error prompt but
no message body, as follows:

    'stderr' is not empty, it contains:
    error:

In a later commit, we will fix this issue.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18 08:30:32 -08:00
bc62d27d5c docs: MERGE_AUTOSTASH is not that special
A handful of manual pages called MERGE_AUTOSTASH a "special ref",
but there is nothing special about it.  It merely is yet another
pseudoref.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 14:08:28 -08:00
dada38646a docs: AUTO_MERGE is not that special
A handful of manual pages called AUTO_MERGE a "special ref", but
there is nothing special about it.  It merely is yet another
pseudoref.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 14:08:28 -08:00
7122f4f747 refs.h: HEAD is not that special
In-code comment explains pseudorefs but used a wrong nomenclature
"special ref".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 14:08:28 -08:00
2047b2c28c git-bisect.txt: BISECT_HEAD is not that special
The description of "git bisect --no-checkout" called BISECT_HEAD a
"special ref", but there is nothing special about it.  It merely is
yet another pseudoref.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 14:08:28 -08:00
d9a4bb3385 git.txt: HEAD is not that special
The introductory text in "git help git" that describes HEAD called
it "a special ref".  It is special compared to the more regular refs
like refs/heads/master and refs/tags/v1.0.0, but not that special,
unlike truly special ones like FETCH_HEAD.

Rewrite a few sentences to also introduce the distinction between a
regular ref that contain the object name and a symbolic ref that
contain the name of another ref.  Update the description of HEAD
that point at the current branch to use the more correct term, a
"symbolic ref".

This was found as part of auditing the documentation and in-code
comments for uses of "special ref" that refer merely a "pseudo ref".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 14:08:21 -08:00
68fcebfb1a git-add.txt: add missing short option -A to synopsis
With one exception, the synopsis for `git add` consistently lists the
short counterpart alongside the long-form of each option (for instance,
"[--edit | -e]"). The exception is that -A is not mentioned alongside
--all. Fix this inconsistency

Reported-by: Benjamin Lehmann <ben.lehmann@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 13:01:51 -08:00
647b5e0998 tests: adjust whitespace in chainlint expectations
The "check-chainlint" target runs automatically when running tests and
performs self-checks to verify that the chainlinter itself produces the
expected output. Originally, the chainlinter was implemented via sed,
but the infrastructure has been rewritten in fb41727b7e (t: retire
unused chainlint.sed, 2022-09-01) to use a Perl script instead.

The rewrite caused some slight whitespace changes in the output that are
ultimately not of much importance. In order to be able to assert that
the actual chainlinter errors match our expectations we thus have to
ignore whitespace characters when diffing them. As the `-w` flag is not
in POSIX we try to use `git diff -w --no-index` before we fall back to
`diff -w -u`.

To accomodate for cases where the host system has no Git installation we
use the locally-compiled version of Git. This can result in problems
though when the Git project's repository is using extensions that the
locally-compiled version of Git doesn't understand. It will refuse to
run and thus cause the checks to fail.

Instead of improving the detection logic, fix our ".expect" files so
that we do not need any post-processing at all anymore. This allows us
to drop the `-w` flag when diffing so that we can always use diff(1)
now.

Note that we keep some of the post-processing of `chainlint.pl` output
intact to strip leading line numbers generated by the script. Having
these would cause a rippling effect whenever we add a new test that
sorts into the middle of existing tests and would require us to
renumerate all subsequent lines, which seems rather pointless.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 08:36:14 -08:00
ba47d88795 t/perf: add performance tests for multi-pack reuse
To ensure that we don't regress either the size or runtime performance
of multi-pack reuse, add a performance test to measure both of these.

The test partitions the objects in GIT_TEST_PERF_LARGE_REPO into 1, 10,
and 100 packs, and then tries to perform a "clone" at each stage with
both single- and multi-pack reuse enabled.

Note that the `repack_into_n_chunks()` function in this new test script
differs from the existing `repack_into_n()`. The former partitions the
repository into N equal-sized chunks, while the latter produces N packs
of five commits each (plus their objects), and then another pack with
the remainder.

On git.git, I can produce the following results on my machine:

    Test                                                            this tree
    --------------------------------------------------------------------------------
    5332.3: clone for 1-pack scenario (single-pack reuse)           1.57(2.99+0.15)
    5332.4: clone size for 1-pack scenario (single-pack reuse)               231.8M
    5332.5: clone for 1-pack scenario (multi-pack reuse)            1.79(2.96+0.21)
    5332.6: clone size for 1-pack scenario (multi-pack reuse)                231.7M
    5332.9: clone for 10-pack scenario (single-pack reuse)          3.89(16.75+0.35)
    5332.10: clone size for 10-pack scenario (single-pack reuse)             209.9M
    5332.11: clone for 10-pack scenario (multi-pack reuse)          1.56(2.99+0.17)
    5332.12: clone size for 10-pack scenario (multi-pack reuse)              224.4M
    5332.15: clone for 100-pack scenario (single-pack reuse)        8.24(54.31+0.59)
    5332.16: clone size for 100-pack scenario (single-pack reuse)            278.3M
    5332.17: clone for 100-pack scenario (multi-pack reuse)         2.13(2.44+0.33)
    5332.18: clone size for 100-pack scenario (multi-pack reuse)             357.9M

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:09 -08:00
af626ac0e0 pack-bitmap: enable reuse from all bitmapped packs
Now that both the pack-bitmap and pack-objects code are prepared to
handle marking and using objects from multiple bitmapped packs for
verbatim reuse, allow marking objects from all bitmapped packs as
eligible for reuse.

Within the `reuse_partial_packfile_from_bitmap()` function, we no longer
only mark the pack whose first object is at bit position zero for reuse,
and instead mark any pack contained in the MIDX as a reuse candidate.

Provide a handful of test cases in a new script (t5332) exercising
interesting behavior for multi-pack reuse to ensure that we performed
all of the previous steps correctly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:09 -08:00
941074134c pack-objects: allow setting pack.allowPackReuse to "single"
In e704fc7978 (pack-objects: introduce pack.allowPackReuse, 2019-12-18),
the `pack.allowPackReuse` configuration option was introduced, allowing
users to disable the pack reuse mechanism.

To prepare for debugging multi-pack reuse, allow setting configuration
to "single" in addition to the usual bool-or-int values.

"single" implies the same behavior as "true", "1", "yes", and so on. But
it will complement a new "multi" value (to be introduced in a future
commit). When set to "single", we will only perform pack reuse on a
single pack, regardless of whether or not there are multiple MIDX'd
packs.

This requires no code changes (yet), since we only support single pack
reuse.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:09 -08:00
3bea0c0611 t/test-lib-functions.sh: implement test_trace2_data helper
Introduce a helper function which looks for a specific (category, key,
value) tuple in the output of a trace2 event stream.

We will use this function in a future patch to ensure that the expected
number of objects are reused from an expected number of packs.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:09 -08:00
54393e4e68 pack-objects: add tracing for various packfile metrics
As part of the multi-pack reuse effort, we will want to add some tests
that assert that we reused a certain number of objects from a certain
number of packs.

We could do this by grepping through the stderr output of
`pack-objects`, but doing so would be brittle in case the output format
changed.

Instead, let's use the trace2 mechanism to log various pieces of
information about the generated packfile, which we can then use to
compare against desired values.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:09 -08:00
519e17ff75 pack-bitmap: prepare to mark objects from multiple packs for reuse
Now that the pack-objects code is equipped to handle reusing objects
from multiple packs, prepare the pack-bitmap code to mark objects from
multiple packs as reuse candidates.

In order to prepare the pack-bitmap code for this change, remove the
same set of assumptions we unwound in previous commits from the helper
function `reuse_partial_packfile_from_bitmap_1()`, in preparation for it
to be called in a loop over the set of bitmapped packs in a following
commit.

Most importantly, we can no longer assume that the bit position
corresponding to the first object in a given reuse pack candidate is at
the beginning of the bitmap itself.

For the single pack that this assumption is still true for (in MIDX
bitmaps, this is the preferred pack, in single-pack bitmaps it is the
pack the bitmap is tied to), we can still use our whole-words
optimization.

But for all subsequent packs, we can not make use of this optimization,
since it assumes that all delta bases are being sent from the same pack,
which would break if we are sending OFS_DELTAs down to the client. To
understand why, consider two packs, P1 and P2 where:

  - P1 has object A which is a delta on base B
  - P2 has its own copy of B, in addition to other objects

Suppose that the MIDX which covers P1 and P2 selected its copy of A from
P1, but selected its copy of B from P2. Since A is a delta of B, but the
base was selected from a different pack, sending the bytes corresponding
to A as an OFS_DELTA verbatim from P1 would be incorrect, since we don't
guarantee that B is in the same place relative to A in the generated
pack as in P1.

For now, we detect and reject these cross-pack deltas by searching for
the (pack_id, offset) pair for the delta's base object (using the same
pack_id as the pack containing the delta'd object) in the MIDX. If we
find a match, that means that the MIDX did indeed pick the base object
from the same pack, and we are OK to reuse the delta.

If we don't find a match, however, that means that the base object was
selected from a different pack in the MIDX, and we can let the slower
path handle re-delta'ing our candidate object.

In the future, there are a couple of other things we could do, namely:

  - Turn any cross-pack deltas (which are stored as OFS_DELTAs) into
    REF_DELTAs. We already do this today when reusing an OFS_DELTA
    without `--delta-base-offset` enabled, so it's not a huge stretch to
    do the same for cross-pack deltas even when `--delta-base-offset` is
    enabled.

    This would work, but would obviously result in larger-than-necessary
    packs, as we in theory *could* represent these cross-pack deltas by
    patching an existing OFS_DELTA. But it's not clear how much that
    would matter in practice. I suspect it would have a lot to do with
    how you pack your repository in the first place.

  - Finally, we could patch OFS_DELTAs across packs in a similar fashion
    as we do today for OFS_DELTAs within a single pack on either side of
    a gap. This would result in the smallest packs of the three options
    here, but implementing this would be more involved.

    At minimum, you'd have to keep the reusable chunks list for all
    reused packs, not just the one we're currently processing. And you'd
    have to ensure that any bases which are a part of cross-pack deltas
    appear before the delta. I think this is possible to do, but would
    require assembling the reusable chunks list potentially in a
    different order than they appear in the source packs.

For now, let's pursue the simplest approach and reject any cross-pack
deltas.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:09 -08:00
dbd5c520d2 pack-revindex: implement midx_pair_to_pack_pos()
Now that we have extracted the `midx_key_to_pack_pos()` function, we can
implement the `midx_pair_to_pack_pos()` function which accepts (pack_id,
offset) tuples and returns an index into the psuedo-pack order.

This will be used in a following commit in order to figure out whether
or not the MIDX chose a given delta's base object from the same pack as
the delta resides in. It will do so by locating the base object's offset
in the pack, and then performing a binary search using the same pack ID
with the base object's offset.

If (and only if) it finds a match (at any position) we can guarantee
that the MIDX selected both halves of the delta/base pair from the same
pack.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
e1bfe30c4d pack-revindex: factor out midx_key_to_pack_pos() helper
The `midx_to_pack_pos()` function implements a binary search over
objects in the MIDX between lexical and pseudo-pack order. It does this
by taking in an index into the lexical order (i.e. the same argument
you'd use for `nth_midxed_object_id()` and similar) and spits out a
position in the pseudo-pack order.

This works for all callers, since they currently all are translating
from lexical order to pseudo-pack order. But future callers may want to
translate a known (offset, pack_id) tuple into an index into the
psuedo-pack order, without knowing where that (offset, pack_id) tuple
appears in lexical order.

Prepare for implementing a function that translates between a (offset,
pack_id) tuple into an index into the psuedo-pack order by extracting a
helper function which does just that, and then reimplementing
midx_to_pack_pos() in terms of it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
b1e3333068 midx: implement midx_preferred_pack()
When performing a binary search over the objects in a MIDX's bitmap
(i.e. in pseudo-pack order), the reader reconstructs the pseudo-pack
ordering using a combination of (a) the preferred pack, (b) the pack's
lexical position in the MIDX based on pack names, and (c) the object
offset within the pack.

In order to perform this binary search, the reader must know the
identity of the preferred pack. This could be stored in the MIDX, but
isn't for historical reasons, mostly because it can easily be inferred
at read-time by looking at the object in the first bit position and
finding out which pack it was selected from in the MIDX, like so:

    nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0));

In midx_to_pack_pos() which performs this binary search, we look up the
identity of the preferred pack before each search. This is relatively
quick, since it involves two table-driven lookups (one in the MIDX's
revindex for `pack_pos_to_midx()`, and another in the MIDX's object
table for `nth_midxed_pack_int_id()`).

But since the preferred pack does not change after the MIDX is written,
it is safe to cache this value on the MIDX itself.

Write a helper to do just that, and rewrite all of the existing
call-sites that care about the identity of the preferred pack in terms
of this new helper.

This will prepare us for a subsequent patch where we will need to binary
search through the MIDX's pseudo-pack order multiple times.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
ed9f41480a git-compat-util.h: implement checked size_t to uint32_t conversion
In a similar fashion as other checked cast functions in this header
(such as `cast_size_t_to_ulong()` and `cast_size_t_to_int()`), implement
a checked cast function for going from a size_t to a uint32_t value.

This function will be utilized in a future commit which needs to make
such a conversion.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
b96289a10b pack-objects: include number of packs reused in output
In addition to including the number of objects reused verbatim from a
reuse-pack, include the number of packs from which objects were reused.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
ca0fd69e37 pack-objects: prepare write_reused_pack_verbatim() for multi-pack reuse
The function `write_reused_pack_verbatim()` within
`builtin/pack-objects.c` is responsible for writing out a continuous
set of objects beginning at the start of the reuse packfile.

In the existing implementation, we did something like:

    while (pos < reuse_packfile_bitmap->word_alloc &&
           reuse_packfile_bitmap->words[pos] == (eword_t)~0)
      pos++;

    if (pos)
      /* write first `pos * BITS_IN_WORD` objects from pack */

as an optimization to record a single chunk for the longest continuous
prefix of objects wanted out of the reuse pack, instead of having a
chunk for each individual object. For more details, see bb514de356
(pack-objects: improve partial packfile reuse, 2019-12-18).

In order to retain this optimization in a multi-pack reuse world, we can
no longer assume that the first object in a pack is on a word boundary
in the bitmap storing the set of reusable objects.

Assuming that all objects from the beginning of the reuse packfile up to
the object corresponding to the first bit on a word boundary are part of
the result, consume whole words at a time until the last whole word
belonging to the reuse packfile. Copy those objects to the resulting
packfile, and track that we reused them by recording a single chunk.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
4805125710 pack-objects: prepare write_reused_pack() for multi-pack reuse
The function `write_reused_pack()` within `builtin/pack-objects.c` is
responsible for performing pack-reuse on a single pack, and has two main
functions:

  - it dispatches a call to `write_reused_pack_verbatim()` to see if we
    can reuse portions of the packfile in whole-word chunks

  - for any remaining objects (that is, any objects that appear after
    the first "gap" in the bitmap), call write_reused_pack_one() on that
    object to record it for reuse.

Prepare this function for multi-pack reuse by removing the assumption
that the bit position corresponding to the first object being reused
from a given pack must be at bit position zero.

The changes in this function are mostly straightforward. Initialize `i`
to the position of the first word to contain bits corresponding to that
reuse pack. In most situations, we throw the initialized value away,
since we end up replacing it with the return value from
write_reused_pack_verbatim(), moving us past the section of whole words
that we reused.

Likewise, modify the per-object loop to ignore any bits at the beginning
of the first word that do not belong to the pack currently being reused,
as well as skip to the "done" section once we have processed the last
bit corresponding to this pack.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
073b40eba0 pack-objects: pass bitmapped_pack's to pack-reuse functions
Further prepare pack-objects to perform verbatim pack-reuse over
multiple packfiles by converting functions that take in a pointer to a
`struct packed_git` to instead take in a pointer to a `struct
bitmapped_pack`.

The additional information found in the bitmapped_pack struct (such as
the bit position corresponding to the beginning of the pack) will be
necessary in order to perform verbatim pack-reuse.

Note that we don't use any of the extra pieces of information contained
in the bitmapped_pack struct, so this step is merely preparatory and
does not introduce any functional changes.

Note further that we do not change the argument type to
write_reused_pack_one(). That function is responsible for copying
sections of the packfile directly and optionally patching any OFS_DELTAs
to account for not reusing sections of the packfile in between a delta
and its base.

As such, that function is (and should remain) oblivious to multi-pack
reuse, and does not require any of the extra pieces of information
stored in the bitmapped_pack struct.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
d1d701eb9c pack-objects: keep track of pack_start for each reuse pack
When reusing objects from a pack, we keep track of a set of one or more
`reused_chunk`s, corresponding to sections of one or more object(s) from
a source pack that we are reusing. Each chunk contains two pieces of
information:

  - the offset of the first object in the source pack (relative to the
    beginning of the source pack)
  - the difference between that offset, and the corresponding offset in
    the pack we're generating

The purpose of keeping track of these is so that we can patch an
OFS_DELTAs that cross over a section of the reuse pack that we didn't
take.

For instance, consider a hypothetical pack as shown below:

                                                (chunk #2)
                                                __________...
                                               /
                                              /
      +--------+---------+-------------------+---------+
  ... | <base> | <other> |      (unused)     | <delta> | ...
      +--------+---------+-------------------+---------+
       \                /
        \______________/
           (chunk #1)

Suppose that we are sending objects "base", "other", and "delta", and
that the "delta" object is stored as an OFS_DELTA, and that its base is
"base". If we don't send any objects in the "(unused)" range, we can't
copy the delta'd object directly, since its delta offset includes a
range of the pack that we didn't copy, so we have to account for that
difference when patching and reassembling the delta.

In order to compute this value correctly, we need to know not only where
we are in the packfile we're assembling (with `hashfile_total(f)`) but
also the position of the first byte of the packfile that we are
currently reusing. Currently, this works just fine, since when reusing
only a single pack those two values are always identical (because
verbatim reuse is the first thing pack-objects does when enabled after
writing the pack header).

But when reusing multiple packs which have one or more gaps, we'll need
to account for these two values diverging.

Together, these two allow us to compute the reused chunk's offset
difference relative to the start of the reused pack, as desired.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
5e29c3f707 pack-objects: parameterize pack-reuse routines over a single pack
The routines pack-objects uses to perform verbatim pack-reuse are:

  - write_reused_pack_one()
  - write_reused_pack_verbatim()
  - write_reused_pack()

, all of which assume that there is exactly one packfile being reused:
the global constant `reuse_packfile`.

Prepare for reusing objects from multiple packs by making reuse packfile
a parameter of each of the above functions in preparation for calling
these functions in a loop with multiple packfiles.

Note that we still have the global "reuse_packfile", but pass it through
each of the above function's parameter lists, eliminating all but one
direct access (the top-level caller in `write_pack_file()`). Even after
this series, we will still have a global, but it will hold the array of
reusable packfiles, and we'll pass them one at a time to these functions
in a loop.

Note also that we will eventually need to pass a `bitmapped_pack`
instead of a `packed_git` in order to hold onto additional information
required for reuse (such as the bit position of the first object
belonging to that pack). But that change will be made in a future commit
so as to minimize the noise below as much as possible.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
83296d20e8 pack-bitmap: return multiple packs via reuse_partial_packfile_from_bitmap()
Further prepare for enabling verbatim pack-reuse over multiple packfiles
by changing the signature of reuse_partial_packfile_from_bitmap() to
populate an array of `struct bitmapped_pack *`'s instead of a pointer to
a single packfile.

Since the array we're filling out is sized dynamically[^1], add an
additional `size_t *` parameter which will hold the number of reusable
packs (equal to the number of elements in the array).

Note that since we still have not implemented true multi-pack reuse,
these changes aren't propagated out to the rest of the caller in
builtin/pack-objects.c.

In the interim state, we expect that the array has a single element, and
we use that element to fill out the static `reuse_packfile` variable
(which is a bog-standard `struct packed_git *`). Future commits will
continue to push this change further out through the pack-objects code.

[^1]: That is, even though we know the number of packs which are
  candidates for pack-reuse, we do not know how many of those
  candidates we can actually reuse.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
35e156b9de pack-bitmap: simplify reuse_partial_packfile_from_bitmap() signature
The signature of `reuse_partial_packfile_from_bitmap()` currently takes
in a bitmap, as well as three output parameters (filled through
pointers, and passed as arguments), and also returns an integer result.

The output parameters are filled out with: (a) the packfile used for
pack-reuse, (b) the number of objects from that pack that we can reuse,
and (c) a bitmap indicating which objects we can reuse. The return value
is either -1 (when there are no objects to reuse), or 0 (when there is
at least one object to reuse).

Some of these parameters are redundant. Notably, we can infer from the
bitmap how many objects are reused by calling bitmap_popcount(). And we
can similar compute the return value based on that number as well.

As such, clean up the signature of this function to drop the "*entries"
parameter, as well as the int return value, since the single caller of
this function can infer these values themself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:08 -08:00
e5d48bf38b ewah: implement bitmap_is_empty()
In a future commit, we will want to check whether or not a bitmap has
any bits set in any of its words. The best way to do this (prior to the
existence of this patch) is to call `bitmap_popcount()` and check
whether the result is non-zero.

But this is semi-wasteful, since we do not need to know the exact number
of bits set, only whether or not there is at least one of them.

Implement a new helper function to check just that.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
dab60934e3 pack-bitmap: pass bitmapped_pack struct to pack-reuse functions
When trying to assemble a pack with bitmaps using `--use-bitmap-index`,
`pack-objects` asks the pack-bitmap machinery for a bitmap which
indicates the set of objects we can "reuse" verbatim from on-disk.

This set is roughly comprised of: a prefix of objects in the bitmapped
pack (or preferred pack, in the case of a multi-pack reachability
bitmap), plus any other objects not included in the prefix, excluding
any deltas whose base we are not sending in the resulting pack.

The pack-bitmap machinery is responsible for computing this bitmap, and
does so with the following functions:

  - reuse_partial_packfile_from_bitmap()
  - try_partial_reuse()

In the existing implementation, the first function is responsible for
(a) marking the prefix of objects in the reusable pack, and then (b)
calling try_partial_reuse() on any remaining objects to ensure that they
are also reusable (and removing them from the bitmapped set if they are
not).

Likewise, the `try_partial_reuse()` function is responsible for checking
whether an isolated object (that is, an object from the bitmapped
pack/preferred pack not contained in the prefix from earlier) may be
reused, i.e. that it isn't a delta of an object that we are not sending
in the resulting pack.

These functions are based on two core assumptions, which we will unwind
in this and the following commits:

  1. There is only a single pack from the bitmap which is eligible for
     verbatim pack-reuse. For single-pack bitmaps, this is trivially the
     bitmapped pack. For multi-pack bitmaps, this is (currently) the
     MIDX's preferred pack.

  2. The pack eligible for reuse has its first object in bit position 0,
     and all objects from that pack follow in pack-order from that first
     bit position.

In order to perform verbatim pack reuse over multiple packs, we must
unwind these two assumptions. Most notably, in order to reuse bits from
a given packfile, we need to know the first bit position occupied by
an object form that packfile. To propagate this information around, pass
a `struct bitmapped_pack *` anywhere we previously passed a `struct
packed_git *`, since the former contains the bitmap position we're
interested in (as well as a pointer to the latter).

As an additional step, factor out a sub-routine from the main
`reuse_partial_packfile_from_bitmap()` function, called
`reuse_partial_packfile_from_bitmap_1()`. This new function will be
responsible for figuring out which objects may be reused from a single
pack, and the existing function will dispatch multiple calls to its new
helper function for each reusable pack.

Consequently, `reuse_partial_packfile_from_bitmap()` will now maintain
an array of reusable packs instead of a single such pack. We currently
expect that array to have only a single element, so this awkward state
is short-lived. It will serve as useful scaffolding in subsequent
commits as we begin to work towards enabling multi-pack reuse.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
307d75bbe6 midx: implement midx_locate_pack()
The multi-pack index API exposes a `midx_contains_pack()` function that
takes in a string ending in either ".idx" or ".pack" and returns whether
or not the MIDX contains a given pack corresponding to that string.

There is no corresponding function to locate the position of a pack
within the MIDX's pack order (sorted lexically by pack filename).

We could add an optional out parameter to `midx_contains_pack()` that is
filled out with the pack's position when the parameter is non-NULL. To
minimize the amount of fallout from this change, instead introduce a new
function by renaming `midx_contains_pack()` to `midx_locate_pack()`,
adding that output parameter, and then reimplementing
`midx_contains_pack()` in terms of it.

Future patches will make use of this new function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
5f5ccd9595 midx: implement BTMP chunk
When a multi-pack bitmap is used to implement verbatim pack reuse (that
is, when verbatim chunks from an on-disk packfile are copied
directly[^1]), it does so by using its "preferred pack" as the source
for pack-reuse.

This allows repositories to pack the majority of their objects into a
single (often large) pack, and then use it as the single source for
verbatim pack reuse. This increases the amount of objects that are
reused verbatim (and consequently, decrease the amount of time it takes
to generate many packs). But this performance comes at a cost, which is
that the preferred packfile must pace its growth with that of the entire
repository in order to maintain the utility of verbatim pack reuse.

As repositories grow beyond what we can reasonably store in a single
packfile, the utility of verbatim pack reuse diminishes. Or, at the very
least, it becomes increasingly more expensive to maintain as the pack
grows larger and larger.

It would be beneficial to be able to perform this same optimization over
multiple packs, provided some modest constraints (most importantly, that
the set of packs eligible for verbatim reuse are disjoint with respect
to the subset of their objects being sent).

If we assume that the packs which we treat as candidates for verbatim
reuse are disjoint with respect to any of their objects we may output,
we need to make only modest modifications to the verbatim pack-reuse
code itself. Most notably, we need to remove the assumption that the
bits in the reachability bitmap corresponding to objects from the single
reuse pack begin at the first bit position.

Future patches will unwind these assumptions and reimplement their
existing functionality as special cases of the more general assumptions
(e.g. that reuse bits can start anywhere within the bitset, but happen
to start at 0 for all existing cases).

This patch does not yet relax any of those assumptions. Instead, it
implements a foundational data-structure, the "Bitampped Packs" (`BTMP`)
chunk of the multi-pack index. The `BTMP` chunk's contents are described
in detail here. Importantly, the `BTMP` chunk contains information to
map regions of a multi-pack index's reachability bitmap to the packs
whose objects they represent.

For now, this chunk is only written, not read (outside of the test-tool
used in this patch to test the new chunk's behavior). Future patches
will begin to make use of this new chunk.

[^1]: Modulo patching any `OFS_DELTA`'s that cross over a region of the
  pack that wasn't used verbatim.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
fba68184b8 midx: factor out fill_pack_info()
When selecting which packfiles will be written while generating a MIDX,
the MIDX internals fill out a 'struct pack_info' with various pieces of
book-keeping.

Instead of filling out each field of the `pack_info` structure
individually in each of the two spots that modify the array of such
structures (`ctx->info`), extract a common routine that does this for
us.

This reduces the code duplication by a modest amount. But more
importantly, it zero-initializes the structure before assigning values
into it. This hardens us for a future change which will add additional
fields to this structure which (until this patch) was not
zero-initialized.

As a result, any new fields added to the `pack_info` structure need only
be updated in a single location, instead of at each spot within midx.c.

There are no functional changes in this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
a96015a517 pack-bitmap: plug leak in find_objects()
The `find_objects()` function creates an object_list for any tips of the
reachability query which do not have corresponding bitmaps.

The object_list is not used outside of `find_objects()`, but we never
free it with `object_list_free()`, resulting in a leak. Let's plug that
leak by calling `object_list_free()`, which results in t6113 becoming
leak-free.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
6cdb67b97d pack-bitmap-write: deep-clear the bb_commit slab
The `bb_commit` commit slab is used by the pack-bitmap-write machinery
to track various pieces of bookkeeping used to generate reachability
bitmaps.

Even though we clear the slab when freeing the bitmap_builder struct
(with `bitmap_builder_clear()`), there are still pointers which point to
locations in memory that have not yet been freed, resulting in a leak.

Plug the leak by introducing a suitable `free_fn` for the `struct
bb_commit` type, and make sure it is called on each member of the slab
via the `deep_clear_bb_data()` function.

Note that it is possible for both of the arguments to `bitmap_free()` to
be NULL, but `bitmap_free()` is a noop for NULL arguments, so it is OK
to pass them unconditionally.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
66f0c71073 pack-objects: free packing_data in more places
The pack-objects internals use a packing_data struct to track what
objects are part of the pack(s) being formed.

Since these structures contain allocated fields, failing to
appropriately free() them results in a leak. Plug that leak by
introducing a clear_packing_data() function, and call it in the
appropriate spots.

This is a fairly straightforward leak to plug, since none of the callers
expect to read any values or have any references to parts of the address
space being freed.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:38:07 -08:00
dee182941f mailinfo: avoid recursion when unquoting From headers
Our unquote_comment() function is recursive; when it sees a comment
within a comment, like:

  (this is an (embedded) comment)

it recurses to handle the inner comment. This is fine for practical use,
but it does mean that you can easily run out of stack space with a
malicious header. For example:

  perl -e 'print "From: ", "(" x 2**18;' |
  git mailinfo /dev/null /dev/null

segfaults on my system. And since mailinfo is likely to be fed untrusted
input from the Internet (if not by human users, who might recognize a
garbage header, but certainly there are automated systems that apply
patches from a list) it may be possible for an attacker to trigger the
problem.

That said, I don't think there's an interesting security vulnerability
here. All an attacker can do is make it impossible to parse their email
and apply their patch, and there are lots of ways to generate bogus
emails. So it's more of an annoyance than anything.

But it's pretty easy to fix it. The recursion is not helping us preserve
any particular state from each level. The only flag in our parsing is
take_next_literally, and we can never recurse when it is set (since the
start of a new comment implies it was not backslash-escaped). So it is
really only useful for finding the end of the matched pair of
parentheses. We can do that easily with a simple depth counter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:33:52 -08:00
2d9396c2fe t5100: make rfc822 comment test more careful
When processing "From" headers in an email, mailinfo "unquotes" quoted
strings and rfc822 parenthesized comments. For quoted strings, we
actually remove the double-quotes, so:

  From: "A U Thor" <someone@example.com>

become:

  Author: A U Thor
  Email: someone@example.com

But for comments, we leave the outer parentheses in place, so:

  From: A U (this is a comment) Thor <someone@example.com>

becomes:

  Author: A U (this is a comment) Thor
  Email: someone@example.com

So what is the comment "unquoting" actually doing? In our code, being in
a comment section has exactly two effects:

  1. We'll unquote backslash-escaped characters inside a comment
     section.

  2. We _won't_ unquote double-quoted strings inside a comment section.

Our test for comments in t5100 checks this:

  From: "A U Thor" <somebody@example.com> (this is \(really\) a comment (honestly))

So it is covering (1), but not (2). Let's add in a quoted string to
cover this.

Moreover, because the comment appears at the end of the From header,
there's nothing to confirm that we correctly found the end of the
comment section (and not just the end-of-string). Let's instead move it
to the beginning of the header, which means we can confirm that the
existing quoted string is detected (which will only happen if we know
we've left the comment block).

As expected, the test continues to pass, but this will give us more
confidence as we refactor the code in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 14:33:50 -08:00
0a06892ddd bisect: consistently write BISECT_EXPECTED_REV via the refdb
We're inconsistently writing BISECT_EXPECTED_REV both via the filesystem
and via the refdb, which violates the newly established rules for how
special refs must be treated. This works alright in practice with the
reffiles reference backend, but will cause bugs once we gain additional
backends.

Fix this issue and consistently write BISECT_EXPECTED_REV via the refdb
so that it is no longer a special ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:27 -08:00
70c70de616 refs: complete list of special refs
We have some references that are more special than others. The reason
for them being special is that they either do not follow the usual
format of references, or that they are written to the filesystem
directly by the respective owning subsystem and thus circumvent the
reference backend.

This works perfectly fine right now because the reffiles backend will
know how to read those refs just fine. But with the prospect of gaining
a new reference backend implementation we need to be a lot more careful
here:

  - We need to make sure that we are consistent about how those refs are
    written. They must either always be written via the filesystem, or
    they must always be written via the reference backend. Any mixture
    will lead to inconsistent state.

  - We need to make sure that such special refs are always handled
    specially when reading them.

We're already mostly good with regard to the first item, except for
`BISECT_EXPECTED_REV` which will be addressed in a subsequent commit.
But the current list of special refs is missing some refs that really
should be treated specially. Right now, we only treat `FETCH_HEAD` and
`MERGE_HEAD` specially here.

Introduce a new function `is_special_ref()` that contains all current
instances of special refs to fix the reading path.

Note that this is only a temporary measure where we record and rectify
the current state. Ideally, the list of special refs should in the end
only contain `FETCH_HEAD` and `MERGE_HEAD` again because they both may
reference multiple objects and can contain annotations, so they indeed
are special.

Based-on-patch-by: Han-Wen Nienhuys <hanwenn@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:27 -08:00
668cdc043f refs: propagate errno when reading special refs fails
Some refs in Git are more special than others due to reasons explained
in the next commit. These refs are read via `refs_read_special_head()`,
but this function doesn't behave the same as when we try to read a
normal ref. Most importantly, we do not propagate `failure_errno` in the
case where the reference does not exist, which is behaviour that we rely
on in many parts of Git.

Fix this bug by propagating errno when `strbuf_read_file()` fails.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:26 -08:00
8f61321ccb wt-status: read HEAD and ORIG_HEAD via the refdb
We read both the HEAD and ORIG_HEAD references directly from the
filesystem in order to figure out whether we're currently splitting a
commit. If both of the following are true:

  - HEAD points to the same object as "rebase-merge/amend".

  - ORIG_HEAD points to the same object as "rebase-merge/orig-head".

Then we are currently splitting commits.

The current code only works by chance because we only have a single
reference backend implementation. Refactor it to instead read both refs
via the refdb layer so that we'll also be compatible with alternate
reference backends.

There are some subtleties involved here:

  - We pass `RESOLVE_REF_READING` so that a missing ref will cause
    `read_ref_full()` to return an error.

  - We pass `RESOLVE_REF_NO_RECURSE` so that we do not try to resolve
    symrefs. The old code didn't resolve symrefs either, and we only
    ever write object IDs into the refs in "rebase-merge/".

  - In the same spirit we verify that successfully-read refs are not
    symbolic refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:26 -08:00
b23285a921 checkout: forbid "-B <branch>" from touching a branch used elsewhere
"git checkout -B <branch> [<start-point>]", being a "forced" version
of "-b", switches to the <branch>, after optionally resetting its
tip to the <start-point>, even if the <branch> is in use in another
worktree, which is somewhat unexpected.

Protect the <branch> using the same logic that forbids "git checkout
<branch>" from touching a branch that is in use elsewhere.

This is a breaking change that may deserve backward compatibliity
warning in the Release Notes.  The "--ignore-other-worktrees" option
can be used as an escape hatch if the finger memory of existing
users depend on the current behaviour of "-B".

Reported-by: Willem Verstraeten <willem.verstraeten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-13 07:48:17 -08:00
fbc6526ea6 t6300: avoid hard-coding object sizes
f4ee22b526 (ref-filter: add tests for objectsize:disk, 2018-12-24)
hard-coded the expected object sizes.  Coincidentally the size of commit
and tag is the same with zlib at the default compression level.

1f5f8f3e85 (t6300: abstract away SHA-1-specific constants, 2020-02-22)
encoded the sizes as a single value, which coincidentally also works
with sha256.

Different compression libraries like zlib-ng may arrive at different
values.  Get them from the file system instead of hard-coding them to
make switching the compression library (or changing the compression
level) easier.

Reported-by: Ondrej Pohorelsky <opohorel@redhat.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 15:41:15 -08:00
d1bd3a8c34 mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
When processing a header like a "From" line, mailinfo uses
unquote_quoted_pair() to handle double-quotes and rfc822 parenthesized
comments. It takes a NUL-terminated string on input, and loops over the
"in" pointer until it sees the NUL. When it finds the start of an
interesting block, it delegates to helper functions which also increment
"in", and return the updated pointer.

But there's a bug here: the helpers find the NUL with a post-increment
in the loop condition, like:

   while ((c = *in++) != 0)

So when they do see a NUL (rather than the correct termination of the
quote or comment section), they return "in" as one _past_ the NUL
terminator. And thus the outer loop in unquote_quoted_pair() does not
realize we hit the NUL, and keeps reading past the end of the buffer.

We should instead make sure to return "in" positioned at the NUL, so
that the caller knows to stop their loop, too. A hacky way to do this is
to return "in - 1" after leaving the inner loop. But a slightly cleaner
solution is to avoid incrementing "in" until we are sure it contained a
non-NUL byte (i.e., doing it inside the loop body).

The two tests here show off the problem. Since we check the output,
they'll _usually_ report a failure in a normal build, but it depends on
what garbage bytes are found after the heap buffer. Building with
SANITIZE=address reliably notices the problem. The outcome (both the
exit code and the exact bytes) are just what Git happens to produce for
these cases today, and shouldn't be taken as an endorsement. It might be
reasonable to abort on an unterminated string, for example. The priority
for this patch is fixing the out-of-bounds memory access.

Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 15:32:49 -08:00
18c9cb7524 builtin/clone: create the refdb with the correct object format
We're currently creating the reference database with a potentially
incorrect object format when the remote repository's object format is
different from the local default object format. This works just fine for
now because the files backend never records the object format anywhere.
But this logic will fail with any new reference backend that encodes
this information in some form either on-disk or in-memory.

The preceding commits have reshuffled code in git-clone(1) so that there
is no code path that will access the reference database before we have
detected the remote's object format. With these refactorings we can now
defer initialization of the reference database until after we have
learned the remote's object format and thus initialize it with the
correct format from the get-go.

These refactorings are required to make git-clone(1) work with the
upcoming reftable backend when cloning repositories with the SHA256
object format.

This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an
empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects
the resulting hash format of the empty cloned repository to match the
default hash, but now we always end up with a sha1 repository. The
problem is that for dumb HTTP fetches, we have no easy way to figure out
the remote's hash function except for deriving it based on the hash
length of refs in `info/refs`. But as the remote repository is empty we
cannot rely on this detection mechanism.

Before the change in this commit we already initialized the repository
with the default hash function and then left it as-is. With this patch
we always use the hash function detected via the remote, where we fall
back to "sha1" in case we cannot detect it.

Neither the old nor the new behaviour are correct as we second-guess the
remote hash function in both cases. But given that this is a rather
unlikely edge case (we use the dumb HTTP protocol, the remote repository
uses SHA256 and the remote repository is empty), let's simply adapt the
test to assert the new behaviour. If we want to properly address this
edge case in the future we will have to extend the dumb HTTP protocol so
that we can properly detect the hash function for empty repositories.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
3c8f60c641 builtin/clone: skip reading HEAD when retrieving remote
After we have set up the remote configuration in git-clone(1) we'll call
`remote_get()` to read the remote from the on-disk configuration. But
next to reading the on-disk configuration, `remote_get()` will also
cause us to try and read the repository's HEAD reference so that we can
figure out the current branch. Besides being pointless in git-clone(1)
because we're operating in an empty repository anyway, this will also
break once we move creation of the reference database to a later point
in time.

Refactor the code to introduce a new `remote_get_early()` function that
will skip reading the HEAD reference to address this issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
360822a347 builtin/clone: set up sparse checkout later
When asked to do a sparse checkout, then git-clone(1) will spawn
`git sparse-checkout set` to set up the configuration accordingly. This
requires a proper Git repository or otherwise the command will fail. But
as we are about to move creation of the reference database to a later
point, this prerequisite will not hold anymore.

Move the logic to a later point in time where we know to have created
the reference database already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
9159029329 builtin/clone: fix bundle URIs with mismatching object formats
We create the reference database in git-clone(1) quite early before
connecting to the remote repository. Given that we do not yet know about
the object format that the remote repository uses at that point in time
the consequence is that the refdb may be initialized with the wrong
object format.

This is not a problem in the context of the files backend as we do not
encode the object format anywhere, and furthermore the only reference
that we write between initializing the refdb and learning about the
object format is the "HEAD" symref. It will become a problem though once
we land the reftable backend, which indeed does require to know about
the proper object format at the time of creation. We thus need to
rearrange the logic in git-clone(1) so that we only initialize the refdb
once we have learned about the actual object format.

As a first step, move listing of remote references to happen earlier,
which also allow us to set up the hash algorithm of the repository
earlier now. While we aim to execute this logic as late as possible
until after most of the setup has happened already, detection of the
object format and thus later the setup of the reference database must
happen before any other logic that may spawn Git commands or otherwise
these Git commands may not recognize the repository as such.

The first Git step where we expect the repository to be fully initalized
is when we fetch bundles via bundle URIs. Funny enough, the comments
there also state that "the_repository must match the cloned repo", which
is indeed not necessarily the case for the hash algorithm right now. So
in practice it is the right thing to detect the remote's object format
before downloading bundle URIs anyway, and not doing so causes clones
with bundle URIs to fail when the local default object format does not
match the remote repository's format.

Unfortunately though, this creates a new issue: downloading bundles may
take a long time, so if we list refs beforehand they might've grown
stale meanwhile. It is not clear how to solve this issue except for a
second reference listing though after we have downloaded the bundles,
which may be an expensive thing to do.

Arguably though, it's preferable to have a staleness issue compared to
being unable to clone a repository altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
bab2283ec6 remote-curl: rediscover repository when fetching refs
The reftable format encodes the hash function used by the repository
inside of its tables. The reftable backend thus needs to be initialized
with the correct hash function right from the start, or otherwise we may
end up writing tables with the wrong hash function. But git-clone(1)
initializes the reference database before learning about the hash
function used by the remote repository, which has never been a problem
with the reffiles backend.

To fix this, we'll have to change git-clone(1) to be more careful and
only create the reference backend once it learned about the remote hash
function. This creates a problem for git-remote-curl(1), which will then
be spawned at a time where the repository is not yet fully-initialized.
Consequentially, git-remote-curl(1) will fail to detect the repository,
which eventually causes it to error out once it is asked to fetch remote
objects.

We can address this issue by trying to re-discover the Git repository in
case none was detected at startup time. With this change, the clone will
look as following:

  1. git-clone(1) sets up the initial repository, excluding the
     reference database.

  2. git-clone(1) spawns git-remote-curl(1), which will be unable to
     detect the repository due to a missing "HEAD".

  3. git-clone(1) asks git-remote-curl(1) to list remote references.
     This works just fine as this step does not require a local
     repository

  4. git-clone(1) creates the reference database as it has now learned
     about the hash function.

  5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
     The latter notices that it doesn't have a repository available, but
     it now knows to try and re-discover it.

If the re-discovery succeeds in the last step we can continue with the
clone.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
56cd0334f7 setup: allow skipping creation of the refdb
Allow callers to skip creation of the reference database via a new flag
`INIT_DB_SKIP_REFDB`, which is required for git-clone(1) so that we can
create it at a later point once the object format has been discovered
from the remote repository.

Note that we also uplift the call to `create_reference_database()` into
`init_db()`, which makes it easier to handle the new flag for us. This
changes the order in which we do initialization so that we now set up
the Git configuration before we create the reference database. In
practice this move should not result in any change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
79543e760d setup: extract function to create the refdb
We're about to let callers skip creation of the reference database when
calling `init_db()`. Extract the logic into a standalone function so
that it becomes easier to do this refactoring.

While at it, expand the comment that explains why we always create the
"refs/" directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
c0cadb0576 reftable/block: reuse buffer to compute record keys
When iterating over entries in the block iterator we compute the key of
each of the entries and write it into a buffer. We do not reuse the
buffer though and thus re-allocate it on every iteration, which is
wasteful.

Refactor the code to reuse the buffer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:17 -08:00
a8305bc6d8 reftable/block: introduce macro to initialize struct block_iter
There are a bunch of locations where we initialize members of `struct
block_iter`, which makes it harder than necessary to expand this struct
to have additional members. Unify the logic via a new `BLOCK_ITER_INIT`
macro that initializes all members.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:17 -08:00
829231dc20 reftable/merged: reuse buffer to compute record keys
When iterating over entries in the merged iterator's queue, we compute
the key of each of the entries and write it into a buffer. We do not
reuse the buffer though and thus re-allocate it on every iteration,
which is wasteful given that we never transfer ownership of the
allocated bytes outside of the loop.

Refactor the code to reuse the buffer. This also fixes a potential
memory leak when `merged_iter_advance_subiter()` returns an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
9abda98149 reftable/stack: fix use of unseeded randomness
When writing a new reftable stack, Git will first create the stack with
a random suffix so that concurrent updates will not try to write to the
same file. This random suffix is computed via a call to rand(3P). But we
never seed the function via srand(3P), which means that the suffix is in
fact always the same.

Fix this bug by using `git_rand()` instead, which does not need to be
initialized. While this function is likely going to be slower depending
on the platform, this slowness should not matter in practice as we only
use it when writing a new reftable stack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
3054fbd93e reftable/stack: fix stale lock when dying
When starting a transaction via `reftable_stack_init_addition()`, we
create a lockfile for the reftable stack itself which we'll write the
new list of tables to. But if we terminate abnormally e.g. via a call to
`die()`, then we do not remove the lockfile. Subsequent executions of
Git which try to modify references will thus fail with an out-of-date
error.

Fix this bug by registering the lock as a `struct tempfile`, which
ensures automatic cleanup for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
d779996a10 reftable/stack: reuse buffers when reloading stack
In `reftable_stack_reload_once()` we iterate over all the tables added
to the stack in order to figure out whether any of the tables needs to
be reloaded. We use a set of buffers in this context to compute the
paths of these tables, but discard those buffers on every iteration.
This is quite wasteful given that we do not need to transfer ownership
of the allocated buffer outside of the loop.

Refactor the code to instead reuse the buffers to reduce the number of
allocations we need to do. Note that we do not have to manually reset
the buffer because `stack_filename()` does this for us already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
5c086453ff reftable/stack: perform auto-compaction with transactional interface
Whenever updating references or reflog entries in the reftable stack, we
need to add a new table to the stack, thus growing the stack's length by
one. The stack can grow to become quite long rather quickly, leading to
performance issues when trying to read records. But besides performance
issues, this can also lead to exhaustion of file descriptors very
rapidly as every single table requires a separate descriptor when
opening the stack.

While git-pack-refs(1) fixes this issue for us by merging the tables, it
runs too irregularly to keep the length of the stack within reasonable
limits. This is why the reftable stack has an auto-compaction mechanism:
`reftable_stack_add()` will call `reftable_stack_auto_compact()` after
its added the new table, which will auto-compact the stack as required.

But while this logic works alright for `reftable_stack_add()`, we do not
do the same in `reftable_addition_commit()`, which is the transactional
equivalent to the former function that allows us to write multiple
updates to the stack atomically. Consequentially, we will easily run
into file descriptor exhaustion in code paths that use many separate
transactions like e.g. non-atomic fetches.

Fix this issue by calling `reftable_stack_auto_compact()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
15f98b602f reftable/stack: verify that reftable_stack_add() uses auto-compaction
While we have several tests that check whether we correctly perform
auto-compaction when manually calling `reftable_stack_auto_compact()`,
we don't have any tests that verify whether `reftable_stack_add()` does
call it automatically. Add one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
85a8c899ce reftable: handle interrupted writes
There are calls to write(3P) where we don't properly handle interrupts.
Convert them to use `write_in_full()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
917a2b3ce9 reftable: handle interrupted reads
There are calls to pread(3P) and read(3P) where we don't properly handle
interrupts. Convert them to use `pread_in_full()` and `read_in_full()`,
respectively.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:16 -08:00
e32b8ece64 reftable: wrap EXPECT macros in do/while
The `EXPECT` macros used by the reftable test framework are all using a
single `if` statement with the actual condition. This results in weird
syntax when using them in if/else statements like the following:

```
if (foo)
	EXPECT(foo == 2)
else
	EXPECT(bar == 2)
```

Note that there need not be a trailing semicolon. Furthermore, it is not
immediately obvious whether the else now belongs to the `if (foo)` or
whether it belongs to the expanded `if (foo == 2)` from the macro.

Fix this by wrapping the macros in a do/while loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:23:15 -08:00
7382497372 show-ref: use die_for_incompatible_opt3()
Use the standard message for reporting the use of multiple mutually
exclusive options by calling die_for_incompatible_opt3() instead of
rolling our own.  This has the benefits of showing only the actually
given options, reducing the number of strings to translate and making
the UI slightly more consistent.

Adjust the test to no longer insist on a specific order of the
reported options, as this implementation detail does not affect the
usefulness of the error message.

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11 07:17:27 -08:00
1a87c842ec Start the 2.44 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 16:37:51 -08:00
1ef1cce9c2 Merge branch 'tz/send-email-negatable-options'
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email".  Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.

* tz/send-email-negatable-options:
  send-email: avoid duplicate specification warnings
  perl: bump the required Perl version to 5.8.1 from 5.8.0
2023-12-09 16:37:51 -08:00
f8f87e0827 Merge branch 'ak/rebase-autosquash'
"git rebase --autosquash" is now enabled for non-interactive rebase,
but it is still incompatible with the apply backend.

* ak/rebase-autosquash:
  rebase: rewrite --(no-)autosquash documentation
  rebase: support --autosquash without -i
  rebase: fully ignore rebase.autoSquash without -i
2023-12-09 16:37:50 -08:00
98d0a1f93e Merge branch 'vd/for-each-ref-unsorted-optimization'
"git for-each-ref --no-sort" still sorted the refs alphabetically
which paid non-trivial cost.  It has been redefined to show output
in an unspecified order, to allow certain optimizations to take
advantage of.

* vd/for-each-ref-unsorted-optimization:
  t/perf: add perf tests for for-each-ref
  ref-filter.c: use peeled tag for '*' format fields
  for-each-ref: clean up documentation of --format
  ref-filter.c: filter & format refs in the same callback
  ref-filter.c: refactor to create common helper functions
  ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
  ref-filter.h: add functions for filter/format & format-only
  ref-filter.h: move contains caches into filter
  ref-filter.h: add max_count and omit_empty to ref_format
  ref-filter.c: really don't sort when using --no-sort
2023-12-09 16:37:50 -08:00
e020e55a62 Merge branch 'ps/ban-a-or-o-operator-with-test'
Test and shell scripts clean-up.

* ps/ban-a-or-o-operator-with-test:
  Makefile: stop using `test -o` when unlinking duplicate executables
  contrib/subtree: convert subtree type check to use case statement
  contrib/subtree: stop using `-o` to test for number of args
  global: convert trivial usages of `test <expr> -a/-o <expr>`
2023-12-09 16:37:50 -08:00
4297485172 Merge branch 'ss/format-patch-use-encode-headers-for-cover-letter'
"git format-patch --encode-email-headers" ignored the option when
preparing the cover letter, which has been corrected.

* ss/format-patch-use-encode-headers-for-cover-letter:
  format-patch: fix ignored encode_email_headers for cover letter
2023-12-09 16:37:49 -08:00
340581bcf1 Merge branch 'ps/ref-tests-update'
Update ref-related tests.

* ps/ref-tests-update:
  t: mark several tests that assume the files backend with REFFILES
  t7900: assert the absence of refs via git-for-each-ref(1)
  t7300: assert exact states of repo
  t4207: delete replace references via git-update-ref(1)
  t1450: convert tests to remove worktrees via git-worktree(1)
  t: convert tests to not access reflog via the filesystem
  t: convert tests to not access symrefs via the filesystem
  t: convert tests to not write references via the filesystem
  t: allow skipping expected object ID in `ref-store update-ref`
2023-12-09 16:37:49 -08:00
d8b0ec44b1 Merge branch 'jw/git-add-attr-pathspec'
"git add" and "git stash" learned to support the ":(attr:...)"
magic pathspec.

* jw/git-add-attr-pathspec:
  attr: enable attr pathspec magic for git-add and git-stash
2023-12-09 16:37:49 -08:00
34401b7ddb Merge branch 'jk/chunk-bounds-more'
Code clean-up for jk/chunk-bounds topic.

* jk/chunk-bounds-more:
  commit-graph: mark chunk error messages for translation
  commit-graph: drop verify_commit_graph_lite()
  commit-graph: check order while reading fanout chunk
  commit-graph: use fanout value for graph size
  commit-graph: abort as soon as we see a bogus chunk
  commit-graph: clarify missing-chunk error messages
  commit-graph: drop redundant call to "lite" verification
  midx: check consistency of fanout table
  commit-graph: handle overflow in chunk_size checks
2023-12-09 16:37:48 -08:00
b4e6618fdf Merge branch 'js/ci-discard-prove-state'
The way CI testing used "prove" could lead to running the test
suite twice needlessly, which has been corrected.

* js/ci-discard-prove-state:
  ci: avoid running the test suite _twice_
2023-12-09 16:37:48 -08:00
14a4445d18 Merge branch 'ps/ci-gitlab'
Add support for GitLab CI.

* ps/ci-gitlab:
  ci: add support for GitLab CI
  ci: install test dependencies for linux-musl
  ci: squelch warnings when testing with unusable Git repo
  ci: unify setup of some environment variables
  ci: split out logic to set up failed test artifacts
  ci: group installation of Docker dependencies
  ci: make grouping setup more generic
  ci: reorder definitions for grouping functions
2023-12-09 16:37:48 -08:00
712177ed04 Merge branch 'js/doc-unit-tests-with-cmake'
Update the base topic to work with CMake builds.

* js/doc-unit-tests-with-cmake:
  cmake: handle also unit tests
  cmake: use test names instead of full paths
  cmake: fix typo in variable name
  artifacts-tar: when including `.dll` files, don't forget the unit-tests
  unit-tests: do show relative file paths
  unit-tests: do not mistake `.pdb` files for being executable
  cmake: also build unit tests
2023-12-09 16:37:47 -08:00
8bf6fbd00d Merge branch 'js/doc-unit-tests'
Process to add some form of low-level unit tests has started.

* js/doc-unit-tests:
  ci: run unit tests in CI
  unit tests: add TAP unit test framework
  unit tests: add a project plan document
2023-12-09 16:37:47 -08:00
0946acfb50 Merge branch 'ps/httpd-tests-on-nixos'
Portability tweak.

* ps/httpd-tests-on-nixos:
  t9164: fix inability to find basename(1) in Subversion hooks
  t/lib-httpd: stop using legacy crypt(3) for authentication
  t/lib-httpd: dynamically detect httpd and modules path
2023-12-09 16:37:47 -08:00
71a1e94821 revision: parse integer arguments to --max-count, --skip, etc., more carefully
The "rev-list" and other commands in the "log" family, being the
oldest part of the system, use their own custom argument parsers,
and integer values of some options are parsed with atoi(), which
allows a non-digit after the number (e.g., "1q") to be silently
ignored.  As a natural consequence, an argument that does not begin
with a digit (e.g., "q") silently becomes zero, too.

Switch to use strtol_i() and parse_timestamp() appropriately to
catch bogus input.

Note that one may naïvely expect that --max-count, --skip, etc., to
only take non-negative values, but we must allow them to also take
negative values, as an escape hatch to countermand a limit set by an
earlier option on the command line; the underlying variables are
initialized to (-1) and "--max-count=-1", for example, is a
legitimate way to reinitialize the limit.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:57:31 +09:00
ea8f9494ab sequencer: simplify away extra git_config_string() call
In our config callback, we call git_config_string() to copy the incoming
value string into a local string. But we don't modify or store that
string; we just look at it and then free it. We can make the code
simpler by just looking at the value passed into the callback.

Note that we do need to check for NULL, which is the one bit of logic
git_config_string() did for us. And I could even see an argument that we
are abstracting any error-checking of the value behind the
git_config_string() layer. But in practice no other callbacks behave
this way; it is standard to check for NULL and then just look at the
string directly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:23 +09:00
004c9432f7 gpg-interface: drop pointless config_error_nonbool() checks
Config callbacks which use git_config_string() or git_config_pathname()
have no need to check for a NULL value. This is handled automatically
by those helpers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:22 +09:00
37e8a341ea push: drop confusing configset/callback redundancy
We parse push config by calling git_config() with our git_push_config()
callback. But inside that callback, when we see "push.gpgsign", we
ignore the value passed into the callback and instead make a new call to
git_config_get_value().

This is unnecessary at best, and slightly wrong at worst (if there are
multiple instances, get_value() only returns one; both methods end up
with last-one-wins, but we'd fail to report errors if earlier
incarnations were bogus).

The call was added by 68c757f219 (push: add a config option push.gpgSign
for default signed pushes, 2015-08-19). That commit doesn't give any
reason to deviate from the usual strategy here; it was probably just
somebody unfamiliar with our config API and its conventions.

It also added identical code to builtin/send-pack.c, which also handles
push.gpgsign.

And then the same issue spread to its neighbor in b33a15b081 (push: add
recurseSubmodules config option, 2015-11-17), presumably via
cargo-culting.

This patch fixes all three to just directly use the value provided to
the callback. While I was adjusting the code to do so, I noticed that
push.gpgsign is overly careful about a NULL value. After
git_parse_maybe_bool() has returned anything besides 1, we know that the
value cannot be NULL (if it were, it would be an implicit "true", and
many callers of maybe_bool rely on that). Here that lets us shorten "if
(v && !strcasecmp(v, ...))" to just "if (!strcasecmp(v, ...))".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:22 +09:00
be6bc048d7 config: use git_config_string() for core.checkRoundTripEncoding
Since this code path was recently converted to check for a NULL value,
it now behaves exactly like git_config_string(). We can shorten the code
a bit by using that helper.

Note that git_config_string() takes a const pointer, but our storage
variable is non-const. We're better off making this "const", though,
since the default value points to a string literal (and thus it would be
an error if anybody tried to write to it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:22 +09:00
0824879078 diff: give more detailed messages for bogus diff.* config
The config callbacks for a few diff.* variables simply return -1 when we
encounter an error. The message you get mentions the offending location,
like:

  fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7

but is vague about "bad" (as it must be, since the message comes from
the generic config code). Most callbacks add their own messages here, so
let's do the same. E.g.:

  error: unknown value for config 'diff.algorithm': foo
  fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7

I've written the string in a way that should be reusable for
translators, and matches another similar message in transport.c (there
doesn't yet seem to be a popular generic message to reuse here, so
hopefully this will get the ball rolling).

Note that in the case of diff.algorithm, our parse_algorithm_value()
helper does detect a NULL value string. But it's still worth detecting
it ourselves here, since we can give a more specific error message (and
which is the usual one for unexpected implicit-bool values).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:22 +09:00
92cecce0de config: use config_error_nonbool() instead of custom messages
A few config callbacks use their own custom messages to report an
unexpected implicit bool like:

  [merge "foo"]
  driver

These should just use config_error_nonbool(), so the user sees
consistent messages.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:22 +09:00
0dda4ce9f6 imap-send: don't use git_die_config() inside callback
The point of git_die_config() is to let configset users mention the
file/line info for invalid config, like:

  if (!git_config_get_int("foo.bar", &value)) {
	if (!is_ok(value))
		git_die_config("foo.bar");
  }

Using it from within a config callback is unnecessary, because we can
simply return an error, at which point the config machinery will mention
the file/line of the offending variable. Worse, using git_die_config()
can actually produce the wrong location when the key is found in
multiple spots. For instance, with config like:

  [imap]
  host
  host = foo

we'll report the line number of the "host = foo" line, but the problem
is on the implicit-bool "host" line.

We can fix it by just returning an error code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:21 +09:00
22e27413ee git_xmerge_config(): prefer error() to die()
When parsing merge config, a few code paths die on error. It's
preferable for us to call error() here, because the resulting error
message from the config parsing code contains much more detail.

For example, before:

  fatal: unknown style 'bogus' given for 'merge.conflictstyle'

and after:

  error: unknown style 'bogus' given for 'merge.conflictstyle'
  fatal: bad config variable 'merge.conflictstyle' in file '.git/config' at line 7

Since we're touching these lines, I also marked them for translation.
There's no reason they shouldn't behave like most other config-parsing
errors.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:21 +09:00
41f98fae02 config: reject bogus values for core.checkstat
If you feed nonsense config like:

  git -c core.checkstat=foobar status

we'll silently ignore the unknown value, rather than reporting an error.
This goes all the way back to c08e4d5b5c (Enable minimal stat checking,
2013-01-22).

Detecting and complaining now is technically a backwards-incompatible
change, but I don't think anybody has any reason to use an invalid value
here. There are no historical values we'd want to allow for backwards
compatibility or anything like that. We are better off loudly telling
the user that their config may not be doing what they expect.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:21 +09:00
d49cb162fa fsck: handle NULL value when parsing message config
When parsing fsck.*, receive.fsck.*, or fetch.fsck.*, we don't check for
an implicit bool. So any of:

  [fsck]
  badTree
  [receive "fsck"]
  badTree
  [fetch "fsck"]
  badTree

will cause us to segfault. We can fix it with config_error_nonbool() in
the usual way, but we have to make a few more changes to get good error
messages. The problem is that all three spots do:

  if (skip_prefix(var, "fsck.", &var))

to match and parse the actual message id. But that means that "var" now
just says "badTree" instead of "receive.fsck.badTree", making the
resulting message confusing. We can fix that by storing the parsed
message id in its own separate variable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:47 +09:00
1b274c9834 trailer: handle NULL value when parsing trailer-specific config
When parsing the "key", "command", and "cmd" trailer config, we just
make a copy of the value string.  If we see an implicit bool like:

  [trailer "foo"]
  key

we'll segfault trying to copy a NULL pointer. We can fix this with the
usual config_error_nonbool() check.

I split this out from the other vanilla cases, because at first glance
it looks like a better fix here would be to move the NULL check out of
the switch statement. But it would change the behavior of other keys
like trailer.*.ifExists, where an implicit bool is interpreted as
EXISTS_DEFAULT.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:47 +09:00
34b1a0d320 submodule: handle NULL value when parsing submodule.*.branch
We record the submodule branch config value as a string, so config that
uses an implicit bool like:

  [submodule "foo"]
  branch

will cause us to segfault. Note that unlike most other config-parsing
bugs of this class, this can be triggered by parsing a bogus .gitmodules
file (which we might do after cloning a malicious repository).

I don't think the security implications are important, though. It's
always a strict NULL dereference, not an out-of-bounds read or write. So
we should reliably kill the process. That may be annoying, but the
impact is limited to the attacker preventing the victim from
successfully using "git clone --recurse-submodules", etc, on the
malicious repo.

The "branch" entry is the only one with this problem; other strings like
"path" and "url" already check for NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:47 +09:00
89086c9466 help: handle NULL value for alias.* config
When showing all config with "git help --all", we print the list of
defined aliases. But our config callback to do so does not check for a
NULL value, meaning a config block like:

  [alias]
  foo

will cause us to segfault. We should detect and complain about this in
the usual way.

Since this command is purely informational (and we aren't trying to run
the alias), we could perhaps just generate a warning and continue. But
this sort of misconfiguration should be pretty rare, and the error
message we will produce points directly to the line of config that needs
to be fixed. So just generating the usual error should be OK.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:47 +09:00
24942ef316 trace2: handle NULL values in tr2_sysenv config callback
If you have config with an implicit bool like:

  [trace2]
  envvars

we'll segfault, as we unconditionally try to xstrdup() the value. We
should instead detect and complain, as a boolean value has no meaning
here. The same is true for every variable in tr2_sysenv_settings (and
this patch covers them all, as we check them in a loop).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:47 +09:00
a62712696e setup: handle NULL value when parsing extensions
The "partialclone" extension config records a string, and hence it is an
error to have an implicit bool like:

  [extensions]
  partialclone

in your config. We should recognize and reject this, rather than
segfaulting (which is the current behavior). Note that it's OK to use
config_error_nonbool() here, even though the return value is an enum. We
explicitly document EXTENSION_ERROR as -1 for compatibility with
error(), etc.

This is the only extension value that has this problem. Most of the
others are bools that interpret this value naturally. The exception is
extensions.objectformat, which does correctly check for NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:46 +09:00
ba176db511 config: handle NULL value when parsing non-bools
When the config parser sees an "implicit" bool like:

  [core]
  someVariable

it passes NULL to the config callback. Any callback code which expects a
string must check for NULL. This usually happens via helpers like
git_config_string(), etc, but some custom code forgets to do so and will
segfault.

These are all fairly vanilla cases where the solution is just the usual
pattern of:

  if (!value)
        return config_error_nonbool(var);

though note that in a few cases we have to split initializers like:

  int some_var = initializer();

into:

  int some_var;
  if (!value)
        return config_error_nonbool(var);
  some_var = initializer();

There are still some broken instances after this patch, which I'll
address on their own in individual patches after this one.

Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:39 +09:00
daaa03e54c bisect: always clean on reset
Usually "bisect reset" cleans up any refs/bisect/ refs, along with
meta-files like .git/BISECT_LOG. But it only does so after deciding that
a bisection is active, which it does by reading BISECT_START. This is
usually fine, but it's possible to get into a confusing state if the
BISECT_START file is gone, but other cruft is left (this might be due to
a bug, or a system crash, etc).

And since "bisect reset" refuses to do anything in this state, the user
has no easy way to clean up the leftover cruft. While another "bisect
start" would clear the state, in the interim it can be annoying, as
other tools (like our bash prompt code) think we are bisecting, and
for-each-ref output may be polluted with refs/bisect/ entries.

Further adding to the confusion is that running "bisect reset $some_ref"
skips the BISECT_START check. So it never realizes that there's no
bisection active and does the cleanup anyway!

So let's just make sure we always do the cleanup, whether we looked at
BISECT_START or not. If the user doesn't give us a commit to reset to,
we'll still say "We are not bisecting" and skip the call to "git
checkout".

Reported-by: Janik Haag <janik@aq0.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:21:31 +09:00
9385174627 parse-options: decouple "--end-of-options" and "--"
When we added generic end-of-options support in 51b4594b40
(parse-options: allow --end-of-options as a synonym for "--",
2019-08-06), we made them true synonyms. They both stop option parsing,
and they are both returned in the resulting argv if the KEEP_DASHDASH
flag is used.

The hope was that this would work for all callers:

  - most generic callers would not pass KEEP_DASHDASH, and so would just
    do the right thing (stop parsing there) without needing to know
    anything more.

  - callers with KEEP_DASHDASH were generally going to rely on
    setup_revisions(), which knew to handle --end-of-options specially

But that turned out miss quite a few cases that pass KEEP_DASHDASH but
do their own manual parsing. For example, "git reset", "git checkout",
and so on want pass KEEP_DASHDASH so they can support:

  git reset $revs -- $paths

but of course aren't going to actually do a traversal, so they don't
call setup_revisions(). And those cases currently get confused by
--end-of-options being left in place, like:

   $ git reset --end-of-options HEAD
   fatal: option '--end-of-options' must come before non-option arguments

We could teach each of these callers to handle the leftover option
explicitly. But let's try to be a bit more clever and see if we can
solve it centrally in parse-options.c.

The bogus assumption here is that KEEP_DASHDASH tells us the caller
wants to see --end-of-options in the result. But really, the callers
which need to know that --end-of-options was reached are those that may
potentially parse more options from argv. In other words, those that
pass the KEEP_UNKNOWN_OPT flag.

If such a caller is aware of --end-of-options (e.g., because they call
setup_revisions() with the result), then this will continue to do the
right thing, treating anything after --end-of-options as a non-option.

And if the caller is not aware of --end-of-options, they are better off
keeping it intact, because either:

  1. They are just passing the options along to somebody else anyway, in
     which case that somebody would need to know about the
     --end-of-options marker.

  2. They are going to parse the remainder themselves, at which point
     choking on --end-of-options is much better than having it silently
     removed. The point is to avoid option injection from untrusted
     command line arguments, and bailing is better than quietly treating
     the untrusted argument as an option.

This fixes bugs with --end-of-options across several commands, but I've
focused on two in particular here:

  - t7102 confirms that "git reset --end-of-options --foo" now works.
    This checks two things. One, that we no longer barf on
    "--end-of-options" itself (which previously we did, even if the rev
    was something vanilla like "HEAD" instead of "--foo"). And two, that
    we correctly treat "--foo" as a revision rather than an option.

    This fix applies to any other cases which pass KEEP_DASHDASH but not
    KEEP_UNKNOWN_OPT, like "git checkout", "git check-attr", "git grep",
    etc, which would previously choke on "--end-of-options".

  - t9350 shows the opposite case: fast-export passed KEEP_UNKNOWN_OPT
    but not KEEP_DASHDASH, but then passed the result on to
    setup_revisions(). So it never saw --end-of-options, and would
    erroneously parse "fast-export --end-of-options --foo" as having a
    "--foo" option. This is now fixed.

Note that this does shut the door for callers which want to know if we
hit end-of-options, but don't otherwise need to keep unknown opts. The
obvious thing here is feeding it to the DWIM verify_filename()
machinery. And indeed, this is a problem even for commands which do
understand --end-of-options already. For example, without this patch,
you get:

  $ git log --end-of-options --foo
  fatal: option '--foo' must come before non-option arguments

because we refuse to accept "--foo" as a filename (because it starts
with a dash) even though we could know that we saw end-of-options. The
verify_filename() function simply doesn't accept this extra information.

So that is the status quo, and this patch doubles down further on that.
Commands like "git reset" have the same problem, but they won't even
know that parse-options saw --end-of-options! So even if we fixed
verify_filename(), they wouldn't have anything to pass to it.

But in practice I don't think this is a big deal. If you are being
careful enough to use --end-of-options, then you should also be using
"--" to disambiguate and avoid the DWIM behavior in the first place. In
other words, doing:

  git log --end-of-options --this-is-a-rev -- --this-is-a-path

works correctly, and will continue to do so. And likewise, with this
patch now:

  git reset --end-of-options --this-is-a-rev -- --this-is-a-path

will work, as well.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:21:02 +09:00
792b86283b worktree: simplify incompatibility message for --orphan and commit-ish
Use a single translatable string to report that the worktree add option
--orphan is incompatible with a commit-ish instead of having the
commit-ish in a separate translatable string.  This reduces the number
of strings to translate and gives translators the full context.

A similar message is used in builtin/describe.c, but with the plural of
commit-ish, and here we need the singular form.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:41:03 +09:00
62bc6dd33c worktree: standardize incompatibility messages
Use the standard parameterized message for reporting incompatible
options for worktree add.  This reduces the number of strings to
translate and makes the UI slightly more consistent.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:41:03 +09:00
f5f9e972bd clean: factorize incompatibility message
Use the standard parameterized message for reporting incompatible
options to inform users that they can't use -x and -X together.  This
reduces the number of strings to translate and makes the UI slightly
more consistent.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:41:03 +09:00
81fb70f55e revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
Use the standard parameterized message for reporting incompatible
options to report options that are not accepted in combination with
--exclude-hidden.  This reduces the number of strings to translate and
makes the UI a bit more consistent.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:41:03 +09:00
fa518aef56 revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
The revision option --reverse is incompatible with --walk-reflogs and
--graph is incompatible with both --reverse and --walk-reflogs.  So they
are all incompatible with each other.

Use the function for checking three mutually incompatible options,
die_for_incompatible_opt3(), to perform this check in one place and
without repetition.  This is shorter and clearer.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:40:44 +09:00
1241800867 repack: use die_for_incompatible_opt3() for -A/-k/--cruft
The repack option --keep-unreachable is incompatible with -A, --cruft is
incompatible with -A and -k, and -k is short for --keep-unreachable.  So
they are all incompatible with each other.

Use the function for checking three mutually incompatible options,
die_for_incompatible_opt3(), to perform this check in one place and
without repetition.  This is shorter and clearer.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:39:12 +09:00
b3bf4701cf push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
The push option --delete is incompatible with --all, --mirror, and
--tags; --tags is incompatible with --all and --mirror; --all is
incompatible with --mirror.  This means they are all incompatible with
each other.  And --branches is an alias for --all.

Use the function for checking four mutually incompatible options,
die_for_incompatible_opt4(), to perform this check in one place and
without repetition.  This is shorter and clearer.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 07:39:11 +09:00
a1fbe26a0c completion: avoid user confusion in non-cone mode
It is tempting to think of "files and directories" of the current
directory as valid inputs to the add and set subcommands of git
sparse-checkout.  However, in non-cone mode, they often aren't and using
them as potential completions leads to *many* forms of confusion:

Issue #1. It provides the *wrong* files and directories.

For
    git sparse-checkout add
we always want to add files and directories not currently in our sparse
checkout, which means we want file and directories not currently present
in the current working tree.  Providing the files and directories
currently present is thus always wrong.

For
    git sparse-checkout set
we have a similar problem except in the subset of cases where we are
trying to narrow our checkout to a strict subset of what we already
have.  That is not a very common scenario, especially since it often
does not even happen to be true for the first use of the command; for
years we required users to create a sparse-checkout via
    git sparse-checkout init
    git sparse-checkout set <args...>
(or use a clone option that did the init step for you at clone time).
The init command creates a minimal sparse-checkout with just the
top-level directory present, meaning the set command has to be used to
expand the checkout.  Thus, only in a special and perhaps unusual cases
would any of the suggestions from normal file and directory completion
be appropriate.

Issue #2: Suggesting patterns that lead to warnings is unfriendly.

If the user specifies any regular file and omits the leading '/', then
the sparse-checkout command will warn the user that their command is
problematic and suggest they use a leading slash instead.

Issue #3: Completion gets confused by leading '/', and provides wrong paths.

Users often want to anchor their patterns to the toplevel of the
repository, especially when listing individual files.  There are a
number of reasons for this, but notably even sparse-checkout encourages
them to do so (as noted above).  However, if users do so (via adding a
leading '/' to their pattern), then bash completion will interpret the
leading slash not as a request for a path at the toplevel of the
repository, but as a request for a path at the root of the filesytem.
That means at best that completion cannot help with such paths, and if
it does find any completions, they are almost guaranteed to be wrong.

Issue #4: Suggesting invalid patterns from subdirectories is unfriendly.

There is no per-directory equivalent to .gitignore with
sparse-checkouts.  There is only a single worktree-global
$GIT_DIR/info/sparse-checkout file.  As such, paths to files must be
specified relative to the toplevel of a repository.  Providing
suggestions of paths that are relative to the current working directory,
as bash completion defaults to, is wrong when the current working
directory is not the worktree toplevel directory.

Issue #5: Paths with special characters will be interpreted incorrectly

The entries in the sparse-checkout file are patterns, not paths.  While
most paths also qualify as patterns (though even in such cases it would
be better for users to not use them directly but prefix them with a
leading '/'), there are a variety of special characters that would need
special escaping beyond the normal shell escaping: '*', '?', '\', '[',
']', and any leading '#' or '!'.  If completion suggests any such paths,
users will likely expect them to be treated as an exact path rather than
as a pattern that might match some number of files other than 1.

However, despite the first four issues, we can note that _if_ users are
using tab completion, then they are probably trying to specify a path in
the index.  As such, we transform their argument into a top-level-rooted
pattern that matches such a file.  For example, if they type:
   git sparse-checkout add Make<TAB>
we could "complete" to
   git sparse-checkout add /Makefile
or, if they ran from the Documentation/technical/ subdirectory:
   git sparse-checkout add m<TAB>
we could "complete" it to:
   git sparse-checkout add /Documentation/technical/multi-pack-index.txt
Note in both cases I use "complete" in quotes, because we actually add
characters both before and after the argument in question, so we are
kind of abusing "bash completions" to be "bash completions AND
beginnings".

The fifth issue is a bit stickier, especially when you consider that we
not only need to deal with escaping issues because of special meanings
of patterns in sparse-checkout & gitignore files, but also that we need
to consider escaping issues due to ls-files needing to sometimes quote
or escape characters, and because the shell needs to escape some
characters.  The multiple interacting forms of escaping could get ugly;
this patch makes no attempt to do so and simply documents that we
decided to not deal with those corner cases for now but at least get the
common cases right.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 15:35:40 +09:00
7c3595b613 completion: avoid misleading completions in cone mode
The "set" and "add" subcommands of "sparse-checkout", when in cone mode,
should only complete on directories.  For bash_completion in general,
when no completions are returned for any subcommands, it will often fall
back to standard completion of files and directories as a substitute.
That is not helpful here.  Since we have already looked for all valid
completions, if none are found then falling back to standard bash file
and directory completion is at best actively misleading.  In fact, there
are three different ways it can be actively misleading.  Add a long
comment in the code about how that fallback behavior can deceive, and
disable the fallback by returning a fake result as the sole completion.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 15:25:14 +09:00
253eeaf7a2 completion: fix logic for determining whether cone mode is active
_git_sparse_checkout() was checking whether we were in cone mode by
checking whether either:

    A) core.sparseCheckoutCone was "true"
    B) "--cone" was specified on the command line

This code has 2 bugs I didn't catch in my review at the time

    1) core.sparseCheckout must be "true" for core.sparseCheckoutCone to
       be relevant (which matters since "git sparse-checkout disable"
       only unsets core.sparseCheckout, not core.sparseCheckoutCone)
    2) The presence of "--no-cone" should override any config setting

Further, I forgot to update this logic as part of 2d95707a02
("sparse-checkout: make --cone the default", 2022-04-22) for the new
default.

Update the code for the new default and make it be more careful in
determining whether to complete based on cone mode or non-cone mode.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 15:25:14 +09:00
6b7f56f7ef completion: squelch stray errors in sparse-checkout completion
If, in the root of a project, one types

    git sparse-checkout set --cone ../<TAB>

then an error message of the form

    fatal: ../: '../' is outside repository at '/home/newren/floss/git'

is written to stderr, which munges the users view of their own command.
Squelch such messages by using the __git() wrapper, designed for this
purpose; see commit e15098a314 (completion: consolidate silencing errors
from git commands, 2017-02-03) for more on the wrapper.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 15:25:14 +09:00
d9fd71fa2a hooks--pre-commit: detect non-ASCII when renaming
When diff.renames is turned on, the diff-filter will not return renamed
files (or copied ones with diff.renames=copy) and potential non-ASCII
characters would not be caught by this hook.

Use the plumbing command diff-index instead of the porcelain one to not
be affected by diff.rename.

Signed-off-by: Julian Prein <druckdev@protonmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:55:40 +09:00
866a1b9026 t6301: write invalid object ID via test-tool ref-store
One of the tests in t6301 verifies that the reference backend correctly
warns about the case where a reference points to a non-existent object.
This is done by writing the object ID into the loose reference directly,
which is quite intimate with how the files backend works.

Refactor the code to instead use `test-tool ref-store` to write the
reference, which is backend-agnostic.

There are two more tests in this file which write loose files directly,
as well. But both of them are indeed quite specific to the loose files
backend and cannot be easily ported to other backends. We thus mark them
as requiring the REFFILES prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:24 +09:00
2e4afdad66 t5551: stop writing packed-refs directly
We have multiple tests in t5551 that write thousands of tags. To do so
efficiently we generate the tags by writing the `packed-refs` file
directly, which of course assumes that the reference database is backed
by the files backend.

Refactor the code to instead use a single `git update-ref --stdin`
command to write the tags. While the on-disk end result is not the same
as we now have a bunch of loose refs instead of a single packed-refs
file, the distinction shouldn't really matter for any of the tests that
use this helper.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:24 +09:00
853bd0d267 t5401: speed up creation of many branches
One of the tests in t5401 creates a bunch of branches by calling
git-branch(1) for every one of them. This is quite inefficient and takes
a comparatively long time even on Unix systems where spawning processes
is comparatively fast. Refactor it to instead use git-update-ref(1),
which leads to an almost 10-fold speedup:

```
Benchmark 1: ./t5401-update-hooks.sh (rev = HEAD)
  Time (mean ± σ):     983.2 ms ±  97.6 ms    [User: 328.8 ms, System: 679.2 ms]
  Range (min … max):   882.9 ms … 1078.0 ms    3 runs

Benchmark 2: ./t5401-update-hooks.sh (rev = HEAD~)
  Time (mean ± σ):      9.312 s ±  0.398 s    [User: 2.766 s, System: 6.617 s]
  Range (min … max):    8.885 s …  9.674 s    3 runs

Summary
  ./t5401-update-hooks.sh (rev = HEAD) ran
    9.47 ± 1.02 times faster than ./t5401-update-hooks.sh (rev = HEAD~)
```

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:24 +09:00
4626269168 t4013: simplify magic parsing and drop "failure"
In t14013, we have various different tests that verify whether certain
diffs are generated as expected. As much of the logic is the same across
many of the tests we some common code in there that generates the actual
test cases for us.

As some diffs are more special than others depending on the command line
parameters passed to git-diff(1), these tests need to adapt behaviour to
the specific test case sometimes. This is done via colon-prefixed magic
commands, of which we currently know "failure" and "noellipses". The
logic to parse this magic is a bit convoluted though and hard to grasp,
also due to the rather unnecessary nesting.

Un-nest the cases so that it becomes a bit more straightfoward. The
logic is further simplified by removing support for the "failure" magic,
which is not actually used anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:24 +09:00
110feb893a t3310: stop checking for reference existence via test -f
One of the tests in t3310 exercises whether the special references
`NOTES_MERGE_PARTIAL` and `NOTES_MERGE_REF` exist as expected when the
notes subsystem runs into a merge conflict. This is done by checking
on-disk data structures directly though instead of asking the reference
backend.

Refactor the test to use git-rev-parse(1) instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:24 +09:00
7e1fcb81ee t1417: make reflog --updateref tests backend agnostic
The tests for `git reflog delete --updateref` are currently marked to
only run with the reffiles backend. There is no inherent reason that
this should be the case other than the fact that the setup messes with
the on-disk reflogs directly.

Refactor the test to stop doing so and drop the REFFILES prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:24 +09:00
88121d9371 t1410: use test-tool to create empty reflog
One of the tests in t1410 is marked to be specific to the files
reference backend, which is because we create a reflog manually by
creating the respective file. Refactor the test to instead use our
`test-tool ref-store` helper to create the reflog so that it works with
other reference backends, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:23 +09:00
b49831ca1c t1401: stop treating FETCH_HEAD as real reference
One of the tests in t1401 asserts that we can create a symref from a
symbolic reference to a top-level reference, which is done by linking
from `refs/heads/top-level` to `FETCH_HEAD`. But `FETCH_HEAD` is not a
proper reference and doesn't even follow the loose reference format, so
it is not a good candidate for the logic under test.

Refactor the test to use `ORIG_HEAD` instead of `FETCH_HEAD`. This also
works with other backends than the reffiles one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:23 +09:00
db7288b321 t1400: split up generic reflog tests from the reffile-specific ones
We have a bunch of tests in t1400 that check whether we correctly read
reflog entries. These tests create the reflog by manually writing to the
respective loose file, which makes them specific to the files backend.
But while some of them do indeed exercise very specific edge cases in
the reffiles backend, most of the tests exercise generic functionality
that should be common to all backends.

Unfortunately, we can't easily adapt all of the tests to work with all
backends. Instead, split out the reffile-specific tests from the ones
that should work with all backends and refactor the generic ones to not
write to the on-disk files directly anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:23 +09:00
54087dd32b t0410: mark tests to require the reffiles backend
Two of our tests in t0410 verify whether partial clones end up with the
correct repository format version and extensions. These checks require
the reffiles backend because every other backend would by necessity bump
the repository format version to be at least 1.

Mark the tests accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 11:50:23 +09:00
e4299d26d4 doc: make the gitfile syntax easier to discover
Signed-off-by: Marcel Krause <mk+copyleft@pimpmybyte.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03 10:54:51 +09:00
7854bf4960 i18n: factorize even more 'incompatible options' messages
Continue the work of 12909b6b8a (i18n: turn "options are incompatible"
into "cannot be used together", 2022-01-05) and a699367bb8 (i18n:
factorize more 'incompatible options' messages, 2022-01-31) to use the
same parameterized error message for reporting incompatible command line
options.  This reduces the number of strings to translate and makes the
UI slightly more consistent.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-27 10:01:45 +09:00
cd3c28c53a column: release strbuf and string_list after use
Releasing strbuf and string_list just before exiting is not strictly
necessary, but it gets rid of false positives reported by leak checkers,
which can then be more easily used to show that the column-printing
machinery behind print_columns() are free of leaks.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-27 09:59:56 +09:00
e928c11e29 replay: stop assuming replayed branches do not diverge
The replay command is able to replay multiple branches but when some of
them are based on other replayed branches, their commit should be
replayed onto already replayed commits.

For this purpose, let's store the replayed commit and its original
commit in a key value store, so that we can easily find and reuse a
replayed commit instead of the original one.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:50 +09:00
c4611130f4 replay: add --contained to rebase contained branches
Let's add a `--contained` option that can be used along with
`--onto` to rebase all the branches contained in the <revision-range>
argument.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:49 +09:00
22d99f012f replay: add --advance or 'cherry-pick' mode
There is already a 'rebase' mode with `--onto`. Let's add an 'advance' or
'cherry-pick' mode with `--advance`. This new mode will make the target
branch advance as we replay commits onto it.

The replayed commits should have a single tip, so that it's clear where
the target branch should be advanced. If they have more than one tip,
this new mode will error out.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:49 +09:00
3916ec307e replay: use standard revision ranges
Instead of the fixed "<oldbase> <branch>" arguments, the replay
command now accepts "<revision-range>..." arguments in a similar
way as many other Git commands. This makes its interface more
standard and more flexible.

This also enables many revision related options accepted and
eaten by setup_revisions(). If the replay command was a high level
one or had a high level mode, it would make sense to restrict some
of the possible options, like those generating non-contiguous
history, as they could be confusing for most users.

Also as the interface of the command is now mostly finalized,
we can add more documentation and more testcases to make sure
the command will continue to work as designed in the future.

We only document the rev-list related options among all the
revision related options that are now accepted, as the rev-list
related ones are probably the most useful for now.

Helped-by: Dragan Simic <dsimic@manjaro.org>
Helped-by: Linus Arver <linusa@google.com>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:49 +09:00
81613be31e replay: make it a minimal server side command
We want this command to be a minimal command that just does server side
picking of commits, displaying the results on stdout for higher level
scripts to consume.

So let's simplify it:
  * remove the worktree and index reading/writing,
  * remove the ref (and reflog) updating,
  * remove the assumptions tying us to HEAD, since (a) this is not a
    rebase and (b) we want to be able to pick commits in a bare repo,
    i.e. to/from branches that are not checked out and not the main
    branch,
  * remove unneeded includes,
  * handle rebasing multiple branches by printing on stdout the update
    ref commands that should be performed.

The output can be piped into `git update-ref --stdin` for the ref
updates to happen.

In the future to make it easier for users to use this command
directly maybe an option can be added to automatically pipe its output
into `git update-ref`.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:49 +09:00
fda7dea7c9 replay: remove HEAD related sanity check
We want replay to be a command that can be used on the server side on
any branch, not just the current one, so we are going to stop updating
HEAD in a future commit.

A "sanity check" that makes sure we are replaying the current branch
doesn't make sense anymore. Let's remove it.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:49 +09:00
4a37727626 replay: remove progress and info output
The replay command will be changed in a follow up commit, so that it
will not update refs directly, but instead it will print on stdout a
list of commands that can be consumed by `git update-ref --stdin`.

We don't want this output to be polluted by its current low value
output, so let's just remove the latter.

In the future, when the command gets an option to update refs by
itself, it will make a lot of sense to display a progress meter, but
we are not there yet.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:49 +09:00
38283bced8 replay: add an important FIXME comment about gpg signing
We want to be able to handle signed commits in some way in the future,
but we are not ready to do it now. So for the time being let's just add
a FIXME comment to remind us about it.

These are different ways we could handle them:

  - in case of a cli user and if there was an interactive mode, we could
    perhaps ask if the user wants to sign again
  - we could add an option to just fail if there are signed commits

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:48 +09:00
8259e4154f replay: change rev walking options
Let's force the rev walking options we need after calling
setup_revisions() instead of before.

This might override some user supplied rev walking command line options
though. So let's detect that and warn users by:

  a) setting the desired values, before setup_revisions(),
  b) checking after setup_revisions() whether these values differ from
     the desired values,
  c) if so throwing a warning and setting the desired values again.

We want the command to work from older commits to newer ones by default.
Also we don't want history simplification, as we want to deal with all
the commits in the affected range.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:48 +09:00
e787e664da replay: introduce pick_regular_commit()
Let's refactor the code to handle a regular commit (a commit that is
neither a root commit nor a merge commit) into a single function instead
of keeping it inside cmd_replay().

This is good for separation of concerns, and this will help further work
in the future to replay merge commits.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:48 +09:00
a9df61ace3 replay: die() instead of failing assert()
It's not a good idea for regular Git commands to use an assert() to
check for things that could happen but are not supported.

Let's die() with an explanation of the issue instead.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:48 +09:00
d46da6d90b replay: start using parse_options API
Instead of manually parsing arguments, let's start using the parse_options
API. This way this new builtin will look more standard, and in some
upcoming commits will more easily be able to handle more command line
options.

Note that we plan to later use standard revision ranges instead of
hardcoded "<oldbase> <branch>" arguments. When we will use standard
revision ranges, it will be easier to check if there are no spurious
arguments if we keep ARGV[0], so let's call parse_options() with
PARSE_OPT_KEEP_ARGV0 even if we don't need ARGV[0] right now to avoid
some useless code churn.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:48 +09:00
f920b0289b replay: introduce new builtin
For now, this is just a rename from `t/helper/test-fast-rebase.c` into
`builtin/replay.c` with minimal changes to make it build appropriately.

Let's add a stub documentation and a stub test script though.

Subsequent commits will flesh out the capabilities of the new command
and make it a more standard regular builtin.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:48 +09:00
b9d0991cc7 t6429: remove switching aspects of fast-rebase
At the time t6429 was written, merge-ort was still under development,
did not have quite as many tests, and certainly was not widely deployed.
Since t6429 was exercising some codepaths just a little differently, we
thought having them also test the "merge_switch_to_result()" bits of
merge-ort was useful even though they weren't intrinsic to the real
point of these tests.

However, the value provided by doing extra testing of the
"merge_switch_to_result()" bits has decreased a bit over time, and it's
actively making it harder to refactor `test-tool fast-rebase` into `git
replay`, which we are going to do in following commits.  Dispense with
these bits.

Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:47 +09:00
b1df3b3867 commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
In 7a5d604443 (commit: detect commits that exist in commit-graph but not
in the ODB, 2023-10-31), we have introduced a new object existence check
into `repo_parse_commit_internal()` so that we do not parse commits via
the commit-graph that don't have a corresponding object in the object
database. This new check of course comes with a performance penalty,
which the commit put at around 30% for `git rev-list --topo-order`. But
there are in fact scenarios where the performance regression is even
higher. The following benchmark against linux.git with a fully-build
commit-graph:

  Benchmark 1: git.v2.42.1 rev-list --count HEAD
    Time (mean ± σ):     658.0 ms ±   5.2 ms    [User: 613.5 ms, System: 44.4 ms]
    Range (min … max):   650.2 ms … 666.0 ms    10 runs

  Benchmark 2: git.v2.43.0-rc1 rev-list --count HEAD
    Time (mean ± σ):      1.333 s ±  0.019 s    [User: 1.263 s, System: 0.069 s]
    Range (min … max):    1.302 s …  1.361 s    10 runs

  Summary
    git.v2.42.1 rev-list --count HEAD ran
      2.03 ± 0.03 times faster than git.v2.43.0-rc1 rev-list --count HEAD

While it's a noble goal to ensure that results are the same regardless
of whether or not we have a potentially stale commit-graph, taking twice
as much time is a tough sell. Furthermore, we can generally assume that
the commit-graph will be updated by git-gc(1) or git-maintenance(1) as
required so that the case where the commit-graph is stale should not at
all be common.

With that in mind, default-disable GIT_COMMIT_GRAPH_PARANOIA and restore
the behaviour and thus performance previous to the mentioned commit. In
order to not be inconsistent, also disable this behaviour by default in
`lookup_commit_in_graph()`, where the object existence check has been
introduced right at its inception via f559d6d45e (revision: avoid
hitting packfiles when commits are in commit-graph, 2021-08-09).

This results in another speedup in commands that end up calling this
function, even though it's less pronounced compared to the above
benchmark. The following has been executed in linux.git with ~1.2
million references:

  Benchmark 1: GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
    Time (mean ± σ):      2.947 s ±  0.003 s    [User: 2.412 s, System: 0.534 s]
    Range (min … max):    2.943 s …  2.949 s    3 runs

  Benchmark 2: GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted
    Time (mean ± σ):      2.724 s ±  0.030 s    [User: 2.207 s, System: 0.514 s]
    Range (min … max):    2.704 s …  2.759 s    3 runs

  Summary
    GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted ran
      1.08 ± 0.01 times faster than GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted

So whereas 7a5d604443 initially introduced the logic to start doing an
object existence check in `repo_parse_commit_internal()` by default, the
updated logic will now instead cause `lookup_commit_in_graph()` to stop
doing the check by default. This behaviour continues to be tweakable by
the user via the GIT_COMMIT_GRAPH_PARANOIA environment variable.

Note that this requires us to amend some tests to manually turn on the
paranoid checks again. This is because we cause repository corruption by
manually deleting objects which are part of the commit graph already.
These circumstances shouldn't usually happen in repositories.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:10:00 +09:00
62b4f7b9c6 doc: refer to internet archive
These pages are no longer reachable from their original locations,
which makes things difficult for readers. Instead, switch to linking to
the Internet Archive for the content.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:07:06 +09:00
28a0c65f5d doc: update links for andre-simon.de
Beyond the fact that it's somewhat traditional to respect sites'
self-identification, it's helpful for links to point to the things
that people expect them to reference. Here that means linking to
specific pages instead of a domain.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:07:05 +09:00
d05b08cd52 doc: switch links to https
These sites offer https versions of their content.
Using the https versions provides some protection for users.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:07:05 +09:00
65175d9ea2 doc: update links to current pages
It's somewhat traditional to respect sites' self-identification.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 10:07:05 +09:00
cbf498eb53 builtin/reflog.c: fix dry-run option short name
The documentation for reflog states that the --dry-run option of the
expire and delete subcommands has a corresponding short name, -n.
However, 33d7bdd645 (builtin/reflog.c: use parse-options api for expire,
delete subcommands, 2022-01-06) did not include this short name in the
new options parsing.

Re-add the short name in the new dry-run option definitions.

Signed-off-by: Josh Brobst <josh@brob.st>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-26 09:35:42 +09:00
d44b517137 orphan/unborn: fix use of 'orphan' in end-user facing messages
"orphan branch" is not even grammatical ("orphaned branch" is), and
we have been using "unborn branch" to mean the state where the HEAD
points at a branch that does not yet exist.

Update end-user facing messages to correct them.  There are cases
other random words are used (e.g., "unparented branch") but now we
have a glossary entry, use the term "unborn branch" consistently.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-24 12:11:23 +09:00
49dc156376 orphan/unborn: add to the glossary and use them consistently
To orphan is a verb that denotes the act of getting on an unborn
branch, and a few references to "orphan branch" in our documentation
are misuses of the word.  They caused end-user confusion, which was
made even worse because we did not have the term defined in the
glossary document.  Add entries for "unborn" branch and "orphan"
operation to the glossary, and adjust existing documentation
accordingly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-24 12:11:23 +09:00
9263c40a0a checkout: refactor die_if_checked_out() caller
There is a bit dense logic to make a call to "die_if_checked_out()"
while trying to check out a branch.  Extract it into a helper
function and give it a bit of comment to describe what is going on.

The most important part of the refactoring is the separation of the
guarding logic before making the call to die_if_checked_out() into
the caller specific part (e.g., the logic that decides that the
caller is trying to check out an existing branch) and the bypass due
to the "--ignore-other-worktrees" option.  The latter will be common
no matter how the current or future callers decides they need this
protection.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-23 15:01:09 +09:00
16fa3eebc0 t0212: test URL redacting in EVENT format
In the added tests cases, skip testing the `GIT_TRACE2_REDACT=0` case
because we would need to exactly model the full JSON event stream like
we did in the preceding basic tests and I do not think it is worth it.

Furthermore, the Trace2 routines print the same content in normal, perf,
or event format, and in t0210 and t0211 we already tested the basic
functionality, so no need to repeat it here.

In this test, we use the test-helper to unit test each of the event
messages where URLs can appear and confirm that they are redacted in
each event.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-23 10:30:33 +09:00
c73e7f80d3 t0211: test URL redacting in PERF format
This transmogrifies the test case that was just added to t0210, to also
cover the `GIT_TRACE2_PERF` backend.

Just like t0211, we now have to toggle the `TEST_PASSES_SANITIZE_LEAK`
annotation.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-23 10:30:33 +09:00
b7d49ac1ec trace2: redact passwords from https:// URLs by default
It is an unsafe practice to call something like

	git clone https://user:password@example.com/

This not only risks leaking the password "over the shoulder" or into the
readline history of the current Unix shell, it also gets logged via
Trace2 if enabled.

Let's at least avoid logging such secrets via Trace2, much like we avoid
logging secrets in `http.c`. Much like the code in `http.c` is guarded
via `GIT_TRACE_REDACT` (defaulting to `true`), we guard the new code via
`GIT_TRACE2_REDACT` (also defaulting to `true`).

The new tests added in this commit uncover leaks in `builtin/clone.c`
and `remote.c`. Therefore we need to turn off
`TEST_PASSES_SANITIZE_LEAK`. The reasons:

- We observed that `the_repository->remote_status` is not released
  properly.

- We are using `url...insteadOf` and that runs into a code path where an
  allocated URL is replaced with another URL, and the original URL is
  never released.

- `remote_states` contains plenty of `struct remote`s whose refspecs
  seem to be usually allocated by never released.

More investigation is needed here to identify the exact cause and
proper fixes for these leaks/bugs.

Co-authored-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-23 10:30:33 +09:00
abcdb978ea trace2: fix signature of trace2_def_param() macro
Add `struct key_value_info` argument to `trace2_def_param()`.

In dc90208497 (trace2: plumb config kvi, 2023-06-28) a `kvi`
argument was added to `trace2_def_param_fl()` but the macro
was not up updated. Let's fix that.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-23 10:30:32 +09:00
4f7fd79e57 merge-file: add --diff-algorithm option
Make it possible to use other diff algorithms than the 'myers'
default algorithm, when using the 'git merge-file' command, to help
avoid spurious conflicts by selecting a more recent algorithm such
as 'histogram', for instance when using 'git merge-file' as part of
a custom merge driver.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-22 14:23:06 +09:00
564d0252ca Git 2.43
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-20 10:28:15 +09:00
c3cc3e1da7 Merge tag 'l10n-2.43.0-rnd2' of https://github.com/git-l10n/git-po
l10n-2.43.0-rnd2

* tag 'l10n-2.43.0-rnd2' of https://github.com/git-l10n/git-po:
  l10n: zh-TW: Git 2.43.0-rc1
  l10n: Update German translation
  l10n: bg.po: Updated Bulgarian translation (5579t)
  l10n: zh_CN: for git 2.43.0-rc1
  l10n: Update Catalan translation
  l10n: po-id for 2.43 (round 1)
  l10n: fr: v2.43.0 rnd 2
  l10n: update uk localization for v2.43
  l10n: sv.po: Update Swedish translation (5579t)
  l10n: tr: v2.43.0
2023-11-20 10:27:33 +09:00
d003a26cca Merge branch 'vd/glossary-dereference-peel'
"To dereference" and "to peel" were sometimes used in in-code
comments and documentation but without description in the glossary.

* vd/glossary-dereference-peel:
  glossary: add definitions for dereference & peel
2023-11-20 09:57:23 +09:00
e9eb93bb2a Merge branch 'tz/send-email-helpfix'
Typoes in "git send-email -h" have been corrected.

* tz/send-email-helpfix:
  send-email: remove stray characters from usage
2023-11-20 09:57:22 +09:00
d303432667 Merge branch 'l10n/zh-TW/2023-11-19' of github.com:l10n-tw/git-po
* 'l10n/zh-TW/2023-11-19' of github.com:l10n-tw/git-po:
  l10n: zh-TW: Git 2.43.0-rc1
2023-11-20 07:57:09 +08:00
ee41e2d41f fuzz: add new oss-fuzz fuzzer for date.c / date.h
Signed-off-by: Arthur Chan <arthur.chan@adalogics.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-20 08:17:51 +09:00
50f1abcff6 packfile.c: fix a typo in each_file_in_pack_dir_fn()'s declaration
One parameter is called `file_pach`. On the face of it, this looks as if
it was supposed to talk about a `path` instead of a `pach`.

However, looking at the way this callback is called, it gets fed the
`d_name` from a directory entry, which provides just the file name, not
the full path. Therefore, let's fix this by calling the parameter
`file_name` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-20 08:15:50 +09:00
3d735322df l10n: zh-TW: Git 2.43.0-rc1
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2023-11-19 23:35:21 +08:00
88d26d3fa3 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5579t)
2023-11-19 20:56:21 +08:00
ed8b3c3078 l10n: Update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2023-11-18 14:19:34 +01:00
f42a8bb329 l10n: bg.po: Updated Bulgarian translation (5579t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2023-11-18 13:25:32 +01:00
64294acb07 l10n: zh_CN: for git 2.43.0-rc1
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2023-11-18 12:05:13 +08:00
4385297f0d Merge branch '2.43-uk-update' of github.com:arkid15r
* '2.43-uk-update' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: update uk localization for v2.43
2023-11-18 10:51:56 +08:00
e3bd57a674 Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2023-11-18 10:48:39 +08:00
ec93805ce3 Merge branch 'tr-l10n' of github.com:bitigchi/git-po
* 'tr-l10n' of github.com:bitigchi/git-po:
  l10n: tr: v2.43.0
2023-11-18 10:45:56 +08:00
4dcb316637 Merge branch 'fr_v2.43.0' of github.com:jnavila/git
* 'fr_v2.43.0' of github.com:jnavila/git:
  l10n: fr: v2.43.0 rnd 2
2023-11-18 10:43:22 +08:00
4dedb407b1 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.43 (round 1)
2023-11-18 10:42:48 +08:00
1af36a14a4 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5579t)
2023-11-18 10:42:04 +08:00
29a186917b refs: remove delete_refs callback from backends
Now that `refs_delete_refs` is implemented in a generic way via the ref
transaction interfaces there are no callers left that invoke the
`delete_refs` callback anymore. Remove it from all of our backends.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 10:12:12 +09:00
d6f8e72982 refs: deduplicate code to delete references
Both the files and the packed-refs reference backends now use the same
generic transactions-based code to delete references. Let's pull these
implementations up into `refs_delete_refs()` to deduplicate the code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 10:12:12 +09:00
e85e5dd78a refs/files: use transactions to delete references
In the `files_delete_refs()` callback function of the files backend we
implement deletion of references. This is done in two steps:

    1. We lock the packed-refs file and delete all references from it in
       a single transaction.

    2. We delete all loose references via separate calls to
       `refs_delete_ref()`.

These steps essentially duplicate the logic around locking and deletion
order that we already have in the transactional interfaces, where we do
know to lock and evict references from the packed-refs file. Despite the
fact that we duplicate the logic, it's also less efficient than if we
used a single generic transaction:

    - The transactional interface knows to skip locking of the packed
      refs in case they don't contain any of the refs which are about to
      be deleted.

    - We end up creating N+1 separate reference transactions, one for
      the packed-refs file and N for the individual loose references.

Refactor the code to instead delete references via a single transaction.
As we don't assert the expected old object ID this is equivalent to the
previous behaviour, and we already do the same in the packed-refs
backend.

Despite the fact that the result is simpler to reason about, this change
also results in improved performance. The following benchmarks have been
executed in linux.git:

```
$ hyperfine -n '{rev}, packed={packed} refcount={refcount}' \
    -L packed true,false -L refcount 1,1000 -L rev master,pks-ref-store-generic-delete-refs \
    --setup 'git -C /home/pks/Development/git switch --detach {rev} && make -C /home/pks/Development/git -j17' \
    --prepare 'printf "create refs/heads/new-branch-%d HEAD\n" $(seq {refcount}) | git -C /home/pks/Reproduction/linux.git update-ref --stdin && if test {packed} = true; then git pack-refs --all; fi' \
    --warmup=10 \
    '/home/pks/Development/git/bin-wrappers/git -C /home/pks/Reproduction/linux.git branch -d new-branch-{1..{refcount}}'

Benchmark 1: master packed=true refcount=1
  Time (mean ± σ):       7.8 ms ±   1.6 ms    [User: 3.4 ms, System: 4.4 ms]
  Range (min … max):     5.5 ms …  11.0 ms    120 runs

Benchmark 2: master packed=false refcount=1
  Time (mean ± σ):       7.0 ms ±   1.1 ms    [User: 3.2 ms, System: 3.8 ms]
  Range (min … max):     5.7 ms …   9.8 ms    180 runs

Benchmark 3: master packed=true refcount=1000
  Time (mean ± σ):     283.8 ms ±   5.2 ms    [User: 45.7 ms, System: 231.5 ms]
  Range (min … max):   276.7 ms … 291.6 ms    10 runs

Benchmark 4: master packed=false refcount=1000
  Time (mean ± σ):     284.4 ms ±   5.3 ms    [User: 44.2 ms, System: 233.6 ms]
  Range (min … max):   277.1 ms … 293.3 ms    10 runs

Benchmark 5: generic-delete-refs packed=true refcount=1
  Time (mean ± σ):       6.2 ms ±   1.8 ms    [User: 2.3 ms, System: 3.9 ms]
  Range (min … max):     4.1 ms …  12.2 ms    142 runs

Benchmark 6: generic-delete-refs packed=false refcount=1
  Time (mean ± σ):       7.1 ms ±   1.4 ms    [User: 2.8 ms, System: 4.3 ms]
  Range (min … max):     4.2 ms …  10.8 ms    157 runs

Benchmark 7: generic-delete-refs packed=true refcount=1000
  Time (mean ± σ):     198.9 ms ±   1.7 ms    [User: 29.5 ms, System: 165.7 ms]
  Range (min … max):   196.1 ms … 201.4 ms    10 runs

Benchmark 8: generic-delete-refs packed=false refcount=1000
  Time (mean ± σ):     199.7 ms ±   7.8 ms    [User: 32.2 ms, System: 163.2 ms]
  Range (min … max):   193.8 ms … 220.7 ms    10 runs

Summary
  generic-delete-refs packed=true refcount=1 ran
    1.14 ± 0.37 times faster than master packed=false refcount=1
    1.15 ± 0.40 times faster than generic-delete-refs packed=false refcount=1
    1.26 ± 0.44 times faster than master packed=true refcount=1
   32.24 ± 9.17 times faster than generic-delete-refs packed=true refcount=1000
   32.36 ± 9.29 times faster than generic-delete-refs packed=false refcount=1000
   46.00 ± 13.10 times faster than master packed=true refcount=1000
   46.10 ± 13.13 times faster than master packed=false refcount=1000
```

Especially in the case where we have many references we can see a clear
performance speedup of nearly 30%.

This is in contrast to the stated objecive in a27dcf89b6 (refs: make
delete_refs() virtual, 2016-09-04), where the virtual `delete_refs()`
function was introduced with the intent to speed things up rather than
making things slower. So it seems like we have outlived the need for a
virtual function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 10:12:12 +09:00
b02c7646f4 t5510: ensure that the packed-refs file needs locking
One of the tests in t5510 asserts that a `git fetch --prune` detects
failures to prune branches. This is done by locking the packed-refs
file, which would then later lead to a locking issue when Git tries to
rewrite the file to prune the branches from it.

Interestingly though, we do not pack the about-to-be-pruned branch into
the packed-refs file, so it never even contained that branch in the
first place. While this is good enough right now because the pruning
will always lock the file regardless of whether it contains the branch
or not, this is a mere implementation detail. In fact, we're about to
rewrite branch deletions to make use of the ref transaction interface,
which knows to skip rewrites of the packed-refs file in the case where
it does not contain the branches in the first place, and this will break
the test.

Prepare the test for that change by packing the refs before trying to
prune them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 10:12:12 +09:00
6ff658cc78 send-email: avoid duplicate specification warnings
A warning is issued for options which are specified more than once
beginning with perl-Getopt-Long >= 2.55.  In addition to causing users
to see warnings, this results in test failures which compare the output.
An example, from t9001-send-email.37:

  | +++ diff -u expect actual
  | --- expect      2023-11-14 10:38:23.854346488 +0000
  | +++ actual      2023-11-14 10:38:23.848346466 +0000
  | @@ -1,2 +1,7 @@
  | +Duplicate specification "no-chain-reply-to" for option "no-chain-reply-to"
  | +Duplicate specification "to-cover|to-cover!" for option "to-cover"
  | +Duplicate specification "cc-cover|cc-cover!" for option "cc-cover"
  | +Duplicate specification "no-thread" for option "no-thread"
  | +Duplicate specification "no-to-cover" for option "no-to-cover"
  |  fatal: longline.patch:35 is longer than 998 characters
  |  warning: no patches were sent
  | error: last command exited with $?=1
  | not ok 37 - reject long lines

Remove the duplicate option specs.  These are primarily the explicit
'--no-' prefix opts which were added in f471494303 (git-send-email.perl:
support no- prefix with older GetOptions, 2015-01-30).  This was done
specifically to support perl-5.8.0 which includes Getopt::Long 2.32[1].

Getopt::Long 2.33 added support for the '--no-' prefix natively by
appending '!' to the option specification string, which was included in
perl-5.8.1 and is not present in perl-5.8.0.  The previous commit bumped
the minimum supported Perl version to 5.8.1 so we no longer need to
provide the '--no-' variants for negatable options manually.

Teach `--git-completion-helper` to output the '--no-' options.  They are
not included in the options hash and would otherwise be lost.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 07:26:34 +09:00
d13a73e383 perl: bump the required Perl version to 5.8.1 from 5.8.0
The following commit will make use of a Getopt::Long feature which is
only present in Perl >= 5.8.1.  Document that as the minimum version we
support.

Many of our Perl scripts will continue to run with 5.8.0 but this change
allows us to adjust them as needed without breaking any promises to our
users.

The Perl requirement was last changed in d48b284183 (perl: bump the
required Perl version to 5.8 from 5.6.[21], 2010-09-24).  At that time,
5.8.0 was 8 years old.  It is now over 21 years old.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 07:26:32 +09:00
294bfc2441 t/perf: add perf tests for for-each-ref
Add performance tests for 'for-each-ref'. The tests exercise different
combinations of filters/formats/options, as well as the overall performance
of 'git for-each-ref | git cat-file --batch-check' to demonstrate the
performance difference vs. 'git for-each-ref' with "%(*fieldname)" format
specifiers.

All tests are run against a repository with 40k loose refs - 10k commits,
each having a unique:

- branch
- custom ref (refs/custom/special_*)
- annotated tag pointing at the commit
- annotated tag pointing at the other annotated tag (i.e., a nested tag)

After those tests are finished, the refs are packed with 'pack-refs --all'
and the same tests are rerun.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:03:01 +09:00
188782ecb1 ref-filter.c: use peeled tag for '*' format fields
In most builtins ('rev-parse <revision>^{}', 'show-ref --dereference'),
"dereferencing" a tag refers to a recursive peel of the tag object. Unlike
these cases, the dereferencing prefix ('*') in 'for-each-ref' format
specifiers triggers only a single, non-recursive dereference of a given tag
object. For most annotated tags, a single dereference is all that is needed
to access the tag's associated commit or tree; "recursive" and
"non-recursive" dereferencing are functionally equivalent in these cases.
However, nested tags (annotated tags whose target is another annotated tag)
dereferenced once return another tag, where a recursive dereference would
return the commit or tree.

Currently, if a user wants to filter & format refs and include information
about a recursively-dereferenced tag, they can do so with something like
'cat-file --batch-check':

    git for-each-ref --format="%(objectname)^{} %(refname)" <pattern> |
        git cat-file --batch-check="%(objectname) %(rest)"

But the combination of commands is inefficient. So, to improve the
performance of this use case and align the defererencing behavior of
'for-each-ref' with that of other commands, update the ref formatting code
to use the peeled tag (from 'peel_iterated_oid()') to populate '*' fields
rather than the tag's immediate target object (from 'get_tagged_oid()').

Additionally, add a test to 't6300-for-each-ref' to verify new nested tag
behavior and update 't6302-for-each-ref-filter.sh' to print the correct
value for nested dereferenced fields.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:03:01 +09:00
d1dfe6e936 for-each-ref: clean up documentation of --format
Move the description of the `*` prefix from the --format option
documentation to the part of the command documentation that deals with other
object type-specific modifiers. Also reorganize and reword the remaining
--format documentation so that the explanation of the default format doesn't
interrupt the details on format string interpolation.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:03:00 +09:00
bd98f9774e ref-filter.c: filter & format refs in the same callback
Update 'filter_and_format_refs()' to try to perform ref filtering &
formatting in a single ref iteration, without an intermediate 'struct
ref_array'. This can only be done if no operations need to be performed on a
pre-filtered array; specifically, if the refs are

- filtered on reachability,
- sorted, or
- formatted with ahead-behind information

they cannot be filtered & formatted in the same iteration. In that case,
fall back on the current filter-then-sort-then-format flow.

This optimization substantially improves memory usage due to no longer
storing a ref array in memory. In some cases, it also dramatically reduces
runtime (e.g. 'git for-each-ref --no-sort --count=1', which no longer loads
all refs into a 'struct ref_array' to printing only the first ref).

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:03:00 +09:00
613d991549 ref-filter.c: refactor to create common helper functions
Factor out parts of 'ref_array_push()', 'ref_filter_handler()', and
'filter_refs()' into new helper functions:

* Extract the code to grow a 'struct ref_array' and append a given 'struct
  ref_array_item *' to it from 'ref_array_push()' into 'ref_array_append()'.
* Extract the code to filter a given ref by refname & object ID then create
  a new 'struct ref_array_item *' from 'filter_one()' into
  'apply_ref_filter()'.
* Extract the code for filter pre-processing, contains cache creation, and
  ref iteration from 'filter_refs()' into 'do_filter_refs()'.

In later patches, these helpers will be used by new ref-filter API
functions. This patch does not result in any user-facing behavior changes or
changes to callers outside of 'ref-filter.c'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:03:00 +09:00
9c575b0202 ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
Rename 'ref_filter_handler()' to 'filter_one()' to more clearly distinguish
it from other ref filtering callbacks that will be added in later patches.
The "*_one()" naming convention is common throughout the codebase for
iteration callbacks.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:03:00 +09:00
e7574b0c6b ref-filter.h: add functions for filter/format & format-only
Add two new public methods to 'ref-filter.h':

* 'print_formatted_ref_array()' which, given a format specification & array
  of ref items, formats and prints the items to stdout.
* 'filter_and_format_refs()' which combines 'filter_refs()',
  'ref_array_sort()', and 'print_formatted_ref_array()' into a single
  function.

This consolidates much of the code used to filter and format refs in
'builtin/for-each-ref.c', 'builtin/tag.c', and 'builtin/branch.c', reducing
duplication and simplifying the future changes needed to optimize the filter
& format process.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:02:59 +09:00
6d6e5c53b0 ref-filter.h: move contains caches into filter
Move the 'contains_cache' and 'no_contains_cache' used in filter_refs into
an 'internal' struct of the 'struct ref_filter'. In later patches, the
'struct ref_filter *' will be a common data structure across multiple
filtering functions. These caches are part of the common functionality the
filter struct will support, so they are updated to be internally accessible
wherever the filter is used.

The design used here mirrors what was introduced in 576de3d956
(unpack_trees: start splitting internal fields from public API, 2023-02-27)
for 'unpack_trees_options'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:02:59 +09:00
9d4fcfe1ff ref-filter.h: add max_count and omit_empty to ref_format
Add an internal 'array_opts' struct to 'struct ref_format' containing
formatting options that pertain to the formatting of an entire ref array:
'max_count' and 'omit_empty'. These values are specified by the '--count'
and '--omit-empty' options, respectively, to 'for-each-ref'/'tag'/'branch'.
Storing these values in the 'ref_format' will simplify the consolidation of
ref array formatting logic across builtins in later patches.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:02:59 +09:00
56d26ade97 ref-filter.c: really don't sort when using --no-sort
When '--no-sort' is passed to 'for-each-ref', 'tag', and 'branch', the
printed refs are still sorted by ascending refname. Change the handling of
sort options in these commands so that '--no-sort' to truly disables
sorting.

'--no-sort' does not disable sorting in these commands is because their
option parsing does not distinguish between "the absence of '--sort'"
(and/or values for tag.sort & branch.sort) and '--no-sort'. Both result in
an empty 'sorting_options' string list, which is parsed by
'ref_sorting_options()' to create the 'struct ref_sorting *' for the
command. If the string list is empty, 'ref_sorting_options()' interprets
that as "the absence of '--sort'" and returns the default ref sorting
structure (equivalent to "refname" sort).

To handle '--no-sort' properly while preserving the "refname" sort in the
"absence of --sort'" case, first explicitly add "refname" to the string list
*before* parsing options. This alone doesn't actually change any behavior,
since 'compare_refs()' already falls back on comparing refnames if two refs
are equal w.r.t all other sort keys.

Now that the string list is populated by default, '--no-sort' is the only
way to empty the 'sorting_options' string list. Update
'ref_sorting_options()' to return a NULL 'struct ref_sorting *' if the
string list is empty, and add a condition to 'ref_array_sort()' to skip the
sort altogether if the sort structure is NULL. Note that other functions
using 'struct ref_sorting *' do not need any changes because they already
ignore NULL values.

Finally, remove the condition around sorting in 'ls-remote', since it's no
longer necessary. Unlike 'for-each-ref' et. al., it does *not* do any
sorting by default. This default is preserved by simply leaving its sort key
string list empty before parsing options; if no additional sort keys are
set, 'struct ref_sorting *' is NULL and sorting is skipped.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:02:58 +09:00
46edab516b send-email: remove stray characters from usage
A few stray single quotes crept into the usage string in a2ce608244
(send-email docs: add format-patch options, 2021-10-25).  Remove them.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 14:00:34 +09:00
cb00f524df rebase: rewrite --(no-)autosquash documentation
Rewrite the description of the rebase --(no-)autosquash options to try
to make it a bit clearer. Don't use "the '...'" to refer to part of a
commit message, mention how --interactive can be used to review the
todo list, and add a bit more detail on commit --squash/amend.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 09:18:22 +09:00
297be59456 rebase: support --autosquash without -i
The rebase --autosquash option is quietly ignored when used without
--interactive (apart from preventing preemptive fast-forwarding and
triggering conflicts with apply backend options).

Change that to support --autosquash without --interactive, by dropping
its restriction to REBASE_INTERACTIVE_EXCPLICIT mode. When used this
way, auto-squashing is done without opening the todo list editor.

Drop the -i requirement from the --autosquash description, and amend
t3415-rebase-autosquash.sh to test the option and the rebase.autoSquash
config variable with and without -i.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 09:18:22 +09:00
75cf39b117 rebase: fully ignore rebase.autoSquash without -i
Setting the rebase.autoSquash config variable to true implies a couple
of restrictions: it prevents preemptive fast-forwarding and it triggers
conflicts with apply backend options. However, it only actually results
in auto-squashing when combined with the --interactive (or -i) option,
due to code in run_specific_rebase() that disables auto-squashing unless
the REBASE_INTERACTIVE_EXPLICIT flag is set.

Doing autosquashing for rebase.autoSquash without --interactive would be
problematic in terms of backward compatibility, but conversely, there is
no need for the aforementioned restrictions without --interactive.

So drop the options.config_autosquash check from the conditions for
clearing allow_preemptive_ff, as the case where it is combined with
--interactive is already covered by the REBASE_INTERACTIVE_EXPLICIT
flag check above it.

Also drop the "apply options are incompatible with rebase.autoSquash"
error, because it is unreachable if it is restricted to --interactive,
as apply options already cause an error when used with --interactive.
Drop the tests for the error from t3422-rebase-incompatible-options.sh,
which has separate tests for the conflicts of --interactive with apply
options.

When neither --autosquash nor --no-autosquash are given, only set
options.autosquash to true if rebase.autosquash is combined with
--interactive.

Don't initialize options.config_autosquash to -1, as there is no need to
distinguish between rebase.autoSquash being unset or explicitly set to
false.

Finally, amend the rebase.autoSquash documentation to say it only
affects interactive mode.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-16 09:18:21 +09:00
cfb8a6e9a9 Git 2.43-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-14 15:14:45 +09:00
893dce2ffb glossary: add definitions for dereference & peel
Add 'gitglossary' definitions for "dereference" (as it used for both symrefs
and objects) and "peel". These terms are used in options and documentation
throughout Git, but they are not clearly defined anywhere and the behavior
they refer to depends heavily on context. Provide explicit definitions to
clarify existing documentation to users and help contributors to use the
most appropriate terminology possible in their additions to Git.

Update other definitions in the glossary that use the term "dereference" to
link to 'def_dereference'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-14 09:49:33 +09:00
e7e03ef995 ci: avoid running the test suite _twice_
This is a late amendment of 4a6e4b9602 (CI: remove Travis CI support,
2021-11-23), whereby the `.prove` file (being written by the `prove`
command that is used to run the test suite) is no longer retained
between CI builds: This feature was only ever used in the Travis CI
builds, we tried for a while to do the same in Azure Pipelines CI runs
(but I gave up on it after a while), and we never used that feature in
GitHub Actions (nor does the new GitLab CI code use it).

Retaining the Prove cache has been fragile from the start, even though
the idea seemed good at the time, the idea being that the `.prove` file
caches information about previous `prove` runs (`save`) and uses them
(`slow`) to run the tests in the order from longer-running to shorter
ones, making optimal use of the parallelism implied by `--jobs=<N>`.

However, using a Prove cache can cause some surprising behavior: When
the `prove` caches information about a test script it has run,
subsequent `prove` runs (with `--state=slow`) will run the same test
script again even if said script is not specified on the `prove`
command-line!

So far, this bug did not matter. Right until d8f416bbb8 (ci: run unit
tests in CI, 2023-11-09) did it not matter.

But starting with that commit, we invoke `prove` _twice_ in CI, once to
run the regular test suite of regression test scripts, and once to run
the unit tests. Due to the bug, the second invocation re-runs all of the
tests that were already run as part of the first invocation. This not
only wastes build minutes, it also frequently causes the `osx-*` jobs to
fail because they already take a long time and now are likely to run
into a timeout.

The worst part about it is that there is actually no benefit to keep
running with `--state=slow,save`, ever since we decided no longer to
try to reuse the Prove cache between CI runs.

So let's just drop that Prove option and live happily ever after.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-14 09:24:23 +09:00
7ba238d374 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2023-11-13 19:55:50 +01:00
ac9898a4bb l10n: po-id for 2.43 (round 1)
Update following components:

  * builtin/gc.c
  * builtin/interpret-trailers.c
  * builtin/merge-file.c
  * builtin/show-ref.c
  * builtin/update-index.c
  * chunk-format.c
  * parse-options.c
  * scalar.c

While at it, drop unused strings.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2023-11-12 20:35:53 +07:00
e0939bec27 RelNotes: minor wording fixes in 2.43.0 release notes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-12 09:59:28 +09:00
9836cb75d1 l10n: fr: v2.43.0 rnd 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-11-11 10:02:35 +01:00
615993d092 Makefile: stop using test -o when unlinking duplicate executables
When building executables we may end up with both `foo` and `foo.exe` in
the project's root directory. This can cause issues on Cygwin, which is
why we unlink the `foo` binary (see 6fc301bbf6 (Makefile: remove $foo
when $foo.exe is built/installed., 2007-01-10)). This step is skipped if
either:

    - `foo` is a directory, which can happen when building Git on
      Windows via MSVC (see ade2ca0ca9 (Do not try to remove
      directories when removing old links, 2009-10-27)).

    - `foo` is a hardlink to `foo.exe`, which can happen on Cygwin (see
      0d768f7c8f (Makefile: building git in cygwin 1.7.0, 2008-08-15)).

These two conditions are currently chained together via `test -o`, which
is discouraged by our code style guide. Convert the recipe to instead
use an `if` statement with `&&`'d conditions, which both matches our
style guide and is easier to ready.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:21:00 +09:00
47c39c28bc contrib/subtree: convert subtree type check to use case statement
The `subtree_for_commit ()` helper function asserts that the subtree
identified by its parameters are either a commit or tree. This is done
via the `-o` parameter of test, which is discouraged.

Refactor the code to instead use a switch statement over the type.
Despite being aligned with our coding guidelines, the resulting code is
arguably also easier to read.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:21:00 +09:00
88983946fa contrib/subtree: stop using -o to test for number of args
Functions in git-subtree.sh all assert that they are being passed the
correct number of arguments. In cases where we accept a variable number
of arguments we assert this via a single call to `test` with `-o`, which
is discouraged by our coding guidelines.

Convert these cases to stop doing so. This requires us to decompose
assertions of the style `assert test $# = 2 -o $# = 3` into two calls
because we have no easy way to logically chain statements passed to the
assert function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:21:00 +09:00
13420028e5 global: convert trivial usages of test <expr> -a/-o <expr>
Our coding guidelines say to not use `test` with `-a` and `-o` because
it can easily lead to bugs. Convert trivial cases where we still use
these to instead instead concatenate multiple invocations of `test` via
`&&` and `||`, respectively.

While not all of the converted instances can cause ambiguity, it is
worth getting rid of all of them regardless:

    - It becomes easier to reason about the code as we do not have to
      argue why one use of `-a`/`-o` is okay while another one isn't.

    - We don't encourage people to use these expressions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:21:00 +09:00
0856f13aba t9164: fix inability to find basename(1) in Subversion hooks
Hooks executed by Subversion are spawned with an empty environment. By
default, not even variables like PATH will be propagated to them. In
order to ensure that we're still able to find required executables, we
thus write the current PATH variable into the hook script itself and
then re-export it in t9164.

This happens too late in the script though, as we already tried to
execute the basename(1) utility before exporting the PATH variable. This
tends to work on most platforms as the fallback value of PATH for Bash
(see `getconf PATH`) is likely to contain this binary. But on more
exotic platforms like NixOS this is not the case, and thus the test
fails.

While we could work around this issue by simply setting PATH earlier, it
feels fragile to inject a user-controlled value into the script and have
the shell interpret it. Instead, we can refactor the hook setup to write
a `hooks-env` file that configures PATH for us. Like this, Subversion
will know to set up the environment as expected for all hooks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:00:42 +09:00
5d70afa5d8 t/lib-httpd: stop using legacy crypt(3) for authentication
When setting up httpd for our tests, we also install a passwd and
proxy-passwd file that contain the test user's credentials. These
credentials currently use crypt(3) as the password encryption schema.

This schema can be considered deprecated nowadays as it is not safe
anymore. Quoting Apache httpd's documentation [1]:

> Unix only. Uses the traditional Unix crypt(3) function with a
> randomly-generated 32-bit salt (only 12 bits used) and the first 8
> characters of the password. Insecure.

This is starting to cause issues in modern Linux distributions. glibc
has deprecated its libcrypt library that used to provide crypt(3) in
favor of the libxcrypt library. This newer replacement provides a
compile time switch to disable insecure password encryption schemata,
which causes crypt(3) to always return `EINVAL`. The end result is that
httpd tests that exercise authentication will fail on distros that use
libxcrypt without these insecure encryption schematas.

Regenerate the passwd files to instead use the default password
encryption schema, which is md5. While it feels kind of funny that an
MD5-based encryption schema should be more secure than anything else, it
is the current default and supported by all platforms. Furthermore, it
really doesn't matter all that much given that these files are only used
for testing purposes anyway.

[1]: https://httpd.apache.org/docs/2.4/misc/password_encryptions.html

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:00:42 +09:00
7d05974d72 t/lib-httpd: dynamically detect httpd and modules path
In order to set up the Apache httpd server, we need to locate both the
httpd binary and its default module path. This is done with a hardcoded
list of locations that we scan. While this works okayish with distros
that more-or-less follow the Filesystem Hierarchy Standard, it falls
apart on others like NixOS that don't.

While it is possible to specify these paths via `LIB_HTTPD_PATH` and
`LIB_HTTPD_MODULE_PATH`, it is not a nice experience for the developer
to figure out how to set those up. And in fact we can do better by
dynamically detecting both httpd and its module path at runtime:

    - The httpd binary can be located via PATH.

    - The module directory can (in many cases) be derived via the
      `HTTPD_ROOT` compile-time variable.

Amend the code to do so.

Note that the new runtime-detected paths will only be used as a fallback
in case none of the hardcoded paths are usable. For the PATH lookup this
is because httpd is typically installed into "/usr/sbin", which is often
not included in the user's PATH variable. And the module path detection
relies on a configured httpd installation and may thus not work in all
cases, either.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-11 09:00:42 +09:00
a6c8b7d632 l10n: update uk localization for v2.43
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2023-11-10 10:18:19 -08:00
219d54ae8c format-patch: fix ignored encode_email_headers for cover letter
When writing the cover letter, the encode_email_headers option was
ignored. That is, UTF-8 subject lines and email addresses were
written out as-is, without any Q-encoding, even if
--encode-email-headers was passed on the command line.

This is due to encode_email_headers not being copied over from
struct rev_info to struct pretty_print_context. Fix that and add
a test.

Signed-off-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 11:04:11 +09:00
a2c5e294db unit-tests: do show relative file paths
Visual C interpolates `__FILE__` with the absolute _Windows_ path of
the source file. GCC interpolates it with the relative path, and the
tests even verify that.

So let's make sure that the unit tests only emit such paths.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
694e89baeb cmake: handle also unit tests
The unit tests should also be available e.g. in Visual Studio's Test
Explorer when configuring Git's source code via CMake.

Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
0df903d402 unit-tests: do not mistake .pdb files for being executable
When building the unit tests via CMake, the `.pdb` files are built.
Those are, essentially, files containing the debug information
separately from the executables.

Let's not confuse them with the executables we actually want to run.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
2f2729f3a4 cmake: use test names instead of full paths
The primary purpose of Git's CMake definition is to allow developing Git
in Visual Studio. As part of that, the CTest feature allows running
individual test scripts conveniently in Visual Studio's Test Explorer.

However, this Test Explorer's design targets object-oriented languages
and therefore expects the test names in the form
`<namespace>.<class>.<testname>`. And since we specify the full path
of the test scripts instead, including the ugly `/.././t/` part, these
dots confuse the Test Explorer and it uses a large part of the path as
"namespace".

Let's just use `t.suite.<name>` instead. This presents the tests in
Visual Studio's Test Explorer in the following form by default (i.e.
unless the user changes the view via the "Group by" menu):

	◢ ◈ git
	 ◢ ◈ t
	  ◢ ◈ suite
	     ◈ t0000-basic
	     ◈ t0001-init
	     ◈ t0002-gitfile
	     [...]

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
a15d4465a9 cmake: also build unit tests
A new, better way to run unit tests was just added to Git. This adds
support for building those unit tests via CMake.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
5bd7fb49af cmake: fix typo in variable name
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
ca76cca3a6 artifacts-tar: when including .dll files, don't forget the unit-tests
As of recent, Git also builds executables in `t/unit-tests/`. For
technical reasons, when building with CMake and Visual C, the
dependencies (".dll files") need to be copied there, too, otherwise
running the executable will fail "due to missing dependencies".

The CMake definition already contains the directives to copy those
`.dll` files, but we also need to adjust the `artifacts-tar` rule in
the `Makefile` accordingly to let the `vs-test` job in the CI runs
pass successfully.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:16:27 +09:00
d8f416bbb8 ci: run unit tests in CI
Run unit tests in both Cirrus and GitHub CI. For sharded CI instances
(currently just Windows on GitHub), run only on the first shard. This is
OK while we have only a single unit test executable, but we may wish to
distribute tests more evenly when we add new unit tests in the future.

We may also want to add more status output in our unit test framework,
so that we can do similar post-processing as in
ci/lib.sh:handle_failed_tests().

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:15:32 +09:00
e137fe3b29 unit tests: add TAP unit test framework
This patch contains an implementation for writing unit tests with TAP
output. Each test is a function that contains one or more checks. The
test is run with the TEST() macro and if any of the checks fail then the
test will fail. A complete program that tests STRBUF_INIT would look
like

     #include "test-lib.h"
     #include "strbuf.h"

     static void t_static_init(void)
     {
             struct strbuf buf = STRBUF_INIT;

             check_uint(buf.len, ==, 0);
             check_uint(buf.alloc, ==, 0);
             check_char(buf.buf[0], ==, '\0');
     }

     int main(void)
     {
             TEST(t_static_init(), "static initialization works);

             return test_done();
     }

The output of this program would be

     ok 1 - static initialization works
     1..1

If any of the checks in a test fail then they print a diagnostic message
to aid debugging and the test will be reported as failing. For example a
failing integer check would look like

     # check "x >= 3" failed at my-test.c:102
     #    left: 2
     #   right: 3
     not ok 1 - x is greater than or equal to three

There are a number of check functions implemented so far. check() checks
a boolean condition, check_int(), check_uint() and check_char() take two
values to compare and a comparison operator. check_str() will check if
two strings are equal. Custom checks are simple to implement as shown in
the comments above test_assert() in test-lib.h.

Tests can be skipped with test_skip() which can be supplied with a
reason for skipping which it will print. Tests can print diagnostic
messages with test_msg().  Checks that are known to fail can be wrapped
in TEST_TODO().

There are a couple of example test programs included in this
patch. t-basic.c implements some self-tests and demonstrates the
diagnostic output for failing test. The output of this program is
checked by t0080-unit-test-output.sh. t-strbuf.c shows some example
unit tests for strbuf.c

The unit tests will be built as part of the default "make all" target,
to avoid bitrot. If you wish to build just the unit tests, you can run
"make build-unit-tests". To run the tests, you can use "make unit-tests"
or run the test binaries directly, as in "./t/unit-tests/bin/t-strbuf".

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:15:32 +09:00
581790eeee unit tests: add a project plan document
In our current testing environment, we spend a significant amount of
effort crafting end-to-end tests for error conditions that could easily
be captured by unit tests (or we simply forgo some hard-to-setup and
rare error conditions). Describe what we hope to accomplish by
implementing unit tests, and explain some open questions and milestones.
Discuss desired features for test frameworks/harnesses, and provide a
comparison of several different frameworks. Finally, document our
rationale for implementing a custom framework.

Co-authored-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-10 08:15:25 +09:00
f8fcea1a35 l10n: sv.po: Update Swedish translation (5579t)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2023-11-09 14:29:25 +01:00
e020391673 commit-graph: mark chunk error messages for translation
The patches from f32af12cee (Merge branch 'jk/chunk-bounds', 2023-10-23)
added many new untranslated error messages. While it's unlikely for most
users to see these messages at all, most of the other commit-graph error
messages are translated (and likewise for the matching midx messages).

Let's mark them all for consistency (and to help any poor unfortunate
user who does manage to find a broken graph file).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:54 +09:00
f4e4756c54 commit-graph: drop verify_commit_graph_lite()
As we've moved all of the checks from this function directly into the
chunk-reading code used by the caller (and there is only one caller), we
can just drop it entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:54 +09:00
06fb135f8e commit-graph: check order while reading fanout chunk
We read the fanout chunk, storing a pointer to it, but only confirm that
the entries are monotonic in a final "lite" verification step. Let's
move that into the actual OIDF chunk callback, so that we can report
problems immediately (for all the reasons given in the previous
"commit-graph: abort as soon as we see a bogus chunk" commit).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:53 +09:00
d3b6f6c631 commit-graph: use fanout value for graph size
Commit-graph, midx, and pack idx files all have both a lookup table of
oids and an oid fanout table. In midx and pack idx files, we take the
final entry of the fanout table as the source of truth for the number of
entries, and then verify that the size of the lookup table matches that.
But for commit-graph files, we do the opposite: we use the size of the
lookup table as the source of truth, and then check the final fanout
entry against it.

As noted in 4169d89645 (commit-graph: check consistency of fanout
table, 2023-10-09), either is correct. But there are a few reasons to
prefer the fanout table as the source of truth:

  1. The fanout entries are 32-bits on disk, and that defines the
     maximum number of entries we can store. But since the size of the
     lookup table is only bounded by the filesystem, it can be much
     larger. And hence computing it as the commit-graph does means that
     we may truncate the result when storing it in a uint32_t.

  2. We read the fanout first, then the lookup table. If we're verifying
     the chunks as we read them, then we'd want to take the fanout as
     truth (we have nothing yet to check it against) and then we can
     check that the lookup table matches what we already know.

  3. It is pointlessly inconsistent with the midx and pack idx code.
     Since the three have to do similar size and bounds checks, it is
     easier to reason about all three if they use the same approach.

So this patch moves the assignment of g->num_commits to the fanout
parser, and then we can check the size of the lookup chunk as soon as we
try to load it.

There's already a test covering this situation, which munges the final
fanout entry to 2^32-1. In the current code we complain that it does not
agree with the table size. But now that we treat the munged value as the
source of truth, we'll complain that the lookup table is the wrong size
(again, either is correct). So we'll have to update the message we
expect (and likewise for an earlier test which does similar munging).

There's a similar test for this situation on the midx side, but rather
than making a very-large fanout value, it just truncates the lookup
table. We could do that here, too, but the very-large fanout value
actually shows an interesting corner case. On a 32-bit system,
multiplying to find the expected table size would cause an integer
overflow. Using st_mult() would detect that, but cause us to die()
rather than falling back to the non-graph code path. Checking the size
using division (as we do with existing chunk-size checks) avoids the
overflow entirely, and the test demonstrates this when run on a 32-bit
system.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:53 +09:00
8bd40ed2ae commit-graph: abort as soon as we see a bogus chunk
The code to read commit-graph files tries to read all of the required
chunks, but doesn't abort if we can't find one (or if it's corrupted).
It's only at the end of reading the file that we then do some sanity
checks for NULL entries. But it's preferable to detect the errors and
bail immediately, for a few reasons:

  1. It's less error-prone. It's easy in the reader functions to flag an
     error but still end up setting some struct fields (an error I in
     fact made while working on this patch series).

  2. It's safer. Since verifying some chunks depends on the values of
     other chunks, we may be depending on not-yet-verified data. I don't
     know offhand of any case where this can cause problems, but it's
     one less subtle thing to worry about in the reader code.

  3. It prevents the user from seeing nonsense errors. If we're missing
     an OIDL chunk, then g->num_commits will be zero. And so we may
     complain that the size of our CDAT chunk (which should have a
     fixed-size record for each commit) is wrong unless it's also zero.
     But that's misleading; the problem is the missing OIDL chunk; the
     CDAT one might be fine!

So let's just check the return value from read_chunk(). This is exactly
how the midx chunk-reading code does it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:53 +09:00
93d2924729 commit-graph: clarify missing-chunk error messages
When a required commit-graph chunk cannot be loaded, we leave its entry
in the struct NULL, and then later complain that it is missing. But
that's just one reason we might not have loaded it, as we also do some
data quality checks.

Let's switch these messages to say "missing or corrupted", which is
exactly what the midx code says for the same cases. Likewise, we'll use
the same phrasing and capitalization as those for consistency. And while
we're here, we can mark them for translation (just like the midx ones).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:53 +09:00
92de4c5d56 commit-graph: drop redundant call to "lite" verification
The idea of verify_commit_graph_lite() is to have cheap verification
checks both for everyday use of the graph files (to avoid out of bounds
reads, etc) as well as for doing a full check via "commit-graph verify"
(which will also check the hash, etc).

But the expensive verification checks operate on a commit_graph struct,
which we get by using the normal everyday-reader code! So any problem
we'd find by calling it would have been found before we even got to the
verify_one_commit_graph() function.

Removing it simplifies the code a bit, but also frees us up to move the
"lite" verification steps around within that everyday-reader code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:53 +09:00
9d78fb0eb6 midx: check consistency of fanout table
The commit-graph, midx, and pack idx on-disk formats all have oid fanout
tables which are fed to bsearch_hash(). If these tables do not increase
monotonically, then the binary search may not only produce bogus values,
it may cause out of bounds reads.

We fixed this for commit graphs in 4169d89645 (commit-graph: check
consistency of fanout table, 2023-10-09). That commit argued that we did
not need to do the same for midx and pack idx files, because they
already did this check. However, that is wrong. We _do_ check the fanout
table for pack idx files when we load them, but we only do so for midx
files when running "git multi-pack-index verify". So it is possible to
get an out-of-bounds read by running a normal command with a specially
crafted midx file.

Let's fix this using the same solution (and roughly the same test) we
did for the commit-graph in 4169d89645. This replaces the same check
from "multi-pack-index verify", because verify uses the same read
routines, we'd bail on reading the midx much sooner now. So let's make
sure to copy its verbose error message.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:52 +09:00
4bc6d43271 commit-graph: handle overflow in chunk_size checks
We check the size of chunks with fixed records by multiplying the width
of each record by the number of commits in the file. Like:

  if (chunk_size != g->num_commits * GRAPH_DATA_WIDTH)

If this multiplication overflows, we may not notice a chunk is too small
(which could later lead to out-of-bound reads).

In the current code this is only possible for the CDAT chunk, but the
reasons are quite subtle. We compute g->num_commits by dividing the size
of the OIDL chunk by the hash length (since it consists of a bunch of
hashes). So we know that any size_t multiplication that uses a value
smaller than the hash length cannot overflow. And the CDAT records are
the only ones that are larger (the others are just 4-byte records). So
it's worth fixing all of these, to make it clear that they're not
subject to overflow (without having to reason about seemingly unrelated
code).

The obvious thing to do is add an st_mult(), like:

  if (chunk_size != st_mult(g->num_commits, GRAPH_DATA_WIDTH))

And that certainly works, but it has one downside: if we detect an
overflow, we'll immediately die(). But the commit graph is an optional
file; if we run into other problems loading it, we'll generally return
an error and fall back to accessing the full objects. Using st_mult()
means a malformed file will abort the whole process.

So instead, we can do a division like this:

  if (chunk_size / GRAPH_DATA_WIDTH != g->num_commits)

where there's no possibility of overflow. We do lose a little bit of
precision; due to integer division truncation we'd allow up to an extra
GRAPH_DATA_WIDTH-1 bytes of data in the chunk. That's OK. Our main goal
here is making sure we don't have too _few_ bytes, which would cause an
out-of-bounds read (we could actually replace our "!=" with "<", but I
think it's worth being a little pedantic, as a large mismatch could be a
sign of other problems).

I didn't add a test here. We'd need to generate a very large graph file
in order to get g->num_commits large enough to cause an overflow. And a
later patch in this series will use this same division technique in a
way that is much easier to trigger in the tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 19:07:52 +09:00
0e3b67e2aa ci: add support for GitLab CI
We already support Azure Pipelines and GitHub Workflows in the Git
project, but until now we do not have support for GitLab CI. While it is
arguably not in the interest of the Git project to maintain a ton of
different CI platforms, GitLab has recently ramped up its efforts and
tries to contribute to the Git project more regularly.

Part of a problem we hit at GitLab rather frequently is that our own,
custom CI setup we have is so different to the setup that the Git
project has. More esoteric jobs like "linux-TEST-vars" that also set a
couple of environment variables do not exist in GitLab's custom CI
setup, and maintaining them to keep up with what Git does feels like
wasted time. The result is that we regularly send patch series upstream
that fail to compile or pass tests in GitHub Workflows. We would thus
like to integrate the GitLab CI configuration into the Git project to
help us send better patch series upstream and thus reduce overhead for
the maintainer. Results of these pipeline runs will be made available
(at least) in GitLab's mirror of the Git project at [1].

This commit introduces the integration into our regular CI scripts so
that most of the setup continues to be shared across all of the CI
solutions. Note that as the builds on GitLab CI run as unprivileged
user, we need to pull in both sudo and shadow packages to our Alpine
based job to set this up.

[1]: https://gitlab.com/gitlab-org/git

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:10 +09:00
0d3911ad73 ci: install test dependencies for linux-musl
The linux-musl CI job executes tests on Alpine Linux, which is based on
musl libc instead of glibc. We're missing some test dependencies though,
which causes us to skip a subset of tests.

Install these test dependencies to increase our test coverage on this
platform. There are still some missing test dependecies, but these do
not have a corresponding package in the Alpine repositories:

    - p4 and p4d, both parts of the Perforce version control system.

    - cvsps, which generates patch sets for CVS.

    - Subversion and the SVN::Core Perl library, the latter of which is
      not available in the Alpine repositories. While the tool itself is
      available, all Subversion-related tests are skipped without the
      SVN::Core Perl library anyway.

The Apache2-based tests require a bit more care though. For one, the
module path is different on Alpine Linux, which requires us to add it to
the list of known module paths to detect it. But second, the WebDAV
module on Alpine Linux is broken because it does not bundle the default
database backend [1]. We thus need to skip the WebDAV-based tests on
Alpine Linux for now.

[1]: https://gitlab.alpinelinux.org/alpine/aports/-/issues/13112

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:10 +09:00
dd02c3b68c ci: squelch warnings when testing with unusable Git repo
Our CI jobs that run on Docker also use mostly the same architecture to
build and test Git via the "ci/run-build-and-tests.sh" script. These
scripts also provide some functionality to massage the Git repository
we're supposedly operating in.

In our Docker-based infrastructure we may not even have a Git repository
available though, which leads to warnings when those functions execute.
Make the helpers exit gracefully in case either there is no Git in our
PATH, or when not running in a Git repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:10 +09:00
9f17bef9a6 ci: unify setup of some environment variables
Both GitHub Actions and Azure Pipelines set up the environment variables
GIT_TEST_OPTS, GIT_PROVE_OPTS and MAKEFLAGS. And while most values are
actually the same, the setup is completely duplicate. With the upcoming
support for GitLab CI this duplication would only extend even further.

Unify the setup of those environment variables so that only the uncommon
parts are separated. While at it, we also perform some additional small
improvements:

    - We now always pass `--state=failed,slow,save` via GIT_PROVE_OPTS.
      It doesn't hurt on platforms where we don't persist the state, so
      this further reduces boilerplate.

    - When running on Windows systems we set `--no-chain-lint` and
      `--no-bin-wrappers`. Interestingly though, we did so _after_
      already having exported the respective environment variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:09 +09:00
e624f206bc ci: split out logic to set up failed test artifacts
We have some logic in place to create a directory with the output from
failed tests, which will then subsequently be uploaded as CI artifacts.
We're about to add support for GitLab CI, which will want to reuse the
logic.

Split the logic into a separate function so that it is reusable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:09 +09:00
412847ced4 ci: group installation of Docker dependencies
The output of CI jobs tends to be quite long-winded and hard to digest.
To help with this, many CI systems provide the ability to group output
into collapsible sections, and we're also doing this in some of our
scripts.

One notable omission is the script to install Docker dependencies.
Address it to bring more structure to the output for Docker-based jobs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:09 +09:00
a7d499cb93 ci: make grouping setup more generic
Make the grouping setup more generic by always calling `begin_group ()`
and `end_group ()` regardless of whether we have stubbed those functions
or not. This ensures we can more readily add support for additional CI
platforms.

Furthermore, the `group ()` function is made generic so that it is the
same for both GitHub Actions and for other platforms. There is a
semantic conflict here though: GitHub Actions used to call `set +x` in
`group ()` whereas the non-GitHub case unconditionally uses `set -x`.
The latter would get overriden if we kept the `set +x` in the generic
version of `group ()`. To resolve this conflict, we simply drop the `set
+x` in the generic variant of this function. As `begin_group ()` calls
`set -x` anyway this is not much of a change though, as the only
commands that aren't printed anymore now are the ones between the
beginning of `group ()` and the end of `begin_group ()`.

Last, this commit changes `end_group ()` to also accept a parameter that
indicates _which_ group should end. This will be required by a later
commit that introduces support for GitLab CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:09 +09:00
a4761b605c ci: reorder definitions for grouping functions
We define a set of grouping functions that are used to group together
output in our CI, where these groups then end up as collapsible sections
in the respective pipeline platform. The way these functions are defined
is not easily extensible though as we have an up front check for the CI
_not_ being GitHub Actions, where we define the non-stub logic in the
else branch.

Reorder the conditional branches such that we explicitly handle GitHub
Actions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-09 18:56:08 +09:00
345ac93c21 l10n: tr: v2.43.0
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2023-11-09 11:20:54 +03:00
dadef801b3 Git 2.43-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-08 15:04:42 +09:00
8ed4eb7538 Merge branch 'tb/rev-list-unpacked-fix'
"git rev-list --unpacked --objects" failed to exclude packed
non-commit objects, which has been corrected.

* tb/rev-list-unpacked-fix:
  pack-bitmap: drop --unpacked non-commit objects from results
  list-objects: drop --unpacked non-commit objects from results
2023-11-08 15:04:42 +09:00
c732f7430d Merge branch 'ps/leakfixes'
Leakfix.

* ps/leakfixes:
  setup: fix leaking repository format
  setup: refactor `upgrade_repository_format()` to have common exit
  shallow: fix memory leak when registering shallow roots
  test-bloom: stop setting up Git directory twice
2023-11-08 15:04:41 +09:00
98009afd24 Prepare for -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-08 11:04:03 +09:00
a8e2394704 Merge branch 'jc/test-i18ngrep'
Another step to deprecate test_i18ngrep.

* jc/test-i18ngrep:
  tests: teach callers of test_i18ngrep to use test_grep
  test framework: further deprecate test_i18ngrep
2023-11-08 11:04:02 +09:00
ca320b256c Merge branch 'la/strvec-header-fix'
Code clean-up.

* la/strvec-header-fix:
  strvec: drop unnecessary include of hex.h
2023-11-08 11:04:02 +09:00
259e30d2bb Merge branch 'bc/merge-file-object-input'
"git merge-file" learns a mode to read three contents to be merged
from blob objects.

* bc/merge-file-object-input:
  merge-file: add an option to process object IDs
  git-merge-file doc: drop "-file" from argument placeholders
2023-11-08 11:04:01 +09:00
57e216d03d Merge branch 'kn/rev-list-missing-fix'
"git rev-list --missing" did not work for missing commit objects,
which has been corrected.

* kn/rev-list-missing-fix:
  rev-list: add commit object support in `--missing` option
  rev-list: move `show_commit()` to the bottom
  revision: rename bit to `do_not_die_on_missing_objects`
2023-11-08 11:04:01 +09:00
fe84aa5228 Merge branch 'an/clang-format-typofix'
Typofix.

* an/clang-format-typofix:
  clang-format: fix typo in comment
2023-11-08 11:04:00 +09:00
ed14fa1c2a Merge branch 'tb/format-pack-doc-update'
Doc update.

* tb/format-pack-doc-update:
  Documentation/gitformat-pack.txt: fix incorrect MIDX documentation
  Documentation/gitformat-pack.txt: fix typo
2023-11-08 11:04:00 +09:00
d8972a5abd Merge branch 'ps/show-ref'
Teach "git show-ref" a mode to check the existence of a ref.

* ps/show-ref:
  t: use git-show-ref(1) to check for ref existence
  builtin/show-ref: add new mode to check for reference existence
  builtin/show-ref: explicitly spell out different modes in synopsis
  builtin/show-ref: ensure mutual exclusiveness of subcommands
  builtin/show-ref: refactor options for patterns subcommand
  builtin/show-ref: stop using global vars for `show_one()`
  builtin/show-ref: stop using global variable to count matches
  builtin/show-ref: refactor `--exclude-existing` options
  builtin/show-ref: fix dead code when passing patterns
  builtin/show-ref: fix leaking string buffer
  builtin/show-ref: split up different subcommands
  builtin/show-ref: convert pattern to a local variable
2023-11-08 11:04:00 +09:00
42b87f7ee6 Merge branch 'ps/do-not-trust-commit-graph-blindly-for-existence'
The codepath to traverse the commit-graph learned to notice that a
commit is missing (e.g., corrupt repository lost an object), even
though it knows something about the commit (like its parents) from
what is in commit-graph.

* ps/do-not-trust-commit-graph-blindly-for-existence:
  commit: detect commits that exist in commit-graph but not in the ODB
  commit-graph: introduce envvar to disable commit existence checks
2023-11-08 11:03:59 +09:00
234037dbec Merge branch 'js/ci-use-macos-13'
Replace macos-12 used at GitHub CI with macos-13.

* js/ci-use-macos-13:
  ci: upgrade to using macos-13
2023-11-08 11:03:59 +09:00
650963cfd0 Merge branch 'jk/chunk-bounds'
Test portability fix.

* jk/chunk-bounds:
  t: avoid perl's pack/unpack "Q" specifier
2023-11-08 11:03:58 +09:00
55f95ed8ac Merge branch 'jk/tree-name-and-depth-limit'
Further limit tree depth max to avoid Windows build running out of
the stack space.

* jk/tree-name-and-depth-limit:
  max_tree_depth: lower it for MSVC to avoid stack overflows
2023-11-08 11:03:58 +09:00
7b3c8e9f38 pack-bitmap: drop --unpacked non-commit objects from results
When performing revision queries with `--objects` and
`--use-bitmap-index`, the output may incorrectly contain objects which
are packed, even when the `--unpacked` option is given. This affects
traversals, but also other querying operations, like `--count`,
`--disk-usage`, etc.

Like in the previous commit, the fix is to exclude those objects from
the result set before they are shown to the user (or, in this case,
before the bitmap containing the result of the traversal is enumerated
and its objects listed).

This is performed by a new function in pack-bitmap.c, called
`filter_packed_objects_from_bitmap()`. Note that we do not have to
inspect individual bits in the result bitmap, since we know that the
first N (where N is the number of objects in the bitmap's pack/MIDX)
bits correspond to objects which packed by definition.

In other words, for an object to have a bitmap position (not in the
extended index), it must appear in either the bitmap's pack or one of
the packs in its MIDX.

This presents an appealing optimization to us, which is that we can
simply memset() the corresponding number of `eword_t`'s to zero,
provided that we handle any objects which spill into the next word (but
don't occupy all 64 bits of the word itself).

We only have to handle objects in the bitmap's extended index. These
objects may (or may not) appear in one or more pack(s). Since these
objects are known to not appear in either the bitmap's MIDX or pack,
they may be stored as loose, appear in other pack(s), or both.

Before returning a bitmap containing the result of the traversal back to
the caller, drop any bits from the extended index which appear in one or
more packs. This implements the correct behavior for rev-list operations
which use the bitmap index to compute their result.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 11:23:52 +09:00
4263f9279e list-objects: drop --unpacked non-commit objects from results
In git-rev-list(1), we describe the `--unpacked` option as:

    Only useful with `--objects`; print the object IDs that are not in
    packs.

This is true of commits, which we discard via get_commit_action(), but
not of the objects they reach. So if we ask for an --objects traversal
with --unpacked, we may get arbitrarily many objects which are indeed
packed.

I am nearly certain this behavior dates back to the introduction of
`--unpacked` via 12d2a18780 ("git rev-list --unpacked" shows only
unpacked commits, 2005-07-03), but I couldn't get that revision of Git
to compile for me. At least as early as v2.0.0 this has been subtly
broken:

    $ git.compile --version
    git version 2.0.0

    $ git.compile rev-list --objects --all --unpacked
    72791fe96c93f9ec5c311b8bc966ab349b3b5bbe
    05713d991c18bbeef7e154f99660005311b5004d v1.0
    153ed8b7719c6f5a68ce7ffc43133e95a6ac0fdb
    8e4020bb5a8d8c873b25de15933e75cc0fc275df one
    9200b628cf9dc883a85a7abc8d6e6730baee589c two
    3e6b46e1b7e3b91acce99f6a823104c28aae0b58 unpacked.t

There, only the first, third, and sixth entries are loose, with the
remaining set of objects belonging to at least one pack.

The implications for this are relatively benign: bare 'git repack'
invocations which invoke pack-objects with --unpacked are impacted, and
at worst we'll store a few extra objects that should have been excluded.

Arguably changing this behavior is a backwards-incompatible change,
since it alters the set of objects emitted from rev-list queries with
`--objects` and `--unpacked`. But I argue that this change is still
sensible, since the existing implementation deviates from
clearly-written documentation.

The fix here is straightforward: avoid showing any non-commit objects
which are contained in packs by discarding them within list-objects.c,
before they are shown to the user. Note that similar treatment for
`list-objects.c::show_commit()` is not needed, since that case is
already handled by `revision.c::get_commit_action()`.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 11:23:51 +09:00
8be77c5de6 RelNotes: improve wording of credential helper notes
Offer a slightly more verbose description of the issue fixed by
7144dee3ec (credential/libsecret: erase matching creds only, 2023-07-26)
and cb626f8e5c (credential/wincred: erase matching creds only,
2023-07-26).

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 10:27:12 +09:00
7bac6a4b1b RelNotes: minor typo fixes in 2.43.0 draft
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 10:27:12 +09:00
3596e182a2 A bit more before -rc1 2023-11-07 10:26:45 +09:00
a362332c56 Merge branch 'rc/trace-upload-pack'
Trace2 update.

* rc/trace-upload-pack:
  upload-pack: add tracing for fetches
2023-11-07 10:26:45 +09:00
840bd1c9ef Merge branch 'es/bugreport-no-extra-arg'
"git bugreport" learned to complain when it received a command line
argument that it will not use.

* es/bugreport-no-extra-arg:
  bugreport: reject positional arguments
  t0091-bugreport: stop using i18ngrep
2023-11-07 10:26:44 +09:00
9f7fbe07dc Merge branch 'js/my-first-contribution-update'
Documentation update.

* js/my-first-contribution-update:
  Include gettext.h in MyFirstContribution tutorial
2023-11-07 10:26:44 +09:00
00f372e2a4 Merge branch 'ms/send-email-validate-fix'
"git send-email" did not have certain pieces of data computed yet
when it tried to validate the outging messages and its recipient
addresses, which has been sorted out.

* ms/send-email-validate-fix:
  send-email: move validation code below process_address_list
2023-11-07 10:26:44 +09:00
dbffe54f8a Merge branch 'rs/reflog-expire-single-worktree-fix'
"git reflog expire --single-worktree" has been broken for the past
20 months or so, which has been corrected.

* rs/reflog-expire-single-worktree-fix:
  reflog: fix expire --single-worktree
2023-11-07 10:26:44 +09:00
c0329432ac Merge branch 'rs/fix-arghelp'
Doc and help update.

* rs/fix-arghelp:
  am, rebase: fix arghelp syntax of --empty
2023-11-07 10:26:43 +09:00
5f11becce0 Merge branch 'rs/parse-options-cmdmode'
parse-options improvements for OPT_CMDMODE options.

* rs/parse-options-cmdmode:
  am: simplify --show-current-patch handling
  parse-options: make CMDMODE errors more precise
2023-11-07 10:26:43 +09:00
2d2cd0a1bc Merge branch 'jc/grep-f-relative-to-cwd'
"cd sub && git grep -f patterns" tried to read "patterns" file at
the top level of the working tree; it has been corrected to read
"sub/patterns" instead.

* jc/grep-f-relative-to-cwd:
  grep: -f <path> is relative to $cwd
2023-11-07 10:26:43 +09:00
e6bb35d996 Merge branch 'ar/submitting-patches-doc-update'
Doc update.

* ar/submitting-patches-doc-update:
  SubmittingPatches: call gitk's command "Copy commit reference"
2023-11-07 10:26:42 +09:00
9972cd6004 setup: fix leaking repository format
While populating the `repository_format` structure may cause us to
allocate memory, we do not call `clear_repository_format()` in some
places and thus potentially leak memory. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 08:51:41 +09:00
4ce14e1325 setup: refactor upgrade_repository_format() to have common exit
The `upgrade_repository_format()` function has multiple exit paths,
which means that there is no common cleanup of acquired resources.
While this isn't much of a problem right now, we're about to fix a
memory leak that would require us to free the resource in every one of
those exit paths.

Refactor the code to have a common exit path so that the subsequent
memory leak fix becomes easier to implement.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 08:51:41 +09:00
568cc818cc shallow: fix memory leak when registering shallow roots
When registering shallow roots, we unset the list of parents of the
to-be-registered commit if it's already been parsed. This causes us to
leak memory though because we never free this list. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 08:51:41 +09:00
40e9136ff6 test-bloom: stop setting up Git directory twice
We're setting up the Git directory twice in the `test-tool bloom`
helper, once at the beginning of `cmd_bloom()` and once in the local
subcommand implementation `get_bloom_filter_for_commit()`. This can lead
to memory leaks as we'll overwrite variables of `the_repository` with
newly allocated data structures. On top of that it's simply unnecessary.

Fix this by only setting up the Git directory once.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 08:51:40 +09:00
1164c7232e attr: enable attr pathspec magic for git-add and git-stash
Allow users to limit or exclude files based on file attributes
during git-add and git-stash.

For example, the chromium project would like to use

    $ git add . ':(exclude,attr:submodule)'

as submodules are managed by an external tool, forbidding end users
to record changes with "git add".  Allowing "git add" to often
records changes that users do not want in their commits.

This commit does not change any attr magic implementation. It is
only adding attr as an allowed pathspec in git-add and git-stash,
which was previously blocked by GUARD_PATHSPEC and a pathspec mask
in parse_pathspec()).

However, we fix a bug in prefix_magic() where attr values were
unintentionally removed.  This was triggerable when parse_pathspec()
is called with PATHSPEC_PREFIX_ORIGIN as a flag, which was the case
for git-stash (Bug originally filed here [*])

Furthermore, while other commands hit this code path it did not
result in unexpected behavior because this bug only impacts the
pathspec->items->original field which is NOT used to filter
paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands.  (See add_pathspecs()
usage and implementation in stash.c)

It is possible that when the attr pathspec feature was first added
in b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used
with git-add, PATHSPEC_ATTR was added as a mask to git-add's
invocation of parse_pathspec() 84d938b732 (add: do not accept
pathspec magic 'attr', 2018-09-18).  However, this user-friendly
error message was never added for git-stash.

[Reference]
 * https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)

Signed-off-by: Joanna Wang <jojwang@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-04 17:00:27 +09:00
4815c3c4b2 t: avoid perl's pack/unpack "Q" specifier
The perl script introduced by 86b008ee61 (t: add library for munging
chunk-format files, 2023-10-09) uses pack("Q") and unpack("Q") to read
and write 64-bit values ("quadwords" in perl parlance) from the on-disk
chunk files. However, some builds of perl may not support 64-bit
integers at all, and throw an exception here. While some 32-bit
platforms may still support 64-bit integers in perl (such as our linux32
CI environment), others reportedly don't (the NonStop 32-bit builds).

We can work around this by treating the 64-bit values as two 32-bit
values. We can't ever combine them into a single 64-bit value, but in
practice this is OK. These are representing file offsets, and our files
are much smaller than 4GB. So the upper half of the 64-bit value will
always be 0.

We can just introduce a few helper functions which perform the
translation and double-check our assumptions.

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-04 15:54:25 +09:00
682a868f67 ci: upgrade to using macos-13
In April, GitHub announced that the `macos-13` pool is available:
https://github.blog/changelog/2023-04-24-github-actions-macos-13-is-now-available/.
It is only a matter of time until the `macos-12` pool is going away,
therefore we should switch now, without pressure of a looming deadline.

Since the `macos-13` runners no longer include Python2, we also drop
specifically testing with Python2 and switch uniformly to Python3, see
https://github.com/actions/runner-images/blob/HEAD/images/macos/macos-13-Readme.md
for details about the software available on the `macos-13` pool's
runners.

Also, on macOS 13, Homebrew seems to install a `gcc@9` package that no
longer comes with a regular `unistd.h` (there seems only to be a
`ssp/unistd.h`), and hence builds would fail with:

    In file included from base85.c:1:
    git-compat-util.h:223:10: fatal error: unistd.h: No such file or directory
      223 | #include <unistd.h>
          |          ^~~~~~~~~~
    compilation terminated.

The reason why we install GCC v9.x explicitly is historical, and back in
the days it was because it was the _newest_ version available via
Homebrew: 176441bfb5 (ci: build Git with GCC 9 in the 'osx-gcc' build
job, 2019-11-27).

To reinstate the spirit of that commit _and_ to fix that build failure,
let's switch to the now-newest GCC version: v13.x.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 18:52:02 +09:00
a6f43364e3 t: mark several tests that assume the files backend with REFFILES
Add the REFFILES prerequisite to several tests that assume we're using
the files backend. There are various reasons why we cannot easily
convert those tests to be backend-independent, where the most common
one is that we have no way to write corrupt references into the refdb
via our tooling. We may at a later point in time grow the tooling to
make this possible, but for now we just mark these tests as requiring
the files backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:07 +09:00
170ba45acf t7900: assert the absence of refs via git-for-each-ref(1)
We're asserting that a prefetch of remotes via git-maintenance(1)
doesn't write any references in refs/remotes by validating that the
directory ".git/refs/remotes" is missing. This is quite roundabout: we
don't care about the directory existing, we care about the references
not existing, and the way these are stored is on the behest of the
reference database.

Convert the test to instead check via git-for-each-ref(1) whether any
remote reference exist.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:07 +09:00
390c5b07e2 t7300: assert exact states of repo
Some of the tests in t7300 verify that git-clean(1) doesn't touch
repositories that are embedded into the main repository. This is done by
asserting a small set of substructures that are assumed to always exist,
like the "refs/", "objects/" or "HEAD". This has the downside that we
need to assume a specific repository structure that may be subject to
change when new backends for the refdb land. At the same time, we don't
thoroughly assert that git-clean(1) really didn't end up cleaning any
files in the repository either.

Convert the tests to instead assert that all files continue to exist
after git-clean(1) by comparing a file listing via find(1) before and
after executing clean. This makes our actual assertions stricter while
having to care less about the repository's actual on-disk format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:07 +09:00
c6429fb867 t4207: delete replace references via git-update-ref(1)
In t4207 we set up a set of replace objects via git-replace(1). Because
these references should not be impacting subsequent tests we also set up
some cleanup logic that deletes the replacement references via a call to
`rm -rf`. This reaches into the internal implementation details of the
reference backend and will thus break when we grow an alternative refdb
implementation.

Refactor the tests to delete the replacement refs via Git commands so
that we become independent of the actual refdb that's in use. As we
don't have a nice way to delete all replacements or all references in a
certain namespace, we opt for a combination of git-for-each-ref(1) and
git-update-ref(1)'s `--stdin` mode.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:07 +09:00
c603138e3d t1450: convert tests to remove worktrees via git-worktree(1)
Some of our tests in t1450 create worktrees and then corrupt them.
As it is impossible to delete such worktrees via a normal call to `git
worktree remove`, we instead opt to remove them manually by calling
rm(1) instead.

This is ultimately unnecessary though as we can use the `-f` switch to
remove the worktree. Let's convert the tests to do so such that we don't
have to reach into internal implementation details of worktrees.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:06 +09:00
668e31c690 t: convert tests to not access reflog via the filesystem
Some of our tests reach directly into the filesystem in order to both
read or modify the reflog, which will break once we have a second
reference backend in our codebase that stores reflogs differently.

Refactor these tests to either use git-reflog(1) or the ref-store test
helper. Note that the refactoring to use git-reflog(1) also requires us
to adapt our expectations in some cases where we previously verified the
exact on-disk log entries. This seems like an acceptable tradeoff though
to ensure that different backends have the same user-visible behaviour
as any user would typically use git-reflog(1) anyway to access the logs.
Any backend-specific verification of the written on-disk format should
be implemented in a separate, backend-specific test.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:06 +09:00
2393711681 t: convert tests to not access symrefs via the filesystem
Some of our tests access symbolic references via the filesystem
directly. While this works with the current files reference backend, it
this will break once we have a second reference backend in our codebase.

Refactor these tests to instead use git-symbolic-ref(1) or our
`ref-store` test tool. The latter is required in some cases where safety
checks of git-symbolic-ref(1) would otherwise reject writing a symbolic
reference.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:06 +09:00
1c6667cb9d t: convert tests to not write references via the filesystem
Some of our tests manually create, update or delete references by
writing the respective new values into the filesystem directly. While
this works with the current files reference backend, this will break
once we have a second reference backend implementation in our codebase.

Refactor these tests to instead use git-update-ref(1) or our `ref-store`
test tool. The latter is required in some cases where safety checks of
git-update-ref(1) would otherwise reject a reference update.

While at it, refactor some of the tests to schedule the cleanup command
via `test_when_finished` before modifying the repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:06 +09:00
9ddd5b883b t: allow skipping expected object ID in ref-store update-ref
We require the caller to pass both the old and new expected object ID to
our `test-tool ref-store update-ref` helper. When trying to update a
symbolic reference though it's impossible to specify the expected object
ID, which means that the test would instead have to force-update the
reference. This is currently impossible though.

Update the helper to optionally skip verification of the old object ID
in case the test passes in an empty old object ID as input.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:37:06 +09:00
d5dbad0c76 Merge branch 'ps/show-ref' into ps/ref-tests-update
* ps/show-ref:
  t: use git-show-ref(1) to check for ref existence
  builtin/show-ref: add new mode to check for reference existence
  builtin/show-ref: explicitly spell out different modes in synopsis
  builtin/show-ref: ensure mutual exclusiveness of subcommands
  builtin/show-ref: refactor options for patterns subcommand
  builtin/show-ref: stop using global vars for `show_one()`
  builtin/show-ref: stop using global variable to count matches
  builtin/show-ref: refactor `--exclude-existing` options
  builtin/show-ref: fix dead code when passing patterns
  builtin/show-ref: fix leaking string buffer
  builtin/show-ref: split up different subcommands
  builtin/show-ref: convert pattern to a local variable
2023-11-03 08:36:54 +09:00
3ca86adc2d strvec: drop unnecessary include of hex.h
In 41771fa435 (cache.h: remove dependence on hex.h; make other files
include it explicitly, 2023-02-24) we added this as part of a larger
mechanical refactor. But strvec doesn't actually depend on hex.h, so
remove it.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-03 08:26:55 +09:00
6789275d37 tests: teach callers of test_i18ngrep to use test_grep
They are equivalents and the former still exists, so as long as the
only change this commit makes are to rewrite test_i18ngrep to
test_grep, there won't be any new bug, even if there still are
callers of test_i18ngrep remaining in the tree, or when merged to
other topics that add new uses of test_i18ngrep.

This patch was produced more or less with

    git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' |
    xargs perl -p -i -e 's/test_i18ngrep /test_grep /'

and a good way to sanity check the result yourself is to run the
above in a checkout of c4603c1c (test framework: further deprecate
test_i18ngrep, 2023-10-31) and compare the resulting working tree
contents with the result of applying this patch to the same commit.
You'll see that test_i18ngrep in a few t/lib-*.sh files corrected,
in addition to the manual reproduction.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 17:13:44 +09:00
bc5204569f Git 2.43-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 17:09:48 +09:00
61a22ddaf0 Git 2.42.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 16:59:16 +09:00
b8e45c5aa2 Merge branch 'ms/doc-push-fix' into maint-2.42
Docfix.

* ms/doc-push-fix:
  git-push doc: more visibility for -q option
2023-11-02 16:53:28 +09:00
fdb233cefb Merge branch 'jc/commit-new-underscore-index-fix' into maint-2.42
Message fix.

* jc/commit-new-underscore-index-fix:
  commit: do not use cryptic "new_index" in end-user facing messages
2023-11-02 16:53:28 +09:00
d373ec0723 Merge branch 'wx/merge-ort-comment-typofix' into maint-2.42
Typofix.

* wx/merge-ort-comment-typofix:
  merge-ort.c: fix typo 'neeed' to 'needed'
2023-11-02 16:53:27 +09:00
8a26aaa91e Merge branch 'ps/git-repack-doc-fixes' into maint-2.42
Doc updates.

* ps/git-repack-doc-fixes:
  doc/git-repack: don't mention nonexistent "--unpacked" option
  doc/git-repack: fix syntax for `-g` shorthand option
2023-11-02 16:53:27 +09:00
382d55a9d3 Merge branch 'ni/die-message-fix-for-git-add' into maint-2.42
Message updates.

* ni/die-message-fix-for-git-add:
  builtin/add.c: clean up die() messages
2023-11-02 16:53:27 +09:00
f8685969f5 Merge branch 'jc/am-doc-whitespace-action-fix' into maint-2.42
Docfix.

* jc/am-doc-whitespace-action-fix:
  am: align placeholder for --whitespace option with apply
2023-11-02 16:53:27 +09:00
a40b8e9197 Merge branch 'jc/update-list-references-to-lore' into maint-2.42
Doc update.

* jc/update-list-references-to-lore:
  doc: update list archive reference to use lore.kernel.org
2023-11-02 16:53:27 +09:00
3a16179bfb Merge branch 'ps/rewritten-is-per-worktree-doc' into maint-2.42
Doc update.

* ps/rewritten-is-per-worktree-doc:
  doc/git-worktree: mention "refs/rewritten" as per-worktree refs
2023-11-02 16:53:26 +09:00
f6a567638b Merge branch 'sn/cat-file-doc-update' into maint-2.42
"git cat-file" documentation updates.

* sn/cat-file-doc-update:
  doc/cat-file: make synopsis and description less confusing
2023-11-02 16:53:26 +09:00
f92cea12c9 Merge branch 'jk/decoration-and-other-leak-fixes' into maint-2.42
Leakfix.

* jk/decoration-and-other-leak-fixes:
  daemon: free listen_addr before returning
  revision: clear decoration structs during release_revisions()
  decorate: add clear_decoration() function
2023-11-02 16:53:26 +09:00
1e315cab44 Merge branch 'rs/parse-opt-ctx-cleanup' into maint-2.42
Code clean-up.

* rs/parse-opt-ctx-cleanup:
  parse-options: drop unused parse_opt_ctx_t member
2023-11-02 16:53:26 +09:00
fa5799cd34 Merge branch 'ob/am-msgfix' into maint-2.42
The parameters to generate an error message have been corrected.

* ob/am-msgfix:
  am: fix error message in parse_opt_show_current_patch()
2023-11-02 16:53:25 +09:00
8a5b2e1157 Merge branch 'hy/doc-show-is-like-log-not-diff-tree' into maint-2.42
Doc update.

* hy/doc-show-is-like-log-not-diff-tree:
  show doc: redirect user to git log manual instead of git diff-tree
2023-11-02 16:53:25 +09:00
965d445b2d Merge branch 'ch/clean-docfix' into maint-2.42
Typofix.

* ch/clean-docfix:
  git-clean doc: fix "without do cleaning" typo
2023-11-02 16:53:25 +09:00
905765bc5b Merge branch 'eg/config-type-path-docfix' into maint-2.42
Typofix.

* eg/config-type-path-docfix:
  git-config: fix misworded --type=path explanation
2023-11-02 16:53:25 +09:00
71c614b9a2 Merge branch 'ob/t3404-typofix' into maint-2.42
Code clean-up.

* ob/t3404-typofix:
  t3404-rebase-interactive.sh: fix typos in title of a rewording test
2023-11-02 16:53:24 +09:00
cd41f66b9d Merge branch 'ob/sequencer-remove-dead-code' into maint-2.42
Code clean-up.

* ob/sequencer-remove-dead-code:
  sequencer: remove unreachable exit condition in pick_commits()
2023-11-02 16:53:24 +09:00
f76827da0e Merge branch 'rs/name-rev-use-opt-hidden-bool' into maint-2.42
Simplify use of parse-options API a bit.

* rs/name-rev-use-opt-hidden-bool:
  name-rev: use OPT_HIDDEN_BOOL for --peel-tag
2023-11-02 16:53:24 +09:00
2fdfd7594f Merge branch 'rs/grep-parseopt-simplify' into maint-2.42
Simplify use of parse-options API a bit.

* rs/grep-parseopt-simplify:
  grep: use OPT_INTEGER_F for --max-depth
2023-11-02 16:53:24 +09:00
18e0648b9b Merge branch 'ob/sequencer-reword-error-message' into maint-2.42
Update an error message (which would probably never been seen).

* ob/sequencer-reword-error-message:
  sequencer: fix error message on failure to copy SQUASH_MSG
2023-11-02 16:53:23 +09:00
535b30eb58 Merge branch 'bc/more-git-var' into maint-2.42
Fix-up for a topic that already has graduated.

* bc/more-git-var:
  var: avoid a segmentation fault when `HOME` is unset
2023-11-02 16:53:23 +09:00
0510d06b56 Merge branch 'jk/ci-retire-allow-ref' into maint-2.42
CI update.

* jk/ci-retire-allow-ref:
  ci: deprecate ci/config/allow-ref script
  ci: allow branch selection through "vars"
2023-11-02 16:53:23 +09:00
1a3712f06b Merge branch 'ws/git-svn-retire-faketerm' into maint-2.42
Code clean-up.

* ws/git-svn-retire-faketerm:
  git-svn: drop FakeTerm hack
2023-11-02 16:53:22 +09:00
43af21409e Merge branch 'ch/t6300-verify-commit-test-cleanup' into maint-2.42
Test clean-up.

* ch/t6300-verify-commit-test-cleanup:
  t/t6300: drop magic filtering
  t/lib-gpg: forcibly run a trustdb update
2023-11-02 16:53:22 +09:00
7c7f6d828b Merge branch 'jc/mv-d-to-d-error-message-fix' into maint-2.42
Typofix in an error message.

* jc/mv-d-to-d-error-message-fix:
  mv: fix error for moving directory to another
2023-11-02 16:53:22 +09:00
9d4a69f852 Merge branch 'ja/worktree-orphan' into maint-2.42
Typofix in an error message.

* ja/worktree-orphan:
  builtin/worktree.c: fix typo in "forgot fetch" msg
2023-11-02 16:53:21 +09:00
8db7d2d6bd Merge branch 'ob/t9001-indent-fix' into maint-2.42
Test style fix.

* ob/t9001-indent-fix:
  t9001: fix indentation in test_no_confirm()
2023-11-02 16:53:21 +09:00
b50a670153 Merge branch 'jk/function-pointer-mismatches-fix' into maint-2.42
Code clean-up to please clang-18.

* jk/function-pointer-mismatches-fix:
  hashmap: use expected signatures for comparison functions
2023-11-02 16:53:21 +09:00
9ae84d2e7f Merge branch 'ds/upload-pack-error-sequence-fix' into maint-2.42
Error message generation fix.

* ds/upload-pack-error-sequence-fix:
  upload-pack: fix exit code when denying fetch of unreachable object ID
  upload-pack: fix race condition in error messages
2023-11-02 16:53:20 +09:00
c78718c4b3 Merge branch 'ws/git-push-doc-grammofix' into maint-2.42
Doc update.

* ws/git-push-doc-grammofix:
  git-push.txt: fix grammar
2023-11-02 16:53:20 +09:00
07011e1480 Merge branch 'jk/test-pass-ubsan-options-to-http-test' into maint-2.42
UBSAN options were not propagated through the test framework to git
run via the httpd, unlike ASAN options, which has been corrected.

* jk/test-pass-ubsan-options-to-http-test:
  test-lib: set UBSAN_OPTIONS to match ASan
2023-11-02 16:53:20 +09:00
584cde766b Merge branch 'tb/send-email-extract-valid-address-error-message-fix' into maint-2.42
An error message given by "git send-email" when given a malformed
address did not give correct information, which has been corrected.

* tb/send-email-extract-valid-address-error-message-fix:
  git-send-email.perl: avoid printing undef when validating addresses
2023-11-02 16:53:20 +09:00
2a6806140e Merge branch 'jk/redact-h2h3-headers-fix' into maint-2.42
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.

* jk/redact-h2h3-headers-fix:
  http: update curl http/2 info matching for curl 8.3.0
  http: factor out matching of curl http/2 trace lines
2023-11-02 16:53:19 +09:00
83442ded89 Merge branch 'pb/completion-aliases-doc' into maint-2.42
Clarify how "alias.foo = : git cmd ; aliased-command-string" should
be spelled with necessary whitespaces around punctuation marks to
work.

* pb/completion-aliases-doc:
  completion: improve doc for complex aliases
2023-11-02 16:53:19 +09:00
2fd4378d64 Merge branch 'js/diff-cached-fsmonitor-fix' into maint-2.42
"git diff --cached" codepath did not fill the necessary stat
information for a file when fsmonitor knows it is clean and ended
up behaving as if it is not clean, which has been corrected.

* js/diff-cached-fsmonitor-fix:
  diff-lib: fix check_removed when fsmonitor is on
2023-11-02 16:53:19 +09:00
56ee4a3578 Merge branch 'js/systemd-timers-wsl-fix' into maint-2.42
Update "git maintainance" timers' implementation based on systemd
timers to work with WSL.

* js/systemd-timers-wsl-fix:
  maintenance(systemd): support the Windows Subsystem for Linux
2023-11-02 16:53:18 +09:00
bfb8376d68 Merge branch 'pw/diff-no-index-from-named-pipes' into maint-2.42
"git diff --no-index -R <(one) <(two)" did not work correctly,
which has been corrected.

* pw/diff-no-index-from-named-pipes:
  diff --no-index: fix -R with stdin
2023-11-02 16:53:18 +09:00
d12df942ba Merge branch 'js/complete-checkout-t' into maint-2.42
The completion script (in contrib/) has been taught to treat the
"-t" option to "git checkout" and "git switch" just like the
"--track" option, to complete remote-tracking branches.

* js/complete-checkout-t:
  completion(switch/checkout): treat --track and -t the same
2023-11-02 16:53:18 +09:00
17ab51ee8f Merge branch 'rs/grep-no-no-or' into maint-2.42
"git grep -e A --no-or -e B" is accepted, even though the negation
of "or" did not mean anything, which has been tightened.

* rs/grep-no-no-or:
  grep: reject --no-or
2023-11-02 16:53:18 +09:00
9a4ae43f0b Merge branch 'so/diff-doc-for-patch-update' into maint-2.42
References from description of the `--patch` option in various
manual pages have been simplified and improved.

* so/diff-doc-for-patch-update:
  doc/diff-options: fix link to generating patch section
2023-11-02 16:53:17 +09:00
a70f725c06 Merge branch 'pw/rebase-i-after-failure' into maint-2.42
Various fixes to the behaviour of "rebase -i" when the command got
interrupted by conflicting changes.
cf. <6b927687-cf6e-d73e-78fb-bd4f46736928@gmx.de>

* pw/rebase-i-after-failure:
  rebase -i: fix adding failed command to the todo list
  rebase --continue: refuse to commit after failed command
  rebase: fix rewritten list for failed pick
  sequencer: factor out part of pick_commits()
  sequencer: use rebase_path_message()
  rebase -i: remove patch file after conflict resolution
  rebase -i: move unlink() calls
2023-11-02 16:53:17 +09:00
d97034b0a1 Merge branch 'ks/ref-filter-sort-numerically' into maint-2.42
"git for-each-ref --sort='contents:size'" sorts the refs according
to size numerically, giving a ref that points at a blob twelve-byte
(12) long before showing a blob hundred-byte (100) long.

* ks/ref-filter-sort-numerically:
  ref-filter: sort numerically when ":size" is used
2023-11-02 16:53:17 +09:00
57b52cec46 Merge branch 'jk/diff-result-code-cleanup' into maint-2.42
"git diff --no-such-option" and other corner cases around the exit
status of the "diff" command has been corrected.

* jk/diff-result-code-cleanup:
  diff: drop useless "status" parameter from diff_result_code()
  diff: drop useless return values in git-diff helpers
  diff: drop useless return from run_diff_{files,index} functions
  diff: die when failing to read index in git-diff builtin
  diff: show usage for unknown builtin_diff_files() options
  diff-files: avoid negative exit value
  diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
2023-11-02 16:53:16 +09:00
a00b1127ce Merge branch 'ob/sequencer-empty-hint-fix' into maint-2.42
The use of API between two calls to require_clean_work_tree() from
the sequencer code has been cleaned up for consistency.

* ob/sequencer-empty-hint-fix:
  sequencer: rectify empty hint in call of require_clean_work_tree()
2023-11-02 16:53:16 +09:00
7f8314f277 Merge branch 'ts/unpacklimit-config-fix' into maint-2.42
transfer.unpackLimit ought to be used as a fallback, but overrode
fetch.unpackLimit and receive.unpackLimit instead.

* ts/unpacklimit-config-fix:
  transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
2023-11-02 16:53:16 +09:00
8764491463 Merge branch 'jc/diff-exit-code-with-w-fixes' into maint-2.42
"git diff -w --exit-code" with various options did not work
correctly, which is being addressed.

* jc/diff-exit-code-with-w-fixes:
  diff: the -w option breaks --exit-code for --raw and other output modes
  t4040: remove test that succeeded for a wrong reason
  diff: teach "--stat -w --exit-code" to notice differences
  diff: mode-only change should be noticed by "--patch -w --exit-code"
  diff: move the fallback "--exit-code" code down
2023-11-02 16:53:15 +09:00
1ea39ad467 Merge branch 'tb/commit-graph-verify-fix' into maint-2.42
The commit-graph verification code that detects mixture of zero and
non-zero generation numbers has been updated.

* tb/commit-graph-verify-fix:
  commit-graph: avoid repeated mixed generation number warnings
  t/t5318-commit-graph.sh: test generation zero transitions during fsck
  commit-graph: verify swapped zero/non-zero generation cases
  commit-graph: introduce `commit_graph_generation_from_graph()`
2023-11-02 16:53:15 +09:00
ec7cc187d4 Merge branch 'jc/ci-skip-same-commit' into maint-2.42
Tweak GitHub Actions CI so that pushing the same commit to multiple
branch tips at the same time will not waste building and testing
the same thing twice.

* jc/ci-skip-same-commit:
  ci: avoid building from the same commit in parallel
2023-11-02 16:53:15 +09:00
50758312f2 Merge branch 'ds/scalar-updates' into maint-2.42
Scalar updates.

* ds/scalar-updates:
  scalar reconfigure: help users remove buggy repos
  setup: add discover_git_directory_reason()
  scalar: add --[no-]src option
2023-11-02 16:53:15 +09:00
396a167bd4 Merge branch 'mp/rebase-label-length-limit' into maint-2.42
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.

* mp/rebase-label-length-limit:
  rebase: allow overriding the maximal length of the generated labels
  sequencer: truncate labels to accommodate loose refs
2023-11-02 16:53:14 +09:00
31730a30a0 Merge branch 'js/ci-coverity' into maint-2.42
GitHub CI workflow has learned to trigger Coverity check.

* js/ci-coverity:
  coverity: detect and report when the token or project is incorrect
  coverity: allow running on macOS
  coverity: support building on Windows
  coverity: allow overriding the Coverity project
  coverity: cache the Coverity Build Tool
  ci: add a GitHub workflow to submit Coverity scans
2023-11-02 16:53:14 +09:00
6d68ab0819 Merge branch 'jk/test-lsan-denoise-output' into maint-2.42
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.

* jk/test-lsan-denoise-output:
  test-lib: ignore uninteresting LSan output
2023-11-02 16:53:14 +09:00
19ab8e8dd5 Merge branch 'js/ci-san-skip-p4-and-svn-tests' into maint-2.42
Flakey "git p4" tests, as well as "git svn" tests, are now skipped
in the (rather expensive) sanitizer CI job.

* js/ci-san-skip-p4-and-svn-tests:
  ci(linux-asan-ubsan): let's save some time
2023-11-02 16:53:13 +09:00
fc0269be63 Merge branch 'tb/mark-more-tests-as-leak-free' into maint-2.42
Tests that are known to pass with LSan are now marked as such.

* tb/mark-more-tests-as-leak-free:
  leak tests: mark t5583-push-branches.sh as leak-free
  leak tests: mark t3321-notes-stripspace.sh as leak-free
  leak tests: mark a handful of tests as leak-free
2023-11-02 16:53:13 +09:00
b64d78ad02 max_tree_depth: lower it for MSVC to avoid stack overflows
There seems to be some internal stack overflow detection in MSVC's
`malloc()` machinery that seems to be independent of the `stack reserve`
and `heap reserve` sizes specified in the executable (editable via
`EDITBIN /STACK:<n> <exe>` and `EDITBIN /HEAP:<n> <exe>`).

In the newly test cases added by `jk/tree-name-and-depth-limit`, this
stack overflow detection is unfortunately triggered before Git can print
out the error message about too-deep trees and exit gracefully. Instead,
it exits with `STATUS_STACK_OVERFLOW`. This corresponds to the numeric
value -1073741571, something the MSYS2 runtime we sadly need to use to
run Git's test suite cannot handle and which it internally maps to the
exit code 127. Git's test suite, in turn, mistakes this to mean that the
command was not found, and fails both test cases.

Here is an example stack trace from an example run:

    [0x0]   ntdll!RtlpAllocateHeap+0x31   0x4212603f50   0x7ff9d6d4cd49
    [0x1]   ntdll!RtlpAllocateHeapInternal+0x6c9   0x42126041b0   0x7ff9d6e14512
    [0x2]   ntdll!RtlDebugAllocateHeap+0x102   0x42126042b0   0x7ff9d6dcd8b0
    [0x3]   ntdll!RtlpAllocateHeap+0x7ec70   0x4212604350   0x7ff9d6d4cd49
    [0x4]   ntdll!RtlpAllocateHeapInternal+0x6c9   0x42126045b0   0x7ff9596ed480
    [0x5]   ucrtbased!heap_alloc_dbg_internal+0x210   0x42126046b0   0x7ff9596ed20d
    [0x6]   ucrtbased!heap_alloc_dbg+0x4d   0x4212604750   0x7ff9596f037f
    [0x7]   ucrtbased!_malloc_dbg+0x2f   0x42126047a0   0x7ff9596f0dee
    [0x8]   ucrtbased!malloc+0x1e   0x42126047d0   0x7ff730fcc1ef
    [0x9]   git!do_xmalloc+0x2f   0x4212604800   0x7ff730fcc2b9
    [0xa]   git!do_xmallocz+0x59   0x4212604840   0x7ff730fca779
    [0xb]   git!xmallocz_gently+0x19   0x4212604880   0x7ff7311b0883
    [0xc]   git!unpack_compressed_entry+0x43   0x42126048b0   0x7ff7311ac9a4
    [0xd]   git!unpack_entry+0x554   0x42126049a0   0x7ff7311b0628
    [0xe]   git!cache_or_unpack_entry+0x58   0x4212605250   0x7ff7311ad3a8
    [0xf]   git!packed_object_info+0x98   0x42126052a0   0x7ff7310a92da
    [0x10]   git!do_oid_object_info_extended+0x3fa   0x42126053b0   0x7ff7310a44e7
    [0x11]   git!oid_object_info_extended+0x37   0x4212605460   0x7ff7310a38ba
    [0x12]   git!repo_read_object_file+0x9a   0x42126054a0   0x7ff7310a6147
    [0x13]   git!read_object_with_reference+0x97   0x4212605560   0x7ff7310b4656
    [0x14]   git!fill_tree_descriptor+0x66   0x4212605620   0x7ff7310dc0a5
    [0x15]   git!traverse_trees_recursive+0x3f5   0x4212605680   0x7ff7310dd831
    [0x16]   git!unpack_callback+0x441   0x4212605790   0x7ff7310b4c95
    [0x17]   git!traverse_trees+0x5d5   0x42126058a0   0x7ff7310dc0f2
    [0x18]   git!traverse_trees_recursive+0x442   0x4212605980   0x7ff7310dd831
    [0x19]   git!unpack_callback+0x441   0x4212605a90   0x7ff7310b4c95
    [0x1a]   git!traverse_trees+0x5d5   0x4212605ba0   0x7ff7310dc0f2
    [0x1b]   git!traverse_trees_recursive+0x442   0x4212605c80   0x7ff7310dd831
    [0x1c]   git!unpack_callback+0x441   0x4212605d90   0x7ff7310b4c95
    [0x1d]   git!traverse_trees+0x5d5   0x4212605ea0   0x7ff7310dc0f2
    [0x1e]   git!traverse_trees_recursive+0x442   0x4212605f80   0x7ff7310dd831
    [0x1f]   git!unpack_callback+0x441   0x4212606090   0x7ff7310b4c95
    [0x20]   git!traverse_trees+0x5d5   0x42126061a0   0x7ff7310dc0f2
    [0x21]   git!traverse_trees_recursive+0x442   0x4212606280   0x7ff7310dd831
    [...]
    [0xfad]   git!cmd_main+0x2a2   0x42126ff740   0x7ff730fb6345
    [0xfae]   git!main+0xe5   0x42126ff7c0   0x7ff730fbff93
    [0xfaf]   git!wmain+0x2a3   0x42126ff830   0x7ff731318859
    [0xfb0]   git!invoke_main+0x39   0x42126ff8a0   0x7ff7313186fe
    [0xfb1]   git!__scrt_common_main_seh+0x12e   0x42126ff8f0   0x7ff7313185be
    [0xfb2]   git!__scrt_common_main+0xe   0x42126ff960   0x7ff7313188ee
    [0xfb3]   git!wmainCRTStartup+0xe   0x42126ff990   0x7ff9d5ed257d
    [0xfb4]   KERNEL32!BaseThreadInitThunk+0x1d   0x42126ff9c0   0x7ff9d6d6aa78
    [0xfb5]   ntdll!RtlUserThreadStart+0x28   0x42126ff9f0   0x0

I verified manually that `traverse_trees_cur_depth` was 562 when that
happened, which is far below the 2048 that were already accepted into
Git as a hard limit.

Despite many attempts to figure out which of the internals trigger this
`STATUS_STACK_OVERFLOW` and how to maybe increase certain sizes to avoid
running into this issue and let Git behave the same way as under Linux,
I failed to find any build-time/runtime knob we could turn to that
effect.

Note: even switching to using a different allocator (I used mimalloc
because that's what Git for Windows uses for its GCC builds) does not
help, as the zlib code used to unpack compressed pack entries _still_
uses the regular `malloc()`. And runs into the same issue.

Note also: switching to using a different allocator _also_ for zlib code
seems _also_ not to help. I tried that, and it still exited with
`STATUS_STACK_OVERFLOW` that seems to have been triggered by a
`mi_assert_internal()`, i.e. an internal assertion of mimalloc...

So the best bet to work around this for now seems to just lower the
maximum allowed tree depth _even further_ for MSVC builds.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 08:58:28 +09:00
e1068f0ad4 merge-file: add an option to process object IDs
git merge-file knows how to merge files on the file system already.  It
would be helpful, however, to allow it to also merge single blobs.
Teach it an `--object-id` option which means that its arguments are
object IDs and not files to allow it to do so.

We handle the empty blob specially since read_mmblob doesn't read it
directly and otherwise users cannot specify an empty ancestor.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 08:51:40 +09:00
8077612ea1 git-merge-file doc: drop "-file" from argument placeholders
`git merge-file` takes three positional arguments. Each of them is
documented as `<foo-file>`. In preparation for teaching this command to
alternatively take three object IDs, make these placeholders a bit more
generic by dropping the "-file" parts. Instead, clarify early that the
three arguments are filenames. Even after the next commit, we can afford
to present this file-centric view up front and in the general
discussion, since it will remain the default one.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 08:51:38 +09:00
1bd809938a Documentation/gitformat-pack.txt: fix incorrect MIDX documentation
Back in 32f3c541e3 (multi-pack-index: write pack names in chunk,
2018-07-12) the MIDX's "Packfile Names" (or "PNAM", for short) chunk was
described as containing an array of string entries. e0d1bcf825 notes
that this is the only chunk in the MIDX format's specification that is
not guaranteed to be 4-byte aligned, and so should be placed last.

This isn't quite accurate: the entries within the PNAM chunk are not
guaranteed to be 4-byte aligned since they are arbitrary strings, but
the chunk itself is 4-byte aligned since the ending is padded with NUL
bytes.

That padding has always been there since 32f3c541e3 via
midx.c::write_midx_pack_names(), which ended with:

    i = MIDX_CHUNK_ALIGNMENT - (written % MIDX_CHUNK_ALIGNMENT)
    if (i < MIDX_CHUNK_ALIGNMENT) {
      unsigned char padding[MIDX_CHUNK_ALIGNMENT];
      memset(padding, 0, sizeof(padding))
      hashwrite(f, padding, i);
      written += i;
    }

In fact, 32f3c541e3's log message itself describes the chunk in its
first paragraph with:

    Since filenames are not well structured, add padding to keep good
    alignment in later chunks.

So these have always been externally aligned. Correct the corresponding
part of our documentation to reflect that.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 13:25:04 +09:00
530a9f183f Documentation/gitformat-pack.txt: fix typo
e0d1bcf825 (multi-pack-index: add format details, 2018-07-12) describes
the MIDX's "PNAM" chunk as having entries which are "null-terminated
strings".

This is a typo, as strings are terminated with a NUL character, which is
a distinct concept from "NULL" or "null", which we typically reserve for
the void pointer to address 0.

Correct the documentation accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 13:25:02 +09:00
8f81532599 clang-format: fix typo in comment
Signed-off-by: Aditya Neelamraju <adityanv97@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:24:19 +09:00
0497e6c611 t: use git-show-ref(1) to check for ref existence
Convert tests that use `test_path_is_file` and `test_path_is_missing` to
instead use a set of helpers `test_ref_exists` and `test_ref_missing`.
These helpers are implemented via the newly introduced `git show-ref
--exists` command. Thus, we can avoid intimate knowledge of how the ref
backend stores references on disk.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:01 +09:00
9080a7f178 builtin/show-ref: add new mode to check for reference existence
While we have multiple ways to show the value of a given reference, we
do not have any way to check whether a reference exists at all. While
commands like git-rev-parse(1) or git-show-ref(1) can be used to check
for reference existence in case the reference resolves to something
sane, neither of them can be used to check for existence in some other
scenarios where the reference does not resolve cleanly:

    - References which have an invalid name cannot be resolved.

    - References to nonexistent objects cannot be resolved.

    - Dangling symrefs can be resolved via git-symbolic-ref(1), but this
      requires the caller to special case existence checks depending on
      whether or not a reference is symbolic or direct.

Furthermore, git-rev-list(1) and other commands do not let the caller
distinguish easily between an actually missing reference and a generic
error.

Taken together, this seems like sufficient motivation to introduce a
separate plumbing command to explicitly check for the existence of a
reference without trying to resolve its contents.

This new command comes in the form of `git show-ref --exists`. This
new mode will exit successfully when the reference exists, with a
specific exit code of 2 when it does not exist, or with 1 when there
has been a generic error.

Note that the only way to properly implement this command is by using
the internal `refs_read_raw_ref()` function. While the public function
`refs_resolve_ref_unsafe()` can be made to behave in the same way by
passing various flags, it does not provide any way to obtain the errno
with which the reference backend failed when reading the reference. As
such, it becomes impossible for us to distinguish generic errors from
the explicit case where the reference wasn't found.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:01 +09:00
1307d5e86f builtin/show-ref: explicitly spell out different modes in synopsis
The synopsis treats the `--verify` and the implicit mode the same. They
are slightly different though:

    - They accept different sets of flags.

    - The implicit mode accepts patterns while the `--verify` mode
      accepts references.

Split up the synopsis for these two modes such that we can disambiguate
those differences.

While at it, drop "--quiet" from the pattern mode's synopsis. It does
not make a lot of sense to list patterns, but squelch the listing output
itself. The description for "--quiet" is adapted accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
199970e72f builtin/show-ref: ensure mutual exclusiveness of subcommands
The git-show-ref(1) command has three different modes, of which one is
implicit and the other two can be chosen explicitly by passing a flag.
But while these modes are standalone and cause us to execute completely
separate code paths, we gladly accept the case where a user asks for
both `--exclude-existing` and `--verify` at the same time even though it
is not obvious what will happen. Spoiler: we ignore `--verify` and
execute the `--exclude-existing` mode.

Let's explicitly detect this invalid usage and die in case both modes
were requested.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
ee26f1e29a builtin/show-ref: refactor options for patterns subcommand
The patterns subcommand is the last command that still uses global
variables to track its options. Convert it to use a structure instead
with the same motivation as preceding commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
b0f0be9398 builtin/show-ref: stop using global vars for show_one()
The `show_one()` function implicitly receives a bunch of options which
are tracked via global variables. This makes it hard to see which
subcommands of git-show-ref(1) actually make use of these options.

Introduce a `show_one_options` structure that gets passed down to this
function. This allows us to get rid of more global state and makes it
more explicit which subcommands use those options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
84650989b7 builtin/show-ref: stop using global variable to count matches
When passing patterns to git-show-ref(1) we're checking whether any
reference matches -- if none do, we indicate this condition via an
unsuccessful exit code.

We're using a global variable to count these matches, which is required
because the counter is getting incremented in a callback function. But
now that we have the `struct show_ref_data` in place, we can get rid of
the global variable and put the counter in there instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
7907fb0c97 builtin/show-ref: refactor --exclude-existing options
It's not immediately obvious options which options are applicable to
what subcommand in git-show-ref(1) because all options exist as global
state. This can easily cause confusion for the reader.

Refactor options for the `--exclude-existing` subcommand to be contained
in a separate structure. This structure is stored on the stack and
passed down as required. Consequently, it clearly delimits the scope of
those options and requires the reader to worry less about global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
53921d5f8e builtin/show-ref: fix dead code when passing patterns
When passing patterns to `git show-ref` we have some code that will
cause us to die if `verify && !quiet` is true. But because `verify`
indicates a different subcommand of git-show-ref(1) that causes us to
execute `cmd_show_ref__verify()` and not `cmd_show_ref__patterns()`, the
condition cannot ever be true.

Let's remove this dead code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
dbabd0b023 builtin/show-ref: fix leaking string buffer
Fix a leaking string buffer in `git show-ref --exclude-existing`. While
the buffer is technically not leaking because its variable is declared
as static, there is no inherent reason why it should be.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
b14cbae2b5 builtin/show-ref: split up different subcommands
While not immediately obvious, git-show-ref(1) actually implements three
different subcommands:

    - `git show-ref <patterns>` can be used to list references that
      match a specific pattern.

    - `git show-ref --verify <refs>` can be used to list references.
      These are _not_ patterns.

    - `git show-ref --exclude-existing` can be used as a filter that
      reads references from standard input, performing some conversions
      on each of them.

Let's make this more explicit in the code by splitting up the three
subcommands into separate functions. This also allows us to address the
confusingly named `patterns` variable, which may hold either patterns or
reference names.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
ff546ebb59 builtin/show-ref: convert pattern to a local variable
The `pattern` variable is a global variable that tracks either the
reference names (not patterns!) for the `--verify` mode or the patterns
for the non-verify mode. This is a bit confusing due to the slightly
different meanings.

Convert the variable to be local. While this does not yet fix the double
meaning of the variable, this change allows us to address it in a
subsequent patch more easily by explicitly splitting up the different
subcommands of git-show-ref(1).

Note that we introduce a `struct show_ref_data` to pass the patterns to
`show_ref()`. While this is overengineered now, we will extend this
structure in a subsequent patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:09:00 +09:00
9830926c7d rev-list: add commit object support in --missing option
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.

Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.

A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:07:18 +09:00
b49529230d rev-list: move show_commit() to the bottom
The `show_commit()` function already depends on `finish_commit()`, and
in the upcoming commit, we'll also add a dependency on
`finish_object__ma()`. Since in C symbols must be declared before
they're used, let's move `show_commit()` below both `finish_commit()`
and `finish_object__ma()`, so the code is cleaner as a whole without the
need for declarations.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:07:18 +09:00
ca556f4707 revision: rename bit to do_not_die_on_missing_objects
The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.

In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:07:18 +09:00
922cc26e41 Merge branch 'ps/do-not-trust-commit-graph-blindly-for-existence' into kn/rev-list-missing-fix
* ps/do-not-trust-commit-graph-blindly-for-existence:
  commit: detect commits that exist in commit-graph but not in the ODB
  commit-graph: introduce envvar to disable commit existence checks
2023-11-01 12:06:55 +09:00
7a5d604443 commit: detect commits that exist in commit-graph but not in the ODB
Commit graphs can become stale and contain references to commits that do
not exist in the object database anymore. Theoretically, this can lead
to a scenario where we are able to successfully look up any such commit
via the commit graph even though such a lookup would fail if done via
the object database directly.

As the commit graph is mostly intended as a sort of cache to speed up
parsing of commits we do not want to have diverging behaviour in a
repository with and a repository without commit graphs, no matter
whether they are stale or not. As commits are otherwise immutable, the
only thing that we really need to care about is thus the presence or
absence of a commit.

To address potentially stale commit data that may exist in the graph,
our `lookup_commit_in_graph()` function will check for the commit's
existence in both the commit graph, but also in the object database. So
even if we were able to look up the commit's data in the graph, we would
still pretend as if the commit didn't exist if it is missing in the
object database.

We don't have the same safety net in `parse_commit_in_graph_one()`
though. This function is mostly used internally in "commit-graph.c"
itself to validate the commit graph, and this usage is fine. We do
expose its functionality via `parse_commit_in_graph()` though, which
gets called by `repo_parse_commit_internal()`, and that function is in
turn used in many places in our codebase.

For all I can see this function is never used to directly turn an object
ID into a commit object without additional safety checks before or after
this lookup. What it is being used for though is to walk history via the
parent chain of commits. So when commits in the parent chain of a graph
walk are missing it is possible that we wouldn't notice if that missing
commit was part of the commit graph. Thus, a query like `git rev-parse
HEAD~2` can succeed even if the intermittent commit is missing.

It's unclear whether there are additional ways in which such stale
commit graphs can lead to problems. In any case, it feels like this is a
bigger bug waiting to happen when we gain additional direct or indirect
callers of `repo_parse_commit_internal()`. So let's fix the inconsistent
behaviour by checking for object existence via the object database, as
well.

This check of course comes with a performance penalty. The following
benchmarks have been executed in a clone of linux.git with stable tags
added:

    Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
      Time (mean ± σ):      2.913 s ±  0.018 s    [User: 2.363 s, System: 0.548 s]
      Range (min … max):    2.894 s …  2.950 s    10 runs

    Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
      Time (mean ± σ):      3.834 s ±  0.052 s    [User: 3.276 s, System: 0.556 s]
      Range (min … max):    3.780 s …  3.961 s    10 runs

    Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
      Time (mean ± σ):     13.841 s ±  0.084 s    [User: 13.152 s, System: 0.687 s]
      Range (min … max):   13.714 s … 13.995 s    10 runs

    Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
      Time (mean ± σ):     13.762 s ±  0.116 s    [User: 13.094 s, System: 0.667 s]
      Range (min … max):   13.645 s … 14.038 s    10 runs

    Summary
      git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
        1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
        4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
        4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)

We look at a ~30% regression in general, but in general we're still a
whole lot faster than without the commit graph. To counteract this, the
new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:04:06 +09:00
e04838ea82 commit-graph: introduce envvar to disable commit existence checks
Our `lookup_commit_in_graph()` helper tries to look up commits from the
commit graph and, if it doesn't exist there, falls back to parsing it
from the object database instead. This is intended to speed up the
lookup of any such commit that exists in the database. There is an edge
case though where the commit exists in the graph, but not in the object
database. To avoid returning such stale commits the helper function thus
double checks that any such commit parsed from the graph also exists in
the object database. This makes the function safe to use even when
commit graphs aren't updated regularly.

We're about to introduce the same pattern into other parts of our code
base though, namely `repo_parse_commit_internal()`. Here the extra
sanity check is a bit of a tougher sell: `lookup_commit_in_graph()` was
a newly introduced helper, and as such there was no performance hit by
adding this sanity check. If we added `repo_parse_commit_internal()`
with that sanity check right from the beginning as well, this would
probably never have been an issue to begin with. But by retrofitting it
with this sanity check now we do add a performance regression to
preexisting code, and thus there is a desire to avoid this or at least
give an escape hatch.

In practice, there is no inherent reason why either of those functions
should have the sanity check whereas the other one does not: either both
of them are able to detect this issue or none of them should be. This
also means that the default of whether we do the check should likely be
the same for both. To err on the side of caution, we thus rather want to
make `repo_parse_commit_internal()` stricter than to loosen the checks
that we already have in `lookup_commit_in_graph()`.

The escape hatch is added in the form of a new GIT_COMMIT_GRAPH_PARANOIA
environment variable that mirrors GIT_REF_PARANOIA. If enabled, which is
the default, we will double check that commits looked up in the commit
graph via `lookup_commit_in_graph()` also exist in the object database.
This same check will also be added in `repo_parse_commit_internal()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-01 12:04:06 +09:00
2e87fca189 test framework: further deprecate test_i18ngrep
As an attempt to come up with a useful mechanism to ensure that
certain messages are left untranslated [*], we earlier wrote
GIT_TEST_GETTEXT_POISON off as a failed experiment.

But the output from the test helper was easier to use while
debugging failed tests, compared to the same test writtein with the
plain-vanilla "grep".  Especially when a test that expects a certain
string to appear in the output (e.g. "this test must fail with this
message") fails, "grep message output" would just silently fail and
in a &&-chained sequence of commands, it is hard to tell which step
failed.  test_i18ngrep explicitly said "we wanted to see a line that
match this pattern but did not see a hit in this file".

What we have as test_i18ngrep in our tree still retains this verbose
output (even though we got rid of the "poison" support).  Let's
rename it to test_grep (because it is no longer about i18n at all)
and then make test_i18ngrep a thin wrapper around it.  Existing
callers of test_i18ngrep can be mechanically rewritten to instead
use test_grep over time, but it does not have to be done in this
commit.

[Footnote]

 * The idea was that human-facing messages are often translated, but
   there are messages that should never be translated.  We use
   "grep" only for the latter kind of messages, and then run tests
   in "poison" mode that spew garbage for translatable messages.  If
   such a test run fails, it means these messages tested with "grep"
   were marked for translation by mistake.  test_i18ngrep was to be
   used for other messages that are to be translated, and was to
   always "succeed" when runing under the "poison" mode.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-31 14:24:35 +09:00
692be87cbb Merge branch 'jm/bisect-run-synopsis-fix'
Doc and usage message update.

* jm/bisect-run-synopsis-fix:
  doc/git-bisect: clarify `git bisect run` syntax
2023-10-31 12:57:44 +09:00
ece54894fe Merge branch 'ii/branch-error-messages-update'
Error message update.

* ii/branch-error-messages-update:
  builtin/branch.c: adjust error messages to coding guidelines
2023-10-31 12:57:44 +09:00
b8f58c200c upload-pack: add tracing for fetches
Information on how users are accessing hosted repositories can be
helpful to server operators. For example, being able to broadly
differentiate between fetches and initial clones; the use of shallow
repository features; or partial clone filters.

a29263c (fetch-pack: add tracing for negotiation rounds, 2022-08-02)
added some information on have counts to fetch-pack itself to help
diagnose negotiation; but from a git-upload-pack (server) perspective,
there's no means of accessing such information without using
GIT_TRACE_PACKET to examine the protocol packets.

Improve this by emitting a Trace2 JSON event from upload-pack with
summary information on the contents of a fetch request.

* haves, wants, and want-ref counts can help determine (broadly) between
  fetches and clones, and the use of single-branch, etc.
* shallow clone depth, tip counts, and deepening options.
* any partial clone filter type.

Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-30 21:43:21 +09:00
3130c155df The twenty-second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-30 07:09:59 +09:00
3adc25a695 Merge branch 'ms/doc-push-fix'
Docfix.

* ms/doc-push-fix:
  git-push doc: more visibility for -q option
2023-10-30 07:09:59 +09:00
5006bfc1f5 Merge branch 'jk/send-email-fix-addresses-from-composed-messages'
The codepath to handle recipient addresses `git send-email
--compose` learns from the user was completely broken, which has
been corrected.

* jk/send-email-fix-addresses-from-composed-messages:
  send-email: handle to/cc/bcc from --compose message
  Revert "send-email: extract email-parsing code into a subroutine"
  doc/send-email: mention handling of "reply-to" with --compose
2023-10-30 07:09:59 +09:00
90c8096657 Merge branch 'ob/rebase-cleanup'
Code clean-up.

* ob/rebase-cleanup:
  rebase: move parse_opt_keep_empty() down
  rebase: handle --strategy via imply_merge() as well
  rebase: simplify code related to imply_merge()
2023-10-30 07:09:58 +09:00
4fcbc5b94f Merge branch 'jc/commit-new-underscore-index-fix'
Message fix.

* jc/commit-new-underscore-index-fix:
  commit: do not use cryptic "new_index" in end-user facing messages
2023-10-30 07:09:58 +09:00
9a48da7843 Merge branch 'wx/merge-ort-comment-typofix'
Typofix.

* wx/merge-ort-comment-typofix:
  merge-ort.c: fix typo 'neeed' to 'needed'
2023-10-30 07:09:58 +09:00
39072d2496 Merge branch 'ps/git-repack-doc-fixes'
Doc updates.

* ps/git-repack-doc-fixes:
  doc/git-repack: don't mention nonexistent "--unpacked" option
  doc/git-repack: fix syntax for `-g` shorthand option
2023-10-30 07:09:57 +09:00
64912cc023 Merge branch 'kh/pathspec-error-wo-repository-fix'
The pathspec code carelessly dereferenced NULL while emitting an
error message, which has been corrected.

* kh/pathspec-error-wo-repository-fix:
  grep: die gracefully when outside repository
2023-10-30 07:09:57 +09:00
6597631888 Merge branch 'ni/die-message-fix-for-git-add'
Message updates.

* ni/die-message-fix-for-git-add:
  builtin/add.c: clean up die() messages
2023-10-30 07:09:57 +09:00
030c2fba90 Merge branch 'jc/am-doc-whitespace-action-fix'
Docfix.

* jc/am-doc-whitespace-action-fix:
  am: align placeholder for --whitespace option with apply
2023-10-30 07:09:56 +09:00
9030f85730 Merge branch 'mm/p4-symlink-with-lfs'
"git p4" tried to store symlinks to LFS when told, but has been
fixed not to do so, because it does not make sense.

* mm/p4-symlink-with-lfs:
  git-p4 shouldn't attempt to store symlinks in LFS
2023-10-30 07:09:56 +09:00
3a5e77e346 Merge branch 'da/t7601-style-fix'
Coding style update.

* da/t7601-style-fix:
  t7601: use "test_path_is_file" etc. instead of "test -f"
2023-10-30 07:09:56 +09:00
1551066dc5 Merge branch 'jc/update-list-references-to-lore'
Doc update.

* jc/update-list-references-to-lore:
  doc: update list archive reference to use lore.kernel.org
2023-10-30 07:09:56 +09:00
26dd307cfa Merge branch 'jc/attr-tree-config'
The attribute subsystem learned to honor `attr.tree` configuration
that specifies which tree to read the .gitattributes files from.

* jc/attr-tree-config:
  attr: add attr.tree for setting the treeish to read attributes from
  attr: read attributes from HEAD when bare repo
2023-10-30 07:09:55 +09:00
8183b63ff6 Merge branch 'sn/typo-grammo-phraso-fixes'
Many typos, ungrammatical sentences and wrong phrasing have been
fixed.

* sn/typo-grammo-phraso-fixes:
  t/README: fix multi-prerequisite example
  doc/gitk: s/sticked/stuck/
  git-jump: admit to passing merge mode args to ls-files
  doc/diff-options: improve wording of the log.diffMerges mention
  doc: fix some typos, grammar and wording issues
2023-10-30 07:09:55 +09:00
26d4c51d36 reflog: fix expire --single-worktree
33d7bdd645 (builtin/reflog.c: use parse-options api for expire, delete
subcommands, 2022-01-06) broke the option --single-worktree of git
reflog expire and added a non-printable short flag for it, presumably by
accident.  While before it set the variable "all_worktrees" to 0, now it
sets it to 1, its default value.  --no-single-worktree is required now
to set it to 0.

Fix it by replacing the variable with one that has the opposite meaning,
to avoid the negation and its potential for confusion.  The new variable
"single_worktree" directly captures whether --single-worktree was given.

Also remove the unprintable short flag SOH (start of heading) because it
is undocumented, hard to use and is likely to have been added by mistake
in connection with the negation bug above.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-29 12:19:28 +09:00
f7c1b23819 am, rebase: fix arghelp syntax of --empty
Use parentheses and pipes to present alternatives in the argument help
for the --empty options of git am and git rebase, like in the rest of
the documentation.

While at it remove a stray use of the enum empty_action value
STOP_ON_EMPTY_COMMIT to indicate that no short option is present.
While it has a value of 0 and thus there is no user-visible change,
that enum is not meant to hold short option characters.  Hard-code 0,
like we do for other options without a short option.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-29 12:10:45 +09:00
e5cf20e092 am: simplify --show-current-patch handling
Let the parse-options code detect and handle the use of options that are
incompatible with --show-current-patch.  This requires exposing the
distinction between the "raw" and "diff" sub-modes.  Do that by
splitting the mode RESUME_SHOW_PATCH into RESUME_SHOW_PATCH_RAW and
RESUME_SHOW_PATCH_DIFF and stop tracking sub-modes in a separate struct.

The result is a simpler callback function and more precise error
messages.  The original reports a spurious argument or a NULL pointer:

   $ git am --show-current-patch --show-current-patch=diff
   error: options '--show-current-patch=diff' and '--show-current-patch=raw' cannot be used together
   $ git am --show-current-patch=diff --show-current-patch
   error: options '--show-current-patch=(null)' and '--show-current-patch=diff' cannot be used together

With this patch we get the more precise:

   $ git am --show-current-patch --show-current-patch=diff
   error: --show-current-patch=diff is incompatible with --show-current-patch
   $ git am --show-current-patch=diff --show-current-patch
   error: --show-current-patch is incompatible with --show-current-patch=diff

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-29 12:05:59 +09:00
0025dde775 parse-options: make CMDMODE errors more precise
Only a single PARSE_OPT_CMDMODE option can be specified for the same
variable at the same time.  This is enforced by get_value(), but the
error messages are imprecise in three ways:

1. If a non-PARSE_OPT_CMDMODE option changes the value variable of a
PARSE_OPT_CMDMODE option then an ominously vague message is shown:

   $ t/helper/test-tool parse-options --set23 --mode1
   error: option `mode1' : incompatible with something else

Worse: If the order of options is reversed then no error is reported at
all:

   $ t/helper/test-tool parse-options --mode1 --set23
   boolean: 0
   integer: 23
   magnitude: 0
   timestamp: 0
   string: (not set)
   abbrev: 7
   verbose: -1
   quiet: 0
   dry run: no
   file: (not set)

Fortunately this can currently only happen in the test helper; actual
Git commands don't share the same variable for the value of options with
and without the flag PARSE_OPT_CMDMODE.

2. If there are multiple options with the same value (synonyms), then
the one that is defined first is shown rather than the one actually
given on the command line, which is confusing:

   $ git am --resolved --quit
   error: option `quit' is incompatible with --continue

3. Arguments of PARSE_OPT_CMDMODE options are not handled by the
parse-option machinery.  This is left to the callback function.  We
currently only have a single affected option, --show-current-patch of
git am.  Errors for it can show an argument that was not actually given
on the command line:

   $ git am --show-current-patch --show-current-patch=diff
   error: options '--show-current-patch=diff' and '--show-current-patch=raw' cannot be used together

The options --show-current-patch and --show-current-patch=raw are
synonyms, but the error accuses the user of input they did not actually
made.  Or it can awkwardly print a NULL pointer:

   $ git am --show-current-patch=diff --show-current-patch
   error: options '--show-current-patch=(null)' and '--show-current-patch=diff' cannot be used together

The reasons for these shortcomings is that the current code checks
incompatibility only when encountering a PARSE_OPT_CMDMODE option at the
command line, and that it searches the previous incompatible option by
value.

Fix the first two points by checking all PARSE_OPT_CMDMODE variables
after parsing each option and by storing all relevant details if their
value changed.  Do that whether or not the changing options has the flag
PARSE_OPT_CMDMODE set.  Report an incompatibility only if two options
change the variable to different values and at least one of them is a
PARSE_OPT_CMDMODE option.  This changes the output of the first three
examples above to:

   $ t/helper/test-tool parse-options --set23 --mode1
   error: --mode1 is incompatible with --set23
   $ t/helper/test-tool parse-options --mode1 --set23
   error: --set23 is incompatible with --mode1
   $ git am --resolved --quit
   error: --quit is incompatible with --resolved

Store the argument of PARSE_OPT_CMDMODE options of type OPTION_CALLBACK
as well to allow taking over the responsibility for compatibility
checking from the callback function.  The next patch will use this
capability to fix the messages for git am --show-current-patch.

Use a linked list for storing the PARSE_OPT_CMDMODE variables.  This
somewhat outdated data structure is simple and suffices, as the number
of elements per command is currently only zero or one.  We do support
multiple different command modes variables per command, but I don't
expect that we'd ever use a significant number of them.  Once we do we
can switch to a hashmap.

Since we no longer need to search the conflicting option, the all_opts
parameter of get_value() is no longer used.  Remove it.

Extend the tests to check for both conflicting option names, but don't
insist on a particular order.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-29 09:15:18 +09:00
681c0a247b bugreport: reject positional arguments
git-bugreport already rejected unrecognized flag arguments, like
`--diaggnose`, but this doesn't help if the user's mistake was to forget
the `--` in front of the argument. This can result in a user's intended
argument not being parsed with no indication to the user that something
went wrong. Since git-bugreport presently doesn't take any positionals
at all, let's reject all positionals and give the user a usage hint.

Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-29 08:56:17 +09:00
831401bb14 t0091-bugreport: stop using i18ngrep
Since e6545201ad (Merge branch 'ab/detox-config-gettext', 2021-04-13),
test_i18ngrep is no longer required. Quit using it in the bugreport
tests, since it's setting a bad example for tests added later.

Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-29 08:55:48 +09:00
6b79a2183c Include gettext.h in MyFirstContribution tutorial
The tutorial in Documentation/MyFirstContribution.txt has steps to print
some text using the "_" function. However, this leads to compiler errors
when running "make" since "gettext.h" is not #included.

Update docs with a note to #include "gettext.h" in "builtin/psuh.c".

Signed-off-by: Jacob Stopak <jacob@initialcommit.io>
Reviewed-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-28 09:02:06 +09:00
3b681e255c gitk: sv.po: Update Swedish translation (323t)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2023-10-26 21:47:18 +01:00
0d8647034e send-email: move validation code below process_address_list
Move validation logic below processing of email address lists so that
email validation gets the proper email addresses.  As a side effect,
some initialization needed to be moved down.  In order for validation
and the actual email sending to have the same initial state, the
initialized variables that get modified by pre_process_file are
encapsulated in a new function.

This fixes email address validation errors when the optional
perl module Email::Valid is installed and multiple addresses are passed
in on a single to/cc argument like --to=foo@example.com,bar@example.com.
A new test was added to t9001 to expose failures with this case in the
future.

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-26 21:46:10 +09:00
d15b85391a SubmittingPatches: call gitk's command "Copy commit reference"
Documentation/SubmittingPatches informs the contributor that gitk's
context menu command "Copy commit summary" can be used to obtain the
conventional format of referencing existing commits.  This command in
gitk was renamed to "Copy commit reference" in commit [1], following
implementation of Git's "reference" pretty format in [2].

Update mention of this gitk command in Documentation/SubmittingPatches
to its new name.

[1] b8b60957ce (gitk: rename "commit summary" to "commit reference",
    2019-12-12)
[2] commit 1f0fc1d (pretty: implement 'reference' format, 2019-11-20)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-24 15:27:23 -07:00
2e8e77cbac The twenty-first batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-23 13:56:38 -07:00
d12166d3c8 Merge branch 'en/docfixes'
Documentation typo and grammo fixes.

* en/docfixes: (25 commits)
  documentation: add missing parenthesis
  documentation: add missing quotes
  documentation: add missing fullstops
  documentation: add some commas where they are helpful
  documentation: fix whitespace issues
  documentation: fix capitalization
  documentation: fix punctuation
  documentation: use clearer prepositions
  documentation: add missing hyphens
  documentation: remove unnecessary hyphens
  documentation: add missing article
  documentation: fix choice of article
  documentation: whitespace is already generally plural
  documentation: fix singular vs. plural
  documentation: fix verb vs. noun
  documentation: fix adjective vs. noun
  documentation: fix verb tense
  documentation: employ consistent verb tense for a list
  documentation: fix subject/verb agreement
  documentation: remove extraneous words
  ...
2023-10-23 13:56:37 -07:00
5edbcead42 Merge branch 'bc/racy-4gb-files'
The index file has room only for lower 32-bit of the file size in
the cached stat information, which means cached stat information
will have 0 in its sd_size member for a file whose size is multiple
of 4GiB.  This is mistaken for a racily clean path.  Avoid it by
storing a bogus sd_size value instead for such files.

* bc/racy-4gb-files:
  Prevent git from rehashing 4GiB files
  t: add a test helper to truncate files
2023-10-23 13:56:37 -07:00
626f689f79 Merge branch 'jc/fail-stash-to-store-non-stash'
Feeding "git stash store" with a random commit that was not created
by "git stash create" now errors out.

* jc/fail-stash-to-store-non-stash:
  stash: be careful what we store
2023-10-23 13:56:37 -07:00
755fb09163 Merge branch 'so/diff-merges-dd'
"git log" and friends learned "--dd" that is a short-hand for
"--diff-merges=first-parent -p".

* so/diff-merges-dd:
  completion: complete '--dd'
  diff-merges: introduce '--dd' option
  diff-merges: improve --diff-merges documentation
2023-10-23 13:56:37 -07:00
f32af12cee Merge branch 'jk/chunk-bounds'
The codepaths that read "chunk" formatted files have been corrected
to pay attention to the chunk size and notice broken files.

* jk/chunk-bounds: (21 commits)
  t5319: make corrupted large-offset test more robust
  chunk-format: drop pair_chunk_unsafe()
  commit-graph: detect out-of-order BIDX offsets
  commit-graph: check bounds when accessing BIDX chunk
  commit-graph: check bounds when accessing BDAT chunk
  commit-graph: bounds-check generation overflow chunk
  commit-graph: check size of generations chunk
  commit-graph: bounds-check base graphs chunk
  commit-graph: detect out-of-bounds extra-edges pointers
  commit-graph: check size of commit data chunk
  midx: check size of revindex chunk
  midx: bounds-check large offset chunk
  midx: check size of object offset chunk
  midx: enforce chunk alignment on reading
  midx: check size of pack names chunk
  commit-graph: check consistency of fanout table
  midx: check size of oid lookup chunk
  commit-graph: check size of oid fanout chunk
  midx: stop ignoring malformed oid fanout chunk
  t: add library for munging chunk-format files
  ...
2023-10-23 13:56:36 -07:00
3f02785de9 doc/git-bisect: clarify git bisect run syntax
The description of the `git bisect run` command syntax at the beginning
of the manpage is `git bisect run <cmd>...`, which isn't quite clear
about what `<cmd>` is or what the `...` mean; one could think that it is
the whole (quoted) command line with all arguments in a single string,
or that it supports multiple commands, or that it doesn't accept
commands with arguments at all.

Change to `git bisect run <cmd> [<arg>...]` to clarify the syntax,
in both the manpage and the `git bisect -h` command output.

Additionally, change `--term-{new,bad}` et al to `--term-(new|bad)`
for consistency with the synopsis syntax conventions.

Signed-off-by: Javier Mora <cousteaulecommandant@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-23 13:04:47 -07:00
12b99928c8 builtin/branch.c: adjust error messages to coding guidelines
As per the CodingGuidelines document, it is recommended that error messages
such as die(), error() and warning(), should start with a lowercase letter
and should not end with a period.

This patch adjusts tests to match updated messages.

Signed-off-by: Isoken June Ibizugbe <isokenjune@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-23 12:22:57 -07:00
243c79fdc7 merge-ort.c: fix typo 'neeed' to 'needed'
Signed-off-by: 王常新 <wchangxin824@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-21 23:13:49 -07:00
ceadf0f3cf The twentieth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 16:23:11 -07:00
4835409be1 Merge branch 'ps/rewritten-is-per-worktree-doc'
Doc update.

* ps/rewritten-is-per-worktree-doc:
  doc/git-worktree: mention "refs/rewritten" as per-worktree refs
2023-10-20 16:23:11 -07:00
92741d83c0 Merge branch 'ak/pretty-decorate-more-fix'
Unlike "git log --pretty=%D", "git log --pretty="%(decorate)" did
not auto-initialize the decoration subsystem, which has been
corrected.

* ak/pretty-decorate-more-fix:
  pretty: fix ref filtering for %(decorate) formats
2023-10-20 16:23:11 -07:00
6b1e2254d6 Merge branch 'vd/loose-ref-iteration-optimization'
The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

* vd/loose-ref-iteration-optimization:
  files-backend.c: avoid stat in 'loose_fill_ref_dir'
  dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
  dir.[ch]: expose 'get_dtype'
  ref-cache.c: fix prefix matching in ref iteration
2023-10-20 16:23:11 -07:00
c662038629 Merge branch 'ty/merge-tree-strategy-options'
"git merge-tree" learned to take strategy backend specific options
via the "-X" option, like "git merge" does.

* ty/merge-tree-strategy-options:
  merge: introduce {copy|clear}_merge_options()
  merge-tree: add -X strategy option
2023-10-20 16:23:11 -07:00
f6d83e2115 git-push doc: more visibility for -q option
The "-v" option is shown in the SYNOPSIS section near the top, but
"-q" is not shown anywhere there.

List "-q" alongside "-v".

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 15:13:38 -07:00
96db17352d rebase: move parse_opt_keep_empty() down
This moves it right next to parse_opt_empty(), which is a much more
logical place. As a side effect, this removes the need for a forward
declaration of imply_merge().

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:47:44 -07:00
37e80a2471 rebase: handle --strategy via imply_merge() as well
At least after the successive trimming of enum rebase_type mentioned in
the previous commit, this code did exactly what imply_merge() does, so
just call it instead.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:47:44 -07:00
a5b5740bf6 rebase: simplify code related to imply_merge()
The code's evolution left in some bits surrounding enum rebase_type that
don't really make sense any more. In particular, it makes no sense to
invoke imply_merge() if the type is already known not to be
REBASE_APPLY, and it makes no sense to assign the type after calling
imply_merge().

enum rebase_type had more values until commit a74b35081c ("rebase: drop
support for `--preserve-merges`") and commit 10cdb9f38a ("rebase: rename
the two primary rebase backends"). The latter commit also renamed
imply_interactive() to imply_merge().

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:47:43 -07:00
3ec6167567 send-email: handle to/cc/bcc from --compose message
If the user writes a message via --compose, send-email will pick up
various headers like "From", "Subject", etc and use them for other
patches as if they were specified on the command-line. But we don't
handle "To", "Cc", or "Bcc" this way; we just tell the user "those
aren't interpeted yet" and ignore them.

But it seems like an obvious thing to want, especially as the same
feature exists when the cover letter is generated separately by
format-patch. There it is gated behind the --to-cover option, but I
don't think we'd need the same control here; since we generate the
--compose template ourselves based on the existing input, if the user
leaves the lines unchanged then the behavior remains the same.

So let's fill in the implementation; like those other headers we already
handle, we just need to assign to the initial_* variables. The only
difference in this case is that they are arrays, so we'll feed them
through parse_address_line() to split them (just like we would when
reading a single string via prompting).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:31:39 -07:00
637e8944a1 Revert "send-email: extract email-parsing code into a subroutine"
This reverts commit b6049542b9.

Prior to that commit, we read the results of the user editing the
"--compose" message in a loop, picking out parts we cared about, and
streaming the result out to a ".final" file. That commit split the
reading/interpreting into two phases; we'd now read into a hash, and
then pick things out of the hash.

The goal was making the code more readable. And in some ways it did,
because the ugly regexes are confined to the reading phase. But it also
introduced several bugs, because now the two phases need to match each
other. In particular:

  - we pick out headers like "Subject: foo" with a case-insensitive
    regex, and then use the user-provided header name as the key in a
    case-sensitive hash. So if the user wrote "subject: foo", we'd no
    longer recognize it as a subject.

  - the namespace for the hash keys conflates header names with meta
    information like "body". If you put "body: foo" in your message, it
    would be misinterpreted as the actual message body (nobody is likely
    to do that in practice, but it seems like an unnecessary danger).

  - the handling for to/cc/bcc is totally broken. The behavior before
    that commit is to recognize and skip those headers, with a note to
    the user that they are not yet handled. Not great, but OK. But
    after the patch, the reading side now splits the addresses into a
    perl array-ref. But the interpreting side doesn't handle this at
    all, and blindly prints the stringified array-ref value. This leads
    to garbage like:

      (mbox) Adding to: ARRAY (0x555b4345c428) from line 'To: ARRAY(0x555b4345c428)'
      error: unable to extract a valid address from: ARRAY (0x555b4345c428)
      What to do with this address? ([q]uit|[d]rop|[e]dit):

    Probably not a huge deal, since nobody should even try to use those
    headers in the first place (since they were not implemented). But
    the new behavior is worse, and indicative of the sorts of problems
    that come from having the two layers.

The revert had a few conflicts, due to later work in this area from
15dc3b9161 (send-email: rename variable for clarity, 2018-03-04) and
d11c943c78 (send-email: support separate Reply-To address, 2018-03-04).
I've ported the changes from those commits over as part of the conflict
resolution.

The new tests show the bugs. Note the use of GIT_SEND_EMAIL_NOTTY in the
second one. Without it, the test is happy to reach outside the test
harness to the developer's actual terminal (when run with the buggy
state before this patch).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:31:32 -07:00
e0c7e2c326 doc/send-email: mention handling of "reply-to" with --compose
The documentation for git-send-email lists the headers handled specially
by --compose in a way that implies that this is the complete set of
headers that are special. But one more was added by d11c943c78
(send-email: support separate Reply-To address, 2018-03-04) and never
documented.

Let's add it, and reword the documentation slightly to avoid having to
specify the list of headers twice (as it is growing and will continue to
do so as we add new features).

If you read the code, you may notice that we also handle MIME-Version
specially, in that we'll avoid over-writing user-provided MIME headers.
I don't think this is worth mentioning, as it's what you'd expect to
happen (as opposed to the other headers, which are picked up to be used
in later emails). And certainly this feature existed when the
documentation was expanded in 01d3861217 (git-send-email.txt: describe
--compose better, 2009-03-16), and we chose not to mention it then.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:31:30 -07:00
7cb26a1722 commit: ignore_non_trailer computes number of bytes to ignore
ignore_non_trailer() returns the _number of bytes_ that should be
ignored from the end of the log message. It does not by itself "ignore"
anything.

Rename this function to remove the leading "ignore" verb, to sound more
like a quantity than an action.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:25:12 -07:00
b1688ea02d grep: die gracefully when outside repository
Die gracefully when `git grep --no-index` is run outside of a Git
repository and the path is outside the directory tree.

If you are not in a Git repository and say:

    git grep --no-index search ..

You trigger a `BUG`:

    BUG: environment.c:213: git environment hasn't been setup
    Aborted (core dumped)

Because `..` is a valid path which is treated as a pathspec. Then
`pathspec` figures out that it is not in the current directory tree. The
`BUG` is triggered when `pathspec` tries to advise the user about how the
path is not in the current (non-existing) repository.

Reported-by: ks1322 ks1322 <ks1322@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 11:06:45 -07:00
10c89a02b0 git-p4 shouldn't attempt to store symlinks in LFS
git-p4.py would attempt to put a symlink in LFS if its file extension
matched git-p4.largeFileExtensions.

Git LFS doesn't store symlinks because smudge/clean filters don't handle
symlinks. They never get passed to the filter process nor the
smudge/clean filters, nor could that occur without a change to the
protocol or command-line interface. Unless Git learned how to send them
to the filters, Git LFS would have a hard time using them in any useful
way.

Git LFS's goal is to move large files out of the repository history, and
symlinks are functionally limited to 4 KiB or a similar size on most
systems.

Signed-off-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-19 10:57:44 -07:00
5abb758118 t7601: use "test_path_is_file" etc. instead of "test -f"
Some tests in t7601 use "test -f" and "test ! -f" to see if a path
exists or is missing.

Use test_path_is_file and test_path_is_missing helper functions to
clarify these tests a bit better. This especially matters for the
"missing" case because "test ! -f F" will be happy if "F" exists as a
directory, but the intent of the test is that "F" should not exist, even
as a directory. The updated code expresses this better.

Signed-off-by: Dorcas AnonoLitunya <anonolitunya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-18 16:57:49 -07:00
14d569b1a7 am: align placeholder for --whitespace option with apply
`git am` passes the value given to its `--whitespace` option through
to the underlying `git apply`, and the value is called <action> over
there.  Fix the documentation for the command that calls the value
<option> to say <action> instead.

Note that the option help given by `git am -h` already calls the
value <action>, so there is no need to make a matching change there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-18 16:35:44 -07:00
813d9a9188 The nineteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-18 13:25:42 -07:00
7906b5c957 Merge branch 'jc/merge-ort-attr-index-fix'
Fix "git merge-tree" to stop segfaulting when the --attr-source
option is used.

* jc/merge-ort-attr-index-fix:
  merge-ort: initialize repo in index state
2023-10-18 13:25:42 -07:00
cc7d7183f0 Merge branch 'sn/cat-file-doc-update'
"git cat-file" documentation updates.

* sn/cat-file-doc-update:
  doc/cat-file: make synopsis and description less confusing
2023-10-18 13:25:41 -07:00
0bc6bff9d5 Merge branch 'xz/commit-title-soft-limit-doc'
Doc update.

* xz/commit-title-soft-limit-doc:
  doc: correct the 50 characters soft limit (+)
2023-10-18 13:25:41 -07:00
79861babe2 Merge branch 'tb/repack-max-cruft-size'
"git repack" learned "--max-cruft-size" to prevent cruft packs from
growing without bounds.

* tb/repack-max-cruft-size:
  repack: free existing_cruft array after use
  builtin/repack.c: avoid making cruft packs preferred
  builtin/repack.c: implement support for `--max-cruft-size`
  builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
  t7700: split cruft-related tests to t7704
2023-10-18 13:25:41 -07:00
a060705d94 commit: do not use cryptic "new_index" in end-user facing messages
These error messages say "new_index" as if that spelling has some
significance to the end users (e.g. the file "$GIT_DIR/new_index"
has some issues), but that is not the case at all.  The i18n folks
were made to include the word literally in the translated messages,
which was not a good idea at all.  Spell it "new index", as we are
just telling the users that we failed to create a new index file.
The term is expected to be translated to the end-users' languages,
not left as if it were a literal file name.

This dates all the way back to the first re-implemenation of "git
commit" command in C (the scripted version did not have such wording
in its error messages), in f5bbc322 (Port git commit to C.,
2007-11-08).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-17 22:09:54 -07:00
48399e9cf0 builtin/add.c: clean up die() messages
As described in the CodingGuidelines document, a single line message
given to die() and its friends should not capitalize its first word,
and should not add full-stop at the end.

Signed-off-by: Naomi Ibe <naomi.ibeh69@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-17 12:41:55 -07:00
990adccbdf status: fix branch shown when not only bisecting
In 83c750acde (wt-status.*: better advice for git status added,
2012-06-05), git-status received new informative messages to describe
the ongoing work in a worktree.

These messages were enhanced in 0722c805d6 (status: show the branch name
if possible in in-progress info, 2013-02-03), to show, if possible, the
branch where the operation was initiated.

Since then, we show incorrect information when several operations are in
progress and one of them is bisect:

   $ git checkout -b foo
   $ GIT_SEQUENCE_EDITOR='echo break >' git rebase -i HEAD~
   $ git checkout -b bar
   $ git bisect start
   $ git status
   ...

   You are currently editing a commit while rebasing branch 'bar' on '...'.

   You are currently bisecting, started from branch 'bar'.

   ...

Note that we erroneously say "while rebasing branch 'bar'" when we
should be referring to "foo".

This must have gone unnoticed for so long because it must be unusual to
start a bisection while another operation is in progress.  And even less
usual to involve different branches.

It caught my attention reviewing a leak introduced in 8b87cfd000
(wt-status: move strbuf into read_and_strip_branch(), 2013-03-16).

A simple change to deal with this situation can be to record in struct
wt_status_state, the branch where the bisect starts separately from the
branch related to other operations.

Let's do it and so we'll be able to display correct information and
we'll avoid the leak as well.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-16 15:05:27 -07:00
ca3285dd69 doc/git-repack: don't mention nonexistent "--unpacked" option
The documentation for geometric repacking mentions a "--unpacked" option
that supposedly changes how loose objects are rolled up. This option has
never existed, and the implied behaviour, namely to include all unpacked
objects into the resulting packfile, is in fact the default behaviour.

Correct the documentation to not mention this option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-16 14:21:59 -07:00
e9cc3a027b doc/git-repack: fix syntax for -g shorthand option
The `-g` switch is a shorthand for `--geometric=` and allows the user to
specify the geometric. The documentation is wrong though and indicates
that the syntax for the shorthand is `-g=<factor>`. In fact though, the
option must be specified without the equals sign via `-g<factor>`.

Fix the syntax accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-16 14:21:59 -07:00
7538f9d89b t5319: make corrupted large-offset test more robust
The test t5319.88 ("reader bounds-checks large offset table") can fail
intermittently. The failure mode looks like this:

  1. An earlier test sets up "objects64", a directory that can be used
     to produce a midx with a corrupted large-offsets table. To get the
     large offsets, it corrupts the normal ".idx" file to have a fake
     large offset, and then builds a midx from that.

     That midx now has a large offset table, which is what we want. But
     we also have a .idx on disk that has a corrupted entry. We'll call
     the object with the corrupted large-offset "X".

  2. In t5319.88, we further corrupt the midx by reducing the size of
     the large-offset chunk (because our goal is to make sure we do not
     do an out-of-bounds read on it).

  3. We then enumerate all of the objects with "cat-file --batch-check
     --batch-all-objects", expecting to see a complaint when we try to
     show object X. We use --batch-all-objects because our objects64
     repo doesn't actually have any refs (but if we check them all, one
     of them will be the failing one). The default batch-check format
     includes %(objecttype) and %(objectsize), both of which require us
     to access the actual pack data (and thus requires looking at the
     offset).

  4a. Usually, this succeeds. We try to output object X, do a lookup via
      the midx for the type/size lookup, and run into the corrupt
      large-offset table.

  4b. But sometimes we hit a different error. If another object points
      to X as a delta base, then trying to find the type of that object
      requires walking the delta chain to the base entry (since only the
      base has the concrete type; deltas themselves are either OFS_DELTA
      or REF_DELTA).

      Normally this would not require separate offset lookups at all, as
      deltas are usually stored as OFS_DELTA, specifying the relative
      offset to the base. But the corrupt idx created in step 1 is done
      directly with "git pack-objects" and does not pass the
      --delta-base-offset option, meaning we have REF_DELTA entries!
      Those do have to consult an index to find the location of the base
      object, and they use the pack .idx to do this. The same pack .idx
      that we know is corrupted from step 1!

      Git does notice the error, but it does so by seeing the corrupt
      .idx file, not the corrupt midx file, and the error it reports is
      different, causing the test to fail.

The set of objects created in the test is deterministic. But the delta
selection seems not to be (which is not too surprising, as it is
multi-threaded). I have seen the failure in Windows CI but haven't
reproduced it locally (not even with --stress). Re-running a failed
Windows CI job tends to work. But when I download and examine the trash
directory from a failed run, it shows a different set of deltas than I
get locally. But the exact source of non-determinism isn't that
important; our test should be robust against any order.

There are a few options to fix this:

  a. It would be OK for the "objects64" setup to "unbreak" the .idx file
     after generating the midx. But then it would be hard for subsequent
     tests to reuse it, since it is the corrupted idx that forces the
     midx to have a large offset table.

  b. The "objects64" setup could use --delta-base-offset. This would fix
     our problem, but earlier tests have many hard-coded offsets. Using
     OFS_DELTA would change the locations of objects in the pack (this
     might even be OK because I think most of the offsets are within the
     .idx file, but it seems brittle and I'm afraid to touch it).

  c. Our cat-file output is in oid order by default. Since we store
     bases before deltas, if we went in pack order (using the
     "--unordered" flag), we'd always see our corrupt X before any delta
     which depends on it. But using "--unordered" means we skip the midx
     entirely. That makes sense, since it is just enumerating all of
     the packs, using the offsets found in their .idx files directly.
     So it doesn't work for our test.

  d. We could ask directly about object X, rather than enumerating all
     of them. But that requires further hard-coding of the oid (both
     sha1 and sha256) of object X. I'd prefer not to introduce more
     brittleness.

  e. We can use a --batch-check format that looks at the pack data, but
     doesn't have to chase deltas. The problem in this case is
     %(objecttype), which has to walk to the base. But %(objectsize)
     does not; we can get the value directly from the delta itself.
     Another option would be %(deltabase), where we report the REF_DELTA
     name but don't look at its data.

I've gone with option (e) here. It's kind of subtle, but it's simple and
has no side effects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-14 10:17:25 -07:00
a9ecda2788 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 14:18:29 -07:00
2920971a7f Merge branch 'jk/decoration-and-other-leak-fixes'
Leakfix.

* jk/decoration-and-other-leak-fixes:
  daemon: free listen_addr before returning
  revision: clear decoration structs during release_revisions()
  decorate: add clear_decoration() function
2023-10-13 14:18:28 -07:00
09dcbb486d Merge branch 'ar/diff-index-merge-base-fix'
"git diff --merge-base X other args..." insisted that X must be a
commit and errored out when given an annotated tag that peels to a
commit, but we only need it to be a committish.  This has been
corrected.

* ar/diff-index-merge-base-fix:
  diff: fix --merge-base with annotated tags
2023-10-13 14:18:28 -07:00
b32f5b6b34 Merge branch 'js/submodule-fix-misuse-of-path-and-name'
In .gitmodules files, submodules are keyed by their names, and the
path to the submodule whose name is $name is specified by the
submodule.$name.path variable.  There were a few codepaths that
mixed the name and path up when consulting the submodule database,
which have been corrected.  It took long for these bugs to be found
as the name of a submodule initially is the same as its path, and
the problem does not surface until it is moved to a different path,
which apparently happens very rarely.

* js/submodule-fix-misuse-of-path-and-name:
  t7420: test that we correctly handle renamed submodules
  t7419: test that we correctly handle renamed submodules
  t7419, t7420: use test_cmp_config instead of grepping .gitmodules
  t7419: actually test the branch switching
  submodule--helper: return error from set-url when modifying failed
  submodule--helper: use submodule_from_path in set-{url,branch}
2023-10-13 14:18:28 -07:00
a45eddec40 Merge branch 'jk/commit-graph-leak-fixes'
Leakfix.

* jk/commit-graph-leak-fixes:
  commit-graph: clear oidset after finishing write
  commit-graph: free write-context base_graph_name during cleanup
  commit-graph: free write-context entries before overwriting
  commit-graph: free graph struct that was not added to chain
  commit-graph: delay base_graph assignment in add_graph_to_chain()
  commit-graph: free all elements of graph chain
  commit-graph: move slab-clearing to close_commit_graph()
  merge: free result of repo_get_merge_bases()
  commit-reach: free temporary list in get_octopus_merge_bases()
  t6700: mark test as leak-free
2023-10-13 14:18:28 -07:00
c75e91499b Merge branch 'la/trailer-test-and-doc-updates'
Test coverage for trailers has been improved.

* la/trailer-test-and-doc-updates:
  trailer doc: <token> is a <key> or <keyAlias>, not both
  trailer doc: separator within key suppresses default separator
  trailer doc: emphasize the effect of configuration variables
  trailer --unfold help: prefer "reformat" over "join"
  trailer --parse docs: add explanation for its usefulness
  trailer --only-input: prefer "configuration variables" over "rules"
  trailer --parse help: expose aliased options
  trailer --no-divider help: describe usual "---" meaning
  trailer: trailer location is a place, not an action
  trailer doc: narrow down scope of --where and related flags
  trailer: add tests to check defaulting behavior with --no-* flags
  trailer test description: this tests --where=after, not --where=before
  trailer tests: make test cases self-contained
2023-10-13 14:18:27 -07:00
e56b9edf22 Merge branch 'ds/mailmap-entry-update'
Update mailmap entry for Derrick.

* ds/mailmap-entry-update:
  mailmap: change primary address for Derrick Stolee
2023-10-13 14:18:27 -07:00
5143ac07b1 Prevent git from rehashing 4GiB files
The index stores file sizes using a uint32_t. This causes any file
that is a multiple of 2^32 to have a cached file size of zero.
Zero is a special value used by racily clean. This causes git to
rehash every file that is a multiple of 2^32 every time git status
or git commit is run.

This patch mitigates the problem by making all files that are a
multiple of 2^32 appear to have a size of 1<<31 instead of zero.

The value of 1<<31 is chosen to keep it as far away from zero
as possible to help prevent things getting mixed up with unpatched
versions of git.

An example would be to have a 2^32 sized file in the index of
patched git. Patched git would save the file as 2^31 in the cache.
An unpatched git would very much see the file has changed in size
and force it to rehash the file, which is safe. The file would
have to grow or shrink by exactly 2^31 and retain all of its
ctime, mtime, and other attributes for old git to not notice
the change.

This patch does not change the behavior of any file that is not
an exact multiple of 2^32.

Signed-off-by: Jason D. Hatton <jhatton@globalfinishing.com>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 13:33:35 -07:00
678eb55f5d t: add a test helper to truncate files
In a future commit, we're going to work with some large files which will
be at least 4 GiB in size.  To take advantage of the sparseness
functionality on most Unix systems and avoid running the system out of
disk, it would be convenient to use truncate(2) to simply create a
sparse file of sufficient size.

However, the GNU truncate(1) utility isn't portable, so let's write a
tiny test helper that does the work for us.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 13:33:35 -07:00
9f9c40cf34 attr: add attr.tree for setting the treeish to read attributes from
44451a2 (attr: teach "--attr-source=<tree>" global option to "git",
2023-05-06) provided the ability to pass in a treeish as the attr
source. In the context of serving Git repositories as bare repos like we
do at GitLab however, it would be easier to point --attr-source to HEAD
for all commands by setting it once.

Add a new config attr.tree that allows this.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 11:43:29 -07:00
2386535511 attr: read attributes from HEAD when bare repo
The motivation for 44451a2e5e (attr: teach "--attr-source=<tree>" global
option to "git" , 2023-05-06), was to make it possible to use
gitattributes with bare repositories.

To make it easier to read gitattributes in bare repositories however,
let's just make HEAD:.gitattributes the default. This is in line with
how mailmap works, 8c473cecfd (mailmap: default mailmap.blob in bare
repositories, 2012-12-13).

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 11:43:29 -07:00
59167d7d09 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-12 12:18:27 -07:00
4ae4c70577 Merge branch 'js/ci-coverity'
GitHub CI workflow has learned to trigger Coverity check.

* js/ci-coverity:
  coverity: detect and report when the token or project is incorrect
  coverity: allow running on macOS
  coverity: support building on Windows
  coverity: allow overriding the Coverity project
  coverity: cache the Coverity Build Tool
  ci: add a GitHub workflow to submit Coverity scans
2023-10-12 12:18:27 -07:00
c70e7a3cfd Merge branch 'jm/git-status-submodule-states-docfix'
Docfix.

* jm/git-status-submodule-states-docfix:
  git-status.txt: fix minor asciidoc format issue
2023-10-12 12:18:26 -07:00
6e47cfcffc Merge branch 'rs/parse-opt-ctx-cleanup'
Code clean-up.

* rs/parse-opt-ctx-cleanup:
  parse-options: drop unused parse_opt_ctx_t member
2023-10-12 12:18:26 -07:00
6e5457d8c7 mailmap: change primary address for Derrick Stolee
The previous primary address is no longer valid.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-12 10:59:36 -07:00
5b2424b658 grep: -f <path> is relative to $cwd
Just like OPT_FILENAME() does, "git grep -f <path>" should treat
the <path> relative to the original $cwd by paying attention to the
prefix the command is given.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-12 10:41:59 -07:00
d9b6634589 stash: be careful what we store
"git stash store" is meant to store what "git stash create"
produces, as these two are implementation details of the end-user
facing "git stash save" command.  Even though it is clearly
documented as such, users would try silly things like "git stash
store HEAD" to render their stash unusable.

Worse yet, because "git stash drop" does not allow such a stash
entry to be removed, "git stash clear" would be the only way to
recover from such a mishap.  Reuse the logic that allows "drop" to
refrain from working on such a stash entry to teach "store" to avoid
storing an object that is not a stash entry in the first place.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-11 16:27:30 -07:00
b182658e3e merge: introduce {copy|clear}_merge_options()
When mostly the same set of options are to be used to perform
multiple merges, one instance of the merge_options structure may
want to be created and used by copying from the same template
instance.  We saw such a use recently in "git merge-tree".

Let's make the pattern official by introducing copy_merge_options()
as a supported way to make a copy of the structure, and also give
clear_merge_options() to release any resources held by a copied
instance.  Currently we only make a shallow copy, so the former is a
mere structure assignment while the latter is a no-op, but this may
change in the future as the members of merge_options structure
evolve.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-11 13:37:47 -07:00
aab89be2eb The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-10 11:39:15 -07:00
1fdedb7c7d Merge branch 'cc/repack-sift-filtered-objects-to-separate-pack'
"git repack" machinery learns to pay attention to the "--filter="
option.

* cc/repack-sift-filtered-objects-to-separate-pack:
  gc: add `gc.repackFilterTo` config option
  repack: implement `--filter-to` for storing filtered out objects
  gc: add `gc.repackFilter` config option
  repack: add `--filter=<filter-spec>` option
  pack-bitmap-write: rebuild using new bitmap when remapping
  repack: refactor finding pack prefix
  repack: refactor finishing pack-objects command
  t/helper: add 'find-pack' test-tool
  pack-objects: allow `--filter` without `--stdout`
2023-10-10 11:39:15 -07:00
afb0d0880a Merge branch 'ds/init-diffstat-width'
Code clean-up.

* ds/init-diffstat-width:
  diff --stat: set the width defaults in a helper function
2023-10-10 11:39:14 -07:00
a7a2d10421 Merge branch 'cw/prelim-cleanup'
Shuffle some bits across headers and sources to prepare for
libification effort.

* cw/prelim-cleanup:
  parse: separate out parsing functions from config.h
  config: correct bad boolean env value error message
  wrapper: reduce scope of remove_or_warn()
  hex-ll: separate out non-hash-algo functions
2023-10-10 11:39:14 -07:00
3df51ea0a5 Merge branch 'eb/limit-bulk-checkin-to-blobs'
The "streaming" interface used for bulk-checkin codepath has been
narrowed to take only blob objects for now, with no real loss of
functionality.

* eb/limit-bulk-checkin-to-blobs:
  bulk-checkin: only support blobs in index_bulk_checkin
2023-10-10 11:39:14 -07:00
8b3aa36f5a doc/git-worktree: mention "refs/rewritten" as per-worktree refs
Some references are special in the context of worktrees as they are
considered to be per-worktree instead of shared across all of the
worktrees. Most importantly, this includes "refs/worktree/" that have
explicitly been designed such that users can create per-woorktree refs.
But there are also special references that have an associated meaning
like "refs/bisect/", which is used to track state of git-bisect(1).

These special per-worktree references are documented in git-worktree(1),
but one instance is missing. In a9be29c981 (sequencer: make refs
generated by the `label` command worktree-local, 2018-04-25), we have
converted "refs/rewritten/" to be a per-worktree reference as well.
These references are used by our sequencer infrastructure to generate
labels for rebased commits. So in order to allow for multiple concurrent
rebases to happen in different worktrees, these references need to be
tracked per worktree.

We forgot to update our documentation to mention these new per-worktree
references, which is fixed by this patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-10 09:23:16 -07:00
ca06f0fe5d chunk-format: drop pair_chunk_unsafe()
There are no callers left, and we don't want anybody to add new ones (they
should use the not-unsafe version instead). So let's drop the function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:02 -07:00
12192a9db9 commit-graph: detect out-of-order BIDX offsets
The BIDX chunk tells us the offsets at which each commit's Bloom filters
can be found in the BDAT chunk. We compute the length of each filter by
checking the offsets of neighbors and subtracting them.

If the offsets are out of order, then we'll get a negative length, which
we then store as a very large unsigned value. This can cause us to read
out-of-bounds memory, as we access the hash data modulo "filter->len *
BITS_PER_WORD".

We can easily detect this case when loading the individual filters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:02 -07:00
581e0f8b18 commit-graph: check bounds when accessing BIDX chunk
We load the bloom_filter_indexes chunk using pair_chunk(), so we have no
idea how big it is. This can lead to out-of-bounds reads if it is
smaller than expected, since we index it based on the number of commits
found elsewhere in the graph file.

We can check the chunk size up front, like we do for CDAT and other
chunks with one fixed-size record per commit.

The test case demonstrates the problem. It actually won't segfault,
because we end up reading random data from the follow-on chunk (BDAT in
this case), and the bounds checks added in the previous patch complain.
But this is by no means assured, and you can craft a commit-graph file
with BIDX at the end (or a smaller BDAT) that does segfault.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
920f400e91 commit-graph: check bounds when accessing BDAT chunk
When loading Bloom filters from a commit-graph file, we use the offset
values in the BIDX chunk to index into the memory mapped for the BDAT
chunk. But since we don't record how big the BDAT chunk is, we just
trust that the BIDX offsets won't cause us to read outside of the chunk
memory. A corrupted or malicious commit-graph file will cause us to
segfault (in practice this isn't a very interesting attack, since
commit-graph files are local-only, and the worst case is an
out-of-bounds read).

We can't fix this by checking the chunk size during parsing, since the
data in the BDAT chunk doesn't have a fixed size (that's why we need the
BIDX in the first place). So we'll fix it in two parts:

  1. Record the BDAT chunk size during parsing, and then later check
     that the BIDX offsets we look up are within bounds.

  2. Because the offsets are relative to the end of the BDAT header, we
     must also make sure that the BDAT chunk is at least as large as the
     expected header size. Otherwise, we overflow when trying to move
     past the header, even for an offset of "0". We can check this
     early, during the parsing stage.

The error messages are rather verbose, but since this is not something
you'd expect to see outside of severe bugs or corruption, it makes sense
to err on the side of too many details. Sadly we can't mention the
filename during the chunk-parsing stage, as we haven't set g->filename
at this point, nor passed it down through the stack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
ee6a792412 commit-graph: bounds-check generation overflow chunk
If the generation entry in a commit-graph doesn't fit, we instead insert
an offset into a generation overflow chunk. But since we don't record
the size of the chunk, we may read outside the chunk if the offset we
find on disk is malicious or corrupted.

We can't check the size of the chunk up-front; it will vary based on how
many entries need overflow. So instead, we'll do a bounds-check before
accessing the chunk memory. Unfortunately there is no error-return from
this function, so we'll just have to die(), which is what it does for
other forms of corruption.

As with other cases, we can drop the st_mult() call, since we know our
bounds-checked value will fit within a size_t.

Before this patch, the test here actually "works" because we read
garbage data from the next chunk. And since that garbage data happens
not to provide a generation number which changes the output, it appears
to work. We could construct a case that actually segfaults or produces
wrong output, but it would be a bit tricky. For our purposes its
sufficient to check that we've detected the bounds error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
4a3c34662b commit-graph: check size of generations chunk
We neither check nor record the size of the generations chunk we parse
from a commit-graph file. This should have one uint32_t for each commit
in the file; if it is smaller (due to corruption, etc), we may read
outside the mapped memory.

The included test segfaults without this patch, as it shrinks the size
considerably (and the chunk is near the end of the file, so we read off
the end of the array rather than accidentally reading another chunk).

We can fix this by checking the size up front (like we do for other
fixed-size chunks, like CDAT).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
6cf61d0db5 commit-graph: bounds-check base graphs chunk
When we are loading a commit-graph chain, we check that each slice of the
chain points to the appropriate set of base graphs via its BASE chunk.
But since we don't record the size of the chunk, we may access
out-of-bounds memory if the file is corrupted.

Since we know the number of entries we expect to find (based on the
position within the commit-graph-chain file), we can just check the size
up front.

In theory this would also let us drop the st_mult() call a few lines
later when we actually access the memory, since we know that the
computed offset will fit in a size_t. But because the operands
"g->hash_len" and "n" have types "unsigned char" and "int", we'd have to
cast to size_t first. Leaving the st_mult() does that cast, and makes it
more obvious that we don't have an overflow problem.

Note that the test does not actually segfault before this patch, since
it just reads garbage from the chunk after BASE (and indeed, it even
rejects the file because that garbage does not have the expected hash
value). You could construct a file with BASE at the end that did
segfault, but corrupting the existing one is easy, and we can check
stderr for the expected message.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
9622610e55 commit-graph: detect out-of-bounds extra-edges pointers
If an entry in a commit-graph file has more than 2 parents, the
fixed-size parent fields instead point to an offset within an "extra
edges" chunk. We blindly follow these, assuming that the chunk is
present and sufficiently large; this can lead to an out-of-bounds read
for a corrupt or malicious file.

We can fix this by recording the size of the chunk and adding a
bounds-check in fill_commit_in_graph(). There are a few tricky bits:

  1. We'll switch from working with a pointer to an offset. This makes
     some corner cases just fall out naturally:

      a. If we did not find an EDGE chunk at all, our size will
         correctly be zero (so everything is "out of bounds").

      b. Comparing "size / 4" lets us make sure we have at least 4 bytes
         to read, and we never compute a pointer more than one element
         past the end of the array (computing a larger pointer is
         probably OK in practice, but is technically undefined
         behavior).

      c. The current code casts to "uint32_t *". Replacing it with an
         offset avoids any comparison between different types of pointer
         (since the chunk is stored as "unsigned char *").

  2. This is the first case in which fill_commit_in_graph() may return
     anything but success. We need to make sure to roll back the
     "parsed" flag (and any parents we might have added before running
     out of buffer) so that the caller can cleanly fall back to
     loading the commit object itself.

     It's a little non-trivial to do this, and we might benefit from
     factoring it out. But we can wait on that until we actually see a
     second case where we return an error.

As a bonus, this lets us drop the st_mult() call. Since we've already
done a bounds check, we know there won't be any integer overflow (it
would imply our buffer is larger than a size_t can hold).

The included test does not actually segfault before this patch (though
you could construct a case where it does). Instead, it reads garbage
from the next chunk which results in it complaining about a bogus parent
id. This is sufficient for our needs, though (we care that the fallback
succeeds, and that stderr mentions the out-of-bounds read).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
b72df612af commit-graph: check size of commit data chunk
We expect a commit-graph file to have a fixed-size data record for each
commit in the file (and we know the number of commits to expct from the
size of the lookup table). If we encounter a file where this is too
small, we'll look past the end of the chunk (and possibly even off the
mapped memory).

We can fix this by checking the size up front when we record the
pointer.

The included test doesn't segfault, since it ends up reading bytes
from another chunk. But it produces nonsense results, since the values
it reads are garbage. Our test notices this by comparing the output to a
non-corrupted run of the same command (and of course we also check that
the expected error is printed to stderr).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
c0fe9b2da5 midx: check size of revindex chunk
When we load a revindex from disk, we check the size of the file
compared to the number of objects we expect it to have. But when we use
a RIDX chunk stored directly in the midx, we just access the memory
directly. This can lead to out-of-bounds memory access for a corrupted
or malicious multi-pack-index file.

We can catch this by recording the RIDX chunk size, and then checking it
against the expected size when we "load" the revindex. Note that this
check is much simpler than the one that load_revindex_from_disk() does,
because we just have the data array with no header (so we do not need
to account for the header size, and nor do we need to bother validating
the header values).

The test confirms both that we catch this case, and that we continue the
process (the revindex is required to use the midx bitmaps, but we
fallback to a non-bitmap traversal).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
2abd56e9b2 midx: bounds-check large offset chunk
When we see a large offset bit in the regular midx offset table, we
use the entry as an index into a separate large offset table (just like
a pack idx does). But we don't bounds-check the access to that large
offset table (nor even record its size when we parse the chunk!).

The equivalent code for a regular pack idx is in check_pack_index_ptr().
But things are a bit simpler here because of the chunked format: we can
just check our array index directly.

As a bonus, we can get rid of the st_mult() here. If our array
bounds-check is successful, then we know that the result will fit in a
size_t (and the bounds check uses a division to avoid overflow
entirely).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
0924869b4e midx: check size of object offset chunk
The object offset chunk has one fixed-size entry for each object in the
midx. But since we don't check its size, we may access out-of-bounds
memory if we see a corrupt or malicious midx file.

Sine the entries are fixed-size, the total length can be known up-front,
and we can just check it while parsing the chunk (this is similar to
what we do when opening pack idx files, which contain a similar offset
table).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
c9b9fefc13 midx: enforce chunk alignment on reading
The midx reader assumes chunks are aligned to a 4-byte boundary: we
treat the fanout chunk as an array of uint32_t, indexing it to feed the
results to ntohl(). Without aligning the chunks, we may violate the
CPU's alignment constraints. Though many platforms allow this, some do
not. And certanily UBSan will complain, since it is undefined behavior.

Even though most chunks are naturally 4-byte-aligned (because they are
storing uint32_t or larger types), PNAM is not. It stores NUL-terminated
pack names, so you can have a valid chunk with any length. The writing
side handles this by 4-byte-aligning the chunk, introducing a few extra
NULs as necessary. But since we don't check this on the reading side, we
may end up with a misaligned fanout and trigger the undefined behavior.

We have two options here:

  1. Swap out ntohl(fanout[i]) for get_be32(fanout+i) everywhere. The
     latter handles alignment itself. It's possible that it's slightly
     slower (though in practice I'm not sure how true that is,
     especially for these code paths which then go on to do a binary
     search).

  2. Enforce the alignment when reading the chunks. This is easy to do,
     since the table-of-contents reader can check it in one spot.

I went with the second option here, just because it places less burden
on maintenance going forward (it is OK to continue using ntohl), and we
know it can't have any performance impact on the actual reads.

The commit-graph code uses the same chunk API. It's usually also 4-byte
aligned, but some chunks are not (like Bloom filter BDAT chunks). So
we'll pass "1" here to allow any alignment. It doesn't suffer from the
same problem as midx with its fanout because the fanout chunk is always
the first (and the rest of the format dictates that the first chunk will
start aligned).

The new test shows the effect on a midx with a misaligned PNAM chunk.
Note that the midx-reading code treats chunk-toc errors as soft, falling
back to the non-midx path rather than calling die(), as we do for other
parsing errors. Arguably we should make all of these behave the same,
but that's out of scope for this patch. For now the test just expects
the fallback behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
72a9a08283 midx: check size of pack names chunk
We parse the pack-name chunk as a series of NUL-terminated strings. But
since we don't look at the chunk size, there's nothing to guarantee that
we don't parse off the end of the chunk (or even off the end of the
mapped file).

We can record the length, and then as we parse make sure that we never
walk past it.

The new test exercises the case, though note that it does not actually
segfault before this patch. It hits a NUL byte somewhere in one of the
other chunks, and comes up with a garbage pack name. You could construct
one that reads out-of-bounds (e.g., a PNAM chunk at the end of file),
but this case is simple and sufficient to check that we detect the
problem.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:01 -07:00
4169d89645 commit-graph: check consistency of fanout table
We use bsearch_hash() to look up items in the oid index of a
commit-graph. It also has a fanout table to reduce the initial range in
which we'll search. But since the fanout comes from the on-disk file, a
corrupted or malicious file can cause us to look outside of the
allocated index memory.

One solution here would be to pass the total table size to
bsearch_hash(), which could then bounds check the values it reads from
the fanout. But there's an inexpensive up-front check we can do, and
it's the same one used by the midx and pack idx code (both of which
likewise have fanout tables and use bsearch_hash(), but are not affected
by this bug):

  1. We can check the value of the final fanout entry against the size
     of the table we got from the index chunk. These must always match,
     since the fanout is just slicing up the index.

       As a side note, the midx and pack idx code compute it the other
       way around: they use the final fanout value as the object count, and
       check the index size against it. Either is valid; if they
       disagree we cannot know which is wrong (a corrupted fanout value,
       or a too-small table of oids).

  2. We can quickly scan the fanout table to make sure it is
     monotonically increasing. If it is, then we know that every value
     is less than or equal to the final value, and therefore less than
     or equal to the table size.

     It would also be sufficient to just check that each fanout value is
     smaller than the final one, but the midx and pack idx code both do
     a full monotonicity check. It's the same cost, and it catches some
     other corruptions (though not all; the checks done by "commit-graph
     verify" are more complete but more expensive, and our goal here is
     to be fast and memory-safe).

There are two new tests. One just checks the final fanout value (this is
the mirror image of the "too small oid lookup" case added for the midx
in the previous commit; it's flipped here because commit-graph considers
the oid lookup chunk to be the source of truth).

The other actually creates a fanout with many out-of-bounds entries, and
prior to this patch, it does cause the segfault you'd expect. But note
that the error is not "your fanout entry is out-of-bounds", but rather
"fanout value out of order". That's because we leave the final fanout
value in place (to get past the table size check), making the index
non-monotonic (the second-to-last entry is big, but the last one must
remain small to match the actual table).

We need adjustments to a few existing tests, as well:

  - an earlier test in t5318 corrupts the fanout and runs "commit-graph
    verify". Its message is now changed, since we catch the problem
    earlier (during the load step, rather than the careful validation
    step).

  - in t5324, we test that "commit-graph verify --shallow" does not do
    expensive verification on the base file of the chain. But the
    corruption it uses (munging a byte at offset 1000) happens to be in
    the middle of the fanout table. And now we detect that problem in
    the cheaper checks that are performed for every part of the graph.
    We'll push this back to offset 1500, which is only caught by the
    more expensive checksum validation.

    Likewise, there's a later test in t5324 which munges an offset 100
    bytes into a file (also in the fanout table) that is referenced by
    an alternates file. So we now find that corruption during the load
    step, rather than the verification step. At the very least we need
    to change the error message (like the case above in t5318). But it
    is probably good to make sure we handle all parts of the
    verification even for alternate graph files. So let's likewise
    corrupt byte 1500 and make sure we found the invalid checksum.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:00 -07:00
fc926567ed midx: check size of oid lookup chunk
When reading an on-disk multi-pack-index, we take the number of objects
in the midx from the final value of the fanout table. But we just
blindly assume that the chunk containing the actual oid entries is the
correct size. This can lead to us reading out-of-bounds memory if the
lookup chunk is too small (or if the fanout is corrupted; when they
don't agree we cannot tell which one is wrong).

Note that we bump the assignment of m->num_objects into the fanout
parser callback, so that it's set when we parse the lookup table
(otherwise we'd have to manually record the lookup table size and check
it later).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:00 -07:00
52e2e8d43d commit-graph: check size of oid fanout chunk
We load the oid fanout chunk with pair_chunk(), which means we never see
the size of the chunk. We just assume the on-disk file uses the
appropriate size, and if it's too small we'll access random memory.

It's easy to check this up-front; the fanout always consists of 256
uint32's, since it is a fanout of the first byte of the hash pointing
into the oid index. These parameters can't be changed without
introducing a new chunk type.

This matches the similar check in the midx OIDF chunk (but note that
rather than checking for the error immediately, the graph code just
leaves parts of the struct NULL and checks for required fields later).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:00 -07:00
e3c9600397 midx: stop ignoring malformed oid fanout chunk
When we load the oid-fanout chunk, our callback checks that its size is
reasonable and returns an error if not. However, the caller only checks
our return value against CHUNK_NOT_FOUND, so we end up ignoring the
error completely! Using a too-small fanout table means we end up
accessing random memory for the fanout and segfault.

We can fix this by checking for any non-zero return value, rather than
just CHUNK_NOT_FOUND, and adjusting our error message to cover both
cases. We could handle each error code individually, but there's not
much point for such a rare case. The extra message produced in the
callback makes it clear what is going on.

The same pattern is used in the adjacent code. Those cases are actually
OK for now because they do not use a custom callback, so the only error
they can get is CHUNK_NOT_FOUND. But let's convert them, as this is an
accident waiting to happen (especially as we convert some of them away
from pair_chunk). The error messages are more verbose, but it should be
rare for a user to see these anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:00 -07:00
86b008ee61 t: add library for munging chunk-format files
When testing corruption of files using the chunk format (like
commit-graphs and midx files), it's helpful to be able to modify bytes
in specific chunks. This requires being able both to read the
table-of-contents (to find the chunk to modify) but also to adjust it
(to account for size changes in the offsets of subsequent chunks).

We have some tests already which corrupt chunk files, but they have some
downsides:

  1. They are very brittle, as they manually compute the expected size
     of a particular instance of the file (e.g., see the definitions
     starting with NUM_OBJECTS in t5319).

  2. Because they rely on manual offsets and don't read the
     table-of-contents, they're limited to overwriting bytes. But there
     are many interesting corruptions that involve changing the sizes of
     chunks (especially smaller-than-expected ones).

This patch adds a perl script which makes such corruptions easy. We'll
use it in subsequent patches.

Note that we could get by with just a big "perl -e" inside the helper
function. I chose to put it in a separate script for two reasons. One,
so we don't have to worry about the extra layer of shell quoting. And
two, the script is kind of big, and running the tests with "-x" would
repeatedly dump it into the log output.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:00 -07:00
570b8b8836 chunk-format: note that pair_chunk() is unsafe
The pair_chunk() function is provided as an easy helper for parsing
chunks that just want a pointer to a set of bytes. But every caller has
a hidden bug: because we return only the pointer without the matching
chunk size, the callers have no clue how many bytes they are allowed to
look at. And as a result, they may read off the end of the mmap'd data
when the on-disk file does not match their expectations.

Since chunk files are typically used for local-repository data like
commit-graph files and midx's, the security implications here are pretty
mild. The worst that can happen is that you hand somebody a corrupted
repository tarball, and running Git on it does an out-of-bounds read and
crashes. So it's worth being more defensive, but we don't need to drop
everything and fix every caller immediately.

I noticed the problem because the pair_chunk_fn() callback does not look
at its chunk_size argument, and wanted to annotate it to silence
-Wunused-parameter. We could do that now, but we'd lose the hint that
this code should be audited and fixed.

So instead, let's set ourselves up for going down that path:

  1. Provide a pair_chunk() function that does return the size, which
     prepares us for fixing these cases.

  2. Rename the existing function to pair_chunk_unsafe(). That gives us
     an easy way to grep for cases which still need to be fixed, and the
     name should cause anybody adding new calls to think twice before
     using it.

There are no callers of the "safe" version yet, but we'll add some in
subsequent patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:55:00 -07:00
2cdb796101 files-backend.c: avoid stat in 'loose_fill_ref_dir'
Modify the 'readdir' loop in 'loose_fill_ref_dir' to, rather than 'stat' a
file to determine whether it is a directory or not, use 'get_dtype'.

Currently, the loop uses 'stat' to determine whether each dirent is a
directory itself or not in order to construct the appropriate ref cache
entry. If 'stat' fails (returning a negative value), the dirent is silently
skipped; otherwise, 'S_ISDIR(st.st_mode)' is used to check whether the entry
is a directory.

On platforms that include an entry's d_type in in the 'dirent' struct, this
extra 'stat' check is redundant. We can use the 'get_dtype' method to
extract this information on platforms that support it (i.e. where
NO_D_TYPE_IN_DIRENT is unset), and derive it with 'stat' on platforms that
don't. Because 'stat' is an expensive call, this confers a
modest-but-noticeable performance improvement when iterating over large
numbers of refs (approximately 20% speedup in 'git for-each-ref' in a 30k
ref repo).

Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set
to 1 to replicate the existing handling of symlink dirents. This
unfortunately requires calling 'stat' on the associated entry regardless of
platform, but symlinks in the loose ref store are highly unlikely since
they'd need to be created manually by a user.

Note that this patch also changes the condition for skipping creation of a
ref entry from "when 'stat' fails" to "when the d_type is anything other
than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because
the platform doesn't support d_type in dirents or some other reason) or
DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If
the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be
skipped. However, it will also be skipped if it is any other valid d_type
(e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not
handle these properly anyway, so we can safely constrain accepted types to
directories and regular files.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:14 -07:00
aa79636fe7 dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
Add a 'follow_symlink' boolean option to 'get_type()'. If 'follow_symlink'
is enabled, DT_LNK (in addition to DT_UNKNOWN) d_types triggers the
stat-based d_type resolution, using 'stat' instead of 'lstat' to get the
type of the followed symlink. Note that symlinks are not followed
recursively, so a symlink pointing to another symlink will still resolve to
DT_LNK.

Update callers in 'diagnose.c' to specify 'follow_symlink = 0' to preserve
current behavior.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:13 -07:00
6dc1004333 dir.[ch]: expose 'get_dtype'
Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to
'dir.h' so that it is accessible to callers in other files. The function and
its documentation are moved verbatim except for a small addition to the
description clarifying what the 'path' arg represents.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:13 -07:00
5305474ec4 ref-cache.c: fix prefix matching in ref iteration
Update 'cache_ref_iterator_advance' to skip over refs that are not matched
by the given prefix.

Currently, a ref entry is considered "matched" if the entry name is fully
contained within the prefix:

* prefix: "refs/heads/v1"
* entry: "refs/heads/v1.0"

OR if the prefix is fully contained in the entry name:

* prefix: "refs/heads/v1.0"
* entry: "refs/heads/v1"

The first case is always correct, but the second is only correct if the ref
cache entry is a directory, for example:

* prefix: "refs/heads/example"
* entry: "refs/heads/"

Modify the logic in 'cache_ref_iterator_advance' to reflect these
expectations:

1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
   ref cache entry do not overlap at all. Skip this entry.
2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
   inside this entry if it is a directory. Skip if the entry is not a
   directory, otherwise iterate over it.
3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
   that the cache entry (directory or not) is fully contained by or equal to
   the prefix. Iterate over this entry.

Note that condition 2 relies on the names of directory entries having the
appropriate trailing slash. The existing function documentation of
'create_dir_entry' explicitly calls out the trailing slash requirement, so
this is a safe assumption to make.

This bug generally doesn't have any user-facing impact, since it requires:

1. using a non-empty prefix without a trailing slash in an iteration like
   'for_each_fullref_in',
2. the callback to said iteration not reapplying the original filter (as
   for-each-ref does) to ensure unmatched refs are skipped, and
3. the repository having one or more refs that match part of, but not all
   of, the prefix.

However, there are some niche scenarios that meet those criteria
(specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
tests covering those cases to demonstrate the fix in this patch.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:13 -07:00
e95bafc52f merge-ort: initialize repo in index state
initialize_attr_index() does not initialize the repo member of
attr_index. Starting in 44451a2e5e (attr: teach "--attr-source=<tree>"
global option to "git", 2023-05-06), this became a problem because
istate->repo gets passed down the call chain starting in
git_check_attr(). This gets passed all the way down to
replace_refs_enabled(), which segfaults when accessing r->gitdir.

Fix this by initializing the repository in the index state.

Signed-off-by: John Cai <johncai86@gmail.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 14:42:02 -07:00
7c446ac790 completion: complete '--dd'
'--dd' only makes sense for 'git log' and 'git show', so add it to
__git_log_show_options which is referenced in the completion for these
two commands.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:47:29 -07:00
c8e5cb0658 diff-merges: introduce '--dd' option
This option provides a shortcut to request diff with respect to first
parent for any kind of commit, universally. It's implemented as pure
synonym for "--diff-merges=first-parent --patch".

Gives user quick and universal way to see what changes, exactly, were
brought to a branch by merges as well as by regular commits.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:47:29 -07:00
be3820c60c diff-merges: improve --diff-merges documentation
* Put descriptions of convenience shortcuts first, so they are the
  first things reader observes rather than lengthy detailed stuff.

* Get rid of very long line containing all the --diff-merges formats
  by replacing them with <format>, and putting each supported format
  on its own line.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:47:29 -07:00
cebfaaa333 doc/cat-file: make synopsis and description less confusing
The DESCRIPTION's "first form" is actually the 1st, 2nd, 3rd and 5th
form in SYNOPSIS, the "second form" is the 4th one.

Interestingly, this state of affairs was introduced in
97fe725075 (cat-file docs: fix SYNOPSIS and "-h" output, 2021-12-28)
with the claim of "Now the two will match again." ("the two" being
DESCRIPTION and SYNOPSIS)...

The description also suffers from other correctness and clarity issues,
e.g., the "first form" paragraph discusses -p, -s and -t, but leaves out
-e, which is included in the corresponding SYNOPSIS section; the second
paragraph mentions <format>, which doesn't occur in SYNOPSIS at all, and
of the three batch options, really only describes the behavior of
--batch-check.  Also the mention of "drivers" seems an implementation
detail not adding much clarity in a short summary (and isn't expanded
upon in the rest of the man page, either).

Rather than trying to maintain one-to-one (or N-to-M) correspondence
between the DESCRIPTION and SYNOPSIS forms, creating duplication and
providing opportunities for error, shorten the former into a concise
summary describing the two general modes of operation: batch and
non-batch, leaving details to the subsequent manual sections.

While here, fix a grammar error in the description of -e and make the
following further minor improvements:

  NAME:
    shorten ("content or type and size" isn't the whole story; say
    "details" and leave the actual details to later sections)

  SYNOPSIS and --help:
    move the (--textconv | --filters) form before --batch, closer
    to the other non-batch forms

Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:46:33 -07:00
1627e6b4e4 doc: correct the 50 characters soft limit (+)
The soft limit of the first line of the commit message should be
"no more than 50 characters" or "50 characters or less", but not
"less than 50 character".

This is an addition to commit c2c349a15c (doc: correct the 50 characters
soft limit, 2023-09-28).

Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:07:26 -07:00
5fbcdb2082 documentation: add missing parenthesis
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:47 -07:00
798cddfa51 documentation: add missing quotes
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:47 -07:00
845c6ca90e documentation: add missing fullstops
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:47 -07:00
4d542687fc documentation: add some commas where they are helpful
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:44 -07:00
42bdb80a08 documentation: fix whitespace issues
Get rid of extraneous whitespace, replace tab-after-fullstop with
space, etc.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
2150b6fb47 documentation: fix capitalization
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
f4e1851a29 documentation: fix punctuation
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
9a9fd289cc documentation: use clearer prepositions
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
0cac690e1a documentation: add missing hyphens
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
f22fdf33af documentation: remove unnecessary hyphens
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
0a4f051f93 documentation: add missing article
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
3771d00257 documentation: fix choice of article
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
03b3431e6a documentation: whitespace is already generally plural
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
6cc668c0ab documentation: fix singular vs. plural
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
401a4e257e documentation: fix verb vs. noun
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
af181e4dbd documentation: fix adjective vs. noun
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
5676b04a44 documentation: fix verb tense
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
7f7e6bbe06 documentation: employ consistent verb tense for a list
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
ce14cc0b00 documentation: fix subject/verb agreement
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
859a6d6045 documentation: remove extraneous words
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
8936352242 documentation: add missing words
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
dbe33c5ad0 documentation: fix apostrophe usage
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:29 -07:00
384f7d17d2 documentation: fix typos
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:06:24 -07:00
82e81edf71 documentation: fix small error
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:04:21 -07:00
cf6cac2005 documentation: wording improvements
Diff best viewed with --color-diff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 12:04:21 -07:00
2b09d16aba pretty: fix ref filtering for %(decorate) formats
Mark pretty formats containing "%(decorate" as requiring decoration in
userformat_find_requirements(), same as "%d" and "%D".

Without this, cmd_log_init_finish() didn't invoke load_ref_decorations()
with the decoration_filter it puts together, and hence filtering options
such as --decorate-refs were quietly ignored.

Amend one of the %(decorate) checks in t4205-log-pretty-formats.sh to
test this.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 11:25:13 -07:00
c1b754d059 repack: free existing_cruft array after use
We allocate an array of packed_git pointers so that we can sort the list
of cruft packs, but we never free the array, causing a small leak. Note
that we don't need to free the packed_git structs themselves; they're
owned by the repository object.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 10:27:34 -07:00
ffbf6a748d doc: update list archive reference to use lore.kernel.org
No disrespect to other mailing list archives, but the local part of
their URLs will become pretty much meaningless once the archives go
out of service, and we learned the lesson hard way when $gmane
stopped serving.

Let's point into https://lore.kernel.org/ for an article that can be
found there, because the local part of the URL has the Message-Id:
that can be used to find the same message in other archives, even if
lore goes down.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-06 16:46:59 -07:00
badf2fe1c3 daemon: free listen_addr before returning
We build up a string list of listen addresses from the command-line
arguments, but never free it. This causes t5811 to complain of a leak
(though curiously it seems to do so only when compiled with gcc, not
with clang).

To handle this correctly, we have to do a little refactoring:

  - there are two exit points from the main function, depending on
    whether we are entering the main loop or serving a single client
    (since rather than a traditional fork model, we re-exec ourselves
    with the extra "--serve" argument to accommodate Windows).

    We don't need --listen at all in the --serve case, of course, but it
    is passed along by the parent daemon, which simply copies all of the
    command-line options it got.

  - we just "return serve()" to run the main loop, giving us no chance
    to do any cleanup

So let's use a "ret" variable to store the return code, and give
ourselves a single exit point at the end. That gives us one place to do
cleanup.

Note that this code also uses the "use a no-dup string-list, but
allocate strings we add to it" trick, meaning string_list_clear() will
not realize it should free them. We can fix this by switching to a "dup"
string-list, but using the "append_nodup" function to add to it (this is
preferable to tweaking the strdup_strings flag before clearing, as it
puts all the subtle memory-ownership code together).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 14:54:58 -07:00
8ef8da4842 revision: clear decoration structs during release_revisions()
The point of release_revisions() is to free memory associated with the
rev_info struct, but we have several "struct decoration" members that
are left untouched. Since the previous commit introduced a function to
do that, we can just call it.

We do have to provide some specialized callbacks to map the void
pointers onto real ones (the alternative would be casting the existing
function pointers; this generally works because "void *" is usually
interchangeable with a struct pointer, but it is technically forbidden
by the standard).

Since the line-log code does not expose the type it stores in the
decoration (nor of course the function to free it), I put this behind a
generic line_log_free() entry point. It's possible we may need to add
more line-log specific bits anyway (running t4211 shows a number of
other leaks in the line-log code).

While this doubtless cleans up many leaks triggered by the test suite,
the only script which becomes leak-free is t4217, as it does very little
beyond a simple traversal (its existing leak was from the use of
--children, which is now fixed).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 14:54:57 -07:00
771868243c decorate: add clear_decoration() function
There's not currently any way to free the resources associated with a
decoration struct. As a result, we have several memory leaks which
cannot easily be plugged.

Let's add a "clear" function and make use of it in the example code of
t9004. This removes the only leak from that script, so we can mark it as
passing the leak sanitizer.

Curiously this leak is found only when running SANITIZE=leak with clang,
but not with gcc.  But it is a bog-standard leak: we allocate some
memory in a local variable struct, and then exit main() without
releasing it. I'm not sure why gcc doesn't find it. After this
patch, both compilers report it as leak-free.

Note that the clear function takes a callback to free the individual
entries. That's not needed for our example (which is just decorating
with ints), but will be for real callers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 14:54:55 -07:00
3c1e2c2113 builtin/repack.c: avoid making cruft packs preferred
When doing a `--geometric` repack, we make sure that the preferred pack
(if writing a MIDX) is the largest pack that we *didn't* repack. That
has the effect of keeping the preferred pack in sync with the pack
containing a majority of the repository's reachable objects.

But if the repository happens to double in size, we'll repack
everything. Here we don't specify any `--preferred-pack`, and instead
let the MIDX code choose.

In the past, that worked fine, since there would only be one pack to
choose from: the one we just wrote. But it's no longer necessarily the
case that there is one pack to choose from. It's possible that the
repository also has a cruft pack, too.

If the cruft pack happens to come earlier in lexical order (and has an
earlier mtime than any non-cruft pack), we'll pick that pack as
preferred. This makes it impossible to reuse chunks of the reachable
pack verbatim from pack-objects, so is sub-optimal.

Luckily, this is a somewhat rare circumstance to be in, since we would
have to repack the entire repository during a `--geometric` repack, and
the cruft pack would have to sort ahead of the pack we just created.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 13:26:11 -07:00
37dc6d8104 builtin/repack.c: implement support for --max-cruft-size
Cruft packs are an alternative mechanism for storing a collection of
unreachable objects whose mtimes are recent enough to avoid being
pruned out of the repository.

When cruft packs were first introduced back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and
a7d493833f (builtin/pack-objects.c: --cruft with expiration,
2022-05-20), the recommended workflow consisted of:

  - Repacking periodically, either by packing anything loose in the
    repository (via `git repack -d`) or producing a geometric sequence
    of packs (via `git repack --geometric=<d> -d`).

  - Every so often, splitting the repository into two packs, one cruft
    to store the unreachable objects, and another non-cruft pack to
    store the reachable objects.

Repositories may (out of band with the above) choose periodically to
prune out some unreachable objects which have aged out of the grace
period by generating a pack with `--cruft-expiration=<approxidate>`.

This allowed repositories to maintain relatively few packs on average,
and quarantine unreachable objects together in a cruft pack, avoiding
the pitfalls of holding unreachable objects as loose while they age out
(for more, see some of the details in 3d89a8c118
(Documentation/technical: add cruft-packs.txt, 2022-05-20)).

This all works, but can be costly from an I/O-perspective when
frequently repacking a repository that has many unreachable objects.
This problem is exacerbated when those unreachable objects are rarely
(if every) pruned.

Since there is at most one cruft pack in the above scheme, each time we
update the cruft pack it must be rewritten from scratch. Because much of
the pack is reused, this is a relatively inexpensive operation from a
CPU-perspective, but is very costly in terms of I/O since we end up
rewriting basically the same pack (plus any new unreachable objects that
have entered the repository since the last time a cruft pack was
generated).

At the time, we decided against implementing more robust support for
multiple cruft packs. This patch implements that support which we were
lacking.

Introduce a new option `--max-cruft-size` which allows repositories to
accumulate cruft packs up to a given size, after which point a new
generation of cruft packs can accumulate until it reaches the maximum
size, and so on. To generate a new cruft pack, the process works like
so:

  - Sort a list of any existing cruft packs in ascending order of pack
    size.

  - Starting from the beginning of the list, group cruft packs together
    while the accumulated size is smaller than the maximum specified
    pack size.

  - Combine the objects in these cruft packs together into a new cruft
    pack, along with any other unreachable objects which have since
    entered the repository.

Once a cruft pack grows beyond the size specified via `--max-cruft-size`
the pack is effectively frozen. This limits the I/O churn up to a
quadratic function of the value specified by the `--max-cruft-size`
option, instead of behaving quadratically in the number of total
unreachable objects.

When pruning unreachable objects, we bypass the new code paths which
combine small cruft packs together, and instead start from scratch,
passing in the appropriate `--max-pack-size` down to `pack-objects`,
putting it in charge of keeping the resulting set of cruft packs sized
correctly.

This may seem like further I/O churn, but in practice it isn't so bad.
We could prune old cruft packs for whom all or most objects are removed,
and then generate a new cruft pack with just the remaining set of
objects. But this additional complexity buys us relatively little,
because most objects end up being pruned anyway, so the I/O churn is
well contained.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 13:26:11 -07:00
b5b1f4c0ec builtin/repack.c: parse --max-pack-size with OPT_MAGNITUDE
The repack builtin takes a `--max-pack-size` command-line argument which
it uses to feed into any of the pack-objects children that it may spawn
when generating a new pack.

This option is parsed with OPT_STRING, meaning that we'll accept
anything as input, punting on more fine-grained validation until we get
down into pack-objects.

This is fine, but it's wasteful to spend an entire sub-process just to
figure out that one of its option is bogus. Instead, parse the value of
`--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the
known-good result down to pack-objects.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 13:18:54 -07:00
f0a39ba504 t/README: fix multi-prerequisite example
With the broken quoting the test wouldn't even parse correctly, but
there's also the '==' instead of POSIX '=' (of the shells I tested,
busybox ash, bash and ksh (93 and OpenBSD) accept '==', dash and zsh do
not), and 'print 2' from Python 2 days.

(I assume the test failing due to 3 != 4 is intentional or immaterial.)

Fixes: 93a5724613 ("test-lib: Add support for multiple test prerequisites")
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 12:55:38 -07:00
72fac03522 doc/gitk: s/sticked/stuck/
The terminology was changed in b0d12fc9b2 (Use the word 'stuck'
instead of 'sticked').

Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 12:55:38 -07:00
a62a7060a5 git-jump: admit to passing merge mode args to ls-files
There's even an example of such usage in the README.

Fixes: 67ba13e5a4 ("git-jump: pass "merge" arguments to ls-files")
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 12:55:38 -07:00
043465a6cf doc/diff-options: improve wording of the log.diffMerges mention
Fix the grammar ("which default value is") and reword to match other
similar descriptions (say "configuration variable" instead of
"parameter", link to git-config(1)).

Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 12:55:38 -07:00
97509a3497 doc: fix some typos, grammar and wording issues
Signed-off-by: Štěpán Němec <stepnem@smrk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 12:55:38 -07:00
3349520e1a coverity: detect and report when the token or project is incorrect
When trying to obtain the MD5 of the Coverity Scan Tool (in order to
decide whether a cached version can be used or a new version has to be
downloaded), it is possible to get a 401 (Authorization required) due to
either an incorrect token, or even more likely due to an incorrect
Coverity project name.

Seeing an authorization failure that is caused by an incorrect project
name was somewhat surprising to me when developing the Coverity
workflow, as I found such a failure suggestive of an incorrect token
instead.

So let's provide a helpful error message about that specifically when
encountering authentication issues.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05 11:45:46 -07:00
3a06386e31 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-04 13:29:09 -07:00
64b2419ccc Merge branch 'bb/unicode-width-table-15'
The display width table for unicode characters has been updated for
Unicode 15.1

* bb/unicode-width-table-15:
  unicode: update the width tables to Unicode 15.1
2023-10-04 13:28:53 -07:00
ba7d57b8e5 Merge branch 'xz/commit-title-soft-limit-doc'
Doc tweak.

* xz/commit-title-soft-limit-doc:
  doc: correct the 50 characters soft limit
2023-10-04 13:28:53 -07:00
c3c0020673 Merge branch 'jk/commit-graph-verify-fix'
Various fixes to "git commit-graph verify".

* jk/commit-graph-verify-fix:
  commit-graph: report incomplete chains during verification
  commit-graph: tighten chain size check
  commit-graph: detect read errors when verifying graph chain
  t5324: harmonize sha1/sha256 graph chain corruption
  commit-graph: check mixed generation validation when loading chain file
  commit-graph: factor out chain opening function
2023-10-04 13:28:53 -07:00
42b495e9c5 Merge branch 'ks/ref-filter-mailmap'
"git for-each-ref" and friends learn to apply mailmap to authorname
and other fields.

* ks/ref-filter-mailmap:
  ref-filter: add mailmap support
  t/t6300: introduce test_bad_atom
  t/t6300: cleanup test_atom
2023-10-04 13:28:53 -07:00
3029189186 Merge branch 'ps/revision-cmdline-stdin-not'
"git rev-list --stdin" learned to take non-revisions (like "--not")
recently from the standard input, but the way such a "--not" was
handled was quite confusing, which has been rethought.  This is
potentially a change that breaks backward compatibility.

* ps/revision-cmdline-stdin-not:
  revision: make pseudo-opt flags read via stdin behave consistently
2023-10-04 13:28:52 -07:00
641307d3b6 git-status.txt: fix minor asciidoc format issue
The list of additional XY values for submodules in short format
isn't formatted consistently with the rest of the document.
Format as list for consistency.

Signed-off-by: Javier Mora <cousteaulecommandant@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-04 09:55:26 -07:00
bd1c20ccd7 t7420: test that we correctly handle renamed submodules
Create a second submodule with a name that differs from its path. Test
that calling set-url modifies the correct .gitmodules entries. Make sure
we don't create a section named after the path instead of the name.

Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 15:32:37 -07:00
32bff3675e t7419: test that we correctly handle renamed submodules
Add the submodule again with an explicitly different name and path. Test
that calling set-branch modifies the correct .gitmodules entries. Make
sure we don't create a section named after the path instead of the name.

Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 15:32:34 -07:00
5fc880632d t7419, t7420: use test_cmp_config instead of grepping .gitmodules
We have a test function to verify config files. Use it as it's more
precise.

Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 15:32:31 -07:00
b027fb0784 t7419: actually test the branch switching
The submodule repo the test set up had the 'topic' branch checked out,
meaning the repo's default branch (HEAD) is the 'topic' branch.

The following tests then pretended to switch between the default branch
and the 'topic' branch. This was papered over by continually adding
commits to the 'topic' branch and checking if the submodule gets updated
to this new commit.

Return the submodule repo to the 'main' branch after setup so we can
actually test the switching behavior.

Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 15:32:27 -07:00
387c122131 submodule--helper: return error from set-url when modifying failed
set-branch will return an error when setting the config fails so I don't
see why set-url shouldn't. Also skip the sync in this case.

Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 15:30:43 -07:00
6327085aa0 submodule--helper: use submodule_from_path in set-{url,branch}
The commands need a path to a submodule but treated it as the name when
modifying the .gitmodules file, leading to confusion when a submodule's
name does not match its path.

Because calling submodule_from_path initializes the submodule cache, we
need to manually trigger a reread before syncing, as the cache is
missing the config change we just made.

Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 15:30:42 -07:00
da09e7af68 commit-graph: clear oidset after finishing write
In graph_write() we store commits in an oidset, but never clean it up,
leaking the contents. We should clear it in the cleanup section.

The oidset comes from 6830c36077 (commit-graph.h: replace 'commit_hex'
with 'commits', 2020-04-13), but it was just replacing a string_list
that was also leaked. Curiously, we fixed the leak of some adjacent
variables in commit fa8953cb40 (builtin/commit-graph.c: extract
'read_one_commit()', 2020-05-18), but the oidset wasn't included for
some reason.

In combination with the preceding commits, this lets us mark t5324 as
leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
d9c84c6d67 commit-graph: free write-context base_graph_name during cleanup
Commit 6c622f9f0b (commit-graph: write commit-graph chains, 2019-06-18)
added a base_graph_name string to the write_commit_graph_context struct.
But the end-of-function cleanup forgot to free it, causing a leak.

This (presumably in combination with the preceding leak-fixes) lets us
mark t5328 as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
274bfa7f28 commit-graph: free write-context entries before overwriting
When writing a split graph file, we replace the final element of the
commit_graph_hash_after and commit_graph_filenames_after arrays. But
since these are allocated strings, we need to free them before
overwriting to avoid leaking the old string.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
1d94abfe1e commit-graph: free graph struct that was not added to chain
When reading the graph chain file, we open (and allocate) each
individual slice it mentions and then add them to a linked-list chain.
But if adding to the chain fails (e.g., because the base-graph chunk it
contains didn't match what we expected), we leave the function without
freeing the graph struct that caused the failure, leaking it.

We can fix it by calling free_graph_commit().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
991d549f74 commit-graph: delay base_graph assignment in add_graph_to_chain()
When adding a graph to a chain, we do some consistency checks and then
if everything looks good, set g->base_graph to add a link to the chain.
But when we added a new consistency check in 209250ef38 (commit-graph.c:
prevent overflow in add_graph_to_chain(), 2023-07-12), it comes _after_
we've already set g->base_graph. So we might return failure, even though
we actually added to the chain.

This hasn't caused a bug yet, because after failing to add to the chain,
we discard the failed graph struct completely, leaking it. But in order
to fix that, it's important that the struct be in a consistent and
predictable state after the failure.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
09a75abba4 commit-graph: free all elements of graph chain
When running "commit-graph verify", we call free_commit_graph(). That's
sufficient for the case of a single graph file, but if we loaded a chain
of split graph files, they form a linked list via the base_graph
pointers. We need to free all of them, or we leak all but the first
struct.

We can make this work by teaching free_commit_graph() to walk the
base_graph pointers and free each element. This in turn lets us simplify
close_commit_graph(), which does the same thing by recursion (we cannot
just use close_commit_graph() in "commit-graph verify", as the function
takes a pointer to an object store, and the verify command creates a
single one-off graph struct).

While indenting the code in free_commit_graph() for the loop, I noticed
that setting g->data to NULL is rather pointless, as we free the struct
a few lines later. So I cleaned that up while we're here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
ac6d45d11f commit-graph: move slab-clearing to close_commit_graph()
When closing and freeing a commit-graph, the main entry point is
close_commit_graph(), which then uses close_commit_graph_one() to
recurse through the base_graph links and free each one.

Commit 957ba814bf (commit-graph: when closing the graph, also release
the slab, 2021-09-08) put the call to clear the slab into the recursive
function, but this is pointless: there's only a single global slab
variable. It works OK in practice because clearing the slab is
idempotent, but it makes the code harder to reason about and refactor.

Move it into the parent function so it's only called once (and there are
no other direct callers of the recursive close_commit_graph_one(), so we
are not hurting them).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
716a6b2c3a merge: free result of repo_get_merge_bases()
We call repo_get_merge_bases(), which allocates a commit_list, but never
free the result, causing a leak.

The obvious solution is to free it, but we need to look at the contents
of the first item to decide whether to leave the loop. One option is to
free it in both code paths. But since the commit that the list points to
is longer-lived than the list itself, we can just dereference it
immediately, free the list, and then continue with the existing logic.
This is about the same amount of code, but keeps the list management all
in one place.

This lets us mark a number of merge-related test scripts as leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:24 -07:00
ec97ad120c commit-reach: free temporary list in get_octopus_merge_bases()
We loop over the set of commits to merge, and for each one compute the
merge base against the existing set of merge base candidates we've
found. Then we replace the candidate set with a simple assignment of the
list head, leaking the old list. We should free it first before
assignment.

This makes t5521 leak-free, so mark it as such.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:23 -07:00
be4b578c69 t6700: mark test as leak-free
This test has never leaked since it was added. Let's annotate it to make
sure it stays that way (and to reduce noise when looking for other
leak-free scripts after we fix some leaks).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 14:28:23 -07:00
f4cbb32c27 parse-options: drop unused parse_opt_ctx_t member
5c387428f1 (parse-options: don't emit "ambiguous option" for aliases,
2019-04-29) added "updated_options" to struct parse_opt_ctx_t, but it
has never been used.  Remove it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03 13:15:03 -07:00
78de1c6c32 t7700: split cruft-related tests to t7704
A small handful of the tests in t7700 (the main script for testing
functionality of 'git repack') are specifically related to cruft pack
operations.

Prepare for adding new cruft pack-related tests by moving the existing
set into a new test script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 18:28:47 -07:00
7673ecd2dc t1016-compatObjectFormat: add tests to verify the conversion between objects
For now my strategy is simple.  Create two identical repositories one
in each format.  Use fixed timestamps. Verify the dynamically computed
compatibility objects from one repository match the objects stored in
the other repository.

A general limitation of this strategy is that the git when generating
signed tags and commits with compatObjectFormat enabled will generate
a signature for both formats.  To overcome this limitation I have
added "test-tool delete-gpgsig" that when fed an signed commit or tag
with two signatures deletes one of the signatures.

With that in place I can have "git commit" and  "git tag" generate
signed objects, have my tool delete one, and feed the new object
into "git hash-object" to create the kinds of commits and tags
git without compatObjectFormat enabled will generate.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
3afa8d86ac t1006: test oid compatibility with cat-file
Update the existing tests that are oid based to test that cat-file
works correctly with the normal oid and the compat_oid.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
baab175c1d t1006: rename sha1 to oid
Before I extend this test, changing the naming of the relevant
hash from sha1 to oid.  Calling the hash sha1 is incorrect today
as it can be either sha1 or sha256 depending on the value of
GIT_DEFAULT_HASH_FUNCTION when the test is called.

I plan to test sha1 and sha256 simultaneously in the same repository.
Having a name like sha1 will be even more confusing.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
48b16ab231 test-lib: compute the compatibility hash so tests may use it
Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
c68be1fd31 builtin/ls-tree: let the oid determine the output algorithm
Update cmd_ls_tree to call get_oid_with_context and pass
GET_OID_HASH_ANY instead of calling the simpler repo_get_oid.

This implments in ls-tree the behavior that asking to display a sha1
hash displays the corrresponding sha1 encoded object and asking to
display a sha256 hash displayes the corresponding sha256 encoded
object.

This is useful for testing the conversion of an object to an
equivlanet object encoded with a different hash function.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
8d691757b8 object-file: handle compat objects in check_object_signature
Update check_object_signature to find the hash algorithm the exising
signature uses, and to use the same hash algorithm when recomputing it
to check the signature is valid.

This will be useful when teaching git ls-tree to display objects
encoded with the compat hash algorithm.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
efed687edc tree-walk: init_tree_desc take an oid to get the hash algorithm
To make it possible for git ls-tree to display the tree encoded
in the hash algorithm of the oid specified to git ls-tree, update
init_tree_desc to take as a parameter the oid of the tree object.

Update all callers of init_tree_desc and init_tree_desc_gently
to pass the oid of the tree object.

Use the oid of the tree object to discover the hash algorithm
of the oid and store that hash algorithm in struct tree_desc.

Use the hash algorithm in decode_tree_entry and
update_tree_entry_internal to handle reading a tree object encoded in
a hash algorithm that differs from the repositories hash algorithm.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
d6222a2d05 builtin/cat-file: let the oid determine the output algorithm
Use GET_OID_HASH_ANY when calling get_oid_with_context.  This
implements the semi-obvious behaviour that specifying a sha1 oid shows
the output for a sha1 encoded object, and specifying a sha256 oid
shows the output for a sha256 encoded object.

This is useful for testing the the conversion of an object to an
equivalent object encoded with a different hash function.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
d7446c89b8 rev-parse: add an --output-object-format parameter
The new --output-object-format parameter returns the oid in the
specified format.

This is a generally useful plumbing facility.  It is useful for writing
test cases and for directly querying the translation maps.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
9ae702faf1 repository: implement extensions.compatObjectFormat
Add a configuration option to enable updating and reading from
compatibility hash maps when git accesses the reposotiry.

Call the helper function repo_set_compat_hash_algo with the value
that compatObjectFormat is set to.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
2328ebaa4e object-file: update object_info_extended to reencode objects
oid_object_info_extended is updated to detect an oid encoding that
does not match the current repository, use repo_oid_to_algop to find
the correspoding oid in the current repository and to return the data
for the oid.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:40 -07:00
08a45903cb object-file-convert: convert commits that embed signed tags
As mentioned in the hash function transition plan commit mergetag
lines need to be handled.  The commit mergetag lines embed an entire
tag object in a commit object.

Keep the implementation sane if not fast by unembedding the tag
object, converting the tag object, and embedding the new tag object,
in the new commit object.

In the long run I don't expect any other approach is maintainable, as
tag objects may be extended in ways that require additional
translation.

To keep the implementation of convert_commit_object maintainable I
have modified convert_commit_object to process the lines in any order,
and to fail on unknown lines.  We can't know ahead of time if a new
line might embed something that needs translation or not so it is
better to fail and require the code to be updated instead of silently
mistranslating objects.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
318b023e4a object-file-convert: convert commit objects when writing
When writing a commit object in a repository with both SHA-1 and
SHA-256, we'll need to convert our commit objects so that we can write
the hash values for both into the repository.  To do so, let's add a
function to convert commit objects.

Read the commit object and map the tree value and any of the parent
values, and copy the rest of the commit through unmodified.  Note that
we don't need to modify the signature headers, because they are the
same under both algorithms.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
ac45d995f3 object-file-convert: don't leak when converting tag objects
Upon close examination I discovered that while brian's code to convert
tag objects was functionally correct, it leaked memory.

Rearrange the code so that all error checking happens before any
memory is allocated.

Add code to release the temporary strbufs the code uses.

The code pretty much assumes the tag object ends with a newline,
so add an explict test to verify that is the case.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
c8762c30df object-file-convert: convert tag objects when writing
When writing a tag object in a repository with both SHA-1 and SHA-256,
we'll need to convert our commit objects so that we can write the hash
values for both into the repository.  To do so, let's add a function to
convert tag objects.

Note that signatures for tag objects in the current algorithm trail the
message, and those for the alternate algorithm are in headers.
Therefore, we parse the tag object for both a trailing signature and a
header and then, when writing the other format, swap the two around.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
33a14e81ae object-file-convert: add a function to convert trees between algorithms
In the future, we're going to want to provide SHA-256 repositories that
have compatibility support for SHA-1 as well.  In order to do so, we'll
need to be able to convert tree objects from SHA-256 to SHA-1 by writing
a tree with each SHA-256 object ID mapped to a SHA-1 object ID.

We implement a function, convert_tree_object, that takes an existing
tree buffer and writes it to a new strbuf, converting between
algorithms.  Let's make this function generic, because while we only
need it to convert from the main algorithm to the compatibility
algorithm now, we may need to do the other way around in the future,
such as for transport.

We avoid reusing the code in decode_tree_entry because that code
normalizes data, and we don't want that here.  We want to produce a
complete round trip of data, so if, for example, the old entry had a
wrongly zero-padded mode, we'd want to preserve that when converting to
ensure a stable hash value.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
45b3b12141 object: factor out parse_mode out of fast-import and tree-walk into in object.h
builtin/fast-import.c and tree-walk.c have almost identical version of
get_mode.  The two functions started out the same but have diverged
slightly.  The version in fast-import changed mode to a uint16_t to
save memory.  The version in tree-walk started erroring if no mode was
present.

As far as I can tell both of these changes are valid for both of the
callers, so add the both changes and place the common parsing helper
in object.h

Rename the helper from get_mode to parse_mode so it does not
conflict with another helper named get_mode in diff-no-index.c

This will be used shortly in a new helper decode_tree_entry_raw
which is used to compute cmpatibility objects as part of
the sha256 transition.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
095261a18d cache: add a function to read an OID of a specific algorithm
Currently, we always read a object ID of the current algorithm with
oidread.  However, once we start converting objects, we'll need to
consider what happens when we want to read an object ID of a specific
algorithm, such as the compatibility algorithm.  To make this easier,
let's define oidread_algop, which specifies which algorithm we should
use for our object ID, and define oidread in terms of it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
867386d0c8 tag: sign both hashes
When we write a tag the object oid is specific to the hash algorithm.

This matters when a tag is signed.  The hash transition plan calls for
signatures on both the sha1 form and the sha256 form of the object,
and for both of those signatures to live in the tag object.

To generate tag object with multiple signatures, first compute the
unsigned form of the tag, and then if the tag is being signed compute
the unsigned form of the tag with the compatibilityr hash.  Then
compute compute the signatures of both buffers.

Once the signatures are computed add them to both buffers.  This
allows computing the compatibility hash in do_sign, saving
write_object_file the expense of recomputing the compatibility tag
just to compute it's hash.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
6bcc5fa20d commit: export add_header_signature to support handling signatures on tags
Rename add_commit_signature as add_header_signature, and expose it so
that it can be used for converting tags from one object format to
another.

Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
a3e8ae5473 commit: convert mergetag before computing the signature of a commit
It so happens that commit mergetag lines embed a tag object.  So to
compute the compatible signature of a commit object that has mergetag
lines the compatible embedded tag must be computed first.

Implement this by duplicating and converting the commit extra headers
into the compatible version of the commit extra headers, that need
to be passed to commit_tree_extended.

To handle merge tags only the compatible extra headers need to be
computed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
6206089cbd commit: write commits for both hashes
When we write a commit, we include data that is specific to the hash
algorithm, such as parents and the root tree.  In order to write both a
SHA-1 commit and a SHA-256 version, we need to convert between them.

However, a straightforward conversion isn't necessarily what we want.
When we sign a commit, we sign its data, so if we create a commit for
SHA-256 and then write a SHA-1 version, we'll still have only signed the
SHA-256 data.  While this is valid, it would be better to sign both
forms of data so people using SHA-1 can verify the signatures as well.

Consequently, we don't want to use the standard mapping that occurs when
we write an object.  Instead, let's move most of the writing of the
commit into a separate function which is agnostic of the hash algorithm
and which simply writes into a buffer and specify both versions of the
object ourselves.

We can then call this function twice: once with the SHA-256 contents,
and if SHA-1 is enabled, once with the SHA-1 contents.  If we're signing
the commit, we then sign both versions and append both signatures to
both buffers.  To produce a consistent hash, we always append the
signatures in the order in which Git implemented them: first SHA-1, then
SHA-256.

In order to make this signing code work, we split the commit signing
code into two functions, one which signs the buffer, and one which
appends the signature.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
c2538492df object-file: add a compat_oid_in parameter to write_object_file_flags
To create the proper signatures for commit objects both versions of
the commit object need to be generated and signed.  After that it is
a waste to throw away the work of generating the compatibility hash
so update write_object_file_flags to take a compatibility hash input
parameter that it can use to skip the work of generating the
compatability hash.

Update the places that don't generate the compatability hash to
pass NULL so it is easy to tell write_object_file_flags should
not attempt to use their compatability hash.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
63a6745a07 object-file: update the loose object map when writing loose objects
To implement SHA1 compatibility on SHA256 repositories the loose
object map needs to be updated whenver a loose object is written.
Updating the loose object map this way allows git to support
the old hash algorithm in constant time.

The functions write_loose_object, and stream_loose_object are
the only two functions that write to the loose object store.

Update stream_loose_object to compute the compatibiilty hash, update
the loose object, and then call repo_add_loose_object_map to update
the loose object map.

Update write_object_file_flags to convert the object into
it's compatibility encoding, hash the compatibility encoding,
write the object, and then update the loose object map.

Update force_object_loose to lookup the hash of the compatibility
encoding, write the loose object, and then update the loose object
map.

Update write_object_file_literally to convert the object into it's
compatibility hash encoding, hash the compatibility enconding, write
the object, and then update the loose object map, when the type string
is a known type.  For objects with an unknown type this results in a
partially broken repository, as the objects are not mapped.

The point of write_object_file_literally is to generate a partially
broken repository for testing.  For testing skipping writing the loose
object map is much more useful than refusing to write the broken
object at all.

Except that the loose objects are updated before the loose object map
I have not done any analysis to see how robust this scheme is in the
event of failure.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:39 -07:00
a2d923fb0d loose: compatibilty short name support
Update loose_objects_cache when udpating the loose objects map.  This
oidtree is used to discover which oids are possibilities when
resolving short names, and it can support a mixture of sha1
and sha256 oids.

With this any oid recorded objects/loose-objects-idx is usable
for resolving an oid to an object.

To make this maintainable a helper insert_loose_map is factored
out of load_one_loose_object_map and repo_add_loose_object_map,
and then modified to also update the loose_objects_cache.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:38 -07:00
23b2c7e95b loose: add a mapping between SHA-1 and SHA-256 for loose objects
As part of the transition plan, we'd like to add a file in the .git
directory that maps loose objects between SHA-1 and SHA-256.  Let's
implement the specification in the transition plan and store this data
on a per-repository basis in struct repository.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:38 -07:00
15a1ca1abe repository: add a compatibility hash algorithm
We currently have support for using a full stage 4 SHA-256
implementation.  However, we'd like to support interoperability with
SHA-1 repositories as well.  The transition plan anticipates a
compatibility hash algorithm configuration option that we can use to
implement support for this.  Let's add an element to the repository
structure that indicates the compatibility hash algorithm so we can use
it when we need to consider interoperability between algorithms.

Add a helper function repo_set_compat_hash_algo that takes a
compatibility hash algorithm and sets "repo->compat_hash_algo".  If
GIT_HASH_UNKNOWN is passed as the compatibility hash algorithm
"repo->compat_hash_algo" is set to NULL.

For now, the code results in "repo->compat_hash_algo" always being set
to NULL, but that will change once a configuration option is added.

Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:38 -07:00
52fca06db2 object-names: support input of oids in any supported hash
Support short oids encoded in any algorithm, while ensuring enough of
the oid is specified to disambiguate between all of the oids in the
repository encoded in any algorithm.

By default have the code continue to only accept oids specified in the
storage hash algorithm of the repository, but when something is
ambiguous display all of the possible oids from any accepted oid
encoding.

A new flag is added GET_OID_HASH_ANY that when supplied causes the
code to accept oids specified in any hash algorithm, and to return the
oids that were resolved.

This implements the functionality that allows both SHA-1 and SHA-256
object names, from the "Object names on the command line" section of
the hash function transition document.

Care is taken in get_short_oid so that when the result is ambiguous
the output remains the same if GIT_OID_HASH_ANY was not supplied.  If
GET_OID_HASH_ANY was supplied objects of any hash algorithm that match
the prefix are displayed.

This required updating repo_for_each_abbrev to give it a parameter so
that it knows to look at all hash algorithms.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:38 -07:00
d50cbe4a5d oid-array: teach oid-array to handle multiple kinds of oids
While looking at how to handle input of both SHA-1 and SHA-256 oids in
get_oid_with_context, I realized that the oid_array in
repo_for_each_abbrev might have more than one kind of oid stored in it
simultaneously.

Update to oid_array_append to ensure that oids added to an oid array
always have an algorithm set.

Update void_hashcmp to first verify two oids use the same hash algorithm
before comparing them to each other.

With that oid-array should be safe to use with different kinds of
oids simultaneously.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:38 -07:00
5e9d802a33 object-file-convert: stubs for converting from one object format to another
Two basic functions are provided:
- convert_object_file Takes an object file it's type and hash algorithm
  and converts it into the equivalent object file that would
  have been generated with hash algorithm "to".

  For blob objects there is no conversation to be done and it is an
  error to use this function on them.

  For commit, tree, and tag objects embedded oids are replaced by the
  oids of the objects they refer to with those objects and their
  object ids reencoded in with the hash algorithm "to".  Signatures
  are rearranged so that they remain valid after the object has
  been reencoded.

- repo_oid_to_algop which takes an oid that refers to an object file
  and returns the oid of the equivalent object file generated
  with the target hash algorithm.

The pair of files object-file-convert.c and object-file-convert.h are
introduced to hold as much of this logic as possible to keep this
conversion logic cleanly separated from everything else and in the
hopes that someday the code will be clean enough git can support
compiling out support for sha1 and the various conversion functions.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:57:38 -07:00
9b96046b92 gc: add gc.repackFilterTo config option
A previous commit implemented the `gc.repackFilter` config option
to specify a filter that should be used by `git gc` when
performing repacks.

Another previous commit has implemented
`git repack --filter-to=<dir>` to specify the location of the
packfile containing filtered out objects when using a filter.

Let's implement the `gc.repackFilterTo` config option to specify
that location in the config when `gc.repackFilter` is used.

Now when `git gc` will perform a repack with a <dir> configured
through this option and not empty, the repack process will be
passed a corresponding `--filter-to=<dir>` argument.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:31 -07:00
71c5aec1f5 repack: implement --filter-to for storing filtered out objects
A previous commit has implemented `git repack --filter=<filter-spec>` to
allow users to filter out some objects from the main pack and move them
into a new different pack.

It would be nice if this new different pack could be created in a
different directory than the regular pack. This would make it possible
to move large blobs into a pack on a different kind of storage, for
example cheaper storage.

Even in a different directory, this pack can be accessible if, for
example, the Git alternates mechanism is used to point to it. In fact
not using the Git alternates mechanism can corrupt a repo as the
generated pack containing the filtered objects might not be accessible
from the repo any more. So setting up the Git alternates mechanism
should be done before using this feature if the user wants the repo to
be fully usable while this feature is used.

In some cases, like when a repo has just been cloned or when there is no
other activity in the repo, it's Ok to setup the Git alternates
mechanism afterwards though. It's also Ok to just inspect the generated
packfile containing the filtered objects and then just move it into the
'.git/objects/pack/' directory manually. That's why it's not necessary
for this command to check that the Git alternates mechanism has been
already setup.

While at it, as an example to show that `--filter` and `--filter-to`
work well with other options, let's also add a test to check that these
options work well with `--max-pack-size`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:31 -07:00
1cd43a9ed9 gc: add gc.repackFilter config option
A previous commit has implemented `git repack --filter=<filter-spec>` to
allow users to filter out some objects from the main pack and move them
into a new different pack.

Users might want to perform such a cleanup regularly at the same time as
they perform other repacks and cleanups, so as part of `git gc`.

Let's allow them to configure a <filter-spec> for that purpose using a
new gc.repackFilter config option.

Now when `git gc` will perform a repack with a <filter-spec> configured
through this option and not empty, the repack process will be passed a
corresponding `--filter=<filter-spec>` argument.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:30 -07:00
48a9b67b43 repack: add --filter=<filter-spec> option
This new option puts the objects specified by `<filter-spec>` into a
separate packfile.

This could be useful if, for example, some blobs take up a lot of
precious space on fast storage while they are rarely accessed. It could
make sense to move them into a separate cheaper, though slower, storage.

It's possible to find which new packfile contains the filtered out
objects using one of the following:

  - `git verify-pack -v ...`,
  - `test-tool find-pack ...`, which a previous commit added,
  - `--filter-to=<dir>`, which a following commit will add to specify
    where the pack containing the filtered out objects will be.

This feature is implemented by running `git pack-objects` twice in a
row. The first command is run with `--filter=<filter-spec>`, using the
specified filter. It packs objects while omitting the objects specified
by the filter. Then another `git pack-objects` command is launched using
`--stdin-packs`. We pass it all the previously existing packs into its
stdin, so that it will pack all the objects in the previously existing
packs. But we also pass into its stdin, the pack created by the previous
`git pack-objects --filter=<filter-spec>` command as well as the kept
packs, all prefixed with '^', so that the objects in these packs will be
omitted from the resulting pack. The result is that only the objects
filtered out by the first `git pack-objects` command are in the pack
resulting from the second `git pack-objects` command.

As the interactions with kept packs are a bit tricky, a few related
tests are added.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:30 -07:00
0e4747ec8b pack-bitmap-write: rebuild using new bitmap when remapping
`git repack` is about to learn a new `--filter=<filter-spec>` option and
we will want to check that this option is incompatible with
`--write-bitmap-index`.

Unfortunately it appears that a test like:

test_expect_success '--filter fails with --write-bitmap-index' '
       test_must_fail \
               env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \
               git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none
'

sometimes fail because when rebuilding bitmaps, it appears that we are
reusing existing bitmap information. So instead of detecting that some
objects are missing and erroring out as it should, the
`git repack --write-bitmap-index --filter=...` command succeeds.

Let's fix that by making sure we rebuild bitmaps using new bitmaps
instead of existing ones.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:30 -07:00
be315e9a3f repack: refactor finding pack prefix
Create a new find_pack_prefix() to refactor code that handles finding
the pack prefix from the packtmp and packdir global variables, as we are
going to need this feature again in following commit.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:30 -07:00
ff8504e4ec repack: refactor finishing pack-objects command
Create a new finish_pack_objects_cmd() to refactor duplicated code
that handles reading the packfile names from the output of a
`git pack-objects` command and putting it into a string_list, as well as
calling finish_command().

While at it, beautify a code comment a bit in the new function.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:29 -07:00
66589f89ab t/helper: add 'find-pack' test-tool
In a following commit, we will make it possible to separate objects in
different packfiles depending on a filter.

To make sure that the right objects are in the right packs, let's add a
new test-tool that can display which packfile(s) a given object is in.

Let's also make it possible to check if a given object is in the
expected number of packfiles with a `--check-count <n>` option.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:29 -07:00
6cfcabfb9f pack-objects: allow --filter without --stdout
9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21)
taught `git pack-objects` to use `--filter`, but required the use of
`--stdout` since a partial clone mechanism was not yet in place to
handle missing objects. Since then, changes like 9e27beaa23
(promisor-remote: implement promisor_remote_get_direct(), 2019-06-25)
and others added support to dynamically fetch objects that were missing.

Even without a promisor remote, filtering out objects can also be useful
if we can put the filtered out objects in a separate pack, and in this
case it also makes sense for pack-objects to write the packfile directly
to an actual file rather than on stdout.

Remove the `--stdout` requirement when using `--filter`, so that in a
follow-up commit, repack can pass `--filter` to pack-objects to omit
certain objects from the resulting packfile.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 14:54:29 -07:00
4adceb5a29 diff: fix --merge-base with annotated tags
Checking early for OBJ_COMMIT excludes other objects that can be
resolved to commits, like annotated tags.  If we remove it, annotated
tags will be resolved and handled just fine by
lookup_commit_reference(), and if we are given something that can't be
resolved to a commit, we'll still get a useful error message, e.g.:

> error: object 21ab162211ac3ef13c37603ca88b27e9c7e0d40b is a tree, not a commit
> fatal: no merge base found

Signed-off-by: Alyssa Ross <hi@alyssa.is>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 11:55:42 -07:00
d0e8084c65 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02 11:20:00 -07:00
4a0bcc832a Merge branch 'js/doc-status-with-submodules-mark-up-fix'
Docfix.

* js/doc-status-with-submodules-mark-up-fix:
  Documentation/git-status: add missing line breaks
2023-10-02 11:20:00 -07:00
5bb67fb7ab Merge branch 'jc/unresolve-removal'
"checkout --merge -- path" and "update-index --unresolve path" did
not resurrect conflicted state that was resolved to remove path,
but now they do.

* jc/unresolve-removal:
  checkout: allow "checkout -m path" to unmerge removed paths
  checkout/restore: add basic tests for --merge
  checkout/restore: refuse unmerging paths unless checking out of the index
  update-index: remove stale fallback code for "--unresolve"
  update-index: use unmerge_index_entry() to support removal
  resolve-undo: allow resurrecting conflicted state that resolved to deletion
  update-index: do not read HEAD and MERGE_HEAD unconditionally
2023-10-02 11:20:00 -07:00
4ca7a3fd26 diff --stat: set the width defaults in a helper function
Extract the commonly used initialization of the --stat-width=<width>,
--stat-name-width=<width> and --stat-graph-with=<width> parameters to their
internal default values into a helper function, to avoid repeating the same
initialization code in a few places.

Add a couple of tests to additionally cover existing configuration options
diff.statNameWidth=<width> and diff.statGraphWidth=<width> when used by
git-merge to generate --stat outputs.  This closes the gap that existed
previously in the --stat tests, and reduces the chances for having any
regressions introduced by this commit.

While there, perform a small bunch of minor wording tweaks in the improved
unit test, to improve its test-level consistency a bit.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29 15:46:06 -07:00
b1bda75173 parse: separate out parsing functions from config.h
The files config.{h,c} contain functions that have to do with parsing,
but not config.

In order to further reduce all-in-one headers, separate out functions in
config.c that do not operate on config into its own file, parse.h,
and update the include directives in the .c files that need only such
functions accordingly.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29 15:14:57 -07:00
e16be13cfa config: correct bad boolean env value error message
An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29 15:14:56 -07:00
afd2a1d5f1 wrapper: reduce scope of remove_or_warn()
remove_or_warn() is only used by entry.c and apply.c, but it is
currently declared and defined in wrapper.{h,c}, so it has a scope much
greater than it needs. This needlessly large scope also causes wrapper.c
to need to include object.h, when this file is largely unconcerned with
Git objects.

Move remove_or_warn() to entry.{h,c}. The file apply.c still has access
to it, since it already includes entry.h for another reason.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29 15:14:56 -07:00
d88e8106e8 hex-ll: separate out non-hash-algo functions
In order to further reduce all-in-one headers, separate out functions in
hex.h that do not operate on object hashes into its own file, hex-ll.h,
and update the include directives in the .c files that need only such
functions accordingly.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29 15:14:56 -07:00
493f462273 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29 09:04:16 -07:00
a03cc4ba1d Merge branch 'ob/am-msgfix'
The parameters to generate an error message have been corrected.

* ob/am-msgfix:
  am: fix error message in parse_opt_show_current_patch()
2023-09-29 09:04:16 -07:00
a4eebfadf2 Merge branch 'jk/test-pass-ubsan-options-to-http-test'
UBSAN options were not propagated through the test framework to git
run via the httpd, unlike ASAN options, which has been corrected.

* jk/test-pass-ubsan-options-to-http-test:
  test-lib: set UBSAN_OPTIONS to match ASan
2023-09-29 09:04:16 -07:00
d15f92e379 Merge branch 'jc/alias-completion'
The command line completion script (in contrib/) can be told to
complete aliases by including ": git <cmd> ;" in the alias to tell
it that the alias should be completed similar to how "git <cmd>" is
completed.  The parsing code for the alias as been loosened to
allow ';' without an extra space before it.

* jc/alias-completion:
  completion: loosen and document the requirement around completing alias
2023-09-29 09:04:15 -07:00
e076f3a23f Merge branch 'hy/doc-show-is-like-log-not-diff-tree'
Doc update.

* hy/doc-show-is-like-log-not-diff-tree:
  show doc: redirect user to git log manual instead of git diff-tree
2023-09-29 09:04:15 -07:00
5cd3f68add Merge branch 'kh/range-diff-notes'
"git range-diff --notes=foo" compared "log --notes=foo --notes" of
the two ranges, instead of using just the specified notes tree.

* kh/range-diff-notes:
  range-diff: treat notes like `log`
2023-09-29 09:04:15 -07:00
0b493d2986 Merge branch 'ds/stat-name-width-configuration'
"git diff" learned diff.statNameWidth configuration variable, to
give the default width for the name part in the "--stat" output.

* ds/stat-name-width-configuration:
  diff --stat: add config option to limit filename width
2023-09-29 09:04:15 -07:00
2affeb3cb5 Merge branch 'jk/fsmonitor-unused-parameter'
Unused parameters in fsmonitor related code paths have been marked
as such.

* jk/fsmonitor-unused-parameter:
  run-command: mark unused parameters in start_bg_wait callbacks
  fsmonitor: mark unused hashmap callback parameters
  fsmonitor/darwin: mark unused parameters in system callback
  fsmonitor: mark unused parameters in stub functions
  fsmonitor/win32: mark unused parameter in fsm_os__incompatible()
  fsmonitor: mark some maybe-unused parameters
  fsmonitor/win32: drop unused parameters
  fsmonitor: prefer repo_git_path() to git_pathdup()
2023-09-29 09:04:14 -07:00
2b04c3ce59 Merge branch 'ml/git-gui-exec-path-fix'
Fix recent regression in Git-GUI that fails to run hook scripts at
all.

* ml/git-gui-exec-path-fix:
  git-gui - use git-hook, honor core.hooksPath
  git-gui - re-enable use of hook scripts
2023-09-29 09:04:14 -07:00
c2c349a15c doc: correct the 50 characters soft limit
The soft limit of the first line of the commit message should be
"no more than 50 characters" or "50 characters or less", but not
"less than 50 character".

Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 10:49:23 -07:00
5f259197ee commit-graph: report incomplete chains during verification
The load_commit_graph_chain_fd_st() function will stop loading chains
when it sees an error. But if it has loaded any graph slice at all, it
will return it. This is a good thing for normal use (we use what data we
can, and this is just an optimization). But it's a bad thing for
"commit-graph verify", which should be careful about finding any
irregularities. We do complain to stderr with a warning(), but the
verify command still exits with a successful return code.

The new tests here cover corruption of both the base and tip slices of
the chain. The corruption of the base file already works (it is the
first file we look at, so when we see the error we return NULL). The
"tip" case is what is fixed by this patch (it complains to stderr but
still returns the base slice).

Likewise the existing tests for corruption of the commit-graph-chain
file itself need to be updated. We already exited non-zero correctly for
the "base" case, but the "tip" case can now do so, too.

Note that this also causes us to adjust a test later in the file that
similarly corrupts a tip (though confusingly the test script calls this
"base"). It checks stderr but erroneously expects the whole "verify"
command to exit with a successful code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 07:00:43 -07:00
7754a565e2 commit-graph: tighten chain size check
When we open a commit-graph-chain file, if it's smaller than a single
entry, we just quietly treat that as ENOENT. That make some sense if the
file is truly zero bytes, but it means that "commit-graph verify" will
quietly ignore a file that contains garbage if that garbage happens to
be short.

Instead, let's only simulate ENOENT when the file is truly empty, and
otherwise return EINVAL. The normal graph-loading routines don't care,
but "commit-graph verify" will notice and complain about the difference.

It's not entirely clear to me that the 0-is-ENOENT case actually happens
in real life, so we could perhaps just eliminate this special-case
altogether. But this is how we've always behaved, so I'm preserving it
in the name of backwards compatibility (though again, it really only
matters for "verify", as the regular routines are happy to load what
they can).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 07:00:43 -07:00
47d06bb010 commit-graph: detect read errors when verifying graph chain
Because it's OK to not have a graph file at all, the graph_verify()
function needs to tell the difference between a missing file and a real
error.  So when loading a traditional graph file, we call
open_commit_graph() separately from load_commit_graph_chain_fd_st(), and
don't complain if the first one fails with ENOENT.

When the function learned about chain files in 3da4b609bb (commit-graph:
verify chains with --shallow mode, 2019-06-18), we couldn't be as
careful, since the only way to load a chain was with
read_commit_graph_one(), which did both the open/load as a single unit.
So we'll miss errors in chain files we load, thinking instead that there
was just no chain file at all.

Note that we do still report some of these problems to stderr, as the
loading function calls error() and warning(). But we'd exit with a
successful exit code, which is wrong.

We can fix that by using the recently split open/load functions for
chains. That lets us treat the chain file just like a single file with
respect to error handling here.

An existing test (from 3da4b609bb) shows off the problem; we were
expecting "commit-graph verify" to report success, but that makes no
sense. We did not even verify the contents of the graph data, because we
couldn't load it! I don't think this was an intentional exception, but
rather just the test covering what happened to occur.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 07:00:43 -07:00
2d45710c5d t5324: harmonize sha1/sha256 graph chain corruption
In t5324.20, we corrupt a hex character 60 bytes into the graph chain
file. Since the file consists of two hash identifiers, one per line, the
corruption differs between sha1 and sha256. In a sha1 repository, the
corruption is on the second line, and in a sha256 repository, it is on
the first.

We should of course detect the problem with either line. But as the next
few patches will show (and fix), that is not the case (in fact, we
currently do not exit non-zero for either line!). And while at the end
of our series we'll catch all errors, our intermediate states will have
differing behavior between the two hashes.

Let's make sure we test corruption of both the first and second lines,
and do so consistently with either hash by choosing offsets which are
always in the first hash (30 bytes) or in the second (70).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 07:00:43 -07:00
8298b54317 commit-graph: check mixed generation validation when loading chain file
In read_commit_graph_one(), we call validate_mixed_generation_chain()
after loading the graph. Even though we don't check the return value,
this has the side effect of clearing the read_generation_data flag,
which is important when working with mixed generation numbers.

But doing this in load_commit_graph_chain_fd_st() makes more sense:

  1. We are calling it even when we did not load a chain at all, which
     is pointless (you cannot have mixed generations in a single file).

  2. For now, all callers load the graph via read_commit_graph_one().
     But the point of factoring out the open/load in the previous commit
     was to let "commit-graph verify" call them separately. So it needs
     to trigger this function as part of the load.

     Without this patch, the mixed-generation tests in t5324 would start
     failing on "git commit-graph verify" calls, once we switch to using
     a separate open/load call there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 07:00:43 -07:00
7ed76b4eb2 commit-graph: factor out chain opening function
The load_commit_graph_chain() function opens the chain file and all of
the slices of graph that it points to. If there is no chain file (which
is a totally normal condition), we return NULL. But if we run into
errors with the chain file or loading the actual graph data, we also
return NULL, and the caller cannot tell the difference.

The caller can check for ENOENT for the unremarkable "no such file"
case. But I'm hesitant to assume that the rest of the function would
never accidentally set errno to ENOENT itself, since it is opening the
slice files (and that would mean the caller fails to notice a real
error).

So let's break this into two functions: one to open the file, and one to
actually load it. This matches the interface we provide for the
non-chain graph file, which will also come in handy in a moment when we
fix some bugs in the "git commit-graph verify" code.

Some notes:

  - I've kept the "1 is good, 0 is bad" return convention (and the weird
    "fd" out-parameter) used by the matching open_commit_graph()
    function and other parts of the commit-graph code. This is unlike
    most of the rest of Git (which would just return the fd, with -1 for
    error), but it makes sense to stay consistent with the adjacent bits
    of the API here.

  - The existing chain loading function will quietly return if the file
    is too small to hold a single entry. I've retained that behavior
    (and explicitly set ENOENT in the opener function) for now, under
    the notion that it's probably valid (though I'd imagine unusual) to
    have an empty chain file.

There are two small behavior changes here, but I think both are strictly
positive:

  1. The original blindly did a stat() before checking if fopen()
     succeeded, meaning we were making a pointless extra stat call.

  2. We now use fstat() to check the file size. The previous code using
     a regular stat() on the pathname meant we could technically race
     with somebody updating the chain file, and end up with a size that
     does not match what we just opened with fopen(). I doubt anybody
     ever hit this in practice, but it may have caused an out-of-bounds
     read.

We'll retain the load_commit_graph_chain() function which does both the
open and reading steps (most existing callers do not care about seeing
errors anyway, since loading commit-graphs is optimistic).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28 07:00:43 -07:00
9eb5419799 bulk-checkin: only support blobs in index_bulk_checkin
As the code is written today index_bulk_checkin only accepts blobs.
Remove the enum object_type parameter and rename index_bulk_checkin to
index_blob_bulk_checkin, index_stream to index_blob_stream,
deflate_to_pack to deflate_blob_to_pack, stream_to_pack to
stream_blob_to_pack, to make this explicit.

Not supporting commits, tags, or trees has no downside as it is not
currently supported now, and commits, tags, and trees being smaller by
design do not have the problem that the problem that index_bulk_checkin
was built to solve.

Before we start adding code to support the hash function transition
supporting additional objects types in index_bulk_checkin has no real
additional cost, just an extra function parameter to know what the
object type is.  Once we begin the hash function transition this is not
the case.

The hash function transition document specifies that a repository with
compatObjectFormat enabled will compute and store both the SHA-1 and
SHA-256 hash of every object in the repository.

What makes this a challenge is that it is not just an additional hash
over the same object.  Instead the hash function transition document
specifies that the compatibility hash (specified with
compatObjectFormat) be computed over the equivalent object that another
git repository whose storage hash (specified with objectFormat) would
store.  When comparing equivalent repositories built with different
storage hash functions, the oids embedded in objects used to refer to
other objects differ and the location of signatures within objects
differ.

As blob objects have neither oids referring to other objects nor stored
signatures their storage hash and their compatibility hash are computed
over the same object.

The other kinds of objects: trees, commits, and tags, all store oids
referring to other objects.  Signatures are stored in commit and tag
objects.  As oids and the tags to store signatures are not the same size
in repositories built with different storage hashes the size of the
equivalent objects are also different.

A version of index_bulk_checkin that supports more than just blobs when
computing both the SHA-1 and the SHA-256 of every object added would
need a different, and more expensive structure.  The structure is more
expensive because it would be required to temporarily buffering the
equivalent object the compatibility hash needs to be computed over.

A temporary object is needed, because before a hash over an object can
computed it's object header needs to be computed.  One of the members of
the object header is the entire size of the object.  To know the size of
an equivalent object an entire pass over the original object needs to be
made, as trees, commits, and tags are composed of a variable number of
variable sized pieces.  Unfortunately there is no formula to compute the
size of an equivalent object from just the size of the original object.

Avoid all of those future complications by limiting index_bulk_checkin
to only work on blobs.

Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-26 10:17:56 -07:00
872976c37e unicode: update the width tables to Unicode 15.1
Unicode 15.1 has been announced on 2023-09-12 [0], so update the
character width tables to the new version.

[0] http://blog.unicode.org/2023/09/announcing-unicode-standard-version-151.html

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 16:17:28 -07:00
a3d2e83a17 ref-filter: add mailmap support
Add mailmap support to ref-filter formats which are similar in
pretty. This support is such that the following pretty placeholders are
equivalent to the new ref-filter atoms:

	%aN = authorname:mailmap
	%cN = committername:mailmap

	%aE = authoremail:mailmap
	%aL = authoremail:mailmap,localpart
	%cE = committeremail:mailmap
	%cL = committeremail:mailmap,localpart

Additionally, mailmap can also be used with ":trim" option for email by
doing something like "authoremail:mailmap,trim".

The above also applies for the "tagger" atom, that is,
"taggername:mailmap", "taggeremail:mailmap", "taggeremail:mailmap,trim"
and "taggername:mailmap,localpart".

The functionality of ":trim" and ":localpart" remains the same. That is,
":trim" gives the email, but without the angle brackets and ":localpart"
gives the part of the email before the '@' character (if such a
character is not found then we directly grab everything between the
angle brackets).

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 14:52:34 -07:00
0144f0de77 t/t6300: introduce test_bad_atom
Introduce a new function "test_bad_atom", which is similar to
"test_atom()" but should be used to check whether the correct error
message is shown on stderr.

Like "test_atom", the new function takes three arguments. The three
arguments specify the ref, the format and the expected error message
respectively, with an optional fourth argument for tweaking
"test_expect_*" (which is by default "success").

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 14:52:33 -07:00
04830eb762 t/t6300: cleanup test_atom
Previously, when the executable part of "test_expect_{success,failure}"
(inside "test_atom") got "eval"ed, it would have been syntactically
incorrect if the second argument ($2, which is the format) to "test_atom"
were enclosed in single quotes because the $variables would get
interpolated even before the arguments to "test_expect_{success,failure}"
are formed.

So fix this and also some style issues along the way.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 14:52:33 -07:00
6a4c9e7b32 merge-tree: add -X strategy option
Add merge strategy option to produce more customizable merge result such
as automatically resolving conflicts.

Signed-off-by: Tang Yuyi <winglovet@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 14:37:42 -07:00
c13d2adf8b coverity: allow running on macOS
For completeness' sake, let's add support for submitting macOS builds to
Coverity Scan.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 10:12:49 -07:00
d3c3ffa624 coverity: support building on Windows
By adding the repository variable `ENABLE_COVERITY_SCAN_ON_OS` with a
value, say, `["windows-latest"]`, this GitHub workflow now runs on
Windows, allowing to analyze Windows-specific issues.

This allows, say, the Git for Windows fork to submit Windows builds to
Coverity Scan instead of Linux builds.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 10:12:49 -07:00
7bc49e8f55 coverity: allow overriding the Coverity project
By default, the builds are submitted to the `git` project at
https://scan.coverity.com/projects/git.

The Git for Windows project would like to use this workflow, too,
though, and needs the builds to be submitted to the `git-for-windows`
Coverity project.

To that end, allow configuring the Coverity project name via the
repository variable, you guessed it, `COVERITY_PROJECT`. The default if
that variable is not configured or has an empty value is still `git`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 10:12:49 -07:00
002e5e9ad1 coverity: cache the Coverity Build Tool
It would add a 1GB+ download for every run, better cache it.

This is inspired by the GitHub Action `vapier/coverity-scan-action`,
however, it uses the finer-grained `restore`/`save` method to be able to
cache the Coverity Build Tool even if an unrelated step in the GitHub
workflow fails later on.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 10:12:48 -07:00
a56b6230d0 ci: add a GitHub workflow to submit Coverity scans
Coverity is a static analysis tool that detects and generates reports on
various security and code quality issues.

It is particularly useful when diagnosing memory safety issues which may
be used as part of exploiting a security vulnerability.

Coverity's website provides a service that accepts "builds" (which
contains the object files generated during a standard build as well as a
database generated by Coverity's scan tool).

Let's add a GitHub workflow to automate all of this. To avoid running it
without appropriate Coverity configuration (e.g. the token required to
use Coverity's services), the job only runs when the repository variable
"ENABLE_COVERITY_SCAN_FOR_BRANCHES" has been configured accordingly (see
https://docs.github.com/en/actions/learn-github-actions/variables for
details how to configure repository variables): It is expected to be a
valid JSON array of branch strings, e.g. `["main", "next"]`.

In addition, this workflow requires two repository secrets:

- COVERITY_SCAN_EMAIL: the email to send the report to, and

- COVERITY_SCAN_TOKEN: the Coverity token (look in the Project Settings
  tab of your Coverity project).

Note: The initial version of this patch used
`vapier/coverity-scan-action` to benefit from that Action's caching of
the Coverity tool, which is rather large. Sadly, that Action only
supports Linux, and we want to have the option of building on Windows,
too. Besides, in the meantime Coverity requires `cov-configure` to be
runantime, and that Action was not adjusted accordingly, i.e. it seems
not to be maintained actively. Therefore it would seem prudent to
implement the steps manually instead of using that Action.

Initial-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 10:12:48 -07:00
f97c8b1e00 revision: make pseudo-opt flags read via stdin behave consistently
When reading revisions from stdin via git-rev-list(1)'s `--stdin` option
then these revisions never honor flags like `--not` which have been
passed on the command line. Thus, an invocation like e.g. `git rev-list
--all --not --stdin` will not treat all revisions read from stdin as
uninteresting. While this behaviour may be surprising to a user, it's
been this way ever since it has been introduced via 42cabc341c (Teach
rev-list an option to read revs from the standard input., 2006-09-05).

With that said, in c40f0b7877 (revision: handle pseudo-opts in `--stdin`
mode, 2023-06-15) we have introduced a new mode to read pseudo opts from
standard input where this behaviour is a lot more confusing. If you pass
`--not` via stdin, it will:

    - Influence subsequent revisions or pseudo-options passed on the
      command line.

    - Influence pseudo-options passed via standard input.

    - _Not_ influence normal revisions passed via standard input.

This behaviour is extremely inconsistent and bound to cause confusion.

While it would be nice to retroactively change the behaviour for how
`--not` and `--stdin` behave together, chances are quite high that this
would break existing scripts that expect the current behaviour that has
been around for many years by now. This is thus not really a viable
option to explore to fix the inconsistency.

Instead, we change the behaviour of how pseudo-opts read via standard
input influence the flags such that the effect is fully localized. With
this change, when reading `--not` via standard input, it will:

    - _Not_ influence subsequent revisions or pseudo-options passed on
      the command line, which is a change in behaviour.

    - Influence pseudo-options passed via standard input.

    - Influence normal revisions passed via standard input, which is a
      change in behaviour.

Thus, all flags read via standard input are fully self-contained to that
standard input, only.

While this is a breaking change as well, the behaviour has only been
recently introduced with Git v2.42.0. Furthermore, the current behaviour
can be regarded as a simple bug. With that in mind it feels like the
right thing to retroactively change it and make the behaviour sane.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25 09:59:04 -07:00
bcb6cae296 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-22 17:01:37 -07:00
fa7a594dac Merge branch 'tb/send-email-extract-valid-address-error-message-fix'
An error message given by "git send-email" when given a malformed
address did not give correct information, which has been corrected.

* tb/send-email-extract-valid-address-error-message-fix:
  git-send-email.perl: avoid printing undef when validating addresses
2023-09-22 17:01:37 -07:00
8ed1eee410 Merge branch 'ch/clean-docfix'
Typofix.

* ch/clean-docfix:
  git-clean doc: fix "without do cleaning" typo
2023-09-22 17:01:37 -07:00
1b46285770 Merge branch 'eg/config-type-path-docfix'
Typofix.

* eg/config-type-path-docfix:
  git-config: fix misworded --type=path explanation
2023-09-22 17:01:37 -07:00
7a90d1eb4d Merge branch 'jk/redact-h2h3-headers-fix'
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.

* jk/redact-h2h3-headers-fix:
  http: update curl http/2 info matching for curl 8.3.0
  http: factor out matching of curl http/2 trace lines
2023-09-22 17:01:36 -07:00
fb6e6e06d5 Merge branch 'jk/ort-unused-parameter-cleanups'
Code clean-up.

* jk/ort-unused-parameter-cleanups:
  merge-ort: lowercase a few error messages
  merge-ort: drop unused "opt" parameter from merge_check_renames_reusable()
  merge-ort: drop unused parameters from detect_and_process_renames()
  merge-ort: stop passing "opt" to read_oid_strbuf()
  merge-ort: drop custom err() function
2023-09-22 17:01:36 -07:00
5c0f9933ec Merge branch 'tb/repack-existing-packs-cleanup'
The code to keep track of existing packs in the repository while
repacking has been refactored.

* tb/repack-existing-packs-cleanup:
  builtin/repack.c: extract common cruft pack loop
  builtin/repack.c: avoid directly inspecting "util"
  builtin/repack.c: store existing cruft packs separately
  builtin/repack.c: extract `has_existing_non_kept_packs()`
  builtin/repack.c: extract redundant pack cleanup for existing packs
  builtin/repack.c: extract redundant pack cleanup for --geometric
  builtin/repack.c: extract marking packs for deletion
  builtin/repack.c: extract structure to store existing packs
2023-09-22 17:01:36 -07:00
6a8bb340f2 Merge branch 'la/trailer-cleanups'
Code clean-up.

Keep only the first three clean-ups, and discard the rest to be replaced later.
cf. <owly1qetjqo1.fsf@fine.c.googlers.com>
cf. <owlyzg1dsswr.fsf@fine.c.googlers.com>

* la/trailer-cleanups:
  trailer: split process_command_line_args into separate functions
  trailer: split process_input_file into separate pieces
  trailer: separate public from internal portion of trailer_iterator
2023-09-22 17:01:36 -07:00
38a15f4755 Documentation/git-status: add missing line breaks
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-22 15:27:51 -07:00
252d693797 test-lib: set UBSAN_OPTIONS to match ASan
For a long time we have used ASAN_OPTIONS to set abort_on_error. This is
important because we want to notice detected problems even in programs
which are expected to fail. But we never did the same for UBSAN_OPTIONS.
This means that our UBSan test suite runs might silently miss some
cases.

It also causes a more visible effect, which is that t4058 complains
about unexpected "fixes" (and this is how I noticed the issue):

  $ make SANITIZE=undefined CC=gcc && (cd t && ./t4058-*)
  ...
  ok 8 - git read-tree does not segfault # TODO known breakage vanished
  ok 9 - reset --hard does not segfault # TODO known breakage vanished
  ok 10 - git diff HEAD does not segfault # TODO known breakage vanished

The tests themselves aren't that interesting. We have a known bug where
these programs segfault, and they do when compiled without sanitizers.
With UBSan, when the test runs:

  test_might_fail git read-tree --reset base

it gets:

  cache-tree.c:935:9: runtime error: member access within misaligned address 0x5a5a5a5a5a5a5a5a for type 'struct cache_entry', which requires 8 byte alignment

So that's garbage memory which would _usually_ cause us to segfault, but
UBSan catches it and complains first about the alignment. That makes
sense, but the weird thing is that UBSan then exits instead of aborting,
so our test_might_fail call considers that an acceptable outcome and the
test "passes".

Curiously, this historically seems to have aborted, because I've run
"make test" with UBSan many times (and so did our CI) and we never saw
the problem. Even more curiously, I see an abort if I use clang with
ASan and UBSan together, like:

  # this aborts!
  make SANITIZE=undefined,address CC=clang

But not with just UBSan, and not with both when used with gcc:

  # none of these do
  make SANITIZE=undefined CC=gcc
  make SANITIZE=undefined CC=clang
  make SANITIZE=undefined,address CC=gcc

Likewise moving to older versions of gcc (I tried gcc-11 and gcc-12 on
my Debian system) doesn't abort. Nor does moving around in Git's
history. Neither this test nor the relevant code have been touched in a
while, and going back to v2.41.0 produces the same outcome (even though
many UBSan CI runs have passed in the meantime).

So _something_ changed on my system (and likely will soon on other
people's, since this is stock Debian unstable), but I didn't track
it further. I don't know why it ever aborted in the past, but we
definitely should be explicit here and tell UBSan what we want to
happen.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-21 14:10:36 -07:00
43abaaf008 am: fix error message in parse_opt_show_current_patch()
The argument order was incorrect. This was introduced by 246cac8505
(i18n: turn even more messages into "cannot be used together" ones,
2022-01-05).

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-21 12:09:33 -07:00
8d73a2cc03 completion: loosen and document the requirement around completing alias
Recently we started to tell users to spell ": git foo ;" with
space(s) around 'foo' for an alias to be completed similarly
to the 'git foo' command.  It however is easy to also allow users to
spell it in a more natural way with the semicolon attached to 'foo',
i.e. ": git foo;".  Also, add a comment to note that 'git' is optional
and writing ": foo;" would complete the alias just fine.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-20 11:41:41 -07:00
6bdb5b11d6 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-20 10:45:58 -07:00
3c2af826a3 Merge branch 'jc/update-index-show-index-version'
"git update-index" learns "--show-index-version" to inspect
the index format version used by the on-disk index file.

* jc/update-index-show-index-version:
  test-tool: retire "index-version"
  update-index: add --show-index-version
  update-index doc: v4 is OK with JGit and libgit2
2023-09-20 10:45:16 -07:00
767e4d68c7 Merge branch 'ob/t3404-typofix'
Code clean-up.

* ob/t3404-typofix:
  t3404-rebase-interactive.sh: fix typos in title of a rewording test
2023-09-20 10:44:58 -07:00
0e72b42a52 Merge branch 'ob/sequencer-remove-dead-code'
Code clean-up.

* ob/sequencer-remove-dead-code:
  sequencer: remove unreachable exit condition in pick_commits()
2023-09-20 10:44:58 -07:00
8c71f082eb Merge branch 'pb/completion-aliases-doc'
Clarify how "alias.foo = : git cmd ; aliased-command-string" should
be spelled with necessary whitespaces around punctuation marks to
work.

* pb/completion-aliases-doc:
  completion: improve doc for complex aliases
2023-09-20 10:44:58 -07:00
e9dac4b86c Merge branch 'pb/complete-commit-trailers'
The command-line complation support (in contrib/) learned to
complete "git commit --trailer=" for possible trailer keys.

* pb/complete-commit-trailers:
  completion: commit: complete trailers tokens more robustly
  completion: commit: complete configured trailer tokens
2023-09-20 10:44:57 -07:00
671eaaac0c Merge branch 'js/diff-cached-fsmonitor-fix'
"git diff --cached" codepath did not fill the necessary stat
information for a file when fsmonitor knows it is clean and ended
up behaving as if it is not clean, which has been corrected.

* js/diff-cached-fsmonitor-fix:
  diff-lib: fix check_removed when fsmonitor is on
2023-09-20 10:44:57 -07:00
bd49a2998a Merge branch 'js/systemd-timers-wsl-fix'
Update "git maintainance" timers' implementation based on systemd
timers to work with WSL.

* js/systemd-timers-wsl-fix:
  maintenance(systemd): support the Windows Subsystem for Linux
2023-09-20 10:44:57 -07:00
7435d51bfd Merge branch 'pw/diff-no-index-from-named-pipes'
"git diff --no-index -R <(one) <(two)" did not work correctly,
which has been corrected.

* pw/diff-no-index-from-named-pipes:
  diff --no-index: fix -R with stdin
2023-09-20 10:44:57 -07:00
4fbe83fcd9 show doc: redirect user to git log manual instead of git diff-tree
While git show accepts options that apply to the git diff-tree command,
some options do not make sense in the context of git show.
The options of git show are handled using the machinery of git log.
The git log manual page is a better place to look into than git diff-tree
for options that are not in the git show manual page.

Signed-off-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-20 08:52:59 -07:00
2e0d30d928 range-diff: treat notes like log
Currently, `range-diff` shows the default notes if no notes-related
arguments are given. This is also how `log` behaves. But unlike
`range-diff`, `log` does *not* show the default notes if
`--notes=<custom>` are given. In other words, this:

    git log --notes=custom

is equivalent to this:

    git log --no-notes --notes=custom

While:

    git range-diff --notes=custom

acts like this:

    git log --notes --notes-custom

This can’t be how the user expects `range-diff` to behave given that the
man page for `range-diff` under `--[no-]notes[=<ref>]` says:

> This flag is passed to the `git log` program (see git-log(1)) that
> generates the patches.

This behavior also affects `format-patch` since it uses `range-diff` for
the cover letter. Unlike `log`, though, `format-patch` is not supposed
to show the default notes if no notes-related arguments are given.[1]
But this promise is broken when the range-diff happens to have something
to say about the changes to the default notes, since that will be shown
in the cover letter.

Remedy this by introducing `--show-notes-by-default` that `range-diff` can
use to tell the `log` subprocess what to do.

§ Authors

• Fix by Johannes
• Tests by Kristoffer

† 1: See e.g. 66b2ed09c2 (Fix "log" family not to be too agressive about
    showing notes, 2010-01-20).

Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-19 14:40:19 -07:00
72da9832c2 run-command: mark unused parameters in start_bg_wait callbacks
The start_bg_command() function takes a callback to tell when the
background-ed process is "ready". The callback receives the
child_process struct as well as an extra void pointer. But curiously,
neither of the two users of this interface look at either parameter!

This makes some sense. The only non-test user of the API is fsmonitor,
which uses fsmonitor_ipc__get_state() to connect to a single global
fsmonitor daemon (i.e., the one we just started!).

So we could just drop these parameters entirely. But it seems like a
pretty reasonable interface for the "wait" callback to have access to
the details of the spawned process, and to have room for passing extra
data through a void pointer. So let's leave these in place but mark the
unused ones so that -Wunused-parameter does not complain.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:15 -07:00
1fe41944b2 fsmonitor: mark unused hashmap callback parameters
Like many hashmap comparison functions, our cookies_cmp() does not look
at its extra void data parameter. This should have been annotated in
02c3c59e62 (hashmap: mark unused callback parameters, 2022-08-19), but
this new case was added around the same time (plus fsmonitor is not
built at all on Linux, so it is easy to miss there).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:15 -07:00
997eb910a6 fsmonitor/darwin: mark unused parameters in system callback
We pass fsevent_callback() to the system FSEventStreamCreate() function
as a callback. So we must match the expected function signature, even
though we don't care about all of the parameters. Mark the unused ones
to satisfy -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:15 -07:00
4cb5e0b3b9 fsmonitor: mark unused parameters in stub functions
The fsmonitor code has some platform-specific functions for which one or
more platforms implement noop or stub functions. We can't get rid of
these functions nor change their interface, since the point is to match
their equivalents in other platforms. But let's annotate their
parameters to quiet the compiler's -Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:15 -07:00
caf433bbdf fsmonitor/win32: mark unused parameter in fsm_os__incompatible()
We never look at the "ipc" argument we receive. It was added in
8f44976882 (fsmonitor: avoid socket location check if using hook,
2022-10-04) to support the darwin fsmonitor code. The win32 code has to
match the same interface, but we should use an annotation to silence
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:15 -07:00
f4c5778b2d fsmonitor: mark some maybe-unused parameters
There's a bit of conditionally-compiled code in fsmonitor, so some
function parameters may be unused depending on the build options:

  - in fsmonitor--daemon.c's try_to_run_foreground_daemon(), we take a
    detach_console argument, but it's only used on Windows. This seems
    intentional (and not mistakenly missing other platforms) based on
    the discussion in c284e27ba7 (fsmonitor--daemon: implement 'start'
    command, 2022-03-25), which introduced it.

  - in fsmonitor-setting.c's check_for_incompatible(), we pass the "ipc"
    flag down to the system-specific fsm_os__incompatible() helper. But
    we can only do so if our platform has such a helper.

In both cases we can mark the argument as MAYBE_UNUSED. That annotates
it enough to suppress the compiler's -Wunused-parameter warning, but
without making it impossible to use the variable, as a regular UNUSED
annotation would.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:14 -07:00
42e862c0b3 fsmonitor/win32: drop unused parameters
A few helper functions (centered around file-watch events) take extra
fsmonitor state parameters that they don't use. These are static helpers
local to the win32 implementation, and don't need to conform to any
particular interface. We can just drop the extra parameters, which
simplifies the code and silences -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:14 -07:00
00df20a7ab fsmonitor: prefer repo_git_path() to git_pathdup()
The fsmonitor_ipc__get_path() function ignores its repository argument.
It should use it when constructing repo paths (though in practice, it is
unlikely anything but the_repository is ever passed, so this is cleanup
and future proofing, not a bug fix).

Note that despite the lack of "dup" in the name, repo_git_path() behaves
like git_pathdup() and returns an allocated string.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 15:56:14 -07:00
d4a83d07b8 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 13:53:22 -07:00
f41c5a5eec Merge branch 'js/complete-checkout-t'
The completion script (in contrib/) has been taught to treat the
"-t" option to "git checkout" and "git switch" just like the
"--track" option, to complete remote-tracking branches.

* js/complete-checkout-t:
  completion(switch/checkout): treat --track and -t the same
2023-09-18 13:53:13 -07:00
921a713d66 Merge branch 'rs/grep-no-no-or'
"git grep -e A --no-or -e B" is accepted, even though the negation
of "or" did not mean anything, which has been tightened.

* rs/grep-no-no-or:
  grep: reject --no-or
2023-09-18 13:53:13 -07:00
12288cc44e git-send-email.perl: avoid printing undef when validating addresses
When validating email addresses with `extract_valid_address_or_die()`,
we print out a helpful error message when the given input does not
contain a valid email address.

However, the pre-image of this patch looks something like:

    my $address = shift;
    $address = extract_valid_address($address):
    die sprintf(__("..."), $address) if !$address;

which fails when given a bogus email address by trying to use $address
(which is undef) in a sprintf() expansion, like so:

    $ git.compile send-email --to="pi <pi@pi>" /tmp/x/*.patch --force
    Use of uninitialized value $address in sprintf at /home/ttaylorr/src/git/git-send-email line 1175.
    error: unable to extract a valid address from:

This regression dates back to e431225569 (git-send-email: remove invalid
addresses earlier, 2012-11-22), but became more noticeable in a8022c5f7b
(send-email: expose header information to git-send-email's
sendemail-validate hook, 2023-04-19), which validates SMTP headers in
the sendemail-validate hook.

Avoid trying to format an undef by storing the given and cleaned address
separately. After applying this fix, the error contains the invalid
email address, and the warning disappears:

    $ git.compile send-email --to="pi <pi@pi>" /tmp/x/*.patch --force
    error: unable to extract a valid address from: pi <pi@pi>

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 12:04:30 -07:00
c22e9efe9c Merge git-gui into ml/git-gui-exec-path-fix
* git-gui:
  git-gui - use git-hook, honor core.hooksPath
  git-gui - re-enable use of hook scripts
2023-09-18 10:52:30 -07:00
0730a5a3a5 git-gui - use git-hook, honor core.hooksPath
git-gui currently runs some hooks directly using its own code written
before 2010, long predating git v2.9 that added the core.hooksPath
configuration to override the assumed location at $GIT_DIR/hooks.  Thus,
git-gui looks for and runs hooks including prepare-commit-msg,
commit-msg, pre-commit, post-commit, and post-checkout from
$GIT_DIR/hooks, regardless of configuration. Commands (e.g., git-merge)
that git-gui invokes directly do honor core.hooksPath, meaning the
overall behaviour is inconsistent.

Furthermore, since v2.36 git exposes its hook execution machinery via
`git-hook run`, eliminating the need for others to maintain code
duplicating that functionality.  Using git-hook will both fix git-gui's
current issues on hook configuration and (presumably) reduce the
maintenance burden going forward. So, teach git-gui to use git-hook.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 10:51:32 -07:00
bd48adc31d diff --stat: add config option to limit filename width
Add new configuration option diff.statNameWidth=<width> that is equivalent
to the command-line option --stat-name-width=<width>, but it is ignored
by format-patch.  This follows the logic established by the already
existing configuration option diff.statGraphWidth=<width>.

Limiting the widths of names and graphs in the --stat output makes sense
for interactive work on wide terminals with many columns, hence the support
for these configuration options.  They don't affect format-patch because
it already adheres to the traditional 80-column standard.

Update the documentation and add more tests to cover new configuration
option diff.statNameWidth=<width>.  While there, perform a few minor code
and whitespace cleanups here and there, as spotted.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18 09:39:07 -07:00
3f71c97e18 git-gui - re-enable use of hook scripts
Earlier, commit aae9560a introduced search in $PATH to find executables
before running them, avoiding an issue where on Windows a same named
file in the current directory can be executed in preference to anything
in a directory in $PATH. This search is intended to find an absolute
path for a bare executable ( e.g, a function "foo") by finding the first
instance of "foo" in a directory given in $PATH, and this search works
correctly.  The search is explicitly avoided for an executable named
with an absolute path (e.g., /bin/sh), and that works as well.

Unfortunately, the search is also applied to commands named with a
relative path. A hook script (or executable) $HOOK is usually located
relative to the project directory as .git/hooks/$HOOK. The search for
this will generally fail as that relative path will (probably) not exist
on any directory in $PATH. This means that git hooks in general now fail
to run. Considerable mayhem could occur should a directory on $PATH be
git controlled. If such a directory includes .git/hooks/$HOOK, that
repository's $HOOK will be substituted for the one in the current
project, with unknown consequences.

This lookup failure also occurs in worktrees linked to a remote .git
directory using git-new-workdir. However, a worktree using a .git file
pointing to a separate git directory apparently avoids this: in that
case the hook command is resolved to an absolute path before being
passed down to the code introduced in aae9560a.

Fix this by replacing the test for an "absolute" pathname to a check for
a command name having more than one pathname component. This limits the
search and absolute pathname resolution to bare commands. The new test
uses tcl's "file split" command. Experiments on Linux and Windows, using
tclsh, show that command names with relative and absolute paths always
give at least two components, while a bare command gives only one.

	  Linux:   puts [file split {foo}]       ==>  foo
	  Linux:   puts [file split {/foo}]      ==>  / foo
	  Linux:   puts [file split {.git/foo}]  ==> .git foo
	  Windows: puts [file split {foo}]       ==>  foo
	  Windows: puts [file split {c:\foo}]    ==>  c:/ foo
	  Windows: puts [file split {.git\foo}]  ==> .git foo

The above results show the new test limits search and replacement
to bare commands on both Linux and Windows.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-16 17:46:25 -07:00
24c5a270d1 merge-ort: lowercase a few error messages
As noted in CodingGuidelines, error messages should not be capitalized.
Fix up a few of these that were copied verbatim from merge-recursive to
match our modern style.

We'll likewise fix up the matching ones from merge-recursive. We care a
bit less there, since the hope is that it will eventually go away. But
besides being the right thing to do in the meantime, it is necessary for
t6406 to pass both with and without GIT_TEST_MERGE_ALGORITHM set (one of
our CI jobs sets it to "recursive", which will use the merge-recursive.c
code). An alternative would be to use "grep -i" in the test to check
the message, but it's nice for the test suite to be be more exact (we'd
notice if the capitalization fix regressed).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-16 17:26:53 -07:00
811c9c2102 diff-lib: fix check_removed() when fsmonitor is active
`git diff-index` may return incorrect deleted entries when fsmonitor
is used in a repository with git submodules. This can be observed on
Mac machines, but it can affect all other supported platforms too.

If fsmonitor is used, `stat *st` is left uninitialied if cache_entry
has CE_FSMONITOR_VALID bit set.  But, there are three call sites
that rely on stat afterwards, which can result in incorrect results.

We can fill members of "struct stat" that matters well enough using
the information we have in "struct cache_entry" that fsmonitor told
us is up-to-date to solve this.

Helped-by: Josip Sokcevic <sokcevic@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 17:13:14 -07:00
9510fe8940 Merge branch 'jc/fake-lstat' into jc/diff-cached-fsmonitor-fix
* jc/fake-lstat:
  cache: add fake_lstat()
2023-09-15 17:09:32 -07:00
c33fa871a5 cache: add fake_lstat()
At times, we may already know that a path represented by a
cache_entry ce has no changes via some out-of-line means, like
fsmonitor, and yet need the control to go through a codepath that
requires us to have "struct stat" obtained by lstat() on the path,
for various purposes (e.g. "ie_match_stat()" wants cached stat-info
is still current wrt "struct stat", "diff" wants to know st_mode).

The callers of lstat() on a tracked file, when its cache_entry knows
it is up-to-date, can instead call this helper to pretend that it
called lstat() by faking the "struct stat" information.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 17:08:46 -07:00
161c35f93b Merge branch 'js/diff-cached-fsmonitor-fix' into jc/diff-cached-fsmonitor-fix
* js/diff-cached-fsmonitor-fix:
  diff-lib: fix check_removed when fsmonitor is on
2023-09-15 17:08:02 -07:00
563f339d98 git-clean doc: fix "without do cleaning" typo
"quit without do cleaning" is not grammatical.

Signed-off-by: Caleb Hill <chill389cc@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 16:05:01 -07:00
58be11432e git-config: fix misworded --type=path explanation
When `--type=<type>` was added as a prefered alias for `--<type>` by
fb0dc3bac1 (builtin/config.c: support `--type=<type>` as preferred
alias for `--<type>`), the explanation for the path type was
reworded.  Whereas the previous explanation said "expand a leading
`~`" this was changed to "adding a leading `~`".  Change "adding" to
"expanding" to correctly explain the canonicalization.

Signed-off-by: Evan Gates <evan.gates@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 14:09:37 -07:00
0763c3a2c4 http: update curl http/2 info matching for curl 8.3.0
To redact header lines in http/2 curl traces, we have to parse past some
prefix bytes that curl sticks in the info lines it passes to us. That
changed once already, and we adapted in db30130165 (http: handle both
"h2" and "h2h3" in curl info lines, 2023-06-17).

Now it has changed again, in curl's fbacb14c4 (http2: cleanup trace
messages, 2023-08-04), which was released in curl 8.3.0. Running a build
of git linked against that version will fail to redact the trace (and as
before, t5559 notices and complains).

The format here is a little more complicated than the other ones, as it
now includes a "stream id". This is not constant but is always numeric,
so we can easily parse past it.

We'll continue to match the old versions, of course, since we want to
work with many different versions of curl. We can't even select one
format at compile time, because the behavior depends on the runtime
version of curl we use, not the version we build against.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 10:54:11 -07:00
39fa527c89 http: factor out matching of curl http/2 trace lines
We have to parse out curl's http/2 trace lines so we can redact their
headers. We already match two different types of lines from various
vintages of curl. In preparation for adding another (which will be
slightly more complex), let's pull the matching into its own function,
rather than doing it in the middle of a conditional.

While we're doing so, let's expand the comment a bit to describe the two
matches. That probably should have been part of db30130165 (http: handle
both "h2" and "h2h3" in curl info lines, 2023-06-17), but will become
even more important as we add new types.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 10:54:08 -07:00
6eb0c0eb7a merge-ort: drop unused "opt" parameter from merge_check_renames_reusable()
The merge_options parameter has never been used since the function was
introduced in 64aceb6d73 (merge-ort: add code to check for whether
cached renames can be reused, 2021-05-20). In theory some merge options
might impact our decisions here, but that has never been the case so
far.

Let's drop it to appease -Wunused-parameter; it would be easy to add
back later if we need to (there is only one caller).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-14 12:01:29 -07:00
fce9ffb225 merge-ort: drop unused parameters from detect_and_process_renames()
This function takes three trees representing the merge base and both
sides of the merge, but never looks at any of them. This is due to
f78cf97617 (merge-ort: call diffcore_rename() directly, 2021-02-14).
Prior to that commit, we passed pairs of trees to diff_tree_oid(). But
after that commit, we collect a custom diff_queue for each pair in the
merge_options struct, and just run diffcore_rename() on the result. So
the function does not need to know about the original trees at all
anymore.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-14 12:01:29 -07:00
1c9419ae9d merge-ort: stop passing "opt" to read_oid_strbuf()
This function doesn't look at its merge_options parameter. It used to
pass it down to err(), but that function no longer exists (and didn't
look at "opt" anyway). We can drop it here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-14 12:01:29 -07:00
808e83f266 merge-ort: drop custom err() function
The merge-ort code has an err() function, but it's really just error()
in disguise. It differs in two ways:

  1. It takes a "struct merge_options" argument. But the function
     completely ignores it! We can simply remove it.

  2. It formats the error string into a strbuf, prepending "error: ",
     and then feeds the result into error(). But this is wrong! The
     error() function already adds the prefix, so we end up with:

        error: error: Failed to execute internal merge

So let's just drop this function entirely and call error() directly, as
the functions are otherwise identical (note that they both always return
-1).

Presumably nobody noticed the bogus messages because they are quite hard
to trigger (they are mostly internal errors reading and writing
objects). However, one easy trigger is a custom merge driver which dies
by signal; we have a test already here, but we were not checking the
contents of stderr.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-14 12:01:29 -07:00
bda494f404 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-14 11:17:00 -07:00
18ad82232f Merge branch 'so/diff-doc-for-patch-update'
References from description of the `--patch` option in various
manual pages have been simplified and improved.

* so/diff-doc-for-patch-update:
  doc/diff-options: fix link to generating patch section
2023-09-14 11:17:00 -07:00
b995e78147 Merge branch 'pw/rebase-i-after-failure'
Various fixes to the behaviour of "rebase -i" when the command got
interrupted by conflicting changes.

* pw/rebase-i-after-failure:
  rebase -i: fix adding failed command to the todo list
  rebase --continue: refuse to commit after failed command
  rebase: fix rewritten list for failed pick
  sequencer: factor out part of pick_commits()
  sequencer: use rebase_path_message()
  rebase -i: remove patch file after conflict resolution
  rebase -i: move unlink() calls
2023-09-14 11:17:00 -07:00
f73604fabf Merge branch 'ob/revert-of-revert-is-reapply'
The default log message created by "git revert", when reverting a
commit that records a revert, has been tweaked.

* ob/revert-of-revert-is-reapply:
  git-revert.txt: add discussion
  sequencer: beautify subject of reverts of reverts
2023-09-14 11:16:59 -07:00
86b56ff267 Merge branch 'ak/pretty-decorate-more'
"git log --format" has been taught the %(decorate) placeholder.

* ak/pretty-decorate-more:
  decorate: use commit color for HEAD arrow
  pretty: add pointer and tag options to %(decorate)
  pretty: add %(decorate[:<options>]) format
  decorate: color each token separately
  decorate: avoid some unnecessary color overhead
  decorate: refactor format_decorations()
  pretty-formats: enclose options in angle brackets
  pretty-formats: define "literal formatting code"
2023-09-14 11:16:59 -07:00
174dfe4637 Merge branch 'jk/tree-name-and-depth-limit'
We now limit depth of the tree objects and maximum length of
pathnames recorded in tree objects.

* jk/tree-name-and-depth-limit:
  lower core.maxTreeDepth default to 2048
  tree-diff: respect max_allowed_tree_depth
  list-objects: respect max_allowed_tree_depth
  read_tree(): respect max_allowed_tree_depth
  traverse_trees(): respect max_allowed_tree_depth
  add core.maxTreeDepth config
  fsck: detect very large tree pathnames
  tree-walk: rename "error" variable
  tree-walk: drop MAX_TRAVERSE_TREES macro
  tree-walk: reduce stack size for recursive functions
2023-09-14 11:16:59 -07:00
6a4e7440fb Merge branch 'ks/ref-filter-sort-numerically'
"git for-each-ref --sort='contents:size'" sorts the refs according
to size numerically, giving a ref that points at a blob twelve-byte
(12) long before showing a blob hundred-byte (100) long.

* ks/ref-filter-sort-numerically:
  ref-filter: sort numerically when ":size" is used
2023-09-14 11:16:59 -07:00
d4cab3717f Merge branch 'rs/name-rev-use-opt-hidden-bool'
Simplify use of parse-options API a bit.

* rs/name-rev-use-opt-hidden-bool:
  name-rev: use OPT_HIDDEN_BOOL for --peel-tag
2023-09-14 11:16:58 -07:00
19d5a0b2c1 Merge branch 'rs/grep-parseopt-simplify'
Simplify use of parse-options API a bit.

* rs/grep-parseopt-simplify:
  grep: use OPT_INTEGER_F for --max-depth
2023-09-14 11:16:58 -07:00
c6a0468f82 builtin/repack.c: extract common cruft pack loop
When generating the list of packs to store in a MIDX (when given the
`--write-midx` option), we include any cruft packs both during
--geometric and non-geometric repacks.

But the rules for when we do and don't have to check whether any of
those cruft packs were queued for deletion differ slightly between the
two cases.

But the two can be unified, provided there is a little bit of extra
detail added in the comment to clarify when it is safe to avoid checking
for any pending deletions (and why it is OK to do so even when not
required).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:48 -07:00
4a17e97246 builtin/repack.c: avoid directly inspecting "util"
The `->util` field corresponding to each string_list_item is used to
track the existence of some pack at the beginning of a repack operation
was originally intended to be used as a bitfield.

This bitfield tracked:

  - (1 << 0): whether or not the pack should be deleted
  - (1 << 1): whether or not the pack is cruft

The previous commit removed the use of the second bit, but a future
patch (from a different series than this one) will introduce a new use
of it.

So we could stop treating the util pointer as a bitfield and instead
start treating it as if it were a boolean. But this would require some
backtracking when that later patch is applied.

Instead, let's avoid touching the ->util field directly, and instead
introduce convenience functions like:

  - pack_mark_for_deletion()
  - pack_is_marked_for_deletion()

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:48 -07:00
eabfaf8e8d builtin/repack.c: store existing cruft packs separately
When repacking with the `--write-midx` option, we invoke the function
`midx_included_packs()` in order to produce the list of packs we want to
include in the resulting MIDX.

This list is comprised of:

  - existing .keep packs
  - any pack(s) which were written earlier in the same process
  - any unchanged packs when doing a `--geometric` repack
  - any cruft packs

Prior to this patch, we stored pre-existing cruft and non-cruft packs
together (provided those packs are non-kept). This meant we needed an
additional bit to indicate which non-kept pack(s) were cruft versus
those that aren't.

But alternatively we can store cruft packs in a separate list, avoiding
the need for this extra bit, and simplifying the code below.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:48 -07:00
4bbfb003c0 builtin/repack.c: extract has_existing_non_kept_packs()
When there is:

  - at least one pre-existing packfile (which is not marked as kept),
  - repacking with the `-d` flag, and
  - not doing a cruft repack

, then we pass a handful of additional options to the inner
`pack-objects` process, like `--unpack-unreachable`,
`--keep-unreachable`, and `--pack-loose-unreachable`, in addition to
marking any packs we just wrote for promisor remotes as kept in-core
(with `--keep-pack`, as opposed to the presence of a ".keep" file on
disk).

Because we store both cruft and non-cruft packs together in the same
`existing.non_kept_packs` list, it suffices to check its `nr` member to
see if it is zero or not.

But a following change will store cruft- and non-cruft packs separately,
meaning this check would break as a result. Prepare for this by
extracting this part of the check into a new helper function called
`has_existing_non_kept_packs()`.

This patch does not introduce any functional changes, but prepares us to
make a more isolated change in a subsequent patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:47 -07:00
f2d3bf178a builtin/repack.c: extract redundant pack cleanup for existing packs
To remove redundant packs at the end of a repacking operation, Git uses
its `remove_redundant_pack()` function in a loop over the set of
pre-existing, non-kept packs.

In a later commit, we will split this list into two, one for
pre-existing cruft pack(s), and another for non-cruft pack(s). Prepare
for this by factoring out the routine to loop over and delete redundant
packs into its own function.

Instead of calling `remove_redundant_pack()` directly, we now will call
`remove_redundant_existing_packs()`, which itself dispatches a call to
`remove_redundant_packs_1()`. Note that the geometric repacking code
will still call `remove_redundant_pack()` directly, but see the previous
commit for more details.

Having `remove_redundant_packs_1()` exist as a separate function may
seem like overkill in this patch. However, a later patch will call
`remove_redundant_packs_1()` once over two separate lists, so this
refactoring sets us up for that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:47 -07:00
639c4a3992 builtin/repack.c: extract redundant pack cleanup for --geometric
To reduce the complexity of the already quite-long `cmd_repack()`
implementation, extract out the parts responsible for deleting redundant
packs from a geometric repack out into its own sub-routine.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:47 -07:00
054b5e4873 builtin/repack.c: extract marking packs for deletion
At the end of a repack (when given `-d`), Git attempts to remove any
packs which have been made "redundant" as a result of the repacking
operation. For example, an all-into-one (`-A` or `-a`) repack makes
every pre-existing pack which is not marked as kept redundant. Geometric
repacks (with `--geometric=<n>`) make any packs which were rolled up
redundant, and so on.

But before deleting the set of packs we think are redundant, we first
check to see whether or not we just wrote a pack which is identical to
any one of the packs we were going to delete. When this is the case, Git
must avoid deleting that pack, since it matches a pack we just wrote
(so deleting it may cause the repository to become corrupt).

Right now we only process the list of non-kept packs in a single pass.
But a future change will split the existing non-kept packs further into
two lists: one for cruft packs, and another for non-cruft packs.

Factor out this routine to prepare for calling it twice on two separate
lists in a future patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:47 -07:00
e2b43831a5 builtin/repack.c: extract structure to store existing packs
The repack machinery needs to keep track of which packfiles were present
in the repository at the beginning of a repack, segmented by whether or
not each pack is marked as kept.

The names of these packs are stored in two `string_list`s, corresponding
to kept- and non-kept packs, respectively. As a consequence, many
functions within the repack code need to take both `string_list`s as
arguments, leading to code like this:

    ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix,
                           cruft_expiration, &names,
                           &existing_nonkept_packs, /* <- */
                           &existing_kept_packs);   /* <- */

Wrap up this pair of `string_list`s into a single structure that stores
both. This saves us from having to pass both string lists separately,
and prepares for adding additional fields to this structure.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 12:32:47 -07:00
d6c51973e4 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13 10:07:57 -07:00
d070b77d25 Merge branch 'ob/sequencer-reword-error-message'
Update an error message (which would probably never been seen).

* ob/sequencer-reword-error-message:
  sequencer: fix error message on failure to copy SQUASH_MSG
2023-09-13 10:07:57 -07:00
877c9919d6 Merge branch 'bc/more-git-var'
Fix-up for a topic that already has graduated.

* bc/more-git-var:
  var: avoid a segmentation fault when `HOME` is unset
2023-09-13 10:07:57 -07:00
331f20d52d Merge branch 'ew/hash-with-openssl-evp'
Fix-up new-ish code to support OpenSSL EVP API.

* ew/hash-with-openssl-evp:
  treewide: fix various bugs w/ OpenSSL 3+ EVP API
2023-09-13 10:07:57 -07:00
c52a02a0f0 Merge branch 'jk/unused-post-2.42-part2'
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.

* jk/unused-post-2.42-part2:
  parse-options: mark unused parameters in noop callback
  interpret-trailers: mark unused "unset" parameters in option callbacks
  parse-options: add more BUG_ON() annotations
  merge: do not pass unused opt->value parameter
  parse-options: mark unused "opt" parameter in callbacks
  parse-options: prefer opt->value to globals in callbacks
  checkout-index: delay automatic setting of to_tempfile
  format-patch: use OPT_STRING_LIST for to/cc options
  merge: simplify parsing of "-n" option
  merge: make xopts a strvec
2023-09-13 10:07:56 -07:00
4333267995 completion: improve doc for complex aliases
The completion code can be told to use a particular completion for
aliases that shell out by using ': git <cmd> ;' as the first command of
the alias. This only works if <cmd> and the semicolon are separated by a
space, since if the space is missing __git_aliased_command returns (for
example) 'checkout;' instead of just 'checkout', and then
__git_complete_command fails to find a completion for 'checkout;'.

The examples have that space but it's not clear if it's just for
style or if it's mandatory. Explicitly mention it.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 17:46:01 -07:00
0b658eae75 completion: commit: complete trailers tokens more robustly
In the previous commit, we added support for completing configured
trailer tokens in 'git commit --trailer'.

Make the implementation more robust by:

- using '__git' instead of plain 'git', as the rest of the completion
  script does
- using a stricter pattern for --get-regexp to avoid false hits
- using 'cut' and 'rev' instead of 'awk' to account for tokens including
  dots.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 17:33:59 -07:00
63642d58b4 sequencer: remove unreachable exit condition in pick_commits()
This was introduced by 56dc3ab04 ("sequencer (rebase -i): implement the
'edit' command", 2017-01-02), and was pointless from the get-go: all
early exits from the loop above are returns, so todo_list->current ==
todo_list->nr is an invariant after the loop.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 17:32:09 -07:00
8aae489756 t3404-rebase-interactive.sh: fix typos in title of a rewording test
This test was introduced by commit 0c164ae7a ("rebase -i: add another
reword test", 2021-08-20). I didn't quite get what it was meant to do,
so here's an explanation from Phillip:

The purpose of the test is to ensure that

  (i) There are no uncommitted changes when the editor runs. i.e., we
      commit without running the editor and then reword by amending
      that commit. This ensures that we have the same user experience
      whether or not the commit was fast-forwarded [1].

 (ii) That the todo list is re-read after the commit has been reworded.
      This is to allow the user to update the todo list while the rebase
      is paused for editing the commit message.

[1] https://lore.kernel.org/git/20190812175046.GM20404@szeder.dev/

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 17:24:56 -07:00
83708f80fc test-tool: retire "index-version"
As "git update-index --show-index-version" can do the same thing,
the 'index-version' subcommand in the test-tool lost its reason to
exist.  Remove it and replace its use with the end-user facing
'git update-index --show-index-version'.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 16:21:53 -07:00
606e088d5d update-index: add --show-index-version
"git update-index --index-version N" is used to set the index format
version to a specific version, but there was no way to query the
current version used in the on-disk index file.

Teach the command a new "--show-index-version" option, and also
teach the "--index-version N" option to report what the version was
when run with the "--verbose" option.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 16:21:53 -07:00
764b2330db update-index doc: v4 is OK with JGit and libgit2
Being invented in late 2012 no longer makes the index v4 format
"relatively young".

The support for the index version 4 was added to libgit2 with their
5625d86b (index: support index v4, 2016-05-17) and to JGit with
their e9cb0a8e (DirCache: support index V4, 2020-08-10).

Let's update the paragraph that discouraged its use for folks overly
cautious about cross-tool compatibility.

Helped-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-12 16:21:53 -07:00
6a044a2048 diff-lib: fix check_removed when fsmonitor is on
`git diff-index` may return incorrect deleted entries when fsmonitor
is used in a repository with git submodules. This can be observed on
Mac machines, but it can affect all other supported platforms too.

If fsmonitor is used, `stat *st` is not initialized if cache_entry has
CE_FSMONITOR_VALID set. But, there are three call sites that rely on stat
afterwards, which can result in incorrect results.

This change partially reverts commit 4f3d6d02 (fsmonitor: skip lstat
deletion check during git diff-index, 2021-03-17).

Signed-off-by: Josip Sokcevic <sokcevic@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11 16:45:49 -07:00
5e8515e8e8 maintenance(systemd): support the Windows Subsystem for Linux
When running in the Windows Subsystem for Linux (WSL), it is usually
necessary to use the Git Credential Manager for authentication when
performing the background fetches.

This requires interoperability between the Windows Subsystem for Linux
and the Windows host to work, which uses so-called vsocks, i.e. sockets
intended for communcations between virtual machines and the host they
are running on.

However, when Git is configured to run background maintenance via
`systemd`, the address families available to those maintenance processes
are restricted, and did not include `AF_VSOCK`. This leads to problems
e.g. when a background fetch tries to access github.com:

	systemd[437]: Starting Optimize Git repositories data...
	git[747387]: WSL (747387) ERROR: UtilBindVsockAnyPort:285: socket failed 97
	git[747381]: fatal: could not read Username for 'https://github.com': No such device or address
	git[747381]: error: failed to prefetch remotes
	git[747381]: error: task 'prefetch' failed
	systemd[437]: git-maintenance@hourly.service: Main process exited, code=exited, status=1/FAILURE
	systemd[437]: git-maintenance@hourly.service: Failed with result 'exit-code'.
	systemd[437]: Failed to start Optimize Git repositories data.

Address this (pun intended) by adding the `AF_VSOCK` address family to
the allow list.

This fixes https://github.com/microsoft/git/issues/604.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11 12:41:30 -07:00
48944f214c diff --no-index: fix -R with stdin
When -R is given, queue_diff() swaps the mode and name variables of the
two files to produce a reverse diff.  1e3f26542a (diff --no-index:
support reading from named pipes, 2023-07-05) added variables that
indicate whether files are special, i.e named pipes or - for stdin.
These new variables were not swapped, though, which broke the handling
of stdin with with -R.  Swap them like the other metadata variables.

Reported-by: Martin Storsjö <martin@martin.st>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11 12:05:37 -07:00
94430d03df trailer: split process_command_line_args into separate functions
Previously, process_command_line_args did two things:

    (1) parse trailers from the configuration, and
    (2) parse trailers defined on the command line.

Separate (1) outside to a new function, parse_trailers_from_config.
Rename the remaining logic to parse_trailers_from_command_line_args.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11 10:01:19 -07:00
c2a8edf997 trailer: split process_input_file into separate pieces
Currently, process_input_file does three things:

    (1) parse the input string for trailers,
    (2) print text before the trailers, and
    (3) calculate the position of the input where the trailers end.

Rename this function to parse_trailers(), and make it only do
(1). The caller of this function, process_trailers, becomes responsible
for (2) and (3). These items belong inside process_trailers because they
are both concerned with printing the surrounding text around
trailers (which is already one of the immediate concerns of
process_trailers).

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11 10:01:19 -07:00
13211ae23f trailer: separate public from internal portion of trailer_iterator
The fields here are not meant to be used by downstream callers, so put
them behind an anonymous struct named as "internal" to warn against
their use. This follows the pattern in 576de3d956 (unpack_trees: start
splitting internal fields from public API, 2023-02-27).

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11 10:01:18 -07:00
9f892830d6 completion(switch/checkout): treat --track and -t the same
When `git switch --track ` is to be completed, only remote refs are
eligible because that is what the `--track` option targets.

And when the short-hand `-t` is used instead, the same _should_ happen.
Let's make it so.

Note that the bug exists both in the completions of `switch` and
`completion`, even if it manifests in slightly different ways: While
the completion of `git switch -t ` will not even look at remote refs,
the completion of `git checkout -t ` will look at both remote _and_
local refs. Both should look only at remote refs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-08 09:26:15 -07:00
6ccbc66794 trailer doc: <token> is a <key> or <keyAlias>, not both
The `--trailer` option takes a "<token>=<value>" argument, for example

    --trailer "Acked-by=Bob"

And in this exampple it is understood that "Acked-by" is the <token>.
However, the user can use a shorter "ack" string by defining
configuration like

    git config trailer.ack.key "Acked-by"

However, in the docs we define the above configuration as

    trailer.<token>.key

so the <token> can mean either the longer "Acked-by" or the shorter
"ack".

Separate the two meanings of <token> into <key> and <keyAlias>, and
update the configuration syntax to say "trailer.<keyAlias>.key".

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
ab76661f22 trailer doc: separator within key suppresses default separator
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
db97296122 trailer doc: emphasize the effect of configuration variables
The sentence does not mention the effect of configuration variables at
all, when they are actively used by default (unless --parse is
specified) to potentially add new trailers, without the user having to
always supply --trailer manually.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
289a0b2447 trailer --unfold help: prefer "reformat" over "join"
The phrase "join whitespace-continued values" requires some additional
context. For example, "whitespace" means newlines (not just space
characters), and "join" means to join only the multiple lines together
for a single trailer (and not that we are joining multiple trailers
together). That is, "join" means to convert

    token: This is a very long value, with spaces and
      newlines in it.

to

    token: This is a very long value, with spaces and newlines in it.

and does not mean to convert

    token: value1
    token: value2

to

    token: value1 value2.

Update the help text to resolve the above ambiguity. While we're add it,
update the docs to use similar language as the change in the help text.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
cb088cbe0f trailer --parse docs: add explanation for its usefulness
For users who are skimming the docs to go straight to the individual
breakdown of each flag, it may not be clear why --parse is a convenience
alias (without them also looking at the other options that --parse turns
on). To save them the trouble of looking at the other options (and
computing what that would mean), describe a summary of the overall
effect.

Similarly update the area when we first mention --parse near the top of
the doc.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
a6c72e7046 trailer --only-input: prefer "configuration variables" over "rules"
Use the phrase "configuration variables" instead of "rules" because

(1) we already say "configuration variables" in multiple
    places in the docs (where the word "rules" is only used for describing
    "--only-input" behavior and for an unrelated case of mentioning how
    the trailers do not follow "rules for RFC 822 headers"), and

(2) this phrase is more specific than just "rules".

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
8c7d4acb07 trailer --parse help: expose aliased options
The existing description "set parsing options" is vague, because
arguably _all_ of the options for interpret-trailers have to do with
parsing to some degree.

Explain what this flag does to match what is in the docs, namely how
it is an alias for "--only-trailers --only-input --unfold".

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
b674f25b81 trailer --no-divider help: describe usual "---" meaning
It's unclear what treating something "specially" means.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
467bb1b97a trailer: trailer location is a place, not an action
Fix the help text to say "placement" instead of "action" because the
values are placements, not actions.

While we're at it, tweak the documentation to say "placements" instead
of "values", similar to how the existing language for "--if-exists" uses
the word "action" to describe both the syntax (with the phrase
"--if-exists <action>") and the possible values (with the phrase
"possible actions").

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
f659c56a8c trailer doc: narrow down scope of --where and related flags
The wording "all configuration variables" is misleading (the same could
be said to the descriptions of the "--[no-]if-exists" and the
"--[no-]if-missing" options).  Specifying --where=value overrides only
the trailer.where variable and applicable trailer.<token>.where
variables, and --no-where stops the overriding of these variables.
Ditto for the other two with their relevant configuration variables.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
73574f21b4 trailer: add tests to check defaulting behavior with --no-* flags
While the "--no-where" flag is tested, the "--no-if-exists" and
"--no-if-missing" flags are not, so add tests for them. But also add
tests for all "--no-*" flags to check their effects, both when (1) there
are relevant configuration variables set, and (2) they are not set.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:44 -07:00
e670ba2500 trailer test description: this tests --where=after, not --where=before
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:04:36 -07:00
84e53330f0 trailer tests: make test cases self-contained
By using "test_config" instead of "git config", we avoid leaking
configuration state across test cases. This in turn helps to make the
tests more self-contained, by explicitly capturing the configuration
setup. It then makes it easier to add tests anywhere in this 1500+ line
file, without worrying about what implicit state was set in some prior
test case defined earlier up in the script.

This commit was created mechanically as follows: we changed the first
occurrence of a particular "git config trailer.*" option, then ran the
tests repeatedly to see which ones broke, adding in the extra
"test_config" equivalents to make them pass again. In addition, in some
test cases we removed "git config --unset ..." lines because they were
no longer necessary (as the --unset was being used to clean up leaked
configuration state from earlier test cases).

The process described above was done repeatedly until there were no more
unbridled "git config" invocations. Some "git config" invocations still
do exist in the script, but they were already cleaned up properly with

    test_when_finished "git config --remove-section ..."

so they were left alone.

Note that these cleanups result in generally longer test case setups
because the previously hidden state is now being exposed. Although we
could then clean up the test cases' "expected" values to be less
verbose (the verbosity arising from the use of implicit state), we
choose not to do so here, to make sure that this cleanup does not change
any meanings behind the test cases.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 23:03:57 -07:00
94e83dcf5b The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 15:06:19 -07:00
09684a12b0 Merge branch 'dd/format-patch-rfc-updates'
"git format-patch --rfc --subject-prefix=<foo>" used to ignore the
"--subject-prefix" option and used "[RFC PATCH]"; now we will add
"RFC" prefix to whatever subject prefix is specified.

This is a backward compatible change that may deserve a note.

* dd/format-patch-rfc-updates:
  format-patch: --rfc honors what --subject-prefix sets
2023-09-07 15:06:08 -07:00
32de857fb2 Merge branch 'jk/ci-retire-allow-ref'
CI update.

* jk/ci-retire-allow-ref:
  ci: deprecate ci/config/allow-ref script
  ci: allow branch selection through "vars"
2023-09-07 15:06:08 -07:00
300b2a1047 Merge branch 'ws/git-svn-retire-faketerm'
Code clean-up.

* ws/git-svn-retire-faketerm:
  git-svn: drop FakeTerm hack
2023-09-07 15:06:08 -07:00
25ff15d108 Merge branch 'jk/unused-post-2.42'
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.

* jk/unused-post-2.42: (22 commits)
  update-ref: mark unused parameter in parser callbacks
  gc: mark unused descriptors in scheduler callbacks
  bundle-uri: mark unused parameters in callbacks
  fetch: mark unused parameter in ref_transaction callback
  credential: mark unused parameter in urlmatch callback
  grep: mark unused parmaeters in pcre fallbacks
  imap-send: mark unused parameters with NO_OPENSSL
  worktree: mark unused parameters in noop repair callback
  negotiator/noop: mark unused callback parameters
  add-interactive: mark unused callback parameters
  grep: mark unused parameter in output function
  test-trace2: mark unused argv/argc parameters
  trace2: mark unused config callback parameter
  trace2: mark unused us_elapsed_absolute parameters
  stash: mark unused parameter in diff callback
  ls-tree: mark unused parameter in callback
  commit-graph: mark unused data parameters in generation callbacks
  worktree: mark unused parameters in each_ref_fn callback
  pack-bitmap: mark unused parameters in show_object callback
  ref-filter: mark unused parameters in parser callbacks
  ...
2023-09-07 15:06:07 -07:00
8af5aac986 Merge branch 'tb/multi-cruft-pack'
Use of --max-pack-size to allow multiple packfiles to be created is
now supported even when we are sending unreachable objects to cruft
packs.

* tb/multi-cruft-pack:
  Documentation/gitformat-pack.txt: drop mixed version section
  Documentation/gitformat-pack.txt: remove multi-cruft packs alternative
  builtin/pack-objects.c: support `--max-pack-size` with `--cruft`
  builtin/pack-objects.c: remove unnecessary strbuf_reset()
2023-09-07 15:06:07 -07:00
aae8558b10 grep: reject --no-or
Since 3e230fa1b2 (grep: use parseopt, 2009-05-07) git grep has been
accepting the option --no-or.  It does the same as --or: nothing.
That's confusing and unintended.  Forbid negating --or.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 13:35:07 -07:00
c7153fad2d completion: commit: complete configured trailer tokens
Since 2daae3d1d1 (commit: add --trailer option, 2021-03-23), 'git
commit' can add trailers to commit messages. To make that feature more
pleasant to use at the command line, update the Bash completion code to
offer configured trailer tokens.

Add a __git_trailer_tokens function to list the configured trailers
tokens, and use it in _git_commit to suggest the configured tokens,
suffixing the completion words with ':' so that the user only has to add
the trailer value.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-07 12:37:07 -07:00
203573b024 rebase -i: fix adding failed command to the todo list
When rebasing commands are moved from the todo list in "git-rebase-todo"
to the "done" file (which is used by "git status" to show the recently
executed commands) just before they are executed. This means that if a
command fails because it would overwrite an untracked file it has to be
added back into the todo list before the rebase stops for the user to
fix the problem.

Unfortunately when a failed command is added back into the todo list the
command preceding it is erroneously appended to the "done" file.  This
means that when rebase stops after "pick B" fails the "done" file
contains

	pick A
	pick B
	pick A

instead of

	pick A
	pick B

This happens because save_todo() updates the "done" file with the
previous command whenever "git-rebase-todo" is updated. When we add the
failed pick back into "git-rebase-todo" we do not want to update
"done". Fix this by adding a "reschedule" parameter to save_todo() which
prevents the "done" file from being updated when adding a failed command
back into the "git-rebase-todo" file. A couple of the existing tests are
modified to improve their coverage as none of them trigger this bug or
check the "done" file.

Reported-by: Stefan Haller <lists@haller-berlin.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:44 -07:00
405509cbd6 rebase --continue: refuse to commit after failed command
If a commit cannot be picked because it would overwrite an untracked
file then "git rebase --continue" should refuse to commit any staged
changes as the commit was not picked. This is implemented by refusing to
commit if the message file is missing. The message file is chosen for
this check because it is only written when "git rebase" stops for the
user to resolve merge conflicts.

Existing commands that refuse to commit staged changes when continuing
such as a failed "exec" rely on checking for the absence of the author
script in run_git_commit(). This prevents the staged changes from being
committed but prints

    error: could not open '.git/rebase-merge/author-script' for
    reading

before the message about not being able to commit. This is confusing to
users and so checking for the message file instead improves the user
experience. The existing test for refusing to commit after a failed exec
is updated to check that we do not print the error message about a
missing author script anymore.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:44 -07:00
e032abd5a0 rebase: fix rewritten list for failed pick
git rebase keeps a list that maps the OID of each commit before it was
rebased to the OID of the equivalent commit after the rebase.  This list
is used to drive the "post-rewrite" hook that is called at the end of a
successful rebase. When a rebase stops for the user to resolve merge
conflicts the OID of the commit being picked is written to
".git/rebase-merge/stopped-sha". Then when the rebase is continued that
OID is added to the list of rewritten commits. Unfortunately if a commit
cannot be picked because it would overwrite an untracked file we still
write the "stopped-sha1" file. This means that when the rebase is
continued the commit is added into the list of rewritten commits even
though it has not been picked yet.

Fix this by not calling error_with_patch() for failed commands. The pick
has failed so there is nothing to commit and therefore we do not want to
set up the state files for committing staged changes when the rebase
continues. This change means we no-longer write a patch for the failed
command or display the error message printed by error_with_patch(). As
the command has failed the patch isn't really useful and in any case the
user can inspect the commit associated with the failed command by
inspecting REBASE_HEAD. Unless the user has disabled it we already print
an advice message that is more helpful than the message from
error_with_patch() which the user will still see. Even if the advice is
disabled the user will see the messages from the merge machinery
detailing the problem.

The code to add a failed command back into the todo list is duplicated
between pick_one_commit() and the loop in pick_commits(). Both sites
print advice about the command being rescheduled, decrement the current
item and save the todo list. To avoid duplicating this code
pick_one_commit() is modified to set a flag to indicate that the command
should be rescheduled in the main loop. This simplifies things as only
the remaining copy of the code needs to be modified to set REBASE_HEAD
rather than calling error_with_patch().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:43 -07:00
f2b5f41eda sequencer: factor out part of pick_commits()
This simplifies the next commit. If a pick fails we now return the error
at the end of the loop body rather than returning early, a successful
"edit" command continues to return early. There are three things to
check to ensure that removing the early return for an error does not
change the behavior of the code:

(1) We could enter the block guarded by "if (reschedule)". This block
    is not entered because "reschedlue" is always zero when picking a
    commit.

(2) We could enter the block guarded by
    "else if (is_rebase_i(opts) &&  check_todo && !res)". This block is
    not entered when returning an error because "res" is non-zero in
    that case.

(3) todo_list->current could be incremented before returning. That is
    avoided by moving the increment which is of course a potential
    change in behavior itself. The move is safe because none of the
    callers look at todo_list after this function returns. Moving the
    increment makes it clear we only want to advance the current item
    if the command was successful.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:43 -07:00
9f67899b41 sequencer: use rebase_path_message()
Rather than constructing the path in a struct strbuf use the ready
made function to get the path name instead. This was the last
remaining use of the strbuf so remove it as well.

As with the previous patch we now use a hard coded string rather than
git_dir() when constructing the path. This is safe for the same
reason (make_patch() is only called when rebasing) and is protected by
the assertion added in the previous patch.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:43 -07:00
206a78d710 rebase -i: remove patch file after conflict resolution
When a rebase stops for the user to resolve conflicts it writes a patch
for the conflicting commit to .git/rebase-merge/patch. This file has
been written since the introduction of "git-rebase-interactive.sh" in
1b1dce4bae (Teach rebase an interactive mode, 2007-06-25). I assume the
idea was to enable the user inspect the conflicting commit in the same
way as they could for the patch based rebase. This file should be
deleted when the rebase continues as if the rebase stops for a failed
"exec" command or a "break" command it is confusing to the user if there
is a stale patch lying around from an unrelated command. As the path is
now used in two different places rebase_path_patch() is added and used
to obtain the path for the patch.

To construct the path write_patch() previously used get_dir() which
returns different paths depending on whether we're rebasing or
cherry-picking/reverting. As this function is only called when
rebasing it is safe to use a hard coded string for the directory
instead. An assertion is added to make sure we don't starting calling
this function when cherry-picking in the future.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:43 -07:00
36ac861a30 rebase -i: move unlink() calls
At the start of each iteration the loop that picks commits removes the
state files from the previous pick. However some of these files are only
written if there are conflicts in which case we exit the loop before the
end of the loop body. Therefore they only need to be removed when the
rebase continues, not at the start of each iteration.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 10:29:43 -07:00
11422f23e3 doc/diff-options: fix link to generating patch section
When formatted as man-page, the section title is rendered
"GENERATING PATCH TEXT WITH -P" whereas reference still reads
"Generating patch text with -p", that is inconsistent and makes
searching harder than it needs to be.

Fix this by getting rid of custom reference text.

Also, documentation for every command that describes `-p` option by
including the "diff-options.txt" file does include the
"diff-generate-patch.txt" file as well (as it should), so the internal
link is in fact useful for any of them.

Fix this by getting rid of conditionals around the reference.

Fixes: ebdc46c242 (docs: link generating patch sections)
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06 08:58:45 -07:00
256a94ef6c var: avoid a segmentation fault when HOME is unset
The code introduced in 576a37fccb (var: add attributes files locations,
2023-06-27) paid careful attention to use `xstrdup()` for pointers known
never to be `NULL`, and `xstrdup_or_null()` otherwise.

One spot was missed, though: `git_attr_global_file()` can return `NULL`,
when the `HOME` variable is not set (and neither `XDG_CONFIG_HOME`), a
scenario not too uncommon in certain server scenarios.

Fix this, and add a test case to avoid future regressions.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 15:28:26 -07:00
82af2c639c sequencer: fix error message on failure to copy SQUASH_MSG
The message talked about renaming, while the actual action is copying.
This was introduced by 6e98de72c ("sequencer (rebase -i): add support
for the 'fixup' and 'squash' commands", 2017-01-02).

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 15:27:22 -07:00
2a63c79dae grep: use OPT_INTEGER_F for --max-depth
a91f453f64 (grep: Add --max-depth option., 2009-07-22) added the option
--max-depth, defining it using a positional struct option initializer of
type OPTION_INTEGER.  It also sets defval to 1 for some reason, but that
value would only be used if the flag PARSE_OPT_OPTARG was given.

Use the macro OPT_INTEGER_F instead to standardize the definition and
specify only the necessary values.  This also normalizes argh to N_("n")
as a side-effect, which is OK.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:59:26 -07:00
078c42531e name-rev: use OPT_HIDDEN_BOOL for --peel-tag
adfc1857bd (describe: fix --contains when a tag is given as input,
2013-07-18) added the option --peel-tag, defining it using a positional
struct option initializer and a comment indicating that it's intended to
be a hidden OPT_BOOL.  4741edd549 (Remove deprecated OPTION_BOOLEAN for
parsing arguments, 2013-08-03) added the macro OPT_HIDDEN_BOOL, which
allows to express this more succinctly.  Use it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:58:44 -07:00
6d79cd8474 ref-filter: sort numerically when ":size" is used
Atoms like "raw" and "contents" have a ":size" option which can be used
to know the size of the data. Since these atoms have the cmp_type
FIELD_STR, they are sorted alphabetically from 'a' to 'z' and '0' to
'9'. Meaning, even when the ":size" option is used and what we
ultimatlely have is numbers, we still sort alphabetically.

For example, consider the the following case in a repo

refname			contents:size		raw:size
=======			=============		========
refs/heads/branch1	1130			1210
refs/heads/master	300			410
refs/tags/v1.0		140			260

Sorting with "--format="%(refname) %(contents:size) --sort=contents:size"
would give

refs/heads/branch1 1130
refs/tags/v1.0.0 140
refs/heads/master 300

which is an alphabetic sort, while what one might really expect is

refs/tags/v1.0.0 140
refs/heads/master 300
refs/heads/branch1 1130

which is a numeric sort (that is, a "$ sort -n file" as opposed to a
"$ sort file", where "file" contains only the "contents:size" or
"raw:size" info, each of which is on a newline).

Same is the case with "--sort=raw:size".

So, sort numerically whenever the sort is done with "contents:size" or
"raw:size" and do it the normal alphabetic way when "contents" or "raw"
are used with some other option (they are FIELD_STR anyways).

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:49:40 -07:00
0058b3d5ee parse-options: mark unused parameters in noop callback
Unsurprisingly, the noop options callback doesn't bother to look at any
of its parameters. Let's mark them so that -Wunused-parameter does not
complain.

Another option would be to drop the callback and have parse-options
itself recognize OPT_NOOP_NOARG. But that seems like extra work for no
real benefit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:48:17 -07:00
d775365db3 interpret-trailers: mark unused "unset" parameters in option callbacks
There are a few parse-option callbacks that do not look at their "unset"
parameters, but also do not set PARSE_OPT_NONEG. At first glance this
seems like a bug, as we'd ignore "--no-if-exists", etc.

But they do work fine, because when "unset" is true, then "arg" is NULL.
And all three functions pass "arg" on to helper functions which do the
right thing with the NULL.

Note that this shortcut would not be correct if any callback used
PARSE_OPT_NOARG (in which case "arg" would be NULL but "unset" would be
false). But none of these do.

So the code is fine as-is. But we'll want to mark the unused "unset"
parameters to quiet -Wunused-parameter. I've also added a comment to
make this rather subtle situation more explicit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:48:17 -07:00
abf2952f83 parse-options: add more BUG_ON() annotations
These callbacks are similar to the ones touched by 517fe807d6 (assert
NOARG/NONEG behavior of parse-options callbacks, 2018-11-05), but were
either missed in that commit (the one in add.c) or were added later (the
one in log.c).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:48:17 -07:00
62c5358a5e merge: do not pass unused opt->value parameter
The option_parse_strategy() callback does not look at opt->value;
instead it calls append_strategy(), which manipulates the global
use_strategies array directly. But the OPT_CALLBACK declaration assigns
"&use_strategies" to opt->value.

One could argue this is good, as it tells the reader what we generally
expect the callback to do. But it is also bad, because it can mislead
you into thinking that swapping out "&use_strategies" there might have
any effect. Let's switch it to pass NULL (which is what every other
"does not bother to look at opt->value" callback does). If you want to
know what the callback does, it's easy to read the function itself.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:48:17 -07:00
34bf44f2d5 parse-options: mark unused "opt" parameter in callbacks
The previous commit argued that parse-options callbacks should try to
use opt->value rather than touching globals directly. In some cases,
however, that's awkward to do. Some callbacks touch multiple variables,
or may even just call into an abstracted function that does so.

In some of these cases we _could_ convert them by stuffing the multiple
variables into a single struct and passing the struct pointer through
opt->value. But that may make other parts of the code less readable,
as the struct relationship has to be mentioned everywhere.

Let's just accept that these cases are special and leave them as-is. But
we do need to mark their "opt" parameters to satisfy -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:48:17 -07:00
66e3309294 parse-options: prefer opt->value to globals in callbacks
We have several parse-options callbacks that ignore their "opt"
parameters entirely. This is a little unusual, as we'd normally put the
result of the parsing into opt->value. In the case of these callbacks,
though, they directly manipulate global variables instead (and in
most cases the caller sets opt->value to NULL in the OPT_CALLBACK
declaration).

The immediate symptom we'd like to deal with is that the unused "opt"
variables trigger -Wunused-parameter. But how to fix that is debatable.
One option is to annotate them with UNUSED. But another is to have the
caller pass in the appropriate variable via opt->value, and use it. That
has the benefit of making the callbacks reusable (in theory at least),
and makes it clear from the OPT_CALLBACK declaration which variables
will be affected (doubly so for the cases in builtin/fast-export.c,
where we do set opt->value, but it is completely ignored!).

The slight downside is that we lose type safety, since they're now
passing through void pointers.

I went with the "just use them" approach here. The loss of type safety
is unfortunate, but that is already an issue with most of the other
callbacks. If we want to try to address that, we should do so more
consistently (and this patch would prepare these callbacks for whatever
we choose to do there).

Note that in the cases in builtin/fast-export.c, we are passing
anonymous enums. We'll have to give them names so that we can declare
the appropriate pointer type within the callbacks.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:48:17 -07:00
9b40386586 checkout-index: delay automatic setting of to_tempfile
Using --stage=all requires writing to tempfiles, since we cannot put
multiple stages into a single file. So --stage=all implies --temp.

But we do so by setting to_tempfile in the options callback for --stage,
rather than after all options have been parsed. This leads to two bugs:

  1. If you run "checkout-index --stage=all --stage=2", this should not
     imply --temp, but it currently does. The callback cannot just unset
     to_tempfile when it sees the "2" value, because it no longer knows
     if its value was from the earlier --stage call, or if the user
     specified --temp explicitly.

  2. If you run "checkout-index --stage=all --no-temp", the --no-temp
     will overwrite the earlier implied --temp. But this mode of
     operation cannot work, and the command will fail with "<path>
     already exists" when trying to write the higher stages.

We can fix both by lazily setting to_tempfile. We'll make it a tristate,
with -1 as "not yet given", and have --stage=all enable it only after
all options are parsed. Likewise, after all options are parsed we can
detect and reject the bogus "--no-temp" case.

Note that this does technically change the behavior for "--stage=all
--no-temp" for paths which have only one stage present (which
accidentally worked before, but is now forbidden). But this behavior was
never intended, and you'd have to go out of your way to try to trigger
it.

The new tests cover both cases, as well the general "--stage=all implies
--temp", as most of the other tests explicitly say "--temp". Ironically,
the test "checkout --temp within subdir" is the only one that _doesn't_
use "--temp", and so was implicitly covering this case. But it seems
reasonable to have a more explicit test alongside the other related
ones.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:47:29 -07:00
1fc548b2d6 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05 14:38:56 -07:00
4241eece79 Merge branch 'jk/test-lsan-denoise-output'
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.

* jk/test-lsan-denoise-output:
  test-lib: ignore uninteresting LSan output
2023-09-05 14:38:56 -07:00
3e2b0c2f94 Merge branch 'js/ci-san-skip-p4-and-svn-tests'
Flakey "git p4" tests, as well as "git svn" tests, are now skipped
in the (rather expensive) sanitizer CI job.

* js/ci-san-skip-p4-and-svn-tests:
  ci(linux-asan-ubsan): let's save some time
2023-09-05 14:38:56 -07:00
8cc32c6b37 Merge branch 'tb/mark-more-tests-as-leak-free'
Tests that are known to pass with LSan are now marked as such.

* tb/mark-more-tests-as-leak-free:
  leak tests: mark t5583-push-branches.sh as leak-free
  leak tests: mark t3321-notes-stripspace.sh as leak-free
  leak tests: mark a handful of tests as leak-free
2023-09-05 14:38:56 -07:00
27e2ea97da Merge branch 'rs/parse-options-help-text-is-optional'
It may be tempting to leave the help text NULL for a command line
option that is either hidden or too obvious, but "git subcmd -h"
and "git subcmd --help-all" would have segfaulted if done so.  Now
the help text is optional.

* rs/parse-options-help-text-is-optional:
  parse-options: allow omitting option help text
2023-09-05 14:38:56 -07:00
c9192f9e45 git-revert.txt: add discussion
The section is inspired by git-commit.txt.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-02 15:21:44 -07:00
883cb1b8f8 sequencer: beautify subject of reverts of reverts
Instead of generating a silly-looking `Revert "Revert "foo""`, make it
a more humane `Reapply "foo"`.

This is done for two reasons:
- To cover the actually common case of just a double revert.
- To encourage people to rewrite summaries of recursive reverts by
  setting an example (a subsequent commit will also do this explicitly
  in the documentation).

To achieve these goals, the mechanism does not need to be particularly
sophisticated. Therefore, more complicated alternatives which would
"compress more efficiently" have not been implemented.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-02 15:20:43 -07:00
d814540bb7 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-01 11:26:28 -07:00
3b4e395cb3 Merge branch 'ob/format-patch-description-file'
"git format-patch" learns a way to feed cover letter description,
that (1) can be used on detached HEAD where there is no branch
description available, and (2) also can override the branch
description if there is one.

* ob/format-patch-description-file:
  format-patch: add --description-file option
2023-09-01 11:26:28 -07:00
f137bd4358 Merge branch 'jk/diff-result-code-cleanup'
"git diff --no-such-option" and other corner cases around the exit
status of the "diff" command has been corrected.

* jk/diff-result-code-cleanup:
  diff: drop useless "status" parameter from diff_result_code()
  diff: drop useless return values in git-diff helpers
  diff: drop useless return from run_diff_{files,index} functions
  diff: die when failing to read index in git-diff builtin
  diff: show usage for unknown builtin_diff_files() options
  diff-files: avoid negative exit value
  diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
2023-09-01 11:26:28 -07:00
e0b8c84240 treewide: fix various bugs w/ OpenSSL 3+ EVP API
The OpenSSL 3+ EVP API for SHA-* cannot support our prior use cases
supported by other SHA-* implementations.  It has the following
differences:

1. ->init_fn is required before all use
2. struct assignments don't work and requires ->clone_fn
3. can't support ->update_fn after ->final_*fn

While fixing cases 1 and 2 is merely the matter of calling ->init_fn and
->clone_fn as appropriate, fixing case 3 requires calling ->final_*fn on
a temporary context that's cloned from the primary context.

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/ZPCL11k38PXTkFga@debian.me/
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Fixes: 3e440ea0ab ("sha256: avoid functions deprecated in OpenSSL 3+")
Fixes: bda9c12073 ("avoid SHA-1 functions deprecated in OpenSSL 3+")
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 22:26:01 -07:00
4d5693ba05 lower core.maxTreeDepth default to 2048
On my Linux system, all of our recursive tree walking algorithms can run
up to the 4096 default limit without segfaulting. But not all platforms
will have stack sizes as generous (nor might even Linux if we kick off a
recursive walk within a thread).

In particular, several of the tests added in the previous few commits
fail in our Windows CI environment. Through some guess-and-check
pushing, I found that 3072 is still too much, but 2048 is OK.

These are obviously vague heuristics, and there is nothing to promise
that another system might not have trouble at even lower values. But it
seems unlikely anybody will be too angry about a 2048-depth limit (this
is close to the default max-pathname limit on Linux even for a
pathological path like "a/a/a/..."). So let's just lower it.

Some alternatives are:

  - configure separate defaults for Windows versus other platforms.

  - just skip the tests on Windows. This leaves Windows users with the
    annoying case that they can be crashed by running out of stack
    space, but there shouldn't be any security implications (they can't
    go deep enough to hit integer overflow problems).

Since the original default was arbitrary, it seems less confusing to
just lower it, keeping behavior consistent across platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:08 -07:00
7b61bd18b1 tree-diff: respect max_allowed_tree_depth
When diffing trees, we recurse to handle subtrees. That means we may run
out of stack space and segfault. Let's teach this code path about
core.maxTreeDepth in order to fail more gracefully.

As with the previous patch, we have no way to return an error (and other
tree-loading problems would just cause us to die()). So we'll likewise
call die() if we exceed the maximum depth.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:08 -07:00
670a1dadc7 list-objects: respect max_allowed_tree_depth
The tree traversal in list-objects.c, which is used by "rev-list
--objects", etc, uses recursion and may run out of stack space. Let's
teach it about the new core.maxTreeDepth config option.

We unfortunately can't return an error here, as this code doesn't
produce an error return at all. We'll die() instead, which matches the
behavior when we see an otherwise broken tree.

Note that this will also generally reject such deep trees from entering
the repository from a fetch or push, due to the use of rev-list in the
connectivity check. But it's not foolproof! We stop traversing when we
see an UNINTERESTING object, and the connectivity check marks existing
ref tips as UNINTERESTING. So imagine commit X has a tree
with maximum depth N. If you then create a new commit Y with a tree
entry "Y:subdir" that points to "X^{tree}", then the depth of Y will be
N+1. But a connectivity check running "git rev-list --objects Y --not X"
won't realize that; it will stop traversing at X^{tree}, since that was
already reachable.

So this will stop naive pushes of too-deep trees, but not carefully
crafted malicious ones. Doing it robustly and efficiently would require
caching the maximum depth of each tree (i.e., the longest path to any
leaf entry). That's much more complex and not strictly needed. If each
recursive algorithm limits itself already, then that's sufficient.
Blocking the objects from entering the repo would be a nice
belt-and-suspenders addition, but it's not worth the extra cost.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:08 -07:00
1ee7a5c388 read_tree(): respect max_allowed_tree_depth
The read_tree() function reads trees recursively (via its read_tree_at()
helper). This can cause it to run out of stack space on very deep trees.
Let's teach it about the new core.maxTreeDepth option.

The easiest way to demonstrate this is via "ls-tree -r", which the test
covers. Note that I needed a tree depth of ~30k to trigger a segfault on
my Linux system, not the 4100 used by our "big" test in t6700. However,
that test still tells us what we want: that the default 4096 limit is
enough to prevent segfaults on all platforms. We could bump it, but that
increases the cost of the test setup for little gain.

As an interesting side-note: when I originally wrote this patch about 4
years ago, I needed a depth of ~50k to segfault. But porting it forward,
the number is much lower. Seemingly little things like cf0983213c (hash:
add an algo member to struct object_id, 2021-04-26) take it from 32,722
to 29,080.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:08 -07:00
f1f63a481b traverse_trees(): respect max_allowed_tree_depth
The tree-walk.c code walks trees recursively, and may run out of stack
space. The easiest way to see this is with git-archive; on my 64-bit
Linux system it runs out of stack trying to generate a tarfile with a
tree depth of 13,772.

I've picked 4100 as the depth for our "big" test. I ran it with a much
higher value to confirm that we do get a segfault without this patch.
But really anything over 4096 is sufficient for its stated purpose,
which is to find out if our default limit of 4096 is low enough to
prevent segfaults on all platforms. Keeping it small saves us time on
the test setup.

The tree-walk code that's touched here underlies unpack_trees(), so this
protects any programs which use it, not just git-archive (but archive is
easy to test, and was what alerted me to this issue in a real-world
case).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:08 -07:00
be20128bfa add core.maxTreeDepth config
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.

In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.

And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).

The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.

This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
0fbcaef6b4 fsck: detect very large tree pathnames
In general, Git tries not to arbitrarily limit what it will store, and
there are currently no limits at all on the size of the path we find in
a tree. In theory you could have one that is gigabytes long.

But in practice this freedom is not really helping anybody, and is
potentially harmful:

  1. Most operating systems have much lower limits for the size of a
     single pathname component (e.g., on Linux you'll generally get
     ENAMETOOLONG for anything over 255 bytes). And while you _can_ use
     Git in a way that never touches the filesystem (manipulating the
     index and trees directly), it's still probably not a good idea to
     have gigantic tree names. Many operations load and traverse them,
     so any clever Git-as-a-database scheme is likely to perform poorly
     in that case.

  2. We still have a lot of code which assumes strings are reasonably
     sized, and I won't be at all surprised if you can trigger some
     interesting integer overflows with gigantic pathnames. Stopping
     malicious trees from entering the repository provides an extra line
     of defense, protecting downstream code.

This patch implements an fsck check so that such trees can be rejected
by transfer.fsckObjects. I've picked a reasonably high maximum depth
here (4096) that hopefully should not bother anybody in practice. I've
also made it configurable, as an escape hatch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
c7cd0e34cd tree-walk: rename "error" variable
The "error" variable in traverse_trees() shadows the global error()
function (meaning we can't call error() from here). Let's call the local
variable "ret" instead, which matches the idiom in other functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
acd13d1eec tree-walk: drop MAX_TRAVERSE_TREES macro
Since the previous commit dropped the hard-coded limit in
traverse_trees(), we don't need this macro there anymore (the code can
handle any number of trees in parallel).

We do define MAX_UNPACK_TREES using MAX_TRAVERSE_TREES, due to
5290d45134 (tree-walk.c: break circular dependency with unpack-trees,
2020-02-01). So we can just directly define that as "8" now; we know
traverse_trees() can handle whatever we throw at it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
59c4c7d1cb tree-walk: reduce stack size for recursive functions
The traverse_trees() and traverse_trees_recursive() functions call each
other recursively. In a deep tree, this can result in running out of
stack space and crashing.

There's obviously going to be some limit here based on available stack,
but the problem is exacerbated by a few large structs, many of which we
over-allocate. For example, in traverse_trees() we store a name_entry
and tree_desc_x per tree, both of which contain an object_id (which is
now 32 bytes). And we allocate 8 of them (from MAX_TRAVERSE_TREES), even
though many traversals will only look at 1 or 2.

Interestingly, we used to allocate these on the heap, prior to
8dd40c0472 (traverse_trees(): use stack array for name entries,
2020-01-30). That commit was trying to simplify away allocation size
computations, and naively assumed that the sizes were small enough not
to matter. And they don't in normal cases, but on my stock Debian system
I see a crash running "git archive" on a tree with ~3600 entries.
That's deep enough we wouldn't see it in practice, but probably shallow
enough that we'd prefer not to make it a hard limit. Especially because
other systems may have even smaller stacks.

We can replace these stack variables with a few malloc invocations. This
reduces the stack sizes for the two functions from 1128 and 752 bytes,
respectively, down to 40 and 92 bytes. That allows a depth of ~13000 on
my machine (the improvement isn't in linear proportion because my
numbers don't count the size of parameters and other function overhead).

The possible downsides are:

  1. We now have to remember to free(). But both functions have an easy
     single exit (and already had to clean up other bits anyway).

  2. The extra malloc()/free() overhead might be measurable. I tested
     this by setting up a 3000-depth tree with a single blob and running
     "git archive" on it. After switching to the heap, it consistently
     runs 2-3% faster! Presumably this is because the 1K+ of wasted
     stack space penalized memory caches.

On a more real-world case like linux.git, the speed difference isn't
measurable at all, simply because most trees aren't that deep and
there's so much other work going on (like accessing the objects
themselves). So the improvement I saw should be taken as evidence that
we're not making anything worse, but isn't really that interesting on
its own. The main motivation here is that we're now less likely to run
out of stack space and crash.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
3ccb4c55ad format-patch: use OPT_STRING_LIST for to/cc options
The to_callback() and cc_callback() functions are identical to the
generic parse_opt_string_list() function (except that they don't handle
optional arguments, but that's OK because their callers do not use the
OPTARG flag).

Let's simplify the code by using OPT_STRING_LIST.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:07:10 -07:00
7fa701106d merge: simplify parsing of "-n" option
The "-n" option is implemented by an option callback, as it is really a
"reverse bool". If given, it sets show_diffstat to 0. In theory, when
negated, it would set the same flag to 1. But it's not possible to
trigger that, since short options cannot be negated.

So in practice this is really just a SET_INT to 0. Let's use that
instead, which shortens the code.

Note that negation here would do the wrong thing (as with any SET_INT
with a value of "0"). We could specify PARSE_OPT_NONEG to future-proof
ourselves against somebody adding a long option name (which would make
it possible to negate). But there's not much point:

  1. Nobody is going to do that, because the negated form already
     exists, and is called "--stat" (which is defined separately so that
     "--no-stat" works).

  2. If they did, the BUG() check added by 3284b93862 (parse-options:
     disallow negating OPTION_SET_INT 0, 2023-08-08) will catch it (and
     that check is smart enough to realize that our short-only option is
     OK).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:07:10 -07:00
dee02da826 merge: make xopts a strvec
The "xopts" variable uses a custom array with ALLOC_GROW(). Using a
strvec simplifies things a bit. We need fewer variables, and we can also
ditch our custom parseopt callback in favor of OPT_STRVEC().

As a bonus, this means that "--no-strategy-option", which was previously
a silent noop, now does something useful: like other list-like options,
it will clear the list of -X options seen so far. This matches the
behavior of revert/cherry-pick, which made the same change in fb60b9f37f
(sequencer: use struct strvec to store merge strategy options,
2023-04-10).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:07:10 -07:00
e0d7db7423 format-patch: --rfc honors what --subject-prefix sets
Rather than replacing the configured subject prefix (either through the
git config or command line) entirely with "RFC PATCH", this change
prepends RFC to whatever subject prefix was already in use.

This is useful, for example, when a user is working on a repository that
has a subject prefix considered to disambiguate patches:

	git config format.subjectPrefix 'PATCH my-project'

Prior to this change, formatting patches with --rfc would lose the
'my-project' information.

The data flow for the subject-prefix was that rev.subject_prefix
were to be kept the authoritative version of the subject prefix even
while parsing command line options, and sprefix variable was used as
a temporary area to futz with it.  Now, the parsing code has been
refactored to build the subject prefix into the sprefix variable and
assigns its value at the end to rev.subject_prefix, which makes the
flow easier to grasp.

Signed-off-by: Drew DeVault <sir@cmpwn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:02:21 -07:00
3525f1dbc1 Merge branch 'ob/sequencer-empty-hint-fix'
The use of API for consistency between two calls to
require_clean_work_tree() from the sequencer code has been cleaned
up.

* ob/sequencer-empty-hint-fix:
  sequencer: rectify empty hint in call of require_clean_work_tree()
2023-08-31 14:31:42 -07:00
967bfc5894 Merge branch 'ch/t6300-verify-commit-test-cleanup'
Test clean-up.

* ch/t6300-verify-commit-test-cleanup:
  t/t6300: drop magic filtering
  t/lib-gpg: forcibly run a trustdb update
2023-08-31 14:31:42 -07:00
aa4b83dd5e git-svn: drop FakeTerm hack
Drop the FakeTerm hack, just like dfd46bae (send-email: drop
FakeTerm hack, 2023-08-08) did, for exactly the same reason.

It has been obsolete in git-svn since 30d45f798d (git-svn: delay term
initialization, 2014-09-14). Note that unlike send-email, we already
make sure to load Term::ReadLine only once. So this is just a cleanup,
and not fixing any bug.

Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-30 17:20:31 -07:00
edf80d23f1 ci: deprecate ci/config/allow-ref script
Now that we have the CI_BRANCHES mechanism, there is no need for anybody
to use the ci/config/allow-ref mechanism. In the long run, we can
hopefully remove it and the whole "config" job, as it consumes CPU and
adds to the end-to-end latency of the whole workflow. But we don't want
to do that immediately, as people need time to migrate until the
CI_BRANCHES change has made it into the workflow file of every branch.

So let's issue a warning, which will appear in the "annotations" section
below the workflow result in GitHub's web interface. And let's remove
the sample allow-refs script, as we don't want to encourage anybody to
use it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-30 15:56:11 -07:00
21c82dcd62 ci: allow branch selection through "vars"
When we added config to skip CI for certain branches in e76eec3554 (ci:
allow per-branch config for GitHub Actions, 2020-05-07), there wasn't
any way to avoid spinning up a VM just to check the config. From the
developer's perspective this isn't too bad, as the "skipped" branches
complete successfully after running the config job (the workflow result
is "success" instead of "skipped", but that is a minor lie).

But we are still wasting time and GitHub's CPU to spin up a VM just to
check the result of a short shell script. At the time there wasn't any
way to avoid this. But they've since introduced repo-level variables
that should let us do the same thing:

  https://github.blog/2023-01-10-introducing-required-workflows-and-configuration-variables-to-github-actions/#configuration-variables

This is more efficient, and as a bonus is probably less confusing to
configure (the existing system requires sticking your config on a magic
ref).

See the included docs for how to configure it.

The code itself is pretty simple: it checks the variable and skips the
config job if appropriate (and everything else depends on the config job
already). There are two slight inaccuracies here:

  - we don't insist on branches, so this likewise applies to tag names
    or other refs. I think in practice this is OK, and keeping the code
    (and docs) short is more important than trying to be more exact. We
    are targeting developers of git.git and their limited workflows.

  - the match is done as a substring (so if you want to run CI for
    "foobar", then branch "foo" will accidentally match). Again, this
    should be OK in practice, as anybody who uses this is likely to only
    specify a handful of well-known names. If we want to be more exact,
    we can have the code check for adjoining spaces. Or even move to a
    more general CI_CONFIG variable formatted as JSON. I went with this
    scheme for the sake of simplicity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-30 15:56:09 -07:00
6e8611e90a The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-30 13:50:41 -07:00
cc48906c3b Merge branch 'ts/unpacklimit-config-fix'
transfer.unpackLimit ought to be used as a fallback, but overrode
fetch.unpackLimit and receive.unpackLimit instead.

* ts/unpacklimit-config-fix:
  transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
2023-08-30 13:50:41 -07:00
74a2e88700 Merge branch 'jc/diff-exit-code-with-w-fixes'
"git diff -w --exit-code" with various options did not work
correctly, which is being addressed.

* jc/diff-exit-code-with-w-fixes:
  diff: the -w option breaks --exit-code for --raw and other output modes
  t4040: remove test that succeeded for a wrong reason
  diff: teach "--stat -w --exit-code" to notice differences
  diff: mode-only change should be noticed by "--patch -w --exit-code"
  diff: move the fallback "--exit-code" code down
2023-08-30 13:50:41 -07:00
5ba560ba76 Merge branch 'tb/commit-graph-verify-fix'
The commit-graph verification code that detects mixture of zero and
non-zero generation numbers has been updated.

* tb/commit-graph-verify-fix:
  commit-graph: avoid repeated mixed generation number warnings
  t/t5318-commit-graph.sh: test generation zero transitions during fsck
  commit-graph: verify swapped zero/non-zero generation cases
  commit-graph: introduce `commit_graph_generation_from_graph()`
2023-08-30 13:50:40 -07:00
44ad082968 update-ref: mark unused parameter in parser callbacks
The parsing of stdin is driven by a table of function pointers; mark
unused parameters in concrete functions to avoid -Wunused-parameter
warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
316b3a226a gc: mark unused descriptors in scheduler callbacks
Each of the scheduler update callbacks gets the descriptor of the lock
file, but only the crontab updater needs it. We have to retain the
unused descriptors because these are dispatched from a table of function
pointers, but we should mark them to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
fd3fe4914a bundle-uri: mark unused parameters in callbacks
The first hunk is similar to 02c3c59e62 (hashmap: mark unused callback
parameters, 2022-08-19), but was added after that commit.

The other two are used with for_all_bundles_in_list(), but don't use
their void data pointer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
ccf759cdb7 fetch: mark unused parameter in ref_transaction callback
Since this callback is just trying to collect the set of queued tag
updates, there is no need for it to look at old_oid at all. Mark it as
unused to appease -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
8ca199511b credential: mark unused parameter in urlmatch callback
Our select_all() callback does not need to actually look at its
parameters, since the point is to match everything. But we need to mark
its parameters to satisfy -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
4548b0145f grep: mark unused parmaeters in pcre fallbacks
When USE_LIBPCRE2 is not defined, we compile several noop fallbacks.
These need to have their parameters annotated to avoid
-Wunused-parameter warnings (and obviously we cannot remove the
parameters, since the functions must match the non-fallback versions).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
2c3c3d88fc imap-send: mark unused parameters with NO_OPENSSL
Earlier patches annotating unused parameters in imap-send missed a few
cases in code that is compiled only with NO_OPENSSL. These need to
retain the extra parameters to match the interfaces used when we compile
with openssl support.

Note in the case of socket_perror() that the function declaration and
parts of its code are shared between the two cases, and only the openssl
code looks at "sock". So we can't simply mark the parameter as always
unused. Instead, we can add a noop statement that references it. This is
ugly, but should be portable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:26 -07:00
2b0e46f563 worktree: mark unused parameters in noop repair callback
The noop repair callback unsurprisingly does not look at any of its
parameters. Mark them as unused to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
06b217fc1f negotiator/noop: mark unused callback parameters
The noop negotiator unsurprisingly does not bother looking at any of its
parameters. Mark them unused to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
57dbb70cd9 add-interactive: mark unused callback parameters
The interactive commands are dispatched from a table of abstract
pointers, but not every command uses every parameter it receives. Mark
the unused ones to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
bcba446228 grep: mark unused parameter in output function
This is a callback used with grep_options.output, but we don't look at
the grep_opt parameter, as we're just writing the output to stdout.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
51bf8676c0 test-trace2: mark unused argv/argc parameters
The trace2 test helper uses function pointers to dispatch to individual
tests. Not all tests bother looking at their argv/argc parameters. We
could tighten this up (e.g., complaining when seeing unexpected
parameters), but for internal test code it's not worth worrying about.

This is similar in spirit to 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
4b8dd424d8 trace2: mark unused config callback parameter
This should have been part of 783a86c142 (config: mark unused callback
parameters, 2022-08-19), but was missed in that commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
e46a25b05d trace2: mark unused us_elapsed_absolute parameters
Many trace2 targets ignore the absolute elapsed time parameters.
However, the virtual interface needs to retain the parameter since it is
used by others (e.g., the perf target).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:25 -07:00
71006d77c5 stash: mark unused parameter in diff callback
This is similar to the cases in 61bdc7c5d8 (diff: mark unused parameters
in callbacks, 2022-12-13), but I missed it when making that commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
c5cb97cbbf ls-tree: mark unused parameter in callback
The formatting functions are dispatched from a table of function
pointers. The "path name only" function unsurprisingly does not need to
look at its "oid" parameter, but we must mark it as unused to make
-Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
e1cba404db commit-graph: mark unused data parameters in generation callbacks
The compute_generation_info code uses function pointers to abstract the
get/set generation operations. Some callers don't need the extra void
data pointer, which should be annotated to appease -Wunused-parameter.

Note that we can drop the assignment of the "data" parameter in
compute_generation_numbers(), as we've just shown that neither of the
callbacks it uses will access it. This matches the caller in
ensure_generations_valid(), which already does not bother to set "data".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
bbfc4f53b9 worktree: mark unused parameters in each_ref_fn callback
This is similar to the cases in 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but it was added after that commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
d79b9f7cdb pack-bitmap: mark unused parameters in show_object callback
This is similar to the cases in c50dca2a18 (list-objects: mark unused
callback parameters, 2023-02-24), but was added after that commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
29c9f2c366 ref-filter: mark unused parameters in parser callbacks
These are similar to the cases annotated in 5fe9e1ce2f (ref-filter: mark
unused callback parameters, 2023-02-24), but were added after that
commit.

Note that the ahead/behind callback ignores its "atom" parameter, which
is a little unusual, since that struct usually stores the result. But in
this case, the data is stored centrally in ref_array->counts, since we
want to compute all ahead/behinds at once, not per ref.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
c9f7b1e8f2 sequencer: mark repository argument as unused
In sequencer_get_last_command(), we don't ever look at the repository
parameter. This is due to ed5b1ca10b (status: do not report errors in
sequencer/todo, 2019-06-27), which dropped the call to parse_insn_line().

However, it _should_ be used when calling into git_path_* functions,
but the one we use here is declared with the non-REPO variant of
GIT_PATH_FUNC(), and so just uses the_repository internally.

We could change the path helper to use REPO_GIT_PATH_FUNC(), but doing
so piecemeal is not great. There are 41 uses of GIT_PATH_FUNC() in
sequencer.c, and inconsistently switching one makes the code more
confusing. Likewise, this one function is used in half a dozen other
spots, all of which would need to start passing in a repository argument
(with rippling effects up the call stack).

So let's punt on that for now and just silence any -Wunused-parameter
warning.

Note that we could also drop this parameter entirely, as the function is
always called directly, and not as a callback that has to conform to
some external interface. But since we'd eventually want to use the
repository parameter, let's leave it in place to avoid disrupting the
callers twice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:23 -07:00
cb646ffb0a sequencer: use repository parameter in short_commit_name()
Instead of just using the_repository, we can take a repository parameter
from the caller. Most of them already have one, and doing so clears up a
few -Wunused-parameter warnings. There are still a few callers which use
the_repository, but this pushes us one small step forward to eventually
getting rid of those.

Note that a few of these functions have a "rev_info" whose "repo"
parameter could probably be used instead of the_repository. I'm leaving
that for further cleanups, as it's not immediately obvious that
revs->repo is always valid, and there's quite a bit of other possible
refactoring here (even getting rid of some "struct repository" arguments
in favor of revs->repo).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:23 -07:00
6ba913629f ci(linux-asan-ubsan): let's save some time
Every once in a while, the `git-p4` tests flake for reasons outside of
our control. It typically fails with "Connection refused" e.g. here:
https://github.com/git/git/actions/runs/5969707156/job/16196057724

	[...]
	+ git p4 clone --dest=/home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git //depot
	  Initialized empty Git repository in /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git/.git/
	  Perforce client error:
		Connect to server failed; check $P4PORT.
		TCP connect to localhost:9807 failed.
		connect: 127.0.0.1:9807: Connection refused
	  failure accessing depot: could not run p4
	  Importing from //depot into /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git
	 [...]

This happens in other jobs, too, but in the `linux-asan-ubsan` job it
hurts the most because that job often takes over a full hour to run,
therefore re-running a failed `linux-asan-ubsan` job is _very_ costly.

The purpose of the `linux-asan-ubsan` job is to exercise the C code of
Git, anyway, and any part of Git's source code that the `git-p4` tests
run and that would benefit from the attention of ASAN/UBSAN are run
better in other tests anyway, as debugging C code run via Python scripts
can get a bit hairy.

In fact, it is not even just `git-p4` that is the problem (even if it
flakes often enough to be problematic in the CI builds), but really the
part about Python scripts. So let's just skip any Python parts of the
tests from being run in that job.

For good measure, also skip the Subversion tests because debugging C
code run via Perl scripts is as much fun as debugging C code run via
Python scripts. And it will reduce the time this very expensive job
takes, which is a big benefit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 13:54:55 -07:00
1a190bc14a The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 13:51:44 -07:00
b0f704563a Merge branch 'py/git-gui-updates'
Git GUI updates.

* py/git-gui-updates:
  git-gui - use mkshortcut on Cygwin
  git-gui - use cygstart to browse on Cygwin
  git-gui - remove obsolete Cygwin specific code
  git gui Makefile - remove Cygwin modifications
  Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
  Work around Tcl's default `PATH` lookup
  Move the `_which` function (almost) to the top
  Move is_<platform> functions to the beginning
  is_Cygwin: avoid `exec`ing anything
  windows: ignore empty `PATH` elements
  git-gui: Fix a typo in README
2023-08-29 13:51:44 -07:00
3d0e70ae06 Merge branch 'jc/ci-skip-same-commit'
Tweak GitHub Actions CI so that pushing the same commit to multiple
branch tips at the same time will not waste building and testing
the same thing twice.

* jc/ci-skip-same-commit:
  ci: avoid building from the same commit in parallel
2023-08-29 13:51:44 -07:00
19cb1fc37b Merge branch 'ds/scalar-updates'
Scalar updates.

* ds/scalar-updates:
  scalar reconfigure: help users remove buggy repos
  setup: add discover_git_directory_reason()
  scalar: add --[no-]src option
2023-08-29 13:51:44 -07:00
a59dbae0b3 Merge branch 'jc/mv-d-to-d-error-message-fix'
Typofix in an error message.

* jc/mv-d-to-d-error-message-fix:
  mv: fix error for moving directory to another
2023-08-29 13:51:43 -07:00
354356feff Merge branch 'sl/sparse-check-attr'
Teach "git check-attr" work better with sparse-index.

* sl/sparse-check-attr:
  check-attr: integrate with sparse-index
  attr.c: read attributes in a sparse directory
  t1092: add tests for 'git check-attr'
2023-08-29 13:51:43 -07:00
c0b5d46ded Documentation/gitformat-pack.txt: drop mixed version section
This section was added in 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) to highlight a potential pitfall when
deploying cruft packs in an environment where multiple versions of Git
are GC-ing the same repository.

Now that it has been more than a year since 3d89a8c118 was written,
let's drop this section as it is no longer relevant.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 11:58:26 -07:00
3843ef8931 Documentation/gitformat-pack.txt: remove multi-cruft packs alternative
This text, originally from 3d89a8c118 (Documentation/technical: add
cruft-packs.txt, 2022-05-20) lists multiple cruft packs as a potential
alternative to the design of cruft packs.

We have always supported multiple cruft packs (i.e. we use the most
recent mtime for a given object among all cruft packs which contain it,
etc.), but haven't encouraged its use.

We still aren't encouraging users to go out and generate multiple cruft
packs, but let's take a step in that direction by dropping language that
suggests we aren't capable of working with multiple cruft packs.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 11:58:26 -07:00
61568efa95 builtin/pack-objects.c: support --max-pack-size with --cruft
When pack-objects learned the `--cruft` option back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20), we
explicitly forbade `--cruft` with `--max-pack-size`.

At the time, there was no specific rationale given in the patch for not
supporting the `--max-pack-size` option with `--cruft`. (As best I can
remember, it's because we were trying to push users towards only ever
having a single cruft pack, but I cannot be sure).

However, `--max-pack-size` is flexible enough that it already works with
`--cruft` and can shard unreachable objects across multiple cruft packs,
creating separate ".mtimes" files as appropriate. In fact, the
`--max-pack-size` option worked with `--cruft` as far back as
b757353676!

This is because we overwrite the `written_list`, and pass down the
appropriate length, i.e. the number of objects written in each pack
shard.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 11:58:06 -07:00
e741c07872 builtin/pack-objects.c: remove unnecessary strbuf_reset()
When reading input with the `--cruft` option, `git pack-objects` reads
each line into a strbuf, and then moves it to either the list of
discarded or fresh packs, depending on whether or not the input line
starts with a '-' character.

At the beginning of each loop iteration, the next line of input is read
with `strbuf_getline()`, which calls `strbuf_reset()` (as a part of
`strbuf_getwholeline()`) before reading the next line of input.

Thus, the call to `strbuf_reset()` (added back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20)) at the
end of the loop is unnecessary, so let's remove it accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 10:26:44 -07:00
5fafe8c95f leak tests: mark t5583-push-branches.sh as leak-free
When t5583-push-branches.sh was originally introduced via 425b4d7f47
(push: introduce '--branches' option, 2023-05-06), it was not leak-free.
In fact, the test did not even run correctly until 022fbb655d (t5583:
fix shebang line, 2023-05-12), but after applying that patch, we see a
failure at t5583.8:

    ==2529087==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 384 byte(s) in 1 object(s) allocated from:
        #0 0x7fb536330986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
        #1 0x55e07606cbf9 in xrealloc wrapper.c:140
        #2 0x55e075fb6cb3 in prio_queue_put prio-queue.c:42
        #3 0x55e075ec81cb in get_reachable_subset commit-reach.c:917
        #4 0x55e075fe9cce in add_missing_tags remote.c:1518
        #5 0x55e075fea1e4 in match_push_refs remote.c:1665
        #6 0x55e076050a8e in transport_push transport.c:1378
        #7 0x55e075e2eb74 in push_with_options builtin/push.c:401
        #8 0x55e075e2edb0 in do_push builtin/push.c:458
        #9 0x55e075e2ff7a in cmd_push builtin/push.c:702
        #10 0x55e075d8aaf0 in run_builtin git.c:452
        #11 0x55e075d8af08 in handle_builtin git.c:706
        #12 0x55e075d8b12c in run_argv git.c:770
        #13 0x55e075d8b6a0 in cmd_main git.c:905
        #14 0x55e075e81f07 in main common-main.c:60
        #15 0x7fb5360ab6c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
        #16 0x7fb5360ab784 in __libc_start_main_impl ../csu/libc-start.c:360
        #17 0x55e075d88f40 in _start (git+0x1ff40) (BuildId: 38ad998b85a535e786129979443630d025ec2453)

    SUMMARY: LeakSanitizer: 384 byte(s) leaked in 1 allocation(s).

This leak was addressed independently via 68b51172e3 (commit-reach: fix
memory leak in get_reachable_subset(), 2023-06-03), which makes t5583
leak-free.

But t5583 was not in the tree when 68b51172e3 was written, and the two
only met after the latter was merged back in via 693bde461c (Merge
branch 'mh/commit-reach-get-reachable-plug-leak', 2023-06-20).

At that point, t5583 was leak-free. Let's mark it as such accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 09:41:56 -07:00
bac3ccc290 leak tests: mark t3321-notes-stripspace.sh as leak-free
This test was leak-free when t3321 was originally introduced, but never
marked as such:

    $ rev="$(git log --format='%H' --reverse -1 HEAD^ -- t/t3321-notes-stripspace.sh)"
    $ git checkout $rev

    $ make SANITIZE=leak
    [...]

    $ make -C t GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_OPTS=--immediate t3321-notes-stripspace.sh
    [...]
    # passed all 27 test(s)
    1..27
    # faking up non-zero exit with --invert-exit-code
    make: *** [Makefile:66: t3321-notes-stripspace.sh] Error 1
    make: Leaving directory '/home/ttaylorr/src/git/t'

Mark this test as leak-free accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 09:41:56 -07:00
20debfb210 leak tests: mark a handful of tests as leak-free
In the topic merged via 5a4f8381b6 (Merge branch
'ab/mark-leak-free-tests', 2021-10-25), a handful of tests in the suite
were marked as leak-free.

Since then, a handful of tests have become leak-free due to changes like

  - 861c56f6f9 (branch: fix a leak in setup_tracking, 2023-06-11), and
  - 866b43e644 (do_read_index(): always mark index as initialized unless
    erroring out, 2023-06-29)

, but weren't updated at the time to mark themselves as such. This leads
to test "failures" when running:

    $ make SANITIZE=leak
    $ make -C t \
        GIT_TEST_PASSING_SANITIZE_LEAK=check \
        GIT_TEST_SANITIZE_LEAK_LOG=true \
        GIT_TEST_OPTS=-vi test

This patch closes those gaps by exporting TEST_PASSES_SANITIZE_LEAK=true
before sourcing t/test-lib.sh on most remaining leak-free tests.

There are a couple of other tests which are similarly leak-free, but not
included in the list of tests touched by this patch. The remaining tests
will be addressed in the subsequent two patches.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 09:41:56 -07:00
370ef7e40d test-lib: ignore uninteresting LSan output
When I run the tests in leak-checking mode the same way our CI job does,
like:

  make SANITIZE=leak \
       GIT_TEST_PASSING_SANITIZE_LEAK=true \
       GIT_TEST_SANITIZE_LEAK_LOG=true \
       test

then LSan can racily produce useless entries in the log files that look
like this:

  ==git==3034393==Unable to get registers from thread 3034307.

I think they're mostly harmless based on the source here:

  7e0a52e8e9/compiler-rt/lib/lsan/lsan_common.cpp (L414)

which reads:

    PtraceRegistersStatus have_registers =
        suspended_threads.GetRegistersAndSP(i, &registers, &sp);
    if (have_registers != REGISTERS_AVAILABLE) {
      Report("Unable to get registers from thread %llu.\n", os_id);
      // If unable to get SP, consider the entire stack to be reachable unless
      // GetRegistersAndSP failed with ESRCH.
      if (have_registers == REGISTERS_UNAVAILABLE_FATAL)
        continue;
      sp = stack_begin;
    }

The program itself still runs fine and LSan doesn't cause us to abort.
But test-lib.sh looks for any non-empty LSan logs and marks the test as
a failure anyway, under the assumption that we simply missed the failing
exit code somehow.

I don't think I've ever seen this happen in the CI job, but running
locally using clang-14 on an 8-core machine, I can't seem to make it
through a full run of the test suite without having at least one
failure. And it's a different one every time (though they do seem to
often be related to packing tests, which makes sense, since that is one
of our biggest users of threaded code).

We can hack around this by only counting LSan log files that contain a
line that doesn't match our known-uninteresting pattern.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 12:23:20 -07:00
5dc72c0fbc The extra batch to update credenthal helpers
These two topics did not see much interest and reviews while they
were on 'next'; let's "inflict" them to the general public and see
if anybody screams, which is much less nicer way than to merge
only topics that are well reviewed down in an orderly manner, but
that is the only thing we can do to these topics without any
development community help.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 09:52:28 -07:00
bc92d2c7ac Merge branch 'mh/credential-erase-improvements-more'
Update two credential helpers to correctly match which credential
to erase; they dropped not the ones with stale password.

* mh/credential-erase-improvements-more:
  credential/wincred: erase matching creds only
  credential/libsecret: erase matching creds only
2023-08-28 09:51:16 -07:00
e839608295 Merge branch 'mh/credential-libsecret-attrs'
The way authentication related data other than passwords (e.g.
oath token and password expiration data) are stored in libsecret
keyrings has been rethought.

* mh/credential-libsecret-attrs:
  credential/libsecret: store new attributes
2023-08-28 09:51:16 -07:00
f9a547d3a7 scalar reconfigure: help users remove buggy repos
When running 'scalar reconfigure -a', Scalar has warning messages about
the repository missing (or not containing a .git directory). Failures
can also happen while trying to modify the repository-local config for
that repository.

These warnings may seem confusing to users who don't understand what
they mean or how to stop them.

Add a warning that instructs the user how to remove the warning in
future installations.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 09:16:06 -07:00
26ae8da683 setup: add discover_git_directory_reason()
There are many reasons why discovering a Git directory may fail. In
particular, 8959555cee (setup_git_directory(): add an owner check for
the top-level directory, 2022-03-02) added ownership checks as a
security precaution.

Callers attempting to set up a Git directory may want to inform the user
about the reason for the failure. For that, expose the enum
discovery_result from within setup.c and move it into cache.h where
discover_git_directory() is defined.

I initially wanted to change the return type of discover_git_directory()
to be this enum, but several callers rely upon the "zero means success".
The two problems with this are:

1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
   results are errors.

2. There are multiple successful states; positive results are
   successful.

It is worth noting that GIT_DIR_NONE is not returned, so we remove this
option from the enum. We must be careful to keep the successful reasons
as positive values, so they are given explicit positive values.

Instead of updating all callers immediately, add a new method,
discover_git_directory_reason(), and convert discover_git_directory() to
be a thin shim on top of it.

One thing that is important to note is that discover_git_directory()
previously returned -1 on error, so let's continue that into the future.
There is only one caller (in scalar.c) that depends on that signedness
instead of a non-zero check, so clean that up, too.

Because there are extra checks that discover_git_directory_reason() does
after setup_git_directory_gently_1(), there are other modes that can be
returned for failure states. Add these modes to the enum, but be sure to
explicitly add them as BUG() states in the switch of
setup_git_directory_gently().

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 09:16:06 -07:00
4527db8ff8 scalar: add --[no-]src option
Some users have strong aversions to Scalar's opinion that the repository
should be in a 'src' directory, even though this creates a clean slate
for placing build artifacts in adjacent directories.

The new --no-src option allows users to opt out of the default behavior.

While adding options, make sure the usage output by 'scalar clone -h'
reports the same as the SYNOPSIS line in Documentation/scalar.txt.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 09:16:06 -07:00
cd52d9e90f parse-options: allow omitting option help text
1b68387e02 (builtin/receive-pack.c: use parse_options API, 2016-03-02)
added the options --stateless-rpc, --advertise-refs and
--reject-thin-pack-for-testing with a NULL `help` string; 03831ef7b5
(difftool: implement the functionality in the builtin, 2017-01-19)
similarly added the "helpless" option --prompt.  Presumably this was
done because all four options are hidden and self-explanatory.

They cause a NULL pointer dereference when using the option --help-all
with their respective tool, though.  Handle such options gracefully
instead by turning the NULL pointer into an empty string at the top of
the loop, always printing a newline at the end and passing through the
separating newlines from the help text.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 08:20:20 -07:00
6807fcfeda The second batch for 2.43
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-25 10:37:38 -07:00
eccee1854c Merge branch 'jk/function-pointer-mismatches-fix'
Code clean-up to please clang-18.

* jk/function-pointer-mismatches-fix:
  hashmap: use expected signatures for comparison functions
2023-08-25 10:37:38 -07:00
23013a49c8 Merge branch 'ob/t9001-indent-fix'
Test style fix.

* ob/t9001-indent-fix:
  t9001: fix indentation in test_no_confirm()
2023-08-25 10:37:37 -07:00
05c8603564 Merge branch 'ja/worktree-orphan'
Typofix in an error message.

* ja/worktree-orphan:
  builtin/worktree.c: fix typo in "forgot fetch" msg
2023-08-25 10:37:37 -07:00
6d159f5757 Merge branch 'rs/parse-options-negation-help'
"git cmd -h" learned to signal which options can be negated by
listing such options like "--[no-]opt".

* rs/parse-options-negation-help:
  parse-options: simplify usage_padding()
  parse-options: no --[no-]no-...
  parse-options: factor out usage_indent() and usage_padding()
  parse-options: show negatability of options in short help
  t1502: test option negation
  t1502: move optionspec help output to a file
  t1502, docs: disallow --no-help
  subtree: disallow --no-{help,quiet,debug,branch,message}
2023-08-25 10:37:37 -07:00
99fe06cbfd ci: avoid building from the same commit in parallel
At times, we may need to push the same commit to multiple branches
in the same push.  Rewinding 'next' to rebuild on top of 'master'
soon after a release is such an occasion.  Making sure 'main' stays
in sync with 'master' to help those who expect that primary branch
of the project is named either of these is another.

We already use the branch name as a "concurrency group" key, but
that does not address the situation illustrated above.

Let's introduce another `concurrency` attribute, using the commit
hash as the concurrency group key, on the workflow run level, to
address this. This will hold any workflow run in the queued state
when there is already a workflow run targeting the same commit,
until that latter run completed. The `skip-if-redundant` check of
the second run will then have a chance to see whether the first
run succeeded.

The only caveat with this strategy is that only one workflow run
will be kept in the queued state by the `concurrency` feature: if
another run targeting the same commit is triggered, the
previously-queued run will be canceled. Considering the benefit,
this seems the smaller price to pay than to overload Git's build
agent pool with undesired workflow runs.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-25 09:48:22 -07:00
a793520380 Merge https://github.com/prati0100/git-gui
* https://github.com/prati0100/git-gui:
  git-gui - use mkshortcut on Cygwin
  git-gui - use cygstart to browse on Cygwin
  git-gui - remove obsolete Cygwin specific code
  git gui Makefile - remove Cygwin modifications
  Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
  Work around Tcl's default `PATH` lookup
  Move the `_which` function (almost) to the top
  Move is_<platform> functions to the beginning
  is_Cygwin: avoid `exec`ing anything
  windows: ignore empty `PATH` elements
  git-gui: Fix a typo in README
2023-08-24 09:57:43 -07:00
cd9da15a85 Start the 2.43 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-24 09:32:34 -07:00
c7b6a6c0be Merge branch 'ds/maintenance-schedule-fuzz'
Hourly and other schedule of "git maintenance" jobs are randomly
distributed now.

* ds/maintenance-schedule-fuzz:
  maintenance: update schedule before config
  maintenance: fix systemd schedule overlaps
  maintenance: use random minute in systemd scheduler
  maintenance: swap method locations
  maintenance: use random minute in cron scheduler
  maintenance: use random minute in Windows scheduler
  maintenance: use random minute in launchctl scheduler
  maintenance: add get_random_minute()
2023-08-24 09:32:34 -07:00
004a383091 Merge branch 'ob/test-lib-rebase-fake-editor-updates'
Test updates.

* ob/test-lib-rebase-fake-editor-updates:
  t/lib-rebase: improve documentation of set_fake_editor()
  t/lib-rebase: set_fake_editor(): handle FAKE_LINES more consistently
  t/lib-rebase: set_fake_editor(): fix recognition of reset's short command
2023-08-24 09:32:34 -07:00
aaf0a421e2 Merge branch 'mp/rebase-label-length-limit'
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.

* mp/rebase-label-length-limit:
  rebase: allow overriding the maximal length of the generated labels
  sequencer: truncate labels to accommodate loose refs
2023-08-24 09:32:33 -07:00
84d79009d9 Merge branch 'ds/upload-pack-error-sequence-fix'
Error message generation fix.

* ds/upload-pack-error-sequence-fix:
  upload-pack: fix exit code when denying fetch of unreachable object ID
  upload-pack: fix race condition in error messages
2023-08-24 09:32:33 -07:00
2f8aa2c3a0 Merge branch 'ws/git-push-doc-grammofix'
Doc update.

* ws/git-push-doc-grammofix:
  git-push.txt: fix grammar
2023-08-24 09:32:33 -07:00
987a85accb Merge branch 'tb/repack-geometry-cleanup'
Code clean-up.

* tb/repack-geometry-cleanup:
  repack: move `pack_geometry` struct to the stack
2023-08-24 09:32:33 -07:00
e9608bbc35 Merge branch 'ob/sequencer-rearrange-cleanup'
Code clean-up.

* ob/sequencer-rearrange-cleanup:
  sequencer: simplify allocation of result array in todo_list_rearrange_squash()
2023-08-24 09:32:33 -07:00
f5f23a430f Merge branch 'rj/branch-in-use-error-message'
A message written in olden time prevented a branch from getting
checked out saying it is already checked out elsewhere, but these
days, we treat a branch that is being bisected or rebased just like
a branch that is checked out and protect it.  Rephrase the message
to say that the branch is in use.

* rj/branch-in-use-error-message:
  branch: error message checking out a branch in use
  branch: error message deleting a branch in use
2023-08-24 09:32:33 -07:00
a9b5955e07 sequencer: rectify empty hint in call of require_clean_work_tree()
The canonical way to represent "no error hint" is making it NULL, which
shortcuts the error() call altogether. This fixes the output by removing
the line which said just "error:", which would appear when the worktree
is dirtied while editing the initial rebase todo file. This was
introduced by 97e1873 (rebase -i: rewrite complete_action() in C,
2018-08-28), which did a somewhat inaccurate conversion from shell.

To avoid that such bugs re-appear, test for the condition in
require_clean_work_tree().

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-24 08:58:05 -07:00
e25cbdf357 Merge branch 'ml/cygwin-fixes'
Remove some code supporting ancient Cygwin Tcl/Tk versions. Also fix
exploring working directory and making desktop shortcuts on Cygwin.

* ml/cygwin-fixes:
  git-gui - use mkshortcut on Cygwin
  git-gui - use cygstart to browse on Cygwin
  git-gui - remove obsolete Cygwin specific code
  git gui Makefile - remove Cygwin modifications
2023-08-24 16:46:29 +02:00
b85c5a4ec6 git-gui - use mkshortcut on Cygwin
git-gui enables the "Repository->Create Desktop Icon" item on Cygwin,
offering to create a shortcut that starts git-gui on the current
repository. The code in do_cygwin_shortcut invokes function
win32_create_lnk to create the shortcut. This latter function is shared
between Cygwin and Git For Windows and expects Windows rather than unix
pathnames, though do_cygwin_shortcut provides unix pathnames. Also, this
function tries to invoke the Windows Script Host to run a javascript
snippet, but this fails under Cygwin's Tcl. So, win32_create_lnk just
does not support Cygwin.

However, Cygwin's default installation provides /bin/mkshortcut for
creating desktop shortcuts. This is compatible with exec under Cygwin's
Tcl, understands Cygwin's unix pathnames, and avoids the need for shell
escapes to encode troublesome paths. So, teach git-gui to use mkshortcut
on Cygwin, leaving win32_create_lnk unchanged and for exclusive use by
Git For Windows.

Notes: "CHERE_INVOKING=1" is recognized by Cygwin's /etc/profile and
prevents a "chdir $HOME", leaving the shell in the working directory
specified by the shortcut. That directory is written directly by
mkshortcut eliminating any problems with shell escapes and quoting.

The code being replaced includes the full pathname of the git-gui
creating the shortcut, but that git-gui might not be compatible with the
git found after /etc/profile sets the path, and might have a pathname
that defies encoding using shell escapes that can survive the multiple
incompatible interpreters involved in the chain of creating and using
this shortcut.  The new code uses bare "git gui" as the command to
execute, thus using the system git to launch the system git-gui, and
avoiding both issues.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-08-24 16:19:57 +02:00
4ed23c3c92 git-gui - use cygstart to browse on Cygwin
git-gui enables the "Repository->Explore Working Copy" menu on Cygwin,
offering to open a Windows graphical file browser at the root of the
working directory. This code, shared with Git For Windows support,
depends upon use of Windows pathnames. However, git gui on Cygwin uses
unix pathnames, so this shared code will not work on Cygwin.

A base install of Cygwin provides the /bin/cygstart utility that runs
a registered Windows application based upon the file type, after
translating unix pathnames to Windows.  Adding the --explore option
guarantees that the Windows file explorer is opened, regardless of the
supplied pathname's file type and avoiding possibility of some other
action being taken.

So, teach git-gui to use cygstart --explore on Cygwin, restoring the
pre-2012 behavior of opening a Windows file explorer for browsing. This
separates the Git For Windows and Cygwin code paths. Note that
is_Windows is never true on Cygwin, and is_Cygwin is never true on Git
for Windows, though this is not obvious by examining the code for those
independent functions.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-08-24 16:19:57 +02:00
7145c654ff git-gui - remove obsolete Cygwin specific code
In the current git release, git-gui runs on Cygwin without enabling any
of git-gui's Cygwin specific code.  This happens as the Cygwin specific
code in git-gui was (mostly) written in 2007-2008 to work with Cygwin's
then supplied Tcl/Tk which was an incompletely ported variant of the
8.4.1 Windows Tcl/Tk code.  In March, 2012, that 8.4.1 package was
replaced with a full port based upon the upstream unix/X11 code,
since maintained up to date. The two Tcl/Tk packages are completely
incompatible, and have different signatures.

When Cygwin's Tcl/Tk signature changed in 2012, git-gui no longer
detected Cygwin, so did not enable Cygwin specific code, and the POSIX
environment provided by Cygwin since 2012 supported git-gui as a generic
unix. Thus, no-one apparently noticed the existence of incompatible
Cygwin specific code.

However, since commit c5766eae6f in the git-gui source tree
(https://github.com/prati0100/git-gui, master at a5005ded), and not yet
pulled into the git repository, the is_Cygwin function does detect
Cygwin using the unix/X11 Tcl/Tk.  The Cygwin specific code is enabled,
causing use of Windows rather than unix pathnames, and enabling
incorrect warnings about environment variables that were relevant only
to the old Tcl/Tk.  The end result is that (upstream) git-gui is now
incompatible with Cygwin.

So, delete Cygwin specific code (code protected by "if is_Cygwin") that
is not needed in any form to work with the unix/X11 Tcl/Tk.

Cygwin specific code required to enable file browsing and shortcut
creation is not addressed in this patch, does not currently work, and
invocation of those items may leave git-gui in a confused state.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-08-24 16:19:57 +02:00
ae49066982 git gui Makefile - remove Cygwin modifications
git-gui's Makefile hardcodes the absolute Windows path of git-gui's libraries
into git-gui, destroying the ability to package git-gui on one machine and
distribute to others. The intent is to do this only if a non-Cygwin Tcl/Tk is
installed, but the test for this is wrong with the unix/X11 Tcl/Tk shipped
since 2012. Also, Cygwin does not support a non-Cygwin Tcl/Tk.

The Cygwin git maintainer disables this code, so this code is definitely
not in use in the Cygwin distribution.

The simplest fix is to just delete the Cygwin specific code,
allowing the Makefile to work out of the box on Cygwin. Do so.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-08-24 16:19:57 +02:00
d0fc552bfc t/t6300: drop magic filtering
Now that we ran a trustdb check forcibly, it no longer pollutes the
output, and filtering is no longer required.

Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-23 09:13:17 -07:00
f3d33f8cfe transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
The transfer.unpackLimit configuration variable is documented to be
used only as a fallback value when the more operation-specific
fetch.unpackLimit and receive.unpackLimit variables are not set, but
the implementation had the precedence reversed.  Apparently this was
broken since the transfer.unpackLimit was introduced in e28714c5
(Consolidate {receive,fetch}.unpackLimit, 2007-01-24).

Often when documentation and code have diverged for so long, we
prefer to change the documentation instead, to avoid disrupting
users.  But doing so would make these weirdly unlike most other
"specific overrides general" config options. And the fact that the
bug has existed for so long without anyone noticing implies to me
that nobody really tries to mix and match them much.

Signed-off-by: Taylor Santiago <taylorsantiago@google.com>
[jc: rewrote the log message, added tests, covered receive-pack as well]
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-22 18:30:49 -07:00
031fff289a t/lib-gpg: forcibly run a trustdb update
We want to compare output later, so randomly popping up 'gpg: checking
the trustdb' breaks the tests. Run the trustdb update forcibly.

Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-22 09:29:06 -07:00
a64f8b2595 diff: the -w option breaks --exit-code for --raw and other output modes
The output from "--raw", "--name-status", and "--name-only" modes in
"git diff" does depend on and does not reflect how certain different
contents are considered equal, unlike "--patch" and "--stat" output
modes do, when used with options like "-w" (another way of thinking
about it is that it is not like we recompute the hash of the blob
after removing all whitespaces to show "git diff --raw -w" output).

But the fact that "--raw" and friends ignore "-w" is not a good
excuse for "diff --raw -w --exit-code" to also ignore the request to
report the differences with its exit status.  When run without "-w",
"git diff --exit-code --raw" does report with its exit status the
differences as requested, and we should do the same when run with
"-w", too.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 18:56:03 -07:00
db6044d762 commit-graph: avoid repeated mixed generation number warnings
When validating that a commit-graph has either all zero, or all non-zero
generation numbers, we emit a warning on both the rising and falling
edge of transitioning between the two.

So if we are unfortunate enough to see a commit-graph which has a
repeating sequence of zero, then non-zero generation numbers, we'll
generate many warnings that contain more or less the same information.

Avoid this by keeping track of a single example for a commit with zero-
and non-zero generation, and emit a single warning at the end of
verification if both are non-NULL.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:16:23 -07:00
ce7629a315 t/t5318-commit-graph.sh: test generation zero transitions during fsck
The second test called "detect incorrect generation number" asserts that
we correctly warn during an fsck when we see a non-zero generation
number after seeing a zero beforehand.

The other transition (going from non-zero to zero) was previously
untested. Test both directions, and rename the existing test to make
clear which direction it is exercising.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:07:34 -07:00
cc9c9a00a5 commit-graph: verify swapped zero/non-zero generation cases
In verify_one_commit_graph(), we have code that complains when a commit
is found with a generation number of zero, and then later with a
non-zero number. It works like this:

  1. When we see an entry with generation zero, we set the
     generation_zero flag to GENERATION_ZERO_EXISTS.

  2. When we later see an entry with a non-zero generation, we complain
     if the flag is GENERATION_ZERO_EXISTS.

There's a matching GENERATION_NUMBER_EXISTS value, which in theory would
be used to find the case that we see the entries in the opposite order:

  1. When we see an entry with a non-zero generation, we set the
     generation_zero flag to GENERATION_NUMBER_EXISTS.

  2. When we later see an entry with a zero generation, we complain if
     the flag is GENERATION_NUMBER_EXISTS.

But that doesn't work; step 2 is implemented, but there is no step 1. We
never use NUMBER_EXISTS at all, and Coverity rightly complains that step
2 is dead code.

We can fix that by implementing that step 1.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:07:33 -07:00
868c991155 commit-graph: introduce commit_graph_generation_from_graph()
In 2ee11f7261 (commit-graph: return generation from memory, 2023-03-20),
the `commit_graph_generation()` function stopped returning zeros when
asked to locate the generation number of a given commit.

This was done at the time to prepare for a later change which set
generation values in memory, meaning that we could no longer rely on
`graph_pos` alone to tell us whether or not to trust the generation
number returned by this function.

In 2ee11f7261, it was noted that this change only impacted very old
commit-graphs, which were written with all commits having generation
number 0. Indeed, zero is not a valid generation number, so we should
never expect to see that value outside of the aforementioned case.

The test fallout in 2ee11f7261 indicated that we were no longer able to
fsck a specific old case of commit-graph corruption, where we see a
non-zero generation number after having seen a generation number of 0
earlier.

Introduce a variant of `commit_graph_generation()` which behaves like
that function did prior to 2ee11f7261, known as
`commit_graph_generation_from_graph()`. Then use this function in the
context of `verify_one_commit_graph()`, where we only want to trust the
values from the graph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:07:33 -07:00
5cc6b2d70b diff: drop useless "status" parameter from diff_result_code()
Many programs use diff_result_code() to get a user-visible program exit
code from a diff result (e.g., checking opts.found_changes if
--exit-code was requested).

This function also takes a "status" parameter, which seems at first
glance that it could be used to propagate an error encountered when
computing the diff. But it doesn't work that way:

  - negative values are passed through as-is, but are not appropriate as
    program exit codes

  - when --exit-code or --check is in effect, we _ignore_ the passed-in
    status completely. So a failed diff which did not have a chance to
    set opts.found_changes would erroneously report "success, no
    changes" instead of propagating the error.

After recent cleanups, neither of these bugs is possible to trigger, as
every caller just passes in "0". So rather than fixing them, we can
simply drop the useless parameter instead.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:24 -07:00
c0049ca0d7 diff: drop useless return values in git-diff helpers
Since git-diff has many diff modes, it dispatches to many helpers to
perform each one. But every helper simply returns "0", as it exits
directly if there are serious errors (and options like --exit-code are
handled afterwards). So let's get rid of these useless return values,
which makes the code flow more clear.

There's very little chance that we'd later want to propagate errors
instead of dying immediately. These are all static-local helpers for the
git-diff program implementing its various modes. More "lib-ified" code
would directly call the underlying functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:24 -07:00
25bd3acd04 diff: drop useless return from run_diff_{files,index} functions
Neither of these functions ever returns a value other than zero.
Instead, they expect unrecoverable errors to exit immediately, and
things like "--exit-code" are stored inside the diff_options struct to
be handled later via diff_result_code().

Some callers do check the return values, but many don't bother. Let's
drop the useless return values, which are misleading callers about how
the functions work. This could be seen as a step in the wrong direction,
as we might want to eventually "lib-ify" these to more cleanly return
errors up the stack, in which case we'd have to add the return values
back in. But there are some benefits to doing this now:

  1. In the current code, somebody could accidentally add a "return -1"
     to one of the functions, which would be erroneously ignored by many
     callers. By removing the return code, the compiler can notice the
     mismatch and force the developer to decide what to do.

     Obviously the other option here is that we could start consistently
     checking the error code in every caller. But it would be dead code,
     and we wouldn't get any compile-time help in catching new cases.

  2. It communicates the situation to callers, who may want to choose a
     different function. These functions are really thin wrappers for
     doing git-diff-files and git-diff-index within the process. But
     callers who care about recovering from an error here are probably
     better off using the underlying library functions, many of
     which do return errors.

If somebody eventually wants to teach these functions to propagate
errors, they'll have to switch back to returning a value, effectively
reverting this patch. But at least then they will be starting with a
level playing field: they know that they will need to inspect each
caller to see how it should handle the error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:24 -07:00
3755077b50 diff: die when failing to read index in git-diff builtin
When the git-diff program fails to read the index in its diff-files or
diff-index helper functions, it propagates the error up the stack. This
eventually lands in diff_result_code(), which does not handle it well
(as discussed in the previous patch).

Since the only sensible thing here is to exit with an error code (and
what we were expecting the propagated error code to cause), let's just
do that directly.

There's no test here, as I'm not even sure this case can be triggered.
The index-reading functions tend to die() themselves when encountering
any errors, and the return value is just the number of entries in the
file (and so always 0 or positive). But let's err on the conservative
side and keep checking the return value. It may be worth digging into as
a separate topic (though index-reading is low-level enough that we
probably want to eventually teach it to propagate errors anyway for
lib-ification purposes, at which point this code would already be doing
the right thing).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:24 -07:00
5ad6e2b495 diff: show usage for unknown builtin_diff_files() options
The git-diff command has many modes (comparing worktree to index, index
to HEAD, individual blobs, etc). As a result, it dispatches to many
helper functions and cannot completely parse its options until we're in
those helper functions.

Most of them, when seeing an unknown option, exit immediately by calling
usage(). But builtin_diff_files(), which is the default if no revision
or blob arguments are given, instead prints an error() and returns -1.

One obvious shortcoming here is that the user doesn't get to see the
usual usage message. But there's a much more important bug: the -1
return is fed to diff_result_code(), which is not ready to handle it.
By default, it passes the code along as an exit code. We try to avoid
negative exit codes because they get converted to unsigned values, but
it should at least consistently show up as non-zero (i.e., a failure).

But much worse is that when --exit-code is in effect, diff_result_code()
will _ignore_ the status passed in by the caller, and instead only
report on whether the diff found changes. It didn't, of course, because
we never ran the diff, and the program unexpectedly exits with success!

We can fix this bug by just calling usage(), like the other helpers do.
Another option would of course be to teach diff_result_code() to handle
this value. But as we'll see in the next few patches, it can be cleaned
up even further. Let's just fix this bug directly to start with.

Reported-by: Romain Chossart <romainchossart@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:24 -07:00
f126f6ec22 diff-files: avoid negative exit value
If loading the index fails, we print an error and then return "-1" from
the function. But since this is a builtin, we end up with exit(-1),
which produces odd results since program exit codes are unsigned.
Because of integer conversion, it usually becomes 255, which is at least
still an error, but values above 128 are usually interpreted as signal
death.

Since we know the program is exiting immediately, we can just replace
the error return with a die().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:24 -07:00
976b97e3fd diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
Many callers of run_diff_index() passed literal "1" for the option
flag word, which should better be spelled out as DIFF_INDEX_CACHED
for readablity.  Everybody else passes "0" that can stay as-is.

The other bit in the option flag word is DIFF_INDEX_MERGE_BASE, but
curiously there is only one caller that can pass it, which is "git
diff-index --merge-base" itself---no internal callers uses the
feature.

A bit tricky call to the function is in builtin/submodule--helper.c
where the .cached member in a private struct is set/reset as a plain
Boolean flag, which happens to be "1" and happens to match the value
of DIFF_INDEX_CACHED.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:33:23 -07:00
67f4b36e33 format-patch: add --description-file option
This patch makes it possible to directly feed a branch description to
derive the cover letter from. The use case is formatting dynamically
created temporary commits which are not referenced anywhere.

The most obvious alternative would be creating a temporary branch and
setting a description on it, but that doesn't seem particularly elegant.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 15:03:47 -07:00
1e63b34a44 decorate: use commit color for HEAD arrow
Use the commit color instead of the HEAD color for the arrow or custom
symbol in "HEAD -> branch" decorations, for visual consistency with the
prefix, separator and suffix symbols, which are also colored with the
commit color.

This change was triggered by the possibility that one could choose to
use the same symbol for the pointer and the separator options in
%(decorate), in which case they ought to be the same color.

A related precedent is 'ls -l', where the arrow for symlinks gets the
default color rather than that of the symlink name.

Amend test t4207-log-decoration-colors.sh accordingly.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:10 -07:00
f1f8a25856 pretty: add pointer and tag options to %(decorate)
Add pointer and tag options to %(decorate) format, to allow to override
the " -> " string used to show where HEAD points and the "tag: " string
used to mark tags.

Document in pretty-formats.txt and test in t4205-log-pretty-formats.sh.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:10 -07:00
a58dd835e9 pretty: add %(decorate[:<options>]) format
Add %(decorate[:<options>]) format that lists ref names similarly to the
%d format, but which allows the otherwise fixed prefix, suffix and
separator strings to be customized. Omitted options default to the
strings used in %d.

Rename expand_separator() function used to expand %x literal formatting
codes to expand_string_arg(), as it is now used on strings other than
separators.

Examples:
- %(decorate) is equivalent to %d.
- %(decorate:prefix=,suffix=) is equivalent to %D.
- %(decorate:prefix=[,suffix=],separator=%x3B) produces a list enclosed
in square brackets and separated by semicolons.

Test the format in t4205-log-pretty-formats.sh and document it in
pretty-formats.txt.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:09 -07:00
dcb347f837 decorate: color each token separately
Wrap "tag:" prefixes and the arrows in "HEAD -> branch" decorations in
their own color sequences. Otherwise, if --graph is used, tag names or
arrows can end up uncolored when %w width formatting breaks a line just
before them. This is because --graph resets the color after doing its
drawing at the start of a line.

Amend test t4207-log-decoration-colors.sh accordingly.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:09 -07:00
b87a9a2c1e decorate: avoid some unnecessary color overhead
In format_decorations(), don't obtain color sequences if there are no
decorations, and don't emit color sequences around empty strings.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:09 -07:00
a3883a6532 decorate: refactor format_decorations()
Rename the format_decorations_extended function to format_decorations
and drop the format_decorations wrapper macro. Pass the prefix, suffix
and separator strings as a single 'struct format_decorations' pointer
argument instead of separate arguments. Use default values defined in
the function when either the struct pointer or any of the struct fields
are NULL. This is to ease extension with additional options.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:09 -07:00
31a922f838 pretty-formats: enclose options in angle brackets
Enclose the 'options' placeholders in the documentation of the
%(describe) and %(trailers) format specifiers in angle brackets to
clarify that they are placeholders rather than keywords.

Also remove the indentation from their descriptions, instead of
increasing it to account for the extra two angle brackets in the
headings. The indentation isn't required by asciidoc, it doesn't reflect
how the output text is formatted, and it's inconsistent with the
following bullet points that are at the same level in the output.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:09 -07:00
014aa1d1aa pretty-formats: define "literal formatting code"
The description for a %(trailer) option already uses this term without
having a definition anywhere in the document, and we are about to add
another one in %(decorate) that uses it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 11:40:09 -07:00
43c8a30d15 Git 2.42
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 09:34:58 -07:00
915e51b74e Merge branch 'jk/function-pointer-mismatches-fix' (early part)
Fix a minor regression that some compiler might notice.

* 'jk/function-pointer-mismatches-fix' (early part):
  fsck: use enum object_type for fsck_walk callback
2023-08-21 09:27:44 -07:00
5a50dd7eda Merge tag 'l10n-2.42.0-rnd2' of https://github.com/git-l10n/git-po
l10n-2.42.0-rnd2

* tag 'l10n-2.42.0-rnd2' of https://github.com/git-l10n/git-po:
  l10n: zh_TW.po: Git 2.42
  l10n: zh_CN: 2.42.0 round 2
  l10n: zh_CN: v2.42.0 round 1
  l10n: Update German translation
  l10n: Update Catalan translation
  l10n: tr: git 2.42.0
  l10n: fr v2.42.0 rnd 2
  l10n: fr v2.42.0 rnd 1
  l10n: sv.po: Update Swedish translation 5549t0f0u
  l10n: uk: update translation (2.42.0)
  l10n: po-id for 2.42 (round 1)
  l10n: ru.po: update Russian translation
2023-08-21 08:43:46 -07:00
d1f87c2148 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.42 (round 1)
2023-08-21 07:05:38 +08:00
5e2dff212a l10n: zh_TW.po: Git 2.42
Co-authored-by: Lumynous <lumynou5.tw@gmail.com>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2023-08-20 22:01:37 +08:00
beaa1d952b hashmap: use expected signatures for comparison functions
We prefer for callback functions to match the signature with which
they'll be called, rather than casting them to the correct type when
assigning function pointers. Even though casting often works in the real
world, it is a violation of the standard.

We did a mass conversion in 939af16eac (hashmap_cmp_fn takes
hashmap_entry params, 2019-10-06), but have grown a few new cases since
then. Because of the cast, the compiler does not complain. However, as
of clang-18, UBSan will catch these at run-time, and the case in
range-diff.c triggers when running t3206.

After seeing that one, I scanned the results of:

  git grep '_fn)[^(]' '*.c' | grep -v typedef

and found a similar case in compat/terminal.c (which presumably isn't
called in the test suite, since it doesn't trigger UBSan). There might
be other cases lurking if the cast is done using a typedef that doesn't
end in "_fn", but loosening it finds too many false positives. I also
looked for:

  git grep ' = ([a-z_]*) *[a-z]' '*.c'

to find assignments that cast, but nothing looked like a function.

The resulting code is unfortunately a little longer, but the bonus of
using container_of() is that we are no longer restricted to the
hashmap_entry being at the start of the struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-19 21:17:53 -07:00
2bbeddee5d fsck: use enum object_type for fsck_walk callback
We switched the function interface for fsck callbacks in a1aad71601
(fsck.h: use "enum object_type" instead of "int", 2021-03-28). However,
we accidentally flipped the type back to "int" as part of 0b4e9013f1
(fsck: mark unused parameters in various fsck callbacks, 2023-07-03).
The mistake happened because that commit was written before a1aad71601
and rebased forward, and I screwed up while resolving the conflict.

Curiously, the compiler does not warn about this mismatch, at least not
when using gcc and clang on Linux (nor in any of our CI environments).
Based on 28abf260a5 (builtin/fsck.c: don't conflate "int" and "enum" in
callback, 2021-06-01), I'd guess that this would cause the AIX xlc
compiler to complain. I noticed because clang-18's UBSan now identifies
mis-matched function calls at runtime, and does complain of this case
when running the test suite.

I'm not entirely clear on whether this mismatch is a problem in
practice. Compilers are certainly free to make enums smaller than "int"
if they don't need the bits, but I suspect that they have to promote
back to int for function calls (though I didn't dig in the standard, and
I won't be surprised if I'm simply wrong and the real-world impact would
depend on the ABI).

Regardless, switching it back to enum is obviously the right thing to do
here; the switch to "int" was simply a mistake.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-19 21:17:32 -07:00
3cf978718f Merge branch 'l10n-de-2.42' of github.com:ralfth/git
* 'l10n-de-2.42' of github.com:ralfth/git:
  l10n: Update German translation
2023-08-19 21:09:31 +08:00
6fb0e532d5 Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2023-08-19 21:08:22 +08:00
d731a52e4d Merge branch 'update-uk-l10n' of github.com:arkid15r/git-ukrainian-l10n
* 'update-uk-l10n' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: uk: update translation (2.42.0)
2023-08-19 21:07:47 +08:00
c04e26683f Merge branch 'tl/zh_CN_2.42.0_rnd1' of github.com:dyrone/git
* 'tl/zh_CN_2.42.0_rnd1' of github.com:dyrone/git:
  l10n: zh_CN: 2.42.0 round 2
  l10n: zh_CN: v2.42.0 round 1
2023-08-19 21:07:03 +08:00
7fdd36c22b Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation 5549t0f0u
2023-08-19 21:05:00 +08:00
d0d403b8bc Merge branch 'l10n-tr' of github.com:bitigchi/git-po
* 'l10n-tr' of github.com:bitigchi/git-po:
  l10n: tr: git 2.42.0
2023-08-19 21:04:09 +08:00
5626558e63 t4040: remove test that succeeded for a wrong reason
"diff-tree -b --exit-code" without "--patch" exits with 0 status,
not because it finds that the two input files are equivalent while
ignoring whitespaces, but because the implied "--raw" mode always
exits with 0 when whitespace tweaking options like "-b" and "-w"
are given, which is a long-standing bug.

We are about to fix it so that "--raw" and friends report the
differences with the exit status (even though they ignore the
whitespace tweaking options when producing their output), which
will make this test, which succeeded for a wrong reason, start
failing.  Remove it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18 17:01:12 -07:00
e8efd86369 diff: teach "--stat -w --exit-code" to notice differences
When options like "-w" is used while "--exit-code" option is in
effect, instead of the usual "do we have any filepair whose preimage
and postimage have different <mode,object>?" check, we need to compare
the contents of the blobs, taking into account that certain changes
are considered no-op.

With the previous step, we taught "--patch" codepath to set the
.found_changes bit correctly, even for a change that only affects
the mode and not object.  The "--stat" codepath, however, did not
set the .found_changes bit at all.  This lead to

    $ git diff --stat -w --exit-code

for a change that does have an output to exit with status 0.

Set the bit by inspecting the list of paths the diffstat output is
given for (a mode-only change will still appear as a "0-line added
0-line deleted" change) to fix it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18 17:01:11 -07:00
c9a3e724cf diff: mode-only change should be noticed by "--patch -w --exit-code"
The codepath to notice the content-level changes, taking certain
no-op changes like "ignore whitespace" into account, forgot that
a mode-only change is still a change.  This resulted in

    $ git diff --patch --exit-code -w

to exit with status 0 even when there is such a mode-only change,
breaking both "--patch" and "--quiet" output formats.

Teach the builtin_diff() codepath that creation and deletion as well
as mode changes are all interesting changes.

Note that the test specifically checks removal of an empty file,
because if there is anything in the preimage (i.e. the removed file
is not empty), the removal would still trigger textual patch output
and the codepath for that does update .found_changes bit to report
that it found an interesting change.  We need to make sure that the
.found_changes bit is set even without triggering textual patch
output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18 17:01:11 -07:00
5f107caed7 diff: move the fallback "--exit-code" code down
When "--exit-code" is asked and the code cannot just answer by
comparing the object names on both sides but needs to inspect and
compare the contents, there are two ways that the result is found
out.

Some output modes, like "--stat" and "--patch", inherently have to
inspect the contents in order to show the differences in the way
they do.  The codepaths for these modes set the .found_changes bit
as they compute what to show.

However, other output modes do not need to inspect the contents to
show the differences in the way they do.  The most notable example
is "--quiet", which does not need to compute any output to show.
When they are asked to report "--exit-code", they run the codepaths
for the "--patch" output with their output redirected to "/dev/null",
only to set the .found_changes bit.

Currently, this fallback invocation of "--patch" output is done
after the "--stat" output format and its friends and before the
"--patch" and internal callback logic.  Move it to the end of
the sequence to clarify the fallback status of this code block.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18 17:01:11 -07:00
9441efe212 l10n: zh_CN: 2.42.0 round 2
Signed-off-by: Teng Long <dyroneteng@gmail.com>
2023-08-18 19:30:03 +08:00
bb9c886334 l10n: zh_CN: v2.42.0 round 1
Signed-off-by: Teng Long <dyroneteng@gmail.com>
2023-08-18 19:29:01 +08:00
f9972720e9 Merge branch 'ps/revision-stdin-with-options'
Typofix to documentation added during this cycle.

* ps/revision-stdin-with-options:
  rev-list-options: fix typo in `--stdin` documentation
2023-08-17 15:50:05 -07:00
62ce3dcd67 Merge branch 'sa/doc-ls-remote'
Mark-up fix to documentation added during this cycle.

* sa/doc-ls-remote:
  show-ref doc: fix carets in monospace
2023-08-17 15:50:05 -07:00
fa43131a09 Merge branch 'tl/notes-separator'
Typo/grammofix to documentation added during this cycle.

* tl/notes-separator:
  notes doc: tidy up `--no-stripspace` paragraph
  notes doc: split up run-on sentences
2023-08-17 15:50:05 -07:00
a1d7c65007 l10n: Update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2023-08-17 17:00:36 +02:00
c81f1a1676 rev-list-options: fix typo in --stdin documentation
With `--stdin`, we read *from* standard input, not *for*.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-16 11:42:54 -07:00
18c4aac0dd show-ref doc: fix carets in monospace
When commit 00bf685975 (show-ref doc: update for internal consistency,
2023-05-19) switched from double quotes to backticks around our {caret}
macro, we started rendering "{caret}" literally. Fix this by replacing
by a "^" character.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-16 11:40:10 -07:00
3a6e1ad80b notes doc: tidy up --no-stripspace paragraph
Where we document the `--no-stripspace` option, remove a superfluous
"For" to fix the grammar. Mark option names and command names using
`backticks` to set them in monospace.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-16 11:37:25 -07:00
95b6ae9d74 notes doc: split up run-on sentences
When commit c4e2aa7d45 (notes.c: introduce "--[no-]stripspace" option,
2023-05-27) mentioned the new `--no-stripspace` in the documentation for
`-m` and `-F`, it created run-on sentences. It also used slightly
different language in the two sections for no apparent reason. Split the
sentences in two to improve readability, and while touching the two
sites, make them more similar.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-16 11:36:36 -07:00
f8a7795b7a l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2023-08-16 18:25:02 +02:00
5f33a843de upload-pack: fix exit code when denying fetch of unreachable object ID
In 7ba7c52d76 (upload-pack: fix race condition in error messages,
2023-08-10), we have fixed a race in t5516-fetch-push.sh where sometimes
error messages got intermingled. This was done by splitting up the call
to `die()` such that we print the error message before writing to the
remote side, followed by a call to `exit(1)` afterwards.

This causes a subtle regression though as `die()` causes us to exit with
exit code 128, whereas we now call `exit(1)`. It's not really clear
whether we want to guarantee any specific error code in this case, and
neither do we document anything like that. But on the other hand, it
seems rather clear that this is an unintended side effect of the change
given that this change in behaviour was not mentioned at all.

Restore the status-quo by exiting with 128.  The test in t5703 to
ensure that "git fetch" fails by using test_must_fail, which does
not care between exiting 1 and 128, so this changes will not affect
any test.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-16 09:17:46 -07:00
d9dec13dde l10n: tr: git 2.42.0
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2023-08-16 14:40:44 +03:00
87afb88801 l10n: fr v2.42.0 rnd 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-08-16 11:50:23 +02:00
f846e08312 l10n: fr v2.42.0 rnd 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-08-16 11:48:16 +02:00
b90a4a25e6 l10n: sv.po: Update Swedish translation 5549t0f0u
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2023-08-16 07:42:51 +01:00
5bf602fc3b l10n: uk: update translation (2.42.0)
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2023-08-15 18:55:13 -07:00
62a26b36bd Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (34 commits)
  Git 2.42-rc2
  t4053: avoid writing to unopened pipe
  t4053: avoid race when killing background processes
  Git 2.42-rc1
  git maintenance: avoid console window in scheduled tasks on Windows
  win32: add a helper to run `git.exe` without a foreground window
  t9001: remove excessive GIT_SEND_EMAIL_NOTTY=1
  mv: handle lstat() failure correctly
  parse-options: disallow negating OPTION_SET_INT 0
  repack: free geometry struct
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
  t0040: declare non-tab indentation to be okay in this script
  advice: handle "rebase" in error_resolve_conflict()
  A few more topics before -rc1
  mailmap: change primary address for Glen Choo
  gitignore: ignore clangd .cache directory
  docs: update when `git bisect visualize` uses `gitk`
  compat/mingw: implement a native locate_in_PATH()
  run-command: conditionally define locate_in_PATH()
  ...
2023-08-16 07:24:56 +08:00
f1ed9d7dc0 Git 2.42-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-15 10:20:02 -07:00
f9fe84b5a2 Merge branch 'pw/diff-no-index-from-named-pipes'
Test updates.

* pw/diff-no-index-from-named-pipes:
  t4053: avoid writing to unopened pipe
  t4053: avoid race when killing background processes
2023-08-15 10:19:47 -07:00
8e12aaa7ce Merge branch 'st/mv-lstat-fix'
Correct use of lstat() that assumed a failing call would not
clobber the statbuf.

* st/mv-lstat-fix:
  mv: handle lstat() failure correctly
2023-08-15 10:19:47 -07:00
cecd6a5ffc Merge branch 'jc/send-email-pre-process-fix'
Test fix.

* jc/send-email-pre-process-fix:
  t9001: remove excessive GIT_SEND_EMAIL_NOTTY=1
2023-08-15 10:19:47 -07:00
32f4fa8d3b Merge branch 'ds/maintenance-on-windows-fix'
Windows updates.

* ds/maintenance-on-windows-fix:
  git maintenance: avoid console window in scheduled tasks on Windows
  win32: add a helper to run `git.exe` without a foreground window
2023-08-15 10:19:47 -07:00
fc6bba66bc Merge branch 'js/allow-t4000-to-be-indented-with-spaces'
File attribute update.

* js/allow-t4000-to-be-indented-with-spaces:
  t0040: declare non-tab indentation to be okay in this script
2023-08-14 13:26:41 -07:00
fc71d024ad Merge branch 'jk/send-email-with-new-readline'
Adjust to newer Term::ReadLine to prevent it from breaking
the interactive prompt code in send-email.

* jk/send-email-with-new-readline:
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2023-08-14 13:26:41 -07:00
6df312ad31 Merge branch 'jk/repack-leakfix'
Leakfix.

* jk/repack-leakfix:
  repack: free geometry struct
2023-08-14 13:26:40 -07:00
aea6c0531c Merge branch 'rs/parse-opt-forbid-set-int-0-without-noneg'
Developer support to detect meaningless combination of options.

* rs/parse-opt-forbid-set-int-0-without-noneg:
  parse-options: disallow negating OPTION_SET_INT 0
2023-08-14 13:26:40 -07:00
f12cb5052d Merge branch 'ob/rebase-conflict-advice-i18n-fix'
i18n coverage improvement and avoidance of sentence lego.

* ob/rebase-conflict-advice-i18n-fix:
  advice: handle "rebase" in error_resolve_conflict()
2023-08-14 13:26:40 -07:00
fdc9914c28 builtin/worktree.c: fix typo in "forgot fetch" msg
Replace misspelled word "overide" with correctly spelled "override".

Reported-By: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-13 16:35:37 -07:00
b46d806ea5 t9001: fix indentation in test_no_confirm()
The continuations of the compound command were indented as if they were
continuations of the embedded pipe, which was misleading.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-13 16:32:28 -07:00
e5cb1e3f09 t4053: avoid writing to unopened pipe
This fixes an occasional hang I see when running t4053 with
--verbose-log using dash.

Commit 1e3f26542a (diff --no-index: support reading from named pipes,
2023-07-05) added a test that "diff --no-index" will complain when
comparing a named pipe and a directory. The minimum we need to test this
is to mkfifo the pipe, and then run "git diff --no-index pipe some_dir".
But the test does one thing more: it spawns a background shell process
that opens the pipe for writing, like this:

        {
                (>pipe) &
        } &&

This extra writer _could_ be useful if Git misbehaves and tries to open
the pipe for reading. Without the writer, Git would block indefinitely
and the test would never end. But since we do not have such a bug, Git
does not open the pipe and it is the writing process which will block
indefinitely, since there are no readers. The test addresses this by
running "kill $!" in a test_when_finished block. Since the writer should
be blocking forever, this kill command will reliably find it waiting.

However, this seems to be somewhat racy, in that the writing process
sometimes hangs around even after the "kill". In a normal run of the
test script without options, this doesn't have any effect; the
main test script completes anyway. But with --verbose-log, we spawn a
"tee" process that reads the script output, and it won't end until all
descriptors pointing to its input pipe are closed. And the background
process that is hanging around still has its stderr, etc, pointed into
that pipe.

You can reproduce the situation like this:

  cd t
  ./t4053-diff-no-index.sh --verbose-log --stress

Let that run for a few minutes, and then you'll find that some of the
runs have hung. For example, at 11:53, I ran:

  $ ps xk start o pid,start,command | grep tee | head
   713459 11:48:06 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-9.out
   713527 11:48:06 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-15.out
   719434 11:48:07 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-1.out
   728117 11:48:08 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-5.out
   738738 11:48:09 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-31.out
   739457 11:48:09 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-27.out
   744432 11:48:10 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-21.out
   744471 11:48:10 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-29.out
   761961 11:48:12 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-0.out
   812299 11:48:19 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-8.out

All of these have been hung for several minutes. We can investigate one
and see that it's waiting to get EOF on its input:

  $ strace -p 713459
  strace: Process 713459 attached
  read(0,
  ^C

Who else has that descriptor open?

  $ lsof -a -p 713459 -d 0 +E
  COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
  tee     713459 peff    0r  FIFO   0,13      0t0 3943636 pipe 719203,sh,5w 719203,sh,7w 719203,sh,12w 719203,sh,13w
  sh      719203 peff    5w  FIFO   0,13      0t0 3943636 pipe 713459,tee,0r 719203,sh,7w 719203,sh,12w 719203,sh,13w
  sh      719203 peff    7w  FIFO   0,13      0t0 3943636 pipe 713459,tee,0r 719203,sh,5w 719203,sh,12w 719203,sh,13w
  sh      719203 peff   12w  FIFO   0,13      0t0 3943636 pipe 713459,tee,0r 719203,sh,5w 719203,sh,7w 719203,sh,13w
  sh      719203 peff   13w  FIFO   0,13      0t0 3943636 pipe 713459,tee,0r 719203,sh,5w 719203,sh,7w 719203,sh,12w

It's a shell, presumably a subshell spawned by the main script. Though
it may seem odd, having the same descriptor open several times is not
unreasonable (they're all basically the original stdout/stderr of the
script that has been copied). And they should all close when the process
exits. So what's it doing? Curiously, it will exit as soon as we strace
it:

  $ strace -s 64 -p 719203
  strace: Process 719203 attached
  openat(AT_FDCWD, "pipe", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOENT (No such file or directory)
  write(2, "./t4053-diff-no-index.sh: 7: eval: ", 35) = 35
  write(2, "cannot create pipe: Directory nonexistent", 41) = 41
  write(2, "\n", 1)                       = 1
  exit_group(2)                           = ?
  +++ exited with 2 +++

I think what happens is this:

  - it is blocking in the openat() call for the pipe, as we expect (so
    this is definitely the backgrounded subshell mentioned above)

  - strace sends signals (probably STOP/CONT); those cause the kernel to
    stop blocking, but libc will restart the system call automatically

  - by this time, the "pipe" fifo is gone, so we'll actually try to
    create a regular file. But of course the surrounding directory is
    gone, too! So we get ENOENT, and then exit as normal.

So the blocking is something we expect to happen. But what we didn't
expect is for the process to still exist at all! It should have been
killed earlier when the parent process called "kill", but it wasn't. And
we can't catch the race at this point, because it happened much earlier.

One can guess, though, that there is some race with the shell setting up
the signal handling in the backgrounded subshell, and possibly blocking
or ignoring signals at the time that the "kill" is received.  Curiously,
the race does not seem to happen if I use "bash" instead of "dash", so
presumably bash's setup here is more atomic.

One fix might be to try killing the subshell more aggressively, either
using SIGKILL, or looping on kill/wait. But that seems complex and
likely to introduce new problems/races. Instead, we can observe that the
writer is not needed at all. Git will notice the pipe via stat() before
it is ever opened. So we can simply drop the writer subshell entirely.

If we ever changed Git to open the path and fstat() it, this would
result in the test hanging. But we're not likely to do that. After all,
we have to stat() paths to see if they are openable at all (e.g., it
could be a directory), so this seems like a low risk. And anybody who
does make such a change will immediately see the issue, as Git would
hang consistently.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-13 16:30:36 -07:00
c2cbefc510 mv: fix error for moving directory to another
If both directories D1 and D2 already exists, and further there is a
filesystem entity D2/D1, "git mv D1 D2" would fail, and we get an
error message that says:

    "cannot move directory over file, source=D1, destination=D2/D1"

regardless of the type of existing "D2/D1".  If it is a file, the
message is correct, but if it is a directory, it is not (we could
make the D2/D1 directory a union of its original contents and what
was in D1/, but that is not what we do).

The code that decies to issue the error message only checks for
existence of "D2/D1" and does not care what kind of thing sits at
the path.

Rephrase the message to say

    "destination already exists, source=D1, destination=D2/D1"

that would be suitable for any kind of thing being in the way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11 18:16:48 -07:00
f9815878c1 check-attr: integrate with sparse-index
Set the requires-full-index to false for "check-attr".

Add a test to ensure that the index is not expanded whether the files
are outside or inside the sparse-checkout cone when the sparse index is
enabled.

The `p2000` tests demonstrate a ~63% execution time reduction for
'git check-attr' using a sparse index.

Test                                            before  after
-----------------------------------------------------------------------
2000.106: git check-attr -a f2/f4/a (full-v3)    0.05   0.05 +0.0%
2000.107: git check-attr -a f2/f4/a (full-v4)    0.05   0.05 +0.0%
2000.108: git check-attr -a f2/f4/a (sparse-v3)  0.04   0.02 -50.0%
2000.109: git check-attr -a f2/f4/a (sparse-v4)  0.04   0.01 -75.0%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11 09:44:52 -07:00
4723ae1007 attr.c: read attributes in a sparse directory
Before this patch, git check-attr was unable to read the attributes from
a .gitattributes file within a sparse directory. The original comment
was operating under the assumption that users are only interested in
files or directories inside the cones. Therefore, in the original code,
in the case of a cone-mode sparse-checkout, we didn't load the
.gitattributes file.

However, this behavior can lead to missing attributes for files inside
sparse directories, causing inconsistencies in file handling.

To resolve this, revise 'git check-attr' to allow attribute reading for
files in sparse directories from the corresponding .gitattributes files:

1.Utilize path_in_cone_mode_sparse_checkout() and index_name_pos_sparse
to check if a path falls within a sparse directory.

2.If path is inside a sparse directory, employ the value of
index_name_pos_sparse() to find the sparse directory containing path and
path relative to sparse directory. Proceed to read attributes from the
tree OID of the sparse directory using read_attr_from_blob().

3.If path is not inside a sparse directory,ensure that attributes are
fetched from the index blob with read_blob_data_from_index().

Change the test 'check-attr with pathspec outside sparse definition' to
'test_expect_success' to reflect that the attributes inside a sparse
directory can now be read. Ensure that the sparse index case works
correctly for git check-attr to illustrate the successful handling of
attributes within sparse directories.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11 09:44:51 -07:00
fd4faf7a5d t1092: add tests for 'git check-attr'
Add tests for `git check-attr`, make sure attribute file does get read
from index when path is either inside or outside of sparse-checkout
definition.

Add a test named 'diff --check with pathspec outside sparse definition'.
It starts by disabling the trailing whitespace and space-before-tab
checks using the core. whitespace configuration option. Then, it
specifically re-enables the trailing whitespace check for a file located
in a sparse directory by adding a whitespace=trailing-space rule to the
.gitattributes file within that directory. Next, create and populate the
folder1 directory, and then add the .gitattributes file to the index.
Edit the contents of folder1/a, add it to the index, and proceed to
"re-sparsify" 'folder1/' with 'git sparse-checkout reapply'. Finally,
use 'git diff --check --cached' to compare the 'index vs. HEAD',
ensuring the correct application of the attribute rules even when the
file's path is outside the sparse-checkout definition.

Mark the two tests 'check-attr with pathspec outside sparse definition'
and 'diff --check with pathspec outside sparse definition' as
'test_expect_failure' to reflect an existing issue where the attributes
inside a sparse directory are ignored. Ensure that the 'check-attr'
command fails to read the required attributes to demonstrate this
expected failure.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11 09:44:51 -07:00
69ecfcacfd maintenance: update schedule before config
When running 'git maintenance start', the current pattern is to
configure global config settings to enable maintenance on the current
repository and set 'maintenance.auto' to false and _then_ to set up the
schedule with the system scheduler.

This has a problematic error condition: if the scheduler fails to
initialize, the repository still will not use automatic maintenance due
to the 'maintenance.auto' setting.

Fix this gap by swapping the order of operations. If Git fails to
initialize maintenance, then the config changes should never happen.

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:17 -07:00
c97ec0378b maintenance: fix systemd schedule overlaps
The 'git maintenance run' command prevents concurrent runs in the same
repository using a 'maintenance.lock' file. However, when using systemd
the hourly maintenance runs the same time as the daily and weekly runs.
(Similarly, daily maintenance runs at the same time as weekly
maintenance.) These competing commands result in some maintenance not
actually being run.

This overlap was something we could not fix until we made the recent
change to not use the builting 'hourly', 'daily', and 'weekly' schedules
in systemd. We can adjust the schedules such that:

 1. Hourly runs avoid the 0th hour.
 2. Daily runs avoid Monday.

This will keep maintenance runs from colliding when using systemd.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:17 -07:00
daa787010c maintenance: use random minute in systemd scheduler
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the systemd integration.

This integration is more complicated than similar changes for other
schedulers because of a neat trick that systemd allows: templating.

The previous implementation generated two template files with names
of the form 'git-maintenance@.(timer|service)'. The '.timer' or
'.service' indicates that this is a template that is picked up when we
later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
'<schedule>' string is then used to insert into the template both the
'OnCalendar' schedule setting and the '--schedule' parameter of the
'git maintenance run' command.

In order to set these schedules to a given minute, we can no longer use
the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
need to abandon the template model for the .timer files. We can still
use templates for the .service files. For this reason, we split these
writes into two methods.

Modify the template with a custom schedule in the 'OnCalendar' setting.
This schedule has some interesting differences from cron-like patterns,
but is relatively easy to figure out from context. The one that might be
confusing is that '*-*-*' is a date-based pattern, but this must be
omitted when using 'Mon' to signal that we care about the day of the
week. Monday is used since that matches the day used for the 'weekly'
schedule used previously.

Now that the timer files are not templates, we might want to abandon the
'@' symbol in the file names. However, this would cause users with
existing schedules to get two competing schedules due to different
names. The work to remove the old schedule name is one thing that we can
avoid by keeping the '@' symbol in our unit names. Since we are locked
into this name, it makes sense that we keep the template model for the
.service files.

The rest of the change involves making sure we are writing these .timer
and .service files before initializing the schedule with 'systemctl' and
deleting the files when we are done. Some changes are also made to share
the random minute along with a single computation of the execution path
of the current Git executable.

In addition, older Git versions may have written a
'git-maintenance@.timer' template file. Be sure to remove this when
successfully enabling maintenance (or disabling maintenance).

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:16 -07:00
f44d7d00e5 maintenance: swap method locations
The systemd_timer_write_unit_templates() method writes a single template
that is then used to start the hourly, daily, and weekly schedules with
systemd.

However, in order to schedule systemd maintenance on a given minute,
these templates need to be replaced with specific schedules for each of
these jobs.

Before modifying the schedules, move the writing method above the
systemd_timer_enable_unit() method, so we can write a specific schedule
for each unit.

The diff is computed smaller by showing systemd_timer_enable_unit() and
systemd_timer_delete_units()  move instead of
systemd_timer_write_unit_templates() and
systemd_timer_delete_unit_templates().

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:16 -07:00
9b43399057 maintenance: use random minute in cron scheduler
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the cron integration.

The cron schedule specification starts with a minute indicator, which
was previously inserted as the "0" string but now takes the given minute
as an integer parameter.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:16 -07:00
62a239987c maintenance: use random minute in Windows scheduler
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the Windows scheduler integration.

We need only to modify the minute value for the 'StartBoundary' tag
across the three schedules.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:16 -07:00
ec5d9d684c maintenance: use random minute in launchctl scheduler
The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Use get_random_minute() when constructing the schedules for launchctl.

The format already includes a 'Minute' key which is modified from 0 to
the random minute.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:16 -07:00
89024a0ab0 maintenance: add get_random_minute()
When we initially created background maintenance -- with its hourly,
daily, and weekly schedules -- we considered the effects of all clients
launching fetches to the server every hour on the hour. The worry of
DDoSing server hosts was noted, but left as something we would consider
for a future update.

As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded.
Those systems, especially those serving public repositories, are already
resilient to thundering herds of much smaller scale.

However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some
of these technologies are built for a different range of scale, and can
hit concurrency limits sooner. Organizations with such custom
infrastructures are more likely to recommend tools like `scalar` which
furthers their adoption of background maintenance.

To help solve for this, create get_random_minute() as a method to help
Git select a random minute when creating schedules in the future. The
integrations with this method do not yet exist, but will follow in
future changes.

To avoid multiple sources of randomness in the Git codebase, create a
new helper function, git_rand(), that returns a random uint32_t. This is
similar to how rand() returns a random nonnegative value, except it is
based on csprng_bytes() which is cryptographic and will return values
larger than RAND_MAX.

One thing that is important for testability is that we notice when we
are under a test scenario and return a predictable result. The schedules
themselves are not checked for this value, but at least one launchctl
test checks that we do not unnecessarily reboot the schedule if it has
not changed from a previous version.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 14:04:16 -07:00
ac300bda10 rebase: allow overriding the maximal length of the generated labels
With this change, users can override the compiled-in default for the
maximal length of the label names generated by `git rebase
--rebase-merges`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Mark Ruvald Pedersen <mped@demant.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 10:12:31 -07:00
7481d2bfca sequencer: truncate labels to accommodate loose refs
Some commits may have unusually long subject lines. When those subject
lines are used as labels in the `--rebase-merges` mode of `git rebase`,
they can cause errors when writing the corresponding loose refs because
most file systems have a maximal file name length of 255 (`NAME_MAX`).
The symptom looks like this:

	$ git rebase --continue
	error: cannot lock ref 'refs/rewritten/SANITIZED-SUBJECT': Unable to create '.git/refs/rewritten/SANITIZED-SUBJECT.lock': File name too long - where SANITIZED-SUBJECT is very long

Let's accommodate this situation by truncating the labels.

Care must be taken in case the subject line contains multi-byte
characters so as not to truncate in the middle of a character.

Signed-off-by: Mark Ruvald Pedersen <mped@demant.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 10:12:29 -07:00
20a0bd45fa t/lib-rebase: improve documentation of set_fake_editor()
Firstly, make it reflect better what actually happens. Not omitting some
possibilities makes it easier to fully exploit them, and not
contradicting the implementation makes it easier to grok and thus modify
the code.

Secondly, improve the overall structure, putting more general info
further up.

Thirdly, document `merge`, `fakesha`, and `break`, which were previously
omitted entirely.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 09:18:07 -07:00
231e86c10c t4053: avoid race when killing background processes
The test 'diff --no-index reads from pipes' starts a couple of
background processes that write to the pipes that are passed to "diff
--no-index". If the test passes then we expect these processes to exit
as all their output will have been read. However if the test fails
then we want to make sure they do not hang about on the users machine
and the test remembers they should be killed by calling

      test_when_finished  "! kill $!"

after each background process is created. Unfortunately there is a
race where test_when_finished may run before the background process
exits even when all its output has been read resulting in the kill
command succeeding which causes the test to fail. Fix this by ignoring
the exit status of the kill command. If the diff is successful we
could instead wait for the background process to exit and check their
status but that feels like it is testing the platform's printf
implementation rather than git's code.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 09:16:27 -07:00
7ba7c52d76 upload-pack: fix race condition in error messages
Test t5516-fetch-push.sh has a test 'deny fetch unreachable SHA1,
allowtipsha1inwant=true' that checks stderr for a specific error
string from the remote. In some build environments the error sent
over the remote connection gets mingled with the error from the
die() statement. Since both signals are being output to the same
file descriptor (but from parent and child processes), the output
we are matching with grep gets split.

To reduce the risk of this failure, follow this process instead:

1. Write an error message to stderr.
2. Write an error message across the connection.
3. exit(1).

This reorders the events so the error is written entirely before
the client receives a message from the remote, removing the race
condition.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-10 09:15:27 -07:00
fd3ba590d8 git-push.txt: fix grammar
While working on a blog post and using grammarly it suggested this
change.

Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 21:08:10 -07:00
fac96dfbb1 Git 2.42-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 16:18:16 -07:00
e8c53ff912 Merge branch 'pw/rebase-skip-commit-message-fix'
"git rebase -i" with a series of squash/fixup, when one of the
steps stopped in conflicts and ended up getting skipped, did not
handle the accumulated commit log messages, which has been
corrected.

* pw/rebase-skip-commit-message-fix:
  rebase --skip: fix commit message clean up when skipping squash
2023-08-09 16:18:16 -07:00
8cdd5e713d Merge branch 'ma/locate-in-path-for-windows'
"git bisect visualize" stopped running "gitk" on Git for Windows
when the command was reimplemented in C around Git 2.34 timeframe.
This has been corrected.

* ma/locate-in-path-for-windows:
  docs: update when `git bisect visualize` uses `gitk`
  compat/mingw: implement a native locate_in_PATH()
  run-command: conditionally define locate_in_PATH()
2023-08-09 16:18:16 -07:00
b6e2a0c0b3 Merge branch 'bc/ignore-clangd-cache'
.gitignore update.

* bc/ignore-clangd-cache:
  gitignore: ignore clangd .cache directory
2023-08-09 16:18:15 -07:00
cf07e53bae Merge branch 'bc/ident-dot-is-no-longer-crud-letter'
Exclude "." from the set of characters to be removed from the
beginning and the end of the human-readable name.

* bc/ident-dot-is-no-longer-crud-letter:
  ident: don't consider '.' a crud
2023-08-09 16:18:15 -07:00
889c94d2a0 Merge branch 'ew/hash-with-openssl-evp'
Adjust to OpenSSL 3+, which deprecates its SHA-1 functions based on
its traditional API, by using its EVP API instead.

* ew/hash-with-openssl-evp:
  avoid SHA-1 functions deprecated in OpenSSL 3+
  sha256: avoid functions deprecated in OpenSSL 3+
2023-08-09 16:18:15 -07:00
99d51978be repack: move pack_geometry struct to the stack
The `pack_geometry` struct is used to maintain and partition a list of
packfiles into a "frozen" set (to be left alone), and a non-frozen set
(to be combined into a single new pack). In the previous commit, we
removed a leak caused by neglecting to free() the heap allocated space
used to store the structure itself.

But there is no need for this structure to live on the heap anyway.
Instead, let's move it to be stack allocated, eliminating the
possibility of a direct leak like the one addressed in the previous
patch.

The one minor hitch is that we use the NULL-ness of the pack_geometry's
struct pointer to determine whether or not we are performing a geometric
repack with `--geometric=<d>`. But since we only initialize the
pack_geometry structure when the `geometric_factor` is non-zero, we can
use that variable (based on whether or not it is equal to zero) to
determine whether or not we are performing a geometric repack.

There are a couple of spots that have access to a pointer to the
pack_geometry struct, but not the geometric_factor itself. Instead of
passing in an additional variable, let's make the geometric_factor a
field of the pack_geometry struct.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 14:31:01 -07:00
0050f8e401 git maintenance: avoid console window in scheduled tasks on Windows
We just introduced a helper to avoid showing a console window when the
scheduled task runs `git.exe`. Let's actually use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 13:58:15 -07:00
4b8a2717bb win32: add a helper to run git.exe without a foreground window
On Windows, there are two kinds of executables, console ones and
non-console ones. Git's executables are all console ones.

When launching the former e.g. in a scheduled task, a CMD window pops
up. This is not what we want for the tasks installed via the `git
maintenance` command.

To work around this, let's introduce `headless-git.exe`, which is a
non-console program that does _not_ pop up any window. All it does is to
re-launch `git.exe`, suppressing that console window, passing through
all command-line arguments as-are.

Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Helped-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 13:58:13 -07:00
82dc42cbd1 sequencer: simplify allocation of result array in todo_list_rearrange_squash()
The operation doesn't change the number of elements in the array, so we do
not need to allocate the result piecewise.

This moves the re-assignment of todo_list->alloc at the end slighly up,
so it's right after the newly added assert which also refers to `nr`
(and which indeed should come first). Also, the value is more likely to
be still in a register at that point.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 13:52:11 -07:00
b3dcd24b8a t9001: remove excessive GIT_SEND_EMAIL_NOTTY=1
This was added by 3ece9bf0f9 (send-email: clear the $message_id after
validation, 2023-05-17) for no apparent reason, as this is required only
in cases when git's stdin is (must be) redirected, which isn't the case
here.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 12:44:07 -07:00
72695d8214 mv: handle lstat() failure correctly
When moving a directory onto another with `git mv` various checks are
performed. One of of these validates that the destination is not existing.

When calling `lstat` on the destination path and it fails as the path
doesn't exist, some environments seem to overwrite the passed  in
`stat` memory nonetheless (I observed this issue on debian 12 of x86_64,
running on OrbStack on ARM, emulated with Rosetta).

This would affect the code that followed as it would still acccess a now
modified `st` structure, which now seems to contain uninitialized memory.
`S_ISDIR(st_dir_mode)` would then typically return false causing the code
to run into a bad case.

The fix avoids overwriting the existing `st` structure, providing an
alternative that exists only for that purpose.

Note that this patch minimizes complexity instead of stack-frame size.

Signed-off-by: Sebastian Thiel <sebastian.thiel@icloud.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-09 11:46:12 -07:00
2a499264d3 branch: error message checking out a branch in use
Let's update the error message we show when the user tries to check out
a branch which is being used in another worktree, following the
guideline reasoned in 4970bedef2 (branch: update the message to refuse
touching a branch in-use, 2023-07-21).

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-08 18:27:30 -07:00
3284b93862 parse-options: disallow negating OPTION_SET_INT 0
An option of type OPTION_SET_INT can be defined to set its variable to
zero.  It's negated variant will do the same, though, which is
confusing.  Several such options were fixed by disabling negation,
changing the value to set or using a different option type:

991c552916 (ls-tree: fix --no-full-name, 2023-07-18)
e12cb98e1e (branch: reject "--no-all" and "--no-remotes" early, 2023-07-18)
68cbb20e73 (show-branch: reject --[no-](topo|date)-order, 2023-07-19)
3821eb6c3d (reset: reject --no-(mixed|soft|hard|merge|keep) option, 2023-07-19)
36f76d2a25 (pack-objects: fix --no-quiet, 2023-07-21)
3a5f308741 (pack-objects: fix --no-keep-true-parents, 2023-07-21)
c95ae3ff9c (describe: fix --no-exact-match, 2023-07-21)
d089a06421 (bundle: use OPT_PASSTHRU_ARGV, 2023-07-29)

Check for such options that allow negation in parse_options_check() and
report them to find future cases quicker.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-08 16:55:07 -07:00
cb888bb699 repack: free geometry struct
When the program is ending, we call clear_pack_geometry() to free any
resources in the pack_geometry struct. But the struct itself is
allocated on the heap, and leak-checkers will complain about the
resulting small leak.

This one was marked by Coverity as a "new" leak, though it has existed
since 0fabafd0b9 (builtin/repack.c: add '--geometric' option,
2021-02-22). This might be because recent unrelated changes in the file
confused it about what is new and what is not. But regardless, it is
worth addressing.

We can fix it easily by free-ing the struct. We'll convert our "clear"
function to "free", since the allocation happens in the matching init()
function (though since there is only one call to each, and the struct is
local to this file, it's mostly academic).

Another option would be to put the struct on the stack rather than the
heap. However, this gets tricky, as we check the pointer against NULL in
several places to decide whether we're in geometric mode.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-08 16:49:10 -07:00
c016726c2d send-email: avoid creating more than one Term::ReadLine object
Every time git-send-email calls its ask() function to prompt the user,
we call term(), which instantiates a new Term::ReadLine object. But in
v1.46 of Term::ReadLine::Gnu (which provides the Term::ReadLine
interface on some platforms), its constructor refuses to create a second
instance[1]. So on systems with that version of the module, most
git-send-email instances will fail (as we usually prompt for both "to"
and "in-reply-to" unless the user provided them on the command line).

We can fix this by keeping a single instance variable and returning it
for each call to term(). In perl 5.10 and up, we could do that with a
"state" variable. But since we only require 5.008, we'll do it the
old-fashioned way, with a lexical "my" in its own scope.

Note that the tests in t9001 detect this problem as-is, since the
failure mode is for the program to die. But let's also beef up the
"Prompting works" test to check that it correctly handles multiple
inputs (if we had chosen to keep our FakeTerm hack in the previous
commit, then the failure mode would be incorrectly ignoring prompts
after the first).

[1] For discussion of why multiple instances are forbidden, see:
    https://github.com/hirooih/perl-trg/issues/16

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-08 16:48:17 -07:00
dfd46bae92 send-email: drop FakeTerm hack
Back in 280242d1cc (send-email: do not barf when Term::ReadLine does not
like your terminal, 2006-07-02), we added a fallback for when
Term::ReadLine's constructor failed: we'd have a FakeTerm object
instead, which would then die if anybody actually tried to call
readline() on it. Since we instantiated the $term variable at program
startup, we needed this workaround to let the program run in modes when
we did not prompt the user.

But later, in f4dc9432fd (send-email: lazily load modules for a big
speedup, 2021-05-28), we started loading Term::ReadLine lazily only when
ask() is called. So at that point we know we're trying to prompt the
user, and we can just die if ReadLine instantiation fails, rather than
making this fake object to lazily delay showing the error.

This should be OK even if there is no tty (e.g., we're in a cron job),
because Term::ReadLine will return a stub object in that case whose "IN"
and "OUT" functions return undef. And since 5906f54e47 (send-email:
don't attempt to prompt if tty is closed, 2009-03-31), we check for that
case and skip prompting.

And we can be sure that FakeTerm was not kicking in for such a
situation, because it has actually been broken since that commit! It
does not define "IN" or "OUT" methods, so perl would barf with an error.
If FakeTerm was in use, we were neither honoring what 5906f54e47 tried
to do, nor producing the readable message that 280242d1cc intended.

So we're better off just dropping FakeTerm entirely, and letting the
error reported by constructing Term::ReadLine through.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-08 16:48:15 -07:00
12009a182b t0040: declare non-tab indentation to be okay in this script
By necessity, this script needs to verify that certain Git output
matches expectations, including text indented with spaces instead of
tabs.

Most recently, such a check was introduced in 448abbba63 (short help:
allow multi-line opthelp, 2023-07-18) which is reported by `git diff
--check 448abbba6347^!` as having whitespace issues.

Let's not complain about this because it is intentional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-08 16:47:26 -07:00
92edf61870 branch: error message deleting a branch in use
Let's update the error message we show when the user tries to delete a
branch which is being used in another worktree, following the guideline
reasoned in 4970bedef2 (branch: update the message to refuse touching a
branch in-use, 2023-07-21).

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-07 14:22:11 -07:00
ff29a61cbb advice: handle "rebase" in error_resolve_conflict()
This makes sure that we get a properly translated message rather than
inserting the command (which we failed to translate) into a generic
fallback message.

The function is called indirectly via die_resolve_conflict() with fixed
strings, and directly with the string obtained via action_name(), which
in turn returns a string from a fixed set. Hence we know that the now
covered set of strings is exhausitive, and will therefore BUG() out when
encountering an unexpected string. We also know that all covered strings
are actually used.

Arguably, the above suggests that it would be cleaner to pass the
command as an enum in the first place, but that's left for another time.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-07 13:21:00 -07:00
010a0b62e0 t/lib-rebase: set_fake_editor(): handle FAKE_LINES more consistently
Default next action after 'fakesha' to preserving the command instead
of forcing 'pick', consistently with other "instant-effect" keywords.
There is no reason why one would want that inconsistency, so this was
clearly just an oversight in commit 5dcdd740 ("t/lib-rebase: prepare
for testing `git rebase --rebase-merges`"). Rectifying it makes the
behavior easier to reason about and document.

This would affect hypothetical "fakesha <n>" sequences where line <n>
already isn't a pick, which currently don't appear.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-07 12:09:34 -07:00
1cc462446e t/lib-rebase: set_fake_editor(): fix recognition of reset's short command
... in FAKE_LINES.

This has been broken ever since it was introduced in 5dcdd7409a
(t/lib-rebase: prepare for testing `git rebase --rebase-merges`,
2019-07-31), but it's not actually used, so it's a cosmetic defect
only.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-07 12:09:33 -07:00
a82fb66fed A few more topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-07 11:58:17 -07:00
1221e94bd0 mailmap: change primary address for Glen Choo
Glen will lose access to his work email soon.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-07 11:58:17 -07:00
b2797581d0 Merge branch 'ew/sha256-gcrypt-leak-fixes'
Leakfixes.

* ew/sha256-gcrypt-leak-fixes:
  sha256/gcrypt: die on gcry_md_open failures
  sha256/gcrypt: fix memory leak with SHA-256 repos
  sha256/gcrypt: fix build with SANITIZE=leak
2023-08-07 11:57:18 -07:00
a04cef9fd7 Merge branch 'rs/bundle-parseopt-cleanup'
Code clean-up.

* rs/bundle-parseopt-cleanup:
  bundle: use OPT_PASSTHRU_ARGV
2023-08-07 11:57:18 -07:00
e48d9c78cc Merge branch 'am/doc-sha256'
Tone down the warning on SHA-256 repositories being an experimental
curiosity.  We do not have support for them to interoperate with
traditional SHA-1 repositories, but at this point, we do not plan
to make breaking changes to SHA-256 repositories and there is no
longer need for such a strongly phrased warning.

* am/doc-sha256:
  doc: sha256 is no longer experimental
2023-08-07 11:57:18 -07:00
dee27be905 Merge branch 'tb/commit-graph-tests'
Test updates.

* tb/commit-graph-tests:
  t/lib-commit-graph.sh: avoid sub-shell in `graph_git_behavior()`
  t5328: avoid top-level directory changes
  t5318: avoid top-level directory changes
  t/lib-commit-graph.sh: avoid directory change in `graph_git_behavior()`
  t/lib-commit-graph.sh: allow `graph_read_expect()` in sub-directories
2023-08-07 11:57:18 -07:00
311c8ff11c parse-options: simplify usage_padding()
c512643e67 (short help: allow a gap smaller than USAGE_GAP, 2023-07-18)
effectively did away with the two-space gap between options and their
description; one space is enough now.  Incorporate USAGE_GAP into
USAGE_OPTS_WIDTH, merge the two cases with enough space on the line and
incorporate the newline into the format for the remaining case.  The
output remains the same.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:51 -07:00
2a409a1d12 parse-options: no --[no-]no-...
Avoid showing an optional "no-" for options that already start with a
"no-" in the short help, as that double negation is confusing.  Document
the opposite variant on its own line with a generated help text instead,
unless it's defined and documented explicitly already.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:51 -07:00
652a6b15bc parse-options: factor out usage_indent() and usage_padding()
Extract functions for printing spaces before and after options.  We'll
need them in the next commit.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:50 -07:00
e8e5d294dc parse-options: show negatability of options in short help
Add a "[no-]" prefix to options without the flag PARSE_OPT_NONEG to
document the fact that you can negate them.

This looks a bit strange for options that already start with "no-", e.g.
for the option --no-name of git show-branch:

    --[no-]no-name        suppress naming strings

You can actually use --no-no-name as an alias of --name, so the short
help is not wrong.  If we strip off any of the "no-"s, we lose either
the ability to see if the remaining one belongs to the documented
variant or to see if it can be negated.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:50 -07:00
d5dc68f730 t1502: test option negation
Add tests for checking the "git rev-parse --parseopt" flag "!" and
whether options can be negated with a "no-" prefix.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:50 -07:00
8dcb49021e t1502: move optionspec help output to a file
"git rev-parse --parseopt" shows the short help with its description of
all recognized options twice: When called with -h or --help, and after
reporting an unknown option.  Move the one for optionspec into a file
and use it in two tests to deduplicate that part.

"git rev-parse --parseopt -- --h" wraps the help text in "cat <<\EOF"
and "EOF".  Keep that part in the file to use it as is in the test that
needs it and simply remove it in the other one using sed.

Disable whitespace checking for the file using an attribute, as we need
to keep its spaces intact and wouldn't want a stray --whitespace=fix
turn them into tabs.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:50 -07:00
aa43619bdf t1502, docs: disallow --no-help
"git rev-parse --parseopt" handles the built-in options -h and --help,
but not --no-help.  Make test definitions and documentation examples
more realistic by disabling negation.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:50 -07:00
d716512870 subtree: disallow --no-{help,quiet,debug,branch,message}
"git subtree" only handles the negated variant of the options annotate,
prefix, onto, rejoin, ignore-joins and squash explicitly.  help is
handled by "git rev-parse --parseopt" implicitly, but not its negated
form.  Disable negation for it and the for the rest of the options to
get a helpful error message when trying them.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-06 17:16:49 -07:00
e600568929 l10n: po-id for 2.42 (round 1)
Update following components:

* commit-graph.c
* diff-no-index.c
* builtin/notes.c
* builtin/pack-refs.c
* builtin/worktree.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2023-08-06 19:41:06 +07:00
450f2c9e3e Merge branch 'russian-l10n' of https://github.com/DJm00n/git-po-ru
* 'russian-l10n' of https://github.com/DJm00n/git-po-ru:
  l10n: ru.po: update Russian translation
2023-08-05 23:26:39 +08:00
a5c01603b3 gitignore: ignore clangd .cache directory
In at least some versions of clangd, including version 15 in Ubuntu
23.04, a directory, .cache, is written in the root of the repository
with index information about the files in the repository.  Since clangd
is the most common language server protocol (LSP) implementation for C,
and we already support it using the GENERATE_COMPILATION_DATABASE flags
to make it functional, it's likely many users are using or will want to
use it.

As a result, ignore the ".cache" directory to help avoid users
accidentally committing the data.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-04 10:56:51 -07:00
ac83bc5054 Git 2.42-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-04 10:52:31 -07:00
65e25ae522 Merge branch 'jc/branch-in-use-error-message'
"git branch -f X" to repoint the branch X said that X was "checked
out" in another worktree, even when branch X was not and instead
being bisected or rebased.  The message was reworded to say the
branch was "in use".

* jc/branch-in-use-error-message:
  branch: update the message to refuse touching a branch in-use
2023-08-04 10:52:31 -07:00
f4a7c24c09 Merge branch 'hy/blame-in-bare-with-contents'
"git blame --contents=file" has been taught to work in a bare
repository.

* hy/blame-in-bare-with-contents:
  blame: allow --contents to work with bare repo
2023-08-04 10:52:31 -07:00
f9712d75e6 Merge branch 'jc/parse-options-short-help'
Command line parser fix, and a small parse-options API update.

* jc/parse-options-short-help:
  short help: allow a gap smaller than USAGE_GAP
  remote: simplify "remote add --tags" help text
  short help: allow multi-line opthelp
2023-08-04 10:52:31 -07:00
23b20fff3a Merge branch 'jc/doc-sent-patch-now-what'
Process document update.

* jc/doc-sent-patch-now-what:
  MyFirstContribution: refrain from self-iterating too much
2023-08-04 10:52:31 -07:00
840affcb8d Merge branch 'la/doc-choose-starting-point-fixup'
Clarify how to pick a starting point for a new topic in the
SubmittingPatches document.

* la/doc-choose-starting-point-fixup:
  SubmittingPatches: use of older maintenance tracks is an exception
  SubmittingPatches: explain why 'next' and above are inappropriate base
  SubmittingPatches: choice of base for fixing an older maintenance track
2023-08-04 10:52:30 -07:00
a53e8a6488 Merge branch 'pv/doc-submodule-update-settings'
Rewrite the description of giving a custom command to the
submodule.<name>.update configuration variable.

* pv/doc-submodule-update-settings:
  doc: highlight that .gitmodules does not support !command
2023-08-04 10:52:30 -07:00
4d06001846 Merge branch 'ja/worktree-orphan-fix'
Fix tests with unportable regex patterns.

* ja/worktree-orphan-fix:
  t2400: rewrite regex to avoid unintentional PCRE
  builtin/worktree.c: convert tab in advice to space
  t2400: drop no-op `--sq` from rev-parse call
2023-08-04 10:52:30 -07:00
3365e2675e Merge branch 'jc/retire-get-sha1-hex'
The implementation of "get_sha1_hex()" that reads a hexadecimal
string that spells a full object name has been extended to cope
with any hash function used in the repository, but the "sha1" in
its name survived.  Rename it to get_hash_hex(), a name that is
more consistent within its friends like get_hash_hex_algop().

* jc/retire-get-sha1-hex:
  hex: retire get_sha1_hex()
2023-08-04 10:52:30 -07:00
dd68b57fc4 Merge branch 'la/doc-choose-starting-point'
Clarify how to choose the starting point for a new topic in
developer guidance document.

* la/doc-choose-starting-point:
  SubmittingPatches: simplify guidance for choosing a starting point
  SubmittingPatches: emphasize need to communicate non-default starting points
  SubmittingPatches: de-emphasize branches as starting points
  SubmittingPatches: discuss subsystems separately from git.git
  SubmittingPatches: reword awkward phrasing
2023-08-04 10:52:30 -07:00
fff1594fa7 docs: update when git bisect visualize uses gitk
This check has involved more environment variables than just `DISPLAY` since
508e84a790 (bisect view: check for MinGW32 and MacOSX in addition to X11,
2008-02-14), so let's update the documentation accordingly.

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-04 09:47:10 -07:00
2bf46a9f62 compat/mingw: implement a native locate_in_PATH()
since 5e1f28d (bisect--helper: reimplement `bisect_visualize()` shell
 function in C, 2021-09-13) `git bisect visualize` uses exists_in_PATH()
to check wether it should call `gitk`, but exists_in_PATH() relies on
locate_in_PATH() which currently only understands POSIX-ish PATH variables
(a list of paths, separated by colons) on native Windows executables
we encounter Windows PATH variables (a list of paths that often contain
drive letters (and thus colons), separated by semicolons). Luckily we do
already have a function that can lookup executables on windows PATHs:
path_lookup(). Implement a small replacement for the existing
locate_in_PATH() based on path_lookup().

Reported-by: Louis Strous <Louis.Strous@intellimagic.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-03 21:21:10 -07:00
bb532b5345 run-command: conditionally define locate_in_PATH()
This commit doesn't change any behaviour by itself, but allows us to easily
define compat replacements for locate_in_PATH(). It prepares us for the next
commit that adds a native Windows implementation of locate_in_PATH().

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-03 21:21:07 -07:00
6ce7afe163 rebase --skip: fix commit message clean up when skipping squash
During a series of "fixup" and/or "squash" commands, the interactive
rebase accumulates a commit message from all the commits that are being
squashed together. If one of the commits has conflicts when it is picked
and the user chooses to skip that commit then we need to remove that
commit's message from accumulated messages.  To do this 15ef69314d
(rebase --skip: clean up commit message after a failed fixup/squash,
2018-04-27) updated commit_staged_changes() to reset the accumulated
message to the commit message of HEAD (which does not contain the
message from the skipped commit) when the last command was "fixup" or
"squash" and there are no staged changes. Unfortunately the code to do
this contains two bugs.

(1) If parse_head() fails we pass an invalid pointer to
    unuse_commit_buffer().

(2) The reconstructed message uses the entire commit buffer from HEAD
    including the headers, rather than just the commit message.

The first issue is fixed by splitting up the "if" condition into several
statements each with its own error handling. The second issue is fixed
by finding the start of the commit message within the commit buffer
using find_commit_subject().

The existing test added by 15ef69314d is modified to show the effect of
this bug.  The bug is triggered when skipping the first command in the
chain (as the test does before this commit) but the effect is hidden
because opts->current_fixup_count is set to zero which leads
update_squash_messages() to recreate the squash message file from
scratch overwriting the bad message created by
commit_staged_changes(). The test is also updated to explicitly check
the commit messages rather than relying on grep to ensure they do not
contain any stray commit headers.

To check the commit message the function test_commit_message() is moved
from t3437-rebase-fixup-options.sh to test-lib.sh. As the function is
now publicly available it is updated to provide better error detection
and avoid overwriting the commonly used files "actual" and "expect".
Support for reading the expected commit message from stdin is also
added.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-03 13:42:54 -07:00
1c04cb0744 ident: don't consider '.' a crud
When we process a user's name (as in user.name), we strip all
leading and trailing crud from it.  Right now, we consider a dot
a crud character, and strip it off.

However, this is unsuitable for many personal names because humans
frequently have abbreviated suffixes, such as "Jr." or "Sr." at the end
of their names, and this corrupts them.  Some other users may wish to
use an abbreviated name or initial, which will pose a problem especially
in cultures that write the family name first, followed by the personal
name.

Since the current approach causes lots of practical problems, let's
avoid it by no longer considering a dot to be crud.

Note that "." in the name forces the entire name to be quoted to
please mailers, but stripping "." only at the beginning and the end
does not help a name with "." in the middle (like "brian m. carlson")
so this change will not make it much worse.  A name like "Given
Family, Jr." that did not have to be quoted now would need to be, in
order to be placed on the e-mail headers, though.

This is based on a weather-balloon patch by Jeff King sent in Aug 2021
https://lore.kernel.org/git/YSKm8Q8nyTavQaox@coredump.intra.peff.net/

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-02 09:50:52 -07:00
1b0a512956 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-02 09:37:52 -07:00
955c2b1c6a Documentation/RelNotes/2.42.0.txt: typofix
Fix a typo introduced in aa9166bcc0 (The ninth batch, 2023-07-08).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-02 09:37:52 -07:00
70e5c5dddd Merge branch 'ks/ref-filter-describe'
"git branch --list --format=<format>" and friends are taught
a new "%(describe)" placeholder.

* ks/ref-filter-describe:
  ref-filter: add new "describe" atom
  ref-filter: add multiple-option parsing functions
2023-08-02 09:37:24 -07:00
8bfb359844 Merge branch 'ah/sequencer-rewrite-todo-fix'
When the user edits "rebase -i" todo file so that it starts with a
"fixup", which would make it invalid, the command truncated the
rest of the file before giving an error and returning the control
back to the user.  Stop truncating to make it easier to correct
such a malformed todo file.

* ah/sequencer-rewrite-todo-fix:
  sequencer: finish parsing the todo list despite an invalid first line
2023-08-02 09:37:24 -07:00
52d9dc20e1 Merge branch 'bb/use-trace2-counters-for-fsync-stats'
Instead of inventing a custom counter variables for debugging,
use existing trace2 facility in the fsync customization codepath.

* bb/use-trace2-counters-for-fsync-stats:
  wrapper: use trace2 counters to collect fsync stats
2023-08-02 09:37:23 -07:00
99acb0fa54 Merge branch 'ah/autoconf-fixes'
"./configure --with-expat=no" did not work as a way to refuse use
of the expat library on a system with the library installed, which
has been corrected.

* ah/autoconf-fixes:
  configure.ac: always save NO_ICONV to config.status
  configure.ac: don't overwrite NO_CURL option
  configure.ac: don't overwrite NO_EXPAT option
2023-08-02 09:37:23 -07:00
fea92e4cac Merge branch 'jc/tree-walk-drop-base-offset'
Code simplification.

* jc/tree-walk-drop-base-offset:
  tree-walk: drop unused base_offset from do_match()
  tree-walk: lose base_offset that is never used in tree_entry_interesting
2023-08-02 09:37:23 -07:00
bda9c12073 avoid SHA-1 functions deprecated in OpenSSL 3+
OpenSSL 3+ deprecates the SHA1_Init, SHA1_Update, and SHA1_Final
functions, leading to errors when building with `DEVELOPER=1'.

Use the newer EVP_* API with OpenSSL 3+ (only) despite being more
error-prone and less efficient due to heap allocations.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-01 08:34:56 -07:00
3e440ea0ab sha256: avoid functions deprecated in OpenSSL 3+
OpenSSL 3+ deprecates the SHA256_Init, SHA256_Update, and SHA256_Final
functions, leading to errors when building with `DEVELOPER=1'.

Use the newer EVP_* API with OpenSSL 3+ despite being more
error-prone and less efficient due to heap allocations.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-01 08:34:54 -07:00
5bdedac3c7 checkout: allow "checkout -m path" to unmerge removed paths
"git checkout -m -- path" uses the unmerge_marked_index() API, whose
implementation is incapable of unresolving a path that was resolved
as removed.  Extend the unmerge_index() API function so that we can
mark the ce_flags member of the cache entries we add to the index as
unmerged, and replace use of unmerge_marked_index() with it.

Now, together with its unmerge_index_entry_at() helper function,
unmerge_marked_index() function is no longer called by anybody, and
can safely be removed.

This makes two known test failures in t2070 and t7201 to succeed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 16:16:44 -07:00
ed3789f2f0 checkout/restore: add basic tests for --merge
Even though "checkout --merge -- paths" had some tests, we never
made sure it worked to recreate the conflicted state _after_ the
resolution was recorded in the index.  Also "restore --merge" did
not even have any tests.

Currently these commands use the unmerge_marked_index() interface
that cannot handle paths that have been resolved as removal, and
tests for that case are marked with test_expect_failure; these
should eventually be fixed, but not in this patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 16:13:49 -07:00
54f98fee50 checkout/restore: refuse unmerging paths unless checking out of the index
Recreating unmerged index entries using resolve-undo data,
recreating conflicted working tree files using unmerged index
entries, and writing data out of unmerged index entries, make
sense only when we are checking paths out of the index and not when
we are checking paths out of a tree-ish.

Add an extra check to make sure "--merge" and "--ours/--theirs"
options are rejected when checking out from a tree-ish, update the
document (especially the SYNOPSIS section) to highlight that they
are incompatible, and add a few tests to make sure the combination
fails.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 16:10:54 -07:00
c0a4ae7f4e update-index: remove stale fallback code for "--unresolve"
The "update-index --unresolve" is a relatively old feature that was
introduced in Git v1.4.1 (June 2006), which predates the
resolve-undo extension introduced in Git v1.7.0 (February 2010).
The original code that was limited only to work during a merge (and
not during a rebase or a cherry-pick) has been kept as the fallback
codepath to be used as a transition measure.

By now, for more than 10 years we have stored resolve-undo extension
in the index file, and the fallback code way outlived its usefulness.

Remove it, together with two file-scope static global variables.
One of these variables is still used by surviving function, but it
does not have to be a global at all, so move it to local to that
function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 16:08:00 -07:00
35901f1c24 update-index: use unmerge_index_entry() to support removal
"update-index --unresolve" uses the unmerge_index_entry_at() that
assumes that the path to be unresolved must be in the index, which
makes it impossible to unresolve a path that was resolved as removal.

Rewrite unresolve_one() to use the unmerge_index_entry() to support
unresolving such a path.

Existing tests for "update-index --unresolve" forgot to check one
thing that tests for "checkout --merge -- paths" tested, which is to
make sure that resolve-undo record that has already been used to
recreate higher-stage index entries is removed.  Add new invocations
of "ls-files --resolve-undo" after running "update-index --unresolve"
to make sure that unresolving with update-index does remove the used
resolve-undo records.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 16:02:17 -07:00
fe83269e16 resolve-undo: allow resurrecting conflicted state that resolved to deletion
The resolve-undo index extension records up to three (mode, object
name) tuples for non-zero stages for each path that was resolved,
to be used to recreate the original conflicted state later when the
user requests.

The unmerge_index_entry_at() function uses the resolve-undo data to
do so, but it assumes that the path for which the conflicted state
needs to be recreated can be specified by the position in the
active_cache[] array.  This obviously cannot salvage the state of
conflicted paths that were resolved by removing them.  For example,
a delete-modify conflict, in which the change whose "modify" side
made is a trivial typofix, may legitimately be resolved to remove
the path, and resolve-undo extension does record the two (mode,
object name) tuples for the common ancestor version and their
version, lacking our version.  But after recording such a removal of
the path, you should be able to use resolve-undo data to recreate
the conflicted state.

Introduce a new unmerge_index_entry() helper function that takes the
path (which does not necessarily have to exist in the active_cache[]
array) and resolve-undo data, and use it to reimplement unmerge_index()
public function that is used by "git rerere".

The limited interface is still kept for now, as it is used by "git
checkout -m" and "git update-index --unmerge", but these two codepaths
will be updated to lift the assumption to allow conflicts that resolved
to deletion can be recreated.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 15:46:18 -07:00
91e07058c5 update-index: do not read HEAD and MERGE_HEAD unconditionally
When "update-index --unresolve $path" cannot find the resolve-undo
record for the path the user requested to unresolve, it stuffs the
blobs from HEAD and MERGE_HEAD to stage #2 and stage #3 as a
fallback.  For this reason, the operation does not even start unless
both "HEAD" and "MERGE_HEAD" exist.

This is suboptimal in a few ways:

 * It does not recreate stage #1.  Even though it is a correct
   design decision not to do so (because it is impossible to
   recreate in general cases, without knowing how we got there,
   including what merge strategy was used), it is much less useful
   not to have that information in the index.

 * It limits the "unresolve" operation only during a conflicted "git
   merge" and nothing else.  Other operations like "rebase",
   "cherry-pick", and "switch -m" may result in conflicts, and the
   user may want to unresolve the conflict that they incorrectly
   resolved in order to redo the resolution, but the fallback would
   not kick in.

 * Most importantly, the entire "unresolve" operation is disabled
   after a conflicted merge is committed and MERGE_HEAD is removed,
   even though the index has perfectly usable resolve-undo records.

By lazily reading the HEAD and MERGE_HEAD only when we need to go to
the fallback codepath, we will allow cases where resolve-undo
records are available (which is 100% of the time, unless the user is
reading from an index file created by Git more than 10 years ago) to
proceed even after a conflicted merge was committed, during other
mergy operations that do not use MERGE_HEAD, or after the result of
such mergy operations has been committed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 15:46:17 -07:00
8e42eb0e9a doc: sha256 is no longer experimental
Remove scary wording that basically stops people using sha256
repositories not because of interoperability issues with sha1
repositories, but from fear that their work will suddenly become
incompatible in some future version of git.

We should be clear that currently sha256 repositories will not work with
sha1 repositories but stop the scary words.

Signed-off-by: Adam Majer <adamm@zombino.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 09:11:04 -07:00
823839bda1 sha256/gcrypt: die on gcry_md_open failures
`gcry_md_open' allocates memory and must (like all allocation
functions) be checked for failure.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 08:57:24 -07:00
8b608f3fb8 sha256/gcrypt: fix memory leak with SHA-256 repos
`gcry_md_open' needs to be paired with `gcry_md_close' to ensure
resources are released.  Since our internal APIs don't have
separate close/release callbacks, sticking it into the finalization
callback seems appropriate.

Building with SANITIZE=leak and running `git fsck' on a SHA-256
repository no longer reports leaks.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 08:57:15 -07:00
b4b85e41a7 sha256/gcrypt: fix build with SANITIZE=leak
Non-static functions cause `undefined reference' errors when
building with `SANITIZE=leak' due to the lack of prototypes.
Mark all these functions as `static inline' as we do in
sha256/nettle.h to avoid the need to maintain prototypes.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 08:56:54 -07:00
d089a06421 bundle: use OPT_PASSTHRU_ARGV
"git bundle" passes the progress control options to "git pack-objects"
by parsing and then recreating them explicitly.  Simplify that process
by using OPT_PASSTHRU_ARGV instead.

This also fixes --no-quiet, which has been doing the same as --quiet
since its introduction by 79862b6b77 (bundle-create: progress output
control, 2019-11-10) because it had been defined using OPT_SET_INT with
a value of 0, which sets 0 when negated as well.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-31 08:33:53 -07:00
ee48e70a82 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-28 09:45:22 -07:00
ddcb8fd8b9 Merge branch 'rs/pack-objects-parseopt-fix'
Command line parser fix.

* rs/pack-objects-parseopt-fix:
  pack-objects: fix --no-quiet
  pack-objects: fix --no-keep-true-parents
2023-07-28 09:45:22 -07:00
3085f949bf Merge branch 'rs/describe-parseopt-fix'
Command line parser fix.

* rs/describe-parseopt-fix:
  describe: fix --no-exact-match
2023-07-28 09:45:21 -07:00
c8a33b44c5 Merge branch 'bb/trace2-comment-fix'
In-code comment fix.

* bb/trace2-comment-fix:
  trace2: fix a comment
2023-07-28 09:45:21 -07:00
010447cf09 MyFirstContribution: refrain from self-iterating too much
Finding mistakes in and improving your own patches is a good idea,
but doing so too quickly is being inconsiderate to reviewers who
have just seen the initial iteration and taking their time to review
it.  Encourage new developers to perform such a self review before
they send out their patches, not after.  After sending a patch that
they immediately found mistakes in, they are welcome to comment on
them, mentioning what and how they plan to improve them in an
updated version, before sending out their updates.

Helped-by: Torsten Bögershausen <tboegi@web.de>
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-27 17:44:07 -07:00
bfce02c22f The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-27 15:26:37 -07:00
e672bc4f76 Merge branch 'jc/parse-options-reset'
Command line parser fix.

* jc/parse-options-reset:
  reset: reject --no-(mixed|soft|hard|merge|keep) option
2023-07-27 15:26:37 -07:00
d6966f6fff Merge branch 'jc/parse-options-show-branch'
Command line parser fixes.

* jc/parse-options-show-branch:
  show-branch: reject --[no-](topo|date)-order
  show-branch: --no-sparse should give dense output
2023-07-27 15:26:37 -07:00
9562f19026 Merge branch 'jc/transport-parseopt-fix'
Command line parser fixes.

* jc/transport-parseopt-fix:
  fetch: reject --no-ipv[46]
  parse-options: introduce OPT_IPVERSION()
2023-07-27 15:26:37 -07:00
7fb1483c27 Merge branch 'jc/gitignore-doc-pattern-markup'
Doc mark-up update.

* jc/gitignore-doc-pattern-markup:
  gitignore.txt: mark up explanation of patterns consistently
2023-07-27 15:26:37 -07:00
369998df83 SubmittingPatches: use of older maintenance tracks is an exception
While we could technically fix each and every bug on top of the
commit that introduced it, it is not necessarily practical.  For
trivial and low-value bugfixes, it often is simpler and sufficient
to just fix it in the current maintenance track, leaving the bug
unfixed in the older maintenance tracks.

Demote the "use older maintenance track to fix old bugs" as a side
note, and explain that the choice is used only in exceptional cases.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-27 13:07:40 -07:00
f835de52d7 SubmittingPatches: explain why 'next' and above are inappropriate base
The 'next' branch is primarily meant to be a testing ground to make
sure that topics that are reasonably well done work well together.
Building a new work on it would mean everything that was already in
'next' must have graduated to 'master' before the new work can also
be merged to 'master', and that is why we do not encourage basing
new work on 'next'.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-27 13:06:20 -07:00
d1b72cb364 t2400: rewrite regex to avoid unintentional PCRE
Replace all cases of `\s` with ` ` as it is not part of POSIX BRE or ERE
and therefore not all versions of grep handle it.

For the same reason all cases of `\S` are replaced with `[^ ]`. It is
not an exact replacement but it is close enough for this use case.

Also, do not write `\+` in BRE and expect it to mean 1 or more;
it is a GNU extension that may not work everywhere.

Remove `.*` from the end of a pattern that is not right-anchored.

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 14:49:02 -07:00
7e42d4bf15 builtin/worktree.c: convert tab in advice to space
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 14:49:02 -07:00
9111ea1cbe t2400: drop no-op --sq from rev-parse call
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 14:49:02 -07:00
b4fce4b6e4 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 14:13:16 -07:00
9a5e3b5f47 Merge branch 'jc/branch-parseopt-fix'
Command line parser fixes.

* jc/branch-parseopt-fix:
  branch: reject "--no-all" and "--no-remotes" early
2023-07-26 14:13:15 -07:00
914a353a12 Merge branch 'jc/am-parseopt-fix'
Code simplification.

* jc/am-parseopt-fix:
  am: simplify parsing of "--[no-]keep-cr"
2023-07-26 14:13:15 -07:00
8ae477e2b4 Merge branch 'rs/ls-tree-no-full-name-fix'
Command line parser fix.

* rs/ls-tree-no-full-name-fix:
  ls-tree: fix --no-full-name
2023-07-26 14:13:15 -07:00
89672f14d5 Merge branch 'jr/gitignore-doc-example-markup'
Doc update.

* jr/gitignore-doc-example-markup:
  gitignore.txt: use backticks instead of double quotes
2023-07-26 14:13:15 -07:00
cb626f8e5c credential/wincred: erase matching creds only
The credential erase request typically includes protocol, host, username
and password.

credential-wincred erases stored credentials that match protocol,
host and username, regardless of password.

This is confusing in the case the stored password differs from that
in the request. This case can occur when multiple credential helpers are
configured.

Only erase credential if stored password matches request (or request
omits password).

This fixes test "helper (wincred) does not erase a password distinct
from input" when t0303 is run with GIT_TEST_CREDENTIAL_HELPER set to
"wincred". This test was added in aeb21ce22e (credential: avoid
erasing distinct password, 2023-06-13).

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 13:27:34 -07:00
7144dee3ec credential/libsecret: erase matching creds only
The credential erase request typically includes protocol, host, username
and password.

credential-libsecret erases a stored credential if it matches protocol,
host and username, regardless of password.

This is confusing in the case the stored password differs from that
in the request. This case can occur when multiple credential helpers are
configured.

Only erase credential if stored password matches request (or request
omits password).

This fixes test "helper (libsecret) does not erase a password distinct
from input" when t0303 is run with GIT_TEST_CREDENTIAL_HELPER set to
"libsecret". This test was added in aeb21ce22e (credential: avoid
erasing distinct password, 2023-06-13).

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 13:27:31 -07:00
37f6040764 SubmittingPatches: choice of base for fixing an older maintenance track
When working on an high-value bugfix that must be given to ancient
maintenance tracks, a starting point that is older than `maint` may
have to be chosen.

Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-26 09:39:00 -07:00
7cebc5bd78 doc: highlight that .gitmodules does not support !command
Bugfix for fc01a5d2 (submodule update documentation: don't repeat
ourselves, 2016-12-27).

The `custom command` and `none` options are described as sharing the
same limitations, but one is allowed in .gitmodules and the other is
not.

Rewrite the description for custom commands to be more precise,
and make it easier for readers to notice that custom commands cannot
be used in the .gitmodules file.

Signed-off-by: Petar Vutov <pvutov@imap.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-25 14:55:07 -07:00
a80be15292 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-25 12:05:40 -07:00
5929e66755 Merge branch 'jk/nested-points-at'
"git tag --list --points-at X" showed tags that directly refers to
object X, but did not list a tag that points at such a tag, which
has been corrected.

* jk/nested-points-at:
  ref-filter: simplify return type of match_points_at
  ref-filter: avoid parsing non-tags in match_points_at()
  ref-filter: avoid parsing tagged objects in match_points_at()
  ref-filter: handle nested tags in --points-at option
2023-07-25 12:05:24 -07:00
02f50d0d19 Merge branch 'rs/strbuf-addftime-simplify'
Code clean-up.

* rs/strbuf-addftime-simplify:
  strbuf: use skip_prefix() in strbuf_addftime()
2023-07-25 12:05:24 -07:00
261ff512e1 Merge branch 'rs/ref-filter-signature-fix'
Test fix.

* rs/ref-filter-signature-fix:
  t6300: fix setup with GPGSSH but without GPG
2023-07-25 12:05:24 -07:00
c5fcd34e1b Merge branch 'jk/unused-parameter'
Mark-up unused parameters in the code so that we can eventually
enable -Wunused-parameter by default.

* jk/unused-parameter:
  t/helper: mark unused callback void data parameters
  tag: mark unused parameters in each_tag_name_fn callbacks
  rev-parse: mark unused parameter in for_each_abbrev callback
  replace: mark unused parameter in each_mergetag_fn callback
  replace: mark unused parameter in ref callback
  merge-tree: mark unused parameter in traverse callback
  fsck: mark unused parameters in various fsck callbacks
  revisions: drop unused "opt" parameter in "tweak" callbacks
  count-objects: mark unused parameter in alternates callback
  am: mark unused keep_cr parameters
  http-push: mark unused parameter in xml callback
  http: mark unused parameters in curl callbacks
  do_for_each_ref_helper(): mark unused repository parameter
  test-ref-store: drop unimplemented reflog-expire command
2023-07-25 12:05:24 -07:00
dd224ce15d Merge branch 'dk/bundle-i18n-more'
Update message mark-up for i18n in "git bundle".

* dk/bundle-i18n-more:
  i18n: mark more bundle.c strings for translation
2023-07-25 12:05:24 -07:00
0e30958044 Merge branch 'mh/mingw-case-sensitive-build'
Names of MinGW header files are spelled in mixed case in some
source files, but the build host can be using case sensitive
filesystem with header files with their name spelled in all
lowercase.

* mh/mingw-case-sensitive-build:
  mingw: use lowercase includes for some Windows headers
2023-07-25 12:05:23 -07:00
d4ce18536a Merge branch 'dk/t4002-syntaxo-fix'
Test fix.

* dk/t4002-syntaxo-fix:
  t4002: fix "diff can read from stdin" syntax
2023-07-25 12:05:23 -07:00
4488bb3bed Merge branch 'tb/object-access-overflow-protection'
Various offset computation in the code that accesses the packfiles
and other data in the object layer has been hardened against
arithmetic overflow, especially on 32-bit systems.

* tb/object-access-overflow-protection:
  commit-graph.c: prevent overflow in `verify_commit_graph()`
  commit-graph.c: prevent overflow in `write_commit_graph()`
  commit-graph.c: prevent overflow in `merge_commit_graph()`
  commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
  commit-graph.c: prevent overflow in `load_tree_for_commit()`
  commit-graph.c: prevent overflow in `fill_commit_in_graph()`
  commit-graph.c: prevent overflow in `fill_commit_graph_info()`
  commit-graph.c: prevent overflow in `load_oid_from_graph()`
  commit-graph.c: prevent overflow in add_graph_to_chain()
  commit-graph.c: prevent overflow in `write_commit_graph_file()`
  pack-bitmap.c: ensure that eindex lookups don't overflow
  midx.c: prevent overflow in `fill_included_packs_batch()`
  midx.c: prevent overflow in `write_midx_internal()`
  midx.c: store `nr`, `alloc` variables as `size_t`'s
  midx.c: prevent overflow in `nth_midxed_offset()`
  midx.c: prevent overflow in `nth_midxed_object_oid()`
  midx.c: use `size_t`'s for fanout nr and alloc
  packfile.c: use checked arithmetic in `nth_packed_object_offset()`
  packfile.c: prevent overflow in `load_idx()`
  packfile.c: prevent overflow in `nth_packed_object_id()`
2023-07-25 12:05:23 -07:00
88d08c342a Merge branch 'ah/advise-force-pushing'
Help newbies by suggesting that there are cases where force-pushing
is a valid and sensible thing to update a branch at a remote
repository, rather than reconciling with merge/rebase.

* ah/advise-force-pushing:
  push: don't imply that integration is always required before pushing
  remote: don't imply that integration is always required before pushing
  wt-status: don't show divergence advice when committing
2023-07-25 12:05:23 -07:00
08e5fb1296 hex: retire get_sha1_hex()
The naming convention around get_sha1_hex() and its friends is
awkward these days, after "struct object_id" was introduced.

There are three public functions around this area:

 * get_sha1_hex()       - use the implied the_hash_algo, fill uchar *
 * get_oid_hex()        - use the implied the_hash_algo, fill oid *
 * get_oid_hex_algop()  - use the passed algop, fill oid *

Between the latter two, the "_algop" suffix signals whether the
the_hash_algo is used as the implied algorithm or the caller should
pass an algorithm explicitly.  That is very much understandable and
is a good convention.

Between the former two, however, the "SHA1" vs "OID" in the names
differentiate in what type of variable the result is stored.

We could argue that it makes sense to use "SHA1" to mean "flat byte
buffer" to honor the historical practice in the days before "struct
object_id" was invented, but the natural fourth friend of the above
group would take an algop and fill a flat byte buffer, and it would
be strange to name it get_sha1_hex_algop().  Do we use the passed in
algo, or are we limited to SHA-1 ;-)?

In fact, such a function exists, albeit as a private helper function
used by the implementation of these functions, and is named a lot
more sensibly: get_hash_hex_algop().

Correct the misnomer of get_sha1_hex() and use "hash", instead of
"sha1", as "flat byte buffer that stores binary (as opposed to
hexadecimal) representation of the hash".

The four (2x2) friends now become:

 * get_hash_hex()       - use the implied the_hash_algo, fill uchar *
 * get_oid_hex()        - use the implied the_hash_algo, fill oid *
 * get_hash_hex_algop() - use the passed algop, fill uchar *
 * get_oid_hex_algop()  - use the passed algop, fill oid *

As there are only two remaining calls to get_sha1_hex() in the
codebase right now, the blast radious of this change is fairly
small.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 16:11:23 -07:00
f1b9cebc8b t/lib-commit-graph.sh: avoid sub-shell in graph_git_behavior()
In a previous commit, we introduced a sub-shell in the implementation of
`graph_git_behavior()`, in order to allow us to pass `-C "$DIR"`
directly to the git processes spawned by `graph_git_two_modes()`.

Now that its callers are always operating from the "$TRASH_DIRECTORY"
instead of one of its sub-directories, we can drop the inner sub-shell,
as it is no longer required.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 14:35:22 -07:00
749f126b29 t5328: avoid top-level directory changes
In a similar spirit as the last commit, avoid top-level directory
changes in the last remaining commit-graph related test, t5328.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 14:35:22 -07:00
51550d03e4 t5318: avoid top-level directory changes
Avoid changing the current working directory from outside of a sub-shell
during the tests in t5318.

Each test has mostly straightforward changes, either:

  - Removing the top-level `cd "$TRASH_DIRECTORY/full"`, which is
    unnecessary after ensuring that other tests don't change their
    working directory outside of a sub-shell.

  - Changing any Git invocations which want to be in a sub-directory by
    either (a) adding a "-C $DIR" argument, or (b) moving the whole test
    into a sub-shell.

While we're here, remove any explicit "git config core.commitGraph true"
invocations which were designed to enable use of the commit-graph. These
are unnecessary following 31b1de6a09 (commit-graph: turn on
commit-graph by default, 2019-08-13).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 14:35:22 -07:00
a953d2b628 t/lib-commit-graph.sh: avoid directory change in graph_git_behavior()
The `graph_git_behavior()` helper asserts that a number of common Git
operations (such as `git log --oneline`, `git log --topo-order`, etc.)
produce identical output regardless of whether or not a commit-graph is
in use.

This helper takes as its second argument the location (relative to the
`$TRASH_DIRECTORY`) of the Git repostiory under test. In order to run
each of its commands within that repository, it first changes into that
directory, without the use of a sub-shell.

This pollutes future tests which expect to be run in the top-level
`$TRASH_DIRECTORY` as usual. We could wrap `graph_git_behavior()` in a
sub-shell, like:

    graph_git_behavior() {
      # ...
      (
        cd "$TRASH_DIRECTORY/$DIR" &&
        graph_git_two_modesl
      )
    }

, but since we're invoking git directly, we can pass along a "-C $DIR"
when "$DIR" is non-empty.

Note, however, that until the remaining callers are cleaned up to avoid
changing working directories outside of a sub-shell, that we need to
ensure that we are operating in the top-level $TRASH_DIRECTORY. The
inner-subshell will go away in a future commit once it is no longer
necessary.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 14:35:22 -07:00
c355b64176 t/lib-commit-graph.sh: allow graph_read_expect() in sub-directories
The `graph_read_expect()` function is used to ensure that the output of
the "read-graph" test helper matches certain parameters (e.g., how many
commits are in the graph, which chunks were written, etc.).

It expects the Git repository being tested to be at the current working
directory. However, a handful of t5318 tests use different repositories
stored in sub-directories. To work around this, several tests in t5318
change into the relevant repository outside of a sub-shell, altering the
context for the rest of the suite.

Prepare to remove these globally-scoped directory changes by teaching
`graph_read_expect()` to take an optional "-C dir" to specify where the
repository containing the commit-graph being tested is.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 14:35:21 -07:00
f5d18f8c0e ref-filter: add new "describe" atom
Duplicate the logic of %(describe) and friends from pretty to
ref-filter. In the future, this change helps in unifying both the
formats as ref-filter will be able to do everything that pretty is doing
and we can have a single interface.

The new atom "describe" and its friends are equivalent to the existing
pretty formats with the same name.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 10:42:29 -07:00
f46094a5e6 ref-filter: add multiple-option parsing functions
The functions

	match_placeholder_arg_value()
	match_placeholder_bool_arg()

were added in pretty 4f732e0fd7 (pretty: allow %(trailers) options
with explicit value, 2019-01-29) to parse multiple options in an
argument to --pretty. For example,

	git log --pretty="%(trailers:key=Signed-Off-By,separator=%x2C )"

will output all the trailers matching the key and seperates them by
a comma followed by a space per commit.

Add similar functions,

	match_atom_arg_value()
	match_atom_bool_arg()

in ref-filter.

There is no atom yet that can use these functions in ref-filter, but we
are going to add a new %(describe) atom in a subsequent commit where we
parse options like tags=<bool-value> or match=<pattern> given to it.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 09:55:00 -07:00
9645a087c2 sequencer: finish parsing the todo list despite an invalid first line
Before the todo list is edited it is rewritten to shorten the OIDs of
the commits being picked and to append advice about editing the list.
The exact advice depends on whether the todo list is being edited for
the first time or not. After the todo list has been edited it is
rewritten to lengthen the OIDs of the commits being picked and to remove
the advice. If the edited list cannot be parsed then this last step is
skipped.

Prior to db81e50724 (rebase-interactive: use todo_list_write_to_file()
in edit_todo_list(), 2019-03-05) if the existing todo list could not be
parsed then the initial rewrite was skipped as well. This had the
unfortunate consequence that if the list could not be parsed after the
initial edit the advice given to the user was wrong when they re-edited
the list. This change relied on todo_list_parse_insn_buffer() returning
the whole todo list even when it cannot be parsed. Unfortunately if the
list starts with a "fixup" command then it will be truncated and the
remaining lines are lost. Fix this by continuing to parse after an
initial "fixup" commit as we do when we see any other invalid line.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[jc: removed an apparently unneeded subshell around the test body]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-24 09:49:44 -07:00
4970bedef2 branch: update the message to refuse touching a branch in-use
The "git branch -f" command can refuse to force-update a branch that
is used by another worktree.  The original rationale for this
behaviour was that updating a branch that is checked out in another
worktree, without making a matching change to the index and the
working tree files in that worktree, will lead to a very confused
user.  "git diff HEAD" will no longer give a useful patch, because
HEAD is a commit unrelated to what the index and the working tree in
the worktree were based on, for example.

These days, the same mechanism also protects branches that are being
rebased or bisected, and the same machanism is expected to be the
right place to add more checks, when we decide to protect branches
undergoing other kinds of operations.  We however forgot to rethink
the messaging, which originally said that we are refusing to touch
the branch because it is "checked out" elsewhere, when d2ba271a
(branch: check for bisects and rebases, 2022-06-14) started to
protect branches that are being rebased or bisected.

The spirit of the check has always been that we do not want to
disrupt the use of the same branch in other worktrees.  Let's reword
the message slightly to say that the branch is "used by" another
worktree, instead of "checked out".

We could teach the branch.c:prepare_checked_out_branches() function
to remember why it decided that a particular branch needs protecting
(i.e. was it because it was checked out?  being bisected?  something
else?) in addition to which worktree the branch was in use, and use
that in the error message to say "you cannot force update this
branch because it is being bisected in the worktree X", etc., but it
is dubious that such extra complexity is worth it.  The message
already tells which directory the worktree in question is, and it
should be just a "chdir" away for the user to find out what state it
is in, if the user felt curious enough.  So let's not go there yet.

Helped-by: Josh Sref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 15:30:57 -07:00
e43f4fd0bd The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 13:47:26 -07:00
39fe402d67 Merge branch 'tb/refs-exclusion-and-packed-refs'
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.

* tb/refs-exclusion-and-packed-refs:
  ls-refs.c: avoid enumerating hidden refs where possible
  upload-pack.c: avoid enumerating hidden refs where possible
  builtin/receive-pack.c: avoid enumerating hidden references
  refs.h: implement `hidden_refs_to_excludes()`
  refs.h: let `for_each_namespaced_ref()` take excluded patterns
  revision.h: store hidden refs in a `strvec`
  refs/packed-backend.c: add trace2 counters for jump list
  refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
  refs/packed-backend.c: refactor `find_reference_location()`
  refs: plumb `exclude_patterns` argument throughout
  builtin/for-each-ref.c: add `--exclude` option
  ref-filter.c: parameterize match functions over patterns
  ref-filter: add `ref_filter_clear()`
  ref-filter: clear reachable list pointers after freeing
  ref-filter.h: provide `REF_FILTER_INIT`
  refs.c: rename `ref_filter`
2023-07-21 13:47:26 -07:00
36f76d2a25 pack-objects: fix --no-quiet
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted the option --no-quiet, but it
does the same as --quiet.  That's because it's defined using OPT_SET_INT
with a value of 0, which sets 0 when negated, too.

Make --no-quiet equivalent to --progress and ignore it if --all-progress
was given.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 10:04:04 -07:00
3a5f308741 pack-objects: fix --no-keep-true-parents
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted --no-keep-true-parents, but
this option does the same as --keep-true-parents.  That's because it's
defined using OPT_SET_INT with a value of 0, which sets 0 when negated
as well.

Turn --no-keep-true-parents into the opposite of --keep-true-parents by
using OPT_BOOL and storing the option's status directly in a variable
named "grafts_keep_true_parents" instead of in negative form in
"grafts_replace_parents".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 10:02:59 -07:00
c95ae3ff9c describe: fix --no-exact-match
Since 2c33f75754 (Teach git-describe --exact-match to avoid expensive
tag searches, 2008-02-24) git describe accepts --no-exact-match, but it
does the same as --exact-match, an alias for --candidates=0.  That's
because it's defined using OPT_SET_INT with a value of 0, which sets 0
when negated as well.

Let --no-exact-match set the number of candidates to the default value
instead.  Users that need a more specific lack of exactitude can specify
their preferred value using --candidates, as before.

The "--no-exact-match" option was not covered in the tests, so let's
add a few.  Also add a case where --exact-match option is used on a
commit that cannot be described without distance from tags and make
sure the command fails.

Signed-off-by: René Scharfe <l.s.r@web.de>
[jc: added trivial tests]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 09:57:15 -07:00
835950bd19 blame: allow --contents to work with bare repo
The --contents option can be used with git blame to blame the file
as if it had the contents from the specified file. Since 1a3119ed
(blame: allow --contents to work with non-HEAD commit, 2023-03-24),
the --contents option can work with non-HEAD commit. However, if you
try to use --contents in a bare repository, you get the following
error:

    fatal: this operation must be run in a work tree

This is because before trying to generate a fake working tree
commit, we always call setup_work_tree(). But in a bare repo,
working tree is not available. The call to setup_work_tree is used
to prepare the reading of the blamed file in the working tree, which
isn't necessary if we are reading the contents from the specific
file instead of the file in the working tree.

Add a check in setup_scoreboard to skip setup_work_tree if we are
reading from the file specified in --contents.

This enables us to use --contents in a bare repo. This is a nice
addition on top of 1a3119ed, having a working tree to use --contents
is optional.

Add test for the --contents option with bare repo to the
annotate-tests.sh test script.

Signed-off-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 07:32:58 -07:00
a27eecea75 wrapper: use trace2 counters to collect fsync stats
As mentioned in the thread starting at [1], trace2 counters should be
used to count events instead of ad-hoc static variables.

Convert the two fsync static variables to trace2 counters, reducing the
coupling between wrapper.c and the trace2 subsystem. Adjust t/t5351 to
match the trace2 counter output format.

The counters are not per-thread because the ones being replaced also
were not.

[1] https://lore.kernel.org/git/20230627195251.1973421-2-calvinwan@google.com/

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-20 11:52:53 -07:00
3821eb6c3d reset: reject --no-(mixed|soft|hard|merge|keep) option
"git reset --no-mixed" behaved exactly like "git reset --mixed",
which was nonsense.

If there were only two kinds, e.g. "mixed" vs "separate", it might
have made sense to make "git reset --no-mixed" behave identically to
"git reset --separate" and vice-versa, but because we have many
types of reset, let's just forbid "--no-mixed" and negated form of
other types.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 22:02:53 -07:00
68cbb20e73 show-branch: reject --[no-](topo|date)-order
"git show-branch --no-topo-order" behaved exactly the same way as
"git show-branch --topo-order" did, which was nonsense.  This was
because we choose between topo- and date- by setting a variable to
either REV_SORT_IN_GRAPH_ORDER or REV_SORT_BY_COMMIT_DATE with
OPT_SET_INT() and REV_SORT_IN_GRAPH_ORDER happens to be 0.  The
OPT_SET_INT() macro assigns 0 to the target variable in respose to
the negated form of its option.

"--no-date-order" by luck behaves identically to "--topo-order"
exactly for the same reason, and it sort-of makes sense right now,
but the "sort-of makes sense" will quickly break down once we add a
third way to sort.  Not-A may be B when there are only two choices
between A and B, but once your choices become among A, B, and C,
not-A does not mean B.

Just mark these two ordering options to reject negation, and add a
test, which was missing.  "git show-branch --no-reflog" is also
unnegatable, so throw in a test for that while we are at it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 22:00:39 -07:00
c48af99a3e trace2: fix a comment
When the trace2 counter mechanism was added in 81071626ba (trace2: add
global counter mechanism, 2022-10-24), the name of the file where new
counters are added was misspelled in a comment.

Use the correct file name.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 17:02:16 -07:00
c512643e67 short help: allow a gap smaller than USAGE_GAP
The parse-options API responds to "git cmd -h" by listing the option
flag (padded to the USAGE_OPTS_WIDTH column), followed by USAGE_GAP
(set to 2) whitespaces, followed by the help text.  If the flags
part does not fit within the USAGE_OPTS_WIDTH, the help text is given
on its own line.  Imagine that "@" below depicts the USAGE_OPTS_WIDTH'th
column, and "#" are for the usage help text, the output may look
like this:

    @@@@@@@@@@@@@  ########################################
    -f		   description of the flag '-f' comes here
    --short=<num>  description of the flag '--short'
    --very-long-option=<number>
                   description of the flag '--very-long-option'

This is all good and nice in principle, but it becomes awkward when
the flags part is just one column over the limit and forces a line
break.  See the description of the "--almost" option below:

    @@@@@@@@@@@@@  ########################################
    -f		   description of the flag '-f' comes here
    --short=<num>  description of the flag '--short'
    --almost=<num>
                   description of the flag '--almost'
    --very-long-option=<number>
                   description of the flag '--very-long-option'

If we allow shrinking the gap to a single whitespace only in such a
case, we would instead get:

    @@@@@@@@@@@@@  ########################################
    -f		   description of the flag '-f' comes here
    --short=<num>  description of the flag '--short'
    --almost=<num> description of the flag '--almost'
    --very-long-option=<number>
                   description of the flag '--very-long-option'

and the boundary between the flags and their descriptions does not
become any harder to see, while saving precious vertical screen real
estate.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 16:39:02 -07:00
d86a8f386d remote: simplify "remote add --tags" help text
The help text for the --tags option was split into two option[]
entries, which was a hacky way to give two lines of help text (the
second entry did not have either short or long help, and there was
no way to invoke its entry---it was there only for the help text).

As we now support multi-line text in the option help, let's make
the second line of the help a proper second line and remove the
hacky second entry.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 16:39:02 -07:00
448abbba63 short help: allow multi-line opthelp
When "-h" triggers the short-help in a command that implements its
option parsing using the parse-options API, the option help text is
shown with a single fprintf() as a long line.  When the text is
multi-line, the second and subsequent lines are not left padded,
that breaks the alignment across options.

Borrowing the idea from the advice API where its hint strings are
shown with (localized) "hint:" prefix, let's internally split the
(localized) help text into lines, and showing the first line, pad
the remaining lines to align.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 16:30:06 -07:00
fb8f7269c2 configure.ac: always save NO_ICONV to config.status
In case 'configure --with-iconv=no' is used, NO_ICONV is not saved to
config.status and thus git is built with iconv support.

Always save NO_ICONV to config.status to honor what user selected
during configure step.

Signed-off-by: Andreas Herrmann <aherrmann@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 10:07:55 -07:00
92d8f00a11 configure.ac: don't overwrite NO_CURL option
Even if 'configure --with-curl=no' was run, curl support is used,
because library detection overwrites it. Avoid this overwrite.
Configure should obey what the user has specified.

Signed-off-by: Andreas Herrmann <aherrmann@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 10:07:55 -07:00
0dd79e0d49 configure.ac: don't overwrite NO_EXPAT option
Even if 'configure --with-expat=no' was run, expat support is used,
because library detection overwrites it. Avoid this overwrite.
Configure should obey what the user has specified.

Signed-off-by: Andreas Herrmann <aherrmann@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 10:07:55 -07:00
83bb8e5a06 show-branch: --no-sparse should give dense output
"git show-branch --no-sparse" behaved exactly the same way as "git
show-branch --sparse", which did not make any sense.  This was
because it used a variable "dense" initialized to 1 by default to
give "non sparse" behaviour, and OPT_SET_INT() to set the varilable
to 0 in response to the "--sparse" option.  Unfortunately,
OPT_SET_INT() sets 0 to the given variable when the option is
negated.

Flip the polarity of the variable "dense" by renaming it to "sparse"
and initializing it to 0, and have OPT_SET_INT() set the variable to
1 when "--sparse" is given.  This way, "--no-sparse" would set 0 to
the variable and would give us the "dense" behaviour.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 09:16:37 -07:00
a2dad4868b fetch: reject --no-ipv[46]
Now we have introduced OPT_IPVERSION(), tweak its implementation so
that "git clone", "git fetch", and "git push" reject the negated
form of "Use only IP version N" options.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 14:47:30 -07:00
ae2c912c04 parse-options: introduce OPT_IPVERSION()
The command line option parsing for "git clone", "git fetch", and
"git push" have duplicated implementations of parsing "--ipv4" and
"--ipv6" options, by having two OPT_SET_INT() for "ipv4" and "ipv6".

Introduce a new OPT_IPVERSION() macro and use it in these three
commands.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 14:35:54 -07:00
e12cb98e1e branch: reject "--no-all" and "--no-remotes" early
As the command line parser for "git branch --all" forgets to use
PARSE_OPT_NONEG, it accepted "git branch --no-all", and then passed
a nonsense value to the underlying machinery, leading to a fatal
error "filter_refs: invalid type".  The "--remotes" option had
exactly the same issue.

Catch the unsupported options early in the option parser.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 12:19:53 -07:00
947ebd62a0 am: simplify parsing of "--[no-]keep-cr"
Command line options "--keep-cr" and its negation trigger
OPT_SET_INT_F(PARSE_OPT_NONEG) to set a variable to 1 and 0
respectively.  Using OPT_SET_INT() to implement the positive variant
that sets the variable to 1 without specifying PARSE_OPT_NONEG gives
us the negative variant to set it to 0 for free.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 12:19:31 -07:00
d6f598e443 gitignore.txt: mark up explanation of patterns consistently
In the "PATTERN FORMAT" section, all the other pattern elements are
shown as `monospace` literals inside "double quoted" strings.  Do
the same for the explanation of a slash to make it consistent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 12:19:08 -07:00
991c552916 ls-tree: fix --no-full-name
Since 61fdbcf98b (ls-tree: migrate to parse-options, 2009-11-13) git
ls-tree has accepted the option --no-full-name, but it does the same
as --full-name, contrary to convention.  That's because it's defined
using OPT_SET_INT with a value of 0, where the negative variant sets
0 as well.

Turn --no-full-name into the opposite of --full-name by using OPT_BOOL
instead and storing the option's status directly in a variable named
"full_name" instead of in negated form in "chomp_prefix".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 09:38:24 -07:00
cba07a324d The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 07:29:00 -07:00
6016ee0a71 Merge branch 'tb/fsck-no-progress'
"git fsck --no-progress" still spewed noise from the commit-graph
subsystem, which has been corrected.

* tb/fsck-no-progress:
  commit-graph.c: avoid duplicated progress output during `verify`
  commit-graph.c: pass progress to `verify_one_commit_graph()`
  commit-graph.c: iteratively verify commit-graph chains
  commit-graph.c: extract `verify_one_commit_graph()`
  fsck: suppress MIDX output with `--no-progress`
  fsck: suppress commit-graph output with `--no-progress`
2023-07-18 07:28:53 -07:00
c6a5e1a22e Merge branch 'tb/repack-cleanup'
The recent change to "git repack" made it react less nicely when a
leftover .idx file that no longer has the corresponding .pack file
in the repository, which has been corrected.

* tb/repack-cleanup:
  builtin/repack.c: avoid dir traversal in `collect_pack_filenames()`
  builtin/repack.c: only repack `.pack`s that exist
2023-07-18 07:28:53 -07:00
d6e67222c1 Merge branch 'mh/doc-credential-helpers'
Doc update.

* mh/doc-credential-helpers:
  doc: gitcredentials: link to helper list
2023-07-18 07:28:52 -07:00
3437f549dd gitignore.txt: use backticks instead of double quotes
Among four examples, only this one used "double quoted" sample
patterns, but all others marked up the patterns in `monospace`.

Signed-off-by: Johan Ruokangas <johan@latehours.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 06:56:19 -07:00
d9e0062159 ref-filter: simplify return type of match_points_at
We return the oid that matched, but the sole caller only cares whether
we matched anything at all. This is mostly academic, since there's only
one caller, but the lifetime of the returned pointer is not immediately
clear. Sometimes it points to an oid in a tag struct, which should live
forever. And sometimes to the oid passed in, which only lives as long as
the each_ref_fn callback we're called from.

Simplify this to a boolean return which is more direct and obvious. As a
bonus, this lets us avoid the weird pattern of overwriting our "oid"
parameter in the loop (since we now only refer to the tagged oid one
time, and can just inline the call to get it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 14:16:05 -07:00
870eb53ab2 ref-filter: avoid parsing non-tags in match_points_at()
When handling --points-at, we have to try to peel each ref to see if
it's a tag that points at a requested oid. We start this process by
calling parse_object() on the oid pointed to by each ref.

The cost of parsing each object adds up, especially in an output that
doesn't otherwise need to open the objects at all. Ideally we'd use
peel_iterated_oid() here, which uses the cached information in the
packed-refs file. But we can't, because our --points-at must match not
only the fully peeled value, but any interim values (so if tag A points
to tag B which points to commit C, we should match --points-at=B, but
peel_iterated_oid() will only tell us about C).

So the best we can do (absent changes to the packed-refs peel traits) is
to avoid parsing non-tags. The obvious way to do that is to call
oid_object_info() to check the type before parsing. But there are a few
gotchas there, like checking if the object has already been parsed.

Instead we can just tell parse_object() that we are OK skipping the hash
check, which lets it turn on several optimizations. Commits can be
loaded via the commit graph (so it's both fast and we have the benefit
of the parsed data if we need it later at the output stage). Blobs are
not loaded at all. Trees are still loaded, but it's rather rare to have
a ref point directly to a tree (and since this is just an optimization,
kicking in 99% of the time is OK).

Even though we're paying for an extra lookup, the cost to avoid parsing
the non-tags is a net benefit. In my git.git repository with 941 tags
and 1440 other refs pointing to commits, this significantly cuts the
runtime:

  Benchmark 1: ./git.old for-each-ref --points-at=HEAD
    Time (mean ± σ):      26.8 ms ±   0.5 ms    [User: 24.5 ms, System: 2.2 ms]
    Range (min … max):    25.9 ms …  29.2 ms    107 runs

  Benchmark 2: ./git.new for-each-ref --points-at=HEAD
    Time (mean ± σ):       9.1 ms ±   0.3 ms    [User: 6.8 ms, System: 2.2 ms]
    Range (min … max):     8.6 ms …  10.2 ms    308 runs

  Summary
    './git.new for-each-ref --points-at=HEAD' ran
      2.96 ± 0.10 times faster than './git.old for-each-ref --points-at=HEAD'

In a repository that is mostly annotated tags, we'd expect less
improvement (we might still skip a few object loads, but that's balanced
by the extra lookups). In my clone of linux.git, which has 782 tags and
3 branches, the run-time is about the same (it's actually ~1% faster on
average after this patch, but that's within the run-to-run noise).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 14:16:05 -07:00
b9584c5858 ref-filter: avoid parsing tagged objects in match_points_at()
When we peel tags to check if they match a --points-at oid, we
recursively parse the tagged object to see if it is also a tag. But
since the tag itself tells us the type of the object it points to (and
even gives us the appropriate object struct via its "tagged" member), we
can use that directly.

We do still have to make sure to call parse_tag() before looking at each
tag. This is redundant for the outermost tag (since we did call
parse_object() to find its type), but that's OK; parse_tag() is smart
enough to make this a noop when the tag has already been parsed.

In my clone of linux.git, with 782 tags (and only 3 non-tags), this
yields a significant speedup (bringing us back where we were before the
commit before this one started recursively dereferencing tags):

  Benchmark 1: ./git.old for-each-ref --points-at=HEAD --format="%(refname)"
    Time (mean ± σ):      20.3 ms ±   0.5 ms    [User: 11.1 ms, System: 9.1 ms]
    Range (min … max):    19.6 ms …  21.5 ms    141 runs

  Benchmark 2: ./git.new for-each-ref --points-at=HEAD --format="%(refname)"
    Time (mean ± σ):      11.4 ms ±   0.2 ms    [User: 6.3 ms, System: 5.0 ms]
    Range (min … max):    11.0 ms …  12.2 ms    250 runs

  Summary
    './git.new for-each-ref --points-at=HEAD --format="%(refname)"' ran
      1.79 ± 0.05 times faster than './git.old for-each-ref --points-at=HEAD --format="%(refname)"'

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 14:16:05 -07:00
468887f0f8 ref-filter: handle nested tags in --points-at option
Tags are dereferenced until reaching a different object type to handle
nested tags, e.g. on checkout. In contrast, "git tag --points-at=..."
fails to list such nested tags because only one level of indirection is
obtained in filter_refs(). Implement the recursive dereferencing for the
"--points-at" option when filtering refs to unify the behaviour.

Signed-off-by: Jan Klötzke <jan@kloetzke.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 14:16:05 -07:00
5e238546dc The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 11:30:43 -07:00
13ed10efd4 Merge branch 'jc/pathspec-match-with-common-prefix'
"git ls-files '(attr:X)D/'" that triggers the common prefix
optimization codepath failed to read from "D/.gitattributes",
which has been corrected.

* jc/pathspec-match-with-common-prefix:
  dir: match "attr" pathspec magic with correct paths
  t6135: attr magic with path pattern
2023-07-17 11:30:43 -07:00
ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
d5bb430ec6 Merge branch 'vd/adjust-mfow-doc-to-updated-headers'
Code snippets in a tutorial document no longer compiled after
recent header shuffling, which have been corrected.

* vd/adjust-mfow-doc-to-updated-headers:
  docs: add necessary headers to Documentation/MFOW.txt
2023-07-17 11:30:42 -07:00
0e074fb4e5 Merge branch 'rs/ls-tree-prefix-simplify'
Code simplification.

* rs/ls-tree-prefix-simplify:
  ls-tree: simplify prefix handling
2023-07-17 11:30:42 -07:00
d383b4f24e Merge branch 'rs/userformat-find-requirements-simplify'
Code simplification.

* rs/userformat-find-requirements-simplify:
  pretty: use strchr(3) in userformat_find_requirements()
2023-07-17 11:30:41 -07:00
55e8fad660 Merge branch 'rs/pretty-format-double-negation-fix'
Code clarification.

* rs/pretty-format-double-negation-fix:
  pretty: avoid double negative in format_commit_item()
2023-07-17 11:30:41 -07:00
377d1ca423 Merge branch 'rs/packet-length-simplify'
Code simplification.

* rs/packet-length-simplify:
  pkt-line: add size parameter to packet_length()
2023-07-17 11:30:41 -07:00
9187b276e9 Merge branch 'pw/diff-no-index-from-named-pipes'
"git diff --no-index" learned to read from named pipes as if they
were regular files, to allow "git diff <(process) <(substitution)"
some shells support.

* pw/diff-no-index-from-named-pipes:
  diff --no-index: support reading from named pipes
  t4054: test diff --no-index with stdin
  diff --no-index: die on error reading stdin
  diff --no-index: refuse to compare stdin to a directory
2023-07-17 11:30:41 -07:00
945c72250a strbuf: use skip_prefix() in strbuf_addftime()
Use the now common skip_prefix() cascade instead of a case statement to
parse the strftime(3) format in strbuf_addftime().  skip_prefix() parses
the "fmt" pointer and advances it appropriately, making additional
pointer arithmetic unnecessary.  The resulting code is more compact and
consistent with most other strbuf_expand_step() loops.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 09:24:49 -07:00
065135fc0b t6300: fix setup with GPGSSH but without GPG
In a test introduced by 26c9c03f0a (ref-filter: add new "signature"
atom, 2023-06-04) the file named "file" is added by a setup step that
requires GPG and modified by a second setup step that requires GPGSSH.
Systems lacking the first prerequisite skip the initial setup step and
then "git commit -a" in the second one doesn't find the modified file.
Add it explicitly.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17 09:15:18 -07:00
830b4a04c4 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 10:46:08 -07:00
daa2589b63 Merge branch 'jk/imap-send-unused-variable-cleanup'
"imap-send" codepaths got cleaned up to get rid of unused
parameters.

* jk/imap-send-unused-variable-cleanup:
  imap-send: drop unused fields from imap_cmd_cb
  imap-send: drop unused parameter from imap_cmd_cb callback
  imap-send: use server conf argument in setup_curl()
2023-07-14 10:46:07 -07:00
ce36dea07b Merge branch 'ma/t0091-fixup'
"git bugreport" tests did not test what it wanted to test, which
has been corrected.

* ma/t0091-fixup:
  t0091-bugreport.sh: actually verify some content of report
2023-07-14 10:46:07 -07:00
81ebc54e81 Merge branch 'ks/ref-filter-signature'
The "git for-each-ref" family of commands learned placeholders
related to GPG signature verification.

* ks/ref-filter-signature:
  ref-filter: add new "signature" atom
  t/lib-gpg: introduce new prereq GPG2
2023-07-14 10:46:07 -07:00
0a02ca2383 SubmittingPatches: simplify guidance for choosing a starting point
Background: The guidance to "base your work on the oldest branch that
your change is relevant to" was added in d0c26f0f56 (SubmittingPatches:
Add new section about what to base work on, 2010-04-19). That commit
also added the bullet points which describe the scenarios where one
would use one of the following named branches: "maint", "master",
"next", and "seen" ("pu" in the original as that was the name of this
branch before it was renamed, per 828197de8f (docs: adjust for the
recent rename of `pu` to `seen`, 2020-06-25)). The guidance was probably
taken from existing similar language introduced in the "Merge upwards"
section of gitworkflows in f948dd8992 (Documentation: add manpage about
workflows, 2008-10-19).

Summary: This change simplifies the guidance by pointing users to just
"maint" or "master". But it also gives an explanation of why that is
preferred and what is meant by preferring "older" branches (which might
be confusing to some because "old" here is meant in relative terms
between these named branches, not in terms of the age of the branches
themselves). We also add an example to illustrate why it would be a bad
idea to use "next" as a starting point, which may not be so obvious to
new contributors.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 10:31:43 -07:00
5c98149ce4 SubmittingPatches: emphasize need to communicate non-default starting points
The phrase

    and unless it targets the `master` branch (which is the default),
    mark your patches as such.

is tightly packed with several things happening in just two lines of
text. It also feels like it is not that important because of the terse
treatment. This is a problem because (1) it has the potential to confuse
new contributors, and (2) it may be glossed over for those skimming the
docs.

Emphasize and elaborate on this guidance by promoting it to its own
separate paragraph.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 10:31:43 -07:00
b5dbfe28a4 SubmittingPatches: de-emphasize branches as starting points
It could be that a suitable branch does not exist, so instead just use
the phrase "starting point". Technically speaking the starting point
would be a commit (not a branch) anyway.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 10:31:43 -07:00
3423e372e4 SubmittingPatches: discuss subsystems separately from git.git
The discussion around subsystems disrupts the flow of discussion in the
surrounding area, which only deals with starting points used for the
git.git project. So move this bullet point out to the end.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 10:31:43 -07:00
fc0825d561 SubmittingPatches: reword awkward phrasing
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 10:31:43 -07:00
e3a567ff42 t4002: fix "diff can read from stdin" syntax
I noticed this test was producing output like

```
t4002-diff-basic.sh: test_expect_successdiff can read from stdin: not found
```

which is rather odd. Investigation shows an error of shell syntax:
foo'abc' is the same as fooabc to the shell. Perhaps obviously, this is
not a valid command for the test.

I am surprised this doesn't count as an error in the test, but that
accounts for it going unnoticed.

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:53:06 -07:00
9a25cad7e0 commit-graph.c: prevent overflow in verify_commit_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an OID out of an existing commit-graph during
verification.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
588af1bfd3 commit-graph.c: prevent overflow in write_commit_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an existing OID while writing a new commit-graph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
d76e0a744d commit-graph.c: prevent overflow in merge_commit_graph()
When merging two commit graphs, ensure that we don't attempt to merge
two graphs which, when combined, have more total commits than the 32-bit
unsigned maximum.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
19565d093d commit-graph.c: prevent overflow in split_graph_merge_strategy()
In a similar spirit as previous commits, ensure that we don't overflow
when choosing how to split and merge different layers of the
commit-graph.

In particular, avoid a potential overflow between `size_mult` and
`num_commits`, as well as a potential overflow between the number of
commits currently in the merged graph, and the number of commits in the
graph about to be merged.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
51c31a6408 commit-graph.c: prevent overflow in load_tree_for_commit()
In a similar spirit as previous commits, ensure that we don't overflow
when computing an offset into the commit_data chunk when the (relative)
graph position exceeds 2^32-1/GRAPH_DATA_WIDTH.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
50a71c2942 commit-graph.c: prevent overflow in fill_commit_in_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when the lex_index of the commit we are trying to fill out exceeds
2^32-1/(g->hash_len+16).

The other hunk touched in this patch is not susceptible to overflow,
since an explicit cast is made to a 64-bit unsigned value. For clarity
and consistency with the rest of the commits in this series, avoid a
tricky to reason about cast, and use `st_mult()` directly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
2740ed1c76 commit-graph.c: prevent overflow in fill_commit_graph_info()
In a similar spirit as previous commits, ensure that we don't overflow
in a few spots within `fill_commit_graph_info()`:

  - First, when computing an offset into the commit data chunk, which
    can occur when the `lex_index` of the item we're looking up exceeds
    2^32-1/GRAPH_DATA_WIDTH.

  - A similar issue when computing the generation date offset for
    commits with `lex_index` greater than 2^32-1/4. Note that in
    practice this will never overflow, since the left-hand operand is
    from calling `sizeof(...)` and is thus already a `size_t`. But wrap
    that in an `st_mult()` to make it clear that we intend to perform
    this computation using 64-bit operands.

  - Finally, a nearly identical issue as above when computing an offset
    into the `generation_data_overflow` chunk.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
0bd8f30a0e commit-graph.c: prevent overflow in load_oid_from_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when trying to compute an offset into the `chunk_oid_lookup` table when
the `lex_index` of the item we're trying to look up exceeds
`2^32-1/g->hash_len`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
209250ef38 commit-graph.c: prevent overflow in add_graph_to_chain()
The commit-graph uses a fanout table with 4-byte entries to store the
number of commits at each shard of the commit-graph. So it is OK to have
a commit graph with as many as 2^32-1 stored commits. But we risk
overflowing any computation which may exceed the 32-bit (unsigned)
maximum when those computations are (incorrectly) performed using 32-bit
operands.

There are a couple of spots in `add_graph_to_chain()` where we could
potentially overflow the result:

  - First, when comparing the list of existing entries in the
    commit-graph chain. It is unlikely that this should ever overflow,
    since it would require having roughly 2^32-1/g->hash_len
    commit-graphs in the chain. But let's guard that computation with a
    `st_mult()` just to be safe.

  - Second, when computing the number of commits in the graph added to
    the front of the chain. This value is also a 32-bit unsigned, but we
    should make sure that it does not grow beyond the maximum value.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
48f3f8cf37 commit-graph.c: prevent overflow in write_commit_graph_file()
When writing a commit-graph, we use the chunk-format API to write out
each individual chunk of the commit-graph. Each chunk of the
commit-graph is tracked via a call to `add_chunk()`, along with the
expected size of that chunk.

Similar to an earlier commit which handled the identical issue in the
MIDX machinery, guard against overflow when dealing with a commit-graph
with a large number of entries to avoid corrupting the contents of the
commit-graph itself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
0948c50176 pack-bitmap.c: ensure that eindex lookups don't overflow
When a bitmap is used to answer some reachability query, it creates a
pseudo-bitmap called the "extended index" on top of any existing bitmaps
to store objects that are relevant to the query, but not mentioned in
the bitmap.

When looking up the ith object in the extended index in a bitmap, it is
common to write something like:

    bitmap_get(result, i + bitmap_num_objects(bitmap_git))

, indicating that we want the ith object following all other objects
mentioned in the bitmap_git.

Since the type of `i` and the return type of `bitmap_num_objects()` are
both `uint32_t`s,  But if there are either a large number of objects in
the bitmap, or a large number of objects in the extended index (or
both), this addition can overflow when the sum is greater than 2^32-1.

Having that large of a bitmap position is entirely acceptable, but we
need to ensure that the computed bitmap position for that object is
performed using 64-bits and doesn't overflow.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
d67609bdde midx.c: prevent overflow in fill_included_packs_batch()
In a similar spirit as in previous commits, avoid an integer overflow
when computing the expected size of a MIDX.

(Note that this is also OK as-is, since `p->pack_size` is an `off_t`, so
this computation should already be done as 64-bit integers. But again,
let's use `st_mult()` to make this fact clear).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
2bc764c1d4 midx.c: prevent overflow in write_midx_internal()
When writing a MIDX, we use the chunk-format API to write out each
individual chunk of the MIDX. Each chunk of the MIDX is tracked via a
call to `add_chunk()`, along with the expected size of that chunk.

Guard against overflow when dealing with a MIDX with a large number of
entries (and consequently, large chunks within the MIDX file itself) to
avoid corrupting the contents of the MIDX itself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
cc38127439 midx.c: store nr, alloc variables as size_t's
In the `write_midx_context` structure, we use two `uint32_t`'s to track
the length and allocated size of the packs, and one `uint32_t` to track
the number of objects in the MIDX.

In practice, having these be 32-bit unsigned values shouldn't cause any
problems since we are unlikely to have that many objects or packs in any
real-world repository. But these values should be `size_t`'s, so change
their type to reflect that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
5675150cc3 midx.c: prevent overflow in nth_midxed_offset()
In a similar spirit as previous patches, avoid an overflow when looking
up object offsets in the MIDX's large offset table by guarding the
computation via `st_mult()`.

This instance is also OK as-is, since the left operand is the result of
`sizeof(...)`, which is already a `size_t`. But use `st_mult()` instead
here to make it explicit that this computation is to be performed using
64-bit unsigned integers.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
c2b24ede22 midx.c: prevent overflow in nth_midxed_object_oid()
In a similar spirit as previous commits, avoid overflow when looking up
an object's OID in a MIDX when its position is greater than
`2^32-1/m->hash_len`.

As usual, it is perfectly OK for a MIDX to have as many as 2^32-1
objects (since we use 32-bit fields to count the number of objects at
each fanout layer). But if we have more than `2^32-1/m->hash_len` number
of objects, we will incorrectly perform the computation using 32-bit
integers, overflowing the result.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
e6c71f239d midx.c: use size_t's for fanout nr and alloc
The `midx_fanout` struct is used to keep track of a set of OIDs
corresponding to each layer of the MIDX's fanout table. It stores an
array of entries, along with the number of entries in the table, and the
allocated size of the array.

Both `nr` and `alloc` are stored as 32-bit unsigned integers. In
practice, this should never cause any problems, since most packs have
far fewer than 2^32-1 objects.

But storing these as `size_t`'s is more appropriate, and prevents us
from accidentally overflowing some result when multiplying or adding to
either of these values. Update these struct members to be `size_t`'s as
appropriate.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
a519abca02 packfile.c: use checked arithmetic in nth_packed_object_offset()
In a similar spirit as the previous commits, ensure that we use
`st_add()` or `st_mult()` when computing values that may overflow the
32-bit unsigned limit.

Note that in each of these instances, we prevent 32-bit overflow
already since we have explicit casts to `size_t`.

So this code is OK as-is, but let's clarify it by using the `st_xyz()`
helpers to make it obvious that we are performing the relevant
computations using 64 bits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
42be681b33 packfile.c: prevent overflow in load_idx()
Prevent an overflow when locating a pack's CRC offset when the number
of packed items is greater than 2^32-1/hashsz by guarding the
computation with an `st_mult()`.

Note that to avoid truncating the result, the `crc_offset` member must
itself become a `size_t`. The only usage of this variable (besides the
assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
index-pack code. There we use the `crc_offset` as a pointer offset, so
we are already equipped to handle the type change.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:31:34 -07:00
1e9cb3487a t/helper: mark unused callback void data parameters
Many callback interfaces have an extra void data parameter, but we don't
always need it (especially for dumping functions like the ones in test
helpers). Mark them as unused to avoid -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
cc2f810172 tag: mark unused parameters in each_tag_name_fn callbacks
We iterate over the set of input tag names using callbacks. But not all
operations need the same inputs, so some parameters go unused (but of
course not the same ones for each operation). Mark the unused ones to
avoid -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
1e6459efca rev-parse: mark unused parameter in for_each_abbrev callback
We don't need to use the "data" parameter in this instance. Let's mark
it to avoid -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
4c7b06f208 replace: mark unused parameter in each_mergetag_fn callback
We don't look at the "commit" parameter to our callback, as our
"mergetag_data" pointer contains the original name "ref", which we use
instead. But we can't get rid of it, since other for_each_mergetag
callbacks do use it. Let's mark the parameter to avoid
-Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
80d4e5f3a5 replace: mark unused parameter in ref callback
We don't look at the "flags" parameter, which is natural for something
that is just printing the contents of the replace refs. But let's mark
it to appease -Wunused-parameter.

This probably should have been part of 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but I missed it as this one is a
repo_each_ref_fn, which takes an extra repository argument.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
ee550abcce merge-tree: mark unused parameter in traverse callback
Our threeway_callback() does not bother to look at its "n" parameter. It
is static in this file and used only by trivial_merge_trees(), which
always passes 3 trees (hence the name "threeway"). It also does not look
at "dirmask". This is OK, as it handles directories specifically by
looking at the mode bits.

Other traverse_info callbacks need these, so we can't get drop them from
the interface. But let's annotate these ones to avoid complaints from
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
0b4e9013f1 fsck: mark unused parameters in various fsck callbacks
There are a few callback functions which are used with the fsck code,
but it's natural that not all callbacks need all parameters. For
reporting, even something as obvious as "the oid of the object which had
a problem" is not always used, as some callers are only checking a
single object in the first place. And for both reporting and walking,
things like void data pointers and the fsck_options aren't always
necessary.

But since each such parameter is used by _some_ callback, we have to
keep them in the interface. Mark the unused ones in specific callbacks
to avoid triggering -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
cc88afad62 revisions: drop unused "opt" parameter in "tweak" callbacks
The setup_revision_opt struct has a "tweak" function pointer, which can
be used to adjust parameters after setup_revisions() parses arguments,
but before it finalizes setup. In addition to the rev_info struct, the
callback receives a pointer to the setup_revision_opt, as well.

But none of the existing callbacks looks at the extra "opt" parameter,
leading to -Wunused-parameter warnings.

We could mark it as UNUSED, but instead let's remove it entirely. It's
conceivable that it could be useful for a callback to have access to the
"opt" struct. But in the 13 years that this mechanism has existed,
nobody has used it. So let's just drop it in the name of simplifying.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
506d35f13d count-objects: mark unused parameter in alternates callback
Callbacks to for_each_altodb() get a void data pointer, but we don't
need it here. Mark it as unused to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
a8a8e75e9e am: mark unused keep_cr parameters
When parsing the input, we have a "keep_cr" parameter to tell us how to
handle line endings. But this doesn't apply to stgit or hg patches
(which are not mailbox formats where we have to worry about that), so we
ignore the parameter entirely in those functions.

Let's mark these as unused so that -Wunused-parameter does not complain
about them.

Note that we could just drop these parameters entirely. They are
necessary to conform to the mail_conv_fn interface used by
split_mail_conv(), but these two callbacks are the only ones used with
that function. The other formats (which _do_ care about keep_cr) use
split_mail_mbox(). But it's conceivable that we'd eventually add another
format that does care about this option, so let's leave it as part of
the generic interface.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
e519ac35af http-push: mark unused parameter in xml callback
The xml_start_tag() function is passed the expat library's
XML_SetElementHandler() function, so it has to conform to the
expected interface. But we don't actually care about the attributes
list. Mark it so that -Wunused-parameter does not complain.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
d0144007b1 http: mark unused parameters in curl callbacks
These functions are all used as callbacks for curl, so they have to
conform to a particular interface. But they don't need all of their
parameters:

  - fwrite_null() throws away the input, so it doesn't look at most
    parameters

  - fwrite_wwwauth() in theory could take the auth struct in its void
    pointer, but instead we just access it as the global http_auth
    (matching the rest of the code in this file)

  - curl_trace() always writes via the trace mechanism, so it doesn't
    need its void pointer to know where to send things. Likewise, it
    ignores the CURL parameter, since nothing we trace requires querying
    the handle.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
1779deed39 do_for_each_ref_helper(): mark unused repository parameter
This function gets a repository parameter because it's a callback for
do_for_each_repo_ref_iterator(). But it's just a wrapper that passes
along each call to a regular each_ref_fn callback, and the latter
doesn't accept a repository argument.

Probably in the long run all of the each_ref_fn callbacks should get a
repository parameter, too. But changing that now would require updates
all over the code base. Until that happens, let's annotate this wrapper
callback to quiet the compiler's -Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
b8ef49d54c test-ref-store: drop unimplemented reflog-expire command
The reflog-expire command has been unimplemented since it was added in
80f2a6097c (t/helper: add test-ref-store to test ref-store functions,
2017-03-26). This causes -Wunused-parameter to complain, since the
function just calls die() without looking at its arguments.

We could mark these as UNUSED to silence the warning. But let's just
drop the function. It has no callers in the test suite and is not doing
anything useful, beyond perhaps reminding us that it's something we
_could_ be testing.

But since the bulk of the work in adding such tests would be the shell
bits that actually examine the reflog state before and after expiration,
this is not even a useful step in that direction. Somebody who wants to
do that work later can easily add this function back.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
bbb6acd998 i18n: mark more bundle.c strings for translation
These two messages were introduced in 8ba221e245 (bundle: output hash
information in 'verify', 2022-03-22) and 105c6f14ad (bundle: parse
filter capability, 2022-03-09) but never for translation.

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 15:21:10 -07:00
c577d65158 push: don't imply that integration is always required before pushing
In a narrow but common case, the user is the only author of a branch and
doesn't mind overwriting the corresponding branch on the remote. This
workflow is especially common on GitHub, GitLab, and Gerrit, which keep
a permanent record of every version of a branch that is pushed while a
pull request is open for that branch. On those platforms, force-pushing
is encouraged and is analogous to emailing a new version of a patchset.

When giving advice about divergent branches, tell the user about
`git pull`, but don't unconditionally instruct the user to do it. A less
prescriptive message will help prevent users from thinking that they are
required to create an integrated history instead of simply replacing the
previous history. Also, don't put `git pull` in an awkward
parenthetical, because `git pull` can always be used to reconcile
branches and is the normal way to do so.

Due to the difficulty of knowing which command for force-pushing is best
suited to the user's situation, no specific advice is given about
force-pushing. Instead, the user is directed to the Git documentation to
read about possible ways forward that do not involve integration.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 09:14:58 -07:00
d92304ff5c remote: don't imply that integration is always required before pushing
In a narrow but common case, the user is the only author of a branch and
doesn't mind overwriting the corresponding branch on the remote. This
workflow is especially common on GitHub, GitLab, and Gerrit, which keep
a permanent record of every version of a branch that is pushed while a
pull request is open for that branch. On those platforms, force-pushing
is encouraged and is analogous to emailing a new version of a patchset.

When giving advice about divergent branches, tell the user about
`git pull`, but don't unconditionally instruct the user to do it. A less
prescriptive message will help prevent users from thinking that they are
required to create an integrated history instead of simply replacing the
previous history. Likewise, don't imply that `git pull` is only for
merging.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 09:14:58 -07:00
b6f3da5132 wt-status: don't show divergence advice when committing
When the user is in the middle of making a commit, they are not yet at
the point where they are ready to think about integrating their local
branch with the corresponding remote branch or force-pushing over the
remote branch. Don't include advice on how to deal with divergent
branches in the commit template, to avoid giving the impression that the
divergence needs to be dealt with immediately. Similar advice will be
printed when it is most relevant, that is, if the user does try to push
without first reconciling the two branches.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 09:14:58 -07:00
de41d03e1c packfile.c: prevent overflow in nth_packed_object_id()
In 37fec86a83 (packfile: abstract away hash constant values,
2018-05-02), `nth_packed_object_id()` started using the variable
`the_hash_algo->rawsz` instead of a fixed constant when trying to
compute an offset into the ".idx" file for some object position.

This can lead to surprising truncation when looking for an object
towards the end of a large enough pack, like the following:

    (gdb) p hashsz
    $1 = 20
    (gdb) p n
    $2 = 215043814
    (gdb) p hashsz * n
    $3 = 5908984

, which is a debugger session broken on a known-bad call to the
`nth_packed_object_id()` function.

This behavior predates 37fec86a83, and is original to the v2 index
format, via: 74e34e1fca (sha1_file.c: learn about index version 2,
2007-04-09).

This is due to §6.4.4.1 of the C99 standard, which states that an
untyped integer constant will take the first type in which the value can
be accurately represented, among `int`, `long int`, and `long long int`.

Since 20 can be represented as an `int`, and `n` is a 32-bit unsigned
integer, the resulting computation is defined by §6.3.1.8, and the
(signed) integer value representing `n` is converted to an unsigned
type, meaning that `20 * n` (for `n` having type `uint32_t`) is
equivalent to a multiplication between two unsigned 32-bit integers.

When multiplying a sufficiently large `n`, the resulting value can
exceed 2^32-1, wrapping around and producing an invalid result. Let's
follow the example in f86f769550 (compute pack .idx byte offsets using
size_t, 2020-11-13) and replace this computation with `st_mult()`, which
will ensure that the computation is done using 64-bits.

While here, guard the corresponding computation for packs with v1
indexes, too. Though the likelihood of seeing a bug there is much
smaller, since (a) v1 indexes are generated far less frequently than v2
indexes, and (b) they all correspond to packs no larger than 2 GiB, so
having enough objects to trigger this overflow is unlikely if not
impossible.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-12 21:44:59 -07:00
def390d593 builtin/repack.c: avoid dir traversal in collect_pack_filenames()
When repacking, the function `collect_pack_filenames()` is responsible
for collecting the set of existing packs in the repository, and
partitioning them into "kept" (if the pack has a ".keep" file or was
given via `--keep-pack`) and "nonkept" (otherwise) lists.

This function comes from the original C port of git-repack.sh from back
in a1bbc6c017 (repack: rewrite the shell script in C, 2013-09-15),
where it first appears as `get_non_kept_pack_filenames()`. At the time,
the implementation was a fairly direct translation from the relevant
portion of git-repack.sh, which looped over the results of

    find "$PACKDIR" -type f -name '*.pack'

either ignoring the pack as kept, or adding it to the list of existing
packs.

So the choice to directly translate this function in terms of
`readdir()` in a1bbc6c017 made sense. At the time, it was possible to
refine the C version in terms of packed_git structs, but was never done.

However, manually enumerating a repository's packs via `readdir()` is
confusing and error-prone. It leads to frustrating inconsistencies
between which packs Git considers to be part of a repository (i.e.,
could be found in the list of packs from `get_all_packs()`), and which
packs `collect_pack_filenames()` considers to meet the same criteria.

This bit us in 73320e49ad (builtin/repack.c: only collect fully-formed
packs, 2023-06-07), and again in the previous commit.

Prevent these issues from biting us in the future by implementing the
`collect_pack_filenames()` function by looping over an array of pointers
to `packed_git` structs, ensuring that we use the same criteria to
determine the set of available packs.

One gotcha here is that we have to ignore non-local packs, since the
original version of `collect_pack_filenames()` only looks at the local
pack directory to collect existing packs.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-11 13:07:51 -07:00
0af067276e builtin/repack.c: only repack .packs that exist
In 73320e49ad (builtin/repack.c: only collect fully-formed packs,
2023-06-07), we switched the check for which packs to collect by
starting at the .idx files and looking for matching .pack files. This
avoids trying to repack pack-files that have not had their pack-indexes
installed yet.

However, it does cause maintenance to halt if we find the (problematic,
but not insurmountable) case of a .idx file without a corresponding
.pack file. In an environment where packfile maintenance is a critical
function, such a hard stop is costly and requires human intervention to
resolve (by deleting the .idx file).

This was not the case before. We successfully repacked through this
scenario until the recent change to scan for .idx files.

Further, if we are actually in a case where objects are missing, we
detect this at a different point during the reachability walk.

In other cases, Git prepares its list of packfiles by scanning .idx
files and then only adds it to the packfile list if the corresponding
.pack file exists. It even does so without a warning! (See
add_packed_git() in packfile.c for details.)

This case is much less likely to occur than the failures seen before
73320e49ad. Packfiles are "installed" by writing the .pack file before
the .idx and that process can be interrupted. Packfiles _should_ be
deleted by deleting the .idx first, followed by the .pack file, but
unlink_pack_path() does not do this: it deletes the .pack _first_,
allowing a window where this process could be interrupted. We leave the
consideration of changing this order as a separate concern. Knowing that
this condition is possible from interrupted Git processes and not other
tools lends some weight that Git should be more flexible around this
scenario.

Add a check to see if the .pack file exists before adding it to the list
for repacking. This will stop a number of maintenance failures seen in
production but fixed by deleting the .idx files.

This brings us closer to the case before 73320e49ad in that 'git
repack' will not fail when there is an orphaned .idx file, at least, not
due to the way we scan for packfiles. In the case that the .pack file
was erroneously deleted without copies of its objects in other installed
packfiles, then 'git repack' will fail due to the reachable object walk.

This does resolve the case where automated repacks will no longer be
halted on this case. The tests in t7700 show both these successful
scenarios and the case of failing if the .pack was truly required.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-11 13:07:50 -07:00
98456eff08 ls-refs.c: avoid enumerating hidden refs where possible
In a similar fashion as in previous commits, teach `ls-refs` to avoid
enumerating hidden references where possible.

As before, this is linux.git with one hidden reference per commit.

    $ hyperfine -L v ,.compile 'git{v} -c protocol.version=2 ls-remote .'
    Benchmark 1: git -c protocol.version=2 ls-remote .
      Time (mean ± σ):      89.8 ms ±   0.6 ms    [User: 84.3 ms, System: 5.7 ms]
      Range (min … max):    88.8 ms …  91.3 ms    32 runs

    Benchmark 2: git.compile -c protocol.version=2 ls-remote .
      Time (mean ± σ):       6.5 ms ±   0.1 ms    [User: 2.4 ms, System: 4.3 ms]
      Range (min … max):     6.2 ms …   8.3 ms    397 runs

    Summary
      'git.compile -c protocol.version=2 ls-remote .' ran
       13.85 ± 0.33 times faster than 'git -c protocol.version=2 ls-remote .'

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
18b6b1b5c5 upload-pack.c: avoid enumerating hidden refs where possible
In a similar fashion as a previous commit, teach `upload-pack` to avoid
enumerating hidden references where possible.

Note, however, that there are certain cases where cannot avoid
enumerating even hidden references, in particular when either of:

  - `uploadpack.allowTipSHA1InWant`, or
  - `uploadpack.allowReachableSHA1InWant`

are set, corresponding to `ALLOW_TIP_SHA1` and `ALLOW_REACHABLE_SHA1`,
respectively.

When either of these bits are set, upload-pack's `is_our_ref()` function
needs to consider the `HIDDEN_REF` bit of the referent's object flags.
So we must visit all references, including the hidden ones, in order to
mark their referents with the `HIDDEN_REF` bit.

When neither `ALLOW_TIP_SHA1` nor `ALLOW_REACHABLE_SHA1` are set, the
`is_our_ref()` function considers only the `OUR_REF` bit, and not the
`HIDDEN_REF` one. `OUR_REF` is applied via `mark_our_ref()`, and only
to objects at the tips of non-hidden references, so we do not need to
visit hidden references in this case.

When neither of those bits are set, `upload-pack` can potentially avoid
enumerating a large number of references. In the same example as a
previous commit (linux.git with one hidden reference per commit,
"refs/pull/N"):

    $ printf 0000 >in
    $ hyperfine --warmup=1 \
      'git -c transfer.hideRefs=refs/pull upload-pack . <in' \
      'git.compile -c transfer.hideRefs=refs/pull -c uploadpack.allowTipSHA1InWant upload-pack . <in' \
      'git.compile -c transfer.hideRefs=refs/pull upload-pack . <in'
    Benchmark 1: git -c transfer.hideRefs=refs/pull upload-pack . <in
      Time (mean ± σ):     406.9 ms ±   1.1 ms    [User: 357.3 ms, System: 49.5 ms]
      Range (min … max):   405.7 ms … 409.2 ms    10 runs

    Benchmark 2: git.compile -c transfer.hideRefs=refs/pull -c uploadpack.allowTipSHA1InWant upload-pack . <in
      Time (mean ± σ):     406.5 ms ±   1.3 ms    [User: 356.5 ms, System: 49.9 ms]
      Range (min … max):   404.6 ms … 408.8 ms    10 runs

    Benchmark 3: git.compile -c transfer.hideRefs=refs/pull upload-pack . <in
      Time (mean ± σ):       4.7 ms ±   0.2 ms    [User: 0.7 ms, System: 3.9 ms]
      Range (min … max):     4.3 ms …   6.1 ms    472 runs

    Summary
      'git.compile -c transfer.hideRefs=refs/pull upload-pack . <in' ran
       86.62 ± 4.33 times faster than 'git.compile -c transfer.hideRefs=refs/pull -c uploadpack.allowTipSHA1InWant upload-pack . <in'
       86.70 ± 4.33 times faster than 'git -c transfer.hideRefs=refs/pull upload-pack . <in'

As above, we must visit every reference when
uploadPack.allowTipSHA1InWant is set. But when it is unset, we can visit
far fewer references.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
cc2a1f98ac builtin/receive-pack.c: avoid enumerating hidden references
Now that `refs_for_each_fullref_in()` has the ability to avoid
enumerating references matching certain pattern(s), use that to avoid
visiting hidden refs when constructing the ref advertisement via
receive-pack.

Note that since this exclusion is best-effort, we still need
`show_ref_cb()` to check whether or not each reference is hidden or not
before including it in the advertisement.

As was the case when applying this same optimization to `upload-pack`,
`receive-pack`'s reference advertisement phase can proceed much quicker
by avoiding enumerating references that will not be part of the
advertisement.

(Below, we're still using linux.git with one hidden refs/pull/N ref per
commit):

    $ hyperfine -L v ,.compile 'git{v} -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'
    Benchmark 1: git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
      Time (mean ± σ):      89.1 ms ±   1.7 ms    [User: 82.0 ms, System: 7.0 ms]
      Range (min … max):    87.7 ms …  95.5 ms    31 runs

    Benchmark 2: git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
      Time (mean ± σ):       4.5 ms ±   0.2 ms    [User: 0.5 ms, System: 3.9 ms]
      Range (min … max):     4.1 ms …   5.6 ms    508 runs

    Summary
      'git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git' ran
       20.00 ± 1.05 times faster than 'git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
15af64dcfd refs.h: implement hidden_refs_to_excludes()
In subsequent commits, we'll teach `receive-pack` and `upload-pack` to
use the new jump list feature in the packed-refs iterator by ignoring
references which are mentioned via its respective hideRefs lists.

However, the packed-ref jump lists cannot handle un-hiding rules (that
begin with '!'), or namespace comparisons (that begin with '^'). Add a
convenience function to the refs.h API to detect when either of these
conditions are met, and returns an appropriate value to pass as excluded
patterns.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
e6bf24d39a refs.h: let for_each_namespaced_ref() take excluded patterns
A future commit will want to call `for_each_namespaced_ref()` with
a list of excluded patterns.

We could introduce a variant of that function, say,
`for_each_namespaced_ref_exclude()` which takes the extra parameter, and
reimplement the original function in terms of that. But all but one
caller (in `http-backend.c`) will supply the new parameter, so add the
new parameter to `for_each_namespaced_ref()` itself instead of
introducing a new function.

For now, supply NULL for the list of excluded patterns at all callers to
avoid changing behavior, which we will do in a future change.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
c45841fff8 revision.h: store hidden refs in a strvec
In subsequent commits, it will be convenient to have a 'const char **'
of hidden refs (matching `transfer.hiderefs`, `uploadpack.hideRefs`,
etc.), instead of a `string_list`.

Convert spots throughout the tree that store the list of hidden refs
from a `string_list` to a `strvec`.

Note that in `parse_hide_refs_config()` there is an ugly const-cast used
to avoid an extra copy of each value before trimming any trailing slash
characters. This could instead be written as:

    ref = xstrdup(value);
    len = strlen(ref);
    while (len && ref[len - 1] == '/')
            ref[--len] = '\0';
    strvec_push(hide_refs, ref);
    free(ref);

but the double-copy (once when calling `xstrdup()`, and another via
`strvec_push()`) is wasteful.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
c489f47a64 refs/packed-backend.c: add trace2 counters for jump list
The previous commit added low-level tests to ensure that the packed-refs
iterator did not enumerate excluded sections of the refspace.

However, there was no guarantee that these sections weren't being
visited, only that they were being suppressed from the output. To harden
these tests, add a trace2 counter which tracks the number of regions
skipped by the packed-refs iterator, and assert on its value.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
59c35fac54 refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
When iterating through the `packed-refs` file in order to answer a query
like:

    $ git for-each-ref --exclude=refs/__hidden__

it would be useful to avoid walking over all of the entries in
`refs/__hidden__/*` when possible, since we know that the ref-filter
code is going to throw them away anyways.

In certain circumstances, doing so is possible. The algorithm for doing
so is as follows:

  - For each excluded pattern, find the first record that matches it,
    and the first record that *doesn't* match it (i.e. the location
    you'd next want to consider when excluding that pattern).

  - Sort the set of excluded regions from the previous step in ascending
    order of the first location within the `packed-refs` file that
    matches.

  - Clean up the results from the previous step: discard empty regions,
    and combine adjacent regions. The set of regions which remains is
    referred to as the "jump list", and never contains any references
    which should be included in the result set.

Then when iterating through the `packed-refs` file, if `iter->pos` is
ever contained in one of the regions from the previous steps, advance
`iter->pos` past the end of that region, and continue enumeration.

Note that we only perform this optimization when none of the excluded
pattern(s) have special meta-characters in them. For a pattern like
"refs/foo[ac]", the excluded regions ("refs/fooa", "refs/fooc", and
everything underneath them) are not connected. A future implementation
that handles this case may split the character class (pretending as if
two patterns were excluded: "refs/fooa", and "refs/fooc").

There are a few other gotchas worth considering. First, note that the
jump list is sorted, so once we jump past a region, we can avoid
considering it (or any regions preceding it) again. The member
`jump_pos` is used to track the first next-possible region to jump
through.

Second, note that the jump list is best-effort, since we do not handle
loose references, and because of the meta-character issue above. The
jump list may not skip past all references which won't appear in the
results, but will never skip over a reference which does appear in the
result set.

In repositories with a large number of hidden references, the speed-up
can be significant. Tests here are done with a copy of linux.git with a
reference "refs/pull/N" pointing at every commit, as in:

    $ git rev-list HEAD | awk '{ print "create refs/pull/" NR " " $0 }' |
        git update-ref --stdin
    $ git pack-refs --all

, it is significantly faster to have `for-each-ref` jump over the
excluded references, as opposed to filtering them out after the fact:

    $ hyperfine \
      'git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"' \
      'git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"' \
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"'
    Benchmark 1: git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"
      Time (mean ± σ):     798.1 ms ±   3.3 ms    [User: 687.6 ms, System: 146.4 ms]
      Range (min … max):   794.5 ms … 805.5 ms    10 runs

    Benchmark 2: git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"
      Time (mean ± σ):      98.9 ms ±   1.4 ms    [User: 93.1 ms, System: 5.7 ms]
      Range (min … max):    97.0 ms … 104.0 ms    29 runs

    Benchmark 3: git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"
      Time (mean ± σ):       4.5 ms ±   0.2 ms    [User: 0.7 ms, System: 3.8 ms]
      Range (min … max):     4.1 ms …   5.8 ms    524 runs

    Summary
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"' ran
       21.87 ± 1.05 times faster than 'git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"'
      176.52 ± 8.19 times faster than 'git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"'

(Comparing stock git and this patch isn't quite fair, since an earlier
commit in this series adds a naive implementation of the `--exclude`
option. `git.prev` is built from the previous commit and includes this
naive implementation).

Using the jump list is fairly straightforward (see the changes to
`refs/packed-backend.c::next_record()`), but constructing the list is
not. To ensure that the construction is correct, add a new suite of
tests in t1419 covering various corner cases (overlapping regions,
partially overlapping regions, adjacent regions, etc.).

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
d22d941ac0 refs/packed-backend.c: refactor find_reference_location()
The function `find_reference_location()` is used to perform a
binary search-like function over the contents of a repository's
`$GIT_DIR/packed-refs` file.

The search it implements is unlike a standard binary search in that the
records it searches over are not of a fixed width, so the comparison
must locate the end of a record before comparing it.

Extract the core routine of `find_reference_location()` in order to
implement a function in the following patch which will find the first
location in the `packed-refs` file that *doesn't* match the given
pattern.

The behavior of `find_reference_location()` is unchanged.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
b269ac53c0 refs: plumb exclude_patterns argument throughout
The subsequent patch will want to access an optional `excluded_patterns`
array within `refs/packed-backend.c` that will cull out certain
references matching any of the given patterns on a best-effort basis.

To do so, the refs subsystem needs to be updated to pass this value
across a number of different locations.

Prepare for a future patch by introducing this plumbing now, passing
NULLs at top-level APIs in order to make that patch less noisy and more
easily readable.

Signed-off-by: Taylor Blau <me@ttaylorr.co>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
8255dd8a3d builtin/for-each-ref.c: add --exclude option
When using `for-each-ref`, it is sometimes convenient for the caller to
be able to exclude certain parts of the references.

For example, if there are many `refs/__hidden__/*` references, the
caller may want to emit all references *except* the hidden ones.
Currently, the only way to do this is to post-process the output, like:

    $ git for-each-ref --format='%(refname)' | grep -v '^refs/hidden/'

Which is do-able, but requires processing a potentially large quantity
of references.

Teach `git for-each-ref` a new `--exclude=<pattern>` option, which
excludes references from the results if they match one or more excluded
patterns.

This patch provides a naive implementation where the `ref_filter` still
sees all references (including ones that it will discard) and is left to
check whether each reference matches any excluded pattern(s) before
emitting them.

By culling out references we know the caller doesn't care about, we can
avoid allocating memory for their storage, as well as spending time
sorting the output (among other things). Even the naive implementation
provides a significant speed-up on a modified copy of linux.git (that
has a hidden ref pointing at each commit):

    $ hyperfine \
      'git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"' \
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/'
    Benchmark 1: git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"
      Time (mean ± σ):     820.1 ms ±   2.0 ms    [User: 703.7 ms, System: 152.0 ms]
      Range (min … max):   817.7 ms … 823.3 ms    10 runs

    Benchmark 2: git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/
      Time (mean ± σ):     106.6 ms ±   1.1 ms    [User: 99.4 ms, System: 7.1 ms]
      Range (min … max):   104.7 ms … 109.1 ms    27 runs

    Summary
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/' ran
        7.69 ± 0.08 times faster than 'git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"'

Subsequent patches will improve on this by avoiding visiting excluded
sections of the `packed-refs` file in certain cases.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
284c55deb5 ref-filter.c: parameterize match functions over patterns
`match_pattern()` and `match_name_as_path()` both take a `struct
ref_filter *`, and then store a stack variable `patterns` pointing at
`filter->patterns`.

The subsequent patch will add a new array of patterns to match over (the
excluded patterns, via a new `git for-each-ref --exclude` option),
treating the return value of these functions differently depending on
which patterns are being used to match.

Tweak `match_pattern()` and `match_name_as_path()` to take an array of
patterns to prepare for passing either in.

Once we start passing either in, `match_pattern()` will have little to
do with a particular `struct ref_filter *` instance. To clarify this,
drop it from the argument list, and replace it with the only bit of the
`ref_filter` that we care about (`filter->ignore_case`).

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
b571fb9800 ref-filter: add ref_filter_clear()
We did not bother to clean up at all in `git branch` or `git tag`, and
`git for-each-ref` only cleans up a couple of members.

Add and call `ref_filter_clear()` when cleaning up a `struct
ref_filter`. Running this patch (without any test changes) indicates a
couple of now leak-free tests. This was found by running:

    $ make SANITIZE=leak
    $ make -C t GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_OPTS=--immediate

(Note that the `reachable_from` and `unreachable_from` lists should be
cleaned as they are used. So this is just covering any case where we
might bail before running the reachability check.)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
311bfe18ce ref-filter: clear reachable list pointers after freeing
In `reach_filter()`, we pop all commits from the reachable lists,
leaving them empty. But because we're operating on a list pointer that
was passed by value, the original `filter.reachable_from` and
`filter.unreachable_from` pointers are left dangling.

As is the case with the previous commit, nobody touches either of these
fields after calling `reach_filter()`, so leaving them dangling is OK.
But this future proofs against dangerous situations.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
b9f7daa6ef ref-filter.h: provide REF_FILTER_INIT
Provide a sane initialization value for `struct ref_filter`, which in a
subsequent patch will be used to initialize a new field.

In the meantime, ensure that the `ref_filter` struct used in the
test-helper's `cmd__reach()` is zero-initialized. The lack of
initialization is OK, since `commit_contains()` only looks at the single
`with_commit_tag_algo` field that *is* initialized directly above.

So this does not fix a bug, but rather prevents one from biting us in
the future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
bf1377a12b refs.c: rename ref_filter
The refs machinery has its own implementation of a `ref_filter` (used by
`for-each-ref`), which is distinct from the `ref-filter.h` API (also
used by `for-each-ref`, among other things).

Rename the one within refs.c to more clearly indicate its purpose.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
4c9cb51fe7 doc: gitcredentials: link to helper list
Link to community list of credential helpers. This is useful information
for users.

Describe how OAuth credential helpers work. OAuth is a user-friendly
alternative to personal access tokens and SSH keys. Reduced setup cost
makes it easier for users to contribute to projects across multiple
forges.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:35:55 -07:00
9281cd07f0 commit-graph.c: avoid duplicated progress output during verify
When `git commit-graph verify` was taught how to verify commit-graph
chains in 3da4b609bb (commit-graph: verify chains with --shallow mode,
2019-06-18), it produced one line of progress per layer of the
commit-graph chain.

    $ git.compile commit-graph verify
    Verifying commits in commit graph: 100% (4356/4356), done.
    Verifying commits in commit graph: 100% (131912/131912), done.

This could be somewhat confusing to users, who may wonder why there are
multiple occurrences of "Verifying commits in commit graph".

There are likely good arguments on whether or not there should be
one line of progress output per commit-graph layer. On the one hand, the
existing output shows us verifying each individual layer of the chain.
But on the other hand, the fact that a commit-graph may be stored among
multiple layers is an implementation detail that the caller need not be
aware of.

Clarify this by showing a single progress meter regardless of the number
of layers in the commit-graph chain. After this patch, the output
reflects the logical contents of a commit-graph chain, instead of
showing one line of output per commit-graph layer:

    $ git.compile commit-graph verify
    Verifying commits in commit graph: 100% (136268/136268), done.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:45 -07:00
7248857b6e commit-graph.c: pass progress to verify_one_commit_graph()
This is the final step to prepare for consolidating the output of `git
commit-graph verify`. Instead of having each call to
`verify_one_commit_graph()` initialize its own progress struct, have the
caller pass one in instead.

This patch does not alter the output of `git commit-graph verify`, but
the next commit will consolidate the output.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:43 -07:00
f5facaa465 commit-graph.c: iteratively verify commit-graph chains
Now that we have a function which can verify a single layer of a
commit-graph chain, implement `verify_commit_graph()` in terms of
iterating over commit-graphs along their `->base_graph` pointers.

This further prepares us to consolidate the progress output of `git
commit-graph verify`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:42 -07:00
eb319d6771 commit-graph.c: extract verify_one_commit_graph()
When the `verify_commit_graph()` function was extended to support
commit-graph chains via 3da4b609bb (commit-graph: verify chains with
--shallow mode, 2019-06-18), it did so by recursively calling itself on
each layer of the commit-graph chain.

In practice this poses no issues, since commit-graph chains do not loop,
and there are few enough of them that adding additional frames to the
stack is not a problem.

A future commit will consolidate the progress output from `git
commit-graph verify` when verifying chained commit-graphs to print a
single line instead of one progress meter per commit-graph layer.
Prepare for this by extracting a routine to verify a single layer of a
commit-graph.

Note that `verify_commit_graph()` is still recursive after this patch,
but this will change in the subsequent patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:41 -07:00
39bdd30377 fsck: suppress MIDX output with --no-progress
In a similar spirit as the previous commit, address a bug where `git
fsck` produces output when calling `git multi-pack-index verify` even
when invoked with `--no-progress`.

    $ git.compile fsck --connectivity-only --no-progress --no-dangling
    Verifying OID order in multi-pack-index: 100% (605677/605677), done.
    Sorting objects by packfile: 100% (605678/605678), done.
    Verifying object offsets: 100% (605678/605678), done.

The three lines produced by `git fsck` come from `git multi-pack-index
verify`, but should be squelched due to `--no-progress`.

The MIDX machinery learned to generate these progress messages as early
as 430efb8a74 (midx: add progress indicators in multi-pack-index
verify, 2019-03-21), but did not respect `--progress` or `--no-progress`
until ad60096d1c (midx: honor the MIDX_PROGRESS flag in
verify_midx_file, 2019-10-21).

But the `git multi-pack-index verify` step was added to fsck in
66ec0390e7 (fsck: verify multi-pack-index, 2018-09-13), pre-dating any
of the above patches.

Pass `--[no-]progress` as appropriate to ensure that we don't produce
output when told not to.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:40 -07:00
eda206f611 fsck: suppress commit-graph output with --no-progress
Since e0fd51e1d7 (fsck: verify commit-graph, 2018-06-27), `fsck` runs
`git commit-graph verify` to check the integrity of any commit-graph(s).

Originally, the `git commit-graph verify` step would always print to
stdout/stderr, regardless of whether or not `fsck` was invoked with
`--[no-]progress` or not. But in 7371612255 (commit-graph: add
--[no-]progress to write and verify, 2019-08-26), the commit-graph
machinery learned the `--[no-]progress` option, though `fsck` was not
updated to pass this new flag (or not).

This led to seeing output from running `git fsck`, even with
`--no-progress` on repositories that have a commit-graph:

    $ git.compile fsck --connectivity-only --no-progress --no-dangling
    Verifying commits in commit graph: 100% (4356/4356), done.
    Verifying commits in commit graph: 100% (131912/131912), done.

Ensure that `fsck` passes `--[no-]progress` as appropriate when calling
`git commit-graph verify`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:37 -07:00
f4a8fde057 dir: match "attr" pathspec magic with correct paths
The match_pathspec_item() function takes "prefix" value, allowing a
caller to chop off the common leading prefix of pathspec pattern
strings from the path and only use the remainder of the path to
match the pathspec patterns (after chopping the same leading prefix
of them, of course).

This "common leading prefix" optimization has two main features:

 * discard the entries in the in-core index that are outside of the
   common leading prefix; if you are doing "ls-files one/a one/b",
   we know all matches must be from "one/", so first the code
   discards all entries outside the "one/" directory from the
   in-core index.  This allows us to work on a smaller dataset.

 * allow skipping the comparison of the leading bytes when matching
   pathspec with path.  When "ls-files" finds the path "one/a/1" in
   the in-core index given "one/a" and "one/b" as the pathspec,
   knowing that common leading prefix "one/" was found lets the
   pathspec matchinery not to bother comparing "one/" part, and
   allows it to feed "a/1" down, as long as the pathspec element
   "one/a" gets corresponding adjustment to "a".

When the "attr" pathspec magic is in effect, however, the current
code breaks down.

The attributes, other than the ones that are built-in and the ones
that come from the $GIT_DIR/info/attributes file and the top-level
.gitattributes file, are lazily read from the filesystem on-demand,
as we encounter each path and ask if it matches the pathspec.  For
example, if you say "git ls-files "(attr:label)sub/" in a repository
with a file "sub/file" that is given the 'label' attribute in
"sub/.gitattributes":

 * The common prefix optimization finds that "sub/" is the common
   prefix and prunes the in-core index so that it has only entries
   inside that directory.  This is desirable.

 * The code then walks the in-core index, finds "sub/file", and
   eventually asks do_match_pathspec() if it matches the given
   pathspec.

 * do_match_pathspec() calls match_pathspec_item() _after_ stripping
   the common prefix "sub/" from the path, giving it "file", plus
   the length of the common prefix (4-bytes), so that the pathspec
   element "(attr:label)sub/" can be treated as if it were "(attr:label)".

The last one is what breaks the match in the current code, as the
pathspec subsystem ends up asking the attribute subsystem to find
the attribute attached to the path "file".  We need to ask about the
attributes on "sub/file" when calling match_pathspec_attrs(); this
can be done by looking at "prefix" bytes before the beginning of
"name", which is the same trick already used by another piece of the
code in the same match_pathspec_item() function.

Unfortunately this was not discovered so far because the code works
with slightly different arguments, e.g.

 $ git ls-files "(attr:label)sub"
 $ git ls-files "(attr:label)sub/" "no/such/dir/"

would have reported "sub/file" as a path with the 'label' attribute
just fine, because neither would trigger the common prefix
optimization.

Reported-by: Matthew Hughes <mhughes@uw.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-08 14:36:31 -07:00
aa9166bcc0 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-08 11:23:08 -07:00
b00ec259e7 Merge branch 'jk/fsck-indices-in-worktrees'
Code clarification.

* jk/fsck-indices-in-worktrees:
  fsck: avoid misleading variable name
2023-07-08 11:23:08 -07:00
7f5ad0ca8d Merge branch 'js/empty-index-fixes'
A few places failed to differenciate the case where the index is
truly empty (nothing added) and we haven't yet read from the
on-disk index file, which have been corrected.

* js/empty-index-fixes:
  commit -a -m: allow the top-level tree to become empty again
  split-index: accept that a base index can be empty
  do_read_index(): always mark index as initialized unless erroring out
2023-07-08 11:23:07 -07:00
d52a45cf56 Merge branch 'ks/t4205-test-describe-with-abbrev-fix'
Test update.

* ks/t4205-test-describe-with-abbrev-fix:
  t4205: correctly test %(describe:abbrev=...)
2023-07-08 11:23:07 -07:00
bd19ee9c45 pretty: use strchr(3) in userformat_find_requirements()
The strbuf_expand_step() loop in userformat_find_requirements() iterates
through the percent signs in the string "fmt", but we're not interested
in its effect on the strbuf "dummy".  Use strchr(3) instead and get rid
of the strbuf that we no longer need.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:32:57 -07:00
3e81b896f7 pkt-line: add size parameter to packet_length()
hex2chr() takes care not to run over the end of a NUL-terminated string.
It's used in packet_length(), but both callers of that function pass a
four-byte buffer, making NUL-checks unnecessary.  packet_length() could
accidentally be used with a pointer to a buffer of unknown size at new
call-sites, though, and the compiler wouldn't complain.

Add a size parameter plus check, and remove the NUL-checks by calling
hexval() directly.  This trades three NUL checks against one size check
and the ability to report the use of a short buffer at runtime.

If any of the four bytes is NUL or -- more generally -- not a
hexadecimal digit, then packet_length() still returns a negative value.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:30:16 -07:00
30c8c55cbf tree-walk: drop unused base_offset from do_match()
The tree-walk.c:do_match() function takes base_offset but just like
tree_entry_interesting() we dealt with earlier, nobody passes a
value other than 0 in it.  Get rid of the parameter to avoid having
to worry about potential bugs lurking unexercised.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:27:28 -07:00
0ad927e9e0 tree-walk: lose base_offset that is never used in tree_entry_interesting
The tree_entry_interesting() function takes base_offset, allowing
its callers to potentially pass a non-zero number to skip the early
part of the path string.

The feature is never exercised and we do not even know what bugs are
lurking there, as all callers pass 0 to the parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:27:28 -07:00
7e360bc626 t6135: attr magic with path pattern
The test coverage on attribute magic combined with path pattern
was a bit thin.  Let's add a few and make sure "(attr:X)sub" and
"(attr:X)sub/" behave the same.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:23:42 -07:00
1dd14e8e93 pretty: avoid double negative in format_commit_item()
Test for equality with no_flush, which has enough negation already.  Get
rid of the unnecessary parentheses while at it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 12:05:42 -07:00
7b7203e78a ls-tree: simplify prefix handling
git ls-tree has two prefixes: The one handed to cmd_ls_tree(), i.e. the
current subdirectory in the repository (if any) and the "display" prefix
used by the show_tree_*() functions.  The option --full-name clears the
last one, i.e. it shows full paths, and --full-tree clears both, i.e. it
acts as if the command was started in the root of the repository.

The show_tree_*() functions use the ls_tree_options members chomp_prefix
and ls_tree_prefix to determine their prefix values.  Calculate it once
in cmd_ls_tree() instead, once the main prefix value is finalized.

This allows chomp_prefix to become a local variable.  Stop using
strlen(3) to determine its initial value -- we only care whether we got
a non-empty string, not precisely how long it is.

Rename ls_tree_prefix to prefix to demonstrate that we converted all
users and because the ls_tree_ part is no longer necessary since
030a3d5d9e (ls-tree: use a "struct options", 2023-01-12) turned it from
a global variable to a struct member.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 11:57:13 -07:00
061c58647e The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-06 11:54:48 -07:00
b3d1c85d48 Merge branch 'gc/config-context'
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
2023-07-06 11:54:48 -07:00
1d76e69212 Merge branch 'jc/doc-hash-object-types'
Doc update.

* jc/doc-hash-object-types:
  docs: add git hash-object -t option's possible values
2023-07-06 11:54:47 -07:00
391414e971 Merge branch 'jk/cherry-pick-revert-status'
During a cherry-pick or revert session that works on multiple
commits, "git status" did not give correct information, which has
been corrected.

* jk/cherry-pick-revert-status:
  fix cherry-pick/revert status when doing multiple commits
2023-07-06 11:54:47 -07:00
84b889bd03 Merge branch 'pw/apply-too-large'
"git apply" punts when it is fed too large a patch input; the error
message it gives when it happens has been clarified.

* pw/apply-too-large:
  apply: improve error messages when reading patch
2023-07-06 11:54:47 -07:00
a9cc3b8fc7 Merge branch 'tl/notes-separator'
'git notes append' was taught '--separator' to specify string to insert
between paragraphs.

* tl/notes-separator:
  notes: introduce "--no-separator" option
  notes.c: introduce "--[no-]stripspace" option
  notes.c: append separator instead of insert by pos
  notes.c: introduce '--separator=<paragraph-break>' option
  t3321: add test cases about the notes stripspace behavior
  notes.c: use designated initializers for clarity
  notes.c: cleanup 'strbuf_grow' call in 'append_edit'
2023-07-06 11:54:47 -07:00
5a1d9e8f87 Merge branch 'gc/config-partial-submodule-kvi-fix'
Partially revert a sanity check that the rest of the config code
was not ready, to avoid triggering it in a corner case.

* gc/config-partial-submodule-kvi-fix:
  config: don't BUG when both kvi and source are set
2023-07-06 11:54:46 -07:00
f4c18e58be Merge branch 'pb/complete-diff-options'
Completion updates.

* pb/complete-diff-options: (24 commits)
  diff.c: mention completion above add_diff_options
  completion: complete --remerge-diff
  completion: complete --diff-merges, its options and --no-diff-merges
  completion: move --pickaxe-{all,regex} to __git_diff_common_options
  completion: complete --ws-error-highlight
  completion: complete --unified
  completion: complete --output-indicator-{context,new,old}
  completion: complete --output
  completion: complete --no-stat
  completion: complete --no-relative
  completion: complete --line-prefix
  completion: complete --ita-invisible-in-index and --ita-visible-in-index
  completion: complete --irreversible-delete
  completion: complete --ignore-matching-lines
  completion: complete --function-context
  completion: complete --find-renames
  completion: complete --find-object
  completion: complete --find-copies
  completion: complete --default-prefix
  completion: complete --compact-summary
  ...
2023-07-06 11:54:46 -07:00
67e7305e64 Merge branch 'cw/strbuf-cleanup'
Move functions that are not about pure string manipulation out of
strbuf.[ch]

* cw/strbuf-cleanup:
  strbuf: remove global variable
  path: move related function to path
  object-name: move related functions to object-name
  credential-store: move related functions to credential-store file
  abspath: move related functions to abspath
  strbuf: clarify dependency
  strbuf: clarify API boundary
2023-07-06 11:54:46 -07:00
da269af920 Merge branch 'rs/strbuf-expand-step'
Code clean-up around strbuf_expand() API.

* rs/strbuf-expand-step:
  strbuf: simplify strbuf_expand_literal_cb()
  replace strbuf_expand() with strbuf_expand_step()
  replace strbuf_expand_dict_cb() with strbuf_expand_step()
  strbuf: factor out strbuf_expand_step()
  pretty: factor out expand_separator()
2023-07-06 11:54:45 -07:00
1e3f26542a diff --no-index: support reading from named pipes
In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with "git diff
--no-index". On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named pipe. On
Linux, the arguments are symlinks to pipes, so git diff "helpfully"
diffs these symlinks, comparing their targets like "pipe:[1234]" and
"pipe:[5678]".

To address this "diff --no-index" is changed so that if a path given on
the commandline is a named pipe or a symbolic link that resolves to a
named pipe then we read the data to diff from that pipe. This is
implemented by generalizing the code that already exists to handle
reading from stdin when the user passes the path "-".

If the user tries to compare a named pipe to a directory then we die as
we do when trying to compare stdin to a directory.

As process substitution is not support by POSIX this change is tested by
using a pipe and a symbolic link to a pipe.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
df521462f0 t4054: test diff --no-index with stdin
"git diff --no-index" supports reading from stdin with the path "-".
There is no test coverage for this so add a regression test before
changing the code in the next commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
4e61e0f680 diff --no-index: die on error reading stdin
If there is an error when reading from stdin then "diff --no-index"
prints an error message but continues to try and diff a file named "-"
resulting in an error message that looks like

    error: error while reading from stdin: Invalid argument
    fatal: stat '-': No such file or directory

assuming that no file named "-" exists. If such a file exists it prints
the first error message and generates the diff from that file which is
not what the user wanted. Instead just die() straight away if we cannot
read from stdin.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
498198453d diff --no-index: refuse to compare stdin to a directory
When the user runs

    git diff --no-index file directory

we follow the behavior of POSIX diff and rewrite the arguments as

    git diff --no-index file directory/file

Doing that when "file" is "-" (which means "read from stdin") does not
make sense so we should error out if the user asks us to compare "-" to
a directory. This matches the behavior of GNU diff and diff on *BSD.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
1aa92b8500 t0091-bugreport.sh: actually verify some content of report
In the first test in this script, 'creates a report with content in the
right places', we generate a report and pipe it into our helper
`check_all_headers_populated()`. The idea of the helper is to find all
lines that look like headers ("[Some Header Here]") and to check that
the next line is non-empty. This is supposed to catch erroneous outputs
such as the following:

  [A Header]
  something
  more here

  [Another Header]

  [Too Early Header]
  contents

However, we provide the lines of the bug report as filenames to grep,
meaning we mostly end up spewing errors:

  grep: : No such file or directory
  grep: [System Info]: No such file or directory
  grep: git version:: No such file or directory
  grep: git version 2.41.0.2.gfb7d80edca: No such file or directory

This doesn't disturb the test, which tugs along and reports success, not
really having verified the contents of the report at all.

Note that after 788a776069 ("bugreport: collect list of populated
hooks", 2020-05-07), the bug report, which is created in our hook-less
test repo, contains an empty section with the enabled hooks. Thus, even
the intention of our helper is a bit misguided: there is nothing
inherently wrong with having an empty section in the bug report.

Let's instead split this test into three: first verify that we generate
a report at all, then check that the introductory blurb looks the way it
should, then verify that the "[System Info]" seems to contain the right
things. (The "[Enabled Hooks]" section is tested later in the script.)

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:45:46 -07:00
91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
28aed75a9f kwset: move translation table from ctype
This table was originally introduced to solely be used with kwset
machinery (0f871cf56e), so it would make sense for it to belong in
kwset.[ch] rather than ctype.c and git-compat-util.h. It is only used in
diffcore-pickaxe.c, which already includes kwset.h so no other headers
have to be modified.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:18 -07:00
1890ce84bd sane-ctype.h: create header for sane-ctype macros
Splitting these macros from git-compat-util.h cleans up the file and
allows future third-party sources to not use these overrides if they do
not wish to.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:18 -07:00
382f6940af git-compat-util: move wrapper.c funcs to its header
Since the functions in wrapper.c are widely used across the codebase,
include it by default in git-compat-util.h. A future patch will remove
now unnecessary inclusions of wrapper.h from other files.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:18 -07:00
fda5d9595d git-compat-util: move strbuf.c funcs to its header
While functions like starts_with() probably should not belong in the
boundaries of the strbuf library, this commit focuses on first splitting
out headers from git-compat-util.h.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:18 -07:00
d378637d2f imap-send: drop unused fields from imap_cmd_cb
The imap_cmd_cb struct has several fields which are totally unused.
Presumably they did useful things in the upstream isync code from which
this is derived, but they don't in our more limited program. This is
particularly confusing for the "done" callback, which (as of the
previous patch) no longer matches the signature of the adjacent "cont"
callback.

Since we're unlikely to share code with isync going forward, we should
feel free to simplify the code here. Note that "done" is examined but
never set, so we can also drop a little bit of code outside of the
struct definition.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 10:16:53 -07:00
ec9e358a4a imap-send: drop unused parameter from imap_cmd_cb callback
There's a generic callback mechanism for handling plus-continuation of
IMAP commands. It takes the imap_cmd struct itself as an argument. That
seems reasonable, and in a larger imap-using program it might be used.
But in imap-send, we have only one such callback (auth_cram_md5) and it
doesn't use this value, triggering -Wunused-parameter warnings.

We could just mark the parameter as UNUSED. But since this is the only
such function, and because we are not likely to share code with the
upstream isync anymore, we can just simplify the interface to remove
this parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 10:16:53 -07:00
84d689a810 imap-send: use server conf argument in setup_curl()
Our caller passes in an imap_server_conf struct, but we ignore it
totally, and instead read the config directly from the global "server"
variable. This works OK, since our sole caller will pass in that same
global variable. But the intent seems to have been to use the passed-in
variable, as otherwise it has no purpose (and many other functions use
the same pattern).

Let's use the passed-in value, which also silences a -Wunused-parameter
warning.

It would be nice if "server" was not a global here, as we could avoid
making similar mistakes. But changing that would be a larger refactor,
as it must be accessed as a global in a few spots (e.g., filling it in
with the config callback).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 10:16:53 -07:00
bbd7c7b7c0 docs: add necessary headers to Documentation/MFOW.txt
The tutorial in Documentation/MyFirstObjectWalk.txt
contains the functions trace_printf(), oid_to_hex(),
and pp_commit_easy(), and struct oidset, which are used
without any hint of where they are defined. When the provided
code is compiled, the compiler returns an error, stating that
the functions and the struct are used before declaration. Therefore,include
necessary header files (the ones which have no mentions in the tutorial).

Signed-off-by: Vinayak Dev <vinayakdev.sci@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-04 23:11:49 -07:00
a646b86cd1 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-04 16:08:18 -07:00
89d62d5e8e Merge branch 'bc/more-git-var'
Add more "git var" for toolsmiths to learn various locations Git is
configured with either via the configuration or hardcoded defaults.

* bc/more-git-var:
  var: add config file locations
  var: add attributes files locations
  attr: expose and rename accessor functions
  var: adjust memory allocation for strings
  var: format variable structure with C99 initializers
  var: add support for listing the shell
  t: add a function to check executable bit
  var: mark unused parameters in git_var callbacks
2023-07-04 16:08:18 -07:00
812907d16f Merge branch 'ps/revision-stdin-with-options'
The set-up code for the get_revision() API now allows feeding
options like --all and --not in the --stdin mode.

* ps/revision-stdin-with-options:
  revision: handle pseudo-opts in `--stdin` mode
  revision: small readability improvement for reading from stdin
  revision: reorder `read_revisions_from_stdin()`
2023-07-04 16:08:18 -07:00
4f3e5af03a l10n: ru.po: update Russian translation
Signed-off-by: Dimitriy Ryazantcev <dimitriy.ryazantcev@gmail.com>
2023-06-30 13:21:19 +03:00
9748a68200 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 16:43:21 -07:00
4c237d2ca2 Merge branch 'tb/gc-recent-object-hook'
Test update.

* tb/gc-recent-object-hook:
  t7701: make annotated tag unreachable
2023-06-29 16:43:21 -07:00
3ea43bbe17 Merge branch 'jc/abort-ll-merge-with-a-signal'
When the external merge driver is killed by a signal, its output
should not be trusted as a resolution with conflicts that is
proposed by the driver, but the code did.

* jc/abort-ll-merge-with-a-signal:
  t6406: skip "external merge driver getting killed by a signal" test on Windows
  ll-merge: killing the external merge driver aborts the merge
2023-06-29 16:43:21 -07:00
a1264a08a1 Merge branch 'en/header-split-cache-h-part-3'
Header files cleanup.

* en/header-split-cache-h-part-3: (28 commits)
  fsmonitor-ll.h: split this header out of fsmonitor.h
  hash-ll, hashmap: move oidhash() to hash-ll
  object-store-ll.h: split this header out of object-store.h
  khash: name the structs that khash declares
  merge-ll: rename from ll-merge
  git-compat-util.h: remove unneccessary include of wildmatch.h
  builtin.h: remove unneccessary includes
  list-objects-filter-options.h: remove unneccessary include
  diff.h: remove unnecessary include of oidset.h
  repository: remove unnecessary include of path.h
  log-tree: replace include of revision.h with simple forward declaration
  cache.h: remove this no-longer-used header
  read-cache*.h: move declarations for read-cache.c functions from cache.h
  repository.h: move declaration of the_index from cache.h
  merge.h: move declarations for merge.c from cache.h
  diff.h: move declaration for global in diff.c from cache.h
  preload-index.h: move declarations for preload-index.c from elsewhere
  sparse-index.h: move declarations for sparse-index.c from cache.h
  name-hash.h: move declarations for name-hash.c from cache.h
  run-command.h: move declarations for run-command.c from cache.h
  ...
2023-06-29 16:43:21 -07:00
b2166b0d49 Merge branch 'ds/remove-idx-before-pack'
We create .pack and then .idx, we consider only packfiles that have
.idx usable (those with only .pack are not ready yet), so we should
remove .idx before removing .pack for consistency.

* ds/remove-idx-before-pack:
  packfile: delete .idx files before .pack files
2023-06-29 16:43:20 -07:00
6e6a529b57 fsck: avoid misleading variable name
When reporting a problem, `git fsck` emits a message such as:

    missing blob 1234abcd (:file)

However, this can be ambiguous when the problem is detected in the index
of a worktree other than the one in which `git fsck` was invoked. To
address this shortcoming, 592ec63b38 (fsck: mention file path for index
errors, 2023-02-24) enhanced the output to mention the path of the index
when the problem is detected in some other worktree:

    missing blob 1234abcd (.git/worktrees/wt/index:file)

Unfortunately, the variable in fsck_index() which controls whether the
index path should be shown is misleadingly named "is_main_index" which
can be misunderstood as referring to the main worktree (i.e. the one
housing the .git/ repository) rather than to the current worktree (i.e.
the one in which `git fsck` was invoked). Avoid such potential confusion
by choosing a name more reflective of its actual purpose.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 13:58:57 -07:00
1876a5ae15 t4205: correctly test %(describe:abbrev=...)
The pretty format %(describe:abbrev=<number>) tells describe to use
at least <number> digits of the oid to generate the human-readable
format of the commit-ish.

There are three things to test here:
  - Check that we can describe a commit that is not tagged (that is,
    for example our HEAD is at least one commit ahead of some reachable
    commit which is tagged) with at least <number> digits of the oid
    being used for describing it.

  - Check that when using such a commit-ish, we always use at least
    <number> digits of the oid to describe it.

  - Check that we can describe a tag. This just gives the name of the
    tag irrespective of abbrev (abbrev doesn't make sense here).

Do this, instead of the current test which only tests the last case.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:35 -07:00
2ee045eea1 commit -a -m: allow the top-level tree to become empty again
In 03267e8656 (commit: discard partial cache before (re-)reading it,
2022-11-08), a memory leak was plugged by discarding any partial index
before re-reading it.

The problem with this memory leak fix is that it was based on an
incomplete understanding of the logic introduced in 7168624c35 (Do not
generate full commit log message if it is not going to be used,
2007-11-28).

That logic was introduced to add a shortcut when committing without
editing the commit message interactively. A part of that logic was to
ensure that the index was read into memory:

	if (!active_nr && read_cache() < 0)
		die(...)

Translation to English: If the index has not yet been read, read it, and
if that fails, error out.

That logic was incorrect, though: It used `!active_nr` as an indicator
that the index was not yet read. Usually this is not a problem because
in the vast majority of instances, the index contains at least one
entry.

And it was natural to do it this way because at the time that condition
was introduced, the `index_state` structure had no explicit flag to
indicate that it was initialized: This flag was only introduced in
913e0e99b6 (unpack_trees(): protect the handcrafted in-core index from
read_cache(), 2008-08-23), but that commit did not adjust the code path
where no index file was found and a new, pristine index was initialized.

Now, when the index does not contain any entry (which is quite
common in Git's test suite because it starts quite a many repositories
from scratch), subsequent calls to `do_read_index()` will mistake the
index not to be initialized, and read it again unnecessarily.

This is a problem because after initializing the empty index e.g. the
`cache_tree` in that index could have been initialized before a
subsequent call to `do_read_index()` wants to ensure an initialized
index. And if that subsequent call mistakes the index not to have been
initialized, it would lead to leaked memory.

The correct fix for that memory leak is to adjust the condition so that
it does not mistake `active_nr == 0` to mean that the index has not yet
been read.

Using the `initialized` flag instead, we avoid that mistake, and as a
bonus we can fix a bug at the same time that was introduced by the
memory leak fix: When deleting all tracked files and then asking `git
commit -a -m ...` to commit the result, Git would internally update the
index, then discard and re-read the index undoing the update, and fail
to commit anything.

This fixes https://github.com/git-for-windows/git/issues/4462

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:04 -07:00
7667f4f0a3 split-index: accept that a base index can be empty
We are about to fix an ancient bug where `do_read_index()` pretended
that the index was not initialized when there are no index entries.

Before the `index_state` structure gained the `initialized` flag in
913e0e99b6 (unpack_trees(): protect the handcrafted in-core index from
read_cache(), 2008-08-23), that was the best we could do (even if it was
incorrect: it is totally possible to read a Git index file that contains
no index entries).

This pattern was repeated also in 998330ac2e (read-cache: look for
shared index files next to the index, too, 2021-08-26), which we fix
here by _not_ mistaking an empty base index for a missing
`sharedindex.*` file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:04 -07:00
866b43e644 do_read_index(): always mark index as initialized unless erroring out
In 913e0e99b6 (unpack_trees(): protect the handcrafted in-core index
from read_cache(), 2008-08-23) a flag was introduced into the
`index_state` structure to indicate whether it had been initialized (or
more correctly: read and parsed).

There was one code path that was not handled, though: when the index
file does not yet exist (but the `must_exist` parameter is set to 0 to
indicate that that's okay). In this instance, Git wants to go forward
with a new, pristine Git index, almost as if the file had existed and
contained no index entries or extensions.

Since Git wants to handle this situation the same as if an "empty" Git
index file existed, let's set the `initialized` flag also in that case.

This is necessary to prepare for fixing the bug where the condition
`cache_nr == 0` is incorrectly used as an indicator that the index was
already read, and the condition `initialized != 0` needs to be used
instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:04 -07:00
d4f28279ad docs: add git hash-object -t option's possible values
The summary under the NAME section for git hash-object can mislead
readers to conclude that the command can only be used to create blobs,
whereas the description makes it clear that it can be used to create
objects, not just blobs. Let's clarify the one-line summary.

Further, the description for the option -t does not list out other types
that can be used when creating objects. Let's make this explicit by
listing out the different object types.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 23:00:10 -07:00
6e8e7981eb config: pass source to config_parser_event_fn_t
..so that the callback can use a "struct config_source" parameter
instead of "config_reader.source". "struct config_source" is internal to
config.c, so we are adding a pointer to a struct defined in config.c
into a public function signature defined in config.h, but this is okay
because this function has only ever been (and probably ever will be)
used internally by config.c.

As a result, the_reader isn't used anywhere, so "struct config_reader"
is obsolete (it was only intended to be used with the_reader). Remove
them.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
908857a9f8 config: add kvi.path, use it to evaluate includes
Include directives are evaluated using the path of the config file. To
reduce the dependence on "config_reader.source", add a new
"key_value_info.path" member and use that instead of
"config_source.path". This allows us to remove a "struct config_reader
*" field from "struct config_include_data", which will subsequently
allow us to remove "struct config_reader" entirely.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
f6c213a0cb config.c: remove config_reader from configsets
Remove the last usage of "struct config_reader" from configsets by
copying the "kvi" arg instead of recomputing "kvi" from
config_reader.source. Since we no longer need to pass both "struct
config_reader" and "struct config_set" in a single "void *cb", remove
"struct configset_add_data" too.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
8868b1ebfb config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.

In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.

Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.

The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
dc90208497 trace2: plumb config kvi
There is a code path starting from trace2_def_param_fl() that eventually
calls current_config_scope(), and thus it needs to have "kvi" plumbed
through it. Additional plumbing is also needed to get "kvi" to
trace2_def_param_fl(), which gets called by two code paths:

- Through tr2_cfg_cb(), which is a config callback, so it trivially
  receives "kvi" via the "struct config_context ctx" parameter.

- Through tr2_list_env_vars_fl(), which is a high level function that
  lists environment variables for tracing. This has been secretly
  behaving like git_config_from_parameters() (in that it parses config
  from environment variables/the CLI), but does not set config source
  information.

  Teach tr2_list_env_vars_fl() to be well-behaved by using
  kvi_from_param(), which is used elsewhere for CLI/environment
  variable-based config.

As a result, current_config_scope() has no more callers, so remove it.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
26b669324b config.c: pass ctx with CLI config
Pass config_context when parsing CLI config. To provide the .kvi member,
refactor out kvi_from_param() from the logic that caches CLI config in
configsets. Now that config_context and config_context.kvi is always
present when config machinery calls config callbacks, plumb "kvi" so
that we can remove all calls of current_config_scope() except for
trace2/*.c (which will be handled in a later commit), and remove all
other current_config_*() (the functions themselves and their calls).
Note that this results in .kvi containing a different, more complete
set of information than the mocked up "struct config_source" in
git_config_from_parameters().

Plumbing "kvi" reveals a few places where we've been doing the wrong
thing:

* git_config_parse_parameter() hasn't been setting config source
  information, so plumb "kvi" there too.

* Several sites in builtin/config.c have been calling current_config_*()
  functions outside of config callbacks (indirectly, via the
  format_config() helper), which means they're reading state that isn't
  set correctly:

  * "git config --get-urlmatch --show-scope" iterates config to collect
    values, but then attempts to display the scope after config
    iteration, causing the "unknown" scope to be shown instead of the
    config file's scope. It's clear that this wasn't intended: we knew
    that "--get-urlmatch" couldn't show config source metadata, which is
    why "--show-origin" was marked incompatible with "--get-urlmatch"
    when it was introduced [1]. It was most likely a mistake that we
    allowed "--show-scope" to sneak through.

    Fix this by copying the "kvi" value in the collection phase so that
    it can be read back later. This means that we can now support "git
    config --get-urlmatch --show-origin", but that is left unchanged
    for now.

  * "git config --default" doesn't have config source metadata when
    displaying the default value, so "--show-scope" also results in
    "unknown", and "--show-origin" results in a BUG(). Fix this by
    treating the default value as if it came from the command line (e.g.
    like we do with "git -c" or "git config --file"), using
    kvi_from_param().

[1] https://lore.kernel.org/git/20160205112001.GA13397@sigill.intra.peff.net/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
809d868061 config: pass ctx with config files
Pass config_context to config_callbacks when parsing config files. To
provide the .kvi member, refactor out the configset logic that caches
"struct config_source" and "enum config_scope" as a "struct
key_value_info". Make the "enum config_scope" available to the config
file machinery by plumbing an additional arg through
git_config_from_file_with_options().

We do not exercise ctx yet because the remaining current_config_*()
callers may be used with config_with_options(), which may read config
from parameters, but parameters don't pass ctx yet.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
6021e1d158 config.c: pass ctx in configsets
Pass config_context to config callbacks in configset_iter(), trivially
setting the .kvi member to the cached key_value_info. Then, in config
callbacks that are only used with configsets, use the .kvi member to
replace calls to current_config_*(), and delete current_config_line()
because it has no remaining callers.

This leaves builtin/config.c and config.c as the only remaining users of
current_config_*().

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
e0f9a51c32 urlmatch.h: use config_fn_t type
These are actually used as config callbacks, so use the typedef-ed type
and make future refactors easier.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:38 -07:00
97eeeea2dc config: inline git_color_default_config
git_color_default_config() is a shorthand for calling two other config
callbacks. There are no other non-static functions that do this and it
will complicate our refactoring of config_fn_t so inline it instead.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:38 -07:00
a096a889f4 fix cherry-pick/revert status when doing multiple commits
The status report for an in-progress cherry-pick does not show the
current commit if the cherry-pick happens as part of a series of
multiple commits:

 $ git cherry-pick <commit1> <commit2>
 < one of the cherry-picks fails to merge clean >
 Cherry-pick currently in progress.
  (run "git cherry-pick --continue" to continue)
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

 $ git status
 On branch <branch>
 Your branch is ahead of '<upstream>' by 1 commit.
   (use "git push" to publish your local commits)

 Cherry-pick currently in progress.
   (run "git cherry-pick --continue" to continue)
   (use "git cherry-pick --skip" to skip this patch)
   (use "git cherry-pick --abort" to cancel the cherry-pick operation)

The show_cherry_pick_in_progress() function prints "Cherry-pick
currently in progress". That function does have a more verbose print
based on whether the cherry_pick_head_oid is null or not. If it is not
null, then a more helpful message including which commit is actually
being picked is displayed.

The introduction of the "Cherry-pick currently in progress" message
comes from 4a72486de9 ("fix cherry-pick/revert status after commit",
2019-04-17). This commit modified wt_status_get_state() in order to
detect that a cherry-pick was in progress even if the user has used `git
commit` in the middle of the sequence.

The check used to detect this is the call to sequencer_get_last_command.
If the sequencer indicates that the lass command was a REPLAY_PICK, then
the state->cherry_pick_in_progress is set to 1 and the
cherry_pick_head_oid is initialized to the null_oid. Similar behavior is
done for the case of REPLAY_REVERT.

It happens that this call of sequencer_get_last_command will always
report the action even if the user hasn't interrupted anything. Thus,
during a range of cherry-picks or reverts, the cherry_pick_head_oid and
revert_head_oid will always be overwritten and initialized to the null
oid.

This results in status always displaying the terse message which does
not include commit information.

Fix this by adding an additional check so that we do not re-initialize
the cherry_pick_head_oid or revert_head_oid if we have already set the
cherry_pick_in_progress or revert_in_progress bits. This ensures that
git status will display the more helpful information when its available.
Add a test case covering this behavior.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 15:48:55 -07:00
ed773a18c6 var: add config file locations
Much like with attributes files, sometimes programs would like to know
the location of configuration files at the global or system levels.
However, it isn't always clear where these may live, especially for the
system file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.

Since other parties cannot intuitively know how Git was compiled and
where it looks for these files, help them by providing variables that
can be queried.  Because we have multiple paths for global config
values, print them in order from highest to lowest priority, and be sure
to split on newlines so that "git var -l" produces two entries for the
global value.

However, be careful not to split all values on newlines, since our
editor values could well contain such characters, and we don't want to
split them in such a case.

Note in the documentation that some values may contain multiple paths
and that callers should be prepared for that fact.  This helps people
write code that will continue to work in the event we allow multiple
items elsewhere in the future.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
576a37fccb var: add attributes files locations
Currently, there are some programs which would like to read and parse
the gitattributes files at the global or system levels.  However, it's
not always obvious where these files live, especially for the system
file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.

It's not reasonable to expect all callers of Git to intuitively know
where the Git distributor or user has configured these locations to
be, so add some entries to allow us to determine their location.  Honor
the GIT_ATTR_NOSYSTEM environment variable if one is specified.  Expose
the accessor functions in a way that we can reuse them from within the
var code.

In order to make our paths consistent on Windows and also use the same
form as paths use in "git rev-parse", let's normalize the path before we
return it.  This results in Windows-style paths that use slashes, which
is convenient for making our tests function in a consistent way across
platforms.  Note that this requires that some of our values be freed, so
let's add a flag about whether the value needs to be freed and use it
accordingly.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
15780bb4f0 attr: expose and rename accessor functions
Right now, the functions which determine the current system and global
gitattributes files are not exposed.  We'd like to use them in a future
commit, but they're not ideally named.  Rename them to something more
suitable as a public interface, expose them, and document them.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
cdd489eaf9 var: adjust memory allocation for strings
Right now, all of our values are constants whose allocation is managed
elsewhere.  However, in the future, we'll have some variables whose
memory we will need to free.  To keep things consistent, let's make each
of our functions allocate its own memory and make the caller responsible
for freeing it.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
f74c90dcf7 var: format variable structure with C99 initializers
Right now, we have only two items in our variable struct.  However, in
the future, we're going to add two more items.  To help keep our diffs
nice and tidy and make this structure easier to read, switch to use
C99-style initializers for our data.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
1e65721227 var: add support for listing the shell
On most Unix systems, finding a suitable shell is easy: one simply uses
"sh" with an appropriate PATH value.  However, in many Windows
environments, the shell is shipped alongside Git, and it may or may not
be in PATH, even if Git is.

In such an environment, it can be very helpful to query Git for the
shell it's using, since other tools may want to use the same shell as
well.  To help them out, let's add a variable, GIT_SHELL_PATH, that
points to the location of the shell.

On Unix, we know our shell must be executable to be functional, so
assume that the distributor has correctly configured their environment,
and use that as a basic test.  On Git for Windows, we know that our
shell will be one of a few fixed values, all of which end in "sh" (such
as "bash").  This seems like it might be a nice test on Unix as well,
since it is customary for all shells to end in "sh", but there probably
exist such systems that don't have such a configuration, so be careful
here not to break them.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
d6546af75c t: add a function to check executable bit
In line with our other helper functions for paths, let's add a function
to check whether a path is executable, and if not, print a suitable
error message.  Document this function, and note that it must only be
used under the POSIXPERM prerequisite, since it doesn't otherwise work
on Windows.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
4db16f58c7 var: mark unused parameters in git_var callbacks
We abstract the set of variables into a table, with a "read" callback to
provide the value of each. Each callback takes a "flag" argument, but
most callbacks don't make use of it.

This flag is a bit odd. It may be set to IDENT_STRICT, which make sense
for ident-based callbacks, but is just confusing for things like
GIT_EDITOR.

At first glance, it seems like this is just a hack to let us directly
stick the generic git_committer_info() and git_author_info() functions
into our table. And we'd be better off to wrap them with local functions
which pass IDENT_STRICT, and have our callbacks take no option at all.

But that doesn't quite work. We pass IDENT_STRICT when the caller asks
for a specific variable, but otherwise do not (so that "git var -l" does
not bail if the committer ident cannot be formed).

So we really do need to pass in the flag to each invocation, even if the
individual callback doesn't care about it. Let's mark the unused ones so
that -Wunused-parameter does not complain. And while we're here, let's
rename them so that it's clear that the flag values we get will be from
the IDENT_* set. That may prevent confusion for future readers of the
code.

Another option would be to define our own local "strict" flag for the
callbacks, and then have wrappers that translate that to IDENT_STRICT
where it matters. But that would be more boilerplate for little gain
(most functions would still ignore the "strict" flag anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
a53f43f900 config: don't BUG when both kvi and source are set
When iterating through config, we read config source metadata from
global values - either a "struct config_source + enum config_scope"
or a "struct key_value_info", using the current_config* functions. Prior
to the series starting from 0c60285147 (config.c: create config_reader
and the_reader, 2023-03-28), we weren't very picky about which values we
should read in which situation; we did note that both groups of values
generally shouldn't be set together, but if both were set,
current_config* preferentially reads key_value_info. When that series
added more structure, we enforced that either the former (when parsing a
config source) can be set, or the latter (when iterating a config set),
but *never* both at the same time. See 9828453ff0 (config.c: remove
current_config_kvi, 2023-03-28) and 5cdf18e7cd (config.c: remove
current_parsing_scope, 2023-03-28).

That was a good simplifying constraint that helped us reason about the
global state, but it turns out that there is at least one situation
where we need both to be set at the same time: in a blobless partial
clone where .gitmodules is missing. "git fetch" in such a repo will
start a config parse over .gitmodules (setting the config_source), and
Git will attempt to lazy-fetch it from the promisor remote. However,
when we try to read the promisor configuration, we start iterating a
config set (setting the key_value_info), and we BUG() out because that's
not allowed any more.

Teaching config_reader to gracefully handle this is somewhat
complicated, but fortunately, there are proposed changes to the config.c
machinery to get rid of this global state, and make the BUG() obsolete
[1]. We should rely on that as the eventual solution, and avoid doing
yet another refactor in the meantime.

Therefore, fix the bug by removing the BUG() check. We're reverting to
an older, less safe state, but that's generally okay since
key_value_info is always preferentially read, so we'd always read the
correct values when we iterate a config set in the middle of a config
parse (like we are here). The reverse would be wrong, but extremely
unlikely to happen since very few callers parse config without going
through a config set.

[1] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 12:07:47 -07:00
0a868031ed diff.c: mention completion above add_diff_options
Add a comment on top of add_diff_options, where common diff options are
listed, mentioning __git_diff_common_options in the completion script,
in the hope that contributors update it when they add new diff flags.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:14 -07:00
55245d669a completion: complete --remerge-diff
--remerge-diff only makes sense for 'git log' and 'git show', so add it
to __git_log_show_options which is referenced in the completion for
these two commands.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:14 -07:00
98aaeb2f77 completion: complete --diff-merges, its options and --no-diff-merges
The flags --[no-]diff-merges only make sense for 'git log' and 'git
show', so add a new variable __git_log_show_options for options only
relevant to these two commands, and add them there. Also add
__git_diff_merges_opts and list the accepted values for --diff-merges,
and use it in _git_log and _git_show.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:14 -07:00
d520d98382 completion: move --pickaxe-{all,regex} to __git_diff_common_options
The options --pickaxe-all and --pickaxe-regex are listed in
__git_diff_difftool_options and repeated in _git_log. Move them to
__git_diff_common_options instead, which makes them available
automatically in the completion of other commands referencing this
variable.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:13 -07:00
da260f6188 completion: complete --ws-error-highlight
Add --ws-error-highlight= to the list in __git_diff_common_options, and
add the accepted values in a new list __git_ws_error_highlight_opts.

Use __git_ws_error_highlight_opts in _git_diff, _git_log and _git_show
to offer the accepted values.

As noted in fd0bc17557 (completion: add diff --color-moved[-ws],
2020-02-21), there is no easy way to offer completion for several
comma-separated values, so this is limited to completing a single
value.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:13 -07:00
3b698111d5 completion: complete --unified
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:13 -07:00
b5c3edc17e completion: complete --output-indicator-{context,new,old}
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:13 -07:00
8fe2bd7905 completion: complete --output
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:13 -07:00
5727be06c6 completion: complete --no-stat
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:13 -07:00
ffcccc62b0 completion: complete --no-relative
Add --no-relative to __git_diff_common_options in the completion script,
and move --relative from __git_diff_difftool_options to
__git_diff_common_options since it applies to more than just diff and
difftool.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:12 -07:00
3a4d453246 completion: complete --line-prefix
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:12 -07:00
08a9a6d615 completion: complete --ita-invisible-in-index and --ita-visible-in-index
The options --ita-invisible-in-index and --ita-visible-in-index are
listed in diff-options.txt and so are included in the documentation of
commands which include this file (diff, diff-*, log, show, format-patch)
but they only make sense for diffs relating to the index. As such, add
them to '__git_diff_difftool_options' instead of
'__git_diff_common_options' since it makes more sense to add them there.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:12 -07:00
8885532abb completion: complete --irreversible-delete
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:12 -07:00
b454ed2c5d completion: complete --ignore-matching-lines
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:11 -07:00
57159d5daa completion: complete --function-context
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:11 -07:00
1eb638a5db completion: complete --find-renames
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:11 -07:00
04d430ff8d completion: complete --find-object
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:11 -07:00
2f677ad348 completion: complete --find-copies
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:11 -07:00
a4d13aaf22 completion: complete --default-prefix
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:11 -07:00
fd9058649a completion: complete --compact-summary
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:10 -07:00
de4898d7d8 completion: complete --combined-all-paths
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:10 -07:00
93ec6111a6 completion: complete --cc
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:10 -07:00
6a38b0a0dc completion: complete --break-rewrites
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:10 -07:00
f0b9e15378 completion: add comments describing __git_diff_* globals
Add descriptive comments for '__git_diff_common_options' and
'__git_diff_difftool_options', so that it is clearer when looking at
these variables to know in which command's completion they are used.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:40:10 -07:00
a9e066fa63 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 09:29:50 -07:00
e224f26892 Merge branch 'tb/collect-pack-filenames-fix'
Avoid breakage of "git pack-objects --cruft" due to inconsistency
between the way the code enumerates packfiles in the repository.

* tb/collect-pack-filenames-fix:
  builtin/repack.c: only collect fully-formed packs
2023-06-26 09:29:50 -07:00
8d5c5a05d7 Merge branch 'jk/commit-use-no-divider-with-interpret-trailers'
When "git commit --trailer=..." invokes the interpret-trailers
machinery, it knows what it feeds to interpret-trailers is a full
log message without any patch, but failed to express that by
passing the "--no-divider" option, which has been corrected.

* jk/commit-use-no-divider-with-interpret-trailers:
  commit: pass --no-divider to interpret-trailers
2023-06-26 09:29:49 -07:00
42612e18d2 apply: improve error messages when reading patch
Commit f1c0e3946e (apply: reject patches larger than ~1 GiB, 2022-10-25)
added a limit on the size of patch that apply will process to avoid
integer overflows. The implementation re-used the existing error message
for when we are unable to read the patch. This is unfortunate because (a) it
does not signal to the user that the patch is being rejected because it
is too large and (b) it uses error_errno() without setting errno.

This patch adds a specific error message for the case when a patch is
too large. It also updates the existing message to make it clearer that
it is the patch that cannot be read rather than any other file and marks
both messages for translation. The "git apply" prefix is also dropped to
match most of the rest of the error messages in apply.c (there are still
a few error messages that prefixed with "git apply" and are not marked
for translation after this patch). The test added in f1c0e3946e is
updated accordingly.

Reported-by: Premek Vysoky <Premek.Vysoky@microsoft.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 08:58:50 -07:00
25d59524bb t7701: make annotated tag unreachable
In 4dc16e2cb0 (gc: introduce `gc.recentObjectsHook`, 2023-06-07), we
added tests to ensure that prune-able (i.e. unreachable and with mtime
older than the cutoff) objects which are marked as recent via the new
`gc.recentObjectsHook` configuration are unpacked as loose with
`--unpack-unreachable`.

In that test, we also ensure that objects which are reachable from other
unreachable objects which were *not* pruned are kept as well, regardless
of their mtimes. For this, we use an annotated tag pointing at a blob
($obj2) which would otherwise be pruned.

But after pruning, that object is kept around for two reasons. One, the
tag object's mtime wasn't adjusted to be beyond the 1-hour cutoff, so it
would be kept as due to its recency regardless. The other reason is
because the tag itself is reachable.

Use mktag to write the tag object directly without pointing a reference
at it, and adjust the mtime of the tag object to be older than the
cutoff to ensure that our `gc.recentObjectsHook` configuration is
working as intended.

Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-24 15:50:41 -07:00
94486b6763 Merge branch 'maint'
* maint:
  http: handle both "h2" and "h2h3" in curl info lines
2023-06-24 15:05:06 -07:00
fb7d80edca Merge branch 'jk/redact-h2h3-headers-fix' into maint-2.41
* jk/redact-h2h3-headers-fix:
  http: handle both "h2" and "h2h3" in curl info lines
2023-06-24 15:04:48 -07:00
34d765e736 t6406: skip "external merge driver getting killed by a signal" test on Windows
The run_command() on the platform does not seem to report death by
signal as the caller expects.  For now, skip the test on Windows.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-23 16:34:40 -07:00
6ff334181c The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-23 11:21:28 -07:00
4ee088deb8 Merge branch 'js/defeat-ignore-submodules-config-with-explicit-addition'
Even when diff.ignoreSubmodules tells us to ignore submodule
changes, "git commit" with an index that already records changes to
submodules should include the submodule changes in the resulting
commit, but it did not.

* js/defeat-ignore-submodules-config-with-explicit-addition:
  diff-lib: honor override_submodule_config flag bit
2023-06-23 11:21:17 -07:00
4e4fc50cf7 Merge branch 'rj/leakfixes'
Leakfixes

* rj/leakfixes:
  tests: mark as passing with SANITIZE=leak
  config: fix a leak in git_config_copy_or_rename_section_in_file
  branch: fix a leak in cmd_branch
  branch: fix a leak in setup_tracking
  rev-parse: fix a leak with --abbrev-ref
  branch: fix a leak in setup_tracking
  branch: fix a leak in check_tracking_branch
  branch: fix a leak in inherit_tracking
  branch: fix a leak in dwim_and_setup_tracking
  remote: fix a leak in query_matches_negative_refspec
  config: fix a leak in git_config_copy_or_rename_section_in_file
2023-06-23 11:21:17 -07:00
1d15be363c Merge branch 'tb/open-midx-bitmap-fallback'
Gracefully deal with a stale MIDX file that lists a packfile that
no longer exists.

* tb/open-midx-bitmap-fallback:
  pack-bitmap.c: gracefully degrade on failure to load MIDX'd pack
2023-06-23 11:21:17 -07:00
58ecb2e383 Merge branch 'tb/gc-recent-object-hook'
"git pack-objects" learned to invoke a new hook program that
enumerates extra objects to be used as anchoring points to keep
otherwise unreachable objects in cruft packs.

* tb/gc-recent-object-hook:
  gc: introduce `gc.recentObjectsHook`
  reachable.c: extract `obj_is_recent()`
2023-06-23 11:21:17 -07:00
891e631401 Merge branch 'tz/lib-gpg-prereq-fix'
Test update.

* tz/lib-gpg-prereq-fix:
  t/lib-gpg: require GPGSSH for GPGSSH_VERIFYTIME prereq
2023-06-23 11:21:17 -07:00
a813d9e239 Merge branch 'sl/worktree-sparse'
"git worktree" learned to work better with sparse index feature.

* sl/worktree-sparse:
  worktree: integrate with sparse-index
2023-06-23 11:21:16 -07:00
dcedba13b3 Merge branch 'rs/run-command-exec-error-on-noent'
Simplify error message when run-command fails to start a command.

* rs/run-command-exec-error-on-noent:
  run-command: report exec error even on ENOENT
  t1800: loosen matching of error message for bad shebang
2023-06-23 11:21:16 -07:00
5ee8fcdabc Merge branch 'mh/credential-erase-improvements'
* mh/credential-erase-improvements:
  credential: erase all matching credentials
  credential: avoid erasing distinct password
2023-06-23 11:21:16 -07:00
01202f5f68 Merge branch 'gc/discover-not-setup'
discover_git_directory() no longer touches the_repository.

* gc/discover-not-setup:
  setup.c: don't setup in discover_git_directory()
2023-06-23 11:21:16 -07:00
2b7b788fb3 ll-merge: killing the external merge driver aborts the merge
When an external merge driver dies with a signal, we should not
expect that the result left on the filesystem is in any useful
state.  However, because the current code uses the return value from
run_command() and declares any positive value as a sign that the
driver successfully left conflicts in the result, and because the
return value from run_command() for a subprocess that died upon a
signal is positive, we end up treating whatever garbage left on the
filesystem as the result the merge driver wanted to leave us.

run_command() returns larger than 128 (WTERMSIG(status) + 128, to be
exact) when it notices that the subprocess died with a signal, so
detect such a case and return LL_MERGE_ERROR from ll_ext_merge().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
2023-06-23 09:27:10 -07:00
0bfa463d37 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-22 16:29:07 -07:00
5fd4e2f6d1 Merge branch 'jt/doc-use-octal-with-printf'
Suggest to refrain from using hex literals that are non-portable
when writing printf(1) format strings.

* jt/doc-use-octal-with-printf:
  CodingGuidelines: use octal escapes, not hex
2023-06-22 16:29:07 -07:00
e0e8a2dfa0 Merge branch 'rs/doc-ls-tree-hex-literal'
Doc update.

* rs/doc-ls-tree-hex-literal:
  ls-tree: fix documentation of %x format placeholder
2023-06-22 16:29:07 -07:00
ad6d37ea7e Merge branch 'la/docs-typofixes'
Typofixes.

* la/docs-typofixes:
  docs: typofixes
2023-06-22 16:29:06 -07:00
1bff6a97fe Merge branch 'as/dtype-compilation-fix'
Compilation fix for platforms without D_TYPE in struct dirent.

* as/dtype-compilation-fix:
  statinfo.h: move DTYPE defines from dir.h
2023-06-22 16:29:06 -07:00
644591bd06 Merge branch 'ds/add-i-color-configuration-fix'
The reimplemented "git add -i" did not honor color.ui configuration.

* ds/add-i-color-configuration-fix:
  add: test use of brackets when color is disabled
  add: check color.ui for interactive add
2023-06-22 16:29:06 -07:00
a9ea4c23dc Merge branch 'ps/cat-file-null-output'
"git cat-file --batch" and friends learned "-Z" that uses NUL
delimiter for both input and output.

* ps/cat-file-null-output:
  cat-file: add option '-Z' that delimits input and output with NUL
  cat-file: simplify reading from standard input
  strbuf: provide CRLF-aware helper to read until a specified delimiter
  t1006: modernize test style to use `test_cmp`
  t1006: don't strip timestamps from expected results
2023-06-22 16:29:06 -07:00
d9f9f6b358 Merge branch 'ds/disable-replace-refs'
Introduce a mechanism to disable replace refs globally and per
repository.

* ds/disable-replace-refs:
  repository: create read_replace_refs setting
  replace-objects: create wrapper around setting
  repository: create disable_replace_refs()
2023-06-22 16:29:06 -07:00
f2ffc74186 Merge branch 'tb/pack-bitmap-traversal-with-boundary'
The object traversal using reachability bitmap done by
"pack-object" has been tweaked to take advantage of the fact that
using "boundary" commits as representative of all the uninteresting
ones can save quite a lot of object enumeration.

* tb/pack-bitmap-traversal-with-boundary:
  pack-bitmap.c: use commit boundary during bitmap traversal
  pack-bitmap.c: extract `fill_in_bitmap()`
  object: add object_array initializer helper function
2023-06-22 16:29:05 -07:00
4dd0469328 Merge branch 'ja/worktree-orphan'
'git worktree add' learned how to create a worktree based on an
orphaned branch with `--orphan`.

* ja/worktree-orphan:
  worktree add: emit warn when there is a bad HEAD
  worktree add: extend DWIM to infer --orphan
  worktree add: introduce "try --orphan" hint
  worktree add: add --orphan flag
  t2400: add tests to verify --quiet
  t2400: refactor "worktree add" opt exclusion tests
  t2400: cleanup created worktree in test
  worktree add: include -B in usage docs
2023-06-22 16:29:05 -07:00
68d686460f fsmonitor-ll.h: split this header out of fsmonitor.h
This creates a new fsmonitor-ll.h with most of the functions from
fsmonitor.h, though it leaves three inline functions where they were.
Two-thirds of the files that previously included fsmonitor.h did not
need those three inline functions or the six extra includes those inline
functions required, so this allows them to only include the lower level
header.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
b9a7ac2c68 hash-ll, hashmap: move oidhash() to hash-ll
oidhash() was used by both hashmap and khash, which makes sense.
However, the location of this function in hashmap.[ch] meant that
khash.h had to depend upon hashmap.h, making people unfamiliar with
khash think that it was built upon hashmap.  (Or at least, I personally
was confused for a while about this in the past.)

Move this function to hash-ll, so that khash.h can stop depending upon
hashmap.h.

This has another benefit as well: it allows us to remove hashmap.h's
dependency on hash-ll.h.  While some callers of hashmap.h were making
use of oidhash, most were not, so this change provides another way to
reduce the number of includes.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
8043418b77 khash: name the structs that khash declares
khash.h lets you instantiate custom hash types that map between two
types. These are defined as a struct, as you might expect, and khash
typedef's that to kh_foo_t. But it declares the struct anonymously,
which doesn't give a name to the struct type itself; there is no
"struct kh_foo". This has two small downsides:

  - when using khash, we declare "kh_foo_t *the_foo".  This is
    unlike our usual naming style, which is "struct kh_foo *the_foo".

  - you can't forward-declare a typedef of an unnamed struct type in
    C. So we might do something like this in a header file:

        struct kh_foo;
        struct bar {
                struct kh_foo *the_foo;
        };

    to avoid having to include the header that defines the real
    kh_foo. But that doesn't work with the typedef'd name. Without the
    "struct" keyword, the compiler doesn't know we mean that kh_foo is
    a type.

So let's always give khash structs the name that matches our
conventions ("struct kh_foo" to match "kh_foo_t"). We'll keep doing
the typedef to retain compatibility with existing callers.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
6723899932 merge-ll: rename from ll-merge
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch].  I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.

Further, back in d1cbe1e6d8 ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix.  Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.

Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
  ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group.  But giving them the common "ll-" prefix would group them.  Using
"-ll" as a suffix thus seems just much more logical to me.  Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.

[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
dd77d58795 git-compat-util.h: remove unneccessary include of wildmatch.h
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch().  The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.

Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
88e4e18325 builtin.h: remove unneccessary includes
This also made it clear that a few .c files under builtin/ were
depending upon some headers but had forgotten to #include them.  Add the
missing direct includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
768122900f list-objects-filter-options.h: remove unneccessary include
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
df6e874496 diff.h: remove unnecessary include of oidset.h
This also made it clear that several .c files depended upon various
things that oidset included, but had omitted the direct #include for
those headers.  Add those now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
0fd2e21571 log-tree: replace include of revision.h with simple forward declaration
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
bc5c5ec044 cache.h: remove this no-longer-used header
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.

Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first).  This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
08c46a499a read-cache*.h: move declarations for read-cache.c functions from cache.h
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h.  Also move some related inline
functions from cache.h to read-cache.h.  The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
bc47f16db2 repository.h: move declaration of the_index from cache.h
the_index is a global variable defined in repository.c; as such, its
declaration feels better suited living in repository.h rather than
cache.h.  Move it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
750324ddb8 merge.h: move declarations for merge.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
eaa966db79 diff.h: move declaration for global in diff.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
fbffdfb11c preload-index.h: move declarations for preload-index.c from elsewhere
We already have a preload-index.c file; move the declarations for the
functions in that file into a new preload-index.h.  These were
previously split between cache.h and repository.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
baf889c2cd sparse-index.h: move declarations for sparse-index.c from cache.h
Note in particular that this reverses the decision made in 118a2e8bde
("cache: move ensure_full_index() to cache.h", 2021-04-01).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
f5653856c2 name-hash.h: move declarations for name-hash.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
64c8559575 run-command.h: move declarations for run-command.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
90cbae9ce5 statinfo: move stat_{data,validity} functions from cache/read-cache
These functions do not depend upon struct cache_entry or struct
index_state in any way, and it seems more logical to break them out into
this file, especially since statinfo.h already has the struct stat_data
declaration.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
6cee5ebc7a read-cache: move shared add/checkout/commit code
The function add_files_to_cache(), plus associated helper functions,
were defined in builtin/add.c, but also shared with builtin/checkout.c
and builtin/commit.c.  Move these shared functions to read-cache.c.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
50c37ee839 add: modify add_files_to_cache() to avoid globals
The function add_files_to_cache() is used by all three of builtin/{add,
checkout, commit}.c.  That suggests this is common library code, and
should be moved somewhere else, like read-cache.c.  However, the
function and its helpers made use of two global variables that made
straight code movement difficult:
  * the_index
  * include_sparse
The latter was perhaps more problematic since it was only accessible in
builtin/add.c but was still affecting builtin/checkout.c and
builtin/commit.c without this fact being very clear from the code.  I'm
not sure if the other two callers would want to add a `--sparse` flag
similar to add.c to get non-default behavior, but exposing this
dependence will help if we ever decide we do want to add such a flag.

Modify add_files_to_cache() and its helpers to accept the necessary
arguments instead of relying on globals.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
1a40e7be6c read-cache: move shared commit and ls-files code
The function overlay_tree_on_index(), plus associated helper functions,
were defined in builtin/ls-files.c, but also shared with
builtin/commit.c.  Move these shared functions to read-cache.c.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
e8cf8ef507 setup: adopt shared init-db & clone code
The functions init_db() and initialize_repository_version() were shared
by builtin/init-db.c and builtin/clone.c, and declared in cache.h.

Move these functions, plus their several helpers only used by these
functions, to setup.[ch].

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
fc81735057 init-db, clone: change unnecessary global into passed parameter
Much like the parent commit, this commit was prompted by a desire to
move the functions which builtin/init-db.c and builtin/clone.c share out
of the former file and into setup.c.  A secondary issue that made it
difficult was the init_shared_repository global variable; replace it
with a simple parameter that is passed to the relevant functions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 11:14:34 -07:00
c2f76965d0 init-db: remove unnecessary global variable
This commit was prompted by a desire to move the functions which
builtin/init-db.c and builtin/clone.c share out of the former file and
into setup.c.  One issue that made it difficult was the
init_is_bare_repository global variable.

init_is_bare_repository's sole use in life it to cache a value in
init_db(), and then be used in create_default_files().  This is a bit
odd since init_db() directly calls create_default_files(), and is the
only caller of that function.  Convert the global to a simple function
parameter instead.

(Of course, this doesn't fix the fact that this value is then ignored by
create_default_files(), as noted in a big TODO comment in that function,
but it at least includes no behavioral change other than getting rid of
a very questionable global variable.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 11:14:34 -07:00
0f7443bdc7 init-db: document existing bug with core.bare in template config
The comments in create_default_files() talks about reading config from
the config file in the specified `--templates` directory, which leads to
the question of whether core.bare could be set in such a config file and
thus whether the code is doing the right thing.  It turns out, that it
doesn't; it unconditionally ignores core.bare in the config file in any
--templates directory.  It is not clear to me that fixing it can be done
within this function; it seems to occur too late:
  * create_default_files() is called by init_db()
  * init_db() is called by both builtin/{clone.c,init-db.c}
  * both callers of init_db() call set_git_work_tree() before init_db()
and in order to actual affect whether a repository is bear, we'd need to
somewhere reset these values, not just the is_bare_repository_cfg
setting.

I do not want to open this can of worms at this time; I'm trying to
clean up some headers, for which I need to move some functions, for
which I need to clean up some globals, and that's far enough down the
rabbit hole.  So, simply document the issue with a careful TODO comment
and a few testcases.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 11:14:33 -07:00
3d6a316464 notes: introduce "--no-separator" option
Sometimes, the user may want to add or append multiple notes
without any separator to be added between them.

Disscussion:

  https://public-inbox.org/git/3f86a553-246a-4626-b1bd-bacd8148318a@app.fastmail.com/

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:01 -07:00
c4e2aa7d45 notes.c: introduce "--[no-]stripspace" option
This commit introduces a new option "--[no-]stripspace" to git notes
append, git notes edit, and git notes add. This option allows users to
control whether the note message need to stripped out.

For the consideration of backward compatibility, let's look at the
behavior about "stripspace" in "git notes" command:

1. "Edit Message" case: using the default editor to edit the note
message.

    In "edit" case, the edited message will always be stripped out, the
    implementation which can be found in the "prepare_note_data()". In
    addition, the "-c" option supports to reuse an existing blob as a
    note message, then open the editor to make a further edition on it,
    the edited message will be stripped.

    This commit doesn't change the default behavior of "edit" case by
    using an enum "notes_stripspace", only when "--no-stripspace" option
    is specified, the note message will not be stripped out. If you do
    not specify the option or you specify "--stripspace", clearly, the
    note message will be stripped out.

2. "Assign Message" case: using the "-m"/"-F"/"-C" option to specify the
note message.

    In "assign" case, when specify message by "-m" or "-F", the message
    will be stripped out by default, but when specify message by "-C",
    the message will be copied verbatim, in other word, the message will
    not be stripped out. One more thing need to note is "the order of
    the options matter", that is, if you specify "-C" before "-m" or
    "-F", the reused message by "-C" will be stripped out together,
    because everytime concat "-m" or "-F" message, the concated message
    will be stripped together. Oppositely, if you specify "-m" or "-F"
    before "-C", the reused message by "-C" will not be stripped out.

    This commit doesn't change the default behavior of "assign" case by
    extending the "stripspace" field in "struct note_msg", so we can
    distinguish the different behavior of "-m"/"-F" and "-C" options
    when we need to parse and concat the message.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:01 -07:00
b7d87ad537 notes.c: append separator instead of insert by pos
Rename "insert_separator" to "append_separator" and also remove the
"postion" argument, this serves two purpose:

The first is that when specifying more than one "-m" ( like "-F", etc)
to "git notes add" or "git notes append", the order of them matters,
which means we need to append the each separator and message in turn,
so we don't have to make the caller specify the position, the "append"
operation is enough and clear.

The second is that when we execute the "git notes append" subcommand,
we need to combine the "prev_note" and "current_note" to get the
final result. Before, we inserted a newline character at the beginning
of "current_note". Now, we will append a newline to the end of
"prev_note" instead, this will give the consisitent results.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
90bc19b3ae notes.c: introduce '--separator=<paragraph-break>' option
When adding new notes or appending to an existing notes, we will
insert a blank line between the paragraphs, like:

     $ git notes add -m foo -m bar
     $ git notes show HEAD
     foo

     bar

The default behavour sometimes is not enough, the user may want
to use a custom delimiter between paragraphs, like when
specifying '-m', '-F', '-C', '-c' options. So this commit
introduce a new '--separator' option for 'git notes add' and
'git notes append', for example when executing:

    $ git notes add -m foo -m bar --separator="-"
    $ git notes show HEAD
    foo
    -
    bar

a newline is added to the value given to --separator if it
does not end with one already. So when executing:

      $ git notes add -m foo -m bar --separator="-"
and
      $ export LF="
      "
      $ git notes add -m foo -m bar --separator="-$LF"

Both the two exections produce the same result.

The reason we use a "strbuf" array to concat but not "string_list", is
that the binary file content may contain '\0' in the middle, this will
cause the corrupt result if using a string to save.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
59587049e2 t3321: add test cases about the notes stripspace behavior
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
3d27ae0712 notes.c: use designated initializers for clarity
The "struct note_data d = { 0, 0, NULL, STRBUF_INIT };" style could be
replaced with designated initializer for clarity.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
ef48fcc432 notes.c: cleanup 'strbuf_grow' call in 'append_edit'
Let's cleanup the unnecessary 'strbuf_grow' call in 'append_edit'. This
"strbuf_grow(&d.buf, size + 1);" is prepared for insert a blank line if
needed, but actually when inserting, "strbuf_insertstr(&d.buf, 0,
"\n");" will do the "grow" for us.

348f199b (builtin-notes: Refactor handling of -F option to allow
combining -m and -F, 2010-02-13) added these to mimic the code
introduced by 2347fae5 (builtin-notes: Add "append" subcommand for
appending to note objects, 2010-02-13) that reads in previous note
before the message.  And the resulting code with explicit sizing is
carried to this day.

In the context of reading an existing note in, exact sizing may have
made sense, but because the resulting note needs cleansing with
stripspace() when appending with this option, such an exact sizing
does not buy us all that much in practice.

It may help avoiding overallocation due to ALLOC_GROW() slop, but
nobody can feed so many long messages for it to matter from the
command line.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
6640c2d06d The second batch for 2.42
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-20 15:53:13 -07:00
917d4c2569 Merge branch 'la/doc-interpret-trailers'
Doc update.

* la/doc-interpret-trailers:
  doc: trailer: add more examples in DESCRIPTION
  doc: trailer: mention 'key' in DESCRIPTION
  doc: trailer.<token>.command: emphasize deprecation
  doc: trailer: use angle brackets for <token> and <value>
  doc: trailer: remove redundant phrasing
  doc: trailer: examples: avoid the word "message" by itself
  doc: trailer: drop "commit message part" phrasing
  doc: trailer: swap verb order
  doc: trailer: fix grammar
2023-06-20 15:53:13 -07:00
de00f4b7f3 Merge branch 'jk/log-follow-with-non-literal-pathspec'
"git [-c log.follow=true] log [--follow] ':(glob)f**'" used to barf.

* jk/log-follow-with-non-literal-pathspec:
  diff: detect pathspec magic not supported by --follow
  diff: factor out --follow pathspec check
  pathspec: factor out magic-to-name function
2023-06-20 15:53:13 -07:00
7cb4274d26 Merge branch 'vd/worktree-config-is-per-repository'
The value of config.worktree is per-repository, but has been kept
in a singleton global variable per process. This has been OK as
most Git operations interacted with a single repository at a time,
but not right for operations like recursive "grep" that want to
access multiple repositories from a single process without forking.

The global variable has been eliminated and made into a member in
the per-repository data structure.

* vd/worktree-config-is-per-repository:
  repository: move 'repository_format_worktree_config' to repo scope
  config: pass 'repo' directly to 'config_with_options()'
  config: use gitdir to get worktree config
2023-06-20 15:53:13 -07:00
9cd234e646 Merge branch 'tb/submodule-null-deref-fix'
"git submodule" code trusted the data coming from the config (and
the in-tree .gitmodules file) too much without validating, leading
to NULL dereference if the user mucks with a repository (e.g.
submodule.<name>.url is removed).  This has been corrected.

* tb/submodule-null-deref-fix:
  builtin/submodule--helper.c: handle missing submodule URLs
2023-06-20 15:53:13 -07:00
098a191a97 Merge branch 'jc/test-modernization-2'
Test style updates.

* jc/test-modernization-2:
  t9400-git-cvsserver-server: modernize test format
  t9200-git-cvsexportcommit: modernize test format
  t9104-git-svn-follow-parent: modernize test format
  t9100-git-svn-basic: modernize test format
  t7700-repack: modernize test format
  t7600-merge: modernize test format
  t7508-status: modernize test format
  t7201-co: modernize test format
  t7111-reset-table: modernize test format
  t7110-reset-merge: modernize test format
2023-06-20 15:53:12 -07:00
208a28ec08 Merge branch 'jc/test-modernization'
* jc/test-modernization:
  t7101-reset-empty-subdirs: modernize test format
  t6050-replace: modernize test format
  t5306-pack-nobase: modernize test format
  t5303-pack-corruption-resilience: modernize test format
  t5301-sliding-window: modernize test format
  t5300-pack-object: modernize test format
  t4206-log-follow-harder-copies: modernize test format
  t4202-log: modernize test format
  t4004-diff-rename-symlink: modernize test format
  t4003-diff-rename-1: modernize test format
  t4002-diff-basic: modernize test format
  t3903-stash: modernize test format
  t3700-add: modernize test format
  t3500-cherry: modernize test format
  t1006-cat-file: modernize test format
  t1002-read-tree-m-u-2way: modernize test format
  t1001-read-tree-m-2way: modernize test format
  t3210-pack-refs: modernize test format
  t0030-stripspace: modernize test format
  t0000-basic: modernize test format
2023-06-20 15:53:12 -07:00
6069c1a5a7 Merge branch 'kh/use-default-notes-doc'
Doc update.

* kh/use-default-notes-doc:
  notes: move the documentation to the struct
  notes: update documentation for `use_default_notes`
2023-06-20 15:53:12 -07:00
0899beb63c Merge branch 'pb/complete-and-document-auto-merge-and-friends'
Document more pseudo-refs and teach the command line completion
machinery to complete AUTO_MERGE.

* pb/complete-and-document-auto-merge-and-friends:
  completion: complete AUTO_MERGE
  Documentation: document AUTO_MERGE
  git-merge.txt: modernize word choice in "True merge" section
  completion: complete REVERT_HEAD and BISECT_HEAD
  revisions.txt: document more special refs
  revisions.txt: use description list for special refs
2023-06-20 15:53:12 -07:00
693bde461c Merge branch 'mh/commit-reach-get-reachable-plug-leak'
Plug memory leak.

* mh/commit-reach-get-reachable-plug-leak:
  commit-reach: fix memory leak in get_reachable_subset()
2023-06-20 15:53:11 -07:00
7f9b5ff41e Merge branch 'tz/test-fix-pthreads-prereq'
Test fix.

* tz/test-fix-pthreads-prereq:
  trace2 tests: fix PTHREADS prereq
2023-06-20 15:53:11 -07:00
40693ae926 Merge branch 'tz/test-ssh-verifytime-fix'
Test fix.

* tz/test-ssh-verifytime-fix:
  t/lib-gpg: fix ssh-keygen -Y check-novalidate with openssh-9.0
2023-06-20 15:53:11 -07:00
056d16406d Merge branch 'jk/ci-use-clang-for-sanitizer-jobs'
Clang's sanitizer implementation seems to work better than GCC's.

* jk/ci-use-clang-for-sanitizer-jobs:
  ci: drop linux-clang job
  ci: run ASan/UBSan in a single job
  ci: use clang for ASan/UBSan checks
2023-06-20 15:53:11 -07:00
ae19633021 Merge branch 'tl/quote-problematic-arg-for-clarity'
Error message fix.

* tl/quote-problematic-arg-for-clarity:
  surround %s with quotes when failed to lookup commit
2023-06-20 15:53:10 -07:00
06cff0c8d4 Merge branch 'ps/fetch-cleanups'
Code clean-up.

* ps/fetch-cleanups:
  fetch: use `fetch_config` to store "submodule.fetchJobs" value
  fetch: use `fetch_config` to store "fetch.parallel" value
  fetch: use `fetch_config` to store "fetch.recurseSubmodules" value
  fetch: use `fetch_config` to store "fetch.showForcedUpdates" value
  fetch: use `fetch_config` to store "fetch.pruneTags" value
  fetch: use `fetch_config` to store "fetch.prune" value
  fetch: pass through `fetch_config` directly
  fetch: drop unneeded NULL-check for `remote_ref`
  fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum value
2023-06-20 15:53:10 -07:00
0dd1324a73 packfile: delete .idx files before .pack files
When installing a packfile, we place the .pack file before the .idx
file. The intention is that Git scans for .idx files in the pack
directory and then loads the .pack files from that list.

However, when we delete packfiles, we do not do this in the reverse
order as we should. The unlink_pack_path() method deletes the .pack
followed by the .idx.

This creates a window where the process could be interrupted between
the .pack deletion and the .idx deletion, leaving the repository in a
state that looks strange, but isn't actually too problematic if we
assume the pack was safe to delete. The .idx without a .pack will cause
some overhead, but will not interrupt other Git processes.

This ordering was introduced into the 'git repack' builtin by
a1bbc6c017 (repack: rewrite the shell script in C, 2013-09-15), though
we must be careful to track history through the code move in 8434e85d5f
(repack: refactor pack deletion for future use, 2019-06-10) to see that.

This became more important after 73320e49ad (builtin/repack.c: only
collect fully-formed packs, 2023-06-07) changed how 'git repack' scanned
for packfiles for use in the cruft pack process. It previously looked
for .pack files, but that was problematic due to the order that packs
are installed: repacks between the creation of a .pack and the creation
of its .idx would result in hard failures.

There is an independent proposal about what to do in the case of a .idx
without a .pack during this 'git repack' scenario, but this change is
focused on deleting .pack files more safely.

Modify the order to delete the .idx before the .pack. The rest of the
modifiers on the .pack should still come after the .pack so we know all
of the presumed properties of the packfile as long as it exists in the
filesystem, in case we wish to reinstate it by re-indexing the .pack
file.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-20 13:38:41 -07:00
4416b86c6b strbuf: simplify strbuf_expand_literal_cb()
Now that strbuf_expand_literal_cb() is no longer used as a callback,
drop its "_cb" name suffix and unused context parameter.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
6f1e2d5279 replace strbuf_expand() with strbuf_expand_step()
Avoid the overhead of passing context to a callback function of
strbuf_expand() by using strbuf_expand_step() in a loop instead.  It
requires explicit handling of %% and unrecognized placeholders, but is
simpler, more direct and avoids void pointers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
39dbd49b41 replace strbuf_expand_dict_cb() with strbuf_expand_step()
Avoid the overhead of setting up a dictionary and passing it via
strbuf_expand() to strbuf_expand_dict_cb() by using strbuf_expand_step()
in a loop instead.  It requires explicit handling of %% and unrecognized
placeholders, but is more direct and simpler overall, and expands only
on demand.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
44ccb337f1 strbuf: factor out strbuf_expand_step()
Extract the part of strbuf_expand that finds the next placeholder into a
new function.  It allows to build parsers without callback functions and
the overhead imposed by them.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
3c3d0c4242 pretty: factor out expand_separator()
Deduplicate the code for setting the options "separator" and
"key_value_separator" by moving it into a new helper function,
expand_separator().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:29 -07:00
db30130165 http: handle both "h2" and "h2h3" in curl info lines
When redacting auth tokens in trace output from curl, we look for http/2
headers of the form "h2h3 [header: value]". This comes from b637a41ebe
(http: redact curl h2h3 headers in info, 2022-11-11).

But the "h2h3" prefix changed to just "h2" in curl's fc2f1e547 (http2:
support HTTP/2 to forward proxies, non-tunneling, 2023-04-14). That's in
released version curl 8.1.0; linking against that version means we'll
fail to correctly redact the trace. Our t5559.17 notices and fails.

We can fix this by matching either prefix, which should handle both old
and new versions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:08:31 -07:00
80d32e84b5 tests: mark as passing with SANITIZE=leak
The tests listed below, since previous commits, no longer trigger any
leak.

   + t1507-rev-parse-upstream.sh
   + t1508-at-combinations.sh
   + t1514-rev-parse-push.sh
   + t2027-checkout-track.sh
   + t3200-branch.sh
   + t3204-branch-name-interpretation.sh
   + t5404-tracking-branches.sh
   + t5517-push-mirror.sh
   + t5525-fetch-tagopt.sh
   + t6040-tracking-info.sh
   + t7508-status.sh

Let's mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix
promptly any new leak that may be introduced and triggered by them in
the future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:48 -07:00
5e786ed3ee config: fix a leak in git_config_copy_or_rename_section_in_file
A branch can have its configuration spread over several configuration
sections.  This situation was already foreseen in 52d59cc645 (branch:
add a --copy (-c) option to go with --move (-m), 2017-06-18) when
"branch -c" was introduced.

Unfortunately, a leak was also introduced:

   $ git branch foo
   $ cat >> .git/config <<EOF
   [branch "foo"]
   	some-key-a = a value
   [branch "foo"]
   	some-key-b = b value
   [branch "foo"]
   	some-key-c = c value
   EOF
   $ git branch -c foo bar

   Direct leak of 130 byte(s) in 2 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_vaddf strbuf.c
       ... in strbuf_addf strbuf.c
       ... in store_create_section config.c
       ... in git_config_copy_or_rename_section_in_file config.c
       ... in git_config_copy_section_in_file config.c
       ... in git_config_copy_section config.c
       ... in copy_or_rename_branch builtin/branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's fix it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:48 -07:00
2935a97836 branch: fix a leak in cmd_branch
In 98e7ab6d42 (for-each-ref: delay parsing of --sort=<atom> options,
2021-10-20) a new string_list was introduced to accumulate any
"branch.sort" setting.

That string_list is cleared in ref_sorting_options(), which is only
called when processing the "--list" sub-command.  Therefore, with other
sub-command, while having any sort option set, a leak is produced, e.g.:

   $ git config branch.sort invalid_sort_option
   $ git branch --edit-description

   Direct leak of 384 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in string_list_append_nodup string-list.c
       ... in string_list_append string-list.c
       ... in git_branch_config builtin/branch.c
       ... in configset_iter config.c
       ... in repo_config config.c
       ... in git_config config.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

   Indirect leak of 20 byte(s) in 1 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in string_list_append string-list.c
       ... in git_branch_config builtin/branch.c
       ... in configset_iter config.c
       ... in repo_config config.c
       ... in git_config config.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

We don't have a common clean-up section in cmd_branch().  To avoid
refactoring, keep the fix simple, and while we find a better solution
which hopefuly will avoid entirely that string_list, when no sort
options are needed; let's squelch the leak sanitizer using UNLEAK().

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:48 -07:00
5ace483a15 branch: fix a leak in setup_tracking
In bdaf1dfae7 (branch: new autosetupmerge option "simple" for matching
branches, 2022-04-29) a new exit for setup_tracking() missed the
clean-up, producing a leak.

   $ git config branch.autoSetupMerge simple
   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch bar local/foo

   Direct leak of 384 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in string_list_append_nodup string-list.c
       ... in find_tracked_branch branch.c
       ... in for_each_remote remote.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtinbranch.c
       ... in run_builtin git.c

   Indirect leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in find_tracked_branch branch.c
       ... in for_each_remote remote.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtinbranch.c
       ... in run_builtin git.c

The return introduced in bdaf1dfae7 was to avoid setting up the
tracking, but even in that case it is still necessary to do the
clean-up.  Let's do it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:47 -07:00
05e717d556 rev-parse: fix a leak with --abbrev-ref
To handle "--abbrev-ref" we use shorten_unambiguous_ref().  This
function takes a refname and returns a shortened refname, which is a
newly allocated string that needs to be freed.

Unfortunately, the refname variable is reused to receive the shortened
one.  Therefore, we lose the original refname, which needs to be freed
as well, producing a leak.

This leak can be reviewed with:

   $ for a in {1..10}; do git branch foo_${a}; done
   $ git rev-parse --abbrev-ref refs/heads/foo_{1..10}

   Direct leak of 171 byte(s) in 10 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in expand_ref refs.c
       ... in repo_dwim_ref refs.c
       ... in show_rev builtin/rev-parse.c
       ... in cmd_rev_parse builtin/rev-parse.c
       ... in run_builtin git.c

We have this leak since a45d34691e (rev-parse: --abbrev-ref option to
shorten ref name, 2009-04-13) when "--abbrev-ref" was introduced.

Let's fix it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:47 -07:00
be3d654343 commit: pass --no-divider to interpret-trailers
When git-commit sees any "--trailer" options, it passes the
COMMIT_EDITMSG file through git-interpret-trailers. But it does so
without passing --no-divider, which means that interpret-trailers will
look for a "---" divider to signal the end of the commit message.

That behavior doesn't make any sense in this context; we know we have a
complete and solitary commit message, not something we have to further
parse. And as a result, we'll do the wrong thing if the commit message
contains a "---" marker (which otherwise is not syntactically
significant), inserting any new trailers at the wrong spot.

We can fix this by passing --no-divider. This is the exact situation for
which it was added in 1688c9a489 (interpret-trailers: allow suppressing
"---" divider, 2018-08-22). As noted in the message for that commit, it
just adds the mechanism, and further patches were needed to trigger it
from various callers.  We did that back then in a few spots, like
ffce7f590f (sequencer: ignore "---" divider when parsing trailers,
2018-08-22), but obviously missed this one.

Reported-by: <eric.frederich@siemens.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-16 21:47:40 -07:00
0ce02e2fec credential/libsecret: store new attributes
d208bfd (credential: new attribute password_expiry_utc, 2023-02-18)
and a5c76569e7 (credential: new attribute oauth_refresh_token)
introduced new credential attributes.

libsecret assumes attribute values are non-confidential and
unchanging, so we encode the new attributes in the secret, separated by
newline:

    hunter2
    password_expiry_utc=1684189401
    oauth_refresh_token=xyzzy

This is extensible and backwards compatible. The credential protocol
already assumes that attribute values do not contain newlines.

Alternatives considered: store password_expiry_utc in a libsecret
attribute. This has the problem that libsecret creates new items
rather than overwrites when attribute values change.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-16 13:06:57 -07:00
69f4da8ead setup.c: don't setup in discover_git_directory()
discover_git_directory() started modifying the_repository in ebaf3bcf1a
(repository: move global r_f_p_c to repo struct, 2021-06-17), when, in
the repository setup process, we started copying members from the
"struct repository_format" we're inspecting to the appropriate "struct
repository". However, discover_git_directory() isn't actually used in
the setup process (its only caller in the Git binary is
read_early_config()), so it shouldn't be doing this setup at all!

As explained by 16ac8b8db6 (setup: introduce the
discover_git_directory() function, 2017-03-13) and the comment on its
declaration, discover_git_directory() is intended to be an entrypoint
into setup.c machinery that allows the Git directory to be discovered
without side effects, e.g. so that read_early_config() can read
".git/config" before the_repository has been set up.

Fortunately, we didn't start to rely on this unintended behavior between
then and now, so we let's just remove it. It isn't harming anyone, but
it's confusing.

Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-16 08:35:06 -07:00
6c26da8404 credential: erase all matching credentials
`credential reject` sends the erase action to each helper, but the
exact behaviour of erase isn't specified in documentation or tests.
Some helpers (such as credential-store and credential-libsecret) delete
all matching credentials, others (such as credential-cache) delete at
most one matching credential.

Test that helpers erase all matching credentials. This behaviour is
easiest to reason about. Users expect that `echo
"url=https://example.com" | git credential reject` or `echo
"url=https://example.com\nusername=tim" | git credential reject` erase
all matching credentials.

Fix credential-cache.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 13:26:41 -07:00
aeb21ce22e credential: avoid erasing distinct password
Test that credential helpers do not erase a password distinct from the
input. Such calls can happen when multiple credential helpers are
configured.

Fixes for credential-cache and credential-store.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 13:26:39 -07:00
c40f0b7877 revision: handle pseudo-opts in --stdin mode
While both git-rev-list(1) and git-log(1) support `--stdin`, it only
accepts commits and files. Most notably, it is impossible to pass any of
the pseudo-opts like `--all`, `--glob=` or others via stdin.

This makes it hard to use this function in certain scripted scenarios,
like when one wants to support queries against specific revisions, but
also against reference patterns. While this is theoretically possible by
using arguments, this may run into issues once we hit platform limits
with sufficiently large queries. And because `--stdin` cannot handle
pseudo-opts, the only alternative would be to use a mixture of arguments
and standard input, which is cumbersome.

Implement support for handling pseudo-opts in both commands to support
this usecase better. One notable restriction here is that `--stdin` only
supports "stuck" arguments in the form of `--glob=foo`. This is because
"unstuck" arguments would also require us to read the next line, which
would add quite some complexity to the code. This restriction should be
fine for scripted usage though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00
af37a209ad revision: small readability improvement for reading from stdin
The code that reads lines from standard input manually compares whether
the read line matches "--", which is a bit awkward to read. Furthermore,
we're about to extend the code to also support reading pseudo-options
via standard input, requiring more elaborate handling of lines with a
leading dash.

Refactor the code by hoisting out the check for "--" outside of the
block that checks for a leading dash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00
cc8045018d revision: reorder read_revisions_from_stdin()
Reorder `read_revisions_from_stdin()` so that we can start using
`handle_revision_pseudo_opt()` without a forward declaration in a
subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00
3744ffcbcd ls-tree: fix documentation of %x format placeholder
ls-tree --format expands %x followed by two hexadecimal digits to the
character indicated by that hexadecimal number, e.g.:

   $ git ls-tree --format=%x41 HEAD | head -1
   A

It rejects % followed by a hexadecimal digit, e.g.:

   $ git ls-tree --format=%41 HEAD | head -1
   fatal: bad ls-tree format: element '41' does not start with '('

This functionality is provided by strbuf_expand_literal_cb(), which has
not been changed since it was factored out by fd2015b323 (strbuf:
separate callback for strbuf_expand:ing literals, 2019-01-28).

Adjust the documentation accordingly.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 11:19:02 -07:00
d57fa7fc73 doc: trailer: add more examples in DESCRIPTION
Be more up-front about what trailers are in practice with examples, to
give the reader a visual cue while they go on to read the rest of the
description.

Also add an example for multiline values.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
eda2c44c8b doc: trailer: mention 'key' in DESCRIPTION
The 'key' option is used frequently in the examples at the bottom but
there is no mention of it in the description.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
dc8937fbb9 doc: trailer.<token>.command: emphasize deprecation
This puts the deprecation notice up front, instead of leaving it to the
next paragraph.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
8e80f2916b doc: trailer: use angle brackets for <token> and <value>
We already use angle brackets elsewhere, so this makes things more
consistent.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
74a50fbd7f doc: trailer: remove redundant phrasing
The phrase "many rules" gets essentially repeated again with "many other
rules", so remove this repetition.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
229d6ab6bf doc: trailer: examples: avoid the word "message" by itself
Previously, "message" could mean the input, output, commit message, or
"internal body text inside the commit message" (in the EXAMPLES
section). Avoid overloading this term by using the appropriate meanings
explicitly.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
94f15fe5d5 doc: trailer: drop "commit message part" phrasing
The command can take inputs that are either just a commit message, or
an email-like output such as git-format-patch which includes a commit
message, "---" divider, and patch part. The existing explanation blends
these two inputs together in the first sentence

    This command reads some patches or commit messages

which then necessitates using the "commit message part" phrasing (as
opposed to just "commit message") because the input is ambiguous per the
above definition.

This change separates the two input types and explains them separately,
and so there is no longer a need to use the "commit message part"
phrase.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
00432a36e2 doc: trailer: swap verb order
This matches the order already used in the NAME section.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
bfb5f57bb3 doc: trailer: fix grammar
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
f0b68f0546 CodingGuidelines: use octal escapes, not hex
Extend the shell-scripting section of CodingGuidelines to suggest octal
escape sequences (e.g. "\302\242") over hexadecimal (e.g. "\xc2\xa2")
since the latter can be a source of portability problems.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 14:44:19 -07:00
5768478edc diff-lib: honor override_submodule_config flag bit
When `diff.ignoreSubmodules = all` is set and submodule commits are
manually staged (e.g. via `git-update-index`), `git-commit` should
record the commit  with updated submodules.

`index_differs_from` is called from `prepare_to_commit` with flags set to
`override_submodule_config = 1`. `index_differs_from` then merges the
default diff flags and passed flags.

When `diff.ignoreSubmodules` is set to "all", `flags` ends up having
both `override_submodule_config` and `ignore_submodules` set to 1. This
results in `git-commit` ignoring staged commits.

This patch restores original `flags.ignore_submodule` if
`flags.override_submodule_config` is set.

Signed-off-by: Josip Sokcevic <sokcevic@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 11:28:12 -07:00
d7d8841f67 Start the 2.42 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-13 12:29:46 -07:00
32fe7fff0c Merge branch 'zh/ls-files-format-atoms'
Some atoms that can be used in "--format=<format>" for "git ls-tree"
were not supported by "git ls-files", even though they were relevant
in the context of the latter.

* zh/ls-files-format-atoms:
  ls-files: align format atoms with ls-tree
2023-06-13 12:29:46 -07:00
ca9c063c18 Merge branch 'sl/diff-tree-sparse'
"git diff-tree" has been taught to take advantage of the
sparse-index feature.

* sl/diff-tree-sparse:
  diff-tree: integrate with sparse index
2023-06-13 12:29:46 -07:00
e490bea8a6 Merge branch 'jk/format-patch-message-id-unleak'
Leakfix.

* jk/format-patch-message-id-unleak:
  format-patch: free elements of rev.ref_message_ids list
  format-patch: free rev.message_id when exiting
2023-06-13 12:29:46 -07:00
cbc882ea38 Merge branch 'jc/pack-ref-exclude-include'
"git pack-refs" learns "--include" and "--exclude" to tweak the ref
hierarchy to be packed using pattern matching.

* jc/pack-ref-exclude-include:
  pack-refs: teach pack-refs --include option
  pack-refs: teach --exclude option to exclude refs from being packed
  docs: clarify git-pack-refs --all will pack all refs
2023-06-13 12:29:45 -07:00
ebd07c9f7e Merge branch 'sa/doc-ls-remote'
Doc update.

* sa/doc-ls-remote:
  ls-remote doc: document the output format
  ls-remote doc: explain what each example does
  ls-remote doc: show peeled tags in examples
  ls-remote doc: remove redundant --tags example
  show-branch doc: say <ref>, not <reference>
  show-ref doc: update for internal consistency
2023-06-13 12:29:45 -07:00
4c7d878df6 Merge branch 'gc/doc-cocci-updates'
Update documentation regarding Coccinelle patches.

* gc/doc-cocci-updates:
  cocci: codify authoring and reviewing practices
  cocci: add headings to and reword README
2023-06-13 12:29:45 -07:00
6901ffe80c Merge branch 'jc/diff-s-with-other-options'
The "-s" (silent, squelch) option of the "diff" family of commands
did not interact with other options that specify the output format
well.  This has been cleaned up so that it will clear all the
formatting options given before.

* jc/diff-s-with-other-options:
  diff: fix interaction between the "-s" option and other options
2023-06-13 12:29:45 -07:00
6d2a88c728 Merge branch 'kh/keep-tag-editmsg-upon-failure'
"git tag" learned to leave the "$GIT_DIR/TAG_EDITMSG" file when the
command failed, so that the user can salvage what they typed.

* kh/keep-tag-editmsg-upon-failure:
  tag: keep the message file in case ref transaction fails
  t/t7004-tag: add regression test for successful tag creation
  doc: tag: document `TAG_EDITMSG`
2023-06-13 12:29:44 -07:00
861c56f6f9 branch: fix a leak in setup_tracking
The commit d3115660b4 (branch: add flags and config to inherit tracking,
2021-12-20) replaced in "struct tracking", the member "char *src" by a
new "struct string_list *srcs".

This caused a modification in find_tracked_branch().  The string
returned by remote_find_tracking(), previously assigned to "src", is now
added to the string_list "srcs".

That string_list is initialized with STRING_LIST_INIT_DUP, which means
that what is added is not the given string, but a duplicate.  Therefore,
the string returned by remote_find_tracking() is leaked.

The leak can be reviewed with:

   $ git branch foo
   $ git remote add local .
   $ git fetch local
   $ git branch --track bar local/foo

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in find_tracked_branch branch.c
       ... in for_each_remote remote.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's fix the leak, using the "_nodup" API to add to the string_list.
This way, the string itself will be added instead of being strdup()'d.
And when the string_list is cleared, the string will be freed.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:06:13 -07:00
caee1d669c branch: fix a leak in check_tracking_branch
Let's fix a leak we have in check_tracking_branch() since the function
was introduced in 41c21f22d0 (branch.c: Validate tracking branches with
refspecs instead of refs/remotes/*, 2013-04-21).

The leak can be reviewed with:

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --track bar local/foo

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in check_tracking_branch branch.c
       ... in for_each_remote remote.c
       ... in validate_remote_tracking_branch branch.c
       ... in dwim_branch_start branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:06:03 -07:00
a88a3d7cd7 branch: fix a leak in inherit_tracking
In d3115660b4 (branch: add flags and config to inherit tracking,
2021-12-20) a new option was introduced to allow creating a new branch,
inheriting the tracking of another branch.

The new code, strdup()'d the remote_name of the existing branch, but
unfortunately it was not freed, producing a leak.

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --track bar local/foo
   branch 'bar' set up to track 'local/foo'.
   $ git branch --track=inherit baz bar
   branch 'baz' set up to track 'local/foo'.

   Direct leak of 6 byte(s) in 1 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in inherit_tracking branch.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Actually, the string we're strdup()'ing is from the struct branch
returned by get_branch().  Which, in turn, retrieves the string from the
global "struct repository".  This makes perfectly valid to use the
string throughout the entire execution of create_branch().  There is no
need to duplicate it.

Let's fix the leak, removing the strdup().

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:05:44 -07:00
1533bda770 branch: fix a leak in dwim_and_setup_tracking
In e89f151db1 (branch: move --set-upstream-to behavior to
dwim_and_setup_tracking(), 2022-01-28) the string returned by
dwim_branch_start() was not freed, resulting in a memory leak.

It can be reviewed with:

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --set-upstream-to local/foo foo

   Direct leak of 23 byte(s) in 1 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in expand_ref refs.c
       ... in repo_dwim_ref refs.c
       ... in dwim_branch_start branch.c
       ... in dwim_and_setup_tracking branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's free it now.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:05:09 -07:00
4689101a40 remote: fix a leak in query_matches_negative_refspec
In c0192df630 (refspec: add support for negative refspecs, 2020-09-30)
query_matches_negative_refspec() was introduced.

The function was implemented as a two-loop process, where the former
loop accumulates and the latter evaluates.  To accumulate, a string_list
is used.

Within the first loop, there are three cases where a string is added to
the string_list.  Two of them add strings that do not need to be
freed.  But in the third case, the string added is returned by
match_name_with_pattern(), which needs to be freed.

The string_list is initialized with STRING_LIST_INIT_NODUP, i.e.  when
cleared, the strings added are not freed.  Therefore, the string
returned by match_name_with_pattern() is not freed, so we have a leak.

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --track bar local/foo

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_matches_negative_refspec remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in find_tracked_branch branch.c
       ... in for_each_remote remote.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_matches_negative_refspec remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in check_tracking_branch branch.c
       ... in for_each_remote remote.c
       ... in validate_remote_tracking_branch branch.c
       ... in dwim_branch_start branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

An interesting point to note is that while string_list_append() is used
in the first two cases described, string_list_append_nodup() is used in
the third.  This seems to indicate an intention to delegate the
responsibility for freeing the string, to the string_list.  As if the
string_list had been initialized with STRING_LIST_INIT_DUP, i.e.  the
strings are strdup()'d when added (except if the "_nodup" API is used)
and freed when cleared.

Switching to STRING_LIST_INIT_DUP fixes the leak and probably is what we
wanted to do originally.  Let's do it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:04:28 -07:00
003c1f1171 config: fix a leak in git_config_copy_or_rename_section_in_file
In 52d59cc645 (branch: add a --copy (-c) option to go with --move (-m),
2017-06-18) a new strbuf variable was introduced, but not released.

Thus, when copying a branch that has any configuration, we have a
leak.

   $ git branch foo
   $ git config branch.foo.some-key some_value
   $ git branch -c foo bar

   Direct leak of 65 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_vaddf strbuf.c
       ... in strbuf_addf strbuf.c
       ... in store_create_section config.c
       ... in git_config_copy_or_rename_section_in_file config.c
       ... in git_config_copy_section_in_file config.c
       ... in git_config_copy_section config.c
       ... in copy_or_rename_branch builtin/branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's fix that leak.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:04:16 -07:00
06f3867865 pack-bitmap.c: gracefully degrade on failure to load MIDX'd pack
When opening a MIDX bitmap, we the pack-bitmap machinery eagerly calls
`prepare_midx_pack()` on each of the packs contained in the MIDX. This
is done in order to populate the array of `struct packed_git *`s held by
the MIDX, which we need later on in `load_reverse_index()`, since it
calls `load_pack_revindex()` on each of the MIDX'd packs, and requires
that the caller provide a pointer to a `struct packed_git`.

When opening one of these packs fails, the pack-bitmap code will `die()`
indicating that it can't open one of the packs in the MIDX. This
indicates that the MIDX is somehow broken with respect to the current
state of the repository. When this is the case, we indeed cannot make
use of the MIDX bitmap to speed up reachability traversals.

However, it does not mean that we can't perform reachability traversals
at all. In other failure modes, that same function calls `warning()` and
then returns -1, indicating to its caller (`open_bitmap()`) that we
should either look for a pack bitmap if one is available, or perform
normal object traversal without using bitmaps at all.

There is no reason why this case should cause us to die. If we instead
continued (by jumping to `cleanup` as this patch does) and avoid using
bitmaps altogether, we may again try and query the MIDX, which will also
fail. But when trying to call `fill_midx_entry()` fails, it also returns
a signal of its failure, and prompts the caller to try and locate the
object elsewhere.

In other words, the normal object traversal machinery works fine in the
presence of a corrupt MIDX, so there is no reason that the MIDX bitmap
machinery should abort in that case when we could easily continue.

Note that we *could* in theory try again to load a MIDX bitmap after
calling `reprepare_packed_git()`. Even though the `prepare_packed_git()`
code is careful to avoid adding a pack that we already have,
`prepare_midx_pack()` is not. So if we got part of the way through
calling `prepare_midx_pack()` on a stale MIDX, and then tried again on a
fresh MIDX that contains some of the same packs, we would end up with a
loop through the `->next` pointer.

For now, let's do the simplest thing possible and fallback to the
non-bitmap code when we detect a stale MIDX so that the complete fix as
above can be implemented carefully.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:19:31 -07:00
4dc16e2cb0 gc: introduce gc.recentObjectsHook
This patch introduces a new multi-valued configuration option,
`gc.recentObjectsHook` as a means to mark certain objects as recent (and
thus exempt from garbage collection), regardless of their age.

When performing a garbage collection operation on a repository with
unreachable objects, Git makes its decision on what to do with those
object(s) based on how recent the objects are or not. Generally speaking,
unreachable-but-recent objects stay in the repository, and older objects
are discarded.

However, we have no convenient way to keep certain precious, unreachable
objects around in the repository, even if they have aged out and would
be pruned. Our options today consist of:

  - Point references at the reachability tips of any objects you
    consider precious, which may be undesirable or infeasible if there
    are many such objects.

  - Track them via the reflog, which may be undesirable since the
    reflog's lifetime is limited to that of the reference it's tracking
    (and callers may want to keep those unreachable objects around for
    longer).

  - Extend the grace period, which may keep around other objects that
    the caller *does* want to discard.

  - Manually modify the mtimes of objects you want to keep. If those
    objects are already loose, this is easy enough to do (you can just
    enumerate and `touch -m` each one).

    But if they are packed, you will either end up modifying the mtimes
    of *all* objects in that pack, or be forced to write out a loose
    copy of that object, both of which may be undesirable. Even worse,
    if they are in a cruft pack, that requires modifying its `*.mtimes`
    file by hand, since there is no exposed plumbing for this.

  - Force the caller to construct the pack of objects they want
    to keep themselves, and then mark the pack as kept by adding a
    ".keep" file. This works, but is burdensome for the caller, and
    having extra packs is awkward as you roll forward your cruft pack.

This patch introduces a new option to the above list via the
`gc.recentObjectsHook` configuration, which allows the caller to
specify a program (or set of programs) whose output is treated as a set
of objects to treat as recent, regardless of their true age.

The implementation is straightforward. Git enumerates recent objects via
`add_unseen_recent_objects_to_traversal()`, which enumerates loose and
packed objects, and eventually calls add_recent_object() on any objects
for which `want_recent_object()`'s conditions are met.

This patch modifies the recency condition from simply "is the mtime of
this object more recent than the cutoff?" to "[...] or, is this object
mentioned by at least one `gc.recentObjectsHook`?".

Depending on whether or not we are generating a cruft pack, this allows
the caller to do one of two things:

  - If generating a cruft pack, the caller is able to retain additional
    objects via the cruft pack, even if they would have otherwise been
    pruned due to their age.

  - If not generating a cruft pack, the caller is likewise able to
    retain additional objects as loose.

A potential alternative here is to introduce a new mode to alter the
contents of the reachable pack instead of the cruft one. One could
imagine a new option to `pack-objects`, say `--extra-reachable-tips`
that does the same thing as above, adding the visited set of objects
along the traversal to the pack.

But this has the unfortunate side-effect of altering the reachability
closure of that pack. If parts of the unreachable object graph mentioned
by one or more of the "extra reachable tips" programs is not closed,
then the resulting pack won't be either. This makes it impossible in the
general case to write out reachability bitmaps for that pack, since
closure is a requirement there.

Instead, keep these unreachable objects in the cruft pack (or set of
unreachable, loose objects) instead, to ensure that we can continue to
have a pack containing just reachable objects, which is always safe to
write a bitmap over.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:12:20 -07:00
01e9ca4a40 reachable.c: extract obj_is_recent()
When enumerating objects in order to add recent ones (i.e. those whose
mtime is strictly newer than the cutoff) as tips of a reachability
traversal, `add_recent_object()` discards objects which do not meet the
recency criteria.

The subsequent commit will make checking whether or not an object is
recent also consult the list of hooks in `pack.recentHook`. Isolate this
check in its own function to keep the additional complexity outside of
`add_recent_object()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:08:51 -07:00
73320e49ad builtin/repack.c: only collect fully-formed packs
To partition the set of packs based on which ones are "kept" (either
they have a .keep file, or were otherwise marked via the `--keep-pack`
option) and "non-kept" ones (anything else), `git repack` uses its
`collect_pack_filenames()` function.

Ordinarily, we would rely on a convenience function such as
`get_all_packs()` to enumerate and partition the set of packs. But
`collect_pack_filenames()` uses `readdir()` directly to read the
contents of the "$GIT_DIR/objects/pack" directory, and adds each entry
ending in ".pack" to the appropriate list (either kept, or non-kept as
above).

This is subtly racy, since `collect_pack_filenames()` may see a pack
that is not fully staged (i.e., it is missing its ".idx" file).
Ordinarily, this doesn't cause a problem. But it can cause issues when
generating a cruft pack.

This is because `git repack` feeds (among other things) the list of
existing kept packs down to `git pack-objects --cruft` to indicate that
any kept packs will not be removed from the repository (so that the
cruft pack machinery can avoid packing objects that appear in those
packs as cruft).

But `read_cruft_objects()` lists packfiles by calling `get_all_packs()`.
So if a ".pack" file exists (necessary to get that pack to appear to
`collect_pack_filenames()`), but doesn't have a corresponding ".idx"
file (necessary to get that pack to appear via `get_all_packs()`), we'll
complain with:

    fatal: could not find pack '.tmp-5841-pack-a6b0150558609c323c496ced21de6f4b66589260.pack'

Fix the above by teaching `collect_pack_filenames()` to only collect
packs with their corresponding `*.idx` files in place, indicating that
those packs have been fully staged.

There are a couple of things worth noting:

  - Since each entry in the `extra_keep` list (which contains the
    `--keep-pack` names) has a `*.pack` suffix, we'll have to swap the
    suffix from ".pack" to ".idx", and compare that instead.

  - Since we use the the `fname_kept_list` to figure out which packs to
    delete (with `git repack -d`), we would have previously deleted a
    `*.pack` with no index (since the existince of a ".pack" file is
    necessary and sufficient to include that pack in the list of
    existing non-kept packs).

    Now we will leave it alone (since that pack won't appear in the
    list). This is far more correct behavior, since we don't want
    to race with a pack being staged. Deleting a partially staged pack
    is unlikely, however, since the window of time between staging a
    pack and moving its .idx file into place is miniscule.

    Note that this window does *not* include the time it takes to
    receive and index the pack, since the incoming data goes into
    "$GIT_DIR/objects/tmp_pack_XXXXXX", which does not end in ".pack"
    and is thus ignored by collect_pack_filenames().

In the future, this function should probably be rewritten as a callback
to `for_each_file_in_pack_dir()`, but this is the simplest change we
could do in the short-term.

Reported-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:54:38 -07:00
548afb0d9a docs: typofixes
These were found with an automated CLI tool [1]. Only the
"Documentation" subfolder (and not source code files) was considered
because the docs are user-facing.

[1]: https://crates.io/crates/typos-cli

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:52:51 -07:00
78e56cff69 t/lib-gpg: require GPGSSH for GPGSSH_VERIFYTIME prereq
The GPGSSH_VERIFYTIME prequeq makes use of "${GNUPGHOME}" but does not
create it.  Require GPGSSH which creates the "${GNUPGHOME}" directory.

Additionally, it makes sense to require GPGSSH in GPGSSH_VERIFYTIME
because the latter builds on the former.  If we can't use GPGSSH,
there's little point in checking whether GPGSSH_VERIFYTIME is usable.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:50:39 -07:00
787cb8a48a strbuf: remove global variable
As a library that only interacts with other primitives, strbuf should
not utilize the comment_line_char global variable within its
functions. Therefore, add an additional parameter for functions that use
comment_line_char and refactor callers to pass it in instead.
strbuf_stripspace() removes the skip_comments boolean and checks if
comment_line_char is a non-NUL character to determine whether to skip
comments or not.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
aba0706832 path: move related function to path
Move path-related function from strbuf.[ch] to path.[ch] so that strbuf
is focused on string manipulation routines with minimal dependencies.

repository.h is no longer a necessary dependency after moving this
function out.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
f94018506c object-name: move related functions to object-name
Move object-name-related functions from strbuf.[ch] to object-name.[ch]
so that strbuf is focused on string manipulation routines with minimal
dependencies.

dir.h relied on the forward declration of the repository struct in
strbuf.h. Since that is removed in this patch, add the forward
declaration to dir.h.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
f89854362c credential-store: move related functions to credential-store file
is_rfc3986_unreserved() and is_rfc3986_reserved_or_unreserved() are only
called from builtin/credential-store.c and they are only relevant to that
file so move those functions and make them static.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
5d1344b497 abspath: move related functions to abspath
Move abspath-related functions from strbuf.[ch] to abspath.[ch] so that
strbuf is focused on string manipulation routines with minimal
dependencies.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:35 -07:00
16b171fda0 strbuf: clarify dependency
refs.h was once needed but is no longer so as of 6bab74e7fb ("strbuf:
move strbuf_branchname to sha1_name.c", 2010-11-06). strbuf.h was
included thru refs.h, so removing refs.h requires strbuf.h to be added
back.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:35 -07:00
4557779660 strbuf: clarify API boundary
strbuf, as a generic and widely used structure across the codebase,
should be limited as a library to only interact with primitives. Add
documentation so future functions can appropriately be placed. Older
functions that do not follow this boundary should eventually be moved or
refactored.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:35 -07:00
9c7d1b057f repository: create read_replace_refs setting
The 'read_replace_refs' global specifies whether or not we should
respect the references of the form 'refs/replace/<oid>' to replace which
object we look up when asking for '<oid>'. This global has caused issues
when it is not initialized properly, such as in b6551feadf (merge-tree:
load default git config, 2023-05-10).

To make this more robust, move its config-based initialization out of
git_default_config and into prepare_repo_settings(). This provides a
repository-scoped version of the 'read_replace_refs' global.

The global still has its purpose: it is disabled process-wide by the
GIT_NO_REPLACE_OBJECTS environment variable or by a call to
disable_replace_refs() in some specific Git commands.

Since we already encapsulated the use of the constant inside
replace_refs_enabled(), we can perform the initialization inside that
method, if necessary. This solves the problem of forgetting to check the
config, as we will check it before returning this value.

Due to this encapsulation, the global can move to be static within
replace-object.c.

There is an interesting behavior change possible here: we now have a
repository-scoped understanding of this config value. Thus, if there was
a command that recurses into submodules and might follow replace refs,
then it would now respect the core.useReplaceRefs config value in each
repository.

'git grep --recurse-submodules' is such a command that recurses into
submodules in-process. We can demonstrate the granularity of this config
value via a test in t7814.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
f1178380ac replace-objects: create wrapper around setting
The 'read_replace_objects' constant is initialized by git_default_config
(if core.useReplaceRefs is disabled) and within setup_git_env (if
GIT_NO_REPLACE_OBJECTS) is set. To ensure that this variable cannot be
set accidentally in other places, wrap it in a replace_refs_enabled()
method.

Since we still assign this global in config.c, we are not able to remove
the global scope of this variable and make it a static within
replace-object.c. This will happen in a later change which will also
prevent the variable from being read before it is initialized.

Centralizing read access to the variable is an important first step.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
d24eda4e03 repository: create disable_replace_refs()
Several builtins depend on being able to disable the replace references
so we actually operate on each object individually. These currently do
so by directly mutating the 'read_replace_refs' global.

A future change will move this global into a different place, so it will
be necessary to change all of these lines. However, we can simplify that
transition by abstracting the purpose of these global assignments with a
method call.

We will need to keep this read_replace_refs global forever, as we want
to make sure that we never use replace refs throughout the life of the
process if this method is called. Future changes may present a
repository-scoped version of the variable to represent that repository's
core.useReplaceRefs config value, but a zero-valued read_replace_refs
will always override such a setting.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
f79e18849b cat-file: add option '-Z' that delimits input and output with NUL
In db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with
`-z`, 2022-07-22), we have introduced a new mode to read the input via
NUL-delimited records instead of newline-delimited records. This allows
the user to query for revisions that have newlines in their path
component. While unusual, such queries are perfectly valid and thus it
is clear that we should be able to support them properly.

Unfortunately, the commit only changed the input to be NUL-delimited,
but didn't change the output at the same time. While this is fine for
queries that are processed successfully, it is less so for queries that
aren't. In the case of missing commits for example the result can become
entirely unparsable:

```
$ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" |
    git cat-file --batch -z
7ce4f05bae blob 10
1234567890

commit missing
```

This is of course a crafted query that is intentionally gaming the
deficiency, but more benign queries that contain newlines would have
similar problems.

Ideally, we should have also changed the output to be NUL-delimited when
`-z` is specified to avoid this problem. As the input is NUL-delimited,
it is clear that the output in this case cannot ever contain NUL
characters by itself. Furthermore, Git does not allow NUL characters in
revisions anyway, further stressing the point that using NUL-delimited
output is safe. The only exception is of course the object data itself,
but as git-cat-file(1) prints the size of the object data clients should
read until that specified size has been consumed.

But even though `-z` has only been introduced a few releases ago in Git
v2.38.0, changing the output format retroactively to also NUL-delimit
output would be a backwards incompatible change. And while one could
make the argument that the output is inherently broken already, we need
to assume that there are existing users out there that use it just fine
given that revisions containing newlines are quite exotic.

Instead, introduce a new option `-Z` that switches to NUL-delimited
input and output. While this new option could arguably only switch the
output format to be NUL-delimited, the consequence would be that users
have to always specify both `-z` and `-Z` when the input may contain
newlines. On the other hand, if the user knows that there never will be
newlines in the input, they don't have to use either of those options.
There is thus no usecase that would warrant treating input and output
format separately, which is why we instead opt to "do the right thing"
and have `-Z` mean to NUL-terminate both formats.

The old `-z` option is marked as deprecated with a hint that its output
may become unparsable. It is thus hidden both from the synopsis as well
as the command's help output.

Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:46 -07:00
3217f52a49 cat-file: simplify reading from standard input
The batch modes of git-cat-file(1) read queries from stantard input that
are either newline- or NUL-delimited. This code was introduced via
db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`,
2022-07-22), which notes that:

"""
The refactoring here is slightly unfortunate, since we turn loops like:

     while (strbuf_getline(&buf, stdin) != EOF)

 into:

     while (1) {
         int ret;
         if (opt->nul_terminated)
             ret = strbuf_getline_nul(&input, stdin);
         else
             ret = strbuf_getline(&input, stdin);

         if (ret == EOF)
             break;
     }
"""

The commit proposed introducing a helper function that is easier to use,
which is just what we have done in the preceding commit. Refactor the
code to use this new helper to simplify the loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:24 -07:00
af35e56b0f strbuf: provide CRLF-aware helper to read until a specified delimiter
Many of our commands support reading input that is separated either via
newlines or via NUL characters. Furthermore, in order to be a better
cross platform citizen, these commands typically know to strip the CRLF
sequence so that we also support reading newline-separated inputs on
e.g. the Windows platform. This results in the following kind of awkward
pattern:

```
struct strbuf input = STRBUF_INIT;

while (1) {
	int ret;

	if (nul_terminated)
		ret = strbuf_getline_nul(&input, stdin);
	else
		ret = strbuf_getline(&input, stdin);
	if (ret)
		break;

	...
}
```

Introduce a new CRLF-aware helper function that can read up to a user
specified delimiter. If the delimiter is `\n` the function knows to also
strip CRLF, otherwise it will only strip the specified delimiter. This
results in the following, much more readable code pattern:

```
struct strbuf input = STRBUF_INIT;

while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) {
	...
}
```

The new function will be used in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:24 -07:00
b116c77307 t1006: modernize test style to use test_cmp
The tests for git-cat-file(1) are quite old and haven't ever been
updated since they were introduced. They thus tend to use old idioms
that have since grown outdated. Most importantly, many of the tests use
`test $A = $B` to compare expected and actual output. This has the
downside that it is impossible to tell what exactly is different between
both versions in case the test fails.

Refactor the tests to instead use `test_cmp`. While more verbose, it
both tends to be more readable and will result in a nice diff in case
states don't match.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:24 -07:00
c7309f63c6 t1006: don't strip timestamps from expected results
In t1006 we have a bunch of tests that verify the output format of the
git-cat-file(1) command. But while part of the output for some tests
would include commit timestamps, we don't verify those but instead strip
them before comparing expected with actual results. This is done by the
function `maybe_remove_timestamp`, which goes all the way back to the
ancient commit b335d3f121 (Add tests for git cat-file, 2008-04-23).

Our tests had been in a different shape back then. Most importantly we
didn't yet have the infrastructure to create objects with deterministic
timestamps. Nowadays we do though, and thus there is no reason anymore
to strip the timestamps.

Refactor the tests to not strip the timestamp anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:24 -07:00
4a53d0d0bc mingw: use lowercase includes for some Windows headers
When cross-compiling with the mingw toolchain on a system with a case
sensitive filesystem, the mixed case (which is technically correct as
per the contents of MS Visual C++) doesn't work (the corresponding mingw
headers are all lowercase for some reason).

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 12:31:52 -07:00
8fac776f44 worktree: integrate with sparse-index
The index is read in 'worktree.c' at two points:

1.The 'validate_no_submodules' function, which checks if there are any
submodules present in the worktree.

2.The 'check_clean_worktree' function, which verifies if a worktree is
'clean', i.e., there are no untracked or modified but uncommitted files.
This is done by running the 'git status' command, and an error message
is thrown if the worktree is not clean. Given that 'git status' is
already sparse-aware, the function is also sparse-aware.

Hence we can just set the requires-full-index to false for
"git worktree".

Add tests that verify that 'git worktree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.

The `p2000` tests demonstrate a ~20% execution time reduction for
'git worktree' using a sparse index:

(Note:the p2000 test results didn't reflect the huge speedup because of
the index reading time is minuscule comparing to the filesystem
operations.)

Test                                       before  after
-----------------------------------------------------------------------
2000.102: git worktree add....(full-v3)    3.15    2.82  -10.5%
2000.103: git worktree add....(full-v4)    3.14    2.84  -9.6%
2000.104: git worktree add....(sparse-v3)  2.59    2.14  -16.4%
2000.105: git worktree add....(sparse-v4)  2.10    1.57  -25.2%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 12:13:58 -07:00
6d224ac286 run-command: report exec error even on ENOENT
If execve(2) fails with ENOENT and we report the error, we use the
format "cannot run %s", followed by the actual error message.  For other
errors we use "cannot exec '%s'".

Stop making this subtle distinction and use the second format for all
execve(2) errors.  This simplifies the code and makes the prefix more
precise by indicating the failed operation.  It also allows us to
slightly simplify t1800.16.

On Windows -- which lacks execve(2) -- we already use a single format in
all cases: "cannot spawn %s".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 11:00:22 -07:00
6b6fe8b43e t1800: loosen matching of error message for bad shebang
t1800.16 checks whether an attempt to run a hook script with a missing
executable in its #! line fails and reports that error.  The expected
error message differs between platforms.  The test handles two common
variants, but on NonStop OS we get a third one: "fatal: cannot exec
'bad-hooks/test-hook': ...", which causes the test to fail there.

We don't really care about the specific message text all that much here.
Use grep and a single regex with alternations to ascertain that we get
an error message (fatal or otherwise) about the failed invocation of the
hook, but don't bother checking if we get the right variant for the
platform the test is running on or whether quoting is done.  This looser
check let's the test pass on NonStop OS.

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 11:00:21 -07:00
03bf92b9bf statinfo.h: move DTYPE defines from dir.h
592fc5b3 (dir.h: move DTYPE defines from cache.h, 2023-04-22) moved
DTYPE macros from cache.h to dir.h, but they are still used by cache.h
to implement ce_to_dtype(); cache.h cannot include dir.h because that
would cause name-hash.c to have two different and conflicting
definitions of `struct dir_entry`. (That should be separately fixed.)

Both dir.h and cache.h include statinfo.h, and this seems a reasonable
place for these definitions.

This change fixes a broken build issue on old SunOS.

Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Signed-off-by: Alejandro R Sedeño <asedeno@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 10:59:01 -07:00
6f74648cea add: test use of brackets when color is disabled
From 02156b81bbb2cafb19d702c55d45714fcf224048 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Wed, 7 Jun 2023 09:39:01 -0400
Subject: [PATCH v2 2/2] add: test use of brackets when color is disabled

The interactive add command, 'git add -i', displays a menu of options
using full words. When color is enabled, the first letter of each word
is changed to a highlight color to signal that the first letter could be
used as a command. Without color, brackets ("[]") are used around these
first letters.

This behavior was not previously tested directly in t3701, so add a test
for it now. Since we use 'git add -i >actual <input' without
'force_color', the color system recognizes that colors are not available
on stdout and will be disabled by default.

This test would reproduce correctly with or without the fix in the
previous commit to make sure that color.ui is respected in 'git add'.

Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 10:50:18 -07:00
7cf3b49f47 add: check color.ui for interactive add
When 'git add -i' and 'git add -p' were converted to a builtin, they
introduced a color bug: the 'color.ui' config setting is ignored.

The included test demonstrates an example that is similar to the
previous test, which focuses on customizing colors. Here, we are
demonstrating that colors are not being used at all by comparing the raw
output and the color-decoded version of that output.

The fix is simple, to use git_color_default_config() as the fallback for
git_add_config(). A more robust change would instead encapsulate the
git_use_color_default global in methods that would check the config
setting if it has not been initialized yet. Some ideas are being
discussed on this front [1], but nothing has been finalized.

[1] https://lore.kernel.org/git/pull.1539.git.1685716420.gitgitgadget@gmail.com/

This test case naturally bisects down to 0527ccb1b5 (add -i: default to
the built-in implementation, 2021-11-30), but the fix makes it clear
that this would be broken even if we added the config to use the builtin
earlier than this.

Reported-by: Greg Alexander <gitgreg@galexander.org>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 10:49:16 -07:00
aeee1408ce notes: move the documentation to the struct
Its better to document the struct members directly instead of on a
function that takes a pointer to the struct. This will also make it
easier to update the documentation in the future.

Make adjustments for this new context. Also drop “may contain” since we
don’t need to emphasize that a list could be empty.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-06 09:35:05 +09:00
a2e9dbb884 notes: update documentation for use_default_notes
`suppress_default_notes` was renamed to `use_default_notes` in
3a03cf6b1d (notes: refactor display notes default handling,
2011-03-29).

The commit message says that “values less than one [indicates] “not
set” ”, but what was meant was probably “less than zero” (the author of
3a03cf6b1d agrees on this point).

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-06 09:35:03 +09:00
26c9c03f0a ref-filter: add new "signature" atom
Duplicate the code for outputting the signature and its other
parameters for commits and tags in ref-filter from pretty. In the
future, this will help in getting rid of the current duplicate
implementations of such logic everywhere, when ref-filter can do
everything that pretty is doing.

The new atom "signature" and its friends are equivalent to the existing
pretty formats as follows:

	%(signature) = %GG
	%(signature:grade) = %G?
	%(siganture:signer) = %GS
	%(signature:key) = %GK
	%(signature:fingerprint) = %GF
	%(signature:primarykeyfingerprint) = %GP
	%(signature:trustlevel) = %GT

Co-authored-by: Hariom Verma <hariom18599@gmail.com>
Co-authored-by: Jaydeep Das <jaydeepjd.8914@gmail.com>
Co-authored-by: Nsengiyumva Wilberforce <nsengiyumvawilberforce@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-06 09:32:15 +09:00
2f36339fa8 t/lib-gpg: introduce new prereq GPG2
GnuPG v2.0.0 released in 2006, which according to its release notes

	https://gnupg.org/download/release_notes.html

is the "First stable version of GnuPG integrating OpenPGP and S/MIME".

Use this version or its successors for tests that will fail for
versions less than v2.0.0 because of the difference in the output on
stderr between the versions (v2.* vs v0.* or v2.* vs v1.*). Skip if
the GPG version detected is less than v2.0.0.

Do not, however, remove the existing prereq GPG yet since a lot of tests
still work with the prereq GPG (that is even with versions v0.* or v1.*)
and some systems still use these versions.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-06 09:32:12 +09:00
68b51172e3 commit-reach: fix memory leak in get_reachable_subset()
This is a leak that has existed since the method was first created
in fcb2c0769d (commit-reach: implement get_reachable_subset,
2018-11-02).

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-04 13:43:48 +09:00
d88d727143 ci: drop linux-clang job
Since the linux-asan-ubsan job runs using clang under Linux, there is
not much point in running a separate clang job. Any errors that a normal
clang compile-and-test cycle would find are likely to be a subset of
what the sanitizer job will find. Since this job takes ~14 minutes to
run in CI, this shaves off some of our CPU load (though it does not
affect end-to-end runtime, since it's typically run in parallel and is
not the longest job).

Technically this provides us with slightly less signal for a given run,
since you won't immediately know if a failure in the sanitizer job is
from using clang or from the sanitizers themselves. But it's generally
obvious from the logs, and anyway your next step would be to fix the
probvlem and re-run CI, since we expect all of these jobs to pass
normally.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:35:13 +09:00
ec6915265a ci: run ASan/UBSan in a single job
When we started running sanitizers in CI via 1c0962c0c4 (ci: add address
and undefined sanitizer tasks, 2022-10-20), we ran them as two separate
CI jobs, since as that commit notes, the combination "seems to take
forever".

And indeed, it does with gcc. However, since the previous commit
switched to using clang, the situation is different, and we can save
some CPU by using a single job for both. Comparing before/after CI runs,
this saved about 14 minutes (the single combined job took 54m, versus
44m plus 24m for ASan and UBSan jobs, respectively). That's wall-clock
and not CPU, but since our jobs are mostly CPU-bound, the two should be
closely proportional.

This does increase the end-to-end time of a CI run, though, since before
this patch the two jobs could run in parallel, and the sanitizer job is
our longest single job. It also means that we won't get a separate
result for "this passed with UBSan but not with ASan" or vice versa).
But as 1c0962c0c4 noted, that is not a very useful signal in practice.

Below are some more detailed timings of gcc vs clang that I measured by
running the test suite on my local workstation. Each measurement counts
only the time to run the test suite with each compiler (not the compile
time itself). We'll focus on the wall-clock times for simplicity, though
the CPU times follow roughly similar trends.

Here's a run with CC=gcc as a baseline:

  real	1m12.931s
  user	9m30.566s
  sys	8m9.538s

Running with SANITIZE=address increases the time by a factor of ~4.7x:

  real	5m40.352s
  user	49m37.044s
  sys	36m42.950s

Running with SANITIZE=undefined increases the time by a factor of ~1.7x:

  real	2m5.956s
  user	12m42.847s
  sys	19m27.067s

So let's call that 6.4 time units to run them separately (where a unit
is the time it takes to run the test suite with no sanitizers). As a
simplistic model, we might imagine that running them together would take
5.4 units (we save 1 unit because we are no longer running the test
suite twice, but just paying the sanitizer overhead on top of a single
run).

But that's not what happens. Running with SANITIZE=address,undefined
results in a factor of 9.3x:

  real	11m9.817s
  user	77m31.284s
  sys	96m40.454s

So not only did we not get faster when doing them together, we actually
spent 1.5x as much CPU as doing them separately! And while those
wall-clock numbers might not look too terrible, keep in mind that this
is on an unloaded 8-core machine. In the CI environment, wall-clock
times will be much closer to CPU times. So not only are we wasting CPU,
but we risk hitting timeouts.

Now let's try the same thing with clang. Here's our no-sanitizer
baseline run, which is almost identical to the gcc one (which is quite
convenient, because we can keep using the same "time units" to get an
apples-to-apples comparison):

  real	1m11.844s
  user	9m28.313s
  sys	8m8.240s

And now again with SANITIZE=address, we get a 5x factor (so slightly
worse than gcc's 4.7x, though I wouldn't read too much into it; there is
a fair bit of run-to-run noise):

  real	6m7.662s
  user	49m24.330s
  sys	44m13.846s

And with SANITIZE=undefined, we are at 1.5x, slightly outperforming gcc
(though again, that's probably mostly noise):

  real	1m50.028s
  user	11m0.973s
  sys	16m42.731s

So running them separately, our total cost is 6.5x. But if we combine
them in a single run (SANITIZE=address,undefined), we get:

  real	6m51.804s
  user	52m32.049s
  sys	51m46.711s

which is a factor of 5.7x. That's along the lines we'd hoped for!
Running them together saves us almost a whole time unit. And that's not
counting any time spent outside the test suite itself (starting the job,
setting up the environment, compiling) that we're no longer duplicating
by having two jobs.

So clang behaves like we'd hope: the overhead to run the sanitizers is
additive as you add more sanitizers. Whereas gcc's numbers seem very
close to multiplicative, almost as if the sanitizers were enforcing
their overheads on each other (though that is purely a guess on what is
going on; ultimately what matters to us is the amount of time it takes).

And that roughly matches the CI improvement I saw. A "time unit" there
is more like 12 minutes, and the observed time savings was 14 minutes
(with the extra presumably coming from avoiding duplicated setup, etc).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:35:13 +09:00
85a62951e5 ci: use clang for ASan/UBSan checks
Both gcc and clang support the "address" and "undefined" sanitizers.
However, they may produce different results. We've seen at least two
real world cases where gcc missed a UBSan problem but clang found it:

  1. Clang's UBSan (using clang 14.0.6) found a string index that was
     subtracted to "-1", causing an out-of-bounds read (curiously this
     didn't trigger ASan, but that may be because the string was in the
     argv memory, not stack or heap). Using gcc (version 12.2.0) didn't
     find the same problem.

     Original thread:
     https://lore.kernel.org/git/20230519005447.GA2955320@coredump.intra.peff.net/

  2. Clang's UBSan (using clang 4.0.1) complained about pointer
     arithmetic with NULL, but gcc at the time did not. This was in
     2017, and modern gcc does seem to find the issue, though.

     Original thread:
     https://lore.kernel.org/git/32a8b949-638a-1784-7fba-948ae32206fc@web.de/

Since we don't otherwise have a particular preference for one compiler
over the other for this test, let's switch to the one that we think may
be more thorough.

Note that it's entirely possible that the two are simply _different_,
and we are trading off problems that gcc would find that clang wouldn't.
However, my subjective and anecdotal experience has been that clang's
sanitizer support is a bit more mature (e.g., I recall other oddities
around leak-checking where clang performed more sensibly).

Obviously running both and cross-checking the results would give us the
best coverage, but that's very expensive to run (and these are already
some of our most expensive CI jobs). So let's use clang as our best
guess, and we can re-evaluate if we get more data points.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:35:13 +09:00
8260bc5902 diff: detect pathspec magic not supported by --follow
The --follow code doesn't handle most forms of pathspec magic. We check
that no unexpected ones have made it to try_to_follow_renames() with a
runtime GUARD_PATHSPEC() check, which gives behavior like this:

  $ git log --follow ':(icase)makefile' >/dev/null
  BUG: tree-diff.c:596: unsupported magic 10
  Aborted

The same is true of ":(glob)", ":(attr)", and so on. It's good that we
notice the problem rather than continuing and producing a wrong answer.
But there are two non-ideal things:

  1. The idea of GUARD_PATHSPEC() is to catch programming errors where
     low-level code gets unexpected pathspecs. We'd usually try to catch
     unsupported pathspecs by passing a magic_mask to parse_pathspec(),
     which would give the user a much better message like:

       pathspec magic not supported by this command: 'icase'

     That doesn't happen here because git-log usually _does_ support
     all types of pathspec magic, and so it passes "0" for the mask
     (this call actually happens in setup_revisions()). It needs to
     distinguish the normal case from the "--follow" one but currently
     doesn't.

  2. In addition to --follow, we have the log.follow config option. When
     that is set, we try to turn on --follow mode only when there is a
     single pathspec (since --follow doesn't handle anything else). But
     really, that ought to be expanded to "use --follow when the
     pathspec supports it". Otherwise, we'd complain any time you use an
     exotic pathspec:

       $ git config log.follow true
       $ git log ':(icase)makefile' >/dev/null
       BUG: tree-diff.c:596: unsupported magic 10
       Aborted

     We should instead just avoid enabling follow mode if it's not
     supported by this particular invocation.

This patch expands our diff_check_follow_pathspec() function to cover
pathspec magic, solving both problems.

A few final notes:

  - we could also solve (1) by passing the appropriate mask to
    parse_pathspec(). But that's not great for two reasons. One is that
    the error message is less precise. It says "magic not supported by
    this command", but really it is not the command, but rather the
    --follow option which is the problem. The second is that it always
    calls die(). But for our log.follow code, we want to speculatively
    ask "is this pathspec OK?" and just get a boolean result.

  - This is obviously the right thing to do for ':(icase)' and most
    other magic options. But ':(glob)' is a bit odd here. The --follow
    code doesn't support wildcards, but we allow them anyway. From
    try_to_follow_renames():

	#if 0
	        /*
	         * We should reject wildcards as well. Unfortunately we
	         * haven't got a reliable way to detect that 'foo\*bar' in
	         * fact has no wildcards. nowildcard_len is merely a hint for
	         * optimization. Let it slip for now until wildmatch is taught
	         * about dry-run mode and returns wildcard info.
	         */
	        if (opt->pathspec.has_wildcard)
	                BUG("wildcards are not supported");
	#endif

    So something like "git log --follow 'Make*'" is already doing the
    wrong thing, since ":(glob)" behavior is already the default (it is
    used only to countermand an earlier --noglob-pathspecs).

    So we _could_ loosen the guard to allow :(glob), since it just
    behaves the same as pathspecs do by default. But it seems like a
    backwards step to do so. It already doesn't work (it hits the BUG()
    case currently), and given that the user took an explicit step to
    say "this pathspec should glob", it is reasonable for us to say "no,
    --follow does not support globbing" (or in the case of log.follow,
    avoid turning on follow mode). Which is what happens after this
    patch.

  - The set of allowed pathspec magic is obviously the same as in
    GUARD_PATHSPEC(). We could perhaps factor these out to avoid
    repetition. The point of having separate masks and GUARD calls is
    that we don't necessarily know which parsed pathspecs will be used
    where. But in this case, the two are heavily correlated. Still,
    there may be some value in keeping them separate; it would make
    anyone think twice about adding new magic to the list in
    diff_check_follow_pathspec(). They'd need to touch
    try_to_follow_renames() as well, which is the code that would
    actually need to be updated to handle more exotic pathspecs.

  - The documentation for log.follow says that it enables --follow
    "...when a single <path> is given". We could possibly expand that to
    say "with no unsupported pathspec magic", but that raises the
    question of documenting which magic is supported. I think the
    existing wording of "single <path>" sufficiently encompasses the
    idea (the forbidden magic is stuff that might match multiple
    entries), and the spirit remains the same.

Reported-by: Jim Pryor <dubiousjim@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:34:25 +09:00
9eac5954e8 diff: factor out --follow pathspec check
In --follow mode, we require exactly one pathspec. We check this
condition in two places:

  - in diff_setup_done(), we complain if --follow is used with an
    inapropriate pathspec

  - in git-log's revision "tweak" function, we enable log.follow only if
    the pathspec allows it

The duplication isn't a big deal right now, since the logic is so
simple. But in preparation for it becoming more complex, let's pull it
into a shared function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:34:25 +09:00
8e32caaa78 pathspec: factor out magic-to-name function
When we have unsupported magic in a pathspec (because a command or code
path does not support particular items), we list the unsupported ones in
an error message.

Let's factor out the code here that converts the bits back into their
human-readable names, so that it can be used from other callers, which
may want to provide more flexible error messages.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:34:25 +09:00
e4cf013468 surround %s with quotes when failed to lookup commit
The output may become confusing to recognize if the user
accidentally gave an extra opening space, like:

   $ git commit --fixup=" 6d6360b67e99c2fd82d64619c971fdede98ee74b"
   fatal: could not lookup commit  6d6360b67e99c2fd82d64619c971fdede98ee74b

and it will be better if we surround the %s specifier with single quotes.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 09:01:10 +09:00
fe86abd751 Git 2.41
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-01 15:28:26 +09:00
ee65a63819 Merge tag 'l10n-2.41.0-2' of https://github.com/git-l10n/git-po
l10n-2.41.0-2

* tag 'l10n-2.41.0-2' of https://github.com/git-l10n/git-po:
  l10n: zh_TW.po: Git 2.41.0
  l10n: sv.po: Update Swedish translation (5515t0f0u)
  l10n: Update Catalan translation
  l10n: Update German translation
  l10n: po-id for 2.41 (round 1)
  l10n: Update Catalan translation
  l10n: tr: Update Turkish translations for 2.41.0
  l10n: fr.po v2.41.0 rnd2
  l10n: fr.po v2.41.0 rnd1
  l10n: fr: fix translation of stash save help
  l10n: zh_CN: Git 2.41.0 round #1
  l10n: bg.po: Updated Bulgarian translation (5515t)
  l10n: update uk localization
  l10n: uk: remove stale lines
  l10n: uk: add initial translation
  l10n: TEAMS: Update pt_PT repo link
2023-06-01 15:27:43 +09:00
f86de088f8 l10n: zh_TW.po: Git 2.41.0
Co-authored-by: Peter Dave Hello <hsu@peterdavehello.org>
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2023-06-01 00:53:09 +08:00
81a797fcdf Merge branch 'add-uk-initial-l10n' of github.com:arkid15r/git-ukrainian-l10n
* 'add-uk-initial-l10n' of github.com:arkid15r/git-ukrainian-l10n:
  l10n: update uk localization
  l10n: uk: remove stale lines
  l10n: uk: add initial translation
2023-05-31 21:11:25 +08:00
308f3f4e9a l10n: sv.po: Update Swedish translation (5515t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2023-05-31 13:16:21 +01:00
aa1991d080 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2023-05-26 20:02:14 +02:00
3867f6d650 repository: move 'repository_format_worktree_config' to repo scope
Move 'repository_format_worktree_config' out of the global scope and into
the 'repository' struct. This change is similar to how
'repository_format_partial_clone' was moved in ebaf3bcf1a (repository: move
global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository'
struct and updating 'setup.c' & 'repository.c' functions to assign the value
appropriately.

The primary goal of this change is to be able to load the worktree config of
a submodule depending on whether that submodule - not its superproject - has
'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()'
has access to the newly repo-scoped configuration, add a 'struct repository'
argument to 'do_git_config_sequence()' and pass it the 'repo' value from
'config_with_options()'.

Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to
verify 'extensions.worktreeConfig' is read an used independently by
superprojects and submodules.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:53:41 +09:00
9b6b06c159 config: pass 'repo' directly to 'config_with_options()'
Add a 'struct repository' argument to 'config_with_options()' and remove the
'repo' field from 'struct git_config_source'.

A 'struct repository' instance was originally added to the config source in
e3e8bf046e (submodule-config: pass repo upon blob config read, 2021-08-16)
to improve how submodule blob config content was accessed. At the time, this
was the only use for a 'repository' instance, so it was naturally added only
where it was needed: to 'struct git_config_source'. However, in upcoming
patches, 'config_with_options()' will need the repository instance to access
extension information (regardless of whether a 'config_source' exists). To
make the 'struct repository' instance more easily accessible, move it into
the function's arguments.

Update all callers of 'config_with_options()' to pass the appropriate 'repo'
value:

* in 'builtin/config.c', use 'the_repository'
* in 'submodule--config.c', use the 'repo' arg in 'config_from_gitmodules()'
* in 'read_[very_]early_config()' & 'read_protected_config()', set 'repo' to
  NULL (repository instances aren't available there)
* in 'populate_remote_urls()', use the repo instance that has been added to
  the 'struct config_include_data'
* in 'repo_read_config()', use the given 'repo' arg

Finally, note that this patch eliminates the fallback to 'the_repository'
that previously existed for the 'config_source' repo instance if it was
NULL. The fallback is no longer necessary, as the 'repo' is set explicitly
in all cases where it is needed.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:53:40 +09:00
847d0027d2 config: use gitdir to get worktree config
Update 'do_git_config_sequence()' to read the worktree config from
'config.worktree' in 'opts->git_dir' rather than the gitdir of
'the_repository'.

The worktree config is loaded from the path returned by
'git_pathdup("config.worktree")', the 'config.worktree' relative to the
gitdir of 'the_repository'. If loading the config for a submodule, this path
is incorrect, since 'the_repository' is the superproject. 'opts->git_dir' is
the gitdir of the submodule being configured, so the config file in that
location should be read instead.

To ensure the use of 'opts->git_dir' is safe, require that 'opts->git_dir'
is set if-and-only-if 'opts->commondir' is set (rather than "only-if" as it
is now). In all current usage of 'config_options', these values are set
together, so the stricter check does not change any behavior.

Finally, add tests to 't3007-ls-files-recurse-submodules.sh' to verify the
corrected config is loaded. Use 'ls-files' to test this because, unlike some
other '--recurse-submodules' commands, 'ls-files' parses the config of the
submodule in the same process as the superproject (via 'show_submodule()' ->
'repo_read_index()' -> 'prepare_repo_settings()'). As a result,
'the_repository' points to the config of the superproject but the
commondir/gitdir in the config sequence will be that of the submodule,
providing the exact scenario needed to verify this patch.

The first test ('--recurse-submodules parses submodule repo config') checks
that the submodule's *repo* config is read when running 'ls-files' on the
superproject; this confirms already-working behavior, serving as a reference
for how worktree config parsing should behave. The second test
('--recurse-submodules parses submodule worktree config') tests the same
scenario as the previous but instead using the *worktree* config,
demonstrating the corrected behavior. The 'test_config' helper is extended
for this case so that it properly applies the '--worktree' option to the
configure/unconfigure operations it performs.

Note that, although the submodule worktree config is now parsed instead of
the superproject's, 'extensions.worktreeConfig' in the superproject still
controls whether or not the worktree config is enabled at all in the
submodule. This will be fixed in a later patch.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:53:40 +09:00
20025fdfc7 t/lib-gpg: fix ssh-keygen -Y check-novalidate with openssh-9.0
OpenSSH-9.0 requires a namespace option with `-Y check-novalidate`.
This was added in openssh-portable commit a0b5816f8 (upstream:
ssh-keygen -Y check-novalidate requires namespace or SEGV, 2022-03-18).

The -n option was documented as a required option since check-novalidate
was added in openssh-portable 8aa2aa3cd (upstream: Allow testing
signature syntax and validity without verifying, 2019-09-16).

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:37:08 +09:00
e48a21df65 trace2 tests: fix PTHREADS prereq
The prereq guard added in 14903c8e92 (trace2 tests: guard pthread test
with "PTHREAD", 2022-11-24) lacks the S in PTHREADS, causing it to never
be satisfied.  Fix the spelling of the prereq.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:31:15 +09:00
97e736c4cc Merge branch 'l10n-de-2.41' of github.com:ralfth/git
* 'l10n-de-2.41' of github.com:ralfth/git:
  l10n: Update German translation
2023-05-25 14:06:01 +08:00
2f88193dac Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2023-05-25 14:05:21 +08:00
4d34454e2c Merge branch 'tr' of github.com:bitigchi/git-po
* 'tr' of github.com:bitigchi/git-po:
  l10n: tr: Update Turkish translations for 2.41.0
2023-05-25 14:04:42 +08:00
08af8c4c48 Merge branch 'main' of github.com:alshopov/git-po
* 'main' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5515t)
2023-05-25 14:03:58 +08:00
10553acb0e Merge branch 'fr_2.41.0_rnd1' of github.com:jnavila/git
* 'fr_2.41.0_rnd1' of github.com:jnavila/git:
  l10n: fr.po v2.41.0 rnd2
  l10n: fr.po v2.41.0 rnd1
  l10n: fr: fix translation of stash save help
2023-05-25 14:03:22 +08:00
07d6730b39 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.41 (round 1)
2023-05-25 14:01:59 +08:00
f9bb784ab9 Merge branch 'tl/zh_CN_2.41.0_rnd1' of github.com:dyrone/git
* 'tl/zh_CN_2.41.0_rnd1' of github.com:dyrone/git:
  l10n: zh_CN: Git 2.41.0 round #1
2023-05-25 13:58:10 +08:00
79bdd48716 Git 2.41-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-25 05:55:19 +09:00
6a6621fe9a Merge branch 'sl/sparse-write-tree-part-2'
Fix-up to a topic already graduated to 'master'.

* sl/sparse-write-tree-part-2:
  t1092: update a write-tree test
2023-05-25 05:53:55 +09:00
fbc806acd1 builtin/submodule--helper.c: handle missing submodule URLs
In e0a862fdaf (submodule helper: convert relative URL to absolute URL if
needed, 2018-10-16), `prepare_to_clone_next_submodule()` lost the
ability to handle URL-less submodules, due to a change from:

    if (repo_get_config_string_const(the_repostiory, sb.buf, &url))
        url = sub->url;

to

    if (repo_get_config_string_const(the_repostiory, sb.buf, &url)) {
        if (starts_with_dot_slash(sub->url) ||
            starts_with_dot_dot_slash(sub->url)) {
                /* ... */
            }
    }

, which will segfault when `sub->url` is NULL, since both
`starts_with_dot_slash()` does not guard its arguments as non-NULL.

Guard the checks to both of the above functions by first checking
whether `sub->url` is non-NULL. There is no need to check whether `sub`
itself is NULL, since we already perform this check earlier in
`prepare_to_clone_next_submodule()`.

By adding a NULL-ness check on `sub->url`, we'll fall into the 'else'
branch, setting `url` to `sub->url` (which is NULL). Before attempting
to invoke `git submodule--helper clone`, check whether `url` is NULL,
and die() if it is.

Reported-by: Tribo Dar <3bodar@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-25 05:26:59 +09:00
4d28c4f75f ls-files: align format atoms with ls-tree
"git ls-files --format" can be used to format the output of
multiple file entries in the index, while "git ls-tree --format"
can be used to format the contents of a tree object. However,
the current set of %(objecttype), "(objectsize)", and
"%(objectsize:padded)" atoms supported by "git ls-files --format"
is a subset of what is available in "git ls-tree --format".

Users sometimes need to establish a unified view between the index
and tree, which can help with comparison or conversion between the two.

Therefore, this patch adds the missing atoms to "git ls-files --format".
"%(objecttype)" can be used to retrieve the object type corresponding
to a file in the index, "(objectsize)" can be used to retrieve the
object size corresponding to a file in the index, and "%(objectsize:padded)"
is the same as "(objectsize)", except with padded format.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 20:12:57 +09:00
982ff3a649 completion: complete AUTO_MERGE
The pseudoref AUTO_MERGE is documented since the previous commit. To
make it easier to use, let __git_refs in the Bash completion code
complete it.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 17:21:47 +09:00
4fa1edb988 Documentation: document AUTO_MERGE
Since 5291828df8 (merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a
conflict, 2021-03-20), when using the 'ort' merge strategy, the special
ref AUTO_MERGE is written when a merge operation results in conflicts.
This ref points to a tree recording the conflicted state of the working
tree and is very useful during conflict resolution. However, this ref is
not documented.

Add some documentation for AUTO_MERGE in git-diff(1), git-merge(1),
gitrevisions(7) and in the user manual.

In git-diff(1), mention it at the end of the description section, when
we mention that the command also accepts trees instead of commits, and
also add an invocation to the "Various ways to check your working tree"
example.

In git-merge(1), add a step to the list of things that happen "when it
is not obvious how to reconcile the changes", under the "True merge"
section. Also mention AUTO_MERGE in the "How to resolve conflicts"
section, when mentioning 'git diff'.

In gitrevisions(7), add a mention of AUTO_MERGE along with the other
special refs.

In the user manual, add a paragraph describing AUTO_MERGE to the
"Getting conflict-resolution help during a merge" section, and include
an example of a 'git diff AUTO_MERGE' invocation for the example
conflict used in that section. Note that for uniformity we do not use
backticks around AUTO_MERGE here since the rest of the document does not
typeset special refs differently.

Closes: https://github.com/gitgitgadget/git/issues/1471
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 17:21:47 +09:00
b7dd54a2c7 git-merge.txt: modernize word choice in "True merge" section
The "True merge" section of the 'git merge' documentation mentions that
in case of conflicts, the conflicted working tree files contain "the
result of the "merge" program". This probably refers to RCS's 'merge'
program, which is mentioned further down under "How conflicts are
presented".

Since it is not clear at that point of the document which program is
referred to, and since most modern readers probably do not relate to RCS
anyway, let's just write "the merge operation" instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 17:21:47 +09:00
1ef3c61b78 completion: complete REVERT_HEAD and BISECT_HEAD
The pseudorefs REVERT_HEAD and BISECT_HEAD are not suggested
by the __git_refs function. Add them there.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 17:21:47 +09:00
6ec5f46071 revisions.txt: document more special refs
Some special refs, namely HEAD, FETCH_HEAD, ORIG_HEAD, MERGE_HEAD and
CHERRY_PICK_HEAD, are mentioned and described in 'gitrevisions', but some
others, namely REBASE_HEAD, REVERT_HEAD, and BISECT_HEAD, are not.

Add a small description of these special refs.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 17:21:46 +09:00
bc11bac329 revisions.txt: use description list for special refs
The special refs listed in 'gitrevisions' (under the '<refname>' entry)
are on separate lines in the Asciidoc source, but end up as a single
continuous paragraph in the rendered documentation (see e.g. [1]). In
following commits we will mention additional special refs, so to improve
legibility, use a description list such that every entry appears on its
own line. Since we are already in a description list, use ':::' as the
term delimiter.

In order for the new description list to be aligned with the description
under the '<refname>' entry, instead of being aligned with the last
entry of the "in the following rules" nested list, use the "ancestor
list continuation" syntax [2], i.e., leave an empty line before the
continuation '+'. Do the same for the paragraph following the new
description list ("Note that any...").

While at it, also use a continuation '+' before the "in the following
rules" list, for correctness. The parser seems not to care here, but
it's best to keep the sources correct.

[1] https://git-scm.com/docs/gitrevisions#Documentation/gitrevisions.txt-emltrefnamegtemegemmasterememheadsmasterememrefsheadsmasterem
[2] https://docs.asciidoctor.org/asciidoc/latest/lists/continuation/#ancestor-list-continuation

Suggested-by: Victoria Dye <vdye@github.com>
Based-on-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 17:21:46 +09:00
447a3b7331 t9400-git-cvsserver-server: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:29 +09:00
7dac6347c5 t9200-git-cvsexportcommit: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:29 +09:00
7d7097bf59 t9104-git-svn-follow-parent: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:29 +09:00
be1fce6dae t9100-git-svn-basic: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:29 +09:00
3a3b98be91 t7700-repack: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:29 +09:00
bd48dfad69 t7600-merge: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:29 +09:00
a6171e1478 t7508-status: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:28 +09:00
10dae78533 t7201-co: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:28 +09:00
c970681f50 t7111-reset-table: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:28 +09:00
a32a724b03 t7110-reset-merge: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 12:54:28 +09:00
5eaa027972 l10n: Update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
2023-05-22 17:17:49 +02:00
5aab7179a2 l10n: po-id for 2.41 (round 1)
Update following components:

  * advice.c
  * archive.c
  * attr.c
  * config.c
  * pack-revindex.c
  * builtin/branch.c
  * builtin/bundle.c
  * builtin/pack-redundant.c
  * builtin/rebase.c
  * builtin/sparse-checkout.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2023-05-22 15:25:32 +07:00
0c0ffcd2b8 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2023-05-20 14:09:46 +02:00
6f20bdbffe l10n: tr: Update Turkish translations for 2.41.0
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2023-05-20 13:58:15 +03:00
82e70690d4 l10n: fr.po v2.41.0 rnd2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-05-20 12:43:10 +02:00
5076d955f3 l10n: fr.po v2.41.0 rnd1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-05-20 12:23:19 +02:00
460ba0869d l10n: fr: fix translation of stash save help
Signed-off-by: Benjamin Jorand <benjamin.jorand@doctolib.com>
2023-05-20 12:23:19 +02:00
407b144f35 l10n: zh_CN: Git 2.41.0 round #1
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Reviewed-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Reviewed-by: 依云 <lilydjwg@gmail.com>
Reviewed-by: pan93412 <pan93412@gmail.com>
Reviewed0by: Fangyi Zhou <me@fangyi.io>
2023-05-20 18:06:20 +08:00
68a86d028b Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git:
  A few more topics after 2.41-rc1
  Git 2.41-rc1
  t/lib-httpd: make CGIPassAuth support conditional
  t9001: mark the script as no longer leak checker clean
  send-email: clear the $message_id after validation
  upload-pack: advertise capabilities when cloning empty repos
  A bit more before -rc1
  imap-send: include strbuf.h
  run-command.c: fix missing include under `NO_PTHREADS`
  test: do not negate test_path_is_* to assert absense
  t2021: do not negate test_path_is_dir
  tests: do not negate test_path_exists
  doc/git-config: add unit for http.lowSpeedLimit
  rebase -r: fix the total number shown in the progress
  rebase --update-refs: fix loops
  attr: teach "--attr-source=<tree>" global option to "git"
2023-05-20 08:44:08 +08:00
9e49351c30 A few more topics after 2.41-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-20 05:35:57 +09:00
cacc15ee3f Merge branch 'js/rebase-count-fixes'
A few bugs in the sequencer machinery that results in miscounting
the steps have been corrected.

* js/rebase-count-fixes:
  rebase -r: fix the total number shown in the progress
  rebase --update-refs: fix loops
2023-05-20 05:35:57 +09:00
dc3fd2486f Merge branch 'jc/do-not-negate-test-helpers'
Small fixes.

* jc/do-not-negate-test-helpers:
  test: do not negate test_path_is_* to assert absense
  t2021: do not negate test_path_is_dir
  tests: do not negate test_path_exists
2023-05-20 05:35:56 +09:00
1f141d6cb2 Merge branch 'cg/doc-http-lowspeed-limit'
Doc update.

* cg/doc-http-lowspeed-limit:
  doc/git-config: add unit for http.lowSpeedLimit
2023-05-20 05:35:56 +09:00
6d438bf3e4 l10n: bg.po: Updated Bulgarian translation (5515t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2023-05-19 19:58:56 +02:00
3b8724bce6 t7101-reset-empty-subdirs: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:12 -07:00
32942346aa t6050-replace: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:12 -07:00
a45bb750db t5306-pack-nobase: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:12 -07:00
aac864059f t5303-pack-corruption-resilience: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:12 -07:00
cc0c1ad9ad t5301-sliding-window: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:12 -07:00
e478a52087 t5300-pack-object: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
1afebc92ef t4206-log-follow-harder-copies: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
3da9be913a t4202-log: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
93fc423e9a t4004-diff-rename-symlink: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
9297229d15 t4003-diff-rename-1: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
9cfcbcc095 t4002-diff-basic: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
a8fcc0ac89 t3903-stash: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
0aa0266c4b t3700-add: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
0a6cb5c42f t3500-cherry: modernize test format
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
350c484239 t1006-cat-file: modernize test format
Some tests in t1006-cat-file.sh used the older four space indent format.
Update these to use tabs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
a10bb2ded5 t1002-read-tree-m-u-2way: modernize test format
Some tests are still using the older four space indent format. Update
these to use tabs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
58db6e450b t1001-read-tree-m-2way: modernize test format
Some tests are still using the older four space indent format. Update
these to use tabs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:11 -07:00
3c2f5d26c0 t3210-pack-refs: modernize test format
Some tests in t3210-pack-refs.sh used the older four space indent format.
Update these to use tabs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:10 -07:00
27990663f0 t0030-stripspace: modernize test format
Some tests in t0030-stripspace.sh used the older four space indent
format. Update these to use tabs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:10 -07:00
2f68c99a3b t0000-basic: modernize test format
Some tests in t0000-basic.sh used the older four space indent format.
Update these to use tabs.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 10:08:10 -07:00
c6d26a9dda format-patch: free elements of rev.ref_message_ids list
When we are showing multiple patches with format-patch, we have to
repeatedly overwrite the rev.message_id field. We take care to avoid
leaking the old value by either freeing it, or adding it to
ref_message_ids, a string list of ids to reference in subsequent
messages.

But unfortunately we do leak the value via that string list. We try
to clear the string list, courtesy of 89f45cf4eb (format-patch: don't
leak "extra_headers" or "ref_message_ids", 2022-04-13). But since it was
initialized as "nodup", the string list doesn't realize it owns the
strings, and it leaks them.

We have two options here:

  1. Continue to init with "nodup", but then tweak the value of
     ref_message_ids.strdup_strings just before clearing.

  2. Init with "dup", but use "append_nodup" when transferring ownership
     of strings to the list. Clearing just works.

I picked the second here, as I think it calls attention to the tricky
part (transferring ownership via the nodup call).

There's one other related fix we have to do, though. We also insert the
result of clean_message_id() into the list. This _sometimes_ allocates
and sometimes does not, depending on whether we have to remove cruft
from the end of the string. Let's teach it to consistently return an
allocated string, so that the caller knows it must be freed.

There's no new test here, as the leak can already be seen in t4014.44 (as
well as others in that script). We can't mark all of t4014 as leak-free,
though, as there are other unrelated leaks that it triggers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 09:42:26 -07:00
4a714b3702 Git 2.41-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 09:27:07 -07:00
646ca89558 Merge branch 'jk/http-test-cgipassauth-unavailable-in-older-apache'
We started unconditionally testing with CGIPassAuth directive but
it is unavailable in older Apache that ships with CentOS 7 that has
about a year of shelf-life still left.  The test has conditionally
been disabled when running with an ancient Apache.  This was a fix
for a recent regression caught before the release, so no need to
mention it in the release notes.

* jk/http-test-cgipassauth-unavailable-in-older-apache:
  t/lib-httpd: make CGIPassAuth support conditional
2023-05-19 09:27:07 -07:00
633390bd08 Merge branch 'bc/clone-empty-repo-via-protocol-v0'
The server side of "git clone" now advertises the necessary hints
to clients to help them to clone from an empty repository and learn
object hash algorithm and the (unborn) branch pointed at by HEAD,
even over the older v0/v1 protocol.

* bc/clone-empty-repo-via-protocol-v0:
  upload-pack: advertise capabilities when cloning empty repos
2023-05-19 09:27:06 -07:00
b04671b638 Merge branch 'jc/send-email-pre-process-fix'
When "git send-email" that uses the validate hook is fed a message
without and then with Message-ID, it failed to auto-assign a unique
Message-ID to the former and instead reused the Message-ID from the
latter, which has been corrected.  This was a fix for a recent
regression caught before the release, so no need to mention it in
the release notes.

* jc/send-email-pre-process-fix:
  t9001: mark the script as no longer leak checker clean
  send-email: clear the $message_id after validation
2023-05-19 09:27:06 -07:00
75ab1fa5ab Merge branch 'tb/run-command-needs-alloc-h'
Fix the build problem with NO_PTHREADS defined, a fallout from
recent header file shuffling.

* tb/run-command-needs-alloc-h:
  run-command.c: fix missing include under `NO_PTHREADS`
2023-05-19 09:27:06 -07:00
51f9d2e563 ls-remote doc: document the output format
While well-established, the output format of ls-remote was not actually
documented. This patch adds an OUTPUT section to the documentation
following the format of git-show-ref.txt (which has similar semantics).

Add a basic example immediately after this to solidify the 'normal'
output format.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 08:19:34 -07:00
a5b076321a ls-remote doc: explain what each example does
While it's good to have several examples to solidify the output pattern
and generally demonstrate how to use the command, most other EXAMPLES
sections (e.g., git-show-branch.txt, git-remote.txt) additionally
describe the problem/situation to which the example is applicable.

Follow this example in the ls-remote documentation.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 08:19:34 -07:00
e959fa452f ls-remote doc: show peeled tags in examples
Without `--refs`, this command will show peeled tags. Make this clearer
in the examples to further mitigate the possibility of surprises in
consuming scripts.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 08:19:34 -07:00
21c9bac2c7 ls-remote doc: remove redundant --tags example
The --tags option is already demonstrated in the later example that
lists version-patterned tags. As it doesn't appear to add anything to
the documentation, it ought to be removed to keep the documentation
easier to read.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 08:19:34 -07:00
0f45b5bc32 show-branch doc: say <ref>, not <reference>
The glossary defines 'ref' as the official name of the thing,
and the output from "git grep -e '<ref' Documentation/" shows
that most everybody uses <ref>, not <reference>.  In addition,
the page already says <ref> in its SYNOPSIS section for the
command when it is used in the mode to follow the reflogs.

Strictly speaking, many references of these should be updated to
<commit> after adding an explanation on how these <commit>s are
discovered (i.e. we take <rev>, <glob>, or <ref> and starting from
these commits, follow their ancestry or reflog entries to list
commits), but that would be a lot bigger change I would rather not
to do in this patch, whose primary purpose is to make the existing
documentation more consistent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 08:19:34 -07:00
00bf685975 show-ref doc: update for internal consistency
- Use inline-code syntax for options where appropriate.
- Use code blocks to clarify output format.
- Use 'OID' (for 'object ID') instead of 'SHA-1' as we support
  different hashing algorithms these days.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 08:19:34 -07:00
cfa120947e format-patch: free rev.message_id when exiting
We may allocate a message-id string via gen_message_id(), but we never
free it, causing a small leak. This can be demonstrated by running t9001
with a leak-checking build. The offending test is the one touched by
3ece9bf0f9 (send-email: clear the $message_id after validation,
2023-05-17), but the leak is much older than that. The test was simply
unlucky enough to trigger the leaking code path for the first time.

We can fix this by freeing the string at the end of the function. We can
also re-mark the test script as leak-free, effectively reverting
20bd08aefb (t9001: mark the script as no longer leak checker clean,
2023-05-17).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-18 18:33:04 -07:00
eb1c42da8e t/lib-httpd: make CGIPassAuth support conditional
Commit 988aad99b4 (t5563: add tests for basic and anoymous HTTP access,
2023-02-27) added tests that require Apache to support the CGIPassAuth
directive, which was added in Apache 2.4.13. This is fairly old (~8
years), but recent enough that we still encounter it in the wild (e.g.,
RHEL/CentOS 7, which is not EOL until June 2024).

We can live with skipping the new tests on such a platform. But
unfortunately, since the directive is used unconditionally in our
apache.conf, it means the web server fails to start entirely, and we
cannot run other HTTP tests at all (e.g., the basic ones in t5551).

We can fix that by making the config conditional, and only triggering it
for t5563. That solves the problem for t5551 (which then ignores the
directive entirely). For t5563, we'd see apache complain in start_httpd;
with the default setting of GIT_TEST_HTTPD, we'd then skip the whole
script.

But that leaves one small problem: people may set GIT_TEST_HTTPD=1
explicitly, which instructs the tests to fail (rather than skip) when we
can't start the webserver (to avoid accidentally missing some tests).

This could be worked around by having the user manually set
GIT_SKIP_TESTS on a platform with an older Apache. But we can be a bit
friendlier by doing the version check ourselves and setting an
appropriate prereq. We'll use the (lack of) prereq to then skip the rest
of t5563. In theory we could use the prereq to skip individual tests, but
in practice this whole script depends on it.

Reported-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-18 14:29:32 -07:00
48c5fbfb89 diff-tree: integrate with sparse index
The index is read in 'cmd_diff_tree' at two points:

1. The first index read was added in fd66bcc31f (diff-tree: read the
index so attribute checks work in bare repositories, 2017-12-06) to deal
with reading '.gitattributes' content. 77efbb366a (attr: be careful
about sparse directories, 2021-09-08) established that, in a sparse
index, we do _not_ try to load a '.gitattributes' file from within a
sparse directory.

2. The second index access point is involved in rename detection,
specifically when reading from stdin.This was initially added in
f0c6b2a2fd ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where
'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE |DIFF_SETUP_USE_CACHE'.
That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in
ff7fe37b05 (diff.c: move read_index() code back to the caller,
2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as
of 6e0b8ed6d3 (diff.c: do not use a separate "size cache"., 2007-05-07)
and nothing about 'detect_rename' otherwise indicates index usage.

Hence we can just set the requires-full-index to false for "diff-tree".

Add tests that verify that 'git diff-tree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.

The `p2000` tests demonstrate a ~98% execution time reduction for
'git diff-tree' using a sparse index:

Test                                                before  after
-----------------------------------------------------------------------
2000.94: git diff-tree HEAD (full-v3)                0.05   0.04 -20.0%
2000.95: git diff-tree HEAD (full-v4)                0.06   0.05 -16.7%
2000.96: git diff-tree HEAD (sparse-v3)              0.59   0.01 -98.3%
2000.97: git diff-tree HEAD (sparse-v4)              0.61   0.01 -98.4%
2000.98: git diff-tree HEAD -- f2/f4/a (full-v3)     0.05   0.05 +0.0%
2000.99: git diff-tree HEAD -- f2/f4/a (full-v4)     0.05   0.04 -20.0%
2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3)  0.58   0.01 -98.3%
2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4)  0.55   0.01 -98.2%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-18 10:40:33 -07:00
20bd08aefb t9001: mark the script as no longer leak checker clean
The test uses "format-patch --thread" which is known to leak the
generated message ID list.

Plugging these leaks involves straightening out the memory ownership
rules around rev_info.message_id and rev_info.ref_message_ids, and
is beyond the scope of send-email fix, so for now mark the test as
leaky to unblock the topic before the release.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 16:47:36 -07:00
926c40d04b worktree add: emit warn when there is a bad HEAD
Add a warning to `worktree add` when the command tries to reference
HEAD, there exist valid local branches, and the HEAD points to a
non-existent reference.

Current Behavior:
% git -C foo worktree list
/path/to/repo/foo     dadc8e6dac [main]
/path/to/repo/foo_wt  0000000000 [badref]
% git -C foo worktree add ../wt1
Preparing worktree (new branch 'wt1')
HEAD is now at dadc8e6dac dummy commit
% git -C foo_wt worktree add ../wt2
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%

New Behavior:
% git -C foo worktree list
/path/to/repo/foo     dadc8e6dac [main]
/path/to/repo/foo_wt  0000000000 [badref]
% git -C foo worktree add ../wt1
Preparing worktree (new branch 'wt1')
HEAD is now at dadc8e6dac dummy commit
% git -C foo_wt worktree add ../wt2
warning: HEAD points to an invalid (or orphaned) reference.
HEAD path: '/path/to/repo/foo/.git/worktrees/foo_wt/HEAD'
HEAD contents: 'ref: refs/heads/badref'
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:25 -07:00
128e5496b3 worktree add: extend DWIM to infer --orphan
Extend DWIM to try to infer `--orphan` when in an empty repository. i.e.
a repository with an invalid/unborn HEAD, no local branches, and if
`--guess-remote` is used then no remote branches.

This behavior is equivalent to `git switch -c` or `git checkout -b` in
an empty repository.

Also warn the user (overriden with `-f`/`--force`) when they likely
intend to checkout a remote branch to the worktree but have not yet
fetched from the remote. i.e. when using `--guess-remote` and there is a
remote but no local or remote refs.

Current Behavior:
% git --no-pager branch --list --remotes
% git remote
origin
% git workree add ../main
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git workree add --guess-remote ../main
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git fetch --quiet
% git --no-pager branch --list --remotes
origin/HEAD -> origin/main
origin/main
% git workree add --guess-remote ../main
Preparing worktree (new branch 'main')
branch 'main' set up to track 'origin/main'.
HEAD is now at dadc8e6dac commit message
%

New Behavior:
% git --no-pager branch --list --remotes
% git remote
origin
% git workree add ../main
No possible source branch, inferring '--orphan'
Preparing worktree (new branch 'main')
% git worktree remove ../main
% git workree add --guess-remote ../main
fatal: No local or remote refs exist despite at least one remote
present, stopping; use 'add -f' to overide or fetch a remote first
% git workree add --guess-remote -f ../main
No possible source branch, inferring '--orphan'
Preparing worktree (new branch 'main')
% git worktree remove ../main
% git fetch --quiet
% git --no-pager branch --list --remotes
origin/HEAD -> origin/main
origin/main
% git workree add --guess-remote ../main
Preparing worktree (new branch 'main')
branch 'main' set up to track 'origin/main'.
HEAD is now at dadc8e6dac commit message
%

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:25 -07:00
35f0383ca6 worktree add: introduce "try --orphan" hint
Add a new advice/hint in `git worktree add` for when the user
tries to create a new worktree from a reference that doesn't exist.

Current Behavior:

% git init foo
Initialized empty Git repository in /path/to/foo/
% touch file
% git -C foo commit -q -a -m "test commit"
% git -C foo switch --orphan norefbranch
% git -C foo worktree add newbranch/
Preparing worktree (new branch 'newbranch')
fatal: invalid reference: HEAD
%

New Behavior:

% git init --bare foo
Initialized empty Git repository in /path/to/foo/
% touch file
% git -C foo commit -q -a -m "test commit"
% git -C foo switch --orphan norefbranch
% git -C foo worktree add newbranch/
Preparing worktree (new branch 'newbranch')
hint: If you meant to create a worktree containing a new orphan branch
hint: (branch with no commits) for this repository, you can do so
hint: using the --orphan option:
hint:
hint:   git worktree add --orphan newbranch/
hint:
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git -C foo worktree add -b newbranch2 new_wt/
Preparing worktree (new branch 'newbranch')
hint: If you meant to create a worktree containing a new orphan branch
hint: (branch with no commits) for this repository, you can do so
hint: using the --orphan option:
hint:
hint:   git worktree add --orphan -b newbranch2 new_wt/
hint:
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
7ab8918985 worktree add: add --orphan flag
Add support for creating an orphan branch when adding a new worktree.
The functionality of this flag is equivalent to git switch's --orphan
option.

Current Behavior:
% git -C foo.git --no-pager branch -l
+ main
% git -C foo.git worktree add main/
Preparing worktree (new branch 'main')
HEAD is now at 6c93a75 a commit
%

% git init bar.git
Initialized empty Git repository in /path/to/bar.git/
% git -C bar.git --no-pager branch -l

% git -C bar.git worktree add main/
Preparing worktree (new branch 'main')
fatal: not a valid object name: 'HEAD'
%

New Behavior:

% git -C foo.git --no-pager branch -l
+ main
% git -C foo.git worktree add main/
Preparing worktree (new branch 'main')
HEAD is now at 6c93a75 a commit
%

% git init --bare bar.git
Initialized empty Git repository in /path/to/bar.git/
% git -C bar.git --no-pager branch -l

% git -C bar.git worktree add main/
Preparing worktree (new branch 'main')
fatal: invalid reference: HEAD
% git -C bar.git worktree add --orphan -b main/
Preparing worktree (new branch 'main')
% git -C bar.git worktree add --orphan -b newbranch worktreedir/
Preparing worktree (new branch 'newbranch')
%

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
9ccdace1e8 t2400: add tests to verify --quiet
Add tests to verify that the command performs operations the same with
`--quiet` as without it. Additionally verifies that all non-fatal output
is suppressed.

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
ed6db0e9ff t2400: refactor "worktree add" opt exclusion tests
Pull duplicate test code into a function so that additional opt
combinations can be tested succinctly.

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
1b28fbd218 t2400: cleanup created worktree in test
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
b71f919dda worktree add: include -B in usage docs
Document `-B` next to where `-b` is already documented to bring the
usage docs in line with other commands such as git checkout.

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
85ec240849 l10n: update uk localization
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2023-05-17 14:51:29 -07:00
3ece9bf0f9 send-email: clear the $message_id after validation
Recently git-send-email started parsing the same message twice, once
to validate _all_ the message before sending even the first one, and
then after the validation hook is happy and each message gets sent,
to read the contents to find out where to send to etc.

Unfortunately, the effect of reading the messages for validation
lingered even after the validation is done.  Namely $message_id gets
assigned if exists in the input files but the variable is global,
and it is not cleared before pre_process_file runs.  This causes
reading a message without a message-id followed by reading a message
with a message-id to misbehave---the sub reports as if the message
had the same id as the previously written one.

Clear the variable before starting to read the headers in
pre_process_file.

Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 14:11:38 -07:00
933e3a4ee2 upload-pack: advertise capabilities when cloning empty repos
When cloning an empty repository, protocol versions 0 and 1 currently
offer nothing but the header and flush packets for the /info/refs
endpoint. This means that no capabilities are provided, so the client
side doesn't know what capabilities are present.

However, this does pose a problem when working with SHA-256
repositories, since we use the capabilities to know the remote side's
object format (hash algorithm).  As of 8b214c2e9d ("clone: propagate
object-format when cloning from void", 2023-04-05), this has been fixed
for protocol v2, since there we always read the hash algorithm from the
remote.

Fortunately, the push version of the protocol already indicates a clue
for how to solve this.  When the /info/refs endpoint is accessed for a
push and the remote is empty, we include a dummy "capabilities^{}" ref
pointing to the all-zeros object ID.  The protocol documentation already
indicates this should _always_ be sent, even for fetches and clones, so
let's just do that, which means we'll properly announce the hash
algorithm as part of the capabilities.  This just works with the
existing code because we share the same ref code for fetches and clones,
and libgit2, JGit, and dulwich do as well.

There is one minor issue to fix, though.  If we called send_ref with
namespaces, we would return NULL with the capabilities entry, which
would cause a crash.  Instead, let's refactor out a function to print
just the ref itself without stripping the namespace and use it for our
special capabilities entry.

Add several sets of tests for HTTP as well as for local clones.  The
behavior can be slightly different for HTTP versus a local or SSH clone
because of the stateless-rpc functionality, so it's worth testing both.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 13:22:46 -07:00
004e0f790f A bit more before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 10:13:09 -07:00
67a3b2b39f Merge branch 'jc/attr-source-tree'
"git --attr-source=<tree> cmd $args" is a new way to have any
command to read attributes not from the working tree but from the
given tree object.

* jc/attr-source-tree:
  attr: teach "--attr-source=<tree>" global option to "git"
2023-05-17 10:11:41 -07:00
f7e063f326 fetch: use fetch_config to store "submodule.fetchJobs" value
Move the parsed "submodule.fetchJobs" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
ac197cc094 fetch: use fetch_config to store "fetch.parallel" value
Move the parsed "fetch.parallel" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
56e8bb4fb4 fetch: use fetch_config to store "fetch.recurseSubmodules" value
Move the parsed "fetch.recurseSubmodules" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
ba28b2ca5d fetch: use fetch_config to store "fetch.showForcedUpdates" value
Move the parsed "fetch.showForcedUpdaets" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
2b472cfeac fetch: use fetch_config to store "fetch.pruneTags" value
Move the parsed "fetch.pruneTags" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
b779a25e05 fetch: use fetch_config to store "fetch.prune" value
Move the parsed "fetch.prune" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
d1adf85b0a fetch: pass through fetch_config directly
The `fetch_config` structure currently only has a single member, which
is the `display_format`. We're about extend it to contain all parsed
config values and will thus need it available in most of the code.

Prepare for this change by passing through the `fetch_config` directly
instead of only passing its single member.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
6bc7a37e79 fetch: drop unneeded NULL-check for remote_ref
Drop the `NULL` check for `remote_ref` in `update_local_ref()`. The
function only has a single caller, and that caller always passes in a
non-`NULL` value.

This fixes a false positive in Coverity.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
a40449bcd4 fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum value
With 50957937f9 (fetch: introduce `display_format` enum, 2023-05-10), a
new enumeration was introduced to determine the display format that is
to be used by git-fetch(1). The `DISPLAY_FORMAT_UNKNOWN` value isn't
ever used though, and neither do we rely on the explicit `0` value for
initialization anywhere.

Remove the enum value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
3307f7dde2 imap-send: include strbuf.h
We make liberal use of the strbuf API functions and types, but the
inclusion of <strbuf.h> comes indirectly by including <http.h>,
which does not happen if you build with NO_CURL.

Signed-off-by: Christian Hesse <mail@eworm.de>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:54:07 -07:00
52c0f3318d run-command.c: fix missing include under NO_PTHREADS
Git 2.41-rc0 fails to compile run-command.c with `NO_PTHREADS` defined,
i.e.

  $ make NO_PTHREADS=1 run-command.o

as `ALLOC_GROW()` macro is used in the `atexit()` emulation but the
macro definition is not available.

This bisects to 36bf195890 (alloc.h: move ALLOC_GROW() functions from
cache.h, 2023-02-24), which replaced includes of <cache.h> with
<alloc.h>, which is the new home of `ALLOC_GROW()` (and
`ALLOC_GROW_BY()`).

We can see that run-command.c is the only one that try to use these
macros without including <alloc.h>.

  $ git grep -l -e '[^_]ALLOC_GROW(' -e 'ALLOC_GROW_BY(' \*.c | sort >/tmp/1
  $ git grep -l 'alloc\.h' \*.c | sort >/tmp/2
  $ comm -23 /tmp/[12]
  compat/simple-ipc/ipc-unix-socket.c
  run-command.c

The "compat/" file only talks about the macro in the comment,
and is not broken.

We could fix this by conditionally including "alloc.h" when
`NO_PTHREADS` is defined.  But we have relatively few examples of
conditional includes throughout the tree[^1].

Instead, include "alloc.h" unconditionally in run-command.c to allow it
to successfully compile even when NO_PTHREADS is defined.

[^1]: With `git grep -E -A1 '#if(n)?def' -- **/*.c | grep '#include' -B1`.

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 15:17:17 -07:00
08c12ec1d0 tag: keep the message file in case ref transaction fails
The ref transaction can fail after the user has written their tag
message. In particular, if there exists a tag `foo/bar` and `git tag -a
foo` is said then the command will only fail once it tries to write
`refs/tags/foo`, which is after the file has been unlinked.

Hold on to the message file for a little longer so that it is not
unlinked before the fatal error occurs.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 11:38:14 -07:00
669c11de85 t/t7004-tag: add regression test for successful tag creation
The standard tag message file is unlinked if the tag is created.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 11:38:14 -07:00
719515fdd0 doc: tag: document TAG_EDITMSG
Document `TAG_EDITMSG` which we have told the user about on unsuccessful
command invocations since commit 3927bbe9a4 (tag: delete TAG_EDITMSG
only on successful tag, 2008-12-06).

Introduce this documentation since we are going to add tests for the
lifetime of this file in the case of command failure and success.

Use the documentation for `COMMIT_EDITMSG` from `git-commit.txt` as a
template since these two files share the same purpose.[1]

† 1: from commit 3927bbe9a4:

     “ This matches the behavior of COMMIT_EDITMSG, which stays around
       in case of error.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 11:38:14 -07:00
b126b65b33 test: do not negate test_path_is_* to assert absense
These tests use "! test_path_is_dir" or "! test_path_is_file" to
ensure that the path is not recursively checked out or "submodule
update" did not touch the working tree.

Use "test_path_is_missing" to assert that the path does not exist,
instead of negating test_path_is_* helpers; they are designed to be
loud in wrong occasions.  Besides, negating "test_path_is_dir" would
mean we would be happy if a file exists there, which is not the case
for these tests.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 09:14:23 -07:00
eab648d2b4 t2021: do not negate test_path_is_dir
In this test, a path (some_dir) that is originally a directory is to
be removed and then to be replaced with a file of the same name.
The expectation is that the path becomes a file at the end.
However, "!  test_path_is_dir some_dir" is used to catch a breakage
that leaves the path as a directory.

But as with all the "test_path_is..." helpers, this use of the
helper makes it loud when the expectation (i.e. it is a directory)
is met, and otherwise is silent when it is not---this does not help
debugging.

Be more explicit and state that we expect the path to become a file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 09:14:21 -07:00
c205923649 tests: do not negate test_path_exists
As a way to assert the path 'foo' is missing, "! test_path_exists
foo" is a poor way to do so, as the helper is designed to complain
when 'foo' is missing, but the intention of the author who used
negated form was to make sure it does not exist.  This does not
help debugging the tests.

Use test_path_is_missing instead, which is a more appropriate helper.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 09:13:55 -07:00
03d05937a7 Merge tag 'v2.41.0-rc0'
Git 2.41-rc0

* tag 'v2.41.0-rc0': (508 commits)
  Git 2.41-rc0
  t5583: fix shebang line
  merge-tree: load default git config
  fetch: introduce machine-parseable "porcelain" output format
  fetch: move option related variables into main function
  fetch: lift up parsing of "fetch.output" config variable
  fetch: introduce `display_format` enum
  fetch: refactor calculation of the display table width
  fetch: print left-hand side when fetching HEAD:foo
  fetch: add a test to exercise invalid output formats
  fetch: split out tests for output format
  fetch: fix `--no-recurse-submodules` with multi-remote fetches
  The eighteenth batch
  The seventeenth batch
  diff-files: integrate with sparse index
  t1092: add tests for `git diff-files`
  test: rev-parse-upstream: add missing cmp
  t: drop "verbose" helper function
  t7001: use "ls-files --format" instead of "cut"
  t7001: avoid git on upstream of pipe
  ...
2023-05-16 10:19:48 +08:00
0df2c18090 Git 2.41-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-15 13:59:07 -07:00
15ba44f1b4 Merge branch 'ps/fetch-output-format'
"git fetch" learned the "--porcelain" option that emits what it did
in a machine-parseable format.

* ps/fetch-output-format:
  fetch: introduce machine-parseable "porcelain" output format
  fetch: move option related variables into main function
  fetch: lift up parsing of "fetch.output" config variable
  fetch: introduce `display_format` enum
  fetch: refactor calculation of the display table width
  fetch: print left-hand side when fetching HEAD:foo
  fetch: add a test to exercise invalid output formats
  fetch: split out tests for output format
  fetch: fix `--no-recurse-submodules` with multi-remote fetches
2023-05-15 13:59:07 -07:00
ef06676c36 Merge branch 'sg/retire-unused-cocci'
Retire a rather expensive-to-run Coccinelle check patch.

* sg/retire-unused-cocci:
  cocci: remove 'unused.cocci'
2023-05-15 13:59:06 -07:00
5ca11547bb Merge branch 'sl/diff-files-sparse'
Teach "diff-files" not to expand sparse-index unless needed.

* sl/diff-files-sparse:
  diff-files: integrate with sparse index
  t1092: add tests for `git diff-files`
2023-05-15 13:59:06 -07:00
80754c5cc0 Merge branch 'ds/merge-tree-use-config'
Allow git forges to disable replace-refs feature while running "git
merge-tree".

* ds/merge-tree-use-config:
  merge-tree: load default git config
2023-05-15 13:59:06 -07:00
db13ea835b Merge branch 'js/subtree-fully-spelt-quiet-and-debug-options'
"git subtree" (in contrib/) update.

* js/subtree-fully-spelt-quiet-and-debug-options:
  subtree: support long global flags
2023-05-15 13:59:06 -07:00
85cee30566 Merge branch 'ar/test-cleanup-unused-file-creation'
Test fix.

* ar/test-cleanup-unused-file-creation:
  test: rev-parse-upstream: add missing cmp
2023-05-15 13:59:06 -07:00
5334592b1d Merge branch 'jk/test-verbose-no-more'
Retire "verbose" helper function from the test framework.

* jk/test-verbose-no-more:
  t: drop "verbose" helper function
  t7001: use "ls-files --format" instead of "cut"
  t7001: avoid git on upstream of pipe
2023-05-15 13:59:05 -07:00
f37da97723 Merge branch 'tl/push-branches-is-an-alias-for-all'
"git push --all" gained an alias "git push --branches".

* tl/push-branches-is-an-alias-for-all:
  t5583: fix shebang line
  push: introduce '--branches' option
2023-05-15 13:59:05 -07:00
be2fd0edb1 Merge branch 'jc/name-rev-deprecate-stdin-further'
The "--stdin" option of "git name-rev" has been replaced with
the "--annotate-stdin" option more than a year ago.  We stop
advertising it in the "git name-rev -h" output.

* jc/name-rev-deprecate-stdin-further:
  name-rev: make --stdin hidden
2023-05-15 13:59:05 -07:00
3fb8a0f0a2 Merge branch 'jc/t9800-fix-use-of-show-s-raw'
A test fix.

* jc/t9800-fix-use-of-show-s-raw:
  t9800: correct misuse of 'show -s --raw' in a test
2023-05-15 13:59:05 -07:00
1e1dcb2a42 Merge branch 'jc/dirstat-plug-leaks'
"git diff --dirstat" leaked memory, which has been plugged.

* jc/dirstat-plug-leaks:
  diff: plug leaks in dirstat
  diff: refactor common tail part of dirstat computation
2023-05-15 13:59:05 -07:00
cd2b740ca9 Merge branch 'ds/fsck-bitmap'
"git fsck" learned to detect bit-flip breakages in the reachability
bitmap files.

* ds/fsck-bitmap:
  fsck: use local repository
  fsck: verify checksums of all .bitmap files
2023-05-15 13:59:04 -07:00
29b8a3f49d Merge branch 'js/gitk-fixes-from-gfw'
Gitk updates from GfW project.

* js/gitk-fixes-from-gfw:
  gitk: escape file paths before piping to git log
  gitk: prevent overly long command lines
2023-05-15 13:59:04 -07:00
f87d5aa383 Merge branch 'fc/doc-use-datestamp-in-commit'
An earlier change broke "doc-diff", which has been corrected.

* fc/doc-use-datestamp-in-commit:
  doc-diff: drop SOURCE_DATE_EPOCH override
  doc: doc-diff: specify date
2023-05-15 13:59:04 -07:00
2bb14fbf2f Merge branch 'ar/config-count-tests-updates'
Test updates.

* ar/config-count-tests-updates:
  t1300: add tests for missing keys
  t1300: check stderr for "ignores pairs" tests
  t1300: drop duplicate test
2023-05-15 13:59:04 -07:00
66077a29e1 Merge branch 'kh/doc-interpret-trailers-updates'
Doc update.

* kh/doc-interpret-trailers-updates:
  doc: interpret-trailers: fix example
  doc: interpret-trailers: don’t use deprecated config
  doc: interpret-trailers: use input redirection
  doc: interpret-trailers: don’t use heredoc in examples
2023-05-15 13:59:03 -07:00
fa889347e3 Merge branch 'gc/trace-bare-repo-setup'
The tracing mechanism learned to notice and report when
auto-discovered bare repositories are being used, as allowing so
without explicitly stating the user intends to do so (with setting
GIT_DIR for example) can be used with social engineering as an
attack vector.

* gc/trace-bare-repo-setup:
  setup: trace bare repository setups
2023-05-15 13:59:03 -07:00
64477d20d7 Merge branch 'mc/send-email-header-cmd'
"git send-email" learned "--header-cmd=<cmd>" that can inject
arbitrary e-mail header lines to the outgoing messages.

* mc/send-email-header-cmd:
  send-email: detect empty blank lines in command output
  send-email: add --header-cmd, --no-header-cmd options
  send-email: extract execute_cmd from recipients_cmd
2023-05-15 13:59:03 -07:00
b14a73097c Merge branch 'jc/doc-clarify-git-default-hash-variable'
The documentation was misleading about the interaction between
GIT_DEFAULT_HASH and "git clone", which has been clarified to
stress that the variable is to be ignored by the command.

* jc/doc-clarify-git-default-hash-variable:
  doc: GIT_DEFAULT_HASH is and will be ignored during "clone"
2023-05-15 13:59:03 -07:00
d3f2e4ab13 Merge branch 'rj/branch-unborn-in-other-worktrees'
Error messages given when working on an unborn branch that is
checked out in another worktree have been improved.

* rj/branch-unborn-in-other-worktrees:
  branch: avoid unnecessary worktrees traversals
  branch: rename orphan branches in any worktree
  branch: description for orphan branch errors
  branch: use get_worktrees() in copy_or_rename_branch()
  branch: test for failures while renaming branches
2023-05-15 13:59:03 -07:00
0aefe4c841 doc/git-config: add unit for http.lowSpeedLimit
Add the unit (bytes per second) for http.lowSpeedLimit
in the documentation.

Signed-off-by: Corentin Garcia <corenting@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-14 21:51:33 -07:00
170eea9750 rebase -r: fix the total number shown in the progress
For regular, non-`--rebase-merges` runs, there is very little work to do
for the parser when determining the total number of commands in a rebase
script: it is simply the number of lines after stripping the commented
lines and then trimming the trailing empty line, if any.

The `--rebase-merges` mode complicates things by introducing empty lines
and comments in the middle of the script. These should _not_ be counted
as commands, and indeed, when an interactive rebase is interrupted and
subsequently resumed, the total number of commands can magically shrink,
sometimes dramatically.

The reason for this strange behavior is that empty lines _are_ counted
in `edit_todo_list()` (but not the comments, as they are stripped via
`strbuf_stripspace(..., 1)`, which is a bug.

Let's fix this so that the correct total number is shown from the
get-go, by carefully adjusting it according to what's in the rebase
script. Extra care needs to be taken in case the user edits the script:
the number of commands might be different after the user edited than
beforehand.

Note: Even though commented lines are skipped in `edit_todo_list()`, we
still need to handle `TODO_COMMENT` items by decrementing the
already-incremented `total_nr` again: empty lines are also marked as
`TODO_COMMENT`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-14 21:44:50 -07:00
fa5103dd89 rebase --update-refs: fix loops
The `total_nr` field in the `todo_list` structure merely serves display
purposes, and should only be used when generating the progress message.

In these two instances, however, we want to loop over all of the
commands in the parsed rebase script. The loop limit therefore needs to
be `nr`, which refers to the count of commands in the current
`todo_list`.

This is important because the two numbers, `nr` and `total_nr` can
differ wildly, e.g. due to `total_nr` _not_ counting comments or empty
lines, while `nr` skips any commands that already moved from the
`git-rebase-todo` file to the `done` file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-14 21:44:50 -07:00
4fe42f326e pack-refs: teach pack-refs --include option
Allow users to be more selective over which refs to pack by adding an
--include option to git-pack-refs.

The existing options allow some measure of selectivity. By default
git-pack-refs packs all tags. --all can be used to include all refs,
and the previous commit added the ability to exclude certain refs with
--exclude.

While these knobs give the user some selection over which refs to pack,
it could be useful to give more control. For instance, a repository may
have a set of branches that are rarely updated and would benefit from
being packed. --include would allow the user to easily include a set of
branches to be packed while leaving everything else unpacked.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 14:54:14 -07:00
826ae79fca pack-refs: teach --exclude option to exclude refs from being packed
At GitLab, we have a system that creates ephemeral internal refs that
don't live long before getting deleted. Having an option to exclude
certain refs from a packed-refs file allows these internal references to
be deleted much more efficiently.

Add an --exclude option to the pack-refs builtin, and use the ref
exclusions API to exclude certain refs from being packed into the final
packed-refs file

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 14:54:14 -07:00
283174b214 docs: clarify git-pack-refs --all will pack all refs
--all packs not just branch tips but anything under refs/ with the
exception of hidden refs and broken refs. Clarify this in the
documentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 14:54:13 -07:00
022fbb655d t5583: fix shebang line
The shebang was missing the leading `/` character, resulting in:

    $ ./t5583-push-branches.sh
    bash: ./t5583-push-branches.sh: cannot execute: required file not found

Add the missing character so the test can run.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 10:02:18 -07:00
5bc069e383 Merge branch 'mh/credential-password-expiry-wincred'
Teach the recently invented "password expiry time" trait to the
wincred credential helper.

* mh/credential-password-expiry-wincred:
  credential/wincred: store password_expiry_utc
2023-05-11 12:16:16 -07:00
cb29fb86f3 Merge branch 'mh/use-wincred-from-system'
Code clean-up.

* mh/use-wincred-from-system:
  credential/wincred: include wincred.h
2023-05-11 12:16:15 -07:00
b6551feadf merge-tree: load default git config
The 'git merge-tree' command handles creating root trees for merges
without using the worktree. This is a critical operation in many Git
hosts, as they typically store bare repositories.

This builtin does not load the default Git config, which can have
several important ramifications.

In particular, one config that is loaded by default is
core.useReplaceRefs. This is typically disabled in Git hosts due to
the ability to spoof commits in strange ways.

Since this config is not loaded specifically during merge-tree, users
were previously able to use refs/replace/ references to make pull
requests that looked valid but introduced malicious content. The
resulting merge commit would have the correct commit history, but the
malicious content would exist in the root tree of the merge.

The fix is simple: load the default Git config in cmd_merge_tree().
This may also fix other behaviors that are effected by reading default
config. The only possible downside is a little extra computation time
spent reading config. The config parsing is placed after basic argument
parsing so it does not slow down usage errors.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 12:20:44 -07:00
dd781e3856 fetch: introduce machine-parseable "porcelain" output format
The output of git-fetch(1) is obviously designed for consumption by
users, only: we neatly columnize data, we abbreviate reference names, we
print neat arrows and we don't provide information about actual object
IDs that have changed. This makes the output format basically unusable
in the context of scripted invocations of git-fetch(1) that want to
learn about the exact changes that the command performs.

Introduce a new machine-parseable "porcelain" output format that is
supposed to fix this shortcoming. This output format is intended to
provide information about every reference that is about to be updated,
the old object ID that the reference has been pointing to and the new
object ID it will be updated to. Furthermore, the output format provides
the same flags as the human-readable format to indicate basic conditions
for each reference update like whether it was a fast-forward update, a
branch deletion, a rejected update or others.

The output format is quite simple:

```
<flag> <old-object-id> <new-object-id> <local-reference>\n
```

We assume two conditions which are generally true:

    - The old and new object IDs have fixed known widths and cannot
      contain spaces.

    - References cannot contain newlines.

With these assumptions, the output format becomes unambiguously
parseable. Furthermore, given that this output is designed to be
consumed by scripts, the machine-readable data is printed to stdout
instead of stderr like the human-readable output is. This is mostly done
so that other data printed to stderr, like error messages or progress
meters, don't interfere with the parseable data.

A notable ommission here is that the output format does not include the
remote from which a reference was fetched, which might be important
information especially in the context of multi-remote fetches. But as
such a format would require us to print the remote for every single
reference update due to parallelizable fetches it feels wasteful for the
most likely usecase, which is when fetching from a single remote.

In a similar spirit, a second restriction is that this cannot be used
with `--recurse-submodules`. This is because any reference updates would
be ambiguous without also printing the repository in which the update
happens.

Considering that both multi-remote and submodule fetches are user-facing
features, using them in conjunction with `--porcelain` that is intended
for scripting purposes is likely not going to be useful in the majority
of cases. With that in mind these restrictions feel acceptable. If
usecases for either of these come up in the future though it is easy
enough to add a new "porcelain-v2" format that adds this information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
cdc034a0ac fetch: move option related variables into main function
The options of git-fetch(1) which we pass to `parse_options()` are
declared globally in `builtin/fetch.c`. This means we're forced to use
global variables for all the options, which is more likely to cause
confusion than explicitly passing state around.

Refactor the code to move the options into `cmd_fetch()`. Move variables
that were previously forced to be declared globally and which are only
used by `cmd_fetch()` into function-local scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
58afbe885c fetch: lift up parsing of "fetch.output" config variable
Parsing the display format happens inside of `display_state_init()`. As
we only need to check for a simple config entry, this is a natural
location to put this code as it means that display-state logic is neatly
contained in a single location.

We're about to introduce a new "porcelain" output format though that is
intended to be parseable by machines, for example inside of a script.
This format can be enabled by passing the `--porcelain` switch to
git-fetch(1). As a consequence, we'll have to add a second callsite that
influences the output format, which will become awkward to handle.

Refactor the code such that callers are expected to pass the display
format that is to be used into `display_state_init()`. This allows us to
lift up the code into the main function, where we can then hook it into
command line options parser in a follow-up commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
50957937f9 fetch: introduce display_format enum
We currently have two different display formats in git-fetch(1) with the
"full" and "compact" formats. This is tracked with a boolean value that
simply denotes whether the display format is supposed to be compacted
or not. This works reasonably well while there are only two formats, but
we're about to introduce another format that will make this a bit more
awkward to use.

Introduce a `enum display_format` that is more readily extensible.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
9539638a2b fetch: refactor calculation of the display table width
When displaying reference updates, we try to print the references in a
neat table. As the table's width is determined its contents we thus need
to precalculate the overall width before we can start printing updated
references.

The calculation is driven by `display_state_init()`, which invokes
`refcol_width()` for every reference that is to be printed. This split
is somewhat confusing. For one, we filter references that shall be
attributed to the overall width in both places. And second, we
needlessly recalculate the maximum line length based on the terminal
columns and display format for every reference.

Refactor the code so that the complete width calculations are neatly
contained in `refcol_width()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
1c31764dda fetch: print left-hand side when fetching HEAD:foo
`store_updated_refs()` parses the remote reference for two purposes:

    - It gets used as a note when writing FETCH_HEAD.

    - It is passed through to `display_ref_update()` to display
      updated references in the following format:

      ```
       * branch               master          -> master
      ```

In most cases, the parsed remote reference is the prettified reference
name and can thus be used for both cases. But if the remote reference is
HEAD, the parsed remote reference becomes empty. This is intended when
we write the FETCH_HEAD, where we skip writing the note in that case.
But when displaying the updated references this leads to inconsistent
output where the left-hand side of reference updates is missing in some
cases:

```
$ git fetch origin HEAD HEAD:explicit-head :implicit-head main
From https://github.com/git/git
 * branch                  HEAD       -> FETCH_HEAD
 * [new ref]                          -> explicit-head
 * [new ref]                          -> implicit-head
 * branch                  main       -> FETCH_HEAD
```

This behaviour has existed ever since the table-based output has been
introduced for git-fetch(1) via 165f390250 (git-fetch: more terse fetch
output, 2007-11-03) and was never explicitly documented either in the
commit message or in any of our tests. So while it may not be a bug per
se, it feels like a weird inconsistency and not like it was a concious
design decision.

The logic of how we compute the remote reference name that we ultimately
pass to `display_ref_update()` is not easy to follow. There are three
different cases here:

    - When the remote reference name is "HEAD" we set the remote
      reference name to the empty string. This is the case that causes
      the left-hand side to go missing, where we would indeed want to
      print "HEAD" instead of the empty string. This is what
      `prettify_refname()` would return.

    - When the remote reference name has a well-known prefix then we
      strip this prefix. This matches what `prettify_refname()` does.

    - Otherwise, we keep the fully qualified reference name. This also
      matches what `prettify_refname()` does.

As the return value of `prettify_refname()` would do the correct thing
for us in all three cases, we can thus fix the inconsistency by passing
through the full remote reference name to `display_ref_update()`, which
learns to call `prettify_refname()`. At the same time, this also
simplifies the code a bit.

Note that this patch also changes formatting of the block that computes
the "kind" (which is the category like "branch" or "tag") and "what"
(which is the prettified reference name like "master" or "v1.0")
variables. This is done on purpose so that it is part of the diff,
hopefully making the change easier to comprehend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
3daf6558ed fetch: add a test to exercise invalid output formats
Add a testcase that exercises the logic when an invalid output format is
passed via the `fetch.output` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:24 -07:00
2c5691d6cf fetch: split out tests for output format
We're about to introduce a new porcelain mode for the output of
git-fetch(1). As part of that we'll be introducing a set of new tests
that only relate to the output of this command.

Split out tests that exercise the output format of git-fetch(1) so that
it becomes easier to verify this functionality as a standalone unit. As
the tests assume that the default branch is called "main" we set up the
corresponding GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME environment variable
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:24 -07:00
5667141e3b fetch: fix --no-recurse-submodules with multi-remote fetches
When running `git fetch --no-recurse-submodules`, the exectation is that
we don't fetch any submodules. And while this works for fetches of a
single remote, it doesn't when fetching multiple remotes at once. The
result is that we do recurse into submodules even though the user has
explicitly asked us not to.

This is because while we pass on `--recurse-submodules={yes,on-demand}`
if specified by the user, we don't pass on `--no-recurse-submodules` to
the subprocess spawned to perform the submodule fetch.

Fix this by also forwarding this flag as expected.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:24 -07:00
91428f078b The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:23:29 -07:00
f7947450de Merge branch 'sd/doc-gitignore-and-rm-cached'
Doc update.

* sd/doc-gitignore-and-rm-cached:
  docs: clarify git rm --cached function in gitignore note
2023-05-10 10:23:29 -07:00
40a5d2b79b Merge branch 'fc/doc-man-lift-title-length-limit'
The titles of manual pages used to be chomped at an unreasonably
short limit, which has been removed.

* fc/doc-man-lift-title-length-limit:
  doc: manpage: remove maximum title length
2023-05-10 10:23:29 -07:00
8d6d9529cb Merge branch 'fc/doc-drop-custom-callout-format'
Our custom callout formatter is no longer used in the documentation
formatting toolchain, as the upstream default ones give better
output these days.

* fc/doc-drop-custom-callout-format:
  doc: remove custom callouts format
2023-05-10 10:23:29 -07:00
2ca91d1ee0 Merge branch 'mh/credential-oauth-refresh-token'
The credential subsystem learns to help OAuth framework.

* mh/credential-oauth-refresh-token:
  credential: new attribute oauth_refresh_token
2023-05-10 10:23:29 -07:00
c05615e1c5 Merge branch 'ah/doc-attributes-text'
Doc update to clarify how text and eol attributes interact to
specify the end-of-line conversion.

* ah/doc-attributes-text:
  docs: rewrite the documentation of the text and eol attributes
2023-05-10 10:23:28 -07:00
7f3cc51b28 Merge branch 'ar/test-cleanup-unused-file-creation-part2'
Test cleanup.

* ar/test-cleanup-unused-file-creation-part2:
  t2019: don't create unused files
  t1502: don't create unused files
  t1450: don't create unused files
  t1300: don't create unused files
  t1300: fix config file syntax error descriptions
  t0300: don't create unused file
2023-05-10 10:23:28 -07:00
b6e9521956 Merge branch 'ms/send-email-feed-header-to-validate-hook'
"git send-email" learned to give the e-mail headers to the validate
hook by passing an extra argument from the command line.

* ms/send-email-feed-header-to-validate-hook:
  send-email: expose header information to git-send-email's sendemail-validate hook
  send-email: refactor header generation functions
2023-05-10 10:23:28 -07:00
e2abfa7212 Merge branch 'hx/negotiator-non-recursive'
The implementation of the default "negotiator", used to find common
ancestor over the network for object tranfer, used to be recursive;
it was updated to be iterative to conserve stackspace usage.

* hx/negotiator-non-recursive:
  negotiator/skipping: fix some problems in mark_common()
  negotiator/default: avoid stack overflow
2023-05-10 10:23:28 -07:00
07ac32fff9 Merge branch 'ma/gittutorial-fixes'
Doc fixes.

* ma/gittutorial-fixes:
  gittutorial: wrap literal examples in backticks
  gittutorial: drop early mention of origin
2023-05-10 10:23:27 -07:00
fbbf60a9bc Merge branch 'tb/credential-long-lines'
The implementation of credential helpers used fgets() over fixed
size buffers to read protocol messages, causing the remainder of
the folded long line to trigger unexpected behaviour, which has
been corrected.

* tb/credential-long-lines:
  contrib/credential: embiggen fixed-size buffer in wincred
  contrib/credential: avoid fixed-size buffer in libsecret
  contrib/credential: .gitignore libsecret build artifacts
  contrib/credential: remove 'gnome-keyring' credential helper
  contrib/credential: avoid fixed-size buffer in osxkeychain
  t/lib-credential.sh: ensure credential helpers handle long headers
  credential.c: store "wwwauth[]" values in `credential_read()`
2023-05-10 10:23:27 -07:00
6710b68db1 Merge branch 'rs/test-ctype-eof'
ctype tests have been taught to test EOF, too.

* rs/test-ctype-eof:
  test-ctype: check EOF
2023-05-10 10:23:27 -07:00
5597cfdf47 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09 16:45:47 -07:00
0004d97099 Merge branch 'ob/t3501-retitle'
Retitle a test script with an overly narrow name.

* ob/t3501-retitle:
  t/t3501-revert-cherry-pick.sh: clarify scope of the file
2023-05-09 16:45:46 -07:00
53b29442a8 Merge branch 'jw/send-email-update-gmail-insn'
Doc update to drop use of deprecated app-specific password against
gmail.

* jw/send-email-update-gmail-insn:
  send-email docs: Remove mention of discontinued gmail feature
2023-05-09 16:45:46 -07:00
461eea3fb8 Merge branch 'ob/messages-capitalize-exception'
Message update.

* ob/messages-capitalize-exception:
  messages: capitalization and punctuation exceptions
2023-05-09 16:45:46 -07:00
d6b7f01cd7 Merge branch 'ob/sequencer-i18n-fix'
Message update.

* ob/sequencer-i18n-fix:
  sequencer: actually translate report in do_exec()
2023-05-09 16:45:46 -07:00
ccd12a3d6c Merge branch 'en/header-split-cache-h-part-2'
More header clean-up.

* en/header-split-cache-h-part-2: (22 commits)
  reftable: ensure git-compat-util.h is the first (indirect) include
  diff.h: reduce unnecessary includes
  object-store.h: reduce unnecessary includes
  commit.h: reduce unnecessary includes
  fsmonitor: reduce includes of cache.h
  cache.h: remove unnecessary headers
  treewide: remove cache.h inclusion due to previous changes
  cache,tree: move basic name compare functions from read-cache to tree
  cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
  hash-ll.h: split out of hash.h to remove dependency on repository.h
  tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
  dir.h: move DTYPE defines from cache.h
  versioncmp.h: move declarations for versioncmp.c functions from cache.h
  ws.h: move declarations for ws.c functions from cache.h
  match-trees.h: move declarations for match-trees.c functions from cache.h
  pkt-line.h: move declarations for pkt-line.c functions from cache.h
  base85.h: move declarations for base85.c functions from cache.h
  copy.h: move declarations for copy.c functions from cache.h
  server-info.h: move declarations for server-info.c functions from cache.h
  packfile.h: move pack_window and pack_entry from cache.h
  ...
2023-05-09 16:45:46 -07:00
ab828cde84 Merge branch 'mh/fix-detect-compilers-with-nondigit-versions'
The detect-compilers script to help auto-tweaking the build system
had trouble working with compilers whose version number has extra
suffixes.  The script has been taught that certain suffixes (like
"-win32" in "gcc 10-win32") can be safely stripped as they share
the same features and bugs with the version without the suffix.

* mh/fix-detect-compilers-with-nondigit-versions:
  Handle some compiler versions containing a dash
2023-05-09 16:45:45 -07:00
620e92b845 Merge branch 'jk/parse-commit-with-malformed-ident'
The commit object parser has been taught to be a bit more lenient
to parse timestamps on the author/committer line with a malformed
author/committer ident.

* jk/parse-commit-with-malformed-ident:
  parse_commit(): describe more date-parsing failure modes
  parse_commit(): handle broken whitespace-only timestamp
  parse_commit(): parse timestamp from end of line
  t4212: avoid putting git on left-hand side of pipe
2023-05-09 16:45:45 -07:00
8c30be9176 diff-files: integrate with sparse index
Remove full index requirement for `git diff-files`. Refactor the
ensure_expanded and ensure_not_expanded functions by introducing a
common helper function, ensure_index_state. Add test to ensure the index
is no expanded in `git diff-files`.

The `p2000` tests demonstrate a ~96% execution time reduction for 'git
diff-files' and a ~97% execution time reduction for 'git diff-files'
for a file using a sparse index:

Test                                               before  after
-----------------------------------------------------------------------
2000.94: git diff-files (full-v3)                  0.09    0.08 -11.1%
2000.95: git diff-files (full-v4)                  0.09    0.09 +0.0%
2000.96: git diff-files (sparse-v3)                0.52    0.02 -96.2%
2000.97: git diff-files (sparse-v4)                0.51    0.02 -96.1%
2000.98: git diff-files -- f2/f4/a (full-v3)       0.06    0.07 +16.7%
2000.99: git diff-files -- f2/f4/a (full-v4)       0.08    0.08 +0.0%
2000.100: git diff-files -- f2/f4/a (sparse-v3)    0.46    0.01 -97.8%
2000.101: git diff-files -- f2/f4/a (sparse-v4)    0.51    0.02 -96.1%

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09 14:26:36 -07:00
0aba1a989c t1092: add tests for git diff-files
Before integrating the 'git diff-files' builtin with the sparse index
feature, add tests to t1092-sparse-checkout-compatibility.sh to ensure
it currently works with sparse-checkout and will still work with sparse
index after that integration.

When adding tests against a sparse-checkout definition, we test two
modes: all changes are within the sparse-checkout cone and some changes
are outside the sparse-checkout cone.

In order to have staged changes outside of the sparse-checkout cone,
make a directory called 'folder1' and copy `a` into 'folder1/a'.
'folder1/a' is identical to `a` in the base commit. These make
'folder1/a' in the index, while leaving it outside of the
sparse-checkout definition. Change content inside 'folder1/a' in order
to test 'folder1/a' being present on-disk with modifications.

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09 14:26:34 -07:00
159f4b9c3b test: rev-parse-upstream: add missing cmp
It seems pretty clear 5236fce6b4 (t1507: stop losing return codes of git
commands, 2019-12-20) missed a test_cmp.

Cc: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09 09:25:53 -07:00
8ddfce7144 t: drop "verbose" helper function
We have a small helper function called "verbose", with the idea that you
can write:

  verbose foo

to get a message to stderr when the "foo" command fails, even if it does
not produce any output itself. This goes back to 8ad1652418 (t5304: use
helper to report failure of "test foo = bar", 2014-10-10). It does work,
but overall it has not been a big success for two reasons:

  1. Test writers have to remember to put it there (and the resulting
     test code is longer as a result).

  2. It doesn't handle the opposite case (we expect "foo" to fail, but
     it succeeds), leading to inconsistencies in tests (which you can
     see in many hunks of this patch, e.g. ones involving "has_cr").

Most importantly, we added a136f6d8ff (test-lib.sh: support -x option
for shell-tracing, 2014-10-10) at the same time, and it does roughly the
same thing. The output is not quite as succinct as "verbose", and you
have to watch out for stray shell-traces ending up in stderr. But it
solves both of the problems above, and has clearly become the preferred
tool.

Let's consider the "verbose" function a failed experiment and remove the
last few callers (which are all many years old, and have been dwindling
as we remove them from scripts we touch for other reasons). It will be
one less thing for new test writers to see and wonder if they should be
using themselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 14:50:28 -07:00
a9ea5296b7 t7001: use "ls-files --format" instead of "cut"
Since ls-files recently learned a "--format" option, we can use that
rather than asking for all of "--stage" and then pulling out the bits we
want with "cut". That's simpler and avoids two extra processes (one for
cut, and one for the subshell to hold the intermediate result).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 14:50:28 -07:00
b1c8ac3996 t7001: avoid git on upstream of pipe
We generally avoid git on the left-hand side of a pipe, because it loses
the exit code of the command (and thus we'd miss things like segfaults
or unexpected failures). In the cases in t7001, we wouldn't expect
failures (they are just inspecting the repository state, and are not the
main point of the test), but it doesn't hurt to be careful.

In all but one case here we're piping "ls-files --stage" to cut off the
pathname (since we compare entries before and after moving). Let's pull
that into a helper function to avoid repeating the slightly awkward
replacement.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 14:50:28 -07:00
6e210175c7 t1092: update a write-tree test
* 'on all' in the title of the test 'write-tree on all' was unclear;
remove it.

* Add a baseline 'test_all_match git write-tree' before making any
changes to the index, providing a reference point for the 'write-tree'
prior to any modifications.

* Add a comparison of the output of 'git status --porcelain=v2' to test
the working tree after 'write-tree' exits.

* Ensure SKIP_WORKTREE files weren't materialized on disk by using
"test_path_is_missing".

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 14:41:30 -07:00
b0afdce5da pack-bitmap.c: use commit boundary during bitmap traversal
When reachability bitmap coverage exists in a repository, Git will use a
different (and hopefully faster) traversal to compute revision walks.

Consider a set of positive and negative tips (which we'll refer to with
their standard bitmap parlance by "wants", and "haves"). In order to
figure out what objects exist between the tips, the existing traversal
in `prepare_bitmap_walk()` does something like:

  1. Consider if we can even compute the set of objects with bitmaps,
     and fall back to the usual traversal if we cannot. For example,
     pathspec limiting traversals can't be computed using bitmaps (since
     they don't know which objects are at which paths). The same is true
     of certain kinds of non-trivial object filters.

  2. If we can compute the traversal with bitmaps, partition the
     (dereferenced) tips into two object lists, "haves", and "wants",
     based on whether or not the objects have the UNINTERESTING flag,
     respectively.

  3. Fall back to the ordinary object traversal if either (a) there are
     more than zero haves, none of which are in the bitmapped pack or
     MIDX, or (b) there are no wants.

  4. Construct a reachability bitmap for the "haves" side by walking
     from the revision tips down to any existing bitmaps, OR-ing in any
     bitmaps as they are found.

  5. Then do the same for the "wants" side, stopping at any objects that
     appear in the "haves" bitmap.

  6. Filter the results if any object filter (that can be easily
     computed with bitmaps alone) was given, and then return back to the
     caller.

When there is good bitmap coverage relative to the traversal tips, this
walk is often significantly faster than an ordinary object traversal
because it can visit far fewer objects.

But in certain cases, it can be significantly *slower* than the usual
object traversal. Why? Because we need to compute complete bitmaps on
either side of the walk. If either one (or both) of the sides require
walking many (or all!) objects before they get to an existing bitmap,
the extra bitmap machinery is mostly or all overhead.

One of the benefits, however, is that even if the walk is slower, bitmap
traversals are guaranteed to provide an *exact* answer. Unlike the
traditional object traversal algorithm, which can over-count the results
by not opening trees for older commits, the bitmap walk builds an exact
reachability bitmap for either side, meaning the results are never
over-counted.

But producing non-exact results is OK for our traversal here (both in
the bitmap case and not), as long as the results are over-counted, not
under.

Relaxing the bitmap traversal to allow it to produce over-counted
results gives us the opportunity to make some significant improvements.
Instead of the above, the new algorithm only has to walk from the
*boundary* down to the nearest bitmap, instead of from each of the
UNINTERESTING tips.

The boundary-based approach still has degenerate cases, but we'll show
in a moment that it is often a significant improvement.

The new algorithm works as follows:

  1. Build a (partial) bitmap of the haves side by first OR-ing any
     bitmap(s) that already exist for UNINTERESTING commits between the
     haves and the boundary.

  2. For each commit along the boundary, add it as a fill-in traversal
     tip (where the traversal terminates once an existing bitmap is
     found), and perform fill-in traversal.

  3. Build up a complete bitmap of the wants side as usual, stopping any
     time we intersect the (partial) haves side.

  4. Return the results.

And is more-or-less equivalent to using the *old* algorithm with this
invocation:

    $ git rev-list --objects --use-bitmap-index $WANTS --not \
        $(git rev-list --objects --boundary $WANTS --not $HAVES |
          perl -lne 'print $1 if /^-(.*)/')

The new result performs significantly better in many cases, particularly
when the distance from the boundary commit(s) to an existing bitmap is
shorter than the distance from (all of) the have tips to the nearest
bitmapped commit.

Note that when using the old bitmap traversal algorithm, the results can
be *slower* than without bitmaps! Under the new algorithm, the result is
computed faster with bitmaps than without (at the cost of over-counting
the true number of objects in a similar fashion as the non-bitmap
traversal):

    # (Computing the number of tagged objects not on any branches
    # without bitmaps).
    $ time git rev-list --count --objects --tags --not --branches
    20

    real	0m1.388s
    user	0m1.092s
    sys	0m0.296s

    # (Computing the same query using the old bitmap traversal).
    $ time git rev-list --count --objects --tags --not --branches --use-bitmap-index
    19

    real	0m22.709s
    user	0m21.628s
    sys	0m1.076s

    # (this commit)
    $ time git.compile rev-list --count --objects --tags --not --branches --use-bitmap-index
    19

    real	0m1.518s
    user	0m1.234s
    sys	0m0.284s

The new algorithm is still slower than not using bitmaps at all, but it
is nearly a 15-fold improvement over the existing traversal.

In a more realistic setting (using my local copy of git.git), I can
observe a similar (if more modest) speed-up:

    $ argv="--count --objects --branches --not --tags"
    hyperfine \
      -n 'no bitmaps' "git.compile rev-list $argv" \
      -n 'existing traversal' "git.compile rev-list --use-bitmap-index $argv" \
      -n 'boundary traversal' "git.compile -c pack.useBitmapBoundaryTraversal=true rev-list --use-bitmap-index $argv"
    Benchmark 1: no bitmaps
      Time (mean ± σ):     124.6 ms ±   2.1 ms    [User: 103.7 ms, System: 20.8 ms]
      Range (min … max):   122.6 ms … 133.1 ms    22 runs

    Benchmark 2: existing traversal
      Time (mean ± σ):     368.6 ms ±   3.0 ms    [User: 325.3 ms, System: 43.1 ms]
      Range (min … max):   365.1 ms … 374.8 ms    10 runs

    Benchmark 3: boundary traversal
      Time (mean ± σ):     167.6 ms ±   0.9 ms    [User: 139.5 ms, System: 27.9 ms]
      Range (min … max):   166.1 ms … 169.2 ms    17 runs

    Summary
      'no bitmaps' ran
        1.34 ± 0.02 times faster than 'boundary traversal'
        2.96 ± 0.05 times faster than 'existing traversal'

Here, the new algorithm is also still slower than not using bitmaps, but
represents a more than 2-fold improvement over the existing traversal in
a more modest example.

Since this algorithm was originally written (nearly a year and a half
ago, at the time of writing), the bitmap lookup table shipped, making
the new algorithm's result more competitive. A few other future
directions for improving bitmap traversal times beyond not using bitmaps
at all:

  - Decrease the cost to decompress and OR together many bitmaps
    together (particularly when enumerating the uninteresting side of
    the walk). Here we could explore more efficient bitmap storage
    techniques, like Roaring+Run and/or use SIMD instructions to speed
    up ORing them together.

  - Store pseudo-merge bitmaps, which could allow us to OR together
    fewer "summary" bitmaps (which would also help with the above).

Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 12:05:55 -07:00
47ff853f02 pack-bitmap.c: extract fill_in_bitmap()
To prepare for the boundary-based bitmap walk to perform a fill-in
traversal using the boundary of either side as the tips, extract routine
used to perform fill-in traversal by `find_objects()` so that it can be
used in both places.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 12:05:55 -07:00
fe90355361 object: add object_array initializer helper function
The object_array API has an OBJECT_ARRAY_INIT macro, but lacks a
function to initialize an object_array at a given location in memory.

Introduce `object_array_init()` to implement such a function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 12:05:55 -07:00
99e70f3077 Merge gitk changes into js/gitk-fixes-from-gfw
* .tmp-gitk:
  gitk: escape file paths before piping to git log
  gitk: prevent overly long command lines
2023-05-08 09:16:57 -07:00
7dd272eca1 gitk: escape file paths before piping to git log
We just started piping the file paths via `stdin` instead of passing
them via the command-line, to avoid running into command-line
limitations.

However, since we now pipe the file paths, we need to take care of
special characters.

This fixes https://github.com/git-for-windows/git/issues/2293

Signed-off-by: Nico Rieck <nico.rieck@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 09:15:24 -07:00
bb5cb23daf gitk: prevent overly long command lines
To avoid running into command line limitations, some of Git's commands
support the `--stdin` option.

Let's use exactly this option in the three rev-list/log invocations in
gitk that would otherwise possibly run the danger of trying to invoke a
too-long command line.

While it is easy to redirect either stdin or stdout in Tcl/Tk scripts,
what we need here is both. We need to capture the output, yet we also
need to pipe in the revs/files arguments via stdin (because stdin does
not have any limit, unlike the command line). To help this, we use the
neat Tcl feature where you can capture stdout and at the same time feed
a fixed string as stdin to the spawned process.

One non-obvious aspect about this change is that the `--stdin` option
allows to specify revs, the double-dash, and files, but *no* other
options such as `--not`. This is addressed by prefixing the "negative"
revs with `^` explicitly rather than relying on the `--not` option
(thanks for coming up with that idea, Max!).

This fixes https://github.com/git-for-windows/git/issues/1987

Analysis-and-initial-patch-by: Max Kirillov <max@max630.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 09:15:24 -07:00
b4de9239bf subtree: support long global flags
The documentation at e75d1da38a claimed support, but it was never present

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 07:58:27 -07:00
425b4d7f47 push: introduce '--branches' option
The '--all' option of git-push built-in cmd support to push all branches
(refs under refs/heads) to remote. Under the usage, a user can easlily
work in some scenarios, for example, branches synchronization and batch
upload.

The '--all' was introduced for a long time, meanwhile, git supports to
customize the storage location under "refs/". when a new git user see
the usage like, 'git push origin --all', we might feel like we're
pushing _all_ the refs instead of just branches without looking at the
documents until we found the related description of it or '--mirror'.

To ensure compatibility, we cannot rename '--all' to another name
directly, one way is, we can try to add a new option '--heads' which be
identical with the functionality of '--all' to let the user understand
the meaning of representation more clearly. Actually, We've more or less
named options this way already, for example, in 'git-show-ref' and 'git
ls-remote'.

At the same time, we fix a related issue about the wrong help
information of '--all' option in code and add some test cases in
t5523, t5543 and t5583.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:36:43 -07:00
44451a2e5e attr: teach "--attr-source=<tree>" global option to "git"
Earlier, 47cfc9bd (attr: add flag `--source` to work with tree-ish,
2023-01-14) taught "git check-attr" the "--source=<tree>" option to
allow it to read attribute files from a tree-ish, but did so only
for the command.  Just like "check-attr" users wanted a way to use
attributes from a tree-ish and not from the working tree files,
users of other commands (like "git diff") would benefit from the
same.

Undo most of the UI change the commit made, while keeping the
internal logic to read attributes from a given tree-ish. Expose the
internal logic via a new "--attr-source=<tree>" command line option
given to "git", so that it can be used with any git command that
runs as part of the main git process.

Additionally, add an environment variable GIT_ATTR_SOURCE that is set
when --attr-source is passed in, so that subprocesses use the same value
for the attributes source tree.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:34:09 -07:00
9019d7dceb name-rev: make --stdin hidden
In 34ae3b70 (name-rev: deprecate --stdin in favor of --annotate-stdin),
we renamed --stdin to --annotate-stdin for the sake of a clearer name
for the option, and added text that indicates --stdin is deprecated. The
next step is to hide --stdin completely.

Make the option hidden. Also, update documentation to remove all
mentions of --stdin.

Signed-off-by: "John Cai" <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:32:20 -07:00
b7cf25c8f4 t9800: correct misuse of 'show -s --raw' in a test
There is $(git show -s --raw --pretty=format:%at HEAD) in this test
that is meant to grab the author time of the commit.  We used to
have a bug in the command line option parser of the diff family of
commands, where "show -s --raw" was identical to "show -s".

With the "-s" bug fixed, "show -s --raw" would mean the same thing
as "show --raw", i.e. show the output from the diff machinery in the
"raw" format.  And this test will start failing, so fix it before
that happens.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:30:51 -07:00
836088d80c doc-diff: drop SOURCE_DATE_EPOCH override
The original doc-diff script set SOURCE_DATE_EPOCH to make asciidoc's
output deterministic. Otherwise, the mtime of the source files would end
up in the footer of the manpage, causing noisy and uninteresting diff
hunks.

But this has been unused since 28fde3a1f4 (doc: set actual revdate for
manpages, 2023-04-13), as the footer uses the externally-specified
GIT_DATE instead (that needs to be set consistently, too, which it now
is as of the previous commit).

Asciidoc sets several automatic attributes based on the mtime (or manual
epoch), so it's still possible to write a document that would need
SOURCE_DATE_EPOCH set to be deterministic. But if we wrote such a thing,
it's probably a mistake, and we're better off having doc-diff loudly
show it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05 14:28:03 -07:00
9d484b92ed diff: fix interaction between the "-s" option and other options
Sergey Organov noticed and reported "--patch --no-patch --raw"
behaves differently from just "--raw".  It turns out that there are
a few interesting bugs in the implementation and documentation.

 * First, the documentation for "--no-patch" was unclear that it
   could be read to mean "--no-patch" countermands an earlier
   "--patch" but not other things.  The intention of "--no-patch"
   ever since it was introduced at d09cd15d (diff: allow --no-patch
   as synonym for -s, 2013-07-16) was to serve as a synonym for
   "-s", so "--raw --patch --no-patch" should have produced no
   output, but it can be (mis)read to allow showing only "--raw"
   output.

 * Then the interaction between "-s" and other format options were
   poorly implemented.  Modern versions of Git uses one bit each to
   represent formatting options like "--patch", "--stat" in a single
   output_format word, but for historical reasons, "-s" also is
   represented as another bit in the same word.  This allows two
   interesting bugs to happen, and we have both X-<.

   (1) After setting a format bit, then setting NO_OUTPUT with "-s",
       the code to process another "--<format>" option drops the
       NO_OUTPUT bit to allow output to be shown again.  However,
       the code to handle "-s" only set NO_OUTPUT without unsetting
       format bits set earlier, so the earlier format bit got
       revealed upon seeing the second "--<format>" option.  This is
       the problem Sergey observed.

   (2) After setting NO_OUTPUT with "-s", code to process
       "--<format>" option can forget to unset NO_OUTPUT, leaving
       the command still silent.

It is tempting to change the meaning of "--no-patch" to mean
"disable only the patch format output" and reimplement "-s" as "not
showing anything", but it would be an end-user visible change in
behavior.  Let's fix the interactions of these bits to first make
"-s" work as intended.

The fix is conceptually very simple.

 * Whenever we set DIFF_FORMAT_FOO because we saw the "--foo"
   option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
   given), we make sure we drop DIFF_FORMAT_NO_OUTPUT.  We forgot to
   do so in some of the options and caused (2) above.

 * When processing "-s" option, we should not just set
   DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
   We didn't do so and retained format bits set by options
   previously seen, causing (1) above.

It is even more tempting to lose NO_OUTPUT bit and instead take
output_format word being 0 as its replacement, but that would break
the mechanism "git show" uses to default to "--patch" output, where
the distinction between telling the command to be silent with "-s"
and having no output format specified on the command line matters,
and an explicit output format given on the command line should not
be "combined" with the default "--patch" format.

So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we
may want to replace it with OPTION_GIVEN bit, and

 * make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and
   DIFF_FORMAT_OPTION_GIVEN bit on for each format.  "--no-raw",
   etc. will set off DIFF_FORMAT_$format bit but still record the
   fact that we saw an option from the command line by setting
   DIFF_FORMAT_OPTION_GIVEN bit.

 * make "-s" (and its synonym "--no-patch") clear all other bits
   and set only the DIFF_FORMAT_OPTION_GIVEN bit on.

which I suspect would make the code much cleaner without breaking
any end-user expectations.

Once that is in place, transitioning "--no-patch" to mean the
counterpart of "--patch", just like "--no-raw" only defeats an
earlier "--raw", would be quite simple at the code level.  The
social cost of migrating the end-user expectations might be too
great for it to be worth, but at least the "GIVEN" bit clean-up
alone may be worth it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05 14:24:32 -07:00
83973981eb diff: plug leaks in dirstat
The array of dirstat_file contained in the dirstat_dir structure is
not freed after the processing ends.  Unfortunately, the member that
points at the array, .files, is incremented as the gather_dirstat()
function recursively walks it, and this needs to be plugged by
remembering the beginning of the array before gather_dirstat() mucks
with it and freeing it after we are done.

We can mark t4047 as leak-free.  t4000, which is marked as
leak-free, now can exercise dirstat in it, which will happen next.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05 14:24:09 -07:00
34a94897e0 diff: refactor common tail part of dirstat computation
This will become useful when we plug leaks in these two functions.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05 14:24:07 -07:00
1c301bcaa5 doc: doc-diff: specify date
Earlier we changed the manual page formatting machinery to use the
dates from the commit the documentation source was taken from,
instead of the date the manual page was produced.  When "doc-diff"
compares two commits from different dates, the different dates from
the two commits would result in unnecessary differences in the
output because of the change.

Compensate by setting a fixed date when "doc-diff" formats the pages
to be compared to work around this issue.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-04 18:16:29 -07:00
0c5308af30 docs: clarify git rm --cached function in gitignore note
Explain to users that the step to untrack a file will not also prevent them
from getting added in the future.

Signed-off-by: Sohom Datta <sohom.datta@learner.manipal.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 14:59:34 -07:00
d832f2ac55 doc: manpage: remove maximum title length
DocBook Stylesheets limit the size of the manpage titles for some
reason.

Even some of the longest git commands have no trouble fitting in 80
character terminals, so it's not clear why we would want to limit titles
to 20 characters, especially when modern terminals are much bigger.

For example:

  --- a/git-credential-cache--daemon.1
  +++ b/git-credential-cache--daemon.1
  @@ -1,4 +1,4 @@
  -GIT-CREDENTIAL-CAC(1)             Git Manual             GIT-CREDENTIAL-CAC(1)
  +GIT-CREDENTIAL-CACHE--DAEMON(1)   Git Manual   GIT-CREDENTIAL-CACHE--DAEMON(1)

   NAME
          git-credential-cache--daemon - Temporarily store user credentials in
  @@ -24,4 +24,4 @@ DESCRIPTION
   GIT
          Part of the git(1) suite

  -Git omitted                       2023-05-02             GIT-CREDENTIAL-CAC(1)
  +Git omitted                       2023-05-02   GIT-CREDENTIAL-CACHE--DAEMON(1)

Moreover, asciidoctor manpage backend doesn't limit the title length, so
we probably want to do the same for docbook backends for consistency.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 10:58:50 -07:00
6696077ace docs: rewrite the documentation of the text and eol attributes
These two sentences are confusing because the description of the text
attribute sounds exactly the same as the description of the text=auto
attribute:

"Setting the text attribute on a path enables end-of-line normalization"

"When text is set to "auto", the path is marked for automatic
end-of-line conversion"

Unless the reader is already familiar with the two variants, there's a
high probability that they will think that "end-of-line normalization"
is the same thing as "automatic end-of-line conversion".

It's also not clear that the phrase "When the file has been committed
with CRLF, no conversion is done" in the paragraph for text=auto does
not apply equally to the bare text attribute which is described earlier.
Moreover, it falsely implies that normalization is only suppressed if
the file has been committed. In fact, running `git add` on a CRLF file,
adding the text=auto attribute to the file, and running `git add` again
does not do anything to the line endings either.

On top of that, in several places the documentation for the eol
attribute sounds like either it does not affect normalization on checkin
or it forces normalization on checkin. It also sounds like setting eol
(or setting a config variable) is required to turn on conversion on
checkout, but the text attribute can turn on conversion on checkout by
itself if eol is unspecified.

Rephrase the documentation of text, text=auto, eol, eol=crlf, and eol=lf
to be clear about how they are the same, how they are different, and in
what cases conversion is performed.

Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 09:02:11 -07:00
a5855fd8d4 t2019: don't create unused files
Tests in t2019-checkout-ambiguous-ref.sh redirect two invocations of
"git checkout" to files "stdout" and "stderr".  Several assertions are
made using file "stderr".  File "stdout", however, is unused.

Don't redirect standard output of "git checkout" to file "stdout" in
t2019-checkout-ambiguous-ref.sh to avoid creating unnecessary files.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:53:10 -07:00
dca675c6ef t1502: don't create unused files
Three tests in file t1502-rev-parse-parseopt.sh use three redirections
with invocation of "git rev-parse --parseopt --".  All three tests
redirect standard output to file "out" and file "spec" to standard
input.  Two of the tests redirect standard output a second time to file
"actual", and the third test redirects standard error to file "err".
These tests check contents of files "actual" and "err", but don't use
the files named "out" for assertions.  The two tests that redirect to
standard output twice might also be confusing to the reader.

Don't redirect standard output of "git rev-parse" to file "out" in
t1502-rev-parse-parseopt.sh to avoid creating unnecessary files.

Acked-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:53:06 -07:00
59162ece57 t1450: don't create unused files
Test 'fsck error and recovery on invalid object type' in file
t1450-fsck.sh redirects output of a failing "git fsck" invocation to
files "out" and "err" to assert presence of error messages in the output
of the command.  Commit 31deb28f5e (fsck: don't hard die on invalid
object types, 2021-10-01) changed the way assertions in this test are
performed.  The test doesn't compare the whole standard error with
prepared file "err.expect" and it doesn't assert that standard output is
empty.

Don't create unused files "err.expect" and "out" in test 'fsck error and
recovery on invalid object type'.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:53:03 -07:00
a7cae2905b t1300: don't create unused files
Three tests in t1300-config.sh check that "git config --get" barfs when
syntax errors are present in the config file.  The tests redirect
standard output and standard error of "git config --get" to files,
"actual" and "error" correspondingly.  They assert presence of an error
message in file "error".  However, these tests don't use file "actual"
for assertions.

Don't redirect standard output of "git config --get" to file "actual" in
t1300-config.sh to avoid creating unnecessary files.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:52:48 -07:00
6fc68e7ca3 t1300: fix config file syntax error descriptions
Three tests in t1300-config.sh check that "git config --get" barfs when
the config file contains various syntax errors: key=value pair without
equals sign, broken section line, and broken value string.  The sample
config files include a comment describing the kind of broken syntax.
This description seems to have been copy-pasted from the "broken section
line" sample to the other two samples.

Fix descriptions of broken config file syntax in samples used in
t1300-config.sh.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:52:45 -07:00
ed5288cff2 t0300: don't create unused file
Test 'credential config with partial URLs' in t0300-credentials.sh
contains three "git credential fill" invocations.  For two of the
invocations, the test asserts presence or absence of string "yep" in the
standard output.  For the third test it checks for an error message in
standard error.

Don't redirect standard output of "git credential" to file "stdout" in
t0300-credentials.sh to avoid creating an unnecessary file when only
standard error is checked.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:52:17 -07:00
756991bc88 doc: remove custom callouts format
The code to render callouts for manpages comes from 17 years ago:
776e994af5 (Properly render asciidoc "callouts" in git man pages.,
2006-04-28), and it was needed back then, but DocBook Stylesheets added
support for that in 2008 [1], since 1.74.0 it hasn't been necessary.

What's worse: the format of the upstream callouts is much nicer than our
hacked version.

Compare this:

     $ git diff            (1)
     $ git diff --cached   (2)
     $ git diff HEAD       (3)

  1. Changes in the working tree not yet staged for the next
     commit.
  2. Changes between the index and your last commit; what you
     would be committing if you run git commit without -a
     option.
  3. Changes in the working tree since your last commit; what
     you would be committing if you run git commit -a

To this:

     $ git diff            (1)
     $ git diff --cached   (2)
     $ git diff HEAD       (3)

 1. Changes in the working tree not yet staged for the next commit.
 2. Changes between the index and your last commit; what you would
 be committing if you run git commit without -a option.
 3. Changes in the working tree since your last commit; what you
 would be committing if you run git commit -a

Let's drop our unnecessary inferior custom format and use the official
one.

[1] https://sourceforge.net/p/docbook/code/7842/

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:42:36 -07:00
69c786637d The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 10:13:50 -07:00
d699e27bd4 Merge branch 'tb/ban-strtok'
Mark strtok() and strtok_r() to be banned.

* tb/ban-strtok:
  banned.h: mark `strtok()` and `strtok_r()` as banned
  t/helper/test-json-writer.c: avoid using `strtok()`
  t/helper/test-oidmap.c: avoid using `strtok()`
  t/helper/test-hashmap.c: avoid using `strtok()`
  string-list: introduce `string_list_setlen()`
  string-list: multi-delimiter `string_list_split_in_place()`
2023-05-02 10:13:35 -07:00
cf85f4b3bd Merge branch 'jk/blame-fake-commit-label'
The output given by "git blame" that attributes a line to contents
taken from the file specified by the "--contents" option shows it
differently from a line attributed to the working tree file.

* jk/blame-fake-commit-label:
  blame: use different author name for fake commit generated by --contents
2023-05-02 10:13:35 -07:00
f357d46ada Merge branch 'jk/misc-null-check-fixes'
Code clean-up.

* jk/misc-null-check-fixes:
  fetch_bundle_uri(): drop pointless NULL check
  notes: clean up confusing NULL checks in init_notes()
2023-05-02 10:13:34 -07:00
3927312601 Merge branch 'en/ort-finalize-after-0-merges-fix'
A small API fix to the ort merge strategy backend.

* en/ort-finalize-after-0-merges-fix:
  merge-ort: fix calling merge_finalize() with no intermediate merge
2023-05-02 10:13:34 -07:00
4ca12e10e6 Merge branch 'ek/completion-use-read-r-to-read-literally'
The completion script used to use bare "read" without the "-r"
option to read the contents of various state files, which risked
getting confused with backslashes in them.  This has been
corrected.

* ek/completion-use-read-r-to-read-literally:
  completion: suppress unwanted unescaping of `read`
2023-05-02 10:13:34 -07:00
31885f64e9 test-ctype: check EOF
The character classifiers are supposed to allow passing EOF to them, a
negative value.  It isn't part of any character class.  Extend the tests
to cover that.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 09:25:54 -07:00
cf9cd8b55c fsck: use local repository
In 0d30feef3c (fsck: create scaffolding for rev-index checks,
2023-04-17) and later 5a6072f631 (fsck: validate .rev file header,
2023-04-17), the check_pack_rev_indexes() method was created with a
'struct repository *r' parameter. However, this parameter was unused and
instead 'the_repository' was used in its place.

Fix this situation with the obvious replacement.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 08:48:23 -07:00
756f1bcd29 fsck: verify checksums of all .bitmap files
If a filesystem-level corruption occurs in a .bitmap file, Git can react
poorly. This could take the form of a run-time error due to failing to
parse an EWAH bitmap or be more subtle such as returning the wrong set
of objects to a fetch or clone.

A natural first response to either of these kinds of errors is to run
'git fsck' to see if any files are corrupt. This currently ignores all
.bitmap files.

Add checks to 'git fsck' for all .bitmap files that are currently
associated with a multi-pack-index or pack file. Verify their checksums
using the hashfile API.

We iterate through all multi-pack-indexes and pack-files to be sure to
check all .bitmap files, not just the one that would be read by the
process. For example, a multi-pack-index bitmap overrules a pack-bitmap.
However, if the multi-pack-index is removed, the pack-bitmap may be
selected instead. Be thorough to include every file that could become
active in such a way. This includes checking files in alternates.

There is potential that we could extend this effort to check the
structure of the reachability bitmaps themselves, but it is very
expensive to do so. At minimum, it's as expensive as generating the
bitmaps in the first place, and that's assuming that we don't use the
trivial algorithm of verifying each bitmap individually. The trivial
algorithm will result in quadratic behavior (number of objects times
number of bitmapped commits) while the bitmap building operation
constructs a lattice of commits to build bitmaps incrementally and then
generate the final bitmaps from a subset of those commits.

If we were to extend 'git fsck' to check .bitmap file contents more
closely like this, then we would likely want to hide it behind an option
that signals the user is more willing to do expensive operations such as
this.

For testing, set up a repository with a pack-bitmap _and_ a
multi-pack-index bitmap. This requires some file movement to avoid
deleting the pack-bitmap during the repack that creates the
multi-pack-index bitmap. We can then verify that 'git fsck' is checking
all files, not just the "active" bitmap.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 08:48:22 -07:00
cbb83daeaf doc: interpret-trailers: fix example
We need to provide `--trailer sign` since the command won’t output
anything if you don’t give it an input and/or a
`--trailer`. Furthermore, the message which already contains an s-o-b is
wrong:

    $ git interpret-trailers --trailer sign <msg.txt
    Signed-off-by: Alice <alice@example.com>

    Signed-off-by: Alice <alice@example.com>

This can’t be what was originally intended.

So change the messages in this example to use the typical
“subject/message” file.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 13:26:42 -07:00
f68c26873d doc: interpret-trailers: don’t use deprecated config
`command` has been deprecated since commit c364b7ef51 (trailer: add new
.cmd config option, 2021-05-03).

Use the commit message of c364b7ef51 as a guide to replace the use of
`$ARG` and to use a script instead of an inline command.[1] Also,
explicitly trigger the command by passing in `--trailer=see`, since
this config is not automatically used.[2]

[1]: “Instead of "$ARG", users can refer to the value as positional
   argument, $1, in their scripts.”
[2]: “At the same time, in order to allow `git interpret-trailers` to
   better simulate the behavior of `git command -s`,
   'trailer.<token>.cmd' will not automatically execute.”

Acked-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 13:26:41 -07:00
b032a2bfe7 doc: interpret-trailers: use input redirection
Use input redirection instead of invoking cat(1) on a single file. This
is more straightforward, saves a process, and often makes the line
shorter.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 13:26:41 -07:00
c892bcc944 doc: interpret-trailers: don’t use heredoc in examples
This file contains four instances of trailing spaces from its inception
in commit [1]. These spaces might be intentional, since a user would be
prompted with `> ` in an interactive session. On the one hand, this is a
whitespace error according to `git diff --check`; on the other hand, the
raw documentation—it makes no difference in the rendered output—is just
staying faithful to the simulation of the interactive prompt.

Let’s get rid of these whitespace errors and also make the examples more
friendly to cut-and-paste by replacing the heredocs with files which are
shown with cat(1).

[1]: dfd66ddf5a (Documentation: add documentation for 'git
    interpret-trailers', 2014-10-13)

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 13:26:41 -07:00
e35f202b45 setup: trace bare repository setups
safe.bareRepository=explicit is a safer default mode of operation, since
it guards against the embedded bare repository attack [1]. Most end
users don't use bare repositories directly, so they should be able to
set safe.bareRepository=explicit, with the expectation that they can
reenable bare repositories by specifying GIT_DIR or --git-dir.

However, the user might use a tool that invokes Git on bare repositories
without setting GIT_DIR (e.g. "go mod" will clone bare repositories
[2]), so even if a user wanted to use safe.bareRepository=explicit, it
wouldn't be feasible until their tools learned to set GIT_DIR.

To make this transition easier, add a trace message to note when we
attempt to set up a bare repository without setting GIT_DIR. This allows
users and tool developers to audit which of their tools are problematic
and report/fix the issue.  When they are sufficiently confident, they
would switch over to "safe.bareRepository=explicit".

Note that this uses trace2_data_string(), which isn't supported by the
"normal" GIT_TRACE2 target, only _EVENT or _PERF.

[1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/
[2] https://go.dev/ref/mod

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 11:20:33 -07:00
0a3a972c16 contrib/credential: embiggen fixed-size buffer in wincred
As in previous commits, harden the wincred credential helper against the
aforementioned protocol injection attack.

Unlike the approached used for osxkeychain and libsecret, where a
fixed-size buffer was replaced with `getline()`, we must take a
different approach here. There is no `getline()` equivalent in Windows,
and the function is not available to us with ordinary compiler settings.

Instead, allocate a larger (still fixed-size) buffer in which to process
each line. The value of 100 KiB is chosen to match the maximum-length
header that curl will allow, CURL_MAX_HTTP_HEADER.

To ensure that we are reading complete lines at a time, and that we
aren't susceptible to a similar injection attack (albeit with more
padding), ensure that each read terminates at a newline (i.e., that no
line is more than 100 KiB long).

Note that it isn't sufficient to turn the old loop into something like:

    while (len && strchr("\r\n", buf[len - 1])) {
      buf[--len] = 0;
      ends_in_newline = 1;
    }

because if an attacker sends something like:

    [aaaaa.....]\r
    host=example.com\r\n

the credential helper would fill its buffer after reading up through the
first '\r', call fgets() again, and then see "host=example.com\r\n" on
its line.

Note that the original code was written in a way that would trim an
arbitrary number of "\r" and "\n" from the end of the string. We should
get only a single "\n" (since the point of `fgets()` is to return the
buffer to us when it sees one), and likewise would not expect to see
more than one associated "\r". The new code trims a single "\r\n", which
matches the original intent.

[1]: https://curl.se/libcurl/c/CURLOPT_HEADERFUNCTION.html

Tested-by: Matthew John Cheetham <mjcheetham@outlook.com>
Helped-by: Matthew John Cheetham <mjcheetham@outlook.com>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:02 -07:00
64f1e658e9 contrib/credential: avoid fixed-size buffer in libsecret
The libsecret credential helper reads the newline-delimited
protocol stream one line at a time by repeatedly calling fgets() into a
fixed-size buffer, and is thus affected by the vulnerability described
in the previous commit.

To mitigate this attack, avoid using a fixed-size buffer, and instead
rely on getline() to allocate a buffer as large as necessary to fit the
entire content of the line, preventing any protocol injection.

In most parts of Git we don't assume that every platform has getline().
But libsecret is primarily used on Linux, where we do already assume it
(using a knob in config.mak.uname). POSIX also added getline() in 2008,
so we'd expect other recent Unix-like operating systems to have it
(e.g., FreeBSD also does).

Note that the buffer was already allocated on the heap in this case, but
we'll swap `g_free()` for `free()`, since it will now be allocated by
the system `getline()`, rather than glib's `g_malloc()`.

Tested-by: Jeff King <peff@peff.net>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:02 -07:00
de2fb99006 contrib/credential: .gitignore libsecret build artifacts
The libsecret credential helper does not mark its build artifact as
ignored, so running "make" results in a dirty working tree.

Mark the "git-credential-libsecret" binary as ignored to avoid the above.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:02 -07:00
048b673d72 contrib/credential: remove 'gnome-keyring' credential helper
libgnome-keyring was deprecated in 2014 (in favor of libsecret), more
than nine years ago [1].

The credential helper implemented using libgnome-keyring has had a small
handful of commits since 2013, none of which implemented or changed any
functionality. The last commit to do substantial work in this area was
15f7221686 (contrib/git-credential-gnome-keyring.c: support really
ancient gnome-keyring, 2013-09-23), just shy of nine years ago.

This credential helper suffers from the same `fgets()`-related injection
attack (using the new "wwwauth[]" feature) as in the previous commit.
Instead of patching it, let's remove this helper as deprecated.

[1]: https://mail.gnome.org/archives/commits-list/2014-January/msg01585.html

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:01 -07:00
5747c8072b contrib/credential: avoid fixed-size buffer in osxkeychain
The macOS Keychain-based credential helper reads the newline-delimited
protocol stream one line at a time by repeatedly calling fgets() into a
fixed-size buffer, and is thus affected by the vulnerability described
in the previous commit.

To mitigate this attack, avoid using a fixed-size buffer, and instead
rely on getline() to allocate a buffer as large as necessary to fit the
entire content of the line, preventing any protocol injection.

We solved a similar problem in a5bb10fd5e (config: avoid fixed-sized
buffer when renaming/deleting a section, 2023-04-06) by switching to
strbuf_getline(). We can't do that here because the contrib helpers do
not link with the rest of Git, and so can't use a strbuf. But we can use
the system getline() directly, which works similarly.

In most parts of Git we don't assume that every platform has getline().
But this helper is run only on OS X, and that platform added support in
10.7 ("Lion") which was released in 2011.

Tested-by: Taylor Blau <me@ttaylorr.com>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:01 -07:00
71201ab0e5 t/lib-credential.sh: ensure credential helpers handle long headers
Add a test ensuring that the "wwwauth[]" field cannot be used to
inject malicious data into the credential helper stream.

Many of the credential helpers in contrib/credential read the
newline-delimited protocol stream one line at a time by repeatedly
calling fgets() into a fixed-size buffer.

This assumes that each line is no more than 1024 characters long, since
each iteration of the loop assumes that it is parsing starting at the
beginning of a new line in the stream. However, similar to a5bb10fd5e
(config: avoid fixed-sized buffer when renaming/deleting a section,
2023-04-06), if a line is longer than 1024 characters, a malicious actor
can embed another command within an existing line, bypassing the usual
checks introduced in 9a6bbee800 (credential: avoid writing values with
newlines, 2020-03-11).

As with the problem fixed in that commit, specially crafted input can
cause the helper to return the credential for the wrong host, letting an
attacker trick the victim into sending credentials for one host to
another.

Luckily, all parts of the credential helper protocol that are available
in a tagged release of Git are immune to this attack:

  - "protocol" is restricted to known values, and is thus immune.

  - "host" is immune because curl will reject hostnames that have a '='
    character in them, which would be required to carry out this attack.

  - "username" is immune, because the buffer characters to fill out the
    first `fgets()` call would pollute the `username` field, causing the
    credential helper to return nothing (because it would match a
    username if present, and the username of the credential to be stolen
    is likely not 1024 characters).

  - "password" is immune because providing a password instructs
    credential helpers to avoid filling credentials in the first place.

  - "path" is similar to username; if present, it is not likely to match
    any credential the victim is storing. It's also not enabled by
    default; the victim would have to set credential.useHTTPPath
    explicitly.

However, the new "wwwauth[]" field introduced via 5f2117b24f
(credential: add WWW-Authenticate header to cred requests, 2023-02-27)
can be used to inject data into the credential helper stream. For
example, running:

    {
      printf 'HTTP/1.1 401\r\n'
      printf 'WWW-Authenticate: basic realm='
      perl -e 'print "a" x 1024'
      printf 'host=victim.com\r\n'
    } | nc -Nlp 8080

in one terminal, and then:

    git clone http://localhost:8080

in another would result in a line like:

    wwwauth[]=basic realm=aaa[...]aaahost=victim.com

being sent to the credential helper. If we tweak that "1024" to align
our output with the helper's buffer size and the rest of the data on the
line, it can cause the helper to see "host=victim.com" on its own line,
allowing motivated attackers to exfiltrate credentials belonging to
"victim.com".

The below test demonstrates these failures and provides us with a test
to ensure that our fix is correct. That said, it has a couple of
shortcomings:

  - it's in t0303, since that's the only mechanism we have for testing
    random helpers. But that means nobody is going to run it under
    normal circumstances.

  - to get the attack right, it has to line up the stuffed name with the
    buffer size, so we depend on the exact buffer size. I parameterized
    it so it could be used to test other helpers, but in practice it's
    not likely for anybody to do that.

Still, it's the best we can do, and will help us confirm the presence of
the problem (and our fixes) in the new few patches.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:01 -07:00
16b305cd2b credential.c: store "wwwauth[]" values in credential_read()
Teach git-credential to read "wwwauth[]" value(s) when parsing the
output of a credential helper.

These extra headers are not needed for Git's own HTTP support to use the
feature internally, but the feature would not be available for a
scripted caller (say, git-remote-mediawiki providing the header in the
same way).

As a bonus, this also makes it easier to use wwwauth[] in synthetic
credential inputs in our test suite.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:00 -07:00
3a7a18a045 send-email: detect empty blank lines in command output
The email format does not allow blank lines in headers; detect such
input and report it as malformed and add a test for it.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 08:55:52 -07:00
ba92106e93 send-email: add --header-cmd, --no-header-cmd options
Sometimes, adding a header different than CC or TO is desirable; for
example, when using Debbugs, it is best to use 'X-Debbugs-Cc' headers
to keep people in CC; this is an example use case enabled by the new
'--header-cmd' option.

The header unfolding logic is extracted to a subroutine so that it can
be reused; a test is added for coverage.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 08:55:52 -07:00
03056ce796 send-email: extract execute_cmd from recipients_cmd
This refactor is to pave the way for the addition of the new
'--header-cmd' option to the send-email command.

Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 08:55:52 -07:00
8bb19c14fb t/t3501-revert-cherry-pick.sh: clarify scope of the file
The file started out as a test for picks and reverts with renames, but
has been subsequently populated with all kinds of basic tests, in
accordance with its generic name. Adjust the description to reflect
that.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 08:24:58 -07:00
fc23c397c7 Merge branch 'tb/enable-cruft-packs-by-default'
When "gc" needs to retain unreachable objects, packing them into
cruft packs (instead of exploding them into loose object files) has
been offered as a more efficient option for some time.  Now the use
of cruft packs has been made the default and no longer considered
an experimental feature.

* tb/enable-cruft-packs-by-default:
  repository.h: drop unused `gc_cruft_packs`
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: add additional test cases
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  builtin/repack.c: fix incorrect reference to '-C'
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-28 16:03:03 -07:00
48d89b51b3 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28 16:03:03 -07:00
aabc69885e Merge branch 'jk/gpg-trust-level-fix'
The "%GT" placeholder for the "--format" option of "git log" and
friends caused BUG() to trigger on a commit signed with an unknown
key, which has been corrected.

* jk/gpg-trust-level-fix:
  gpg-interface: set trust level of missing key to "undefined"
2023-04-28 16:03:03 -07:00
839ebad442 send-email docs: Remove mention of discontinued gmail feature
Support for "less secure apps" ended May 30, 2022.

This effectively reverts 155067a (git-send-email.txt: mention less secure
app access with Gmail, 2021-01-08).

Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28 13:46:58 -07:00
b734fe49fd messages: capitalization and punctuation exceptions
These are conscious violations of the usual rules for error messages,
based on this reasoning:

 - If an error message is directly followed by another sentence, it
   needs to be properly terminated with a period, lest the grammar
   looks broken and becomes hard to read.

 - That second sentence isn't actually an error message any more, so
   it should abide to conventional language rules for good looks and
   legibility. Arguably, these should be converted to advice
   messages (which the user can squelch, too), but that's a much
   bigger effort to get right.

 - Neither of these apply to the first hunk in do_exec(), but this
   two-line message looks just too much like a real sentence to not
   terminate it. Also, leaving it alone would make it asymmetrical
   to the other hunk.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28 12:06:27 -07:00
d45cbe3fe0 sequencer: actually translate report in do_exec()
N_() is meant to be used on strings that are subsequently _()'d, which
isn't the case here.

The affected construct is a bit questionable from an i18n perspective,
as it pieces together a sentence from separate strings. However, it
doesn't appear to be that bad, as the "assembly instructions" are in a
translatable message as well. Lacking specific complaints from
translators, it doesn't seem worth changing this.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28 12:03:40 -07:00
3bd0097cfc cocci: codify authoring and reviewing practices
These practices largely reflect what we are already doing on the mailing
list, which should help new Coccinelle authors and reviewers get up to
speed.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 16:49:15 -07:00
bd111141aa cocci: add headings to and reword README
- Drop "examples" since we actually use the patches.
- Drop sentences that could be headings instead

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 16:49:15 -07:00
f85cd430b1 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 16:00:59 -07:00
57a3b971e9 Merge branch 'fc/doc-checkout-markup-updates'
Doc mark-up update.

* fc/doc-checkout-markup-updates:
  doc: git-checkout: reorganize examples
  doc: git-checkout: trivial callout cleanup
2023-04-27 16:00:59 -07:00
d6661e6843 Merge branch 'fc/doc-use-datestamp-in-commit'
Instead of the time the formatter was run, show the timestamp
recorded in the commit in the documentation.

* fc/doc-use-datestamp-in-commit:
  doc: set actual revdate for manpages
2023-04-27 16:00:59 -07:00
a02675ad90 Merge branch 'ds/fsck-pack-revindex'
"git fsck" learned to validate the on-disk pack reverse index files.

* ds/fsck-pack-revindex:
  fsck: validate .rev file header
  fsck: check rev-index position values
  fsck: check rev-index checksums
  fsck: create scaffolding for rev-index checks
2023-04-27 16:00:59 -07:00
849c8b3dbf Merge branch 'tb/pack-revindex-on-disk'
The on-disk reverse index that allows mapping from the pack offset
to the object name for the object stored at the offset has been
enabled by default.

* tb/pack-revindex-on-disk:
  t: invert `GIT_TEST_WRITE_REV_INDEX`
  config: enable `pack.writeReverseIndex` by default
  pack-revindex: introduce `pack.readReverseIndex`
  pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK
  pack-revindex: make `load_pack_revindex` take a repository
  t5325: mark as leak-free
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-27 16:00:59 -07:00
90ef0f14eb parse_commit(): describe more date-parsing failure modes
The previous few commits improved the parsing of dates in malformed
commit objects. But there's one big case left implicit: we may still
feed garbage to parse_timestamp(). This is preferable to trying to be
more strict, but let's document the thinking in a comment.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 09:31:46 -07:00
089d9adff6 parse_commit(): handle broken whitespace-only timestamp
The comment in parse_commit_date() claims that parse_timestamp() will
not walk past the end of the buffer we've been given, since it will hit
the newline at "eol" and stop. This is usually true, when dateptr
contains actual numbers to parse. But with a line like:

   committer name <email>   \n

with just whitespace, and no numbers, parse_timestamp() will consume
that newline as part of the leading whitespace, and we may walk past our
"tail" pointer (which itself is set from the "size" parameter passed in
to parse_commit_buffer()).

In practice this can't cause us to walk off the end of an array, because
we always add an extra NUL byte to the end of objects we load from disk
(as a defense against exactly this kind of bug). However, you can see
the behavior in action when "committer" is the final header (which it
usually is, unless there's an encoding) and the subject line can be
parsed as an integer. We walk right past the newline on the committer
line, as well as the "\n\n" separator, and mistake the subject for the
timestamp.

We can solve this by trimming the whitespace ourselves, making sure that
it has some non-whitespace to parse. Note that we need to be a bit
careful about the definition of "whitespace" here, as our isspace()
doesn't match exotic characters like vertical tab or formfeed. We can
work around that by checking for an actual number (see the in-code
comment). This is slightly more restrictive than the current code, but
in practice the results are either the same (we reject "foo" as "0", but
so would parse_timestamp()) or extremely unlikely even for broken
commits (parse_timestamp() would allow "\v123" as "123", but we'll now
make it "0").

I did also allow "-" here, which may be controversial, as we don't
currently support negative timestamps. My reasoning was two-fold. One,
the design of parse_timestamp() is such that we should be able to easily
switch it to handling signed values, and this otherwise creates a
hard-to-find gotcha that anybody doing that work would get tripped up
on. And two, the status quo is that we currently parse them, though the
result of course ends up as a very large unsigned value (which is likely
to just get clamped to "0" for display anyway, since our date routines
can't handle it).

The new test checks the commit parser (via "--until") for both vanilla
spaces and the vertical-tab case. I also added a test to check these
against the pretty-print formatter, which uses split_ident_line().  It's
not subject to the same bug, because it already insists that there be
one or more digits in the timestamp.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:53:53 -07:00
ea1615dfdd parse_commit(): parse timestamp from end of line
To find the committer timestamp, we parse left-to-right looking for the
closing ">" of the email, and then expect the timestamp right after
that. But we've seen some broken cases in the wild where this fails, but
we _could_ find the timestamp with a little extra work. E.g.:

  Name <Name<email>> 123456789 -0500

This means that features that rely on the committer timestamp, like
--since or --until, will treat the commit as happening at time 0 (i.e.,
1970).

This is doubly confusing because the pretty-print parser learned to
handle these in 03818a4a94 (split_ident: parse timestamp from end of
line, 2013-10-14). So printing them via "git show", etc, makes
everything look normal, but --until, etc are still broken (despite the
fact that that commit explicitly mentioned --until!).

So let's use the same trick as 03818a4a94: find the end of the line, and
parse back to the final ">". In theory we could use split_ident_line()
here, but it's actually a bit more strict. In particular, it requires a
valid time-zone token, too. That should be present, of course, but we
wouldn't want to break --until for cases that are working currently.

We might want to teach split_ident_line() to become more lenient there,
but it would require checking its many callers (since right now they can
assume that if date_start is non-NULL, so is tz_start).

So for now we'll just reimplement the same trick in the commit parser.

The test is in t4212, which already covers similar cases, courtesy of
03818a4a94. We'll just adjust the broken commit to munge both the author
and committer timestamps. Note that we could match (author|committer)
here, but alternation can't be used portably in sed. Since we wouldn't
expect to see ">" except as part of an ident line, we can just match
that character on any line.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:53:35 -07:00
2063b86b81 t4212: avoid putting git on left-hand side of pipe
We wouldn't expect cat-file to fail here, but it's good practice to
avoid putting git on the upstream of a pipe, as we otherwise ignore its
exit code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:53:32 -07:00
60ff56f503 banned.h: mark strtok() and strtok_r() as banned
`strtok()` has a couple of drawbacks that make it undesirable to have
any new instances. In addition to being thread-unsafe, it also
encourages confusing data flows, where `strtok()` may be called from
multiple functions with its first argument as NULL, making it unclear
from the immediate context which string is being tokenized.

Now that we have removed all instances of `strtok()` from the tree,
let's ban `strtok()` to avoid introducing new ones in the future. If new
callers should arise, they are encouraged to use
`string_list_split_in_place()` (and `string_list_remove_empty_items()`,
if applicable).

string_list_split_in_place() is not a perfect drop-in replacement
for `strtok_r()`, particularly if the caller is processing a string with
an arbitrary number of tokens, and wants to process each token one at a
time.

But there are no instances of this in Git's tree which are more
well-suited to `strtok_r()` than the friendlier
`string_list_split_in_place()`, so ban `strtok_r()`, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:51:11 -07:00
10e8a52ef1 negotiator/skipping: fix some problems in mark_common()
The mark_common() method in negotiator/skipping.c was converted
from recursive to iterative in 4654134976 (negotiator/skipping:
avoid stack overflow, 2022-10-25), but there is some more work
to do:

1. prio_queue() should be used with clear_prio_queue(), otherwise there
   will be a memory leak.
2. It does not do duplicate protection before prio_queue_put().
   (The COMMON bit would work here, too.)
3. When it translated from recursive to iterative it kept "return"
   statements that should probably be "continue" statements.
4. It does not attempt to parse commits, and instead returns
   immediately when finding an unparsed commit. This is something
   that it did in its original version, so maybe it is by design,
   but it doesn't match the doc comment for the method.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-26 10:38:57 -07:00
8e21ff5edb negotiator/default: avoid stack overflow
mark_common() in negotiator/default.c may overflow the stack due to
recursive function calls. Avoid this by instead recursing using a
heap-allocated data structure.

This is the same case as 4654134976 (negotiator/skipping: avoid
stack overflow, 2022-10-25)

Reported-by: Xin Xing <xingxin.xx@bytedance.com>
Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-26 10:38:54 -07:00
382a946414 Handle some compiler versions containing a dash
The version reported by e.g. x86_64-w64-mingw32-gcc on Debian bullseye
looks like:
  gcc version 10-win32 20210110 (GCC)

This ends up with detect-compiler failing with:
  ./detect-compiler: 30: test: Illegal number: 10-win32

This change removes the two known suffixes known to exist in GCC versions
in Debian: -win32 and -posix.

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-26 09:20:50 -07:00
5f0e37b4c1 doc: GIT_DEFAULT_HASH is and will be ignored during "clone"
The phrasing "is currently ignored" was prone to be misinterpreted
as if we were wishing if it were honored.  Rephrase it to make it
clear that the experimental variable will be ignored.

In the longer term, after/when we allow incremental/over-the-wire
migration of the object-format, i.e. cloning from an SHA-1
repository to create an SHA-256 repository (or vice versa) and
fetching and pushing between them would bidirectionally convert the
object format on the fly, it is likely that we would teach a new
option "--object-format" to "git clone" to say "you would use
whatever object format the origin uses by default, but this time, I
am telling you to use this format on our side, doing on-the-fly
object format conversion as needed".  So it is perfectly OK to
ignore the settings of this experimental variable, even after such
an extension happens that makes it necessary for us to have a way to
create a new repository that uses different object format from the
origin repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-26 08:17:04 -07:00
2807bd2c10 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25 13:56:20 -07:00
36628c56ed Merge branch 'ps/fix-geom-repack-with-alternates'
Geometric repacking ("git repack --geometric=<n>") in a repository
that borrows from an alternate object database had various corner
case bugs, which have been corrected.

* ps/fix-geom-repack-with-alternates:
  repack: disable writing bitmaps when doing a local repack
  repack: honor `-l` when calculating pack geometry
  t/helper: allow chmtime to print verbosely without modifying mtime
  pack-objects: extend test coverage of `--stdin-packs` with alternates
  pack-objects: fix error when same packfile is included and excluded
  pack-objects: fix error when packing same pack twice
  pack-objects: split out `--stdin-packs` tests into separate file
  repack: fix generating multi-pack-index with only non-local packs
  repack: fix trying to use preferred pack in alternates
  midx: fix segfault with no packs and invalid preferred pack
2023-04-25 13:56:20 -07:00
c4c9d5586f Merge branch 'rj/send-email-validate-hook-count-messages'
The sendemail-validate validate hook learned to pass the total
number of input files and where in the sequence each invocation is
via environment variables.

* rj/send-email-validate-hook-count-messages:
  send-email: export patch counters in validate environment
2023-04-25 13:56:20 -07:00
80d268f309 Merge branch 'jk/protocol-cap-parse-fix'
The code to parse capability list for v0 on-wire protocol fell into
an infinite loop when a capability appears multiple times, which
has been corrected.

* jk/protocol-cap-parse-fix:
  v0 protocol: use size_t for capability length/offset
  t5512: test "ls-remote --heads --symref" filtering with v0 and v2
  t5512: allow any protocol version for filtered symref test
  t5512: add v2 support for "ls-remote --symref" test
  v0 protocol: fix sha1/sha256 confusion for capabilities^{}
  t5512: stop referring to "v1" protocol
  v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25 13:56:20 -07:00
0807e57807 Merge branch 'en/header-split-cache-h'
Header clean-up.

* en/header-split-cache-h: (24 commits)
  protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
  mailmap, quote: move declarations of global vars to correct unit
  treewide: reduce includes of cache.h in other headers
  treewide: remove double forward declaration of read_in_full
  cache.h: remove unnecessary includes
  treewide: remove cache.h inclusion due to pager.h changes
  pager.h: move declarations for pager.c functions from cache.h
  treewide: remove cache.h inclusion due to editor.h changes
  editor: move editor-related functions and declarations into common file
  treewide: remove cache.h inclusion due to object.h changes
  object.h: move some inline functions and defines from cache.h
  treewide: remove cache.h inclusion due to object-file.h changes
  object-file.h: move declarations for object-file.c functions from cache.h
  treewide: remove cache.h inclusion due to git-zlib changes
  git-zlib: move declarations for git-zlib functions from cache.h
  treewide: remove cache.h inclusion due to object-name.h changes
  object-name.h: move declarations for object-name.c functions from cache.h
  treewide: remove unnecessary cache.h inclusion
  treewide: be explicit about dependence on mem-pool.h
  treewide: be explicit about dependence on oid-array.h
  ...
2023-04-25 13:56:20 -07:00
9ce9dea4e1 Sync with Git 2.40.1 2023-04-24 22:31:32 -07:00
a2742f8c59 t/helper/test-json-writer.c: avoid using strtok()
Apply similar treatment as in the previous commit to remove usage of
`strtok()` from the "oidmap" test helper.

Each of the different commands that the "json-writer" helper accepts
pops the next space-delimited token from the current line and interprets
it as a string, integer, or double (with the exception of the very first
token, which is the command itself).

To accommodate this, split the line in place by the space character, and
pass the corresponding string_list to each of the specialized `get_s()`,
`get_i()`, and `get_d()` functions.

`get_i()` and `get_d()` are thin wrappers around `get_s()` that convert
their result into the appropriate type by either calling `strtol()` or
`strtod()`, respectively. In `get_s()`, we mark the token as "consumed"
by incrementing the `consumed_nr` counter, indicating how many tokens we
have read up to that point.

Because each of these functions needs the string-list parts, the number
of tokens consumed, and the line number, these three are wrapped up in
to a struct representing the line state.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
deeabc1ff0 t/helper/test-oidmap.c: avoid using strtok()
Apply similar treatment as in the previous commit to remove usage of
`strtok()` from the "oidmap" test helper.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
826f0e33ab t/helper/test-hashmap.c: avoid using strtok()
Avoid using the non-reentrant `strtok()` to separate the parts of each
incoming command. Instead of replacing it with `strtok_r()`, let's
instead use the more friendly pair of `string_list_split_in_place()` and
`string_list_remove_empty_items()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
492ba81346 string-list: introduce string_list_setlen()
It is sometimes useful to reduce the size of a `string_list`'s list of
items without having to re-allocate them. For example, doing the
following:

    struct strbuf buf = STRBUF_INIT;
    struct string_list parts = STRING_LIST_INIT_NO_DUP;
    while (strbuf_getline(&buf, stdin) != EOF) {
      parts.nr = 0;
      string_list_split_in_place(&parts, buf.buf, ":", -1);
      /* ... */
    }
    string_list_clear(&parts, 0);

is preferable over calling `string_list_clear()` on every iteration of
the loop. This is because `string_list_clear()` causes us free our
existing `items` array. This means that every time we call
`string_list_split_in_place()`, the string-list internals re-allocate
the same size array.

Since in the above example we do not care about the individual parts
after processing each line, it is much more efficient to pretend that
there aren't any elements in the `string_list` by setting `list->nr` to
0 while leaving the list of elements allocated as-is.

This allows `string_list_split_in_place()` to overwrite any existing
entries without needing to free and re-allocate them.

However, setting `list->nr` manually is not safe in all instances. There
are a couple of cases worth worrying about:

  - If the `string_list` is initialized with `strdup_strings`,
    truncating the list can lead to overwriting strings which are
    allocated elsewhere. If there aren't any other pointers to those
    strings other than the ones inside of the `items` array, they will
    become unreachable and leak.

    (We could ourselves free the truncated items between
    string_list->items[nr] and `list->nr`, but no present or future
    callers would benefit from this additional complexity).

  - If the given `nr` is larger than the current value of `list->nr`,
    we'll trick the `string_list` into a state where it thinks there are
    more items allocated than there actually are, which can lead to
    undefined behavior if we try to read or write those entries.

Guard against both of these by introducing a helper function which
guards assignment of `list->nr` against each of the above conditions.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
52acddf36c string-list: multi-delimiter string_list_split_in_place()
Enhance `string_list_split_in_place()` to accept multiple characters as
delimiters instead of a single character.

Instead of using `strchr(2)` to locate the first occurrence of the given
delimiter character, `string_list_split_in_place_multi()` uses
`strcspn(2)` to move past the initial segment of characters comprised of
any characters in the delimiting set.

When only a single delimiting character is provided, `strpbrk(2)` (which
is implemented with `strcspn(2)`) has equivalent performance to
`strchr(2)`. Modern `strcspn(2)` implementations treat an empty
delimiter or the singleton delimiter as a special case and fall back to
calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)`
this way.

This change is one step to removing `strtok(2)` from the tree. Note that
`string_list_split_in_place()` is not a strict replacement for
`strtok()`, since it will happily turn sequential delimiter characters
into empty entries in the resulting string_list. For example:

    string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1)

would yield a string list of:

    ["foo", "", "", "bar", "", "", "baz"]

Callers that wish to emulate the behavior of strtok(2) more directly
should call `string_list_remove_empty_items()` after splitting.

To avoid regressions for the new multi-character delimter cases, update
t0063 in this patch as well.

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35
[2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
603d0fdce2 blame: use different author name for fake commit generated by --contents
When the --contents option is used with git blame, and the contents of
the file have lines which can't be annotated by the history being
blamed, the user will see an author of "Not Committed Yet". This is
similar to the way blame handles working tree contents when blaming
without a revision.

This is slightly confusing since this data isn't the working copy and
while it is technically "not committed yet", its also coming from an
external file. Replace this author name with "External file
(--contents)" to better differentiate such lines from actual working
copy lines.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Glen Choo <chooglen@google.com>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 15:16:31 -07:00
3d77fbb664 t1300: add tests for missing keys
There are several tests in t1300-config.sh that validate failing
invocations of "git config".  However, there are no tests that check
what happens when "git config" is asked to retrieve a value for a
missing key.

Add tests that check this for various combinations of "<section>.<key>"
and "<section>.<subsection>.<key>".

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 15:10:50 -07:00
93f86046c9 t1300: check stderr for "ignores pairs" tests
Tests "git config ignores pairs ..." in t1300-config.sh validate that
"git config" ignores various kinds of supplied pairs of environment
variables GIT_CONFIG_KEY_* GIT_CONFIG_VALUE_* depending on
GIT_CONFIG_COUNT.  By "ignores" here we mean that "git config" abides by
the value of environment variable GIT_CONFIG_COUNT and doesn't use
key-value pairs outside of the supplied GIT_CONFIG_COUNT when trying to
produce a value for config key "pair.one".

These tests also validate that "git config" doesn't complain about
mismatched environment variables to standard error.  This is validated
by redirecting the standard error to a file called "error" and asserting
that it is empty.  However, two of these tests incorrectly redirect to
standard output while calling the file "error", and test 'git config
ignores pairs exceeding count' doesn't validate standard error at all.

Fix these tests by redirecting standard error to file "error" and
asserting its emptiness.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 15:10:50 -07:00
f7f9a836e2 t1300: drop duplicate test
There are two almost identical tests called 'git config ignores pairs
with zero count' in file t1300-config.sh.  Drop the first of these and
keep the one that contains more assertions.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 15:10:50 -07:00
000c4ceca7 merge-ort: fix calling merge_finalize() with no intermediate merge
If some code sets up the data structures for a merge, but then never
actually performs one before calling merge_finalize(), then
merge_finalize() wouldn't notice that result->priv was NULL and
return early, resulting in following that NULL pointer and getting
a segfault.  There is currently no code in the git codebase that does
this, but this issue was found during testing of some proposed patches
that had the following structure:

    struct merge_options merge_opt;
    struct merge_result result;

    init_merge_options(&merge_opt, the_repository);
    memset(&result, 0, sizeof(result));

    <do N merges, for some value of N>

    merge_finalize(&merge_opt, &result);

where some flags could cause the code to have N=0, i.e. doing no merges.
Add a check for result->priv being NULL and return early to avoid a
segfault in these kinds of cases.

While at it, ensure the FREE_AND_NULL() in the function does something
useful with the nulling aspect, namely sets result->priv to NULL rather
than a mere temporary.

Reported-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 14:04:07 -07:00
e3a3f5edf5 reftable: ensure git-compat-util.h is the first (indirect) include
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
0e312eaa12 diff.h: reduce unnecessary includes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
e3d2f20e6f object-store.h: reduce unnecessary includes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
d4a4f9291d commit.h: reduce unnecessary includes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
e1c382141d fsmonitor: reduce includes of cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
4c98cb8e35 cache.h: remove unnecessary headers
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
5e3f94dfe3 treewide: remove cache.h inclusion due to previous changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
53dca334d6 cache,tree: move basic name compare functions from read-cache to tree
None of base_name_compare(), df_name_compare(), or name_compare()
depended upon a cache_entry or index_state in any way.  By moving these
functions to tree.h, half a dozen other files can stop depending upon
cache.h (though that change will be made in a later commit).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
aabc5617cd cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
Since cmp_cache_name_compare() was comparing cache_entry structs, it
was associated with the cache rather than with trees.  Move the
function.  As a side effect, we can make cache_name_stage_compare()
static as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
d1cbe1e6d8 hash-ll.h: split out of hash.h to remove dependency on repository.h
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository->hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
23a517e415 tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
S_DIFFTREE_IFXMIN_NEQ is *only* used in tree-diff.c, so there is no
point exposing it in cache.h.  Move it to tree-diff.c.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
592fc5b349 dir.h: move DTYPE defines from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
3467663d47 versioncmp.h: move declarations for versioncmp.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
641223137b ws.h: move declarations for ws.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
d4ff2072ab match-trees.h: move declarations for match-trees.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
b388633c5c pkt-line.h: move declarations for pkt-line.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
9b5041f647 base85.h: move declarations for base85.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
d5fff46f40 copy.h: move declarations for copy.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
623b80bef2 server-info.h: move declarations for server-info.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
0ff73d742b packfile.h: move pack_window and pack_entry from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
cb2a51356d symlinks.h: move declarations for symlinks.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
69a63fe663 treewide: be explicit about dependence on strbuf.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
0b1a95ef79 fetch_bundle_uri(): drop pointless NULL check
We check if "uri" is NULL, but it cannot be since we'd have segfaulted
earlier in the function when we unconditionally called xstrdup() on it.

In theory we might want to soften that xstrdup() to handle this case,
but even before the code which added it via c23f592117 (bundle-uri:
fetch a list of bundles, 2022-10-12), we'd have fed NULL to
fetch_bundle_uri_internal(), which would also segfault.

The extra check isn't hurting anything, but it does cause Coverity to
complain, and it may mislead somebody reading the code into thinking
that a NULL uri is something we're prepared to handle.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 11:09:16 -07:00
ae6f064fd7 notes: clean up confusing NULL checks in init_notes()
Coverity complains that we check whether "notes_ref" is NULL, but it was
already implied to be non-NULL earlier in the function. And this is
true; since b9342b3fd6 (refs: add array of ref namespaces, 2022-08-05),
we call xstrdup(notes_ref) unconditionally, which would segfault if it
was NULL.

But that commit is actually doing the right thing. Even if NULL is
passed into the function, we'll use default_notes_ref() as a fallback,
which will never return NULL (it tries a few options, but its last
resort is a string literal). Ironically, the "!notes_ref" check was
added by the same commit that added the fallback: 709f79b089 (Notes
API: init_notes(): Initialize the notes tree from the given notes ref,
2010-02-13). So this check never did anything.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 11:09:13 -07:00
7580f92ffa The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-21 15:35:09 -07:00
b64894c206 Merge branch 'ow/ref-filter-omit-empty'
"git branch --format=..." and "git format-patch --format=..."
learns "--omit-empty" to hide refs that whose formatting result
becomes an empty string from the output.

* ow/ref-filter-omit-empty:
  branch, for-each-ref, tag: add option to omit empty lines
2023-04-21 15:35:05 -07:00
9e0d1aa495 Merge branch 'ah/format-patch-thread-doc'
Doc update.

* ah/format-patch-thread-doc:
  format-patch: correct documentation of --thread without an argument
2023-04-21 15:35:05 -07:00
7ac228c994 Merge branch 'rn/sparse-describe'
"git describe --dirty" learns to work better with sparse-index.

* rn/sparse-describe:
  describe: enable sparse index for describe
2023-04-21 15:35:04 -07:00
de73a20756 Merge branch 'rs/archive-from-subdirectory-fixes'
"git archive" run from a subdirectory mishandled attributes and
paths outside the current directory.

* rs/archive-from-subdirectory-fixes:
  archive: improve support for running in subdirectory
2023-04-21 15:35:04 -07:00
09a7b61c1d Merge branch 'fc/doc-stop-using-manversion'
Doc build simplification.

* fc/doc-stop-using-manversion:
  doc: simplify man version
2023-04-21 15:35:04 -07:00
a5c76569e7 credential: new attribute oauth_refresh_token
Git authentication with OAuth access token is supported by every popular
Git host including GitHub, GitLab and BitBucket [1][2][3]. Credential
helpers Git Credential Manager (GCM) and git-credential-oauth generate
OAuth credentials [4][5]. Following RFC 6749, the application prints a
link for the user to authorize access in browser. A loopback redirect
communicates the response including access token to the application.

For security, RFC 6749 recommends that OAuth response also includes
expiry date and refresh token [6]. After expiry, applications can use
the refresh token to generate a new access token without user
reauthorization in browser. GitLab and BitBucket set the expiry at two
hours [2][3]. (GitHub doesn't populate expiry or refresh token.)

However the Git credential protocol has no attribute to store the OAuth
refresh token (unrecognised attributes are silently discarded). This
means that the user has to regularly reauthorize the helper in browser.
On a browserless system, this is particularly intrusive, requiring a
second device.

Introduce a new attribute oauth_refresh_token. This is especially
useful when a storage helper and a read-only OAuth helper are configured
together. Recall that `credential fill` calls each helper until it has a
non-expired password.

```
[credential]
	helper = storage  # eg. cache or osxkeychain
	helper = oauth
```

The OAuth helper can use the stored refresh token forwarded by
`credential fill` to generate a fresh access token without opening the
browser. See
https://github.com/hickford/git-credential-oauth/pull/3/files
for an implementation tested with this patch.

Add support for the new attribute to credential-cache. Eventually, I
hope to see support in other popular storage helpers.

Alternatives considered: ask helpers to store all unrecognised
attributes. This seems excessively complex for no obvious gain.
Helpers would also need extra information to distinguish between
confidential and non-confidential attributes.

Workarounds: GCM abuses the helper get/store/erase contract to store the
refresh token during credential *get* as the password for a fictitious
host [7] (I wrote this hack). This workaround is only feasible for a
monolithic helper with its own storage.

[1] https://github.blog/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/
[2] https://docs.gitlab.com/ee/api/oauth2.html#access-git-over-https-with-access-token
[3] https://support.atlassian.com/bitbucket-cloud/docs/use-oauth-on-bitbucket-cloud/#Cloning-a-repository-with-an-access-token
[4] https://github.com/GitCredentialManager/git-credential-manager
[5] https://github.com/hickford/git-credential-oauth
[6] https://datatracker.ietf.org/doc/html/rfc6749#section-5.1
[7] 66b94e489a/src/shared/GitLab/GitLabHostProvider.cs (L207)

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-21 09:38:30 -07:00
197152098a completion: suppress unwanted unescaping of read
The function `__git_eread`, which reads the first line from the file,
calls the `read` builtin without passing the flag option `-r`.  When
the `read` builtin is called without the flag `-r`, it processes the
backslash escaping in the text that it reads.  For this reason, it is
generally considered the best practice to always use the `read`
builtin with flag `-r` unless one intensionally processes the
backslash escaping.  For the present case in git-prompt.sh, in fact,
all the occurrences of the calls of `__git_eread` intend to read the
literal content of the first lines.

To make it read the first line literally, pass the flag `-r` to the
`read` builtin in the function `__git_eread`.

Signed-off-by: Edwin Kofler <edwin@kofler.dev>
Signed-off-by: Koichi Murase <myoga.murase@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-20 15:47:38 -07:00
138ef8068c cocci: remove 'unused.cocci'
When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a
rule to find "unused" strbufs, 2022-07-05) it found three unused
strbufs, and when it was generalized in the next commit it managed to
find an unused string_list as well.  That's four unused variables in
over 17 years, so apparently we rarely make this mistake.

Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it
increases the from-scratch runtime of 'make coccicheck' by over 5:30
minutes or over 160%:

  $ make -s cocciclean
  $ time make -s coccicheck
      * new spatch flags

  real    8m56.201s
  user    0m0.420s
  sys     0m0.406s
  $ rm contrib/coccinelle/unused.cocci contrib/coccinelle/tests/unused.*
  $ make -s cocciclean
  $ time make -s coccicheck
      * new spatch flags

  real    3m23.893s
  user    0m0.228s
  sys     0m0.247s

That's a lot of runtime spent for not much in return, and arguably an
unused struct instance sneaking in is not that big of a deal to
justify the significantly increased runtime.

Remove 'unused.cocci', because we are not getting our CPU cycles'
worth.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-20 14:53:00 -07:00
ad353d7e77 gittutorial: wrap literal examples in backticks
Our coding guidelines prefer literal examples to be wrapped in
`backticks` to typeset them in monospace.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-20 14:34:08 -07:00
67ceed1f82 gittutorial: drop early mention of origin
We don't have an origin at this point in the tutorial, so "Your branch
is up to date" won't actually show up in the output of `git status`.

This line was introduced in 8942821ec0 ("gittutorial: fix output of 'git
status'", 2014-11-13) in what looks like a mistake -- that commit mostly
just wanted to remove leading '#' characters.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-20 14:34:07 -07:00
9c6990cca2 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-20 14:33:36 -07:00
a4a4db8cf7 Merge branch 'gc/better-error-when-local-clone-fails-with-symlink'
"git clone --local" stops copying from an original repository that
has symbolic links inside its $GIT_DIR; an error message when that
happens has been updated.

* gc/better-error-when-local-clone-fails-with-symlink:
  clone: error specifically with --local and symlinked objects
2023-04-20 14:33:36 -07:00
98c496fcd0 Merge branch 'ar/t2024-checkout-output-fix'
Test fix.

* ar/t2024-checkout-output-fix:
  t2024: fix loose/strict local base branch DWIM test
2023-04-20 14:33:36 -07:00
08bd076ce4 Merge branch 'rs/get-tar-commit-id-use-defined-const'
Code clean-up to replace a hardcoded constant with a CPP macro.

* rs/get-tar-commit-id-use-defined-const:
  get-tar-commit-id: use TYPEFLAG_GLOBAL_HEADER instead of magic value
2023-04-20 14:33:36 -07:00
fa9172c70a Merge branch 'rs/remove-approxidate-relative'
The approxidate() API has been simplified by losing an extra
function that did the same thing as another one.

* rs/remove-approxidate-relative:
  date: remove approxidate_relative()
2023-04-20 14:33:35 -07:00
cbfe844aa1 Merge branch 'rs/userdiff-multibyte-regex'
The userdiff regexp patterns for various filetypes that are built
into the system have been updated to avoid triggering regexp errors
from UTF-8 aware regex engines.

* rs/userdiff-multibyte-regex:
  userdiff: support regexec(3) with multi-byte support
2023-04-20 14:33:35 -07:00
a8022c5f7b send-email: expose header information to git-send-email's sendemail-validate hook
To allow further flexibility in the Git hook, the SMTP header
information of the email which git-send-email intends to send, is now
passed as the 2nd argument to the sendemail-validate hook.

As an example, this can be useful for acting upon keywords in the
subject or specific email addresses.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19 14:19:09 -07:00
56adddaa06 send-email: refactor header generation functions
Split process_file and send_message into easier to use functions.
Making SMTP header information widely available.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19 14:19:09 -07:00
7891e46585 gpg-interface: set trust level of missing key to "undefined"
In check_signature(), we initialize the trust_level field to "-1", with
the idea that if gpg does not return a trust level at all (if there is
no signature, or if the signature is made by an unknown key), we'll
use that value. But this has two problems:

  1. Since the field is an enum, it's up to the compiler to decide what
     underlying storage to use, and it only has to fit the values we've
     declared. So we may not be able to store "-1" at all. And indeed,
     on my system (linux with gcc), the resulting enum is an unsigned
     32-bit value, and -1 becomes 4294967295.

     The difference may seem academic (and you even get "-1" if you pass
     it to printf("%d")), but it means that code like this:

       status |= sigc->trust_level < configured_min_trust_level;

     does not necessarily behave as expected. This turns out not to be a
     bug in practice, though, because we keep the "-1" only when gpg did
     not report a signature from a known key, in which case the line
     above:

       status |= sigc->result != 'G';

     would always set status to non-zero anyway. So only a 'G' signature
     with no parsed trust level would cause a problem, which doesn't
     seem likely to trigger (outside of unexpected gpg behavior).

  2. When using the "%GT" format placeholder, we pass the value to
     gpg_trust_level_to_str(), which complains that the value is out of
     range with a BUG(). This behavior was introduced by 803978da49
     (gpg-interface: add function for converting trust level to string,
     2022-07-11). Before that, we just did a switch() on the enum, and
     anything that wasn't matched would end up as the empty string.

     Curiously, solving this by naively doing:

       if (level < 0)
               return "";

     in that function isn't sufficient. Because of (1) above, the
     compiler can (and does in my case) actually remove that conditional
     as dead code!

We can solve both by representing this state as an enum value. We could
do this by adding a new "unknown" value. But this really seems to match
the existing "undefined" level well. GPG describes this as "Not enough
information for calculation".

We have tests in t7510 that trigger this case (verifying a signature
from a key that we don't have, and then checking various %G
placeholders), but they didn't notice the BUG() because we didn't look
at %GT for that case! Let's make sure we check all %G placeholders for
each case in the formatting tests.

The interesting ones here are "show unknown signature with custom
format" and "show lack of signature with custom format", both of which
would BUG() before, and now turn %GT into "undefined". Prior to
803978da49 they would have turned it into the empty string, but I think
saying "undefined" consistently is a reasonable outcome, and probably
makes life easier for anyone parsing the output (and any such parser had
to be ready to see "undefined" already).

The other modified tests produce the same output before and after this
patch, but now we're consistently checking both %G? and %GT in all of
them.

Signed-off-by: Jeff King <peff@peff.net>
Reported-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19 08:30:54 -07:00
8dda6c3de2 doc: git-checkout: reorganize examples
The examples are an ordered list, however, they are complex enough that
a callout is inside example 1, and that confuses the parsers as the list
continuation (`+`) is unclear (are we continuing the previous list item,
or the previous callout?).

We could use an open block as the asciidoctor documentation suggests,
but that has a tiny formatting issue (a newline is missing).

To simplify things for everyone (the reader, the writer, and the parser)
let's use subsections.

After this change, the HTML documentation generated with asciidoc has
the right indentation.

Cc: Jeff King <peff@peff.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 15:47:13 -07:00
f8bc75a55e doc: git-checkout: trivial callout cleanup
The callouts are directly tied to the listing above, remove spaces to
make it clear they are one and the same.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 15:36:36 -07:00
029a632c35 repository.h: drop unused gc_cruft_packs
As of the previous commit, all callers that need to read the value of
`gc.cruftPacks` do so outside without using the `repo_settings` struct,
making its `gc_cruft_packs` unused. Drop it accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:48 -07:00
e3e24de1bf builtin/gc.c: make gc.cruftPacks enabled by default
Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects
via loose, 2022-05-20), `git gc` learned the `--cruft` option and
`gc.cruftPacks` configuration to opt-in to writing cruft packs when
collecting or pruning unreachable objects.

Cruft packs were introduced with the merge in a50036da1a (Merge branch
'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
explosions", where Git will write out many individual loose objects when
there is a large number of unreachable objects that have not yet aged
past `--prune=<date>`.

Instead of keeping track of those unreachable yet recent objects via
their loose object file's mtime, cruft packs collect all unreachable
objects into a single pack with a corresponding `*.mtimes` file that
acts as a table to store the mtimes of all unreachable objects. This
prevents the need to store unreachable objects as loose as they age out
of the repository, and avoids the problem of loose object explosions.

Beyond avoiding loose object explosions, cruft packs also act as a more
efficient mechanism to store unreachable objects as they age out of a
repository. This is because pairs of similar unreachable objects serve
as delta bases for one another.

In 5b92477f89, the feature was introduced as experimental. Since then,
GitHub has been running these patches in every repository generating
hundreds of millions of cruft packs along the way. The feature is
battle-tested, and avoids many pathological cases such as above. Users
who either run `git gc` manually, or via `git maintenance` can benefit
from having cruft packs.

As such, enable cruft pack generation to take place by default (by
making `gc.cruftPacks` have the default of "true" rather than "false).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:48 -07:00
c58100ab5d t/t9300-fast-import.sh: prepare for gc --cruft by default
In a similar fashion as previous commits, adjust the fast-import tests
to prepare for "git gc" generating a cruft pack by default.

This adjustment is slightly different, however. Instead of relying on us
writing out the objects loose, and then calling `git prune` to remove
them, t9300 needs to be prepared to drop objects that would be moved
into cruft packs.

To do this, we can combine the `git gc` invocation with `git prune` into
one `git gc --prune`, which handles pruning both loose objects, and
objects that would otherwise be written to a cruft pack.

Likely this pattern of "git gc && git prune" started all the way back in
03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which
happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c:
deprecate --prune, it now really has no effect, 2008-05-09).

After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful
again by accepting an optional parameter, 2009-02-14), this script got a
handful of new "git gc && git prune" instances via via 4cedb78cb5
(fast-import: add input format tests, 2011-08-11). These could have been
`git gc --prune`, but weren't (likely taking after 03db4525d3).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:48 -07:00
b9061bc628 t/t6500-gc.sh: add additional test cases
In the last commit, we refactored some of the tests in t6500 to make
clearer when cruft packs will and won't be generated by `git gc`.

Add the remaining cases not covered by the previous patch into this one,
which enumerates all possible combinations of arguments that will
produce (or not produce) a cruft pack.

This prepares us for a future commit which will change the default value
of `gc.cruftPacks` by ensuring that we understand which invocations do
and do not change as a result.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:48 -07:00
50685e0e0b t/t6500-gc.sh: refactor cruft pack tests
In 12253ab6d0 (gc: add tests for --cruft and friends, 2022-10-26), we
added a handful of tests to t6500 to ensure that `git gc` respected the
value of `--cruft` and `gc.cruftPacks`.

Then, in c695592850 (config: let feature.experimental imply
gc.cruftPacks=true, 2022-10-26), another set of similar tests was added
to ensure that `feature.experimental` correctly implied enabling cruft
pack generation (or not).

These tests are similar and could be consolidated. Do so in this patch
to prepare for expanding the set of command-line invocations that enable
or disable writing cruft packs. This makes it possible to easily test
more combinations of arguments without being overly repetitive.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:48 -07:00
b31d45b831 t/t6501-freshen-objects.sh: prepare for gc --cruft by default
In a similar spirit as previous commits, prepare for `gc --cruft`
becoming the default by ensuring that the tests in t6501 explicitly
cover the case of freshening loose objects not using cruft packs.

We could run this test twice, once with `--cruft` and once with
`--no-cruft`, but doing so is unnecessary, since we already test object
rescuing, freshening, and dealing with corrupt parts of the unreachable
object graph extensively via t5329.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
b934207a22 t/t5304-prune.sh: prepare for gc --cruft by default
Many of the tests in t5304 run `git gc`, and rely on its behavior that
unreachable-but-recent objects are written out loose. This is sensible,
since t5304 deals specifically with this kind of pruning.

If left unattended, however, this test would break when the default
behavior of a bare "git gc" is adjusted to generate a cruft pack by
default.

Ensure that these tests continue to work as-is (and continue to provide
coverage of loose object pruning) by passing `--no-cruft` explicitly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
05b9013b71 builtin/gc.c: ignore cruft packs with --keep-largest-pack
When cruft packs were implemented, we never adjusted the code for `git
gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
packs. This option and configuration option share a common
implementation, but including cruft packs is wrong in both cases:

  - Running `git gc --keep-largest-pack` in a repository where the
    largest pack is the cruft pack itself will make it impossible for
    `git gc` to prune objects, since the cruft pack itself is kept.

  - The same is true for `gc.bigPackThreshold`, if the size of the cruft
    pack exceeds the limit set by the caller.

In the future, it is possible that `gc.bigPackThreshold` could be used
to write a separate cruft pack containing any new unreachable objects
that entered the repository since the last time a cruft pack was
written.

There are some complexities to doing so, mainly around handling
pruning objects that are in an existing cruft pack that is above the
threshold (which would either need to be rewritten, or else delay
pruning). Rewriting a substantially similar cruft pack isn't ideal, but
it is significantly better than the status-quo.

If users have large cruft packs that they don't want to rewrite, they
can mark them as `*.keep` packs. But in general, if a repository has a
cruft pack that is so large it is slowing down GC's, it should probably
be pruned anyway.

In the meantime, ignore cruft packs in the common implementation for
both of these options, and add a pair of tests to prevent any future
regressions here.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
c512f31109 builtin/repack.c: fix incorrect reference to '-C'
When cruft packs were originally being developed, `-C` was designated as
the short-form for `--cruft` (as in `git repack -C`).

This was dropped due to confusion with Git's top-level `-C` option
before submitting to the list. But the reference to it in
`--cruft-expiration`'s help text was never updated. Fix that dangling
reference in this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
c41258359e pack-write.c: plug a leak in stage_tmp_packfiles()
The function `stage_tmp_packfiles()` generates a filename to use for
staging the contents of what will become the pack's ".mtimes" file.

The name is generated in `write_mtimes_file()` and the result is
returned back to `stage_tmp_packfiles()` which uses it to rename the
temporary file into place via `rename_tmp_packfiles()`.

`write_mtimes_file()` returns a `const char *`, indicating that callers
are not expected to free its result (similar to, e.g., `oid_to_hex()`).
But callers are expected to free its result, so this return type is
incorrect.

Change the function's signature to return a non-const `char *`, and free
it at the end of `stage_tmp_packfiles()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
331b094eec protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
Michael J Gruber noticed that connection via the git:// protocol no
longer worked after a recent header clean-up.  This was caused by
funny interaction of few gotchas.  First, a necessary definition

	#define DEFAULT_GIT_PORT 9418

was made invisible to a place where

	const char *port = STR(DEFAULT_GIT_PORT);

was expecting to turn the integer into "9418" with a clever STR()
macro, and ended up stringifying it to

	const char *port = "DEFAULT_GIT_PORT";

without giving any chance to compilers to notice such a mistake.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:01:04 -07:00
667fcf4e15 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 18:05:13 -07:00
3c957e6d39 Merge branch 'pw/rebase-cleanup-merge-strategy-option-handling'
Clean-up of the code path that deals with merge strategy option
handling in "git rebase".

* pw/rebase-cleanup-merge-strategy-option-handling:
  rebase: remove a couple of redundant strategy tests
  rebase -m: fix serialization of strategy options
  rebase -m: cleanup --strategy-option handling
  sequencer: use struct strvec to store merge strategy options
  rebase: stop reading and writing unnecessary strategy state
2023-04-17 18:05:13 -07:00
66bf8f1943 Merge branch 'cm/branch-delete-error-message-update'
"git branch -d origin/master" would say "no such branch", but it is
likely a missed "-r" if refs/remotes/origin/master exists.  The
command has been taught to give such a hint in its error message.

* cm/branch-delete-error-message-update:
  branch: improve error log on branch not found by checking remotes refs
2023-04-17 18:05:12 -07:00
c232ebacb2 Merge branch 'fc/remove-header-workarounds-for-asciidoc'
Doc toolchain update to remove old workaround for AsciiDoc.

* fc/remove-header-workarounds-for-asciidoc:
  doc: asciidoc: remove custom header macro
2023-04-17 18:05:12 -07:00
953823fcbf Merge branch 'la/mfc-markup-fix'
Documentation mark-up fix.

* la/mfc-markup-fix:
  MyFirstContribution: render literal *
2023-04-17 18:05:12 -07:00
9d8370d445 Merge branch 'tk/mergetool-gui-default-config'
"git mergetool" and "git difftool" learns a new configuration
guiDefault to optionally favor configured guitool over non-gui-tool
automatically when $DISPLAY is set.

* tk/mergetool-gui-default-config:
  mergetool: new config guiDefault supports auto-toggling gui by DISPLAY
2023-04-17 18:05:11 -07:00
d47ee0a565 Merge branch 'sl/sparse-write-tree'
"git write-tree" learns to work better with sparse-index.

* sl/sparse-write-tree:
  write-tree: integrate with sparse index
2023-04-17 18:05:11 -07:00
5a6072f631 fsck: validate .rev file header
While parsing a .rev file, we check the header information to be sure it
makes sense. This happens before doing any additional validation such as
a checksum or value check. In order to differentiate between a bad
header and a non-existent file, we need to update the API for loading a
reverse index.

Make load_pack_revindex_from_disk() non-static and specify that a
positive value means "the file does not exist" while other errors during
parsing are negative values. Since an invalid header prevents setting up
the structures we would use for further validations, we can stop at that
point.

The place where we can distinguish between a missing file and a corrupt
file is inside load_revindex_from_disk(), which is used both by pack
rev-indexes and multi-pack-index rev-indexes. Some tests in t5326
demonstrate that it is critical to take some conditions to allow
positive error signals.

Add tests that check the three header values.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 14:39:05 -07:00
5f658d1b57 fsck: check rev-index position values
When checking a rev-index file, it may be helpful to identify exactly
which positions are incorrect. Compare the rev-index to a
freshly-computed in-memory rev-index and report the comparison failures.

This additional check (on top of the checksum validation) can help find
files that were corrupt by a single bit flip on-disk or perhaps were
written incorrectly due to a bug in Git.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 14:39:04 -07:00
d975fe1fa5 fsck: check rev-index checksums
The previous change added calls to verify_pack_revindex() in
builtin/fsck.c, but the implementation of the method was left empty. Add
the first and most-obvious check to this method: checksum verification.

While here, create a helper method in the test script that makes it easy
to adjust the .rev file and check that 'git fsck' reports the correct
error message.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 14:39:04 -07:00
0d30feef3c fsck: create scaffolding for rev-index checks
The 'fsck' builtin checks many of Git's on-disk data structures, but
does not currently validate the pack rev-index files (a .rev file to
pair with a .pack and .idx file).

Before doing a more-involved check process, create the scaffolding
within builtin/fsck.c to have a new error type and add that error type
when the API method verify_pack_revindex() returns an error. That method
does nothing currently, but we will add checks to it in later changes.

For now, check that 'git fsck' succeeds without any errors in the normal
case. Future checks will be paired with tests that corrupt the .rev file
appropriately.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 14:39:04 -07:00
3c63503759 Merge branch 'tb/pack-revindex-on-disk' into ds/fsck-pack-revindex
* tb/pack-revindex-on-disk:
  t: invert `GIT_TEST_WRITE_REV_INDEX`
  config: enable `pack.writeReverseIndex` by default
  pack-revindex: introduce `pack.readReverseIndex`
  pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK
  pack-revindex: make `load_pack_revindex` take a repository
  t5325: mark as leak-free
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-17 14:38:59 -07:00
0d1bd1dfb3 Git 2.40.1
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:16:10 +02:00
3d3c11852c Sync with 2.39.3
* maint-2.39: (34 commits)
  Git 2.39.3
  Git 2.38.5
  Git 2.37.7
  Git 2.36.6
  Git 2.35.8
  Makefile: force -O0 when compiling with SANITIZE=leak
  Git 2.34.8
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  range-diff: use ssize_t for parsed "len" in read_patches()
  ...
2023-04-17 21:16:10 +02:00
9bbde12fee Git 2.39.3
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:16:08 +02:00
15628975cf Sync with 2.38.5
* maint-2.38: (32 commits)
  Git 2.38.5
  Git 2.37.7
  Git 2.36.6
  Git 2.35.8
  Git 2.34.8
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  ...
2023-04-17 21:16:08 +02:00
ec58344906 Git 2.38.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:16:07 +02:00
c96ecfe6a5 Sync with 2.37.7
* maint-2.37: (31 commits)
  Git 2.37.7
  Git 2.36.6
  Git 2.35.8
  Git 2.34.8
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  ...
2023-04-17 21:16:06 +02:00
d27ae36bbb Git 2.37.7
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:16:05 +02:00
1df551ce5c Sync with 2.36.6
* maint-2.36: (30 commits)
  Git 2.36.6
  Git 2.35.8
  Git 2.34.8
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  ...
2023-04-17 21:16:04 +02:00
ecaa3db171 Git 2.36.6
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:16:03 +02:00
62298def14 Sync with 2.35.8
* maint-2.35: (29 commits)
  Git 2.35.8
  Git 2.34.8
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  http: support CURLOPT_PROTOCOLS_STR
  ...
2023-04-17 21:16:02 +02:00
7380a72f6b Git 2.35.8
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:16:00 +02:00
8cd052ea53 Sync with 2.34.8
* maint-2.34: (28 commits)
  Git 2.34.8
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  ...
2023-04-17 21:15:59 +02:00
abcb63fb70 Git 2.34.8
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:57 +02:00
d6e9f67a8e Sync with 2.33.8
* maint-2.33: (27 commits)
  Git 2.33.8
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  ...
2023-04-17 21:15:56 +02:00
3a19048ce4 Git 2.33.8
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:54 +02:00
bcd874d50f Sync with 2.32.7
* maint-2.32: (26 commits)
  Git 2.32.7
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  ci: install python on ubuntu
  ...
2023-04-17 21:15:52 +02:00
b8787a98db Git 2.32.7
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:51 +02:00
31f7fe5e34 Sync with 2.31.8
* maint-2.31: (25 commits)
  Git 2.31.8
  tests: avoid using `test_i18ncmp`
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  ci: install python on ubuntu
  ci: use the same version of p4 on both Linux and macOS
  ...
2023-04-17 21:15:49 +02:00
ea56f91275 Git 2.31.8
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:47 +02:00
92957d8427 tests: avoid using test_i18ncmp
Since `test_i18ncmp` was deprecated in v2.31.*, the instances added in
v2.30.9 needed to be converted to `test_cmp` calls.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:45 +02:00
b524e896b6 Sync with 2.30.9
* maint-2.30: (23 commits)
  Git 2.30.9
  gettext: avoid using gettext if the locale dir is not present
  apply --reject: overwrite existing `.rej` symlink if it exists
  http.c: clear the 'finished' member once we are done with it
  clone.c: avoid "exceeds maximum object size" error with GCC v12.x
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
  t5604: GETTEXT_POISON fix, conclusion
  t5604: GETTEXT_POISON fix, part 1
  t5619: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, conclusion
  t0003: GETTEXT_POISON fix, part 1
  t0033: GETTEXT_POISON fix
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  ci: install python on ubuntu
  ci: use the same version of p4 on both Linux and macOS
  ci: remove the pipe after "p4 -V" to catch errors
  github-actions: run gcc-8 on ubuntu-20.04 image
  ...
2023-04-17 21:15:44 +02:00
668f2d5361 Git 2.30.9
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:43 +02:00
528290f8c6 Merge branch 'tb/config-copy-or-rename-in-file-injection'
Avoids issues with renaming or deleting sections with long lines, where
configuration values may be interpreted as sections, leading to
configuration injection. Addresses CVE-2023-29007.

* tb/config-copy-or-rename-in-file-injection:
  config.c: disallow overly-long lines in `copy_or_rename_section_in_file()`
  config.c: avoid integer truncation in `copy_or_rename_section_in_file()`
  config: avoid fixed-sized buffer when renaming/deleting a section
  t1300: demonstrate failure when renaming sections with long lines

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:42 +02:00
4fe5d0b10a Merge branch 'avoid-using-uninitialized-gettext'
Avoids the overhead of calling `gettext` when initialization of the
translated messages was skipped. Addresses CVE-2023-25815.

* avoid-using-uninitialized-gettext: (1 commit)
  gettext: avoid using gettext if the locale dir is not present
2023-04-17 21:15:42 +02:00
18e2b1cfc8 Merge branch 'js/apply-overwrite-rej-symlink-if-exists' into maint-2.30
Address CVE-2023-25652 by deleting any existing `.rej` symbolic links
instead of following them.

* js/apply-overwrite-rej-symlink-if-exists:
  apply --reject: overwrite existing `.rej` symlink if it exists

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2023-04-17 21:15:41 +02:00
3bb3d6bac5 config.c: disallow overly-long lines in copy_or_rename_section_in_file()
As a defense-in-depth measure to guard against any potentially-unknown
buffer overflows in `copy_or_rename_section_in_file()`, refuse to work
with overly-long lines in a gitconfig.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2023-04-17 21:15:40 +02:00
e91cfe6085 config.c: avoid integer truncation in copy_or_rename_section_in_file()
There are a couple of spots within `copy_or_rename_section_in_file()`
that incorrectly use an `int` to track an offset within a string, which
may truncate or wrap around to a negative value.

Historically it was impossible to have a line longer than 1024 bytes
anyway, since we used fgets() with a fixed-size buffer of exactly that
length. But the recent change to use a strbuf permits us to read lines
of arbitrary length, so it's possible for a malicious input to cause us
to overflow past INT_MAX and do an out-of-bounds array read.

Practically speaking, however, this should never happen, since it
requires 2GB section names or values, which are unrealistic in
non-malicious circumstances.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:40 +02:00
a5bb10fd5e config: avoid fixed-sized buffer when renaming/deleting a section
When renaming (or deleting) a section of configuration, Git uses the
function `git_config_copy_or_rename_section_in_file()` to rewrite the
configuration file after applying the rename or deletion to the given
section.

To do this, Git repeatedly calls `fgets()` to read the existing
configuration data into a fixed size buffer.

When the configuration value under `old_name` exceeds the size of the
buffer, we will call `fgets()` an additional time even if there is no
newline in the configuration file, since our read length is capped at
`sizeof(buf)`.

If the first character of the buffer (after zero or more characters
satisfying `isspace()`) is a '[', Git will incorrectly treat it as
beginning a new section when the original section is being removed. In
other words, a configuration value satisfying this criteria can
incorrectly be considered as a new secftion instead of a variable in the
original section.

Avoid this issue by using a variable-width buffer in the form of a
strbuf rather than a fixed-with region on the stack. A couple of small
points worth noting:

  - Using a strbuf will cause us to allocate arbitrary sizes to match
    the length of each line.  In practice, we don't expect any
    reasonable configuration files to have lines that long, and a
    bandaid will be introduced in a later patch to ensure that this is
    the case.

  - We are using strbuf_getwholeline() here instead of strbuf_getline()
    in order to match `fgets()`'s behavior of leaving the trailing LF
    character on the buffer (as well as a trailing NUL).

    This could be changed later, but using strbuf_getwholeline() changes
    the least about this function's implementation, so it is picked as
    the safest path.

  - It is temping to want to replace the loop to skip over characters
    matching isspace() at the beginning of the buffer with a convenience
    function like `strbuf_ltrim()`. But this is the wrong approach for a
    couple of reasons:

    First, it involves a potentially large and expensive `memmove()`
    which we would like to avoid. Second, and more importantly, we also
    *do* want to preserve those spaces to avoid changing the output of
    other sections.

In all, this patch is a minimal replacement of the fixed-width buffer in
`git_config_copy_or_rename_section_in_file()` to instead use a `struct
strbuf`.

Reported-by: André Baptista <andre@ethiack.com>
Reported-by: Vítor Pinho <vitor@ethiack.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:40 +02:00
c4137be0f5 gettext: avoid using gettext if the locale dir is not present
In cc5e1bf992 (gettext: avoid initialization if the locale dir is not
present, 2018-04-21) Git was taught to avoid a costly gettext start-up
when there are not even any localized messages to work with.

But we still called `gettext()` and `ngettext()` functions.

Which caused a problem in Git for Windows when the libgettext that is
consumed from the MSYS2 project stopped using a runtime prefix in
https://github.com/msys2/MINGW-packages/pull/10461

Due to that change, we now use an unintialized gettext machinery that
might get auto-initialized _using an unintended locale directory_:
`C:\mingw64\share\locale`.

Let's record the fact when the gettext initialization was skipped, and
skip calling the gettext functions accordingly.

This addresses CVE-2023-25815.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:39 +02:00
29198213c9 t1300: demonstrate failure when renaming sections with long lines
When renaming a configuration section which has an entry whose length
exceeds the size of our buffer in config.c's implementation of
`git_config_copy_or_rename_section_in_file()`, Git will incorrectly
form a new configuration section with part of the data in the section
being removed.

In this instance, our first configuration file looks something like:

    [b]
      c = d <spaces> [a] e = f
    [a]
      g = h

Here, we have two configuration values, "b.c", and "a.g". The value "[a]
e = f" belongs to the configuration value "b.c", and does not form its
own section.

However, when renaming the section 'a' to 'xyz', Git will write back
"[xyz]\ne = f", but "[xyz]" is still attached to the value of "b.c",
which is why "e = f" on its own line becomes a new entry called "b.e".

A slightly different example embeds the section being renamed within
another section.

Demonstrate this failure in a test in t1300, which we will fix in the
following commit.

Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:39 +02:00
9db05711c9 apply --reject: overwrite existing .rej symlink if it exists
The `git apply --reject` is expected to write out `.rej` files in case
one or more hunks fail to apply cleanly. Historically, the command
overwrites any existing `.rej` files. The idea being that
apply/reject/edit cycles are relatively common, and the generated `.rej`
files are not considered precious.

But the command does not overwrite existing `.rej` symbolic links, and
instead follows them. This is unsafe because the same patch could
potentially create such a symbolic link and point at arbitrary paths
outside the current worktree, and `git apply` would write the contents
of the `.rej` file into that location.

Therefore, let's make sure that any existing `.rej` file or symbolic
link is removed before writing it.

Reported-by: RyotaK <ryotak.mail@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:38 +02:00
2f3b28f272 Merge branch 'js/gettext-poison-fixes'
The `maint-2.30` branch accumulated quite a few fixes over the past two
years. Most of those fixes were originally based on newer versions, and
while the patches cherry-picked cleanly, we weren't diligent enough to
pay attention to the CI builds and the GETTEXT_POISON job regressed.
This topic branch fixes that.

* js/gettext-poison-fixes
  t0033: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, part 1
  t0003: GETTEXT_POISON fix, conclusion
  t5619: GETTEXT_POISON fix
  t5604: GETTEXT_POISON fix, part 1
  t5604: GETTEXT_POISON fix, conclusion
2023-04-17 21:15:37 +02:00
4989c35688 Merge branch 'ds/github-actions-use-newer-ubuntu'
Update the version of Ubuntu used for GitHub Actions CI from 18.04
to 22.04.

* ds/github-actions-use-newer-ubuntu:
  ci: update 'static-analysis' to Ubuntu 22.04
2023-04-17 21:15:36 +02:00
fef08dd32e ci: update 'static-analysis' to Ubuntu 22.04
GitHub Actions scheduled a brownout of Ubuntu 18.04, which canceled all
runs of the 'static-analysis' job in our CI runs. Update to 22.04 to
avoid this as the brownout later turns into a complete deprecation.

The use of 18.04 was set in d051ed77ee (.github/workflows/main.yml: run
static-analysis on bionic, 2021-02-08) due to the lack of Coccinelle
being available on 20.04 (which continues today).

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 18:17:53 +02:00
7ce4c8f752 v0 protocol: use size_t for capability length/offset
When parsing server capabilities, we use "int" to store lengths and
offsets. At first glance this seems like a spot where our parser may be
confused by integer overflow if somebody sent us a malicious response.

In practice these strings are all bounded by the 64k limit of a
pkt-line, so using "int" is OK. However, it makes the code simpler to
audit if they just use size_t everywhere. Note that because we take
these parameters as pointers, this also forces many callers to update
their declared types.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:13 -07:00
c4716236f2 t5512: test "ls-remote --heads --symref" filtering with v0 and v2
We have two overlapping tests for checking the behavior of "ls-remote
--symref" when filtering output. The first test checks that using
"--heads" will omit the symref for HEAD (since we don't print anything
about HEAD at all), but still prints other symrefs.

This has been marked as expecting failure since it was added in
99c08d4eb2 (ls-remote: add support for showing symrefs, 2016-01-19).
That's because back then, we only had the v0 protocol, and it only
reported on the HEAD symref, not others. But these days we have v2,
which does exactly what the test wants. It would even have started
unexpectedly passing when we switched to v2 by default, except that
b2f73b70b2 (t5512: compensate for v0 only sending HEAD symrefs,
2019-02-25) over-zealously marked it to run only in v0 mode.

So let's run it with both protocol versions, and adjust the expected
output for each. It passes in v2 without modification. In v0 mode, we'll
drop the extra symref, but this is still testing something useful: it
ensures that we do omit HEAD.

The test after this checks "--heads" again, this time using the expected
v0 output. That's now redundant. It also checks that limiting with a
pattern like "refs/heads/*" works similarly, but that's redundant with a
test earlier in the script which limits by HEAD (again, back then the
"HEAD" test was less interesting because there were no other symrefs to
omit, but in a modern v2 world, there are). So we can just delete that
second test entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:13 -07:00
d6747adfa8 t5512: allow any protocol version for filtered symref test
We have a test that checks that ls-remote, when asked only about HEAD,
will report the HEAD symref, and not others. This was marked to always
run with the v0 protocol by b2f73b70b2 (t5512: compensate for v0 only
sending HEAD symrefs, 2019-02-25).

But in v0 this test is doing nothing! For v0, upload-pack only reports
the HEAD symref anyway, so we'd never have any other symref to report.

For v2, it is useful; we learn about all symrefs (and the test repo has
multiple), so this demonstrates that we correctly avoid showing them.

We could perhaps mark this to test explicitly with v2, but since that is
the default these days, it's sufficient to just run ls-remote without
any protocol specification. It still passes if somebody does an explicit
GIT_TEST_PROTOCOL_VERSION=0; it's just testing less.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:12 -07:00
20272ee8cf t5512: add v2 support for "ls-remote --symref" test
Commit b2f73b70b2 (t5512: compensate for v0 only sending HEAD symrefs,
2019-02-25) configured this test to always run with protocol v0, since
the output is different for v2.

But that means we are not getting any test coverage of the feature with
v2 at all. We could obviously switch to using and expecting v2, but then
that leaves v0 behind (and while we don't use it by default, it's still
important for testing interoperability with older servers). Likewise, we
could switch the expected output based on $GIT_TEST_PROTOCOL_VERSION,
but hardly anybody runs the tests for v0 these days.

Instead, let's explicitly run it for both protocol versions to make sure
they're well behaved. This matches other similar tests added later in
6a139cdd74 (ls-remote: pass heads/tags prefixes to transport,
2018-10-31), etc.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:12 -07:00
13e67aa39b v0 protocol: fix sha1/sha256 confusion for capabilities^{}
Commit eb398797cd (connect: advertized capability is not a ref,
2016-09-09) added support for an upload-pack server responding with:

  0000000000000000000000000000000000000000        capabilities^{}

followed by a NUL and the actual capabilities. We correctly parse the
oid using the packet_reader's hash_algo field, but then we compare it to
null_oid(), which will instead use our current repo's default algorithm.
If we're defaulting to sha256 locally but the other side is sha1, they
won't match and we'll fail to parse the line (and thus die()).

This can cause a test failure when the suite is run with
GIT_TEST_DEFAULT_HASH=sha256, and we even do so regularly via the
linux-sha256 CI job. But since the test requires JGit to run, it's
usually just skipped, and nobody noticed the problem.

The reason the original patch used JGit is that Git itself does not ever
produce such a line via upload-pack; the feature was added to fix a
real-world problem when interacting with JGit. That was good for
verifying that the incompatibility was fixed, but it's not a good
regression test:

  - hardly anybody runs it, because you have to have jgit installed;
    hence this bug going unnoticed

  - we're depending on jgit's behavior for the test to do anything
    useful. In particular, this behavior is only relevant to the v0
    protocol, but these days we ask for the v2 protocol by default. So
    for modern jgit, this is probably testing nothing.

  - it's complicated and slow. We had to do some fifo trickery to handle
    races, and this one test makes up 40% of the runtime of the total
    script.

Instead, let's just hard-code the response that's of interest to us.
That will test exactly what we want for every run, and reveals the bug
when run in sha256 mode. And of course we'll fix the actual bug by using
the correct hash_algo struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:12 -07:00
e6c4309748 t5512: stop referring to "v1" protocol
There really isn't a "v1" Git protocol. It's just v0 with an extra probe
which we used to test compatibility in preparation for v2. Any tests
that are looking for before/after behavior for v2 really care about
"v0". Mentioning "v1" in these tests is just making things more
confusing, because we don't care about that probe; we're really testing
v0. So let's say so.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:12 -07:00
aa962fef27 v0 protocol: fix infinite loop when parsing multi-valued capabilities
If Git's client-side parsing of an upload-pack response (so git-fetch or
ls-remote) sees multiple instances of a single capability, it can enter
an infinite loop due to a bug in advancing the "offset" parameter in the
parser.

This bug can't happen between a client and server of the same Git
version. The client bug is in parse_feature_value() when the caller
passes in an offset parameter. And that only happens when the v0
protocol is parsing "symref" and "object-format" capabilities, via
next_server_feature_value(). But Git has never produced multiple
object-format capabilities, and it stopped producing multiple symref
values in d007dbf7d6 (Revert "upload-pack: send non-HEAD symbolic refs",
2013-11-18).

However, upload-pack did produce multiple symref entries for a while,
and they are valid. Plus other implementations, such as Dulwich will
still do so. So we should handle them. And even if we do not expect it,
it is obviously a bug for the parser to enter an infinite loop.

The bug itself is pretty simple. Commit 2c6a403d96 (connect: add
function to parse multiple v1 capability values, 2020-05-25) added the
"offset" parameter, which is used as both an in- and out-parameter. When
parsing the first "symref" capability, *offset will be 0 on input, and
after parsing the capability, we set *offset to an index just past the
value by taking a pointer difference "(value + end) - feature_list".

But on the second call, now *offset is set to that larger index, which
lets us skip past the first "symref" capability. However, we do so by
incrementing feature_list. That means our pointer difference is now too
small; it is counting from where we resumed parsing, not from the start
of the original feature_list pointer. And because we incremented
feature_list only inside our function, and not the caller, that
increment is lost next time the function is called.

One solution would be to account for those skipped bytes by incrementing
*offset, rather than assigning to it. But wait, there's more!

We also increment feature_list if we have a near-miss. Say we are
looking for "symref" and find "almost-symref". In that case we'll point
feature_list to the "y" in "almost-symref" and restart our search. But
that again means our offset won't be correct, as it won't account for
the bytes between the start of the string and that "y".

So instead, let's just record the beginning of the feature_list string
in a separate pointer that we never touch. That offset we take in and
return is meant to be using that point as a base, and now we'll do so
consistently.

Since the bug can't be reproduced using the current version of
git-upload-pack, we'll instead hard-code an input which triggers the
problem. Before this patch it loops forever re-parsing the second symref
entry. Now we check both that it finishes, and that it parses both
entries correctly (a case we could not test at all before).

We don't need to worry about testing v2 here; it communicates the
capabilities in a completely different way, and doesn't use this code at
all. There are tests earlier in t5512 that are meant to cover this (they
don't, but we'll address that in a future patch).

Reported-by: Jonas Haag <jonas@lophus.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:12 -07:00
3c8d3adeae send-email: export patch counters in validate environment
When sending patch series (with a cover-letter or not)
sendemail-validate is called with every email/patch file independently
from the others. When one of the patches depends on a previous one, it
may not be possible to use this hook in a meaningful way. A hook that
wants to check some property of the whole series needs to know which
patch is the final one.

Expose the current and total number of patches to the hook via the
GIT_SENDEMAIL_PATCH_COUNTER and GIT_SENDEMAIL_PATCH_TOTAL environment
variables so that both incremental and global validation is possible.

Sharing any other state between successive invocations of the validate
hook must be done via external means. For example, by storing it in
a git config sendemail.validateWorktree entry.

Add a sample script with placeholder validations and update tests to
check that the counters are properly exported.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Robin Jarry <robin@jarry.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:41:15 -07:00
28fde3a1f4 doc: set actual revdate for manpages
manpages expect the date of the last revision, if that is not found
DocBook Stylesheets go through a series of hacks to generate one with
the format `%d/%d/%Y` which is not ideal.

In addition to this format not being standard, different tools generate
dates with different formats.

There's no need for any confusion if we specify the revision date, so
let's do so.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:37:41 -07:00
df113b5560 Merge branch 'fc/doc-stop-using-manversion' into fc/doc-use-datestamp-in-commit
* fc/doc-stop-using-manversion:
  doc: simplify man version
2023-04-14 10:33:32 -07:00
276699360d Merge branch 'fc/remove-header-workarounds-for-asciidoc' into fc/doc-use-datestamp-in-commit
* fc/remove-header-workarounds-for-asciidoc:
  doc: asciidoc: remove custom header macro
2023-04-14 10:33:15 -07:00
d85cd18777 repack: disable writing bitmaps when doing a local repack
In order to write a bitmap, we need to have full coverage of all objects
that are about to be packed. In the traditional non-multi-pack-index
world this meant we need to do a full repack of all objects into a
single packfile. But in the new multi-pack-index world we can get away
with writing bitmaps when we have multiple packfiles as long as the
multi-pack-index covers all objects.

This is not always the case though. When asked to perform a repack of
local objects, only, then we cannot guarantee to have full coverage of
all objects regardless of whether we do a full repack or a repack with a
multi-pack-index. The end result is that writing the bitmap will fail in
both worlds:

    $ git multi-pack-index write --stdin-packs --bitmap <packfiles
    warning: Failed to write bitmap index. Packfile doesn't have full closure (object 1529341d78cf45377407369acb0f4ff2b5cdae42 is missing)
    error: could not write multi-pack bitmap

Now there are two different ways to fix this. The first one would be to
amend git-multi-pack-index(1) to disable writing bitmaps when we notice
that we don't have full object coverage.

    - We don't have enough information in git-multi-pack-index(1) in
      order to tell whether the local repository _should_ have full
      coverage. Because even when connected to an alternate object
      directory, it may be the case that we still have all objects
      around in the main object database.

    - git-multi-pack-index(1) is quite a low-level tool. Automatically
      disabling functionality that it was asked to provide does not feel
      like the right thing to do.

We can easily fix it at a higher level in git-repack(1) though. When
asked to only include local objects via `-l` and when connected to an
alternate object directory then we will override the user's ask and
disable writing bitmaps with a warning. This is similar to what we do in
git-pack-objects(1), where we also disable writing bitmaps in case we
omit an object from the pack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:52 -07:00
932c16c04b repack: honor -l when calculating pack geometry
When the user passes `-l` to git-repack(1), then they essentially ask us
to only repack objects part of the local object database while ignoring
any packfiles part of an alternate object database. And we in fact honor
this bit when doing a geometric repack as the resulting packfile will
only ever contain local objects.

What we're missing though is that we don't take locality of packfiles
into account when computing whether the geometric sequence is intact or
not. So even though we would only ever roll up local packfiles anyway,
we could end up trying to repack because of non-local packfiles. This
does not make much sense, and in the worst case it can cause us to try
and do the geometric repack over and over again because we're never able
to restore the geometric sequence.

Fix this bug by honoring whether the user has passed `-l`. If so, we
skip adding any non-local packfiles to the pack geometry.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:52 -07:00
19a3a7bde9 t/helper: allow chmtime to print verbosely without modifying mtime
The `test-tool chmtime` helper allows us to both read and modify the
modification time of files. But while it is possible to only read the
mtimes of a file via `--get`, it is not possible to read the mtimes
and report them together with their respective file paths via the
`--verbose` flag without also modifying the mtime at the same time.

Fix this so that it is possible to call `test-tool chmtime --verbose
<files>...` without modifying any mtimes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:52 -07:00
f3028418c3 pack-objects: extend test coverage of --stdin-packs with alternates
We don't have any tests that verify that git-pack-objects(1) works with
`--stdin-packs` when combined with alternate object directories. Add
some to make sure that the basic functionality works as expected.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:52 -07:00
752b465c3c pack-objects: fix error when same packfile is included and excluded
When passing the same packfile both as included and excluded via the
`--stdin-packs` option, then we will return an error because the
excluded packfile cannot be found. This is because we will only set the
`util` pointer for the included packfile list if it was found, so that
we later die when we notice that it's in fact not set for the excluded
packfile list.

Fix this bug by always setting the `util` pointer for both the included
and excluded list entries.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
732194b5f2 pack-objects: fix error when packing same pack twice
When passed the same packfile twice via `--stdin-packs` we return an
error that the packfile supposedly was not found. This is because when
reading packs into the list of included or excluded packfiles, we will
happily re-add packfiles even if they are part of the lists already. And
while the list can now contain duplicates, we will only set the `util`
pointer of the first list entry to the `packed_git` structure. We notice
that at a later point when checking that all list entries have their
`util` pointer set and die with an error.

While this is kind of a nonsensical request, this scenario can be hit
when doing geometric repacks. When a repository is connected to an
alternate object directory and both have the exact same packfile then
both would get added to the geometric sequence. And when we then decide
to perform the repack, we will invoke git-pack-objects(1) with the same
packfile twice.

Fix this bug by removing any duplicates from both the included and
excluded packs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
b7b8f048f5 pack-objects: split out --stdin-packs tests into separate file
The test suite for git-pack-objects(1) is quite huge, and we're about to
add more tests that relate to the `--stdin-packs` option. Split out all
tests related to this option into a standalone file so that it becomes
easier to test the feature in isolation.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
51861340f8 repack: fix generating multi-pack-index with only non-local packs
When writing the multi-pack-index with geometric repacking we will add
all packfiles to the index that are part of the geometric sequence. This
can potentially also include packfiles borrowed from an alternate object
directory. But given that a multi-pack-index can only ever include packs
that are part of the main object database this does not make much sense
whatsoever.

In the edge case where all packfiles are contained in the alternate
object database and the local repository has none itself this bug can
cause us to invoke git-multi-pack-index(1) with only non-local packfiles
that it ultimately cannot find. This causes it to return an error and
thus causes the geometric repack to fail.

Fix the code to skip non-local packfiles.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
3d74a2337c repack: fix trying to use preferred pack in alternates
When doing a geometric repack with multi-pack-indices, then we ask
git-multi-pack-index(1) to use the largest packfile as the preferred
pack. It can happen though that the largest packfile is not part of the
main object database, but instead part of an alternate object database.
The result is that git-multi-pack-index(1) will not be able to find the
preferred pack and print a warning. It then falls back to use the first
packfile that the multi-pack-index shall reference.

Fix this bug by only considering packfiles as preferred pack that are
local. This is the right thing to do given that a multi-pack-index
should never reference packfiles borrowed from an alternate.

While at it, rename the function `get_largest_active_packfile()` to
`get_preferred_pack()` to better document its intent.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
ceb96a160b midx: fix segfault with no packs and invalid preferred pack
When asked to write a multi-pack-index the user can specify a preferred
pack that is used as a tie breaker when multiple packs contain the same
objects. When this packfile cannot be found, we just pick the first pack
that is getting tracked by the newly written multi-pack-index as a
fallback.

Picking the fallback can fail in the case where we're asked to write a
multi-pack-index with no packfiles at all: picking the fallback value
will cause a segfault as we blindly index into the array of packfiles,
which would be empty.

Fix this bug by resetting the preferred packfile index to `-1` before
searching for the preferred pack. This fixes the segfault as we already
check for whether the index is `> - 1`. If it is not, we simply don't
pick a preferred packfile at all.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
aabfdc9514 branch, for-each-ref, tag: add option to omit empty lines
If the given format string expands to the empty string, a newline is
still printed. This makes using the output linewise more tedious. For
example, git update-ref --stdin does not accept empty lines.

Add options to "git branch", "git for-each-ref", and "git tag" to
not print these empty lines.  The default behavior remains the same.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 08:07:45 -07:00
9f7f10a282 t: invert GIT_TEST_WRITE_REV_INDEX
Back in e8c58f894b (t: support GIT_TEST_WRITE_REV_INDEX, 2021-01-25), we
added a test knob to conditionally enable writing a ".rev" file when
indexing a pack. At the time, this was used to ensure that the test
suite worked even when ".rev" files were written, which served as a
stress-test for the on-disk reverse index implementation.

Now that reading from on-disk ".rev" files is enabled by default, the
test knob `GIT_TEST_WRITE_REV_INDEX` no longer has any meaning.

We could get rid of the option entirely, but there would be no
convenient way to test Git when ".rev" files *aren't* in place.

Instead of getting rid of the option, invert its meaning to instead
disable writing ".rev" files, thereby running the test suite in a mode
where the reverse index is generated from scratch.

This ensures that, when GIT_TEST_NO_WRITE_REV_INDEX is set to some
spelling of "true", we are still running and exercising Git's behavior
when forced to generate reverse indexes from scratch. Do so by setting
it in the linux-TEST-vars CI run to ensure that we are maintaining good
coverage of this now-legacy code.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:46 -07:00
a8dd7e05b1 config: enable pack.writeReverseIndex by default
Back in e37d0b8730 (builtin/index-pack.c: write reverse indexes,
2021-01-25), Git learned how to read and write a pack's reverse index
from a file instead of in-memory.

A pack's reverse index is a mapping from pack position (that is, the
order that objects appear together in a ".pack")  to their position in
lexical order (that is, the order that objects are listed in an ".idx"
file).

Reverse indexes are consulted often during pack-objects, as well as
during auxiliary operations that require mapping between pack offsets,
pack order, and index index.

They are useful in GitHub's infrastructure, where we have seen a
dramatic increase in performance when writing ".rev" files[1]. In
particular:

  - an ~80% reduction in the time it takes to serve fetches on a popular
    repository, Homebrew/homebrew-core.

  - a ~60% reduction in the peak memory usage to serve fetches on that
    same repository.

  - a collective savings of ~35% in CPU time across all pack-objects
    invocations serving fetches across all repositories in a single
    datacenter.

Reverse indexes are also beneficial to end-users as well as forges. For
example, the time it takes to generate a pack containing the objects for
the 10 most recent commits in linux.git (representing a typical push) is
significantly faster when on-disk reverse indexes are available:

    $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null'
    Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     543.0 ms ±  20.3 ms    [User: 616.2 ms, System: 58.8 ms]
      Range (min … max):   521.0 ms … 577.9 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     245.0 ms ±  11.4 ms    [User: 335.6 ms, System: 31.3 ms]
      Range (min … max):   226.0 ms … 259.6 ms    13 runs

    Summary
      'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran
	2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null'

The same is true of writing a pack containing the objects for the 30
most-recent commits:

    $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null'
    Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     866.5 ms ±  16.2 ms    [User: 1414.5 ms, System: 97.0 ms]
      Range (min … max):   839.3 ms … 886.9 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     581.6 ms ±  10.2 ms    [User: 1181.7 ms, System: 62.6 ms]
      Range (min … max):   567.5 ms … 599.3 ms    10 runs

    Summary
      'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran
	1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null'

...and savings on trivial operations like computing the on-disk size of
a single (packed) object are even more dramatic:

    $ git rev-parse HEAD >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in'
    Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in
      Time (mean ± σ):     305.8 ms ±  11.4 ms    [User: 264.2 ms, System: 41.4 ms]
      Range (min … max):   290.3 ms … 331.1 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in
      Time (mean ± σ):       4.0 ms ±   0.3 ms    [User: 1.7 ms, System: 2.3 ms]
      Range (min … max):     1.6 ms …   4.6 ms    1155 runs

    Summary
      'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran
       76.96 ± 6.25 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in'

In the more than two years since e37d0b8730 was merged, Git's
implementation of on-disk reverse indexes has been thoroughly tested,
both from users enabling `pack.writeReverseIndexes`, and from GitHub's
deployment of the feature. The latter has been running without incident
for more than two years.

This patch changes Git's behavior to write on-disk reverse indexes by
default when indexing a pack, which should make the above operations
faster for everybody's Git installation after a repack.

(The previous commit explains some potential drawbacks of using on-disk
reverse indexes in certain limited circumstances, that essentially boil
down to a trade-off between time to generate, and time to access. For
those limited cases, the `pack.readReverseIndex` escape hatch can be
used).

[1]: https://github.blog/2021-04-29-scaling-monorepo-maintenance/#reverse-indexes

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:46 -07:00
dbcf611617 pack-revindex: introduce pack.readReverseIndex
Since 1615c567b8 (Documentation/config/pack.txt: advertise
'pack.writeReverseIndex', 2021-01-25), we have had the
`pack.writeReverseIndex` configuration option, which tells Git whether
or not it is allowed to write a ".rev" file when indexing a pack.

Introduce a complementary configuration knob, `pack.readReverseIndex` to
control whether or not Git will read any ".rev" file(s) that may be
available on disk.

This option is useful for debugging, as well as disabling the effect of
".rev" files in certain instances.

This is useful because of the trade-off[^1] between the time it takes to
generate a reverse index (slow from scratch, fast when reading an
existing ".rev" file), and the time it takes to access a record (the
opposite).

For example, even though it is faster to use the on-disk reverse index
when computing the on-disk size of a packed object, it is slower to
enumerate the same value for all objects.

Here are a couple of examples from linux.git. When computing the above
for a single object, using the on-disk reverse index is significantly
faster:

    $ git rev-parse HEAD >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in'
    Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in
      Time (mean ± σ):     302.5 ms ±  12.5 ms    [User: 258.7 ms, System: 43.6 ms]
      Range (min … max):   291.1 ms … 328.1 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in
      Time (mean ± σ):       3.9 ms ±   0.3 ms    [User: 1.6 ms, System: 2.4 ms]
      Range (min … max):     2.0 ms …   4.4 ms    801 runs

    Summary
      'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran
       77.29 ± 7.14 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in'

, but when instead trying to compute the on-disk object size for all
objects in the repository, using the ".rev" file is a disadvantage over
creating the reverse index from scratch:

    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" --batch-all-objects'
    Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" --batch-all-objects
      Time (mean ± σ):      8.258 s ±  0.035 s    [User: 7.949 s, System: 0.308 s]
      Range (min … max):    8.199 s …  8.293 s    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" --batch-all-objects
      Time (mean ± σ):     16.976 s ±  0.107 s    [User: 16.706 s, System: 0.268 s]
      Range (min … max):   16.839 s … 17.105 s    10 runs

    Summary
      'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' ran
	2.06 ± 0.02 times faster than 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" --batch-all-objects'

Luckily, the results when running `git cat-file` with `--unordered` are
closer together:

    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects'
    Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects
      Time (mean ± σ):      5.066 s ±  0.105 s    [User: 4.792 s, System: 0.274 s]
      Range (min … max):    4.943 s …  5.220 s    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects
      Time (mean ± σ):      6.193 s ±  0.069 s    [User: 5.937 s, System: 0.255 s]
      Range (min … max):    6.145 s …  6.356 s    10 runs

    Summary
      'git.compile -c pack.readReverseIndex=false cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' ran
        1.22 ± 0.03 times faster than 'git.compile -c pack.readReverseIndex=true cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects'

Because the equilibrium point between these two is highly machine- and
repository-dependent, allow users to configure whether or not they will
read any ".rev" file(s) with this configuration knob.

[^1]: Generating a reverse index in memory takes O(N) time (where N is
  the number of objects in the repository), since we use a radix sort.
  Reading an entry from an on-disk ".rev" file is slower since each
  operation is bound by disk I/O instead of memory I/O.

  In order to compute the on-disk size of a packed object, we need to
  find the offset of our object, and the adjacent object (the on-disk
  size difference of these two). Finding the first offset requires a
  binary search. Finding the latter involves a single .rev lookup.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:46 -07:00
2a250d6165 pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK
In ec8e7760ac (pack-revindex: ensure that on-disk reverse indexes are
given precedence, 2021-01-25), we introduced
GIT_TEST_REV_INDEX_DIE_IN_MEMORY to abort the process when Git generated
a reverse index from scratch.

ec8e7760ac was about ensuring that Git prefers a .rev file when
available over generating the same information in memory from scratch.

In a subsequent patch, we'll introduce `pack.readReverseIndex`, which
may be used to disable reading ".rev" files when available. In order to
ensure that those files are indeed being ignored, introduce an analogous
option to abort the process when Git reads a ".rev" file from disk.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:46 -07:00
65308ad8f7 pack-revindex: make load_pack_revindex take a repository
In a future commit, we will introduce a `pack.readReverseIndex`
configuration, which forces Git to generate the reverse index from
scratch instead of loading it from disk.

In order to avoid reading this configuration value more than once, we'll
use the `repo_settings` struct to lazily load this value.

In order to access the `struct repo_settings`, add a repository argument
to `load_pack_revindex`, and update all callers to pass the correct
instance (in all cases, `the_repository`).

In certain instances, a new function-local variable is introduced to
take the place of a `struct repository *` argument to the function
itself to avoid propagating the new parameter even further throughout
the tree.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:45 -07:00
b77919ed6e t5325: mark as leak-free
This test is leak-free as of the previous commit, so let's mark it as
such to ensure we don't regress and introduce a leak in the future.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:45 -07:00
3969e6c5a4 pack-write.c: plug a leak in stage_tmp_packfiles()
The function `stage_tmp_packfiles()` generates a filename to use for
staging the contents of what will become the pack's ".rev" file.

The name is generated in `write_rev_file_order()` (via its caller
`write_rev_file()`) in a string buffer, and the result is returned back
to `stage_tmp_packfiles()` which uses it to rename the temporary file
into place via `rename_tmp_packfiles()`.

That name is not visible outside of `stage_tmp_packfiles()`, so it can
(and should) be `free()`'d at the end of that function. We can't free it
in `rename_tmp_packfile()` since not all of its `source` arguments are
unreachable after calling it.

Instead, simply free() `rev_tmp_name` at the end of
`stage_tmp_packfiles()`.

(Note that the same leak exists for `mtimes_tmp_name`, but we do not
address it in this commit).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:45 -07:00
44c0024bdc l10n: uk: remove stale lines
Signed-off-by: Arkadii Yakovets <ark@cho.red>
2023-04-11 19:58:37 -07:00
3ff010d8c7 l10n: uk: add initial translation
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Signed-off-by: Arkadii Yakovets <ark@cho.red>
Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2023-04-11 18:24:42 -07:00
9857273be0 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 13:49:13 -07:00
063cd850f2 Merge branch 'jk/use-perl-path-consistently'
Tests had a few places where we ignored PERL_PATH and blindly used
/usr/bin/perl, which have been corrected.

* jk/use-perl-path-consistently:
  t/lib-httpd: pass PERL_PATH to CGI scripts
2023-04-11 13:49:13 -07:00
96f4113ac0 Merge branch 'jc/clone-object-format-from-void'
"git clone" from an empty repository learned to propagate the
choice of the hash algorithm from the source repository to the
newly created repository.

* jc/clone-object-format-from-void:
  clone: propagate object-format when cloning from void
2023-04-11 13:49:13 -07:00
a86083e25f Merge branch 'fc/doc-manpage-base-url-fix'
Modernize manpage generation toolchain.

* fc/doc-manpage-base-url-fix:
  doc: remove manpage-base-url workaround
2023-04-11 13:49:13 -07:00
95e6111e7c Merge branch 'dw/doc-submittingpatches-grammofix'
Grammofix.

* dw/doc-submittingpatches-grammofix:
  SubmittingPatches: clarify MUA discussion with "the"
2023-04-11 13:49:13 -07:00
714be4c3ac Merge branch 'jx/cap-object-info-uninitialized-fix'
Correct use of an uninitialized structure member.

* jx/cap-object-info-uninitialized-fix:
  object-info: init request_info before reading arg
2023-04-11 13:49:13 -07:00
30e04bcfa8 Merge branch 'ar/adjust-tests-for-the-index-fallout'
Comment updates.

* ar/adjust-tests-for-the-index-fallout:
  t2107: fix mention of the_index.cache_changed
  t3060: fix mention of function prune_index
2023-04-11 13:49:12 -07:00
647a2bb3ff Merge branch 'jc/spell-id-in-both-caps-in-message-id'
Consistently spell "Message-ID" as such, not "Message-Id".

* jc/spell-id-in-both-caps-in-message-id:
  e-mail workflow: Message-ID is spelled with ID in both capital letters
2023-04-11 13:49:12 -07:00
d02343b599 Merge branch 'ws/sparse-check-rules'
"git sparse-checkout" command learns a debugging aid for the sparse
rule definitions.

* ws/sparse-check-rules:
  builtin/sparse-checkout: add check-rules command
  builtin/sparse-checkout: remove NEED_WORK_TREE flag
2023-04-11 13:49:12 -07:00
4711556905 mailmap, quote: move declarations of global vars to correct unit
Since earlier commits removed the inclusion of cache.h from mailmap.c
and quote.c, it feels odd to have the extern declarations of
global variables in cache.h rather than the actual header included
by the source file.  Move these global variable extern declarations
from cache.h to mailmap.c and quote.c.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
b7b189cd5a treewide: reduce includes of cache.h in other headers
We had a handful of headers including cache.h that didn't need to
anymore.  Drop those includes and replace them with includes of
smaller files, or forward declarations.  However, note that two .c
files now need to directly include cache.h, though they should have
been including it all along given they are directly using structs
defined in it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
65156bb7ec treewide: remove double forward declaration of read_in_full
cache.h's nature of a dumping ground of includes prevented it from
being included in some compat/ files, forcing us into a workaround
of having a double forward declaration of the read_in_full() function
(see commit 14086b0a13 ("compat/pread.c: Add a forward declaration to
fix a warning", 2007-11-17)).  Now that we have moved functions like
read_in_full() from cache.h to wrapper.h, and wrapper.h isn't littered
with unrelated and scary #defines, get rid of the extra forward
declaration and just have compat/pread.c include wrapper.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
31dfa17b3b cache.h: remove unnecessary includes
cache.h did not need any of these headers, and nothing that depended
upon cache.h needed them either.  Simply expunge these includes.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
77f091ed9f treewide: remove cache.h inclusion due to pager.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
ca4eed708d pager.h: move declarations for pager.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
0e8d4b9db7 treewide: remove cache.h inclusion due to editor.h changes
This actually only affects sideband.c, but helps towards removing
cache.h inclusion in conjunction with some of the upcoming patches
that will be applied.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
4e120823a3 editor: move editor-related functions and declarations into common file
cache.h and strbuf.[ch] had editor-related functions.  Move these into
editor.[ch].

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
d812c3b6a0 treewide: remove cache.h inclusion due to object.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
8876ea83a7 object.h: move some inline functions and defines from cache.h
The object_type() inline function is very tied to the enum object_type
declaration within object.h, and just seemed to make more sense to live
there.  That makes S_ISGITLINK and some other defines make sense to go
with it, as well as the create_ce_mode() and canon_mode() inline
functions.  Move all these inline functions and defines from cache.h to
object.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
b6fdc44c84 treewide: remove cache.h inclusion due to object-file.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
87bed17907 object-file.h: move declarations for object-file.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
d530c04e2c treewide: remove cache.h inclusion due to git-zlib changes
This actually only affects http-backend.c, but the git-zlib changes
are going to be instrumental in pulling out an object-file.h which
will help with several more files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
d88dbaa718 git-zlib: move declarations for git-zlib functions from cache.h
Move functions from cache.h for zlib.c into a new header file.  Since
adding a "zlib.h" would cause issues with the real zlib, rename zlib.c
to git-zlib.c while we are at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
e93fc5d721 treewide: remove cache.h inclusion due to object-name.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
5579f44d2f treewide: remove unnecessary cache.h inclusion
Several files were including cache.h solely to get other headers, such
as trace.h and trace2.h.  Since the last few commits have modified
files to make these dependencies more explicit, the inclusion of cache.h
is no longer needed in several cases.  Remove it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
5bc07225e5 treewide: be explicit about dependence on mem-pool.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
6f2d743043 treewide: be explicit about dependence on oid-array.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
75f273d9b7 treewide: be explicit about dependence on pack-revindex.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
73359a9b43 treewide: be explicit about dependence on convert.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
6c6ddf92d5 treewide: be explicit about dependence on advice.h
Dozens of files made use of advice functions, without explicitly
including advice.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
advice.h if they are using it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
74ea5c9574 treewide: be explicit about dependence on trace.h & trace2.h
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h.  This made it more difficult
to find which files could remove a dependence on cache.h.  Make C files
explicitly include trace.h or trace2.h if they are using them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:08 -07:00
4e33535ea9 clone: error specifically with --local and symlinked objects
6f054f9fb3 (builtin/clone.c: disallow --local clones with
symlinks, 2022-07-28) gives a good error message when "git clone
--local" fails when the repo to clone has symlinks in
"$GIT_DIR/objects". In bffc762f87 (dir-iterator: prevent top-level
symlinks without FOLLOW_SYMLINKS, 2023-01-24), we later extended this
restriction to the case where "$GIT_DIR/objects" is itself a symlink,
but we didn't update the error message then - bffc762f87's tests show
that we print a generic "failed to start iterator over" message.

This is exacerbated by the fact that Documentation/git-clone.txt
mentions neither restriction, so users are left wondering if this is
intentional behavior or not.

Fix this by adding a check to builtin/clone.c: when doing a local clone,
perform an extra check to see if "$GIT_DIR/objects" is a symlink, and if
so, assume that that was the reason for the failure and report the
relevant information. Ideally, dir_iterator_begin() would tell us that
the real failure reason is the presence of the symlink, but (as far as I
can tell) there isn't an appropriate errno value for that.

Also, update Documentation/git-clone.txt to reflect that this
restriction exists.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:46:09 -07:00
fd72637423 t2024: fix loose/strict local base branch DWIM test
Test 'loosely defined local base branch is reported correctly' in
t2024-checkout-dwim.sh, which was introduced in [1] compares output of
two invocations of "git checkout", invoked with two different branches
named "strict" and "loose".  As per description in [1], the test is
validating that output of tracking information for these two branches.
This tracking information is printed to standard output:

    Your branch is behind 'main' by 1 commit, and can be fast-forwarded.
      (use "git pull" to update your local branch)

The test assumes that the names of the two branches (strict and loose)
are in that output, and pipes the output through sed to replace names of
the branches with "BRANCHNAME".  Command "git checkout", however,
outputs the branch name to standard error, not standard output -- see
message "Switched to branch '%s'\n" in function "update_refs_for_switch"
in "builtin/checkout.c".  This means that the two invocations of sed do
nothing.

Redirect both the standard output and the standard error of "git
checkout" for these assertions.  Ensure that compared files have the
string "BRANCHNAME".

In a series of piped commands, only the return code of the last command
is used.  Thus, all other commands will have their return codes masked.
Avoid piping of output of git directly into sed to preserve the exit
status code of "git checkout", while we're here.

[1] 05e73682cd (checkout: report upstream correctly even with loosely
    defined branch.*.merge, 2014-10-14)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 10:11:23 -07:00
05106aa198 rebase: remove a couple of redundant strategy tests
Remove a test in t3402 that has been redundant ever since 80ff47957b
(rebase: remember strategy and strategy options, 2011-02-06).  That
commit added a new test, the first part of which (as noted in the old
commit message) duplicated an existing test.

Also remove a test t3418 that has been redundant since the merge backend
was removed in 68aa495b59 (rebase: implement --merge via the interactive
machinery, 2018-12-11), since it now tests the same code paths as the
preceding test.

Helped-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
4960e5c7bd rebase -m: fix serialization of strategy options
To store the strategy options rebase prepends " --" to each one and
writes them to a file. To load them it reads the file and passes the
contents to split_cmdline(). This roughly mimics the behavior of the
scripted rebase but has a couple of limitations, (1) options containing
whitespace are not properly preserved (this is true of the scripted
rebase as well) and (2) options containing '"' or '\' are incorrectly
parsed and may cause the parser to return an error.

Fix these limitations by quoting each option when they are stored so
that they can be parsed correctly. Now that "--preserve-merges" no
longer exist this change also stops prepending "--" to the options when
they are stored as that was an artifact of the scripted rebase.

These changes are backwards compatible so the files written by an older
version of git can still be read. They are also forwards compatible,
the file can still be parsed by recent versions of git as they treat the
"--" prefix as optional.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
4a8bc9860a rebase -m: cleanup --strategy-option handling
When handling "--strategy-option" rebase collects the commands into a
struct string_list, then concatenates them into a string, prepending "--"
to each one before splitting the string and removing the "--" prefix.
This is an artifact of the scripted rebase and the need to support
"rebase --preserve-merges". Now that "--preserve-merges" no-longer
exists we can cleanup the way the argument is handled.

The tests for a bad strategy option are adjusted now that
parse_strategy_opts() is no-longer called when starting a rebase. The
fact that it only errors out when running "git rebase --continue" is a
mixed blessing but the next commit will fix the root cause of the
parsing problem so lets not worry about that here.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
fb60b9f37f sequencer: use struct strvec to store merge strategy options
The sequencer stores the merge strategy options in an array of strings
which allocated with ALLOC_GROW(). Using "struct strvec" avoids manually
managing the memory of that array and simplifies the code.

Aside from memory allocation the changes to the sequencer are largely
mechanical, changing xopts_nr to xopts.nr and xopts[i] to xopts.v[i]. A
new option parsing macro OPT_STRVEC() is also added to collect the
strategy options.  Hopefully this can be used to simplify the code in
builtin/merge.c in the future.

Note that there is a change of behavior to "git cherry-pick" and "git
revert" as passing “--no-strategy-option” will now clear any previous
strategy options whereas before this change it did nothing.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
461434a013 rebase: stop reading and writing unnecessary strategy state
The state files for "--strategy" and "--strategy-option" are written and
read twice, once by builtin/rebase.c and then by sequencer.c. This is an
artifact of the scripted rebase and the need to support "rebase
--preserve-merges". Now that "--preserve-merges" no-longer exists we
only need to read and write these files in sequencer.c. This enables us
to remove a call to free() in read_strategy_opts() that was added by
f1f4ebf432 (sequencer.c: fix "opts->strategy" leak in
read_strategy_opts(), 2022-11-08) as this commit fixes the root cause of
that leak.

There is further scope for removing duplication in the reading and
writing of state files between builtin/rebase.c and sequencer.c but that
is left for a follow up series.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
c870de6502 get-tar-commit-id: use TYPEFLAG_GLOBAL_HEADER instead of magic value
Use the same macro in the archive reader code as on the writer side in
archive-tar.c to document the connection.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:22:34 -07:00
8a7f0b666f date: remove approxidate_relative()
When 29f4332e66 (Quit passing 'now' to date code, 2019-09-11) removed
its timeval parameter, approxidate_relative() became equivalent to
approxidate().  Convert its last two call sites and remove the redundant
function.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 08:46:40 -07:00
9a09ed3229 doc: simplify man version
The hacks to add version information to the man pages comes from 2007
7ef195ba3e (Documentation: Add version information to man pages,
2007-03-25). In that code we passed three fields to DocBook Stylesheets:
`source`, `version`, and `manual`, however, all the stylesheets do is
join the strings `source` and `version` [1].

Their own documentation explains that in pracice the source is just a
combination of two fields [2]:

  In practice, there are many pages that simply have a version number in
  the "source" field.

Splitting that information might have seemed more proper in 2007, but it
not achieve anything in practice.

Asciidoctor had support for this information in their manpage backend
since day 1: v1.5.3 (2015), but it didn't include the version. In the
docbook5 backend they did in v1.5.7 (2018), but again: no version.

There is no need for us to demand that that they add support for the
version field when in reality all that is going to happen is that both
fields are going to be joined.

Let's do that ourselves so we can forget about all our hacks for this
and so it works for both asciidoc.py, and docbook5 and manpage backends
of asciidoctor.

[1] https://github.com/docbook/xslt10-stylesheets/blob/master/xsl/common/refentry.xsl#L545
[2] https://docbook.sourceforge.net/release/xsl/current/doc/common/template.get.refentry.source.html

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 08:39:26 -07:00
f285f68a13 mailmap: change primary address for Emily Shaffer
Emily finally figured out how to set up their alias at DayJob, and would
prefer to use nasamuffin@google.com, partially to reduce confusion
between IRC and list, and partially because they just like the alias a
lot more.

Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-07 14:33:52 -07:00
be39144954 userdiff: support regexec(3) with multi-byte support
Since 1819ad327b (grep: fix multibyte regex handling under macOS,
2022-08-26) we use the system library for all regular expression
matching on macOS, not just for git grep.  It supports multi-byte
strings and rejects invalid multi-byte characters.

This broke all built-in userdiff word regexes in UTF-8 locales because
they all include such invalid bytes in expressions that are intended to
match multi-byte characters without explicit support for that from the
regex engine.

"|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" is added to all built-in word
regexes to match a single non-space or multi-byte character.  The \xNN
characters are invalid if interpreted as UTF-8 because they have their
high bit set, which indicates they are part of a multi-byte character,
but they are surrounded by single-byte characters.

Replace that expression with "|[^[:space:]]" if the regex engine
supports multi-byte matching, as there is no need to have an explicit
range for multi-byte characters then.  Check for that capability at
runtime, because it depends on the locale and thus on environment
variables.  Construct the full replacement expression at build time
and just switch it in if necessary to avoid string manipulation and
allocations at runtime.

Additionally the word regex for tex contains the expression
"[a-zA-Z0-9\x80-\xff]+" with a similarly invalid range.  The best
replacement with only valid characters that I can come up with is
"([a-zA-Z0-9]|[^\x01-\x7f])+".  Unlike the original it matches NUL
characters, though.  Assuming that tex files usually don't contain NUL
this should be acceptable.

Reported-by: D. Ben Knoble <ben.knoble@gmail.com>
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-07 07:38:09 -07:00
78b6369e67 MyFirstContribution: render literal *
The HTML version of MyFirstContribution [1] does not render the
asterisks (*) meant to be typed in as glob patterns by the user, because
they are being interpreted as bold text delimiters.

[1]: Search for "pattern" in
https://git-scm.com/docs/MyFirstContribution#v2-git-send-email

Signed-off-by: Linus Arver <linusa@google.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06 15:03:18 -07:00
0607f793cb The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06 13:38:32 -07:00
89833fc249 Merge branch 'ds/fetch-bundle-uri-with-all'
"git fetch --all" does not have to download and handle the same
bundleURI over and over, which has been corrected.

* ds/fetch-bundle-uri-with-all:
  fetch: download bundles once, even with --all
2023-04-06 13:38:32 -07:00
c5305bbe32 Merge branch 'ow/ref-format-remove-unused-member'
Code clean-up.

* ow/ref-format-remove-unused-member:
  ref-filter: remove unused ref_format member
2023-04-06 13:38:32 -07:00
0b94009649 Merge branch 'jk/chainlint-fixes'
Test framework fix.

* jk/chainlint-fixes:
  tests: skip test_eval_ in internal chain-lint
  tests: drop here-doc check from internal chain-linter
  tests: diagnose unclosed here-doc in chainlint.pl
  tests: replace chainlint subshell with a function
  tests: run internal chain-linter under "make test"
2023-04-06 13:38:31 -07:00
6047b28eb7 Merge branch 'en/header-split-cleanup'
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.

* en/header-split-cleanup:
  csum-file.h: remove unnecessary inclusion of cache.h
  write-or-die.h: move declarations for write-or-die.c functions from cache.h
  treewide: remove cache.h inclusion due to setup.h changes
  setup.h: move declarations for setup.c functions from cache.h
  treewide: remove cache.h inclusion due to environment.h changes
  environment.h: move declarations for environment.c functions from cache.h
  treewide: remove unnecessary includes of cache.h
  wrapper.h: move declarations for wrapper.c functions from cache.h
  path.h: move function declarations for path.c functions from cache.h
  cache.h: remove expand_user_path()
  abspath.h: move absolute path functions from cache.h
  environment: move comment_line_char from cache.h
  treewide: remove unnecessary cache.h inclusion from several sources
  treewide: remove unnecessary inclusion of gettext.h
  treewide: be explicit about dependence on gettext.h
  treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06 13:38:31 -07:00
72871b198f Merge branch 'ab/remove-implicit-use-of-the-repository'
Code clean-up around the use of the_repository.

* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-06 13:38:30 -07:00
06e9e726d4 Merge branch 'gc/config-parsing-cleanup'
Config API clean-up to reduce its dependence on static variables

* gc/config-parsing-cleanup:
  config.c: rename "struct config_source cf"
  config: report cached filenames in die_bad_number()
  config.c: remove current_parsing_scope
  config.c: remove current_config_kvi
  config.c: plumb the_reader through callbacks
  config.c: create config_reader and the_reader
  config.c: don't assign to "cf_global" directly
  config.c: plumb config_source through static fns
2023-04-06 13:38:29 -07:00
0a8c337394 Merge branch 'sm/ssl-key-type-config'
Add a few configuration variables to tell the cURL library that
different types of ssl-cert and ssl-key are in use.

* sm/ssl-key-type-config:
  http: add support for different sslcert and sslkey types.
2023-04-06 13:38:29 -07:00
87daf40750 Merge branch 'ab/config-multi-and-nonbool'
Assorted config API updates.

* ab/config-multi-and-nonbool:
  for-each-repo: with bad config, don't conflate <path> and <cmd>
  config API: add "string" version of *_value_multi(), fix segfaults
  config API users: test for *_get_value_multi() segfaults
  for-each-repo: error on bad --config
  config API: have *_multi() return an "int" and take a "dest"
  versioncmp.c: refactor config reading next commit
  config API: add and use a "git_config_get()" family of functions
  config tests: add "NULL" tests for *_get_value_multi()
  config tests: cover blind spots in git_die_config() tests
2023-04-06 13:38:29 -07:00
e9dffbc7f1 Merge branch 'ps/fetch-ref-update-reporting'
Clean-up of the code path that reports what "git fetch" did to each
ref.

* ps/fetch-ref-update-reporting:
  fetch: centralize printing of reference updates
  fetch: centralize logic to print remote URL
  fetch: centralize handling of per-reference format
  fetch: pass the full local reference name to `format_display`
  fetch: move output format into `display_state`
  fetch: move reference width calculation into `display_state`
2023-04-06 13:38:28 -07:00
955abf5f72 Merge branch 'jk/unused-post-2.40-part2'
Code clean-up for "-Wunused-parameter" build.

* jk/unused-post-2.40-part2:
  parse-options: drop parse_opt_unknown_cb()
  t/helper: mark unused argv/argc arguments
  mark "argv" as unused when we check argc
  builtins: mark unused prefix parameters
  builtins: annotate always-empty prefix parameters
  builtins: always pass prefix to parse_options()
  fast-import: fix file access when run from subdir
2023-04-06 13:38:28 -07:00
9bc647a2d1 Merge branch 'jk/unused-post-2.40'
More "-Wunused-parameters" code clean-up.

* jk/unused-post-2.40:
  transport: mark unused parameters in fetch_refs_from_bundle()
  http: mark unused parameter in fill_active_slot() callbacks
  http: drop unused parameter from start_object_request()
  mailmap: drop debugging code
2023-04-06 13:38:28 -07:00
ae61aecb9e Merge branch 'jk/document-pack-redundant-deprecation'
Document that we have marked "pack-redundant" as deprecated.

* jk/document-pack-redundant-deprecation:
  pack-redundant: document deprecation
2023-04-06 13:38:28 -07:00
119e82a515 Merge branch 'ps/ahead-behind-truncation-fix'
Fix unnecessary truncation of generation numbers used in-core.

* ps/ahead-behind-truncation-fix:
  commit-graph: fix truncated generation numbers
2023-04-06 13:38:27 -07:00
7727da99df Merge branch 'ds/ahead-behind'
"git for-each-ref" learns '%(ahead-behind:<base>)' that computes the
distances from a single reference point in the history with bunch
of commits in bulk.

* ds/ahead-behind:
  commit-reach: add tips_reachable_from_bases()
  for-each-ref: add ahead-behind format atom
  commit-reach: implement ahead_behind() logic
  commit-graph: introduce `ensure_generations_valid()`
  commit-graph: return generation from memory
  commit-graph: simplify compute_generation_numbers()
  commit-graph: refactor compute_topological_levels()
  for-each-ref: explicitly test no matches
  for-each-ref: add --stdin option
2023-04-06 13:38:21 -07:00
4c643fb321 branch: improve error log on branch not found by checking remotes refs
New git users may want to locally delete remote-tracking branches but
don't really understand how they are distinguished from branches by git.
Then one may naively try:
`git branch -d foo/bar` and get a correct error `branch foo/bar not
found` but hard to understand for a newbie, this patch aims to guide one
in such case.

when failing to delete a branch with `git branch -d <branch>` because
of branch not found, try to find a **remote refs** matching `<branch>`
and if so, add an hint:
`Did you forget --remote?` to the error message

Signed-off-by: Clement Mabileau <mabileau.clement@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06 13:11:26 -07:00
c1917156a0 t/lib-httpd: pass PERL_PATH to CGI scripts
As discussed in t/README, tests should aim to use PERL_PATH rather than
straight "perl". We usually do this automatically with a "perl" function
in test-lib.sh, but a few cases need to be handled specially.

One such case is the apply-one-time-perl.sh CGI, which invokes plain
"perl". It should be using $PERL_PATH, but to make that work, we must
also instruct Apache to pass through the variable.

Prior to this patch, doing:

  mv /usr/bin/perl /usr/bin/my-perl
  make PERL_PATH=/usr/bin/my-perl test

would fail t5702, t5703, and t5616. After this it passes. This is a
pretty extreme case, as even if you install perl elsewhere, you'd likely
still have it in your $PATH. A more realistic case is that you don't
want to use the perl in your $PATH (because it's older, broken, etc) and
expect PERL_PATH to consistently override that (since that's what it's
documented to do). Removing it completely is just a convenient way of
completely breaking it for testing purposes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06 09:29:43 -07:00
8806120de6 doc: asciidoc: remove custom header macro
In 2007 we added a custom header macro to provide version information
7ef195ba3e (Documentation: Add version information to man pages,
2007-03-25),

However, in 2008 asciidoc added the attributes to do this properly [1].

This was not implemented in Git until 2019: 226daba280 (Doc/Makefile:
give mansource/-version/-manual attributes, 2019-09-16).

But in 2023 we are doing it properly, so there's no need for the custom
macro.

[1] https://github.com/asciidoc-py/asciidoc-py/commit/ad78a3c

Cc: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 21:37:45 -07:00
42943b950e mergetool: new config guiDefault supports auto-toggling gui by DISPLAY
When no merge.tool or diff.tool is configured or manually selected, the
selection of a default tool is sensitive to the DISPLAY variable; in a
GUI session a gui-specific tool will be proposed if found, and
otherwise a terminal-based one. This "GUI-optimizing" behavior is
important because a GUI can make a huge difference to a user's ability
to understand and correctly complete a non-trivial conflicting merge.

Some time ago the merge.guitool and diff.guitool config options were
introduced to enable users to configure both a GUI tool, and a non-GUI
tool (with fallback if no GUI tool configured), in the same environment.

Unfortunately, the --gui argument introduced to support the selection of
the guitool is still explicit. When using configured tools, there is no
equivalent of the no-tool-configured "propose a GUI tool if we are in a GUI
environment" behavior.

As proposed in <xmqqmtb8jsej.fsf@gitster.g>, introduce new configuration
options, difftool.guiDefault and mergetool.guiDefault, supporting a special
value "auto" which causes the corresponding tool or guitool to be selected
depending on the presence of a non-empty DISPLAY value. Also support "true"
to say "default to the guitool (unless --no-gui is passed on the
commandline)", and "false" as the previous default behavior when these new
configuration options are not specified.

Signed-off-by: Tao Klerks <tao@klerks.biz>
Acked-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 21:03:29 -07:00
d0ea2ca1cf SubmittingPatches: clarify MUA discussion with "the"
Without the word "the", the sentence is a little harder to read. The
word "the" makes it clearer that the comment refers to discrete patches,
and not portions of individual patches.

Signed-off-by: Daniel Watson <ozzloy@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 14:50:25 -07:00
092df21dfc doc: remove manpage-base-url workaround
Commit 50d9bbba92 (Documentation: Avoid use of xmlto --stringparam,
2009-12-04) introduced manpage-base-url.xsl because ancient versions of
xmlto did not have --stringparam.

However, that was more than ten years ago, no need for that complexity
anymore, we can just use --stringparam.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 14:18:53 -07:00
8b214c2e9d clone: propagate object-format when cloning from void
A user could prepare an empty repository and set it to use SHA256 as
the object format.  The new repository created by "git clone" from
such a repository however would not record that it is expecting
objects in the same SHA256 format.  This works as expected if the
source repository is not empty.

Just like we started copying the name of the primary branch from the
remote repository even if it is unborn in 3d8314f8 (clone: propagate
empty remote HEAD even with other branches, 2022-07-07), lift the
code that records the object format out of the block executed only
when cloning from an instantiated repository, so that it works also
when cloning from an empty repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 14:17:00 -07:00
ae73b2c8f1 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-04 14:28:29 -07:00
5e4070e128 Merge branch 'jk/really-deprecate-pack-redundant'
"git pack-redundant" gave a warning when run, as the command has
outlived its usefulness long ago and is nominated for future
removal.  Now we escalate to give an error.

* jk/really-deprecate-pack-redundant:
  pack-redundant: escalate deprecation warning to an error
2023-04-04 14:28:29 -07:00
abb3b692a4 Merge branch 'jk/document-rev-list-object-name'
Document what the pathname-looking strings in "rev-list --object"
output are for and what they mean.

* jk/document-rev-list-object-name:
  docs: document caveats of rev-list's object-name output
2023-04-04 14:28:29 -07:00
45602dd029 Merge branch 'ar/test-cleanup-unused-file-creation'
Test clean-up.

* ar/test-cleanup-unused-file-creation:
  t1507: assert output of rev-parse
  t1404: don't create unused file
  t1400: assert output of update-ref
  t1302: don't create unused file
  t1010: don't create unused files
  t1006: assert error output of cat-file
  t1005: assert output of ls-files
2023-04-04 14:28:29 -07:00
054ae834a8 Merge branch 'ob/sequencer-save-head-simplify'
Code clean-up.

* ob/sequencer-save-head-simplify:
  sequencer: rewrite save_head() in terms of write_message()
2023-04-04 14:28:29 -07:00
0ee87cde28 Merge branch 'ob/rollback-after-commit-lock-failure'
Code clean-up.

* ob/rollback-after-commit-lock-failure:
  sequencer: remove pointless rollback_lock_file()
2023-04-04 14:28:28 -07:00
62df03c277 Merge branch 'jk/blame-contents-with-arbitrary-commit'
"git blame --contents=<file> <rev> -- <path>" used to be forbidden,
but now it finds the origins of lines starting at <file> contents
through the history that leads to <rev>.

* jk/blame-contents-with-arbitrary-commit:
  blame: allow --contents to work with non-HEAD commit
2023-04-04 14:28:28 -07:00
6dd9d96129 Merge branch 'rs/archive-mtime'
Test update.

* rs/archive-mtime:
  t5000: use check_mtime()
2023-04-04 14:28:28 -07:00
9142fce9b0 Merge branch 'ah/rebase-merges-config'
Streamline --rebase-merges command line option handling and
introduce rebase.merges configuration variable.

* ah/rebase-merges-config:
  rebase: add a config option for --rebase-merges
  rebase: deprecate --rebase-merges=""
  rebase: add documentation and test for --no-rebase-merges
2023-04-04 14:28:28 -07:00
7e13d654c2 Merge branch 'jk/fast-export-cleanup'
Code clean-up.

* jk/fast-export-cleanup:
  fast-export: drop unused parameter from anonymize_commit_message()
  fast-export: drop data parameter from anonymous generators
  fast-export: de-obfuscate --anonymize-map handling
  fast-export: factor out anonymized_entry creation
  fast-export: simplify initialization of anonymized hashmaps
  fast-export: drop const when storing anonymized values
2023-04-04 14:28:27 -07:00
f315a8b609 Merge branch 'js/split-index-fixes'
The index files can become corrupt under certain conditions when
the split-index feature is in use, especially together with
fsmonitor, which have been corrected.

* js/split-index-fixes:
  unpack-trees: take care to propagate the split-index flag
  fsmonitor: avoid overriding `cache_changed` bits
  split-index; stop abusing the `base_oid` to strip the "link" extension
  split-index & fsmonitor: demonstrate a bug
2023-04-04 14:28:27 -07:00
f834089925 Merge branch 'pw/wildmatch-fixes'
The wildmatch library code unlearns exponential behaviour it
acquired some time ago since it was borrowed from rsync.

* pw/wildmatch-fixes:
  t3070: make chain lint tester happy
  wildmatch: hide internal return values
  wildmatch: avoid undefined behavior
  wildmatch: fix exponential behavior
2023-04-04 14:28:27 -07:00
1a65b41b38 write-tree: integrate with sparse index
Update 'git write-tree' to allow using the sparse-index in memory
without expanding to a full one.

The recursive algorithm for update_one() was already updated in 2de37c5
(cache-tree: integrate with sparse directory entries, 2021-03-03) to
handle sparse directory entries in the index. Hence we can just set the
requires-full-index to false for "write-tree".

The `p2000` tests demonstrate a ~96% execution time reduction for 'git
write-tree' using a sparse index:

Test                                           before  after
-----------------------------------------------------------------
2000.78: git write-tree (full-v3)              0.34    0.33 -2.9%
2000.79: git write-tree (full-v4)              0.32    0.30 -6.3%
2000.80: git write-tree (sparse-v3)            0.47    0.02 -95.8%
2000.81: git write-tree (sparse-v4)            0.45    0.02 -95.6%

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-04 12:50:54 -07:00
e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
748b8d669a describe: enable sparse index for describe
git describe compares the index with the working tree when (and only
when) it is run with the "--dirty" flag. This is done by the
run_diff_index() function. The function has been made aware of the
sparse-index in the series that led to 8d2c3732 (Merge branch
'ld/sparse-diff-blame', 2021-12-21). Hence we can just set the
requires-full-index to false for "describe".

Performance metrics

  Test                                                     HEAD~1            HEAD
  -------------------------------------------------------------------------------------------------
  2000.2: git describe --dirty (full-v3)                   0.08(0.09+0.01)   0.08(0.06+0.03) +0.0%
  2000.3: git describe --dirty (full-v4)                   0.09(0.07+0.03)   0.08(0.05+0.04) -11.1%
  2000.4: git describe --dirty (sparse-v3)                 0.88(0.82+0.06)   0.02(0.01+0.05) -97.7%
  2000.5: git describe --dirty (sparse-v4)                 0.68(0.60+0.08)   0.02(0.02+0.04) -97.1%
  2000.6: echo >>new && git describe --dirty (full-v3)     0.08(0.04+0.05)   0.08(0.05+0.04) +0.0%
  2000.7: echo >>new && git describe --dirty (full-v4)     0.08(0.07+0.03)   0.08(0.05+0.04) +0.0%
  2000.8: echo >>new && git describe --dirty (sparse-v3)   0.75(0.69+0.07)   0.02(0.03+0.03) -97.3%
  2000.9: echo >>new && git describe --dirty (sparse-v4)   0.81(0.73+0.09)   0.02(0.01+0.05) -97.5%

Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-03 11:30:23 -07:00
488d9d52be credential/wincred: store password_expiry_utc
This attribute is important when storing OAuth credentials which may
expire after as little as one hour. d208bfdf (credential: new attribute
password_expiry_utc, 2023-02-18) added support for this attribute in
general so that individual credential backend like wincred can use it.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-03 09:59:52 -07:00
f024913164 format-patch: correct documentation of --thread without an argument
In Git, almost all command line flags unconditionally override the
corresponding config option.[1] Add a test to confirm that this is the
case for `git format-patch --thread`.

[1] https://lore.kernel.org/git/CAMMLpeS3+NUQa2oqpHKVo3yWQNVMgkEXrs4U5_ggvk31yQbezQ@mail.gmail.com/

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-03 09:59:20 -07:00
dc12ee77ab object-info: init request_info before reading arg
When retrieving object info via capability "object-info", we store the
command args into a requested_info variable, but forget to initialize
it. Initialize the variable before use to prevent unexpected output.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-03 09:32:02 -07:00
ba4324c4e1 e-mail workflow: Message-ID is spelled with ID in both capital letters
We used to write "Message-Id:" and "Message-ID:" pretty much
interchangeably, and the header name is defined to be case
insensitive by the RFCs, but the canonical form "Message-ID:" is
used throughout the RFC documents, so let's imitate it ourselves.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
2023-04-03 08:55:43 -07:00
f66ad35508 l10n: TEAMS: Update pt_PT repo link
Signed-off-by: Daniel Santos <dacs.git@brilhante.top>
2023-04-03 15:53:56 +01:00
140b9478da The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-31 17:50:32 -07:00
e5b6fc627e Merge branch 'ss/hashmap-typofix'
Typofix.

* ss/hashmap-typofix:
  hashmap.h: fix minor typo
2023-03-31 17:50:24 -07:00
290a973bb9 Merge branch 'ds/p2000-fix-grep-sparse'
Fix perf test.

* ds/p2000-fix-grep-sparse:
  p2000: remove stray '--sparse' flag from test
2023-03-31 17:50:23 -07:00
5c93cfdafd Merge branch 'kh/commentchar-config-error-message'
Doc update.

* kh/commentchar-config-error-message:
  config: tell the user that we expect an ASCII character
2023-03-31 17:50:23 -07:00
0d865049f7 Merge branch 'ab/retire-scripted-add-p'
Test fix.

* ab/retire-scripted-add-p:
  t3701: we don't need no Perl for `add -i` anymore
2023-03-31 17:50:23 -07:00
dd88a1af1a Merge branch 'js/t5563-portability-fix'
Test portability fix.

* js/t5563-portability-fix:
  t5563: prevent "ambiguous redirect"
2023-03-31 17:50:23 -07:00
5ae4bd14be Merge branch 'bb/unicode-width-table-15'
Update width table for the latest edition of Unicode.

* bb/unicode-width-table-15:
  unicode: update the width tables to Unicode 15
2023-03-31 17:50:23 -07:00
1ec40a83a5 t2107: fix mention of the_index.cache_changed
Commit [1] added a test to t2107-update-index-basic.sh with a comment
that mentions macro "active_cache_changed".  Later in [2], the macro was
removed and its usage in function cmd_update_index in file
builtin/update-index.c was replaced with "the_index.cache_changed".

Fix the outdated comment in file t2107-update-index-basic.sh.

[1] fa137f67a4 (lockfile.c: store absolute path, 2014-11-02)
[2] dc594180d9 (cocci & cache.h: apply variable section of "pending"
    index-compatibility, 2022-11-19)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-31 16:57:04 -07:00
993d7085be t3060: fix mention of function prune_index
Commit [1] added tests which trigger function prune_cache.  The comments
in these tests, however, incorrectly call it "prune_path".  Since then,
function "prune_cache" has been renamed to "prune_index" in commit [2].
Later still in commit [3], the_index singleton, which is also mentioned
in a comment, stopped being used directly with function "prune_index".

Fix mentions of function "prune_index" and the struct it changes in
comments in file "t3060-ls-files-with-tree.sh".

[1] 54e1abce90 (Add test case for ls-files --with-tree, 2007-10-03)
[2] 6510ae173a (ls-files: convert prune_cache to take an index,
    2017-06-12)
[3] 188dce131f (ls-files: use repository object, 2017-06-22)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-31 16:57:03 -07:00
25bccb4b79 fetch: download bundles once, even with --all
When fetch.bundleURI is set, 'git fetch' downloads bundles from the
given bundle URI before fetching from the specified remote. However,
when using non-file remotes, 'git fetch --all' will launch 'git fetch'
subprocesses which then read fetch.bundleURI and fetch the bundle list
again. We do not expect the bundle list to have new information during
these multiple runs, so avoid these extra calls by un-setting
fetch.bundleURI in the subprocess arguments.

Be careful to skip fetching bundles for the empty bundle string.
Fetching bundles from the empty list presents some interesting test
failures.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-31 10:07:33 -07:00
92c7b3d473 t5563: prevent "ambiguous redirect"
When I ran this test using `TEST_SHELL_PATH=/bin/bash` in my Ubuntu
setup (where Bash is at version 5.0.17(1)-release), I was greeted with
this error message:

	./test-lib.sh: line 1072: $CHALLENGE: ambiguous redirect

This commit fixes that error by quoting the `CHALLENGE` variable (which
has as value a path containing spaces), and by avoiding to cuddle the
empty string parameter in the `printf` call with the redirect character
(in fact, the `printf ''>$CHALLENGE` is removed because the next line
overwrites the file anyway because it _also_ uses a single `>` to
redirect the output).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-31 08:50:30 -07:00
6369acd968 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:47:19 -07:00
d35cd54a23 Merge branch 'mk/workaround-pcre-jit-ucp-bug'
A recent-ish change to allow unicode character classes to be used
with "grep -P" triggered a JIT bug in older pcre2 libraries.
The problematic change in Git built with these older libraries has
been disabled to work around the bug.

* mk/workaround-pcre-jit-ucp-bug:
  grep: work around UTF-8 related JIT bug in PCRE2 <= 10.34
2023-03-30 13:47:12 -07:00
a15b8451f2 Merge branch 'jc/am-doc-refer-to-format-patch'
Doc update.

* jc/am-doc-refer-to-format-patch:
  am: refer to format-patch in the documentation
2023-03-30 13:47:12 -07:00
5f6f7a48da Merge branch 'sg/parse-options-h-initializers'
Code clean-up to use designated initializers in parse-options API.

* sg/parse-options-h-initializers:
  parse-options.h: use designated initializers in OPT_* macros
  parse-options.h: rename _OPT_CONTAINS_OR_WITH()'s parameters
  parse-options.h: use consistent name for the callback parameters
2023-03-30 13:47:12 -07:00
dbb4102f7b Merge branch 'sg/parse-options-h-users'
Code clean-up to include and/or uninclude parse-options.h file as
needed.

* sg/parse-options-h-users:
  treewide: remove unnecessary inclusions of parse-options.h from headers
  treewide: include parse-options.h in source files
2023-03-30 13:47:11 -07:00
cc48ddd937 tests: skip test_eval_ in internal chain-lint
To check for broken &&-chains, we run "fail_117 && $1" as a test
snippet, and check the exit code. We use test_eval_ to do so, because
that's the way we run the actual test.

But we don't need any of its niceties, like "set -x" tracing. In fact,
they hinder us, because we have to explicitly disable them. So let's
skip that and use "eval" more directly, which is simpler. I had hoped it
would also be faster, but it doesn't seem to produce a measurable
improvement (probably because it's just running internal shell commands,
with no subshells or forks).

Note that there is one gotcha: even though we don't intend to run any of
the commands if the &&-chain is intact, an error like this:

   test_expect_success 'broken' '
	# this next line breaks the &&-chain
	true
	# and then this one is executed even by the linter
	return 1
   '

means we'll "return 1" from the eval, and thus from test_run_(). We
actually do notice this in test_expect_success, but only by saying "hey,
this test didn't say it was OK, so it must have failed", which is not
right (it should say "broken &&-chain").

We can handle this by calling test_eval_inner_() instead, which is our
trick for wrapping "return" in a test snippet. But to do that, we have
to push the trace code out of that inner function and into test_eval_().
This is arguably where it belonged in the first place, but it never
mattered because the "inner_" function had only one caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:07:29 -07:00
750b260411 tests: drop here-doc check from internal chain-linter
Commit 99a64e4b73 (tests: lint for run-away here-doc, 2017-03-22)
tweaked the chain-lint test to catch unclosed here-docs. It works by
adding an extra "echo" command after the test snippet, and checking that
it is run (if it gets swallowed by a here-doc, naturally it is not run).

The downside here is that we introduced an extra $() substitution, which
happens in a subshell. This has a measurable performance impact when
run for many tests.

The tradeoff in safety was undoubtedly worth it when 99a64e4b73 was
written. But since the external chainlint.pl learned to find these
recently, we can just rely on it. By switching back to a simpler
chain-lint, hyperfine reports a measurable speedup on t3070 (which has
1800 tests):

  'HEAD' ran
    1.12 ± 0.01 times faster than 'HEAD~1'

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:07:29 -07:00
2b61c8dc88 tests: diagnose unclosed here-doc in chainlint.pl
An unclosed here-doc in a test is a problem, because it silently gobbles
up any remaining commands. Since 99a64e4b73 (tests: lint for run-away
here-doc, 2017-03-22) we detect this by piggy-backing on the internal
chainlint checker in test-lib.sh.

However, it would be nice to detect it in chainlint.pl, for a few
reasons:

  - the output from chainlint.pl is much nicer; it can show the exact
    spot of the error, rather than a vague "somewhere in this test you
    broke the &&-chain or had a bad here-doc" message.

  - the implementation in test-lib.sh runs for each test snippet. And
    since it requires a subshell, the extra cost is small but not zero.
    If chainlint.pl can reliably find the problem, we can optimize the
    test-lib.sh code.

The chainlint.pl code never intended to find here-doc problems. But
since it has to parse them anyway (to avoid reporting problems inside
here-docs), most of what we need is already there. We can detect the
problem when we fail to find the missing end-tag in swallow_heredocs().
The extra change in scan_heredoc_tag() stores the location of the start
of the here-doc, which lets us mark it as the source of the error in the
output (see the new tests for examples).

[jk: added commit message and tests]

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:07:29 -07:00
1686de55fa tests: replace chainlint subshell with a function
To test that we don't break the &&-chain, test-lib.sh does something
like:

   (exit 117) && $test_commands

and checks that the result is exit code 117. We don't care what that
initial command is, as long as it exits with a unique code. Using "exit"
works and is simple, but is a bit expensive since it requires a subshell
(to avoid exiting the whole script!). This isn't usually very
noticeable, but it can add up for scripts which have a large number of
tests.

Using "return" naively won't work here, because we'd return from the
function eval-ing the snippet (and it wouldn't find &&-chain breakages).
But if we further push that into its own function, it does exactly what
we want, without extra subshell overhead.

According to hyperfine, this produces a measurable improvement when
running t3070 (which has 1800 tests, all of them quite short):

  'HEAD' ran
    1.09 ± 0.01 times faster than 'HEAD~1'

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:07:29 -07:00
7b6555ab8d tests: run internal chain-linter under "make test"
Since 69b9924b87 (t/Makefile: teach `make test` and `make prove` to run
chainlint.pl, 2022-09-01), we run a single chainlint.pl process for all
scripts, and then instruct each individual script to run with the
equivalent of --no-chain-lint, which tells them not to redundantly run
the chainlint script themselves.

However, this also disables the internal linter run within the shell by
eval-ing "(exit 117) && $1" and confirming we get code 117. In theory
the external linter produces a superset of complaints, and we don't need
the internal one anymore. However, we know there is at least one case
where they differ. A test like:

	test_expect_success 'should fail linter' '
		false &&
		sleep 2 &
		pid=$! &&
		kill $pid
	'

is buggy (it ignores the failure from "false", because it is
backgrounded along with the sleep). The internal linter catches this,
but the external one doesn't (and teaching it to do so is complicated[1]).
So not only does "make test" miss this problem, but it's doubly
confusing because running the script standalone does complain.

Let's teach the suppression in the Makefile to only turn off the
external linter (which we know is redundant, as it was already run) and
leave the internal one intact.

I've used a new environment variable to do this here, and intentionally
did not add a "--no-ext-chain-lint" option. This is an internal
optimization used by the Makefile, and not something that ordinary users
would need to tweak.

[1] For discussion of chainlint.pl and this case, see:
    https://lore.kernel.org/git/CAPig+cQtLFX4PgXyyK_AAkCvg4Aw2RAC5MmLbib-aHHgTBcDuw@mail.gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:07:29 -07:00
b10cbdac4c unicode: update the width tables to Unicode 15
Unicode version 15 was released in September 2022 [1], and we have so
far neglected to update our width tables. Do this now.

[1] http://blog.unicode.org/2022/09/announcing-unicode-standard-version-150.html

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 13:06:12 -07:00
ec063d2591 hashmap.h: fix minor typo
The word "no" should be "not".

Signed-off-by: Siddharth Singh <siddhartth@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 10:18:39 -07:00
4833b08426 ref-filter: remove unused ref_format member
use_rest was added in b9dee075eb (ref-filter: add %(rest) atom,
2021-07-26) but was never used. As far as I can tell it was used in a
later patch that was submitted to the mailing list but never applied.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 10:17:49 -07:00
fcf31daae4 pack-redundant: document deprecation
Running the command itself has generated a warning for several versions,
which has recently been upgraded to an error. Let's also make sure the
documentation mentions what is going on. This also gives us a good spot
to explain the reasoning and recommend alternatives.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-30 07:50:43 -07:00
4a4d9706ad parse-options: drop parse_opt_unknown_cb()
This low-level callback was introduced in ce564eb1bd (parse-options:
add parse_opt_unknown_cb(), 2016-09-05) so that we could advertise
--indent-heuristic in git-blame's "-h" output, even though the option is
actually handled in parse_revision_opt(). We later stopped doing so in
44ae131e38 (builtin/blame.c: remove '--indent-heuristic' from usage
string, 2019-10-28).

This is a weird thing to do, and in the intervening years, we've never
used it again. Let's drop the helper in the name of simplicity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
126e3b3d2a t/helper: mark unused argv/argc arguments
Many test helper programs do not bother to look at argc or argv, because
they don't take any options. In a user-facing program, it's a good idea
to check for unexpected arguments and complain. But for a test helper,
it's not worth the trouble to enforce this.

But we do want to tell the compiler we're OK with ignoring them, to
silence -Wunused-parameter (and obviously we can't get rid of them,
since we have to conform to the usual cmd__foo() interface).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
6ba21fa65c mark "argv" as unused when we check argc
A few commands don't take any options at all, and confirm this by
checking argc. After that they have no need to look at argv, but we're
still stuck with it by convention. Let's annotate these cases so that
the compiler doesn't complain with -Wunused-parameter.

Note that in scalar and get-tar-commit-id, we're forced to keep argv by
calling convention (the functions must match cmd_main() and builtin
cmd_foo() conventions, respectively). In diff, these are subcommand
modes that we call individually, so we _could_ just drop the argv
parameters entirely. But it's weird to pass argc without argv, and it
implies that the caller knows that the subcommands aren't interested in
further arguments. It's less confusing to just keep them and silence the
compiler warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
5247b762d0 builtins: mark unused prefix parameters
All builtins receive a "prefix" parameter, but it is only useful if they
need to adjust filenames given by the user on the command line. For
builtins that do not even call parse_options(), they often don't look at
the prefix at all, and -Wunused-parameter complains.

Let's annotate those to silence the compiler warning. I gave a quick
scan of each of these cases, and it seems like they don't have anything
they _should_ be using the prefix for (i.e., there is no hidden bug that
we are missing). The only questionable cases I saw were:

  - in git-unpack-file, we create a tempfile which will always be at the
    root of the repository, even if the command is run from a subdir.
    Arguably this should be created in the subdir from which we're run
    (as we report the path only as a relative name). However, nobody has
    complained, and I'm hesitant to change something that is deep
    plumbing going back to April 2005 (though I think within our
    scripts, the sole caller in git-merge-one-file would be OK, as it
    moves to the toplevel itself).

  - in fetch-pack, local-filesystem remotes are taken as relative to the
    project root, not the current directory. So:

       git init server.git
       [...put stuff in server.git...]
       git init client.git
       cd client.git
       mkdir subdir
       cd subdir
       git fetch-pack ../../server.git ...

    won't work, as we quietly move to the top of the repository before
    interpreting the path (so "../server.git" would work). This is
    weird, but again, nobody has complained and this is how it has
    always worked. And this is how "git fetch" works, too. Plus it
    raises questions about how a configured remote like:

      git config remote.origin.url ../server.git

    should behave. I can certainly come up with a reasonable set of
    behavior, but it may not be worth stirring up complications in a
    plumbing tool.

So I've left the behavior untouched in both of those cases. If anybody
really wants to revisit them, it's easy enough to drop the UNUSED
marker. This commit is just about removing them as obstacles to turning
on -Wunused-parameter all the time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
7915691377 builtins: annotate always-empty prefix parameters
It's usually a bad idea for a builtin's cmd_foo() to ignore the "prefix"
argument it gets, as it needs to prepend that string when accessing any
paths given by the user.

But if a builtin does not ask for the git wrapper to run repository
setup (via the RUN_SETUP or RUN_SETUP_GENTLY flags), then we know the
prefix will always be NULL (it is adjusting for the chdir() done during
repo setup, but there cannot be one if we did not set up the repo). In
those cases it's OK to ignore "prefix", but it's worth annotating for a
few reasons:

  1. It serves as documentation to somebody reading the code about what
     we expect.

  2. If the flags in git.c ever change, the run-time assertion may help
     detect the problem (though only if the command is run from a
     subdirectory of the repository).

  3. It notes to the compiler that we are OK ignoring "prefix". In
     particular, this silences -Wunused-parameter. It _could_ also help
     the compiler generate better code (because it will know the prefix
     is NULL), but in practice this is quite unlikely to matter.

Note that I've only added this annotation to commands which triggered
-Wunused-parameter. It would be correct to add it to any builtin which
doesn't ask for RUN_SETUP, but most of the rest of them do the sensible
thing with "prefix" by passing it to parse_options(). So they're much
more likely to just work if they ever switched to RUN_SETUP, and aren't
worth annotating.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
836c8ceb7a builtins: always pass prefix to parse_options()
Our builtins receive a "prefix" argument as part of their cmd_foo()
function. We should always pass this to parse_options() if we're calling
it, as it may be used for OPT_FILENAME() options.

In the cases here, there's no option that would use it, so we're not
fixing any bug. This is just future-proofing and setting a good example
(plus quelling some -Wunused-parameter warnings).

Note in the case of revert/cherry-pick, that we plumb the prefix through
to run_sequencer(), as those builtins are just thin wrappers around it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
9dc607f1c2 fast-import: fix file access when run from subdir
In cmd_fast_import(), we ignore the "prefix" argument entirely, even
though it tells us how we may have changed directory to the root of the
repository earlier in the process. Which means that if you run it from a
subdir and point to paths in the filesystem, like:

  cd subdir
  git fast-import --import-marks=foo <dump

then it will look for "foo" in the root of the repository, not the
current directory ("subdir/") which the user would have expected.

We can fix this by recording the prefix and using it as appropriate
whenever we open a file for reading or writing. I found each of these by
looking for cases where we call fopen() within fast-import.c, so this
should cover all cases. The new test triggers each one, as well as
making sure we don't accidentally apply the prefix when --relative-marks
is in use (since that option interprets some paths as relative to a
specific directory).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
d52fcf493b p2000: remove stray '--sparse' flag from test
This argument was added in 7cae7627c4 (builtin/grep.c: integrate with
sparse index, 2022-09-22), but it was a carry-over from an earlier
version where the --sparse flag was added to the 'git grep' builtin.
This argument does not exist, so currently the
p2000-sparse-operations.sh performance test script fails when reaching
this step.

With this fix, the script works with these numbers for my copy of the
Git source code repository:

Test                                         HEAD
------------------------------------------------------------
2000.30: git grep --cached ... (full-v3)     0.34(1.20+0.14)
2000.31: git grep --cached ... (full-v4)     0.31(1.15+0.13)
2000.32: git grep --cached ... (sparse-v3)   0.26(1.13+0.12)
2000.33: git grep --cached ... (sparse-v4)   0.27(1.13+0.12)

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:25:52 -07:00
9b4a655302 config.c: rename "struct config_source cf"
The "cf" name is a holdover from before 4d8dd1494e (config: make parsing
stack struct independent from actual data source, 2013-07-12), when the
struct was named config_file. Since that acronym no longer makes sense,
rename "cf" to "cs". In some places, we have "struct config_set cs", so
to avoid conflict, rename those "cs" to "set" ("config_set" would be
more descriptive, but it's much longer and would require us to rewrap
several lines).

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:27 -07:00
e2016508e7 config: report cached filenames in die_bad_number()
If, when parsing numbers from config, die_bad_number() is called, it
reports the filename and config source type if we were parsing a config
file, but not if we were iterating a config_set (it defaults to a less
specific error message). Most call sites don't parse config files
because config is typically read once and cached, so we only report
filename and config source type in "git config --type" (since "git
config" always parses config files).

This could have been fixed when we taught the current_config_*
functions to respect config_set values (0d44a2dacc (config: return
configset value for current_config_ functions, 2016-05-26), but it was
hard to spot then and we might have just missed it (I didn't find
mention of die_bad_number() in the original ML discussion [1].)

Fix this by refactoring the current_config_* functions into variants
that don't BUG() when we aren't reading config, and using the resulting
functions in die_bad_number(). "git config --get[-regexp] --type=int"
cannot use the non-refactored version because it parses the int value
_after_ parsing the config file, which would run into the BUG().

Since the refactored functions aren't public, they use "struct
config_reader".

1. https://lore.kernel.org/git/20160518223712.GA18317@sigill.intra.peff.net/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:27 -07:00
5cdf18e7cd config.c: remove current_parsing_scope
Add ".parsing_scope" to "struct config_reader" and replace
"current_parsing_scope" with "the_reader.parsing_scope. Adjust the
comment slightly to make it clearer that the scope applies to the config
source (not the current value), and should only be set when parsing a
config source.

As such, ".parsing_scope" (only set when parsing config sources) and
".config_kvi" (only set when iterating a config set) should not be
set together, so enforce this with a setter function.

Unlike previous commits, "populate_remote_urls()" still needs to store
and restore the 'scope' value because it could have touched
"current_parsing_scope" ("config_with_options()" can set the scope).

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:27 -07:00
9828453ff0 config.c: remove current_config_kvi
Add ".config_kvi" to "struct config_reader" and replace
"current_config_kvi" with "the_reader.config_kvi", plumbing "struct
config_reader" where necesssary.

Also, introduce a setter function for ".config_kvi", which allows us to
enforce the contraint that only one of ".source" and ".config_kvi" can
be set at a time (as documented in the comments). Because of this
constraint, we know that "populate_remote_urls()" was never touching
"current_config_kvi" when iterating through config files, so it doesn't
need to store and restore that value.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:27 -07:00
a798a56c8a config.c: plumb the_reader through callbacks
The remaining references to "cf_global" are in config callback
functions. Remove them by plumbing "struct config_reader" via the
"*data" arg.

In both of the callbacks here, we are only reading from
"reader->source". So in the long run, if we had a way to expose readonly
information from "reader->source" (probably in the form of "struct
key_value_info"), we could undo this patch (i.e. remove "struct
config_reader" fom "*data").

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:27 -07:00
0c60285147 config.c: create config_reader and the_reader
Create "struct config_reader" to hold the state of the config source
currently being read. Then, create a static instance of it,
"the_reader", and use "the_reader.source" to replace references to
"cf_global" in public functions.

This doesn't create much immediate benefit (since we're mostly replacing
static variables with a bigger static variable), but it prepares us for
a future where this state doesn't have to be global; "struct
config_reader" (or a similar struct) could be provided by the caller, or
constructed internally by a function like "do_config_from()".

A more typical approach would be to put this struct on "the_repository",
but that's a worse fit for this use case since config reading is not
scoped to a repository. E.g. we can read config before the repository is
known ("read_very_early_config()"), blatantly ignore the repo
("read_protected_config()"), or read only from a file
("git_config_from_file()"). This is especially evident in t5318 and
t9210, where test-tool and scalar parse config but don't fully
initialize "the_repository".

We could have also replaced the references to "cf_global" in callback
functions (which are the only ones left), but we'll eventually plumb
"the_reader" through the callback "*data" arg, so that would be
unnecessary churn. Until we remove "cf_global" altogether, add logic to
"config_reader_*_source()" to keep "cf_global" and "the_reader.source"
in sync.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:26 -07:00
c009bc898b config.c: don't assign to "cf_global" directly
To make "cf_global" easier to remove, replace all direct assignments to
it with function calls. This refactor has an additional maintainability
benefit: all of these functions were manually implementing stack
pop/push semantics on "struct config_source", so replacing them with
function calls allows us to only implement this logic once.

In this process, perform some now-obvious clean ups:

- Drop some unnecessary "cf_global" assignments in
  populate_remote_urls(). Since it was introduced in 399b198489 (config:
  include file if remote URL matches a glob, 2022-01-18), it has stored
  and restored the value of "cf_global" to ensure that it doesn't get
  accidentally mutated. However, this was never necessary since
  "do_config_from()" already pushes/pops "cf_global" further down the
  call chain.

- Zero out every "struct config_source" with a dedicated initializer.
  This matters because the "struct config_source" is assigned to
  "cf_global" and we later 'pop the stack' by assigning "cf_global =
  cf_global->prev", but "cf_global->prev" could be pointing to
  uninitialized garbage.

  Fortunately, this has never bothered us since we never try to read
  "cf_global" except while iterating through config, in which case,
  "cf_global" is either set to a sensible value (when parsing a file),
  or it is ignored (when iterating a configset). Later in the series,
  zero-ing out memory will also let us enforce the constraint that
  "cf_global" and "current_config_kvi" are never non-NULL together.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:26 -07:00
c97f3ed256 config.c: plumb config_source through static fns
This reduces the direct dependence on the global "struct config_source",
which will make it easier to remove in a later commit.

To minimize the changes we need to make, we rename the current variable
from "cf" to "cf_global", and the plumbed arg uses the old name "cf".
This is a little unfortunate, since we now have the confusingly named
"struct config_source cf" everywhere (which is a holdover from before
4d8dd1494e (config: make parsing stack struct independent from actual
data source, 2013-07-12), when the struct used to be called
"config_file"), but we will rename "cf" to "cs" by the end of the
series.

In some cases (public functions and config callback functions), there
isn't an obvious way to plumb "struct config_source" through function
args. As a workaround, add references to "cf_global" that we'll address
in later commits.

The remaining references to "cf_global" are direct assignments to
"cf_global", which we'll also address in a later commit.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 13:03:26 -07:00
15364d2a3c docs: document caveats of rev-list's object-name output
At first glance, the names given by "rev-list --objects" seem like a
good way to see which paths are present in a set of commits. But there
are some subtle gotchas there. We do not document the format of the
names at all, so let's do so, along with warning of these problems.

I intentionally did not document the exact format of the names here, as
I don't think it's something we want people to rely on (though I doubt
in practice that we'd change it at this point).

Though all of this is historically tied to "--objects", these days we
have a separate "--object-names" flag which can turn the names off or
on. So I put the detailed documentation there, but added a note from
--objects (which did not otherwise mention the names at all, even though
they are on by default).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 12:55:00 -07:00
8d90352acc The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 10:51:53 -07:00
8766bcc8e4 Merge branch 'fc/docbook-remove-groff-workaround'
Remove workaround for ancient versions of DocBook to make it work
correctly with groff, which has not been necessary since docbook
1.76 from 2010.

* fc/docbook-remove-groff-workaround:
  doc: remove GNU troff workaround
2023-03-28 10:51:53 -07:00
cdb1ef07d2 Merge branch 'pe/time-use-gettimeofday'
time(2) on glib 2.31+, especially on Linux, goes out of sync with
higher resolution timers used for gettimeofday(2) and by the
filesystem.  Replace all calls to it with a git_time() wrapper and
use gettimeofday(2) in its implementation.

* pe/time-use-gettimeofday:
  git-compat-util: use gettimeofday(2) for time(2)
2023-03-28 10:51:52 -07:00
f879501ad0 Merge branch 'jk/fix-proto-downgrade-to-v0'
Transports that do not support protocol v2 did not correctly fall
back to protocol v0 under certain conditions, which has been
corrected.

* jk/fix-proto-downgrade-to-v0:
  git_connect(): fix corner cases in downgrading v2 to v0
2023-03-28 10:51:52 -07:00
8069aa01cd Merge branch 'fc/oid-quietly-parse-upstream'
"git rev-parse --quiet foo@{u}", or anything that asks @{u} to be
parsed with GET_OID_QUIETLY option, did not quietly fail, which has
been corrected.

* fc/oid-quietly-parse-upstream:
  object-name: fix quiet @{u} parsing
2023-03-28 10:51:52 -07:00
6041a13ec2 Merge branch 'fc/completion-colors-do-not-need-prompt-command'
Lift the limitation that colored prompts can only be used with
PROMPT_COMMAND mode.

* fc/completion-colors-do-not-need-prompt-command:
  completion: prompt: use generic colors
2023-03-28 10:51:52 -07:00
3611f7467f for-each-repo: with bad config, don't conflate <path> and <cmd>
Fix a logic error in 4950b2a2b5 (for-each-repo: run subcommands on
configured repos, 2020-09-11). Due to assuming that elements returned
from the repo_config_get_value_multi() call wouldn't be "NULL" we'd
conflate the <path> and <command> part of the argument list when
running commands.

As noted in the preceding commit the fix is to move to a safer
"*_string_multi()" version of the *_multi() API. This change is
separated from the rest because those all segfaulted. In this change
we ended up with different behavior.

When using the "--config=<config>" form we take each element of the
list as a path to a repository. E.g. with a configuration like:

	[repo] list = /some/repo

We would, with this command:

	git for-each-repo --config=repo.list status builtin

Run a "git status" in /some/repo, as:

	git -C /some/repo status builtin

I.e. ask "status" to report on the "builtin" directory. But since a
configuration such as this would result in a "struct string_list *"
with one element, whose "string" member is "NULL":

	[repo] list

We would, when constructing our command-line in
"builtin/for-each-repo.c"...

	strvec_pushl(&child.args, "-C", path, NULL);
	for (i = 0; i < argc; i++)
		strvec_push(&child.args, argv[i]);

...have that "path" be "NULL", and as strvec_pushl() stops when it
sees NULL we'd end with the first "argv" element as the argument to
the "-C" option, e.g.:

	git -C status builtin

I.e. we'd run the command "builtin" in the "status" directory.

In another context this might be an interesting security
vulnerability, but I think that this amounts to a nothingburger on
that front.

A hypothetical attacker would need to be able to write config for the
victim to run, if they're able to do that there's more interesting
attack vectors. See the "safe.directory" facility added in
8d1a744820 (setup.c: create `safe.bareRepository`, 2022-07-14).

An even more unlikely possibility would be an attacker able to
generate the config used for "for-each-repo --config=<key>", but
nothing else (e.g. an automated system producing that list).

Even in that case the attack vector is limited to the user running
commands whose name matches a directory that's interesting to the
attacker (e.g. a "log" directory in a repository). The second
argument (if any) of the command is likely to make git die without
doing anything interesting (e.g. "-p" to "log", there being no "-p"
built-in command to run).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
9e2d884d0f config API: add "string" version of *_value_multi(), fix segfaults
Fix numerous and mostly long-standing segfaults in consumers of
the *_config_*value_multi() API. As discussed in the preceding commit
an empty key in the config syntax yields a "NULL" string, which these
users would give to strcmp() (or similar), resulting in segfaults.

As this change shows, most users users of the *_config_*value_multi()
API didn't really want such an an unsafe and low-level API, let's give
them something with the safety of git_config_get_string() instead.

This fix is similar to what the *_string() functions and others
acquired in[1] and [2]. Namely introducing and using a safer
"*_get_string_multi()" variant of the low-level "_*value_multi()"
function.

This fixes segfaults in code introduced in:

  - d811c8e17c (versionsort: support reorder prerelease suffixes, 2015-02-26)
  - c026557a37 (versioncmp: generalize version sort suffix reordering, 2016-12-08)
  - a086f921a7 (submodule: decouple url and submodule interest, 2017-03-17)
  - a6be5e6764 (log: add log.excludeDecoration config option, 2020-04-16)
  - 92156291ca (log: add default decoration filter, 2022-08-05)
  - 50a044f1e4 (gc: replace config subprocesses with API calls, 2022-09-27)

There are now two users ofthe low-level API:

- One in "builtin/for-each-repo.c", which we'll convert in a
  subsequent commit.

- The "t/helper/test-config.c" code added in [3].

As seen in the preceding commit we need to give the
"t/helper/test-config.c" caller these "NULL" entries.

We could also alter the underlying git_configset_get_value_multi()
function to be "string safe", but doing so would leave no room for
other variants of "*_get_value_multi()" that coerce to other types.

Such coercion can't be built on the string version, since as we've
established "NULL" is a true value in the boolean context, but if we
coerced it to "" for use in a list of strings it'll be subsequently
coerced to "false" as a boolean.

The callback pattern being used here will make it easy to introduce
e.g. a "multi" variant which coerces its values to "bool", "int",
"path" etc.

1. 40ea4ed903 (Add config_error_nonbool() helper function,
   2008-02-11)
2. 6c47d0e8f3 (config.c: guard config parser from value=NULL,
   2008-02-11).
3. 4c715ebb96 (test-config: add tests for the config_set API,
   2014-07-28)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
1c7e239bd0 config API users: test for *_get_value_multi() segfaults
As we'll discuss in the subsequent commit these tests all
show *_get_value_multi() API users unable to handle there being a
value-less key in the config, which is represented with a "NULL" for
that entry in the "string" member of the returned "struct
string_list", causing a segfault.

These added tests exhaustively test for that issue, as we'll see in a
subsequent commit we'll need to change all of the API users
of *_get_value_multi(). These cases were discovered by triggering each
one individually, and then adding these tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
f7b2ff9516 for-each-repo: error on bad --config
As noted in 6c62f01552 (for-each-repo: do nothing on empty config,
2021-01-08) this command wants to ignore a non-existing config key,
but let's not conflate that with bad config.

Before this, all these added tests would pass with an exit code of 0.

We could preserve the comment added in 6c62f01552, but now that we're
directly using the documented repo_config_get_value_multi() value it's
just narrating something that should be obvious from the API use, so
let's drop it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
a428619309 config API: have *_multi() return an "int" and take a "dest"
Have the "git_configset_get_value_multi()" function and its siblings
return an "int" and populate a "**dest" parameter like every other
git_configset_get_*()" in the API.

As we'll take advantage of in subsequent commits, this fixes a blind
spot in the API where it wasn't possible to tell whether a list was
empty from whether a config key existed. For now we don't make use of
those new return values, but faithfully convert existing API users.

Most of this is straightforward, commentary on cases that stand out:

- To ensure that we'll properly use the return values of this function
  in the future we're using the "RESULT_MUST_BE_USED" macro introduced
  in [1].

  As git_die_config() now has to handle this return value let's have
  it BUG() if it can't find the config entry. As tested for in a
  preceding commit we can rely on getting the config list in
  git_die_config().

- The loops after getting the "list" value in "builtin/gc.c" could
  also make use of "unsorted_string_list_has_string()" instead of using
  that loop, but let's leave that for now.

- In "versioncmp.c" we now use the return value of the functions,
  instead of checking if the lists are still non-NULL.

1. 1e8697b5c4 (submodule--helper: check repo{_submodule,}_init()
   return values, 2022-09-01),

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
f6f348a6d5 versioncmp.c: refactor config reading next commit
Refactor the reading of the versionSort.suffix and
versionSort.prereleaseSuffix configuration variables to stay within
the bounds of our CodingGuidelines when it comes to line length, and
to avoid repeating ourselves.

Renaming "deprecated_prereleases" to "oldl" doesn't help us to avoid
line wrapping now, but it will in a subsequent commit.

Let's also split out the names of the config variables into variables
of our own, and refactor the nested if/else to avoid indenting it, and
the existing bracing style issue.

This all helps with the subsequent commit, where we'll need to start
checking different git_config_get_value_multi() return value. See
c026557a37 (versioncmp: generalize version sort suffix reordering,
2016-12-08) for the original implementation of most of this.

Moving the "initialized = 1" assignment allows us to move some of this
to the variable declarations in the subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
b83efcecaf config API: add and use a "git_config_get()" family of functions
We already have the basic "git_config_get_value()" function and its
"repo_*" and "configset" siblings to get a given "key" and assign the
last key found to a provided "value".

But some callers don't care about that value, but just want to use the
return value of the "get_value()" function to check whether the key
exist (or another non-zero return value).

The immediate motivation for this is that a subsequent commit will
need to change all callers of the "*_get_value_multi()" family of
functions. In two cases here we (ab)used it to check whether we had
any values for the given key, but didn't care about the return value.

The rest of the callers here used various other config API functions
to do the same, all of which resolved to the same underlying functions
to provide the answer.

Some of these were using either git_config_get_string() or
git_config_get_string_tmp(), see fe4c750fb1 (submodule--helper: fix a
configure_added_submodule() leak, 2022-09-01) for a recent example. We
can now use a helper function that doesn't require a throwaway
variable.

We could have changed git_configset_get_value_multi() (and then
git_config_get_value() etc.) to accept a "NULL" as a "dest" for all
callers, but let's avoid changing the behavior of existing API
users. Having an "unused" value that we throw away internal to
config.c is cheap.

A "NULL as optional dest" pattern is also more fragile, as the intent
of the caller might be misinterpreted if he were to accidentally pass
"NULL", e.g. when "dest" is passed in from another function.

Another name for this function could have been
"*_config_key_exists()", as suggested in [1]. That would work for all
of these callers, and would currently be equivalent to this function,
as the git_configset_get_value() API normalizes all non-zero return
values to a "1".

But adding that API would set us up to lose information, as e.g. if
git_config_parse_key() in the underlying configset_find_element()
fails we'd like to return -1, not 1.

Let's change the underlying configset_find_element() function to
support this use-case, we'll make further use of it in a subsequent
commit where the git_configset_get_value_multi() function itself will
expose this new return value.

This still leaves various inconsistencies and clobbering or ignoring
of the return value in place. E.g here we're modifying
configset_add_value(), but ever since it was added in [2] we've been
ignoring its "int" return value, but as we're changing the
configset_find_element() it uses, let's have it faithfully ferry that
"ret" along.

Let's also use the "RESULT_MUST_BE_USED" macro introduced in [3] to
assert that we're checking the return value of
configset_find_element().

We're leaving the same change to configset_add_value() for some future
series. Once we start paying attention to its return value we'd need
to ferry it up as deep as do_config_from(), and would need to make
least read_{,very_}early_config() and git_protected_config() return an
"int" instead of "void". Let's leave that for now, and focus on
the *_get_*() functions.

1. 3c8687a73e (add `config_set` API for caching config-like files, 2014-07-28)
2. https://lore.kernel.org/git/xmqqczadkq9f.fsf@gitster.g/
3. 1e8697b5c4 (submodule--helper: check repo{_submodule,}_init()
   return values, 2022-09-01),

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
e7587a8f53 config tests: add "NULL" tests for *_get_value_multi()
A less well known edge case in the config format is that keys can be
value-less, a shorthand syntax for "true" boolean keys. I.e. these two
are equivalent as far as "--type=bool" is concerned:

	[a]key
	[a]key = true

But as far as our parser is concerned the values for these two are
NULL, and "true". I.e. for a sequence like:

	[a]key=x
	[a]key
	[a]key=y

We get a "struct string_list" with "string" members with ".string"
values of:

	{ "x", NULL, "y" }

This behavior goes back to the initial implementation of
git_config_bool() in 17712991a5 (Add ".git/config" file parser,
2005-10-10).

When parts of the config_set API were tested for in [1] they didn't
add coverage for 3/4 of the "(NULL)" cases handled in
"t/helper/test-config.c". We'd test that case for "get_value", but not
"get_value_multi", "configset_get_value" and
"configset_get_value_multi".

We now cover all of those cases, which in turn expose the details of
how this part of the config API works.

1. 4c715ebb96 (test-config: add tests for the config_set API,
   2014-07-28)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
258902ce07 config tests: cover blind spots in git_die_config() tests
There were no tests checking for the output of the git_die_config()
function in the config API, added in 5a80e97c82 (config: add
`git_die_config()` to the config-set API, 2014-08-07). We only tested
"test_must_fail", but didn't assert the output.

We need tests for this because a subsequent commit will alter the
return value of git_config_get_value_multi(), which is used to get the
config values in the git_die_config() function. This test coverage
helps to build confidence in that subsequent change.

These tests cover different interactions with git_die_config():

- The "notes.mergeStrategy" test in
  "t/t3309-notes-merge-auto-resolve.sh" is a case where a function
  outside of config.c (git_config_get_notes_strategy()) calls
  git_die_config().

- The "gc.pruneExpire" test in "t5304-prune.sh" is a case where
  git_config_get_expiry() calls git_die_config(), covering a different
  "type" than the "string" test for "notes.mergeStrategy".

- The "fetch.negotiationAlgorithm" test in
  "t/t5552-skipping-fetch-negotiator.sh" is a case where
  git_config_get_string*() calls git_die_config().

We also cover both the "from command-line config" and "in file..at
line" cases here.

The clobbering of existing ".git/config" files here is so that we're
not implicitly testing the line count of the default config.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
4a93b899c1 libs: use "struct repository *" argument, not "the_repository"
As can easily be seen from grepping in our sources, we had these uses
of "the_repository" in various library code in cases where the
function in question was already getting a "struct repository *"
argument. Let's use that argument instead.

Out of these changes only the changes to "cache-tree.c",
"commit-reach.c", "shallow.c" and "upload-pack.c" would have cleanly
applied before the migration away from the "repo_*()" wrapper macros
in the preceding commits.

The rest aren't new, as we'd previously implicitly refer to
"the_repository", but it's now more obvious that we were doing the
wrong thing all along, and should have used the parameter instead.

The change to change "get_index_format_default(the_repository)" in
"read-cache.c" to use the "r" variable instead should arguably have
been part of [1], or in the subsequent cleanup in [2]. Let's do it
here, as can be seen from the initial code in [3] it's not important
that we use "the_repository" there, but would prefer to always use the
current repository.

This change excludes the "the_repository" use in "upload-pack.c"'s
upload_pack_advertise(), as the in-flight [4] makes that change.

1. ee1f0c242e (read-cache: add index.skipHash config option,
   2023-01-06)
2. 6269f8eaad (treewide: always have a valid "index_state.repo"
   member, 2023-01-17)
3. 7211b9e753 (repo-settings: consolidate some config settings,
   2019-08-13)
4. <Y/hbUsGPVNAxTdmS@coredump.intra.peff.net>

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
c7c33f50bd post-cocci: adjust comments for recent repo_* migration
In preceding commits we changed many calls to macros that were
providing a "the_repository" argument to invoke corresponding repo_*()
function instead. Let's follow-up and adjust references to those in
comments, which coccinelle didn't (and inherently can't) catch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
035c7de9e9 cocci: apply the "revision.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"revision.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
b26a71b1be cocci: apply the "rerere.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"rerere.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
12cb1c10a6 cocci: apply the "refs.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"refs.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
a5183d7696 cocci: apply the "promisor-remote.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"promisor-remote.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
afe27c8894 cocci: apply the "packfile.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"packfile.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
bab821646a cocci: apply the "pretty.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"pretty.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
bc726bd075 cocci: apply the "object-store.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
085390328f cocci: apply the "diff.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"diff.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
ecb5091fd4 cocci: apply the "commit.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
cb338c23d6 cocci: apply the "commit-reach.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit-reach.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
d850b7a545 cocci: apply the "cache.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
7258e892d2 cocci: add missing "the_repository" macros to "pending"
In the case of diff.h, rerere.h and revision.h the macros were added
in [1], [2] and [3] when "the_repository.pending.cocci" didn't
exist. None of the subsequently added migration rules covered
them. Let's add those missing rules.

In the case of macros in "cache.h", "commit.h", "packfile.h",
"promisor-remote.h" and "refs.h" those aren't guarded by
"NO_THE_REPOSITORY_COMPATIBILITY_MACROS", but they're also macros that
add "the_repository" as the first argument, so we should migrate away
from them.

1. 2abf350385 (revision.c: remove implicit dependency on the_index,
   2018-09-21)
2. e675765235 (diff.c: remove implicit dependency on the_index,
   2018-09-21)
3. 35843b1123 (rerere.c: remove implicit dependency on the_index,
   2018-09-21)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
5978de2031 cocci: sort "the_repository" rules by header
Sort the "the_repository.pending.cocci" file by which header the
macros are in, and add a comment to that effect in front of the
rules. This will make subsequent commits easier to follow, as we'll be
applying these rules on a header-by-header basis.

Once we've fully applied "the_repository.pending.cocci" we'll keep
this rules around for a while in "the_repository.cocci", to help any
outstanding topics and out-of-tree code to resolve textual or semantic
conflicts with these changes, but eventually we'll remove the
"the_repository.cocci" as a follow-up.

So even if some of these functions are subsequently moved and/or split
into other or new headers there's no risk of this becoming stale, if
and when that happens the we should be removing these rules anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
6f1436ba2a cocci: fix incorrect & verbose "the_repository" rules
When these rules started being added in [1] they didn't use a ";"
after the ")", and would thus catch uses of these macros within
expressions. But as of [2] the new additions were broken in that
they'd only match a subset of the users of these macros.

Rather than narrowly fixing that, let's have these use the much less
verbose pattern introduced in my recent [3]: There's no need to
exhaustively enumerate arguments if we use the "..." syntax. This
means that we can fold all of these different rules into one.

1. afd69dcc21 (object-store: prepare read_object_file to deal with
   any repo, 2018-11-13)
2. 21a9651ba3 (commit-reach: prepare get_merge_bases to handle any
   repo, 2018-11-13)
3. 0e6550a2c6 (cocci: add a index-compatibility.pending.cocci,
   2022-11-19)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
49c2d93ecf cocci: remove dead rule from "the_repository.pending.cocci"
The "parse_commit_gently" macro went away in [1], so we don't need to
carry this for its migration.

1. ea3f7e598c (revision: use repository from rev_info when parsing
   commits, 2020-06-23)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
3dc0b7f0dc t3070: make chain lint tester happy
1f2e05f0b7 ("wildmatch: fix exponential behavior", 2023-03-20)
introduced a new test with a background process. Backgrounding
necessarily gives a result of 0, so that a seemingly broken && chain is
not really broken.

Adjust t3070 slightly so that our chain lint test recognizes the
construct for what it is and does not raise a false positive.

Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 17:02:38 -07:00
818b4f823f credential/wincred: include wincred.h
Delete redundant definitions. Mingw-w64 has wincred.h since 2007 [1].

[1] 9d937a7f4f/mingw-w64-headers/include/wincred.h

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 15:21:13 -07:00
d3b3419f8f config: tell the user that we expect an ASCII character
Commit 50b54fd72a (config: be strict on core.commentChar, 2014-05-17)
notes that “multi-byte character encoding could also be misinterpreted”,
and indeed a multi-byte codepoint (non-ASCII) is not accepted as a valid
`core.commentChar`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 13:09:38 -07:00
d3af1c193d commit-graph: fix truncated generation numbers
In 80c928d947 (commit-graph: simplify compute_generation_numbers(),
2023-03-20), the code to compute generation numbers was simplified to
use the same infrastructure as is used to compute topological levels.
This refactoring introduced a bug where the generation numbers are
truncated when they exceed UINT32_MAX because we explicitly cast the
computed generation number to `uint32_t`. This is not required though:
both the computed value and the field of `struct commit_graph_data` are
of the same type `timestamp_t` already, so casting to `uint32_t` will
cause truncation.

This cast can cause us to miscompute generation data overflows:

    1. Given a commit with no parents and committer date
       `UINT32_MAX + 1`.

    2. We compute its generation number as `UINT32_MAX + 1`, but
       truncate it to `1`.

    3. We calculate the generation offset via `$generation - $date`,
       which is thus `1 - (UINT32_MAX + 1)`. The computation underflows
       and we thus end up with an offset that is bigger than the maximum
       allowed offset.

As a result, we'd be writing generation data overflow information into
the commit-graph that is bogus and ultimately not even required.

Fix this bug by removing the needless cast.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:52:06 -07:00
00408adeac builtin/sparse-checkout: add check-rules command
There exists no direct way to interrogate git about which paths are
matched by a given set of sparsity rules. It is possible to get this
information from git, but it includes checking out the commit that
contains the paths, applying the sparse checkout patterns and then using
something like 'git ls-files -t' to check if the skip worktree bit is
set. This works in some case, but there are cases where it is awkward or
infeasible to generate a checkout for this purpose.

Exposing the pattern matching of sparse checkout enables more tooling to
be built and avoids a situation where tools that want to reason about
sparse checkouts start containing parallel implementation of the rules.
To accommodate this, add a 'check-rules' subcommand to the
'sparse-checkout' builtin along the lines of the 'git check-ignore' and
'git check-attr' commands. The new command accepts a list of paths on
stdin and outputs just the ones the match the sparse checkout.

To allow for use in a bare repository and to allow for interrogating
about other patterns than the current ones, include a '--rules-file'
option which allows the caller to explicitly pass sparse checkout rules
in the format accepted by 'sparse-checkout set --stdin'.

To allow for reuse of the handling of input patterns for the
'--rules-file' flag, modify 'add_patterns_from_input()' to be able to
read from a 'FILE' instead of just stdin.

To allow for reuse of the logic which decides whether or not rules
should be interpreted as cone-mode patterns, split that part out of
'update_modes()' such that can be called without modifying the config.

An alternative could have been to create a new 'check-sparsity' command.
However, placing it under 'sparse-checkout' allows for a) more easily
re-using the sparse checkout pattern matching and cone/non-code mode
handling, and b) keeps the documentation for the command next to the
experimental warning and the cone-mode discussion.

Signed-off-by: William Sprent <williams@unity3d.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:51:12 -07:00
24fc2cde64 builtin/sparse-checkout: remove NEED_WORK_TREE flag
In preparation for adding a sub-command to 'sparse-checkout' that can be
run in a bare repository, remove the 'NEED_WORK_TREE' flag from its
entry in the 'commands' array of 'git.c'.

To avoid that this changes any behaviour, add calls to
'setup_work_tree()' to all of the 'sparse-checkout' sub-commands and add
tests that verify that 'sparse-checkout <cmd>' still fail with a clear
error message telling the user that the command needs a work tree.

Signed-off-by: William Sprent <williams@unity3d.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:43:51 -07:00
3457b50e8c t3701: we don't need no Perl for add -i anymore
This should have been removed in `ab/retire-scripted-add-p` but wasn't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:40:12 -07:00
061dd722dc unpack-trees: take care to propagate the split-index flag
When copying the `split_index` structure from one index structure to
another, we need to propagate the `SPLIT_INDEX_ORDERED` flag, too, if it
is set, otherwise Git might forget to write the shared index when that
is actually needed.

It just so _happens_ that in many instances when `unpack_trees()` is
called, the result causes the shared index to be written anyway, but
there are edge cases when that is not so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:40 -07:00
be6b65b91b fsmonitor: avoid overriding cache_changed bits
As of e636a7b4d0 (read-cache: be specific what part of the index has
changed, 2014-06-13), the paradigm `cache_changed = 1` fell out of
fashion and it became a bit field instead.

This is important because some bits have specific meaning and should not
be unset without care, e.g. `SPLIT_INDEX_ORDERED`.

However, b5a8169752 (mark_fsmonitor_valid(): mark the index as changed
if needed, 2019-05-24) did use the `cache_changed` attribute as if it
were a Boolean instead of a bit field.

That not only would override the `SPLIT_INDEX_ORDERED` bit when marking
index entries as valid via the FSMonitor, but worse: it would set the
`SOMETHING_OTHER` bit (whose value is 1). This means that Git would
unnecessarily force a full index to be written out when a split index
was asked for.

Let's instead use the bit that is specifically intended to indicate
FSMonitor-triggered changes, allowing the split-index feature to work as
designed.

Noticed-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:39 -07:00
3b7a4475b0 split-index; stop abusing the base_oid to strip the "link" extension
When a split-index is in effect, the `$GIT_DIR/index` file needs to
contain a "link" extension that contains all the information about the
split-index, including the information about the shared index.

However, in some cases Git needs to suppress writing that "link"
extension (i.e. to fall back to writing a full index) even if the
in-memory index structure _has_ a `split_index` configured. This is the
case e.g. when "too many not shared" index entries exist.

In such instances, the current code sets the `base_oid` field of said
`split_index` structure to all-zero to indicate that `do_write_index()`
should skip writing the "link" extension.

This can lead to problems later on, when the in-memory index is still
used to perform other operations and eventually wants to write a
split-index, detects the presence of the `split_index` and reuses that,
too (under the assumption that it has been initialized correctly and
still has a non-null `base_oid`).

Let's stop zeroing out the `base_oid` to indicate that the "link"
extension should not be written.

One might be tempted to simply call `discard_split_index()` instead,
under the assumption that Git decided to write a non-split index and
therefore the `split_index` structure might no longer be wanted.
However, that is not possible because that would release index entries
in `split_index->base` that are likely to still be in use. Therefore we
cannot do that.

The next best thing we _can_ do is to introduce a bit field to indicate
specifically which index extensions (not) to write. So that's what we do
here.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:39 -07:00
3704fed5ea split-index & fsmonitor: demonstrate a bug
This commit adds a new test case that demonstrates a bug in the
split-index code that is triggered under certain circumstances when the
FSMonitor is enabled, and its symptom manifests in the form of one of
the following error messages:

    BUG: fsmonitor.c:20: fsmonitor_dirty has more entries than the index (2 > 1)

    BUG: unpack-trees.c:776: pos <n> doesn't point to the first entry of <dir>/ in index

    error: invalid path ''
    error: The following untracked working tree files would be overwritten by reset:
            initial.t

Which of these error messages appears depends on timing-dependent
conditions.

Technically the root cause lies with a bug in the split-index code that
has nothing to do with FSMonitor, but for the sake of this new test case
it was the easiest way to trigger the bug.

The bug is this: Under specific conditions, Git needs to skip writing
the "link" extension (which is the index extension containing the
information pertaining to the split-index). To do that, the `base_oid`
attribute of the `split_index` structure in the in-memory index is
zeroed out, and `do_write_index()` specifically checks for a "null"
`base_oid` to understand that the "link" extension should not be
written. However, this violates the consistency of the in-memory index
structure, but that does not cause problems in most cases because the
process exits without using the in-memory index structure anymore,
anyway.

But: _When_ the in-memory index is still used (which is the case e.g. in
`git rebase`), subsequent writes of `the_index` are at risk of writing
out a bogus index file, one that _should_ have a "link" extension but
does not. In many cases, the `SPLIT_INDEX_ORDERED` flag _happens_ to be
set for subsequent writes, forcing the shared index to be written, which
re-initializes `base_oid` to a non-bogus state, and all is good.

When it is _not_ set, however, all kinds of mayhem ensue, resulting in
above-mentioned error messages, and often enough putting worktrees in a
totally broken state where the only recourse is to manually delete the
`index` and the `index.lock` files and then call `git reset` manually.
Not something to ask users to do.

The reason why it is comparatively easy to trigger the bug with
FSMonitor is that there is _another_ bug in the FSMonitor code:
`mark_fsmonitor_valid()` sets `cache_changed` to 1, i.e. treating that
variable as a Boolean. But it is a bit field, and 1 happens to be the
`SOMETHING_CHANGED` bit that forces the "link" extension to be skipped
when writing the index, among other things.

"Comparatively easy" is a relative term in this context, for sure. The
essence of how the new test case triggers the bug is as following:

1. The `git rebase` invocation will first reset the worktree to
   a commit that contains only the `one.t` file, and then execute a
   rebase script that starts with the following commands (commit hashes
   skipped):

   label onto

   reset initial
   pick two
   label two

   reset two
   pick three
   [...]

2. Before executing the `label` command, a split index is written, as
   well as the shared index.

3. The `reset initial` command in the rebase script writes out a new
   split index but skips writing the shared index, as intended.

4. The `pick two` command updates the worktree and refreshes the index,
   marking the `two.t` entry as valid via the FSMonitor, which sets the
   `SOMETHING_CHANGED` bit in `cache_changed`, which in turn causes the
   `base_oid` attribute to be zeroed out and a full (non-split) index
   to be written (making sure _not_ to write the "link" extension).

5. Now, the `reset two` command will leave the worktree alone, but
   still write out a new split index, not writing the shared index
   (because `base_oid` is still zeroed out, and there is no index entry
   update requiring it to be written, either).

6. When it is turn to run `pick three`, the index is read, but it is
   too short: It only contains a single entry when there should be two,
   because the "link" extension is missing from the written-out index
   file.

There are three bugs at play, actually, which will be fixed over the
course of the next commits:

- The `base_oid` attribute should not be zeroed out to indicate when
  the "link" extension should not be written, as it puts the in-memory
  index structure into an inconsistent state.

- The FSMonitor should not overwrite bits in `cache_changed`.

- The `unpack_trees()` function tries to reuse the `split_index`
  structure from the source index, if any, but does not propagate the
  `SPLIT_INDEX_ORDERED` flag.

While a fix for the second bug would let this test case pass, there are
other conditions where the `SOMETHING_CHANGED` bit is set. Therefore,
the bug that most crucially needs to be fixed is the first one.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:39 -07:00
3521c63213 branch: avoid unnecessary worktrees traversals
When we rename a branch ref, we need to update any worktree that have
its HEAD pointing to the branch ref being renamed, so to make it use the
new ref name.

If we know in advance that we're renaming a branch that is not currently
checked out in any worktree, we can skip this step entirely.  Let's do
it so.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:15 -07:00
a675ad1708 branch: rename orphan branches in any worktree
In cfaff3aac (branch -m: allow renaming a yet-unborn branch, 2020-12-13)
we added support for renaming an orphan branch when that branch is
checked out in the current worktree.

Let's also allow renaming an orphan branch checked out in a worktree
different than the current one.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:15 -07:00
7a6ccdfb4e branch: description for orphan branch errors
In bcfc82bd48 (branch: description for non-existent branch errors,
2022-10-08) we checked the HEAD in the current worktree to detect if the
branch to operate with is an orphan branch, so as to avoid the confusing
error: "No branch named...".

If we are asked to operate with an orphan branch in a different working
tree than the current one, we need to check the HEAD in that different
working tree.

Let's extend the check we did in bcfc82bd48, to check the HEADs in all
worktrees linked to the current repository, using the helper introduced
in 31ad6b61bd (branch: add branch_checked_out() helper, 2022-06-15).

The helper, branch_checked_out(), does its work obtaining internally a
list of worktrees linked to the current repository.  Obtaining that list
is not a lightweight work because it implies disk access.

In copy_or_rename_branch() we already have a list of worktrees.  Let's
use that already obtained list, and avoid using here the helper.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:14 -07:00
d7f4ca61b5 branch: use get_worktrees() in copy_or_rename_branch()
Obtaining the list of worktrees, using get_worktrees(), is not a
lightweight operation, because it involves reading from disk.

Let's stop calling get_worktrees() in reject_rebase_or_bisect_branch()
and in replace_each_worktree_head_symref().  Make them receive the list
of worktrees from their only caller, copy_or_rename_branch().

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:14 -07:00
2e8af499ff branch: test for failures while renaming branches
When we introduced replace_each_worktree_head_symref() in 70999e9cec
(branch -m: update all per-worktree HEADs, 2016-03-27), we implemented a
best effort approach.

If we are asked to rename a branch that is simultaneously checked out in
multiple worktrees, we try to update all of those worktrees.  If we fail
updating any of them, we die() as a signal that something has gone
wrong.  However, at this point, the branch ref has already been renamed
and also updated the HEADs of the successfully updated worktrees.
Despite returning an error, we do not try to rollback those changes.

Let's add a test to notice if we change this behavior in the future.

In next commits we will change replace_each_worktree_head_symref() to
work more closely with its only caller, copy_or_rename_branch().  Let's
move the former closer to its caller, to facilitate those changes.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:14 -07:00
6605fb70cb rebase: add a config option for --rebase-merges
The purpose of the new option is to accommodate users who would like
--rebase-merges to be on by default and to facilitate turning on
--rebase-merges by default without configuration in a future version of
Git.

Name the new option rebase.rebaseMerges, even though it is a little
redundant, for consistency with the name of the command line option and
to be clear when scrolling through values in the [rebase] section of
.gitconfig.

Support setting rebase.rebaseMerges to the nonspecific value "true" for
users who don't need to or don't want to learn about the difference
between rebase-cousins and no-rebase-cousins.

Make --rebase-merges without an argument on the command line override
any value of rebase.rebaseMerges in the configuration, for consistency
with other command line flags with optional arguments that have an
associated config option.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:32:49 -07:00
33561f5170 rebase: deprecate --rebase-merges=""
The unusual syntax --rebase-merges="" (that is, --rebase-merges with an
empty string argument) has been an undocumented synonym of
--rebase-merges without an argument. Deprecate that syntax to avoid
confusion when a rebase.rebaseMerges config option is introduced, where
rebase.rebaseMerges="" will be equivalent to --no-rebase-merges.

It is not likely that anyone is actually using this syntax, but just in
case, deprecate the empty string argument instead of dropping support
for it immediately.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:32:49 -07:00
7e5dcec3ca rebase: add documentation and test for --no-rebase-merges
As far as I can tell, --no-rebase-merges has always worked, but has
never been documented. It is especially important to document it before
a rebase.rebaseMerges option is introduced so that users know how to
override the config option on the command line. It's also important to
clarify that --rebase-merges without an argument is not the same as
--no-rebase-merges and not passing --rebase-merges is not the same as
passing --rebase-merges=no-rebase-cousins.

A test case is necessary to make sure that --no-rebase-merges keeps
working after its code is refactored in the following patches of this
series. The test case is a little contrived: It's unlikely that a user
would type both --rebase-merges and --no-rebase-merges at the same time.
However, if an alias is defined which includes --rebase-merges, the user
might decide to add --no-rebase-merges to countermand that part of the
alias but leave alone other flags set by the alias.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:32:49 -07:00
1aaed69d11 t5000: use check_mtime()
fd2da4b1ea (archive: add --mtime, 2023-02-18) added a helper function
for checking the file modification time of an extracted entry.  Use it
for the older mtime test as well to shorten the code and piggyback on
the archive extraction done to validate file contents.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:13:30 -07:00
92b1dd1b9e archive: improve support for running in subdirectory
When git archive is started in a subdirectory, it archives its
corresponding tree and its child objects, only.  That is intended.  It
does that by effectively cd'ing into that tree and setting "prefix" to
the empty string.

This has unfortunate consequences, though: Attributes are anchored at
the root of the repository and git archive still applies them to
subtrees, causing mismatches.  And when checking pathspecs it cannot
tell the difference between one that doesn't match anthing or one that
matches some actual blob outside of the subdirectory, leading to a
confusing error message.

Fix that by keeping the "prefix" value and passing it to pathspec and
attribute functions, and shortening it using relative_path() for paths
written to the archive and (if --verbose is given) to stdout.

Still reject attempts to archive files outside the current directory,
but print a more specific error in that case.  Recognizing it requires a
full traversal of the subtree for each pathspec, however.  Allowing them
would be easier, but archive entry paths starting with "../" can be
problematic to extract -- e.g. bsdtar skips them by default.

Reported-by: Cristian Le <cristian.le@mpsd.mpg.de>
Reported-by: Matthias Görgens <matthias.goergens@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-24 15:51:25 -07:00
1a3119ed06 blame: allow --contents to work with non-HEAD commit
The --contents option can be used with git blame to blame the file as if
it had the contents from the specified file. This is akin to copying the
contents into the working tree and then running git blame. This option
has been supported since 1cfe77333f ("git-blame: no rev means start
from the working tree file.")

The --contents option always blames the file as if it was based on the
current HEAD commit. If you try to pass a revision while using
--contents, you get the following error:

  fatal: cannot use --contents with final commit object name

This is because the blame process generates a fake working tree commit
which always uses the HEAD object as its sole parent.

Enhance fake_working_tree_commit to take the object ID to use for the
parent instead of always using the HEAD object. Then, always generate a
fake commit when we have contents provided, even if we have a final
object. Remove the check to disallow --contents and a final revision.

Note that the behavior of generating a fake working commit is still
skipped when a revision is provided but --contents is not provided.
Generating such a commit in that case would combine the currently
checked out file contents with the provided revision, which breaks
normal blame behavior and produces unexpected results.

This enables use of --contents with an arbitrary revision, rather than
forcing the use of the local HEAD commit. This makes the --contents
option significantly more flexible, as it is no longer required to check
out the working tree to the desired commit before using --contents.

Reword the documentation so that its clear that --contents can be used
with <rev>.

Add tests for the --contents option to the annotate-tests.sh test
script.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-24 12:05:22 -07:00
54dbd0933b sequencer: rewrite save_head() in terms of write_message()
Saves some code duplication.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-24 08:02:05 -07:00
2da2cc9b28 sequencer: remove pointless rollback_lock_file()
The file is gone even if commit_lock_file() fails.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-24 07:52:16 -07:00
4406522b76 pack-redundant: escalate deprecation warning to an error
In c3b58472be (pack-redundant: gauge the usage before proposing its
removal, 2020-08-25), we added a big, ugly warning when pack-redundant
is run. The plan there indicated that we would ratchet that up to an
error before finally removing it. Since it has been 2.5 years (and 9
releases) since then, let's continue with the plan.

Note that we did get one bite on the warning, which was somebody asking
about alternatives:

  https://lore.kernel.org/git/CAKvOHKAFXQwt4D8yUCCkf_TQL79mYaJ=KAKhtpDNTvHJFuX1NA@mail.gmail.com/

but we didn't undo the ugly warning (and the advice continues to be "use
repack -d" instead).

There was also some discussion around the time of the deprecation that
pack-redundant was invoked by the bitbake tool, and it still seems to do
so now:

  https://git.openembedded.org/bitbake

That use should probably just go away in favor of an occasional repack
(which probably even happens via auto-gc after fetch these days).

But since neither of those data points caused us to cancel the
deprecation plan by dropping the warning, it seems like we should
proceed with the next step.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-23 13:56:02 -07:00
0a01d41ee4 http: add support for different sslcert and sslkey types.
Basically git work with default curl ssl type - PEM. But for support
eTokens like SafeNet tokens via pksc11 need setup 'ENG' as sslcert type
and as sslkey type. So there added additional options for http to make
that possible.

Signed-off-by: Stanislav Malishevskiy <stanislav.malishevskiy@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-23 11:25:10 -07:00
14b9a04479 grep: work around UTF-8 related JIT bug in PCRE2 <= 10.34
Stephane is reporting[1] a regression introduced in git v2.40.0 that leads
to 'git grep' segfaulting in his CI pipeline. It turns out, he's using an
older version of libpcre2 that triggers a wild pointer dereference in
the generated JIT code that was fixed in PCRE2 10.35.

Instead of completely disabling the JIT compiler for the buggy version,
just mask out the Unicode property handling as we used to do prior to
commit acabd2048e ("grep: correctly identify utf-8 characters with
\{b,w} in -P").

[1] https://lore.kernel.org/git/7E83DAA1-F9A9-4151-8D07-D80EA6D59EEA@clumio.com/

Reported-by: Stephane Odul <stephane@clumio.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-23 11:19:34 -07:00
8453685d04 Makefile: force -O0 when compiling with SANITIZE=leak
Cherry pick commit d3775de0 (Makefile: force -O0 when compiling with
SANITIZE=leak, 2022-10-18), as otherwise the leak checker at GitHub
Actions CI seems to fail with a false positive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-23 09:17:23 +01:00
d051f1718e fast-export: drop unused parameter from anonymize_commit_message()
As the comment above the function indicates, we do not bother actually
storing commit messages in our anonymization map. But we still take the
message as a parameter, and just ignore it. Let's stop doing that, which
will make -Wunused-parameter happier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:09 -07:00
65c756fff0 fast-export: drop data parameter from anonymous generators
The anonymization code has a specific generator callback for each type
of data (e.g., one for paths, one for oids, and so on). These all take a
"data" parameter, but none of them use it for anything. Which is not
surprising, as the point is to generate a new name independent of any
input, and each function keeps its own static counter.

We added the extra pointer in d5bf91fde4 (fast-export: add a "data"
callback parameter to anonymize_str(), 2020-06-23) to handle
--anonymize-map parsing, but that turned out to be awkward itself, and
was recently dropped.

So let's get rid of this "data" parameter that nobody is using, both
from the generators and from anonymize_str() which plumbed it through.
This simplifies the code, and makes -Wunused-parameter happier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:09 -07:00
aa548459a0 fast-export: de-obfuscate --anonymize-map handling
When we handle an --anonymize-map option, we parse the orig/anon pair,
and then feed the "orig" string to anonymize_str(), along with a
generator function that duplicates the "anon" string to be cached in the
map.

This works, because anonymize_str() says "ah, there is no mapping yet
for orig; I'll add one from the generator". But there are some
downsides:

  1. It's a bit too clever, as it's not obvious what the code is trying
     to do or why it works.

  2. It requires allowing generator functions to take an extra void
     pointer, which is not something any of the normal callers of
     anonymize_str() want.

  3. It does the wrong thing if the same token is provided twice.
     When there are conflicting options, like:

       git fast-export --anonymize \
         --anonymize-map=foo:one \
	 --anonymize-map=foo:two

     we usually let the second one override the first. But by using
     anonymize_str(), which has first-one-wins logic, we do the
     opposite.

So instead of relying on anonymize_str(), let's directly add the entry
ourselves. We can tweak the tests to show that we handle overridden
options correctly now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:09 -07:00
dcc4e134aa fast-export: factor out anonymized_entry creation
When anonymizing output, there's only one spot where we generate new
entries to add to our hashmap: when anonymize_str() doesn't find an
entry, we use the generate() callback to make one and add it. Let's pull
that into its own function in preparation for another caller.

Note that we'll add one extra feature. In anonymize_str(), we know that
we won't find an existing entry in the hashmap (since it will only try
to add after failing to find one). But other callers won't have the same
behavior, so we should catch this case and free the now-dangling entry.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:09 -07:00
d6484e9fab fast-export: simplify initialization of anonymized hashmaps
We take pains to avoid doing a lookup on a hashmap which has not been
initialized with hashmap_init(). That was necessary back when this code
was written. But hashmap_get() became safer in b7879b0ba6 (hashmap:
allow re-use after hashmap_free(), 2020-11-02). Since then it's OK to
call functions on a zero-initialized table; it will just correctly
return NULL, since there is no match.

This simplifies the code a little, and also lets us keep the
initialization line closer to when we add an entry (which is when the
hashmap really does need to be totally initialized). That will help
later refactoring.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:08 -07:00
76e50f7fbc fast-export: drop const when storing anonymized values
We store anonymized values as pointers to "const char *", since they are
conceptually const to callers who use them. But they are actually
allocated strings whose memory is owned by the struct.

The ownership mismatch hasn't been a big deal since we never free() them
(they are held until the program ends), but let's switch them to "char *"
in preparation for changing that.

Since most code only accesses them via anonymize_str(), it can continue
to narrow them to "const char *" in its return value.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:08 -07:00
e4cb3693a4 Merge branch 'backport/jk/range-diff-fixes'
"git range-diff" code clean-up. Needed to pacify modern GCC versions.

* jk/range-diff-fixes:
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
2023-03-22 18:00:36 +01:00
3c7896e362 Merge branch 'backport/jk/curl-avoid-deprecated-api' into maint-2.30
Deal with a few deprecation warning from cURL library.

* jk/curl-avoid-deprecated-api:
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
2023-03-22 18:00:36 +01:00
6f5ff3aa31 Merge branch 'backport/jx/ci-ubuntu-fix' into maint-2.30
Adjust the GitHub CI to newer ubuntu release.

* jx/ci-ubuntu-fix:
  github-actions: run gcc-8 on ubuntu-20.04 image
  ci: install python on ubuntu
  ci: use the same version of p4 on both Linux and macOS
  ci: remove the pipe after "p4 -V" to catch errors
2023-03-22 18:00:35 +01:00
0737200a06 Merge branch 'backport/jc/http-clear-finished-pointer' into maint-2.30
Meant to go with js/ci-gcc-12-fixes.
source: <xmqq7d68ytj8.fsf_-_@gitster.g>

* jc/http-clear-finished-pointer:
  http.c: clear the 'finished' member once we are done with it
2023-03-22 18:00:34 +01:00
0a1dc55c40 Merge branch 'backport/js/ci-gcc-12-fixes'
Fixes real problems noticed by gcc 12 and works around false
positives.

* js/ci-gcc-12-fixes:
  nedmalloc: avoid new compile error
  compat/win32/syslog: fix use-after-realloc
2023-03-22 18:00:34 +01:00
5843080c85 http.c: clear the 'finished' member once we are done with it
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.

Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request).  The loop terminating condition mistakenly
thought that the original request has yet to be completed.

Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed.  It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.

One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns.  Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 17:58:29 +01:00
321854ac46 clone.c: avoid "exceeds maximum object size" error with GCC v12.x
Technically, the pointer difference `end - start` _could_ be negative,
and when cast to an (unsigned) `size_t` that would cause problems. In
this instance, the symptom is:

dir.c: In function 'git_url_basename':
dir.c:3087:13: error: 'memchr' specified bound [9223372036854775808, 0]
       exceeds maximum object size 9223372036854775807
       [-Werror=stringop-overread]
    CC ewah/bitmap.o
 3087 |         if (memchr(start, '/', end - start) == NULL
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While it is a bit far-fetched to think that `end` (which is defined as
`repo + strlen(repo)`) and `start` (which starts at `repo` and never
steps beyond the NUL terminator) could result in such a negative
difference, GCC has no way of knowing that.

See also https://gcc.gnu.org/bugzilla//show_bug.cgi?id=85783.

Let's just add a safety check, primarily for GCC's benefit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 17:53:32 +01:00
27d43aaaf5 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 14:19:03 -07:00
ba235249c0 Merge branch 'fc/test-aggregation-clean-up'
Code clean-up for test framework.

* fc/test-aggregation-clean-up:
  test: don't print aggregate-results command
  test: simplify counts aggregation
2023-03-21 14:18:56 -07:00
ea09dff59a Merge branch 'ps/receive-pack-unlock-before-die'
"git receive-pack" that responds to "git push" requests failed to
clean a stale lockfile when killed in the middle, which has been
corrected.

* ps/receive-pack-unlock-before-die:
  receive-pack: fix stale packfile locks when dying
2023-03-21 14:18:55 -07:00
1071deae00 Merge branch 'aj/ls-files-format-fix'
Fix for a "ls-files --format="%(path)" that produced nonsense
output, which was a bug in 2.38.

* aj/ls-files-format-fix:
  ls-files: fix "--format" output of relative paths
2023-03-21 14:18:55 -07:00
15108de2fa Merge branch 'jk/format-patch-ignore-noprefix'
"git format-patch" honors the src/dst prefixes set to nonstandard
values with configuration variables like "diff.noprefix", causing
receiving end of the patch that expects the standard -p1 format to
break.  Teach "format-patch" to ignore end-user configuration and
always use the standard prefixes.

This is a backward compatibility breaking change.

* jk/format-patch-ignore-noprefix:
  rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch
  format-patch: add format.noprefix option
  format-patch: do not respect diff.noprefix
  diff: add --default-prefix option
  t4013: add tests for diff prefix options
  diff: factor out src/dst prefix setup
2023-03-21 14:18:55 -07:00
9b0c7f308a am: refer to format-patch in the documentation
There were two reasons we didn't do this.  As "git am" is designed
to grok e-mailed patches, not necessarily taken out of a Git
repostiory or even if it came from a Git repository not necessarily
produced with format-patch, we didn't want to single it out as the
"blessed" input producer to the command.  Also, in the original
workflow that "git am" was invented for, the user of "am" was
expected to be a different person than the users of "format-patch".

But this is a very safe change to make in 2023.  Thanks to the
effort by many contributors, Git ended up becoming a bit more
popular than we initially thought it would be, and "format-patch",
which took me a few weeks to pursuade Linus to take in 2005, seems
to have become the de-facto standard tool to produce patch e-mails.

Interestingly, the documentation for "git apply", which is listed in
SEE ALSO section of "git am" documentation, does mention "am" and
"format-patch" as two things that are related but different from
"apply" in an early part.

Suggested-by: Kai Grossjohann <kai.grossjohann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 13:18:45 -07:00
ee6ad78260 doc: remove GNU troff workaround
In 2007 the docbook project made the mistake of converting ' to \' for
man pages [1]. It's a problem because groff interprets \' as acute
accent which is rendered as ' in ASCII, but as ´ in utf-8.

This started a cascade of bug reports in git [2], debian [3], Arch Linux
[4], docbook itself [5], and probably many others.

A solution was to use the correct groff character: \(aq, which is always
rendered as ', but the problem is that such character doesn't work in
other troff programs.

A portable solution required the use of a conditional character that is
\(aq in groff, but ' in all others:

  .ie \n(.g .ds Aq \(aq
  .el .ds Aq '

The proper solution took time to be implemented in docbook, but in 2010
they did it [6]. So the docbook man page stylesheets were broken from
1.73 to 1.76.

Unfortunately by that point many workarounds already existed. In the
case of git, GNU_ROFF was introduced, and in the case of Arch Linux
a mapping from \' to ' was added to groff's man.local. Other
distributions might have done the same, or similar workarounds.

Since 2010 there is no need for this workaround, which is fixed
elsewhere, not just in docbook, but other layers as well.

Let's remove it.

[1] ea2a0bac56
[2] https://lore.kernel.org/git/20091012102926.GA3937@debian.b2j/
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=507673#65
[4] https://bugs.archlinux.org/task/9643
[5] https://sourceforge.net/p/docbook/bugs/1022/
[6] fb55343426

Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 13:16:46 -07:00
370ddcbc89 git-compat-util: use gettimeofday(2) for time(2)
Use gettimeofday instead of time(NULL) to get current time.
This avoids clock skew on glibc 2.31+ on Linux, where in the
first 1 to 2.5 ms of every second, time(NULL) returns a
value that is one less than the tv_sec part of
higher-resolution timestamps such as those returned by
gettimeofday or timespec_get, or those in the file system.
There are similar clock skew problems on AIX and MS-Windows,
which have problems in the first 5 ms of every second.

Without this patch, users can observe Git issuing a
timestamp T+1 before it issues timestamp T, because Git
sometimes uses time(NULL) or time(&t) and sometimes uses
higher-res methods like gettimeofday.  Although strictly
speaking users should tolerate this behavior because a
superuser can always change the clock back, this is a
quality of implementation issue and users naturally expect
Git to issue timestamps in increasing order unless the
superuser has fiddled with the system clock.

This patch always uses gettimeofday(...) instead of time(...),
and I have verified that the resulting .o files never refer
to the name 'time'.  A trickier patch would change only
those calls for which timestamp monotonicity is user-visible.
Such a patch would require more expertise about Git internals,
though, and would be harder to maintain later.

Another possibility would be to change Git's documentation
to warn users that Git does not always issue timestamps in
increasing order.  However, Git users would likely be either
dismayed by this possibility, or confused by the level of
detail that any such documentation would require.

Yet another possibility would be to fix the Linux kernel so
that the time syscall is consistent with the other timestamp
syscalls.  I suppose this has not been done due to
performance implications.  (Git's use of timestamps is rare
enough that performance is not a significant consideration
for git.)  However, this wouldn't fix Git's problem on older
Linux kernels, or on AIX or MS-Windows.

Signed-off-by: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 13:11:42 -07:00
ec2f026961 csum-file.h: remove unnecessary inclusion of cache.h
With the change in the last commit to move several functions to
write-or-die.h, csum-file.h no longer needs to include cache.h.
However, removing that include forces several other C files, which
directly or indirectly dependend upon csum-file.h's inclusion of
cache.h, to now be more explicit about their dependencies.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:55 -07:00
d48be35ca6 write-or-die.h: move declarations for write-or-die.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
61a7b98264 treewide: remove cache.h inclusion due to setup.h changes
By moving several declarations to setup.h, the previous patch made it
possible to remove the include of cache.h in several source files.  Do
so.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
9875058870 treewide: remove cache.h inclusion due to environment.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
a64acf7298 treewide: remove unnecessary includes of cache.h
The last several commits were geared at replacing the include of cache.h
in strbuf.c with an include of git-compat-util.h.  Unfortunately, I had
to drop a patch moving some functions from cache.h to object-name.h, due
to excessive conflicts with other in-flight topics.

However, even without that patch, the series of patches so far allows us
to modify a number of C files to replace an include of cache.h with
git-compat-util.h.  Do that to reduce our dependencies.

(If we could have kept our object-name.h patch in this series, it would
have also let us reduce the includes in checkout.c and fmt-merge-msg.c
in addition to strbuf.c).

Just to ensure that nothing else was bringing in cache.h, all of the
affected files have been checked to ensure that
    gcc -E -I. $SOURCE_FILE | grep '"cache.h"'
found no hits and that
    make DEVELOPER=1 ${OBJECT_FILE_FOR_SOURCE_FILE}
successfully compiles without warnings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
d5ebb50dcb wrapper.h: move declarations for wrapper.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
905f96939b path.h: move function declarations for path.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
f7e552d7ca cache.h: remove expand_user_path()
expand_user_path() was renamed to interpolate_path() back in mid-2021,
but reinstated with a #define and a NEEDSWORK comment that we would
eventually want to get rid of it.  Do so now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
0b027f6ca7 abspath.h: move absolute path functions from cache.h
This is another step towards letting us remove the include of cache.h in
strbuf.c.  It does mean that we also need to add includes of abspath.h
in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
7ee24e18e5 environment: move comment_line_char from cache.h
This is one step towards making strbuf.c not depend upon cache.h.
Additional steps will follow in subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
4f6728d52d treewide: remove unnecessary cache.h inclusion from several sources
A number of files were apparently including cache.h solely to get
gettext.h.  By making those files explicitly include gettext.h, we can
already drop the include of cache.h in these files.  On top of that,
there were some files using cache.h that didn't need to for any reason.
Remove these unnecessary includes.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
553d4d70d1 treewide: remove unnecessary inclusion of gettext.h
Looking at things from the opposite angle of the last patch, we had a
few files that were including gettext.h and perhaps needed it at some
point in history, but no longer require it.  Remove the include.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
a6dc3d364c treewide: remove unnecessary cache.h inclusion from a few headers
Ever since a64215b6cd ("object.h: stop depending on cache.h; make
cache.h depend on object.h", 2023-02-24), we have a few headers that
could have replaced their include of cache.h with an include of
object.h.  Make that change now.

Some C files had to start including cache.h after this change (or some
smaller header it had brought in), because the C files were depending
on things from cache.h but were only formerly implicitly getting
cache.h through one of these headers being modified in this patch.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:50 -07:00
cbfe360b14 commit-reach: add tips_reachable_from_bases()
Both 'git for-each-ref --merged=<X>' and 'git branch --merged=<X>' use
the ref-filter machinery to select references or branches (respectively)
that are reachable from a set of commits presented by one or more
--merged arguments. This happens within reach_filter(), which uses the
revision-walk machinery to walk history in a standard way.

However, the commit-reach.c file is full of custom searches that are
more efficient, especially for reachability queries that can terminate
early when reachability is discovered. Add a new
tips_reachable_from_bases() method to commit-reach.c and call it from
within reach_filter() in ref-filter.c. This affects both 'git branch'
and 'git for-each-ref' as tested in p1500-graph-walks.sh.

For the Linux kernel repository, we take an already-fast algorithm and
make it even faster:

Test                                            HEAD~1  HEAD
-------------------------------------------------------------------
1500.5: contains: git for-each-ref --merged     0.13    0.02 -84.6%
1500.6: contains: git branch --merged           0.14    0.02 -85.7%
1500.7: contains: git tag --merged              0.15    0.03 -80.0%

(Note that we remove the iterative 'git rev-list' test from p1500
because it no longer makes sense as a comparison to 'git for-each-ref'
and would just waste time running it for these comparisons.)

The algorithm is implemented in commit-reach.c in the method
tips_reachable_from_base(). This method takes a string_list of tips and
assigns the 'util' for each item with the value 1 if the base commit can
reach those tips.

Like other reachability queries in commit-reach.c, the fastest way to
search for "can A reach B?" is to do a depth-first search up to the
generation number of B, preferring to explore first parents before later
parents. While we must walk all reachable commits up to that generation
number when the answer is "no", the depth-first search can answer "yes"
much faster than other approaches in most cases.

This search becomes trickier when there are multiple targets for the
depth-first search. The commits with lower generation number are more
likely to be within the history of the start commit, but we don't want
to waste time searching commits of low generation number if the commit
target with lowest generation number has already been found.

The trick here is to take the input commits and sort them by generation
number in ascending order. Track the index within this order as
min_generation_index. When we find a commit, if its index in the list is
equal to min_generation_index, then we can increase the generation
number boundary of our search to the next-lowest value in the list.

With this mechanism, the number of commits to search is minimized with
respect to the depth-first search heuristic. We will walk all commits up
to the minimum generation number of a commit that is _not_ reachable
from the start, but we will walk only the necessary portion of the
depth-first search for the reachable commits of lower generation.

Add extra tests for this behavior in t6600-test-reach.sh as the
interesting data shape of that repository can sometimes demonstrate
corner case bugs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
49abcd21da for-each-ref: add ahead-behind format atom
The previous change implemented the ahead_behind() method, including an
algorithm to compute the ahead/behind values for a number of commit tips
relative to a number of commit bases. Now, integrate that algorithm as
part of 'git for-each-ref' hidden behind a new format atom,
ahead-behind. This naturally extends to 'git branch' and 'git tag'
builtins, as well.

This format allows specifying multiple bases, if so desired, and all
matching references are compared against all of those bases. For this
reason, failing to read a reference provided from these atoms results in
an error.

In order to translate the ahead_behind() method information to the
format output code in ref-filter.c, we must populate arrays of
ahead_behind_count structs. In struct ref_array, we store the full array
that will be passed to ahead_behind(). In struct ref_array_item, we
store an array of pointers that point to the relvant items within the
full array. In this way, we can pull all relevant ahead/behind values
directly when formatting output for a specific item. It also ensures the
lifetime of the ahead_behind_count structs matches the time that the
array is being used.

Add specific tests of the ahead/behind counts in t6600-test-reach.sh, as
it has an interesting repository shape. In particular, its merging
strategy and its use of different commit-graphs would demonstrate over-
counting if the ahead_behind() method did not already account for that
possibility.

Also add tests for the specific for-each-ref, branch, and tag builtins.
In the case of 'git tag', there are intersting cases that happen when
some of the selected tips are not commits. This requires careful logic
around commits_nr in the second loop of filter_ahead_behind(). Also, the
test in t7004 is carefully located to avoid being dependent on the GPG
prereq. It also avoids using the test_commit helper, as that will add
ticks to the time and disrupt the expected timestamps in later tag
tests.

Also add performance tests in a new p1300-graph-walks.sh script. This
will be useful for more uses in the future, but for now compare the
ahead-behind counting algorithm in 'git for-each-ref' to the naive
implementation by running 'git rev-list --count' processes for each
input.

For the Git source code repository, the improvement is already obvious:

Test                                            this tree
---------------------------------------------------------------
1500.2: ahead-behind counts: git for-each-ref   0.07(0.07+0.00)
1500.3: ahead-behind counts: git branch         0.07(0.06+0.00)
1500.4: ahead-behind counts: git tag            0.07(0.06+0.00)
1500.5: ahead-behind counts: git rev-list       1.32(1.04+0.27)

But the standard performance benchmark is the Linux kernel repository,
which demosntrates a significant improvement:

Test                                            this tree
---------------------------------------------------------------
1500.2: ahead-behind counts: git for-each-ref   0.27(0.24+0.02)
1500.3: ahead-behind counts: git branch         0.27(0.24+0.03)
1500.4: ahead-behind counts: git tag            0.28(0.27+0.01)
1500.5: ahead-behind counts: git rev-list       4.57(4.03+0.54)

The 'git rev-list' test exists in this change as a demonstration, but it
will be removed in the next change to avoid wasting time on this
comparison.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
fd67d149bd commit-reach: implement ahead_behind() logic
Fully implement the commit-counting logic required to determine
ahead/behind counts for a batch of commit pairs. This is a new library
method within commit-reach.h. This method will be linked to the
for-each-ref builtin in the next change.

The interface for ahead_behind() uses two arrays. The first array of
commits contains the list of all starting points for the walk. This
includes all tip commits _and_ base commits. The second array specifies
base/tip pairs by pointing to commits within the first array, by index.
The second array also stores the resulting ahead/behind counts for each
of these pairs.

This implementation of ahead_behind() allows multiple bases, if desired.
Even with multiple bases, there is only one commit walk used for
counting the ahead/behind values, saving time when the base/tip ranges
overlap significantly.

This interface for ahead_behind() also makes it very easy to call
ensure_generations_valid() on the entire array of bases and tips. This
call is necessary because it is critical that the walk that counts
ahead/behind values never walks a commit more than once. Without
generation numbers on every commit, there is a possibility that a
commit date skew could cause the walk to revisit a commit and then
double-count it. For this reason, it is strongly recommended that 'git
ahead-behind' is only run in a repository with a commit-graph file that
covers most of the reachable commits, storing precomputed generation
numbers. If no commit-graph exists, this walk will be much slower as it
must walk all reachable commits in ensure_generations_valid() before
performing the counting logic.

It is possible to detect if generation numbers are available at run time
and redirect the implementation to another algorithm that does not
require this property. However, that implementation requires a commit
walk per base/tip pair _and_ can be slower due to the commit date
heuristics required. Such an implementation could be considered in the
future if there is a reason to include it, but most Git hosts should
already be generating a commit-graph file as part of repository
maintenance. Most Git clients should also be generating commit-graph
files as part of background maintenance or automatic GCs.

Now, let's discuss the ahead/behind counting algorithm.

The first array of commits are considered the starting commits. The
index within that array will play a critical role.

We create a new commit slab that maps commits to a bitmap. For a given
commit (anywhere in the history), its bitmap stores information relative
to which of the input commits can reach that commit. The ith bit will be
on if the ith commit from the starting list can reach that commit. It is
important to notice that these bitmaps are not the typical "reachability
bitmaps" that are stored in .bitmap files. Instead of signalling which
objects are reachable from the current commit, they instead signal
"which starting commits can reach me?" It is also important to know that
the bitmap is not necessarily "complete" until we walk that commit. We
will perform a commit walk by generation number in such a way that we
can guarantee the bitmap is correct when we visit that commit.

At the beginning of the ahead_behind() method, we initialize the bitmaps
for each of the starting commits. By enabling the ith bit for the ith
starting commit, we signal "the ith commit can reach itself."

We walk commits by popping the commit with maximum generation number out
of the queue, guaranteeing that we will never walk a child of that
commit in any future steps.

As we walk, we load the bitmap for the current commit and perform two
main steps. The _second_ step examines each parent of the current commit
and adds the current commit's bitmap bits to each parent's bitmap. (We
create a new bitmap for the parent if this is our first time seeing that
parent.) After adding the bits to the parent's bitmap, the parent is
added to the walk queue. Due to this passing of bits to parents, the
current commit has a guarantee that the ith bit is enabled on its bitmap
if and only if the ith commit can reach the current commit.

The first step of the walk is to examine the bitmask on the current
commit and decide which ranges the commit is in or not. Due to the "bit
pushing" in the second step, we have a guarantee that the ith bit of the
current commit's bitmap is on if and only if the ith starting commit can
reach it. For each ahead_behind_count struct, check the base_index and
tip_index to see if those bits are enabled on the current bitmap. If
exactly one bit is enabled, then increment the corresponding 'ahead' or
'behind' count.  This increment is the reason we _absolutely need_ to
walk commits at most once.

The only subtle thing to do with this walk is to check to see if a
parent has all bits on in its bitmap, in which case it becomes "stale"
and is marked with the STALE bit. This allows queue_has_nonstale() to be
the terminating condition of the walk, which greatly reduces the number
of commits walked if all of the commits are nearby in history. It avoids
walking a large number of common commits when there is a deep history.
We also use the helper method insert_no_dup() to add commits to the
priority queue without adding them multiple times. This uses the PARENT2
flag. Thus, we must clear both the STALE and PARENT2 bits of all
commits, in case ahead_behind() is called multiple times in the same
process.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
c08645b353 commit-graph: introduce ensure_generations_valid()
Use the just-introduced compute_reachable_generation_numbers_1() to
implement a function which dynamically computes topological levels (or
corrected commit dates) for out-of-graph commits.

This will be useful for the ahead-behind algorithm we are about to
introduce, which needs accurate topological levels on _all_ commits
reachable from the tips in order to avoid over-counting.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
2ee11f7261 commit-graph: return generation from memory
The commit_graph_generation() method used to report a value of
GENERATION_NUMBER_INFINITY if the commit_graph_data_slab had an instance
for the given commit but the graph_pos indicated the commit was not in
the commit-graph file.

However, an upcoming change will introduce the ability to set generation
values in-memory without writing the commit-graph file. Thus, we can no
longer trust 'graph_pos' to indicate whether or not the generation
member can be trusted.

Instead, trust the 'generation' member if the commit has a value in the
slab _and_ the 'generation' member is non-zero. Otherwise, treat it as
GENERATION_NUMBER_INFINITY.

This only makes a difference for a very old case for the commit-graph:
the very first Git release to write commit-graph files wrote zeroes in
the topological level positions. If we are parsing a commit-graph with
all zeroes, those commits will now appear to have
GENERATION_NUMBER_INFINITY (as if they were not parsed from the
commit-graph).

I attempted several variations to work around the need for providing an
uninitialized 'generation' member, but this was the best one I found. It
does require a change to a verification test in t5318 because it reports
a different error than the one about non-zero generation numbers.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
80c928d947 commit-graph: simplify compute_generation_numbers()
The previous change introduced the generic algorithm
compute_reachable_generation_numbers() and used it as the core
functionality of compute_topological_levels(). Now, use it as the core
functionality of compute_generation_numbers().

The main difference here is that we use generation version 2, which is
used in to toggle the logic in compute_generation_from_max() for
computing the corrected commit date based on the corrected commit dates
of the parent commits (and the commit date of the current commit). It
also uses different methods for (get|set)_generation in the vtable in
order to store and access the value in the correct places.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
368d19b0b7 commit-graph: refactor compute_topological_levels()
This patch extracts the common code used to compute topological levels
and corrected committer dates into a common routine,
compute_reachable_generation_numbers(). For ease of reading, it only
modifies compute_topological_levels() to use this new routine, leaving
compute_generation_numbers() to be modified in the next change.

This new routine dispatches to call the necessary functions to get and
set the generation number for a given commit through a vtable (the
compute_generation_info struct).

Computing the generation number itself is done in
compute_generation_from_max(), which dispatches its implementation based
on the generation version requested, or issuing a BUG() for unrecognized
generation versions. This does not use a vtable because the logic
depends only on the generation number version, not where the data is
being loaded from or being stored to. This is a subtle point that will
make more sense in a future change that modifies the in-memory
generation values instead of just preparing values for writing to a
commit-graph file.

This change looks like it adds a lot of new code. However, two upcoming
changes will be quite small due to the work being done in this change.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
b2c51b7590 for-each-ref: explicitly test no matches
The for-each-ref builtin can take a list of ref patterns, but if none
match, it still succeeds (but with no output). Add an explicit test that
demonstrates that behavior.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:32 -07:00
b73dec5530 for-each-ref: add --stdin option
When a user wishes to input a large list of patterns to 'git
for-each-ref' (likely a long list of exact refs) there are frequently
system limits on the number of command-line arguments.

Add a new --stdin option to instead read the patterns from standard
input. Add tests that check that any unrecognized arguments are
considered an error when --stdin is provided. Also, an empty pattern
list is interpreted as the complete ref set.

When reading from stdin, we populate the filter.name_patterns array
dynamically as opposed to pointing to the 'argv' array directly. This is
simple when using a strvec, as it is NULL-terminated in the same way. We
then free the memory directly from the strvec.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:32 -07:00
353e6d4554 parse-options.h: use designated initializers in OPT_* macros
Use designated initializers in the expansions of the OPT_* macros to
make it more readable which one-letter macro parameter initializes
which field in the resulting 'struct option'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:04:07 -07:00
aa0275a2c0 parse-options.h: rename _OPT_CONTAINS_OR_WITH()'s parameters
Rename the 'help' parameter as it matches one of the fields in 'struct
option', and, while at it, rename all other parameters to the usual
one-letter name used in similar macro definitions.

Furthermore, put all parameters in the replacement list between
parentheses, like all other OPT_* macros do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:04:06 -07:00
ab0845b382 parse-options.h: use consistent name for the callback parameters
In the various OPT_* macros the 'f' parameter is usually used to
specify flags, while the 'cb' parameter is used to specify a callback
function.  OPT_CALLBACK and OPT_NUMBER_CALLBACKS, however, are
inconsistent with the rest, as they use 'f' to specify their callback
function.

Rename their callback macro parameters to 'cb' to avoid the
inconsistency.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:04:06 -07:00
c4d9c79378 treewide: remove unnecessary inclusions of parse-options.h from headers
The headers 'diagnose.h', 'list-objects-filter-options.h',
'ref-filter.h' and 'remote.h' declare option parsing callback
functions with a 'struct option*' parameter, and 'revision.h' declares
an option parsing helper function taking 'struct parse_opt_ctx_t*' and
'struct option*' parameters.  These headers all include
'parse-options.h', although they don't need any of the type
definitions from that header file.  Furthermore,
'list-objects-filter-options.h' and 'ref-filter.h' also define some
OPT_* macros to initialize a 'struct option', but these don't
necessitate the inclusion of parse-options.h in these headers either,
because these macros are only expanded in source files.

Remove these unnecessary inclusions of parse-options.h and use forward
declarations to declare the necessary types.

After this patch none of the header files include parse-options.h
anymore.

With these changes, the build time after modifying only
parse-options.h is reduced by about 30%, and the number of targets
built is almost 20% less:

  Before:

    $ touch parse-options.h && time make -j4 |wc -l
    353

    real    1m1.527s
    user    3m32.205s
    sys	    0m15.903s

  After:

    289

    real    0m39.285s
    user    2m12.540s
    sys     0m11.164s

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:55:18 -07:00
49fd551194 treewide: include parse-options.h in source files
The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and
'send-pack' use parse_options(), but their source files don't directly
include 'parse-options.h'.  Furthermore, the source files
'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and
'send-pack.c' define option parsing callback functions, while
'revision.c' defines an option parsing helper function, and thus need
access to various fields in 'struct option' and 'struct
parse_opt_ctx_t', but they don't directly include 'parse-options.h'
either.  They all can still be built, of course, because they include
one of the header files that does include 'parse-options.h' (though
unnecessarily, see the next commit).

Add those missing includes to these files, as our general rule is that
"a C file must directly include the header files that declare the
functions and the types it uses".

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:26:47 -07:00
d6606e02aa fetch: centralize printing of reference updates
In order to print updated references during a fetch, the two different
call sites that do this will first call `format_display()` followed by a
call to `fputs()`. This is needlessly roundabout now that we have the
`display_state` structure that encapsulates all of the printing logic
for references.

Move displaying the reference updates into `format_display()` and rename
it to `display_ref_update()` to better match its new purpose, which
finalizes the conversion to make both the formatting and printing logic
of reference updates self-contained. This will make it easier to add new
output formats and printing to a different file descriptor than stderr.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:02:43 -07:00
c4ef5edbc9 fetch: centralize logic to print remote URL
When fetching from a remote, we not only print the actual references
that have changed, but will also print the URL from which we have
fetched them to standard output. The logic to handle this is duplicated
across two different callsites with some non-trivial logic to compute
the anonymized URL. Furthermore, we're using global state to track
whether we have already shown the URL to the user or not.

Refactor the code by moving it into `format_display()`. Like this, we
can convert the global variable into a member of `display_state`. And
second, we can deduplicate the logic to compute the anonymized URL.

This also works as expected when fetching from multiple remotes, for
example via a group of remotes, as we do this by forking a standalone
git-fetch(1) process per remote that is to be fetched.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:02:43 -07:00
331b7d29f0 fetch: centralize handling of per-reference format
The function `format_display()` is used to print a single reference
update to a buffer which will then ultimately be printed by the caller.
This architecture causes us to duplicate some logic across the different
callsites of this function. This makes it hard to follow the code as
some parts of the logic are located in one place, while other parts of
the logic are located in a different place. Furthermore, by having the
logic scattered around it becomes quite hard to implement a new output
format for the reference updates.

We can make the logic a whole lot easier to understand by making the
`format_display()` function self-contained so that it handles formatting
and printing of the references. This will eventually allow us to easily
implement a completely different output format, but also opens the door
to conditionally print to either stdout or stderr depending on the
output format.

As a first step towards that goal we move the formatting directive used
by both callers to print a single reference update into this function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:02:43 -07:00
7c978db889 fetch: pass the full local reference name to format_display
Before printing the name of the local references that would be updated
by a fetch we first prettify the reference name. This is done at the
calling side so that `format_display()` never sees the full name of the
local reference. This restricts our ability to introduce new output
formats that might want to print the full reference name.

Right now, all callsites except one are prettifying the reference name
anyway. And the only callsite that doesn't passes `FETCH_HEAD` as the
hardcoded reference name to `format_display()`, which would never be
changed by a call to `prettify_refname()` anyway. So let's refactor the
code to pass in the full local reference name and then prettify it in
the formatting code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:02:43 -07:00
5cab51ff71 fetch: move output format into display_state
The git-fetch(1) command supports printing references either in "full"
or "compact" format depending on the `fetch.ouput` config key. The
format that is to be used is tracked in a global variable.

De-globalize the variable by moving it into the `display_state`
structure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:02:43 -07:00
ce9636d645 fetch: move reference width calculation into display_state
In order to print references in proper columns we need to calculate the
width of the reference column before starting to print the references.
This is done with the help of a global variable `refcol_width`.

Refactor the code to instead use a new structure `display_state` that
contains the computed width and plumb it through the stack as required.
This is only the first step towards de-globalizing the state required to
print references.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 11:02:43 -07:00
91b81b64e3 wildmatch: hide internal return values
WM_ABORT_ALL and WM_ABORT_TO_STARSTAR are used internally to limit
backtracking when a match fails, they are not of interest to the caller
and so should not be public.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 10:58:53 -07:00
81b26f8f28 wildmatch: avoid undefined behavior
The code changed in this commit is designed to check if the pattern
starts with "**/" or contains "/**/" (see 3a078dec33 (wildmatch: fix
"**" special case, 2013-01-01)). Unfortunately when the pattern begins
with "**/" `prev_p = p - 2` is evaluated when `p` points to the second
"*" and so the subtraction is undefined according to section 6.5.6 of
the C standard because the result does not point within the same object
as `p`. Fix this by avoiding the subtraction unless it is well defined.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 10:58:53 -07:00
1f2e05f0b7 wildmatch: fix exponential behavior
When dowild() cannot match a '*' or '/**/' wildcard then it must return
WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe
this results in unnecessary backtracking and the time taken for a failed
match increases exponentially with the number of wildcards in the
pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH
for a failed match resulting in long match times for patterns containing
multiple wildcards as can be seen in the following benchmark.
(Note that the timings in the Benchmark 1 are really measuring the time
to execute test-tool rather than the time to match the pattern)

Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"
  Time (mean ± σ):      22.8 ms ±   1.7 ms    [User: 12.1 ms, System: 10.6 ms]
  Range (min … max):    19.4 ms …  26.9 ms    113 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"
  Time (mean ± σ):      5.244 s ±  0.228 s    [User: 5.229 s, System: 0.010 s]
  Range (min … max):    4.969 s …  5.707 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"' ran
  230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"'

The security implications are limited as it only affects operations that
are potentially DoS vectors. For example by creating a blob containing
such a pattern a malicious user can exploit this behavior to use large
amounts of CPU time on a remote server by pushing the blob and then
creating a new clone with --filter=sparse:oid. However this filter type
is usually disabled as it is known to consume large amounts of CPU time
even without this bug.

The WM_MATCH changed in the first hunk of this patch comes from the
original implementation imported from rsync in 5230f605e1 (Import
wildmatch from rsync, 2012-10-15). Compared to the others converted here
it is fairly harmless as it only triggers at the end of the pattern and
so will only cause a single unnecessary backtrack. The others introduced
by 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal>
patterns, 2013-01-01) and 46983441ae (wildmatch: make a special case for
"*/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause
exponential behavior.

A new test is added to protect against future regressions.

[1] https://research.swtch.com/glob

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 10:58:53 -07:00
a93cbe8d78 t1507: assert output of rev-parse
Tests in t1507-rev-parse-upstream.sh compare files "expect" and "actual"
to assert the output of "git rev-parse", "git show", and "git log".
However, two of the tests '@{reflog}-parsing does not look beyond colon'
and '@{upstream}-parsing does not look beyond colon' don't inspect the
contents of the created files.

Assert output of "git rev-parse" in tests in t1507-rev-parse-upstream.sh
to improve test coverage.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:42 -07:00
7deec9442f t1404: don't create unused file
Some tests in file t1404-update-ref-errors.sh create file "unchanged" as
the expected side for a test_cmp assertion at the end of the test for
output of "git for-each-ref".  Test 'no bogus intermediate values during
delete' also creates a file named "unchanged" using "git for-each-ref".
However, the file isn't used for any assertions in the test.  Instead,
"git rev-parse" is used to compare the reference with variable $D.

Don't create unused file "unchanged" in test 'no bogus intermediate
values during delete' of t1404-update-ref-errors.sh.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:42 -07:00
94f07b5544 t1400: assert output of update-ref
In t1400-update-ref.sh test 'transaction can create and delete' creates
files "expect" and "actual", but doesn't compare them.  Similarly, test
'transaction cannot restart ongoing transaction' redirects output of
"git update-ref" to file "actual", but doesn't check its contents with
any assertions.

Assert output of "git update-ref" in tests to improve test coverage in
t1400-update-ref.sh.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:42 -07:00
17ae7f758e t1302: don't create unused file
Test 'gitdir selection on unsupported repo' in t1302-repo-version.sh
writes output of a "git config" invocation to file "actual".  However,
the test doesn't have any assertions for the file.  The file was used by
this test until commit b9605bc4f2 (config: only read .git/config from
configured repos, 2016-09-12), before which "git config" was expected to
print the bogus value of "core.repositoryformatversion" to standard
output.

Don't redirect output of "git config" to file "actual" in test 'gitdir
selection on unsupported repo'.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
f4b98e17cf t1010: don't create unused files
Builtin "git mktree" writes the the object name of the tree object built
to the standard output.  Tests 'mktree refuses to read ls-tree -r output
(1)' and 'mktree refuses to read ls-tree -r output (2)' in
"t1010-mktree.sh" redirect output of "git mktree" to a file, but don't
use its contents in assertions.

Don't redirect output of "git mktree" to file "actual" in tests that
assert that an invocation of "git mktree" must fail.

Output of "git mktree" is empty when it refuses to build a tree object.
So, alternatively, the test could assert that the output is empty.
However, there isn't a good reason for the user to expect the command to
be silent in such cases, so we shouldn't enforce it.  The user shouldn't
use the output of a failing command anyway.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
4e273368ce t1006: assert error output of cat-file
Test "cat-file $arg1 $arg2 error on missing full OID" in
t1006-cat-file.sh compares files "expect.err" and "err.actual" to assert
the expected error output of "git cat-file".  A similar test in the same
file named "cat-file $arg1 $arg2 error on missing short OID" also
creates these two files, but doesn't use them in assertions.

Assert error output of "git cat-file" in test "cat-file $arg1 $arg2
error on missing short OID" of t1006-cat-file.sh to improve test
coverage.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
8fc184c0eb t1005: assert output of ls-files
Test 'reset should work' in t1005-read-tree-reset.sh compares two files
"expect" and "actual" to assert the expected output of "git ls-files".
Several other tests in the same file also create files "expect" and
"actual", but don't use them in assertions.

Assert output of "git ls-files" in t1005-read-tree-reset.sh to improve
test coverage.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
e25cabbf6b The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-19 15:03:22 -07:00
a9f4a01760 Merge branch 'jk/add-p-unmerged-fix'
"git add -p" while the index is unmerged sometimes failed to parse
the diff output it internally produces and died, which has been
corrected.

* jk/add-p-unmerged-fix:
  add-patch: handle "* Unmerged path" lines
2023-03-19 15:03:13 -07:00
947604ddb7 Merge branch 'ew/fetch-no-write-fetch-head-fix'
* ew/fetch-no-write-fetch-head-fix:
  fetch: pass --no-write-fetch-head to subprocesses
2023-03-19 15:03:13 -07:00
9de14c71f7 Merge branch 'fc/advice-diverged-history'
After "git pull" that is configured with pull.rebase=false
merge.ff=only fails due to our end having our own development, give
advice messages to get out of the "Not possible to fast-forward"
state.

* fc/advice-diverged-history:
  advice: add diverging advice for novices
2023-03-19 15:03:13 -07:00
fc1a4ce043 Merge branch 'ab/fix-strategy-opts-parsing'
The code to parse "git rebase -X<opt>" was not prepared to see an
unparsable option string, which has been corrected.

* ab/fix-strategy-opts-parsing:
  sequencer.c: fix overflow & segfault in parse_strategy_opts()
2023-03-19 15:03:12 -07:00
0717a424a7 Merge branch 'ds/reprepare-alternates-when-repreparing-packfiles'
Once we start running, we assumed that the list of alternate object
databases would never change.  Hook into the machinery used to
update the list of packfiles during runtime to update this list as
well.

* ds/reprepare-alternates-when-repreparing-packfiles:
  object-file: reprepare alternates when necessary
2023-03-19 15:03:12 -07:00
5c92a451be Merge branch 'jk/format-patch-change-format-for-empty-commits'
"git format-patch" learned to write a log-message only output file
for empty commits.

* jk/format-patch-change-format-for-empty-commits:
  format-patch: output header for empty commits
2023-03-19 15:03:12 -07:00
95de376349 Merge branch 'jk/bundle-use-dash-for-stdfiles'
"git bundle" learned that "-" is a common way to say that the input
comes from the standard input and/or the output goes to the
standard output.  It used to work only for output and only from the
root level of the working tree.

* jk/bundle-use-dash-for-stdfiles:
  parse-options: use prefix_filename_except_for_dash() helper
  parse-options: consistently allocate memory in fix_filename()
  bundle: don't blindly apply prefix_filename() to "-"
  bundle: document handling of "-" as stdin
  bundle: let "-" mean stdin for reading operations
2023-03-19 15:03:12 -07:00
12201fd756 Merge branch 'jk/bundle-progress'
Simplify UI to control progress meter given by "git bundle" command.

* jk/bundle-progress:
  bundle: turn on --all-progress-implied by default
2023-03-19 15:03:11 -07:00
3f3bb90c8f Merge branch 'as/doc-markup-fix'
Fix for a mis-mark-up in doc made in Git 2.39 days.

* as/doc-markup-fix:
  git-merge-tree.txt: replace spurious HTML entity
2023-03-19 15:03:11 -07:00
96a806f87a Merge branch 'rj/avoid-switching-to-already-used-branch'
A few subcommands have been taught to stop users from working on a
branch that is being used in another worktree linked to the same
repository.

* rj/avoid-switching-to-already-used-branch:
  switch: reject if the branch is already checked out elsewhere (test)
  rebase: refuse to switch to a branch already checked out elsewhere (test)
  branch: fix die_if_checked_out() when ignore_current_worktree
  worktree: introduce is_shared_symref()
2023-03-19 15:03:11 -07:00
c79786c486 Merge branch 'rj/bisect-already-used-branch'
Allow "git bisect reset" to check out the original branch when the
branch is already checked out in a different worktree linked to the
same repository.

* rj/bisect-already-used-branch:
  bisect: fix "reset" when branch is checked out elsewhere
2023-03-19 15:03:11 -07:00
4a25b911cd Merge branch 'zh/push-to-delete-onelevel-ref'
"git push" has been taught to allow deletion of refs with one-level
names to help repairing a repository who acquired such a ref by
mistake.  In general, we don't encourage use of such a ref, and
creation or update to such a ref is rejected as before.

* zh/push-to-delete-onelevel-ref:
  push: allow delete single-level ref
  receive-pack: fix funny ref error messsage
2023-03-19 15:03:10 -07:00
67076b85b8 Merge branch 'ak/restore-both-incompatible-with-conflicts'
"git restore" supports options like "--ours" that are only
meaningful during a conflicted merge, but these options are only
meaningful when updating the working tree files.  These options are
marked to be incompatible when both "--staged" and "--worktree" are
in effect.

* ak/restore-both-incompatible-with-conflicts:
  restore: fault --staged --worktree with merge opts
2023-03-19 15:03:10 -07:00
b0d2440442 Merge branch 'ew/commit-reach-clean-up-flags-fix'
Fix a segfaulting loop.  The function and its caller may need
further clean-up.

* ew/commit-reach-clean-up-flags-fix:
  commit-reach: avoid NULL dereference
2023-03-19 15:03:10 -07:00
6f54213718 Merge branch 'ab/avoid-losing-exit-codes-in-tests'
Test clean-up.

* ab/avoid-losing-exit-codes-in-tests:
  tests: don't lose misc "git" exit codes
  tests: don't lose exit status with "test <op> $(git ...)"
  tests: don't lose "git" exit codes in "! ( git ... | grep )"
  tests: don't lose exit status with "(cd ...; test <op> $(git ...))"
  t/lib-patch-mode.sh: fix ignored exit codes
  auto-crlf tests: don't lose exit code in loops and outside tests
2023-03-19 15:03:10 -07:00
eaa0fd6584 git_connect(): fix corner cases in downgrading v2 to v0
There's code in git_connect() that checks whether we are doing a push
with protocol_v2, and if so, drops us to protocol_v0 (since we know
how to do v2 only for fetches). But it misses some corner cases:

  1. it checks the "prog" variable, which is actually the path to
     receive-pack on the remote side. By default this is just
     "git-receive-pack", but it could be an arbitrary string (like
     "/path/to/git receive-pack", etc). We'd accidentally stay in v2
     mode in this case.

  2. besides "receive-pack" and "upload-pack", there's one other value
     we'd expect: "upload-archive" for handling "git archive --remote".
     Like receive-pack, this doesn't understand v2, and should use the
     v0 protocol.

In practice, neither of these causes bugs in the real world so far. We
do send a "we understand v2" probe to the server, but since no server
implements v2 for anything but upload-pack, it's simply ignored. But
this would eventually become a problem if we do implement v2 for those
endpoints, as older clients would falsely claim to understand it,
leading to a server response they can't parse.

We can fix (1) by passing in both the program path and the "name" of the
operation. I treat the name as a string here, because that's the pattern
set in transport_connect(), which is one of our callers (we were simply
throwing away the "name" value there before).

We can fix (2) by allowing only known-v2 protocols ("upload-pack"),
rather than blocking unknown ones ("receive-pack" and "upload-archive").
That will mean whoever eventually implements v2 push will have to adjust
this list, but that's reasonable. We'll do the safe, conservative thing
(sticking to v0) by default, and anybody working on v2 will quickly
realize this spot needs to be updated.

The new tests cover the receive-pack and upload-archive cases above, and
re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that
already worked before this patch, of course, but it would be an easy
thing to break if we flipped the allow/block logic without also handling
"name" separately).

Here are a few miscellaneous implementation notes, since I had to do a
little head-scratching to understand who calls what:

  - transport_connect() is called only for git-upload-archive. For
    non-http git remotes, that resolves to the virtual connect_git()
    function (which then calls git_connect(); confused yet?). So
    plumbing through "name" in connect_git() covers that.

  - for regular fetches and pushes, callers use higher-level functions
    like transport_fetch_refs(). For non-http git remotes, that means
    calling git_connect() under the hood via connect_setup(). And that
    uses the "for_push" flag to decide which name to use.

  - likewise, plumbing like fetch-pack and send-pack may call
    git_connect() directly; they each know which name to use.

  - for remote helpers (including http), we already have separate
    parameters for "name" and "exec" (another name for "prog"). In
    process_connect_service(), we feed the "name" to the helper via
    "connect" or "stateless-connect" directives.

    There's also a "servpath" option, which can be used to tell the
    helper about the "exec" path. But no helpers we implement support
    it! For http it would be useless anyway (no reasonable server
    implementation will allow you to send a shell command to run the
    server). In theory it would be useful for more obscure helpers like
    remote-ext, but even there it is not implemented.

    It's tempting to get rid of it simply to reduce confusion, but we
    have publicly documented it since it was added in fa8c097cc9
    (Support remote helpers implementing smart transports, 2009-12-09),
    so it's possible some helper in the wild is using it.

  - So for v2, helpers (again, including http) are mainly used via
    stateless-connect, driven by the main program. But they do still
    need to decide whether to do a v2 probe. And so there's similar
    logic in remote-curl.c's discover_refs() that looks for
    "git-receive-pack". But it's not buggy in the same way. Since it
    doesn't support servpath, it is always dealing with a "service"
    string like "git-receive-pack". And since it doesn't support
    straight "connect", it can't be used for "upload-archive".

    So we could leave that spot alone. But I've updated it here to match
    the logic we're changing in connect_git(). That seems like the least
    confusing thing for somebody who has to touch both of these spots
    later (say, to add v2 push support). I didn't add a new test to make
    sure this doesn't break anything; we already have several tests (in
    t5551 and elsewhere) that make sure we are using v2 over http.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 15:15:59 -07:00
b3edf335df transport: mark unused parameters in fetch_refs_from_bundle()
We don't look at the "to_fetch" or "nr_heads" parameters at all. At
first glance this seems like a bug (or at least pessimisation), because
it means we fetch more objects from the bundle than we actually need.
But the bundle does not have any way of computing the set of reachable
objects itself (we'd have to pull all of the objects out to walk them).
And anyway, we've probably already paid most of the cost of grabbing the
objects, since we must copy the bundle locally before accessing it.

So it's perfectly reasonable for the bundle code to just pull everything
into the local object store. Unneeded objects can be dropped later via
gc, etc.

But we should mark these unused parameters as such to avoid the wrath of
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 14:17:48 -07:00
1e5e097496 http: mark unused parameter in fill_active_slot() callbacks
We have a generic "fill" function that is used by both the dumb http
push and fetch code paths. It takes a void parameter in case the caller
wants to pass along extra data, but (since the previous commit) neither
does so.

So we could simply drop the extra parameter. But since it's good
practice to provide a void pointer for in callback functions, we'll
leave it here for the future, and just annotate it as unused (to appease
-Wunused-parameter).

While we're marking it, let's also fix the type in http-walker's
function to have the correct "void" type. The original had to cast the
function pointer and was technically undefined behavior (though
generally OK in practice).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 14:17:48 -07:00
647edf79d6 http: drop unused parameter from start_object_request()
We take a "walker" parameter for the request, but don't actually look at
it. This is due to 5424bc557f (http*: add helper methods for fetching
objects (loose), 2009-06-06). Before then, we consulted the "walker"
struct to tell us if we should be verbose, but now those messages are
printed elsewhere.

Let's drop the unused parameter to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 14:17:48 -07:00
910d07a861 mailmap: drop debugging code
There's some debugging code in mailmap.c which is only compiled if you
manually tweak the source to set DEBUG_MAILMAP. When it's not set, the
fallback noop uses static inline functions; we couldn't use macros here
because one of the functions is variadic (and variadic macros were
forbidden back then, but aren't now). As a result, this triggers
a -Wunused-parameter warning.

We have a few options here:

  1. Leave it be. Just mark it as UNUSED, or switch to a variadic macro.

  2. Assume the debugging code is useful, compile it always, and trigger
     it with a run-time flag (e.g., with a trace key). This is pretty
     easy to do, and carries a pretty small runtime cost.

  3. Assume the debugging is not very useful, and just rip it out. This
     matches what we did with a similar case in 69c5f17f11 (attr: drop
     DEBUG_ATTR code, 2022-10-06).

The debugging flag has been mentioned only three times on the list.
Once, when it was added in 2009:

  https://lore.kernel.org/git/cover.1234102794.git.marius@trolltech.com/

In 2013, when somebody fixed some compilation errors in the conditional
code (presumably because they used it while making other changes):

  https://lore.kernel.org/git/1373871253-96480-1-git-send-email-sunshine@sunshineco.com/

And finally it seemed to have been useful to somebody in 2020:

  https://lore.kernel.org/git/87eejswql6.fsf@evledraar.gmail.com/

So it's not totally without value. On the other hand, it's not likely to
be useful to non-developers (and certainly isn't if you have to
recompile). And using a debugger or adding your own inspection code is
likely to be as useful. So I've just dropped the code entirely here.

Note that we do still have to mark a few parameters unused in callback
functions which are passed to string_list_clear_func(). Those get an
extra pointer with the string being cleared, which we previously fed to
the debugging code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 14:17:19 -07:00
950264636c Start the 2.41 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 14:03:20 -07:00
5009dd4a1c Merge branch 'fz/rebase-msg-update'
Message update.

* fz/rebase-msg-update:
  rebase: fix capitalisation autoSquash in i18n string
2023-03-17 14:03:10 -07:00
4d87411ffe Merge branch 'ew/fetch-hiderefs'
A new "fetch.hideRefs" option can be used to exclude specified refs
from "rev-list --objects --stdin --not --all" traversal for
checking object connectivity, most useful when there are many
unrelated histories in a single repository.

* ew/fetch-hiderefs:
  fetch: support hideRefs to speed up connectivity checks
2023-03-17 14:03:10 -07:00
92c56da096 Merge branch 'mc/credential-helper-www-authenticate'
Allow information carried on the WWW-AUthenticate header to be
passed to the credential helpers.

* mc/credential-helper-www-authenticate:
  credential: add WWW-Authenticate header to cred requests
  http: read HTTP WWW-Authenticate response headers
  t5563: add tests for basic and anoymous HTTP access
2023-03-17 14:03:10 -07:00
af5388d2dd Merge branch 'jc/gpg-lazy-init'
Instead of forcing each command to choose to honor GPG related
configuration variables, make the subsystem lazily initialize
itself.

* jc/gpg-lazy-init:
  drop pure pass-through config callbacks
  gpg-interface: lazily initialize and read the configuration
2023-03-17 14:03:10 -07:00
88cc8ed8bc Merge branch 'en/header-cleanup'
Code clean-up to clarify the rule that "git-compat-util.h" must be
the first to be included.

* en/header-cleanup:
  diff.h: remove unnecessary include of object.h
  Remove unnecessary includes of builtin.h
  treewide: replace cache.h with more direct headers, where possible
  replace-object.h: move read_replace_refs declaration from cache.h to here
  object-store.h: move struct object_info from cache.h
  dir.h: refactor to no longer need to include cache.h
  object.h: stop depending on cache.h; make cache.h depend on object.h
  ident.h: move ident-related declarations out of cache.h
  pretty.h: move has_non_ascii() declaration from commit.h
  cache.h: remove dependence on hex.h; make other files include it explicitly
  hex.h: move some hex-related declarations from cache.h
  hash.h: move some oid-related declarations from cache.h
  alloc.h: move ALLOC_GROW() functions from cache.h
  treewide: remove unnecessary cache.h includes in source files
  treewide: remove unnecessary cache.h includes
  treewide: remove unnecessary git-compat-util.h includes in headers
  treewide: ensure one of the appropriate headers is sourced first
2023-03-17 14:03:09 -07:00
d0732a8120 Merge branch 'jk/unused-post-2.39-part2'
More work towards -Wunused.

* jk/unused-post-2.39-part2: (21 commits)
  help: mark unused parameter in git_unknown_cmd_config()
  run_processes_parallel: mark unused callback parameters
  userformat_want_item(): mark unused parameter
  for_each_commit_graft(): mark unused callback parameter
  rewrite_parents(): mark unused callback parameter
  fetch-pack: mark unused parameter in callback function
  notes: mark unused callback parameters
  prio-queue: mark unused parameters in comparison functions
  for_each_object: mark unused callback parameters
  list-objects: mark unused callback parameters
  mark unused parameters in signal handlers
  run-command: mark error routine parameters as unused
  mark "pointless" data pointers in callbacks
  ref-filter: mark unused callback parameters
  http-backend: mark unused parameters in virtual functions
  http-backend: mark argc/argv unused
  object-name: mark unused parameters in disambiguate callbacks
  serve: mark unused parameters in virtual functions
  serve: use repository pointer to get config
  ls-refs: drop config caching
  ...
2023-03-17 14:03:09 -07:00
f17d232f14 Merge branch 'en/dir-api-cleanup'
Code clean-up to clarify directory traversal API.

* en/dir-api-cleanup:
  unpack-trees: add usage notices around df_conflict_entry
  unpack-trees: special case read-tree debugging as internal usage
  unpack-trees: rewrap a few overlong lines from previous patch
  unpack-trees: mark fields only used internally as internal
  unpack_trees: start splitting internal fields from public API
  sparse-checkout: avoid using internal API of unpack-trees, take 2
  sparse-checkout: avoid using internal API of unpack-trees
  unpack-trees: clean up some flow control
  dir: mark output only fields of dir_struct as such
  dir: add a usage note to exclude_per_dir
  dir: separate public from internal portion of dir_struct
  unpack-trees: heed requests to overwrite ignored files
  t2021: fix platform-specific leftover cruft
2023-03-17 14:03:08 -07:00
2d019f46b0 Merge branch 'jk/fsck-indices-in-worktrees'
"git fsck" learned to check the index files in other worktrees,
just like "git gc" honors them as anchoring points.

* jk/fsck-indices-in-worktrees:
  fsck: check even zero-entry index files
  fsck: mention file path for index errors
  fsck: check index files in all worktrees
  fsck: factor out index fsck
2023-03-17 14:03:08 -07:00
7ee1af8cb8 completion: prompt: use generic colors
When the prompt command mode was introduced in 1bfc51ac81 (Allow
__git_ps1 to be used in PROMPT_COMMAND, 2012-10-10), the assumption was
that it was necessary in order to properly add colors to PS1 in bash,
but this wasn't true.

It's true that the \[ \] markers add the information needed to properly
calculate the width of the prompt, and they have to be added directly to
PS1, a function returning them doesn't work.

But that is because bash coverts the \[ \] markers in PS1 to \001 \002,
which is what readline ultimately needs in order to calculate the width.

We don't need bash to do this conversion, we can use \001 \002
ourselves, and then the prompt command mode is not necessary to display
colors.

This is what functions returning colors are supposed to do [1].

[1] http://mywiki.wooledge.org/BashFAQ/053

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Tested-by: Joakim Petersen <joak-pet@online.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-16 15:58:22 -07:00
dfbfdc521d object-name: fix quiet @{u} parsing
Currently `git rev-parse --quiet @{u}` is not actually quiet when
upstream isn't configured:

  fatal: no upstream configured for branch 'foo'

Make it so.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-16 10:44:56 -07:00
ab89575387 rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch
When git-rebase invokes format-patch, it wants to make sure we use the
normal prefixes, and are not confused by diff.noprefix or similar. When
this was added in 5b220a6876 (Add --src/dst-prefix to git-formt-patch
in git-rebase.sh, 2010-09-09), we only had --src-prefix and --dst-prefix
to do so, which requires re-specifying the prefixes we expect to see.
These days we can say what we want more directly: just use the defaults.

This is a minor cleanup that should have no behavior change, but
hopefully the result expresses more clearly what the code is trying to
accomplish.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-13 14:57:31 -07:00
73876f4861 Git 2.40
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 14:34:41 -07:00
5db135ced5 Merge tag 'l10n-2.40.0-rnd1' of https://github.com/git-l10n/git-po
l10n-2.40.0-rnd1

* tag 'l10n-2.40.0-rnd1' of https://github.com/git-l10n/git-po:
  l10n: zh_CN v2.40.0 round 1
  l10n: update German translation
  l10n: tr: Update Turkish translations for v.2.40.0
  l10n: fr: v2.40.0 rnd 2
  l10n: fr: v2.40.0 rnd 1
  l10n: fr: fix some typos
  l10n: po-id for 2.40 (round 1)
  l10n: sv.po: Update Swedish translation (5490t0f0u)
  l10n: bg.po: Updated Bulgarian translation (5490t)
  l10n: Update Catalan translation
2023-03-12 14:33:14 -07:00
0c8d22abaf t5604: GETTEXT_POISON fix, conclusion
In fade728df1 (apply: fix writing behind newly created symbolic links,
2023-02-02), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:56 +01:00
7c811ed5e5 t5604: GETTEXT_POISON fix, part 1
In bffc762f87 (dir-iterator: prevent top-level symlinks without
FOLLOW_SYMLINKS, 2023-01-24), we backported a patch onto v2.30.* that
was originally based on a much newer version. The v2.30.* release train
still has the GETTEXT_POISON CI job, though, and hence needs
`test_i18n*` in its tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:56 +01:00
a2b2173cfe t5619: GETTEXT_POISON fix
In cf8f6ce02a (clone: delay picking a transport until after
get_repo_path(), 2023-01-24), we backported a patch onto v2.30.* that
was originally based on a much newer version. The v2.30.* release train
still has the GETTEXT_POISON CI job, though, and hence needs
`test_i18n*` in its tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:56 +01:00
c025b4b2f1 range-diff: use ssize_t for parsed "len" in read_patches()
As we iterate through the buffer containing git-log output, parsing
lines, we use an "int" to store the size of an individual line. This
should be a size_t, as we have no guarantee that there is not a
malicious 2GB+ commit-message line in the output.

Overflowing this integer probably doesn't do anything _too_ terrible. We
are not using the value to size a buffer, so the worst case is probably
an out-of-bounds read from before the array. But it's easy enough to
fix.

Note that we have to use ssize_t here, since we also store the length
result from parse_git_diff_header(), which may return a negative value
for error. That function actually returns an int itself, which has a
similar overflow problem, but I'll leave that for another day. Much
of the apply.c code uses ints and should be converted as a whole; in the
meantime, a negative return from parse_git_diff_header() will be
interpreted as an error, and we'll bail (so we can't handle such a case,
but given that it's likely to be malicious anyway, the important thing
is we don't have any memory errors).

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:55 +01:00
d99728b2ca t0003: GETTEXT_POISON fix, conclusion
In 3c50032ff5 (attr: ignore overly large gitattributes files,
2022-12-01), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:55 +01:00
a36df79a37 range-diff: handle unterminated lines in read_patches()
When parsing our buffer of output from git-log, we have a
find_end_of_line() helper that finds the next newline, and gives us the
number of bytes to move past it, or the size of the whole remaining
buffer if there is no newline.

But trying to handle both those cases leads to some oddities:

  - we try to overwrite the newline with NUL in the caller, by writing
    over line[len-1]. This is at best redundant, since the helper will
    already have done so if it saw a newline. But if it didn't see a
    newline, it's actively wrong; we'll overwrite the byte at the end of
    the (unterminated) line.

    We could solve this just dropping the extra NUL assignment in the
    caller and just letting the helper do the right thing. But...

  - if we see a "diff --git" line, we'll restore the newline on top of
    the NUL byte, so we can pass the string to parse_git_diff_header().
    But if there was no newline in the first place, we can't do this.
    There's no place to put it (the current code writes a newline
    over whatever byte we obliterated earlier). The best we can do is
    feed the complete remainder of the buffer to the function (which is,
    in fact, a string, by virtue of being a strbuf).

To solve this, the caller needs to know whether we actually found a
newline or not. We could modify find_end_of_line() to return that
information, but we can further observe that it has only one caller.
So let's just inline it in that caller.

Nobody seems to have noticed this case, probably because git-log would
never produce input that doesn't end with a newline. Arguably we could
just return an error as soon as we see that the output does not end in a
newline. But the code to do so actually ends up _longer_, mostly because
of the cleanup we have to do in handling the error.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:55 +01:00
e4298ccd7f t0003: GETTEXT_POISON fix, part 1
In dfa6b32b5e (attr: ignore attribute lines exceeding 2048 bytes,
2022-12-01), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:55 +01:00
8516dac1e1 t0033: GETTEXT_POISON fix
In e47363e5a8 (t0033: add tests for safe.directory, 2022-04-13), we
backported a patch onto v2.30.* that was originally based on a much
newer version. The v2.30.* release train still has the GETTEXT_POISON
CI job, though, and hence needs `test_i18n*` in its tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:55 +01:00
07f91e5e79 http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.

Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:

  - we don't have to worry that silencing curl's deprecation warnings
    might cause us to miss other more useful ones

  - we'd eventually want to move to the new variant anyway, so this gets
    us set up (albeit with some extra ugly boilerplate for the
    conditional)

There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:

  GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
  if (...http is allowed...)
	GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);

and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.

On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).

This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:54 +01:00
a69043d510 ci: install python on ubuntu
Python is missing from the default ubuntu-22.04 runner image, which
prevents git-p4 from working. To install python on ubuntu, we need
to provide the correct package names:

 * On Ubuntu 18.04 (bionic), "/usr/bin/python2" is provided by the
   "python" package, and "/usr/bin/python3" is provided by the "python3"
   package.

 * On Ubuntu 20.04 (focal) and above, "/usr/bin/python2" is provided by
   the "python2" package which has a different name from bionic, and
   "/usr/bin/python3" is provided by "python3".

Since the "ubuntu-latest" runner image has a higher version, its
safe to use "python2" or "python3" package name.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:54 +01:00
18bc8eb7b5 range-diff: drop useless "offset" variable from read_patches()
The "offset" variable was was introduced in 44b67cb62b (range-diff:
split lines manually, 2019-07-11), but it has never done anything
useful. We use it to count up the number of bytes we've consumed, but we
never look at the result. It was probably copied accidentally from an
almost-identical loop in apply.c:find_header() (and the point of that
commit was to make use of the parse_git_diff_header() function which
underlies both).

Because the variable was set but not used, most compilers didn't seem to
notice, but the upcoming clang-14 does complain about it, via its
-Wunused-but-set-variable warning.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:54 +01:00
b0e3e2d06b http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.

But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros).  But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.

Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).

Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.

Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:54 +01:00
fda237cb64 http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:54 +01:00
86f6f4fa91 nedmalloc: avoid new compile error
GCC v12.x complains thusly:

compat/nedmalloc/nedmalloc.c: In function 'DestroyCaches':
compat/nedmalloc/nedmalloc.c:326:12: error: the comparison will always
                              evaluate as 'true' for the address of 'caches'
                              will never be NULL [-Werror=address]
  326 |         if(p->caches)
      |            ^
compat/nedmalloc/nedmalloc.c:196:22: note: 'caches' declared here
  196 |         threadcache *caches[THREADCACHEMAXCACHES];
      |                      ^~~~~~

... and it is correct, of course.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
79e0626b39 ci: use the same version of p4 on both Linux and macOS
There would be a segmentation fault when running p4 v16.2 on ubuntu
22.04 which is the latest version of ubuntu runner image for github
actions.

By checking each version from [1], p4d version 21.1 and above can work
properly on ubuntu 22.04. But version 22.x will break some p4 test
cases. So p4 version 21.x is exactly the version we can use.

With this update, the versions of p4 for Linux and macOS happen to be
the same. So we can add the version number directly into the "P4WHENCE"
variable, and reuse it in p4 installation for macOS.

By removing the "LINUX_P4_VERSION" variable from "ci/lib.sh", the
comment left above has nothing to do with p4, but still applies to
git-lfs. Since we have a fixed version of git-lfs installed on Linux,
we may have a different version on macOS.

[1]: https://cdist2.perforce.com/perforce/

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
20854bc47a ci: remove the pipe after "p4 -V" to catch errors
When installing p4 as a dependency, we used to pipe output of "p4 -V"
and "p4d -V" to validate the installation and output a condensed version
information. But this would hide potential errors of p4 and would stop
with an empty output. E.g.: p4d version 16.2 running on ubuntu 22.04
causes sigfaults, even before it produces any output.

By removing the pipe after "p4 -V" and "p4d -V", we may get a
verbose output, and stop immediately on errors because we have "set
-e" in "ci/lib.sh". Since we won't look at these trace logs unless
something fails, just including the raw output seems most sensible.

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
c03ffcff4e github-actions: run gcc-8 on ubuntu-20.04 image
GitHub starts to upgrade its runner image "ubuntu-latest" from version
"ubuntu-20.04" to version "ubuntu-22.04". It will fail to find and
install "gcc-8" package on the new runner image.

Change the runner image of the `linux-gcc` job from "ubuntu-latest" to
"ubuntu-20.04" in order to install "gcc-8" as a dependency.

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
417fb91b5d compat/win32/syslog: fix use-after-realloc
Git for Windows' SDK recently upgraded to GCC v12.x which points out
that the `pos` variable might be used even after the corresponding
memory was `realloc()`ed and therefore potentially no longer valid.

Since a subset of this SDK is used in Git's CI/PR builds, we need to fix
this to continue to be able to benefit from the CI/PR runs.

Note: This bug has been with us since 2a6b149c64 (mingw: avoid using
strbuf in syslog, 2011-10-06), and while it looks tempting to replace
the hand-rolled string manipulation with a `strbuf`-based one, that
commit's message explains why we cannot do that: The `syslog()` function
is called as part of the function in `daemon.c` which is set as the
`die()` routine, and since `strbuf_grow()` can call that function if it
runs out of memory, this would cause a nasty infinite loop that we do
not want to re-introduce.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:52 +01:00
cfb62dd006 ls-files: fix "--format" output of relative paths
Fix a bug introduced with the "--format" option in
ce74de93 (ls-files: introduce "--format" option, 2022-07-23),
where relative paths were computed using the output buffer,
which could lead to random garbage data in the output.

Signed-off-by: Adam Johnson <me@adamj.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-10 09:16:16 -08:00
c55c30669c receive-pack: fix stale packfile locks when dying
When accepting a packfile in git-receive-pack(1), we feed that packfile
into git-index-pack(1) to generate the packfile index. As the packfile
would often only contain unreachable objects until the references have
been updated, concurrently running garbage collection might be tempted
to delete the packfile right away and thus cause corruption. To fix
this, we ask git-index-pack(1) to create a `.keep` file before moving
the packfile into place, which is getting deleted again once all of the
reference updates have been processed.

Now in production systems we have observed that those `.keep` files are
sometimes not getting deleted as expected, where the result is that
repositories tend to grow packfiles that are never deleted over time.
This seems to be caused by a race when git-receive-pack(1) is killed
after we have migrated the kept packfile from the quarantine directory
into the main object database. While this race window is typically small
it can be extended for example by installing a `proc-receive` hook.

Fix this race by registering the lockfile as a tempfile so that it will
automatically be removed at exit or when receiving a signal.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-10 08:40:13 -08:00
3dbb0ff340 Merge branch 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po
* 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po:
  l10n: zh_CN v2.40.0 round 1
2023-03-10 22:50:14 +08:00
90ff7c9898 test: don't print aggregate-results command
There's no value in it.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 14:57:57 -08:00
5d1d62e875 test: simplify counts aggregation
When the list of files as input was implemented in 6508eedf67
(t/aggregate-results: accomodate systems with small max argument list
length, 2010-06-01), a much simpler solution wasn't considered.

Let's just pass the directory as an argument.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 14:57:55 -08:00
e2d003dbed object-file: reprepare alternates when necessary
When an object is not found in a repository's object store, we sometimes
call reprepare_packed_git() to see if the object was temporarily moved
into a new pack-file (and its old pack-file or loose object was
deleted). This process does a scan of each pack directory within each
odb, but does not reevaluate if the odb list needs updating.

Extend reprepare_packed_git() to also reprepare the alternate odb list
by setting loaded_alternates to zero and calling prepare_alt_odb(). This
will add newly-discoverd odbs to the linked list, but will not duplicate
existing ones nor will it remove existing ones that are no longer listed
in the alternates file. Do this under the object read lock to avoid
readers from interacting with a potentially incomplete odb being added
to the odb list.

If the alternates file was edited to _remove_ some alternates during the
course of the Git process, Git will continue to see alternates that were
ever valid for that repository. ODBs are not removed from the list, the
same as the existing behavior before this change. Git already has
protections against an alternate directory disappearing from the
filesystem during the lifetime of a process, and those are still in
effect.

This change is specifically for concurrent changes to the repository, so
it is difficult to create a test that guarantees this behavior is
correct. I manually verified by introducing a reprepare_packed_git() call
into get_revision() and stepped into that call in a debugger with a
parent 'git log' process. Multiple runs of prepare_alt_odb() kept
the_repository->objects->odb as a single-item chain until I added a
.git/objects/info/alternates file in a different process. The next run
added the new odb to the chain and subsequent runs did not add to the
chain.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 11:44:57 -08:00
15184ae9da fetch: pass --no-write-fetch-head to subprocesses
It seems a user would expect this option would work regardless
of whether it's fetching from a single remote, many remotes,
or recursing into submodules.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 11:06:39 -08:00
28d1122f9c add-patch: handle "* Unmerged path" lines
When we generate a diff with --cached, unmerged entries have no oid for
their index entry:

  $ git diff-index --abbrev --cached HEAD
  :100644 000000 f719efd 0000000 U	my-conflict

So when we are asked to produce a patch, since we only have one side, we
just emit a special message:

  $ git diff-index --cached -p HEAD
  * Unmerged path my-conflict

This confuses interactive-patch modes that look at cached diffs. For
example:

  $ git reset -p
  BUG: add-patch.c:498: diff starts with unexpected line:
  * Unmerged path my-conflict

Making things even more confusing, you'll get that error only if the
unmerged entry is alphabetically the first changed file. Otherwise, we
simply stick the unrecognized line to the end of the previous hunk.
There it's mostly harmless, as it eventually gets fed back to "git
apply", which happily ignores it. But it's still shown to the user
attached to the hunk, which is wrong.

So let's handle these lines as a noop. There's not really anything
useful to do with a conflicted merge in this case, and that's what we do
for other cases like "add -p". There we get a "diff --cc" line, which we
accept as starting a new file, but we refuse to use any of its hunks
(their headers start with "@@@" and not "@@ ", so we silently ignore
them).

It seems like simply recognizing the line and continuing in our parsing
loop would work. But we actually need to run the rest of the loop body
to handle matching up our colored/filtered output. But that code assumes
that we have some active file_diff we're working on. So instead, we'll
just insert a dummy entry into our array. This ends up the same as if we
saw a "diff --cc" line (a file with no hunks).

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 10:06:18 -08:00
8d5213decf format-patch: add format.noprefix option
The previous commit dropped support for diff.noprefix in format-patch.
While this will do the right thing in most cases (where sending patches
without a prefix was an accidental side effect of the sender preferring
to see their local patches without prefixes), it left no good option for
a project or workflow where you really do want to send patches without
prefixes. You'd be stuck using "--no-prefix" for every invocation.

So let's add a config option specific to format-patch that enables this
behavior. That gives people who have such a workflow a way to get what
they want, but makes it hard to accidentally trigger it.

A more backwards-compatible way of doing the transition would be to have
format.noprefix default to diff.noprefix when it's not set. But that
doesn't really help the "accidental" problem; people would have to
manually set format.noprefix=false. And it's unlikely that anybody
really wants format.noprefix=true in the first place. I'm adding it here
mostly as an escape hatch, not because anybody has expressed any
interest in it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:37:27 -08:00
c169af8f7a format-patch: do not respect diff.noprefix
The output of format-patch respects diff.noprefix, but this usually ends
up being a hassle for people receiving the patch, as they have to
manually specify "-p0" in order to apply it.

I don't think there was any specific intention for it to behave this
way. The noprefix option is handled by git_diff_ui_config(), and
format-patch exists in a gray area between plumbing and porcelain.
People do look at the output, and we'd expect it to colorize things,
respect their choice of algorithm, and so on. But this particular option
creates problems for the receiver (in theory so does diff.mnemonicprefix,
but since we are always formatting commits, the mnemonic prefixes will
always be "a/" and "b/").

So let's disable it. The slight downsides are:

  - people who have set diff.noprefix presumably like to see their
    patches without prefixes. If they use format-patch to review their
    series, they'll see prefixes. On the other hand, it is probably a
    good idea for them to look at what will actually get sent out.

    We could try to play games here with "is stdout a tty", as we do for
    color. But that's not a completely reliable signal, and it's
    probably not worth the trouble. If you want to see the patch with
    the usual bells and whistles, then you are better off using "git
    log" or "git show".

  - if a project really does have a workflow that likes prefix-less
    patches, and the receiver is prepared to use "-p0", then the sender
    now has to manually say "--no-prefix" for each format-patch
    invocation. That doesn't seem _too_ terrible given that the receiver
    has to manually say "-p0" for each git-am invocation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:23 -08:00
b39a569729 diff: add --default-prefix option
You can change the output of prefixes with diff.noprefix and
diff.mnemonicprefix, but there's no easy way to override them from the
command-line. We do have "--no-prefix", but there's no way to get back
to the default prefix. So let's add an option to do that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:21 -08:00
7c03d0db88 t4013: add tests for diff prefix options
We don't have any specific test coverage of diff's various prefix
options. We do incidentally invoke them in a few places, but it's worth
having a more thorough set of tests that covers all of the effects we
expect to see, and that the options kick in at the appropriate times.

This will be especially useful as the next patch adds more options.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:19 -08:00
6799aadfdf diff: factor out src/dst prefix setup
We directly manipulate diffopt's a_prefix and b_prefix to set up either
the default "a/foo" prefix or the "--no-prefix" variant. Although this
is only a few lines, it's worth pulling these into their own functions.
That lets us avoid one repetition already in this patch, but will also
give us a cleaner interface for callers which want to tweak this
setting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:17 -08:00
15a4cc912e sequencer.c: fix overflow & segfault in parse_strategy_opts()
The split_cmdline() function introduced in [1] returns an "int". If
it's negative it signifies an error. The option parsing in [2] didn't
account for this, and assigned the value directly to the "size_t
xopts_nr". We'd then attempt to loop over all of these elements, and
access uninitialized memory.

There's a few things that use this for option parsing, but one way to
trigger it is with a bad value to "-X <strategy-option>", e.g:

	git rebase -X"bad argument\""

In another context this might be a security issue, but in this case
someone who's already able to inject arguments directly to our
commands would be past other defenses, making this potential
escalation a moot point.

As the example above & test case shows the error reporting leaves
something to be desired. The function will loop over the
whitespace-split values, but when it encounters an error we'll only
report the first element, which is OK, not the second "argument\""
whose quote is unbalanced.

This is an inherent limitation of the current API, and the issue
affects other API users. Let's not attempt to fix that now. If and
when that happens these tests will need to be adjusted to assert the
new output.

1. 2b11e3170e (If you have a config containing something like this:,
   2006-06-05)
2. ca6c6b45dd (sequencer (rebase -i): respect strategy/strategy_opts
   settings, 2017-01-02)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-08 14:14:42 -08:00
765071a8f2 advice: add diverging advice for novices
The user might not necessarily know why ff only was configured, maybe an
admin did it, or the installer (Git for Windows), or perhaps they just
followed some online advice.

This can happen not only on pull.ff=only, but merge.ff=only too.

Even worse if the user has configured pull.rebase=false and
merge.ff=only, because in those cases a diverging merge will constantly
keep failing. There's no trivial way to get out of this other than
`git merge --no-ff`.

Let's not assume our users are experts in git who completely understand
all their configurations.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-08 09:28:42 -08:00
c35e313af8 Merge branch 'l10n-de-2.40' of github.com:ralfth/git
* 'l10n-de-2.40' of github.com:ralfth/git:
  l10n: update German translation
2023-03-08 09:10:20 +08:00
680f605e3c Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.40 (round 1)
2023-03-08 08:28:02 +08:00
62931b5929 Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2023-03-08 08:27:07 +08:00
2deb48aa37 Merge branch 'fr_2.40.0_rnd1' of github.com:jnavila/git
* 'fr_2.40.0_rnd1' of github.com:jnavila/git:
  l10n: fr: v2.40.0 rnd 2
  l10n: fr: v2.40.0 rnd 1
  l10n: fr: fix some typos
2023-03-08 08:26:00 +08:00
ae9b8c4926 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5490t0f0u)
2023-03-08 08:25:07 +08:00
462366874a Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5490t)
2023-03-08 08:23:16 +08:00
93a05aa02c Merge branch 'turkish' of github.com:bitigchi/git-po
* 'turkish' of github.com:bitigchi/git-po:
  l10n: tr: Update Turkish translations for v.2.40.0
2023-03-08 08:22:01 +08:00
cec74d09d8 l10n: zh_CN v2.40.0 round 1
Reviewed-by: 依云 <lilydjwg@gmail.com>
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2023-03-07 23:42:30 +00:00
725f57037d Git 2.40-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 21:53:11 -08:00
9a4e18b701 Merge branch 'gm/signature-format-doc'
Doc update.

* gm/signature-format-doc:
  signature-format.txt: note SSH and X.509 signature delimiters
2023-03-06 21:51:56 -08:00
0bbe10313e parse-options: use prefix_filename_except_for_dash() helper
Since our fix_filename()'s only remaining special case is handling "-",
we can use the newly-minted helper function that handles this already.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:14:53 -08:00
7ce4088ab7 parse-options: consistently allocate memory in fix_filename()
When handling OPT_FILENAME(), we have to stick the "prefix" (if any) in
front of the filename to make up for the fact that Git has chdir()'d to
the top of the repository. We can do this with prefix_filename(), but
there are a few special cases we handle ourselves.

Unfortunately the memory allocation is inconsistent here; if we do make
it to prefix_filename(), we'll allocate a string which the caller must
free to avoid a leak. But if we hit our special cases, we'll return the
string as-is, and a caller which tries to free it will crash. So there's
no way to win.

Let's consistently allocate, so that callers can do the right thing.

There are now three cases to care about in the function (and hence a
three-armed if/else):

  1. we got a NULL input (and should leave it as NULL, though arguably
     this is the sign of a bug; let's keep the status quo for now and we
     can pick at that scab later)

  2. we hit a special case that means we leave the name intact; we
     should duplicate the string. This includes our special "-"
     matching. Prior to this patch, it also included empty prefixes and
     absolute filenames. But we can observe that prefix_filename()
     already handles these, so we don't need to detect them.

  3. everything else goes to prefix_filename()

I've dropped the "const" from the "char **file" parameter to indicate
that we're allocating, though in practice it's not really important.
This is all being shuffled through a void pointer via opt->value before
it hits code which ever looks at the string. And it's even a bit weird,
because we are really taking _in_ a const string and using the same
out-parameter for a non-const string. A better function signature would
be:

  static char *fix_filename(const char *prefix, const char *file);

but that would mean the caller dereferences the double-pointer (and the
NULL check is currently handled inside this function). So I took the
path of least-change here.

Note that we have to fix several callers in this commit, too, or we'll
break the leak-checking tests. These are "new" leaks in the sense that
they are now triggered by the test suite, but these spots have always
been leaky when Git is run in a subdirectory of the repository. I fixed
all of the cases that trigger with GIT_TEST_PASSING_SANITIZE_LEAK. There
may be others in scripts that have other leaks, but we can fix them
later along with those other leaks (and again, you _couldn't_ fix them
before this patch, so this is the necessary first step).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:14:45 -08:00
a8bfa99d44 bundle: don't blindly apply prefix_filename() to "-"
A user can specify a filename to a command from the command line,
either as the value given to a command line option, or a command
line argument.  When it is given as a relative filename, in the
user's mind, it is relative to the directory "git" was started from,
but by the time the filename is used, "git" would almost always have
chdir()'ed up to the root level of the working tree.

The given filename, if it is relative, needs to be prefixed with the
path to the current directory, and it typically is done by calling
prefix_filename() helper function.  For commands that can also take
"-" to use the standard input or the standard output, however, this
needs to be done with care.

"git bundle create" uses the next word on the command line as the
output filename, and can take "-" to mean "write to the standard
output".  It blindly called prefix_filename(), so running it in a
subdirectory did not quite work as expected.

Introduce a new helper, prefix_filename_except_for_dash(), and use
it to help "git bundle create" codepath.

Reported-by: Michael Henry
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:12:56 -08:00
ef3b291a5f bundle: document handling of "-" as stdin
We have always allowed "bundle create -" to write to stdout, but it was
never documented. And a recent patch let reading operations like "bundle
list-heads -" read from stdin.

Let's document all of these cases.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:12:56 -08:00
bf8b1e04ff bundle: let "-" mean stdin for reading operations
For writing, "bundle create -" indicates that the bundle should be
written to stdout. But there's no matching handling of "-" for reading
operations. This is inconsistent, and a little inflexible (though one
can always use "/dev/stdin" on systems that support it).

However, it's easy to change. Once upon a time, the bundle-reading code
required a seekable descriptor, but that was fixed long ago in
e9ee84cf28 (bundle: allowing to read from an unseekable fd,
2011-10-13). So we just need to handle "-" explicitly when opening the
file.

We _could_ do this by handling "-" in read_bundle_header(), which the
reading functions all call already. But that is probably a bad idea.
It's also used by low-level code like the transport functions, and we
may want to be more careful there. We do not know that stdin is even
available to us, and certainly we would not want to get confused by a
configured URL that happens to point to "-".

So instead, let's add a helper to builtin/bundle.c. Since both the
bundle code and some of the callers refer to the bundle by name for
error messages, let's use the string "<stdin>" to make the output a bit
nicer to read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:12:55 -08:00
f7111175df git-merge-tree.txt: replace spurious HTML entity
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 11:29:25 -08:00
8b95521edb bundle: turn on --all-progress-implied by default
In 79862b6b77 (bundle-create: progress output control, 2019-11-10),
"bundle create" learned about the --all-progress and
--all-progress-implied options, which were copied from pack-objects.
I think these were a mistake.

In pack-objects, "all-progress-implied" is about switching the behavior
between a regular on-disk "git repack" and the use of pack-objects for
push/fetch (where a fetch does not want progress from the server during
the write stage; the client will print progress as it receives the
data). But there's no such distinction for bundles. Prior to
79862b6b77, we always printed the write stage. Afterwards, a vanilla:

  git bundle create foo.bundle

omits the write progress, appearing to hang (especially if your
repository is large or your disk is slow). That seems like a regression.

It's possible that the flexibility to disable the write-phase progress
_could_ be useful for bundle. E.g., if you did something like:

  ssh some-host git bundle create foo.bundle |
  git bundle unbundle

But if you are running both in real-time, why are you using bundles in
the first place? You're better off doing a real fetch.

But even if we did want to support that, it should be the exception, and
vanilla "bundle create" should display the full progress. So we'd want
to name the option "--no-write-progress" or something.

The "--all-progress" option itself is even worse. It exists in
pack-objects only for historical reasons. It's a mistake because it
implies "--progress", and we added "--all-progress-implied" to fix that.
There is no reason to propagate that mistake to new commands.

Likewise, the documentation for these options was pulled from
pack-objects. But it doesn't make any sense in this context. It talks
about "--stdout", but that is not even an option that git-bundle
supports.

This patch flips the default for "--all-progress-implied" back to
"true", fixing the regression in 79862b6b77. This turns that option
into a noop, and means that "--all-progress" is really the same as
"--progress". We _could_ drop them completely, but since they've been
shipped with Git since v2.25.0, it's polite to continue accepting them.

I didn't implement any sort of "--no-write-progress" here. I'm not at
all convinced it's necessary, and the discussion from the original
thread:

  https://lore.kernel.org/git/20191110204126.30553-2-robbat2@gentoo.org/

shows that that the main focus was on getting --progress and --quiet
support, and not any kind of clever "real-time bundle over the network"
feature. But technically this patch is making it impossible to do
something that you _could_ do post-79862b6b77c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 09:51:06 -08:00
5e104568ad l10n: update German translation
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2023-03-06 17:33:21 +01:00
94c4289435 format-patch: output header for empty commits
When formatting an empty commit, it is surprising that a totally empty
file is generated.  Set the flag to always print the header, matching
the behaviour of git-log.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-03 09:13:52 -08:00
8790c93ce6 l10n: tr: Update Turkish translations for v.2.40.0
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2023-03-03 11:34:51 +03:00
81fba8e54c l10n: fr: v2.40.0 rnd 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-03-02 18:49:13 +01:00
1f7012f4ac l10n: fr: v2.40.0 rnd 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2023-03-02 18:41:06 +01:00
90c6ff566e l10n: fr: fix some typos
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reported-by: Andrei Rybak <rybak.a.v@gmail.com>
2023-03-02 18:41:06 +01:00
2e6b49d732 l10n: po-id for 2.40 (round 1)
Update following components:

  * archive.c
  * attr.c
  * builtin/add.c
  * builtin/rebase.c
  * bundle.c
  * connect.c
  * sequencer.c
  * t/helper/test-bundle-uri.c
  * transport.c
  * wt-status.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2023-03-02 19:48:13 +07:00
8cb7de6f78 l10n: sv.po: Update Swedish translation (5490t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2023-03-02 09:35:41 +01:00
b0c48e4e95 l10n: bg.po: Updated Bulgarian translation (5490t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2023-03-02 08:56:33 +02:00
d15644fe02 Merge branch 'rs/range-diff-custom-abbrev-fix'
Hotfix for a topic that is already in 'master'.

* rs/range-diff-custom-abbrev-fix:
  range-diff: avoid compiler warning when char is unsigned
2023-03-01 13:25:24 -08:00
cdda1199e0 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2023-03-01 22:07:24 +01:00
ef7d4f53c2 Git 2.40-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-01 08:13:35 -08:00
7c3c55026c push: allow delete single-level ref
We discourage the creation/update of single-level refs
because some upper-layer applications only work in specified
reference namespaces, such as "refs/heads/*" or "refs/tags/*",
these single-level refnames may not be recognized. However,
we still hope users can delete them which have been created
by mistake.

Therefore, when updating branches on the server with
"git receive-pack", by checking whether it is a branch deletion
operation, it will determine whether to allow the update of
a single-level refs. This avoids creating/updating such
single-level refs, but allows them to be deleted.

On the client side, "git push" also does not properly fill in
the old-oid of single-level refs, which causes the server-side
"git receive-pack" to think that the ref's old-oid has changed
when deleting single-level refs, this causes the push to be
rejected. So the solution is to fix the client to be able to
delete single-level refs by properly filling old-oid.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-01 08:08:10 -08:00
d81ba50a9b receive-pack: fix funny ref error messsage
When the user deletes the remote one level branch through
"git push origin -d refs/foo", remote will return an error:
"refusing to create funny ref 'refs/foo' remotely", here we
are not creating "refs/foo" instead wants to delete it, so a
better error description here would be: "refusing to update
funny ref 'refs/foo' remotely".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-01 08:08:09 -08:00
454dfcbddf A bit more before 2.40-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-28 16:38:47 -08:00
4240e0f6c0 Merge branch 'ar/test-lib-remove-stale-comment'
Test library clean-up.

* ar/test-lib-remove-stale-comment:
  test-lib: drop comment about test_description
2023-02-28 16:38:47 -08:00
8760a2b3c6 Merge branch 'zy/t9700-style'
Test style fixes.

* zy/t9700-style:
  t9700: modernize test scripts
2023-02-28 16:38:47 -08:00
a2d2b5229e Merge branch 'pw/rebase-i-parse-fix'
Fixes to code that parses the todo file used in "rebase -i".

* pw/rebase-i-parse-fix:
  rebase -i: fix parsing of "fixup -C<commit>"
  rebase -i: match whole word in is_command()
2023-02-28 16:38:47 -08:00
b2893ea403 Merge branch 'jk/http-test-fixes'
Various fix-ups on HTTP tests.

* jk/http-test-fixes:
  t5559: make SSL/TLS the default
  t5559: fix test failures with LIB_HTTPD_SSL
  t/lib-httpd: enable HTTP/2 "h2" protocol, not just h2c
  t/lib-httpd: respect $HTTPD_PROTO in expect_askpass()
  t5551: drop curl trace lines without headers
  t5551: handle v2 protocol in cookie test
  t5551: simplify expected cookie file
  t5551: handle v2 protocol in upload-pack service test
  t5551: handle v2 protocol when checking curl trace
  t5551: stop forcing clone to run with v0 protocol
  t5551: handle HTTP/2 when checking curl trace
  t5551: lower-case headers in expected curl trace
  t5551: drop redundant grep for Accept-Language
  t5541: simplify and move "no empty path components" test
  t5541: stop marking "used receive-pack service" test as v0 only
  t5541: run "used receive-pack service" test earlier
2023-02-28 16:38:47 -08:00
d9165bef58 range-diff: avoid compiler warning when char is unsigned
Since 2b15969f61 (range-diff: let '--abbrev' option takes effect,
2023-02-20), GCC 11.3 on Ubuntu 22.04 on aarch64 warns (and errors
out if the make variable DEVELOPER is set):

range-diff.c: In function ‘output_pair_header’:
range-diff.c:388:20: error: comparison is always false due to limited range of data type [-Werror=type-limits]
  388 |         if (abbrev < 0)
      |                    ^
cc1: all warnings being treated as errors

That's because char is unsigned on that platform.  Use int instead, just
like in struct diff_options, to copy the value faithfully.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-28 14:43:05 -08:00
31a431b18b signature-format.txt: note SSH and X.509 signature delimiters
This document only explains PGP signatures, but Git now supports X.509
signatures as of 1e7adb9756 (gpg-interface: introduce new signature
format "x509" using gpgsm, 2018-07-17), and SSH signatures as of
29b315778e (ssh signing: add ssh key format and signing code,
2021-09-10).

Additionally, explain that these signature formats are controlled
`gpg.format`, linking to its documentation, and explain in said
`gpg.format` documentation that the underlying signature format is
documented in signature-format.txt.

Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 13:42:43 -08:00
f17a1542b2 rebase: fix capitalisation autoSquash in i18n string
The config option (as documented) for rebase.autoSquash has a capital S,
whereas the command line option has a small case s.

Cf. <20220617100309.3224-1-worldhello.net@gmail.com>

Signed-off-by: Fangyi Zhou <me@fangyi.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 12:10:29 -08:00
5f2117b24f credential: add WWW-Authenticate header to cred requests
Add the value of the WWW-Authenticate response header to credential
requests. Credential helpers that understand and support HTTP
authentication and authorization can use this standard header (RFC 2616
Section 14.47 [1]) to generate valid credentials.

WWW-Authenticate headers can contain information pertaining to the
authority, authentication mechanism, or extra parameters/scopes that are
required.

The current I/O format for credential helpers only allows for unique
names for properties/attributes, so in order to transmit multiple header
values (with a specific order) we introduce a new convention whereby a
C-style array syntax is used in the property name to denote multiple
ordered values for the same property.

In this case we send multiple `wwwauth[]` properties where the order
that the repeated attributes appear in the conversation reflects the
order that the WWW-Authenticate headers appeared in the HTTP response.

Add a set of tests to exercise the HTTP authentication header parsing
and the interop with credential helpers. Credential helpers will receive
WWW-Authenticate information in credential requests.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 10:40:40 -08:00
6b8dda9a4f http: read HTTP WWW-Authenticate response headers
Read and store the HTTP WWW-Authenticate response headers made for
a particular request.

This will allow us to pass important authentication challenge
information to credential helpers or others that would otherwise have
been lost.

libcurl only provides us with the ability to read all headers recieved
for a particular request, including any intermediate redirect requests
or proxies. The lines returned by libcurl include HTTP status lines
delinating any intermediate requests such as "HTTP/1.1 200". We use
these lines to reset the strvec of WWW-Authenticate header values as
we encounter them in order to only capture the final response headers.

The collection of all header values matching the WWW-Authenticate
header is complicated by the fact that it is legal for header fields to
be continued over multiple lines, but libcurl only gives us each
physical line a time, not each logical header. This line folding feature
is deprecated in RFC 7230 [1] but older servers may still emit them, so
we need to handle them.

In the future [2] we may be able to leverage functions to read headers
from libcurl itself, but as of today we must do this ourselves.

[1] https://www.rfc-editor.org/rfc/rfc7230#section-3.2
[2] https://daniel.haxx.se/blog/2022/03/22/a-headers-api-for-libcurl/

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 10:40:40 -08:00
988aad99b4 t5563: add tests for basic and anoymous HTTP access
Add a test showing simple anoymous HTTP access to an unprotected
repository, that results in no credential helper invocations.
Also add a test demonstrating simple basic authentication with
simple credential helper support.

Leverage a no-parsed headers (NPH) CGI script so that we can directly
control the HTTP responses to simulate a multitude of good, bad and ugly
remote server implementations around auth.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 10:40:40 -08:00
a0f05f6840 A bit more before 2.40-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 10:08:58 -08:00
506bd0ec82 Merge branch 'if/simplify-trace-setup'
Code clean-up.

* if/simplify-trace-setup:
  trace.c, git.c: remove unnecessary parameter to trace_repo_setup()
2023-02-27 10:08:58 -08:00
630501ceef Merge branch 'jc/countermand-format-attach'
The format.attach configuration variable lacked a way to override a
value defined in a lower-priority configuration file (e.g. the
system one) by redefining it in a higher-priority configuration
file.  Now, setting format.attach to an empty string means show the
patch inline in the e-mail message, without using MIME attachment.

This is a backward incompatible change.

* jc/countermand-format-attach:
  format.attach: allow empty value to disable multi-part messages
2023-02-27 10:08:57 -08:00
dda83e69d0 Merge branch 'jk/shorten-unambiguous-ref-wo-sscanf'
sscanf(3) used in "git symbolic-ref --short" implementation found
to be not working reliably on macOS in UTF-8 locales.  Rewrite the
code to avoid sscanf() altogether to work it around.

* jk/shorten-unambiguous-ref-wo-sscanf:
  shorten_unambiguous_ref(): avoid sscanf()
  shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
  shorten_unambiguous_ref(): avoid integer truncation
2023-02-27 10:08:57 -08:00
7dc55a04d8 Merge branch 'mh/credential-password-expiry'
The credential subsystem learned that a password may have an
explicit expiration.

* mh/credential-password-expiry:
  credential: new attribute password_expiry_utc
2023-02-27 10:08:57 -08:00
5e572aaa5d Merge branch 'rs/archive-mtime'
"git archive HEAD^{tree}" records the paths with the current
timestamp in the archive, making it harder to obtain a stable
output.  The command learned the --mtime option to specify an
arbitrary timestamp (e.g. --mtime="@0 +0000" for the epoch).

* rs/archive-mtime:
  archive: add --mtime
2023-02-27 10:08:57 -08:00
b8840a72e2 Merge branch 'tb/drop-dir-iterator-follow-symlink-bit'
Remove leftover and unused code.

* tb/drop-dir-iterator-follow-symlink-bit:
  t0066: drop setup of "dir5"
  dir-iterator: drop unused `DIR_ITERATOR_FOLLOW_SYMLINKS`
2023-02-27 10:08:57 -08:00
63f74cfbcc Merge branch 'tl/range-diff-custom-abbrev'
"git range-diff" learned --abbrev=<num> option.

* tl/range-diff-custom-abbrev:
  range-diff: let '--abbrev' option takes effect
2023-02-27 10:08:56 -08:00
93c12724f1 Merge branch 'ap/t2015-style-update'
Test clean-up.

* ap/t2015-style-update:
  t2015-checkout-unborn.sh: changes the style for cd
2023-02-27 10:08:56 -08:00
ece8dc97ae Merge branch 'jc/diff-algo-attribute'
The "diff" drivers specified by the "diff" attribute attached to
paths can now specify which algorithm (e.g. histogram) to use.

* jc/diff-algo-attribute:
  diff: teach diff to read algorithm from diff driver
  diff: consolidate diff algorithm option parsing
2023-02-27 10:08:56 -08:00
21522cf5d0 Merge branch 'pw/rebase-i-validate-labels-early'
An invalid label or ref in the "rebase -i" todo file used to
trigger an runtime error. SUch an error is now diagnosed while the
todo file is parsed.

* pw/rebase-i-validate-labels-early:
  rebase -i: check labels and refs when parsing todo list
2023-02-27 10:08:56 -08:00
ee8a88826a restore: fault --staged --worktree with merge opts
The 'restore' command already rejects the --merge, --conflict, --ours
and --theirs options when combined with --staged, but accepts them when
--worktree is added as well.

Unfortunately that doesn't appear to do anything useful. The --ours and
--theirs options seem to be ignored when both --staged and --worktree
are given, whereas with --merge or --conflict, the command has the same
effect as if the --staged option wasn't present.

So reject those options with '--staged --worktree' as well, using
opts->accept_ref to distinguish restore from checkout.

Add test for both '--staged' and '--staged --worktree'.

Signed-off-by: Andy Koppe <andy.koppe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 09:33:20 -08:00
c6ce27ab08 fetch: support hideRefs to speed up connectivity checks
With roughly 800 remotes all fetching into their own
refs/remotes/$REMOTE/* island, the connectivity check[1] gets
expensive for each fetch on systems which lack sufficient RAM to
cache objects.

To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now
allows the no-op fetch to take ~30 seconds instead of ~20 minutes
on a noisy, RAM-constrained machine (localhost, so no network latency):

   git -c fetch.hideRefs=refs \
	-c fetch.hideRefs='!refs/remotes/$REMOTE/' \
	fetch $REMOTE

[1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs'

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 09:27:03 -08:00
c600a91c94 test-lib: drop comment about test_description
When a comment describing how each test file should start was added in
commit [1], it was the second comment of t/test-lib.sh.  The comment
describes how variable "test_description" is supposed to be assigned at
the top of each test file and how "test-lib.sh" should be used by
sourcing it.  However, even in [1], the comment was ten lines away from
the usage of the variable by test-lib.sh.  Since then, the comment has
drifted away both from the top of the file and from the usage of the
variable.  The comment just sits in the middle of the initialization of
the test library, surrounded by unrelated code, almost one hundred lines
away from the usage of "test_description".

Nobody has noticed this drift during evolution of test-lib.sh, which
suggests that this comment has outlived its usefulness.  The assignment
of "test_description", sourcing of "test-lib.sh" by tests, and the
process of writing tests in general are described in detail in
"t/README".  So drop the obsolete comment.

An alternative solution could be to move the comment either to the top
of the file, or down to the usage of variable "test_description".

[1] e1970ce43a ("[PATCH 1/2] Test framework take two.", 2005-05-13)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 09:25:34 -08:00
f297424a3a unpack-trees: add usage notices around df_conflict_entry
Avoid making users believe they need to initialize df_conflict_entry
to something (as happened with other output only fields before) with
a quick comment and a small sanity check.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
1ca13dd3ca unpack-trees: special case read-tree debugging as internal usage
builtin/read-tree.c has some special functionality explicitly designed
for debugging unpack-trees.[ch].  Associated with that is two fields
that no other external caller would or should use.  Mark these as
internal to unpack-trees, but allow builtin/read-tree to read or write
them for this special case.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
0d680a7158 unpack-trees: rewrap a few overlong lines from previous patch
The previous patch made many lines a little longer, resulting in four
becoming a bit too long.  They were left as-is for the previous patch
to facilitate reviewers verifying that we were just adding "internal."
in a bunch of places, but rewrap them now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
13e1fd6e38 unpack-trees: mark fields only used internally as internal
Continue the work from the previous patch by finding additional fields
which are only used internally but not yet explicitly marked as such,
and include them in the internal fields struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
576de3d956 unpack_trees: start splitting internal fields from public API
This just splits the two fields already marked as internal-only into a
separate internal struct.  Future commits will add more fields that
were meant to be internal-only but were not explicitly marked as such
to the same struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
33b1b4c768 sparse-checkout: avoid using internal API of unpack-trees, take 2
Commit 2f6b1eb794 ("cache API: add a "INDEX_STATE_INIT" macro/function,
add release_index()", 2023-01-12) mistakenly added some initialization
of a member of unpack_trees_options that was intended to be
internal-only.  This initialization should be done within
update_sparsity() instead.

Note that while o->result is mostly meant for unpack_trees() and
update_sparsity() mostly operates without o->result,
check_ok_to_remove() does consult it so we need to ensure it is properly
initialized.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
1147c56ff7 sparse-checkout: avoid using internal API of unpack-trees
struct unpack_trees_options has the following field and comment:

	struct pattern_list *pl; /* for internal use */

Despite the internal-use comment, commit e091228e17 ("sparse-checkout:
update working directory in-process", 2019-11-21) starting setting this
field from an external caller.  At the time, the only way around that
would have been to modify unpack_trees() to take an extra pattern_list
argument, and there's a lot of callers of that function.  However, when
we split update_sparsity() off as a separate function, with
sparse-checkout being the sole caller, the need to update other callers
went away.  Fix this API problem by adding a pattern_list argument to
update_sparsity() and stop setting the internal o.pl field directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:51 -08:00
5d4f4a592e unpack-trees: clean up some flow control
The update_sparsity() function was introduced in commit 7af7a25853
("unpack-trees: add a new update_sparsity() function", 2020-03-27).
Prior to that, unpack_trees() was used, but that had a few bugs because
the needs of the caller were different, and different enough that
unpack_trees() could not easily be modified to handle both usecases.

The implementation detail that update_sparsity() was written by copying
unpack_trees() and then streamlining it, and then modifying it in the
needed ways still shows through in that there are leftover vestiges in
both functions that are no longer needed.  Clean them up.  In
particular:

  * update_sparsity() allows a pattern list to be passed in, but
    unpack_trees() never should use a different pattern list.  Add a
    check and a BUG() if this gets violated.
  * update_sparsity() has a check early on that will BUG() if
    o->skip_sparse_checkout is set; as such, there's no need to check
    for that condition again later in the code.  We can simply remove
    the check and its corresponding goto label.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
d144a9d30d dir: mark output only fields of dir_struct as such
While at it, also group these fields together for convenience.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
59e009bf15 dir: add a usage note to exclude_per_dir
As evidenced by the fix a couple commits ago, places in the code using
exclude_per_dir are likely buggy and should be adapted to call
setup_standard_excludes() instead.  Unfortunately, the usage of
exclude_per_dir has been hardcoded into the arguments ls-files accepts,
so we cannot actually remove it.  Add a note that it is deprecated and
no other callers should use it directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
5fdf285e62 dir: separate public from internal portion of dir_struct
In order to make it clearer to callers what portions of dir_struct are
public API, and avoid errors from them setting fields that are meant as
internal API, split the fields used for internal implementation reasons
into a separate embedded struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
b413a82712 unpack-trees: heed requests to overwrite ignored files
When a directory exists but has only ignored files within it and we are
trying to switch to a branch that has a file where that directory is,
the behavior depends upon --[no]-overwrite-ignore.  If the user wants to
--overwrite-ignore (the default), then we should delete the ignored file
and directory and switch to the new branch.

The code to handle this in verify_clean_subdirectory() in unpack-trees
tried to handle this via paying attention to the exclude_per_dir setting
of the internal dir field.  This came from commit c81935348b ("Fix
switching to a branch with D/F when current branch has file D.",
2007-03-15), which pre-dated 039bc64e88 ("core.excludesfile clean-up",
2007-11-14), and thus did not pay attention to ignore patterns from
other relevant files.  Change it to use setup_standard_excludes() so
that it is also aware of excludes specified in other locations.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
24a49cf78e t2021: fix platform-specific leftover cruft
t2021.6 existed to test the status of a symlink that was left around by
previous tests.  It tried to also clean up the symlink after it was done
so that subsequent tests wouldn't be tripped up by it.  Unfortunately,
since this test had a SYMLINK prerequisite, that made the cleanup
platform dependent...and made a testcase I was trying to add to this
testsuite fail (that testcase will be included in the next patch).
Before we go and add new testcases, fix this cleanup by moving it into a
separate test.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
cc5d1d32fd drop pure pass-through config callbacks
Commit fd2d4c135e (gpg-interface: lazily initialize and read the
configuration, 2023-02-09) shrunk a few custom config callbacks so that
they are just one-liners of:

  return git_default_config(...);

We can drop them entirely and replace them direct calls of
git_default_config() intead. This makes the code a little shorter and
easier to understand (with the downside being that if they do grow
custom options again later, we'll have to recreate the functions).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:00:39 -08:00
8d3e7eac52 fsck: check even zero-entry index files
In fb64ca526a (fsck: check index files in all worktrees, 2023-02-24), we
swapped out a call to vanilla repo_read_index() for a series of
read_index_from() calls, one per worktree. The code for the latter was
copied from add_index_objects_to_pending(), which checks for a positive
return value from the index reading function, and we do the same here in
fsck now.

But this is probably the wrong thing. I had interpreted the check as
"don't operate on the index struct if there was an error". But in
reality, if there is an error then the index-reading code will simply
die (which admittedly is not great for fsck, but that is not a new
problem).

The return value here is actually the number of entries read. So it
makes sense for add_index_objects_to_pending() to ignore a zero-entry
index (there is nothing to add). But for fsck, we would still want to
check any extensions, etc (though presumably it is unlikely to have them
in an empty index, I don't think it's impossible).

So we should ignore the return value from read_index_from() entirely.
This matches the behavior before fb64ca526a, when we ignored the return
value from repo_read_index().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 07:36:36 -08:00
894ea94509 switch: reject if the branch is already checked out elsewhere (test)
Since 5883034 (checkout: reject if the branch is already checked out
elsewhere) in normal use, we do not allow multiple worktrees having the
same checked out branch.

A bug has recently been fixed that caused this to not work as expected.

Let's add a test to notice if this changes in the future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-25 13:05:23 -08:00
279f42fa27 rebase: refuse to switch to a branch already checked out elsewhere (test)
In b5cabb4a9 (rebase: refuse to switch to branch already checked out
elsewhere, 2020-02-23) we add a condition to prevent a rebase operation
involving a switch to a branch that is already checked out in another
worktree.

A bug has recently been fixed that caused this to not work as expected.

Let's add a test to notice if this changes in the future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-25 13:05:23 -08:00
faa4d5983b branch: fix die_if_checked_out() when ignore_current_worktree
In 8d9fdd7 (worktree.c: check whether branch is rebased in another
worktree, 2016-04-22) die_if_checked_out() learned a new option
ignore_current_worktree, to modify the operation from "die() if the
branch is checked out in any worktree" to "die() if the branch is
checked out in any worktree other than the current one".

Unfortunately we implemented it by checking the flag is_current in the
worktree that find_shared_symref() returns.

When the same branch is checked out in several worktrees simultaneously,
find_shared_symref() will return the first matching worktree in the list
composed by get_worktrees().  If one of the worktrees with the checked
out branch is the current worktree, find_shared_symref() may or may not
return it, depending on the order in the list.

Instead of find_shared_symref(), let's do the search using use the
recently introduced API is_shared_symref(), and consider
ignore_current_worktree when necessary.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-25 13:05:23 -08:00
662078caac worktree: introduce is_shared_symref()
Add a new function, is_shared_symref(), which contains the heart of
find_shared_symref().  Refactor find_shared_symref() to use the new
function is_shared_symref().

Soon, we will use is_shared_symref() to search for symref beyond
the first worktree that matches.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-25 13:05:23 -08:00
509d3f5103 t9700: modernize test scripts
The style of t9700-perl-git.sh is old. There are 3 problems:
* A title is not on the same line with test_expect_success command.
* A test body is indented by whitespaces.
* There are whitespaces after redirect operators.

Modernize test scripts by:
* Combine the title with test_expect_success command.
* Replace whitespace indents with TAB.
* Delete whitespaces after redirect operators.

Signed-off-by: Zhang Yi <18994118902@163.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-25 12:20:06 -08:00
dadc8e6dac A few more topics post 2.40-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 22:54:01 -08:00
f96dd8c3b5 Merge branch 'ps/free-island-marks'
Fix on a previous fix already in 'master'.

* ps/free-island-marks:
  delta-islands: fix segfault when freeing island marks
2023-02-24 22:54:01 -08:00
6f581b6d6d Merge branch 'jk/http-proxy-tests'
Test updates.

* jk/http-proxy-tests:
  add basic http proxy tests
2023-02-24 22:54:01 -08:00
d180cc2979 Merge branch 'ma/fetch-parallel-use-online-cpus'
"git fetch --jobs=0" used to hit a BUG(), which has been corrected
to use the available CPUs.

* ma/fetch-parallel-use-online-cpus:
  fetch: choose a sensible default with --jobs=0 again
2023-02-24 22:54:00 -08:00
c5f7ef5fdc Git 2.40-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 11:32:40 -08:00
deb32d6d60 Merge branch 'jc/genzeros-avoid-raw-write'
A test helper had a single write(2) of 256kB, which was too big for
some platforms (e.g. NonStop), which has been corrected by using
xwrite() wrapper appropriately.

* jc/genzeros-avoid-raw-write:
  test-genzeros: avoid raw write(2)
2023-02-24 11:32:30 -08:00
a7981d0717 Merge branch 'rd/doc-default-date-format'
Update --date=default documentation.

* rd/doc-default-date-format:
  rev-list: clarify git-log default date format
2023-02-24 11:32:30 -08:00
38a227b796 Merge branch 'js/gpg-errors'
Error messages given upon a signature verification failure used to
discard the errors from underlying gpg program, which has been
corrected.

* js/gpg-errors:
  gpg: do show gpg's error message upon failure
  t7510: add a test case that does not need gpg
2023-02-24 11:32:29 -08:00
98619325c0 Merge branch 'rs/ctype-test'
Test safe_ctype

* rs/ctype-test:
  test-ctype: test iscntrl, ispunct, isxdigit and isprint
  test-ctype: test islower and isupper
  test-ctype: test isascii
2023-02-24 11:32:29 -08:00
592ec63b38 fsck: mention file path for index errors
If we encounter an error in an index file, we may say something like:

  error: 1234abcd: invalid sha1 pointer in resolve-undo

But if you have multiple worktrees, each with its own index, it can be
very helpful to know which file had the problem. So let's pass that path
down through the various index-fsck functions and use it where
appropriate. After this patch you should get something like:

  error: 1234abcd: invalid sha1 pointer in resolve-undo of .git/worktrees/wt/index

That's a bit verbose, but since the point is that you shouldn't see this
normally, we're better to err on the side of more details.

I've also added the index filename to the name used by "fsck
--name-objects", which will show up if we find the object to be missing,
etc. This is bending the rules a little there, as the option claims to
write names that can be fed to rev-parse. But there is no revision
syntax to access the index of another worktree, so the best we can do is
make up something that a human will probably understand.

I did take care to retain the existing ":file" syntax for the current
worktree. So the uglier output should kick in only when it's actually
necessary. See the included tests for examples of both forms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:32:23 -08:00
fb64ca526a fsck: check index files in all worktrees
We check the index file for the main worktree, but completely ignore the
index files in other worktrees. These should be checked, too, as they
are part of the repository state (and in particular, errors in those
index files may cause repo-wide operations like "git gc" to complain).

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:32:23 -08:00
8840069a37 fsck: factor out index fsck
The code to fsck an index operates directly on the_index. Let's move it
into its own function in preparation for handling the index files from
other worktrees.

Since we now have only a single reference to the_index, let's drop
our USE_THE_INDEX_VARIABLE definition and just use the_repository.index
directly. That's a minor cleanup, but also ensures that we didn't miss
any references when moving the code into fsck_index().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:30:58 -08:00
506ebaac96 help: mark unused parameter in git_unknown_cmd_config()
The extra callback parameter became unused in 0918d08887 (help.c: fix
autocorrect in work tree for bare repository, 2022-10-29), but we can't
get rid of it because we must conform to the config callback interface.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:33 -08:00
a5c76b3698 run_processes_parallel: mark unused callback parameters
Our parallel process API takes several callbacks via function pointers
in the run_process_paralell_opts struct. Not every callback needs every
parameter; let's mark the unused ones to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:33 -08:00
1bff855419 userformat_want_item(): mark unused parameter
This function is used as a callback to strbuf_expand(), so it must
conform to the correct interface. But naturally it doesn't need to touch
its "sb" parameter, since it is only examining the placeholder string,
and not actually writing any output. So mark the unused parameter to
silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:32 -08:00
43090008e3 for_each_commit_graft(): mark unused callback parameter
The for_each_commit_graft() functions takes a callback, but not every
callback uses the void data parameter. Mark the unused one to appease
the -Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:32 -08:00
c764e28060 rewrite_parents(): mark unused callback parameter
The rewrite_parents() function takes a callback, but not every callback
needs the "rev" parameter. Mark the unused one so -Wunused-parameter
will be happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:32 -08:00
65daa9ba1c fetch-pack: mark unused parameter in callback function
The for_each_cached_alternate() interface requires a callback that takes
a negotiator parameter, but not all implementations need it. Mark the
unused one as such to appease -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:32 -08:00
3c50c88f42 notes: mark unused callback parameters
for_each_note() requires a callback, but not all callbacks need all of
the parameters. Likewise, init_notes() takes a callback to implement the
"combine" strategy, but the "ignore" variant obviously doesn't look at
its arguments at all. Mark unused parameters as appropriate to silence
compiler warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:32 -08:00
1758712248 prio-queue: mark unused parameters in comparison functions
The prio_queue_compare_fn interface has a void pointer to allow callers
to pass arbitrary data, but most comparison functions don't need it.
Mark those cases to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:31 -08:00
be252d3349 for_each_object: mark unused callback parameters
The for_each_{loose,packed}_object interface uses callback functions,
but not every callback needs all of the parameters. Mark the unused ones
to satisfy -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:31 -08:00
c50dca2a18 list-objects: mark unused callback parameters
Our graph-traversal functions take callbacks for showing commits and
objects, but not all callbacks need each parameter.  Likewise for the
similar traverse_bitmap_commit_list(), which has a different interface
but serves the same purpose. And the include_check mechanism, which
passes along a void pointer which is not always used.

Mark the unused ones to to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:31 -08:00
9ec03b59a8 mark unused parameters in signal handlers
Signal handlers receive their signal number as a parameter, but many
don't care what it is (because they only handle one signal, or because
their action is the same regardless of the signal). Mark such parameters
to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:30 -08:00
ce41759ed5 run-command: mark error routine parameters as unused
After forking but before exec-ing a command, we install special
error/warn/die handlers in the child. These ignore the error messages
they get, since the idea is that they shouldn't be called in the first
place.

Arguably they could pass along that error message _in addition_ to
saying "error() should not be called in a child", but since the whole
point is to avoid any conflicts on stdio/malloc locks, etc, we're better
to just keep these simple. Seeing them trigger is effectively a bug, and
the developer is probably better off grabbing a stack trace.

But we do want to mark the functions so that -Wunused-parameter doesn't
complain.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:30 -08:00
d3dcfa047f mark "pointless" data pointers in callbacks
Both the object_array_filter() and trie_find() functions use callback
functions that let the caller specify which elements match. These
callbacks take a void pointer in case the caller wants to pass in extra
data. But in each case, the single user of these functions just passes
NULL, and the callback ignores the extra pointer.

We could just remove these unused parameters from the callback interface
entirely. But it's good practice to provide such a pointer, as it guides
future callers of the function in the right direction (rather than
tempting them to access global data). Plus it's consistent with other
generic callback interfaces.

So let's instead annotate the unused parameters, in order to silence the
compiler's -Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:30 -08:00
5fe9e1ce2f ref-filter: mark unused callback parameters
The ref-filter code uses virtual functions to handle specific atoms, but
many of the functions ignore some parameters:

  - most atom parsers do not need the ref_format itself, unless they are
    looking at centralized options like use_color, quote_style, etc.

  - meta-atom handlers like append_atom(), align_atom_handler(), etc,
    can't generate errors, so ignore their "err" parameter

  - likewise, the handlers for then/else/end do not even need to look at
    their atom_value, as the "if" handler put everything they need into
    the ref_formatting_state stack

Since these functions all have to conform to virtual function
interfaces, we can't just drop the unused parameters, but must mark them
as UNUSED (to appease -Wunused-parameter).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:30 -08:00
2be1506a78 http-backend: mark unused parameters in virtual functions
The http-backend dispatches requests via a table of virtual functions.
Some of the functions ignore their "arg" parameter, because it's
implicit in the function (e.g., get_info_refs knows that it is
dispatched only for a request to "/info/refs").

Mark these unused parameters to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:30 -08:00
77ef8b0e1e http-backend: mark argc/argv unused
We can't drop them because it's cmd_main(), which has a set prototype,
but the CGI interface does not do anything with such arguments.

Arguably we could detect them and complain. It's possible this could
detect misconfigurations or other mistakes, but:

  - as far as I can tell common webservers like apache do not have any
    mechanism to pass arguments to a CGI at all, so this isn't a mistake
    one could even make

  - it's possible that some obscure webserver might pass arguments, and
    we'd break that case. I have no idea if such a webserver exists; the
    CGI standard says only "The script is invoked in a system-defined
    manner".

So probably it would not hurt to detect them, but it also is unlikely to
help anyone. Let's just mark them as unused, which retains the current
behavior but silences -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:29 -08:00
07ffb954b3 object-name: mark unused parameters in disambiguate callbacks
The object-name disambiguation code triggers a callback for each
possible object id we find. This is really used for two purposes:

  - "hint" functions like disambiguate_commit_only report back on
    whether the value is usable

  - iterator functions like repo_for_each_abbrev() use it to collect
    and report matching names.

Compiling with -Wunused-parameter generates several warnings, but
they're distinct for each type. The "hint" functions never look at the
void cb_data pointer; they only care whether the oid matches our hint.
The iterator functions never look at the "struct repository" parameter;
they're just reporting back the oids they see, and always return 0.

So arguably these could be two separate interfaces:

  int (*hint)(struct repository *r, const struct object_id *oid);
  void (*iter)(const struct object_id *oid, void *cb_data);

But doing so would complicate the disambiguation code, which now has to
accept and call the two different types. Since we can easily squelch the
compiler warnings by annotating the functions, let's just do that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:29 -08:00
74595cca21 serve: mark unused parameters in virtual functions
Each v2 "serve" action has a virtual function for advertising and
implementing the command. A few of these are so trivial that they don't
need to look at their parameters, especially the "repository" parameter.
We can mark them so that -Wunused-parameter doesn't complain.

Note that upload_pack_v2() probably _should_ be using its repository
pointer. But teaching the functions it calls to do so is non-trivial.
Even using it for something as simple as reading config is tricky, both
because it shares code with the v1 upload pack, and because the
git_protected_config() mechanism it uses does not have a repo-specific
interface. So we'll just annotate it for now, and cleaning it up can be
part of the larger work to drop references to the_repository.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:29 -08:00
4b4e75dd4f serve: use repository pointer to get config
A few of the v2 "serve" callbacks ignore their repository parameter and
read config using the_repository (either directly or implicitly by
calling wrapper functions). This isn't a bug since the server code only
handles a single main repository anyway (and indeed, if you look at the
callers, these repository parameters will always be the_repository). But
in the long run we want to get rid of the_repository, so let's take a
tiny step in that direction.

As a bonus, this silences some -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:29 -08:00
c4716086d8 ls-refs: drop config caching
The code for the v2 ls-refs command has an ensure_config_read() function
that tries to read the lsrefs.unborn config only once and caches it in
some static global variables.

There's no real need for this caching. In any given process we'd only
need the value twice (once to decide whether to advertise, and once if
somebody runs the command). And since the config code already has its
own cache, each access is only incurring a hash lookup and string
comparison anyway.

Since the values we set are going to be specific to the_repository, the
globals we set are a mild anti-pattern. In practice it's not a bug (yet)
since the server-side v2 code only handles a single repository anyway.
But it doesn't hurt to take a small step in the right direction and
model a good approach.

Note that we currently set two booleans: advertise_unborn and
allow_unborn. But we can get away with a single value, since "advertise"
naturally implies "allow". That lets us just convert this to a function
with a return value.

Note that we still always read from the_repository; we'll deal with that
in a follow-on patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:28 -08:00
fe6258c348 ref-filter: drop unused atom parameter from get_worktree_path()
The get_worktree_path() function is used to populate the %(worktreepath)
value, but it has never used its "atom" parameter since it was added in
2582083fa1 (ref-filter: add worktreepath atom, 2019-04-28).

Normally we'd use the atom struct to cache any work we do, but in this
case there's a global hashmap that does that caching already. So we can
just drop the unused parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24 09:13:28 -08:00
f524970185 diff.h: remove unnecessary include of object.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:30 -08:00
eef65c716c Remove unnecessary includes of builtin.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:30 -08:00
fc7bd51b06 treewide: replace cache.h with more direct headers, where possible
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:30 -08:00
cbeab74713 replace-object.h: move read_replace_refs declaration from cache.h to here
Adjust several files to be more explicit about their dependency on
replace-objects to accommodate this change.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:30 -08:00
1c02840008 object-store.h: move struct object_info from cache.h
Move struct object_info, and a few related #define's from cache.h to
object-store.h.

A surprising effect of this change is that replace-object.h, which
includes object-store.h, now needs to directly include cache.h since
that is where read_replace_refs is declared and that variable is used
in one of its inline functions.  The next commit will move that
declaration and fix that unfortunate new direct inclusion of cache.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
ac48adf488 dir.h: refactor to no longer need to include cache.h
Moving a few functions around allows us to make dir.h no longer need to
include cache.h.  This commit is best viewed with:
    git log -1 -p --color-moved

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
a64215b6cd object.h: stop depending on cache.h; make cache.h depend on object.h
Things should be able to depend on object.h without pulling in all of
cache.h.  Move an enum to allow this.

Note that a couple files previously depended on things brought in
through cache.h indirectly (revision.h -> commit.h -> object.h ->
cache.h).  As such, this change requires making existing dependencies
more explicit in half a dozen files.  The inclusion of strbuf.h in
some headers if of particular note: these headers directly embedded a
strbuf in some new structs, meaning they should have been including
strbuf.h all along but were indirectly getting the necessary
definitions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
b5fa608180 ident.h: move ident-related declarations out of cache.h
These functions were all defined in a separate ident.c already, so
create ident.h and move the declarations into that file.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
b6c09c03eb pretty.h: move has_non_ascii() declaration from commit.h
The function is defined in pretty.c, so this moves the declaration to
a more logical place.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
b73ecb4811 hex.h: move some hex-related declarations from cache.h
hex.c contains code for hex-related functions, but for some reason these
functions were declared in the catch-all cache.h.  Move the function
declarations into a hex.h header instead.

This also allows us to remove includes of cache.h from a few C files.
For now, we make cache.h include hex.h, so that it is easier to review
the direct changes being made by this patch.  In the next patch, we will
remove that, and add the necessary direct '#include "hex.h"' in the
hundreds of C files that need it.

Note that reviewing the header changes in this commit might be
simplified via
    git log --no-walk -p --color-moved $COMMIT -- '*.h'`
In particular, it highlights the simple movement of code in .h files
rather nicely.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
41227cb138 hash.h: move some oid-related declarations from cache.h
These defines and enum are all oid-related and as such seem to make
more sense being included in hash.h.  Further, moving them there
allows us to remove some includes of cache.h in other files.

The change to line-log.h might look unrelated, but line-log.h includes
diffcore.h, which previously included cache.h, which included the
kitchen sink.  Since this patch makes diffcore.h no longer include
cache.h, the compiler complains about the 'struct string_list *'
function parameter.  Add a forward declaration for struct string_list to
address this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
15db4e7f4a treewide: remove unnecessary cache.h includes in source files
We had several C files include cache.h unnecessarily.  Replace those
with an include of "git-compat-util.h" instead.  Much like the previous
commit, these have all been verified via both ensuring that
    gcc -E $SOURCE_FILE | grep '"cache.h"'
found no hits and that
    make DEVELOPER=1 ${OBJECT_FILE_FOR_SOURCE_FILE}
successfully compiles without warnings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
ba3d1c73da treewide: remove unnecessary cache.h includes
We had several header files include cache.h unnecessarily.  Remove
those.  These have all been verified via both ensuring that
    gcc -E $HEADER | grep '"cache.h"'
found no hits and that
    cat >temp.c <<EOF &&
    #include "git-compat-util.h"
    #include "$HEADER"
    int main() {}
    EOF
    gcc -c temp.c
successfully compiles without warnings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
f332121e75 treewide: remove unnecessary git-compat-util.h includes in headers
For sanity, we should probably do one of the following:

(a) make C and header files both depend upon everything they need
(b) consistently exclude git-compat-util.h from headers and require it
    be the first include in C files

Currently, we have some of the headers following (a) and others
following (b), which makes things messy.  In the past I was pushed
towards (b), as per [1] and [2].  Further, during this series I
discovered that this mixture empirically will mean that we end up with C
files that do not directly include git-compat-util.h, and do include
headers that don't include git-compat-util.h, with the result that we
likely have headers included before an indirect inclusion of
git-compat-util.h.  Since git-compat-util.h has tricky platform-specific
stuff that is meant to be included before everything else, this state of
affairs is risky and may lead to things breaking in subtle ways (and
only on some platforms) as per [1] and [2].

Since including git-compat-util.h in existing header files makes it
harder for us to catch C files that are missing that include, let's
switch to (b) to make the enforcement of this rule easier.  Remove the
inclusion of git-compat-util.h from header files other than the ones
that have been approved as alternate first includes.

[1] https://lore.kernel.org/git/20180811173406.GA9119@sigill.intra.peff.net/
[2] https://lore.kernel.org/git/20180811174301.GA9287@sigill.intra.peff.net/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
8bff5ca030 treewide: ensure one of the appropriate headers is sourced first
We had several C files ignoring the rule to include one of the
appropriate headers first; fix that.

While at it, the rule in Documentation/CodingGuidelines about which
header to include has also fallen out of sync, so update the wording to
mention other allowed headers.

Unfortunately, C files in reftable/ don't actually follow the previous
or updated rule.  If you follow the #include chain in its C files,
reftable/system.h _tends_ to be first (i.e. record.c first includes
record.h, which first includes basics.h, which first includees
system.h), but not always (e.g. publicbasics.c includes another header
first that does not include system.h).  However, I'm going to punt on
making actual changes to the C files in reftable/ since I do not want to
risk bringing it out-of-sync with any version being used externally.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
666b6e1135 rebase -i: fix parsing of "fixup -C<commit>"
If the user omits the space between "-C" and the commit in a fixup
command then it is parsed as an ordinary fixup and the commit message is
not updated as it should be. Fix this by making the space between "-C"
and "<commit>" optional as it is for the "merge" command.

Note that set_replace_editor() is changed to set $GIT_SEQUENCE_EDITOR
instead of $EDITOR in order to be able to replace the todo list and
reword commits with $FAKE_COMMIT_MESSAGE. This is safe as all the
existing users are using set_replace_editor() to replace the todo list.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 14:25:50 -08:00
7aed2c0565 rebase -i: match whole word in is_command()
When matching an unabbreviated command is_command() only does a prefix
match which means it parses "pickled" as TODO_PICK. parse_insn_line()
does error out because is_command() only advances as far as the end of
"pick" so it looks like the command name is not followed by a space but
the error message is "missing arguments for pick" rather than telling
the user that the "pickled" is not a valid command.

Fix this by ensuring the match is follow by whitespace or the end of the
string as we already do for abbreviated commands. The (*bol = p) at the
end of the condition is a bit cute for my taste but I decided to leave
it be for now. Rather than add new tests the existing tests for bad
commands are adapted to use a bad command name that triggers the prefix
matching bug.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 14:25:48 -08:00
8f2146dbf1 t5559: make SSL/TLS the default
The point of t5559 is run the regular t5551 tests with HTTP/2. But it
does so with the "h2c" protocol, which uses cleartext upgrades from
HTTP/1.1 to HTTP/2 (rather than learning about HTTP/2 support during the
TLS negotiation).

This has a few problems:

 - it's not very indicative of the real world. In practice, most servers
   that support HTTP/2 will also support TLS.

 - support for upgrading does not seem as robust. In particular, we've
   run into bugs in some versions of Apache's mod_http2 that trigger
   only with the upgrade mode. See:

     https://lore.kernel.org/git/Y8ztIqYgVCPILJlO@coredump.intra.peff.net/

So the upside is that this change makes our HTTP/2 tests more robust and
more realistic. The downside is that if we can't set up SSL for any
reason, we'll skip the tests (even though you _might_ have been able to
run the HTTP/2 tests the old way). We could probably have a conditional
fallback, but it would be complicated for little gain, and it's not even
clear it would help (i.e., would any test environment even have HTTP/2
but not SSL support?).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:18 -08:00
86190028a8 t5559: fix test failures with LIB_HTTPD_SSL
One test needs to be tweaked in order for t5559 to pass with SSL/TLS set
up. When we make our initial clone, we check that the curl trace of
requests is what we expected. But we need to fix two things:

  - along with ignoring "data" lines from the trace, we need to ignore
    "SSL data" lines

  - when TLS is used, the server is able to tell the client (via ALPN)
    that it supports HTTP/2 before the first HTTP request is made. So
    rather than request an upgrade using an HTTP header, it can just
    speak HTTP/2 immediately

With this patch, running:

  LIB_HTTPD_SSL=1 ./t5559-http-fetch-smart-http2.sh

works, whereas it did not before.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:17 -08:00
3c14419c6b t/lib-httpd: enable HTTP/2 "h2" protocol, not just h2c
Commit 73c49a4474 (t: run t5551 tests with both HTTP and HTTP/2,
2022-11-11) added Apache config to enable HTTP/2. However, it only
enabled the "h2c" protocol, which allows cleartext HTTP/2 (generally
based on an upgrade header during an HTTP/1.1 request). This is what
t5559 is generally testing, since by default we don't set up SSL/TLS.

However, it should be possible to run t5559 with LIB_HTTPD_SSL set. In
that case, Apache will advertise support for HTTP/2 via ALPN during the
TLS handshake. But we need to tell it support "h2" (the non-cleartext
version) to do so. Without that, then curl does not even try to do the
HTTP/1.1 upgrade (presumably because after seeing that we did TLS but
didn't get the ALPN indicator, it assumes it would be fruitless).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:17 -08:00
9d15b1e5df t/lib-httpd: respect $HTTPD_PROTO in expect_askpass()
When the HTTP tests are run with LIB_HTTPD_SSL in the environment, then
we access the test server as https://. This causes expect_askpass to
complain, because it tries to blindly match "http://" in the prompt
shown to the user. We can adjust this to use $HTTPD_PROTO, which is set
during the setup phase.

Note that this is enough for t5551 and t5559 to pass when run with
https, but there are similar problems in other scripts that will need to
be fixed before the whole suite can run with LIB_HTTPD_SSL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
b71a2bf11f t5551: drop curl trace lines without headers
We pick apart a curl trace, looking for "=> Send header:" and so on, and
matching against an expected set of requests and responses. We remove
"== Info" lines entirely. However, our parser is fooled when running the
test with LIB_HTTPD_SSL on Ubuntu 20.04 (as found in our linux-gcc CI
job), as curl hands us an "Info" buffer with a newline, and we get:

  == Info: successfully set certificate verify locations:
  == Info:   CAfile: /etc/ssl/certs/ca-certificates.crt
    CApath: /etc/ssl/certs
  => Send SSL data[...]

which results in the "CApath" line ending up in the cleaned-up output,
causing the test to fail.

Arguably the tracing code should detect this and put it on two separate
"== Info" lines. But this is actually a curl bug, fixed by their
80d73bcca (tls: provide the CApath verbose log on its own line,
2020-08-18). It's simpler to just work around it here.

Since we are using GIT_TRACE_CURL, every line should just start with one
of "<=", "==", or "=>", and we can throw away anything else. In fact, we
can just replace the pattern for deleting "*" lines. Those were from the
old GIT_CURL_VERBOSE output, but we switched over in 14e24114d9
(t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var,
2016-09-05).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
93ea5bf3a8 t5551: handle v2 protocol in cookie test
After making a request, we check that it stored the expected cookies.
This depends on the protocol version, because the cookies we store
depend on the exact requests we made (and for ls-remote, v2 will always
hit /git-upload-pack to get the refs, whereas v0 is happy with the
initial ref advertisement).

As a result, hardly anybody runs this test, as you'd have to manually
set GIT_TEST_PROTOCOL_VERSION=0 to do so.

Let's teach it to handle both protocol versions. One way to do this
would be to make the expectation conditional on the protocol used. But
there's a simpler solution. The reason that v0 doesn't hit
/git-upload-pack is that ls-remote doesn't fetch any objects. If we
instead do a fetch (making sure there's an actual object to grab), then
both v0 and v2 will hit the same endpoints and set the same cookies.

Note that we do have to clean up our new tag here; otherwise it confuses
the later "clone 2,000 tags" test.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
87d38afa0d t5551: simplify expected cookie file
After making an HTTP request that should store cookies, we check that
the expected values are in the cookie file. We don't want to look at the
whole file, because it has noisy comments at the top that we shouldn't
depend on. But we strip out the interesting bits using "tail -3", which
is brittle. It requires us to put an extra blank line in our expected
output, and it would fail to notice any reordering or extra content in
the cookie file.

Instead, let's just grep for non-blank lines that are not comments,
which more directly describes what we're interested in.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
795d713e2c t5551: handle v2 protocol in upload-pack service test
We perform a clone and a fetch, and then check that we saw the expected
requests in Apache's access log. In the v2 protocol, there will be one
extra request to /git-upload-pack for each operation (since the initial
/info/refs probe is just used to upgrade the protocol).

As a result, this test is a noop unless the use of the v0 protocol is
forced. Which means that hardly anybody runs it, since you have to do so
manually.

Let's update it to handle v2 and run it always. We could do this by just
conditionally adding in the extra POST lines. But if we look at the
origin of the test in 7da4e2280c (test smart http fetch and push,
2009-10-30), the point is really just to make sure that the smart
git-upload-pack service was used at all. So rather than counting up the
individual requests, let's just make sure we saw each of the expected
types. This is a bit looser, but makes maintenance easier.

Since we're now matching with grep, we can also loosen the HTTP/1.1
match, which allows this test to pass when run with HTTP/2 via t5559.
That lets:

  GIT_TEST_PROTOCOL_VERSION=0 ./t5559-http-fetch-smart-http2.sh

run to completion, which previously failed (and of course it works if
you use v2, as well).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
1c5a63818a t5551: handle v2 protocol when checking curl trace
After cloning an http repository, we check the curl trace to make sure
the expected requests were made. But since the expected trace was never
updated to handle v2, it is only run when you ask the test suite to run
in v0 mode (which hardly anybody does).

Let's update it to handle both protocols. This isn't too hard since v2
just sends an extra header and an extra request. So we can just annotate
those extra lines and strip them out for v0 (and drop the annotations
for v2). I didn't bother handling v1 here, as it's not really of
practical interest (it would drop the extra v2 request, but still have
the "git-protocol" lines).

There's a similar tweak needed at the end. Since we check the
"accept-encoding" value loosely, we grep for it rather than finding it
in the verbatim trace. This grep insists that there are exactly 2
matches, but of course in v2 with the extra request there are 3. We
could tweak the number, but it's simpler still to just check that we saw
at least one match. The verbatim check already confirmed how many
instances of the header we have; we're really just checking here that
"gzip" is in the value (it's possible, of course, that the headers could
have different values, but that seems like an unlikely bug).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
2f87277dfa t5551: stop forcing clone to run with v0 protocol
In the "clone http repository" test, we check the curl trace to make
sure the expected requests were made. This whole script was marked to
handle only the v0 protocol in d790ee1707 (tests: fix protocol version
for overspecifications, 2019-02-25). That makes sense, since v2 requires
an extra request, so tests as specific as this would fail unless
modified.

Later, in preparation for v2 becoming the default, this was tweaked by
8a1b0978ab (test: request GIT_TEST_PROTOCOL_VERSION=0 when appropriate,
2019-12-23). There we run the trace check only if the user has
explicitly asked to test protocol version 0. But it also forced the
clone itself to run with the v0 protocol.

This makes the check for "can we expect a v0 trace" silly; it will
always be v0. But much worse, it means that the clone we are testing is
not like the one that normal users would run. They would use the
defaults, which are now v2.  And since this is supposed to be a basic
check of clone-over-http, we should do the same.

Let's fix this by dropping the extra v0 override. The test still passes
because the trace checking only kicks in if we asked to use v0
explicitly (this is the same as before; even though we were running a v0
clone, unless you specifically set GIT_TEST_PROTOCOL_VERSION=0, the
trace check was always skipped).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:15 -08:00
8dfe36b007 t5551: handle HTTP/2 when checking curl trace
We check that the curl trace of a clone has the lines we expect, but
this won't work when we run the test under t5559, because a few details
are different under HTTP/2 (but nobody noticed because it only happens
when you manually set GIT_TEST_PROTOCOL_VERSION to "0").

We can handle both HTTP protocols with a few tweaks:

  - we'll drop the HTTP "101 Switching Protocols" response, as well as
    various protocol upgrade headers. These details aren't interesting
    to us. We just want to make sure the correct protocol was used (and
    we do in the main request/response lines).

  - successful HTTP/2 responses just say "200" and not "200 OK"; we can
    normalize these

  - replace HTTP/1.1 with a variable in the request/response lines. We
    can use the existing $HTTP_PROTO for this, as it's already set to
    "HTTP/2" when appropriate. We do need to tweak the fallback value to
    "HTTP/1.1" to match what curl will write (prior to this patch, the
    fallback value didn't matter at all; we only checked if it was the
    literal string "HTTP/2").

Note that several lines still expect HTTP/1.1 unconditionally. The first
request does so because the client requests an upgrade during the
request. The POST request and response do so because you can't do an
upgrade if there is a request body. (This will all be different if we
trigger HTTP/2 via ALPN, but the tests aren't yet capable of that).

This is enough to let:

  GIT_TEST_PROTOCOL_VERSION=0 ./t5559-http-fetch-smart-http2.sh

pass the "clone http repository" test (but there are some other failures
later on).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:14 -08:00
4a21230ab0 t5551: lower-case headers in expected curl trace
There's a test in t5551 which checks the curl trace (after simplifying
it a bit). It doesn't work with HTTP/2, because in that case curl
outputs all of the headers in lower-case. Even though this test is run
with HTTP/2 by t5559, nobody has noticed because checking the trace only
happens if GIT_TEST_PROTOCOL_VERSION is manually set to "0".

Let's fix this by lower-casing all of the header names in the trace, and
then checking for those in our expected code (this is easier than making
HTTP/2 traces look like HTTP/1.1, since HTTP/1.1 uses title-casing).

Sadly, we can't quite do this in our existing sed script. This works if
you have GNU sed:

  s/^\\([><]\\) \\([A-Za-z0-9-]*:\\)/\1 \L\2\E/

but \L is a GNU-ism, and I don't think there's a portable solution. We
could just "tr A-Z a-z" on the way in, of course, but that makes the
non-header parts harder to read (e.g., lowercase "post" requests). But
to paraphrase Baron Munchausen, I have learned from experience that a
modicum of Perl can be most efficacious.

Note that this doesn't quite get the test passing with t5559; there are
more fixes needed on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:14 -08:00
a58f4d6328 t5551: drop redundant grep for Accept-Language
Commit b0c4adcdd7 (remote-curl: send Accept-Language header to server,
2022-07-11) added tests to make sure the header is sent via HTTP.
However, it checks in two places:

  1. In the expected trace output, we check verbatim for the header and
     its value.

  2. Afterwards, we grep for the header again in the trace file.

This (2) is probably cargo-culted from the earlier grep for
Accept-Encoding. It is needed for the encoding because we smudge the
value of that header when doing the verbatim check; see 1a53e692af
(remote-curl: accept all encodings supported by curl, 2018-05-22).

But we don't do so for the language header, so any problem that the
"grep" would catch in (2) would already have been caught by (1).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:14 -08:00
f1449a563f t5541: simplify and move "no empty path components" test
Commit 9ee6bcd398 (t5541-http-push: add test for URLs with trailing
slash, 2010-04-08) added a test that clones a URL with a trailing slash,
and confirms that we don't send a doubled slash (like "$url//info/refs")
to the server.

But this test makes no sense in t5541, which is about pushing. It should
have been added in t5551. Let's move it there.

But putting it at the end is tricky, since it checks the entire contents
of the Apache access log. We could get around this by clearing the log
before our test. But there's an even simpler solution: just make sure no
doubled slashes appear in the log (fortunately, "http://" does not
appear in the log itself).

As a bonus, this also lets us drop the check for the v0 protocol (which
is otherwise necessary since v2 makes multiple requests, and
check_access_log insists on exactly matching the number of requests,
even though we don't care about that here).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:14 -08:00
6ec90b5bf1 t5541: stop marking "used receive-pack service" test as v0 only
We have a test which checks to see if a request to git-receive-pack was
made. Originally, it was checking the entire set of requests made in the
script so far, including clones, and thus it would break when run with
the v2 protocol (since that implies an extra request for fetches).

Since the previous commit, though, we are only checking the requests
made by a single push. And since there is no v2 push protocol, the test
now passes no matter what's in GIT_TEST_PROTOCOL_VERSION. We can just
run it all the time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:14 -08:00
77fb36aa7e t5541: run "used receive-pack service" test earlier
There's a test in t5541 that confirms that "git push" makes two requests
(a GET to /info/refs, and a POST to /git-receive-pack). However, it's a
noop unless GIT_TEST_PROTOCOL_VERSION is set to "0", due to 8a1b0978ab
(test: request GIT_TEST_PROTOCOL_VERSION=0 when appropriate,
2019-12-23).

This means that almost nobody runs it. And indeed, it has been broken
since b0c4adcdd7 (remote-curl: send Accept-Language header to server,
2022-07-11). But the fault is not in that commit, but in how brittle the
test is. It runs after several operations have been performed, which
means that it expects to see the complete set of requests made so far in
the script. Commit b0c4adcdd7 added a new test, which means that the
"used receive-pack service" test must be updated, too.

Let's fix this by making the test less brittle. We'll move it higher in
the script, right after the first push has completed. And we'll clear
the access log right before doing the push, so we'll see only the
requests from that command.

This is technically testing less, in that we won't check that all of
those other requests also correctly used smart http. But there's no
particular reason think that if the first one did, the others wouldn't.

After this patch, running:

  GIT_TEST_PROTOCOL_VERSION=0 ./t5541-http-push-smart.sh

passes, whereas it did not before.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 13:01:14 -08:00
d208bfdfef credential: new attribute password_expiry_utc
Some passwords have an expiry date known at generation. This may be
years away for a personal access token or hours for an OAuth access
token.

When multiple credential helpers are configured, `credential fill` tries
each helper in turn until it has a username and password, returning
early. If Git authentication succeeds, `credential approve`
stores the successful credential in all helpers. If authentication
fails, `credential reject` erases matching credentials in all helpers.
Helpers implement corresponding operations: get, store, erase.

The credential protocol has no expiry attribute, so helpers cannot
store expiry information. Even if a helper returned an improvised
expiry attribute, git credential discards unrecognised attributes
between operations and between helpers.

This is a particular issue when a storage helper and a
credential-generating helper are configured together:

	[credential]
		helper = storage  # eg. cache or osxkeychain
		helper = generate  # eg. oauth

`credential approve` stores the generated credential in both helpers
without expiry information. Later `credential fill` may return an
expired credential from storage. There is no workaround, no matter how
clever the second helper. The user sees authentication fail (a retry
will succeed).

Introduce a password expiry attribute. In `credential fill`, ignore
expired passwords and continue to query subsequent helpers.

In the example above, `credential fill` ignores the expired password
and a fresh credential is generated. If authentication succeeds,
`credential approve` replaces the expired password in storage.
If authentication fails, the expired credential is erased by
`credential reject`. It is unnecessary but harmless for storage
helpers to self prune expired credentials.

Add support for the new attribute to credential-cache.
Eventually, I hope to see support in other popular storage helpers.

Example usage in a credential-generating helper
https://github.com/hickford/git-credential-oauth/pull/16

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Reviewed-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-22 15:18:58 -08:00
06dd2baa8d The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-22 14:55:59 -08:00
5048df67b2 Merge branch 'ab/hook-api-with-stdin'
Extend the run-hooks API to allow feeding data from the standard
input when running the hook script(s).

* ab/hook-api-with-stdin:
  hook: support a --to-stdin=<path> option
  sequencer: use the new hook API for the simpler "post-rewrite" call
  hook API: support passing stdin to hooks, convert am's 'post-rewrite'
  run-command: allow stdin for run_processes_parallel
  run-command.c: remove dead assignment in while-loop
2023-02-22 14:55:45 -08:00
72972ea0b9 Merge branch 'ab/various-leak-fixes'
Leak fixes.

* ab/various-leak-fixes:
  push: free_refs() the "local_refs" in set_refspecs()
  push: refactor refspec_append_mapped() for subsequent leak-fix
  receive-pack: release the linked "struct command *" list
  grep API: plug memory leaks by freeing "header_list"
  grep.c: refactor free_grep_patterns()
  builtin/merge.c: free "&buf" on "Your local changes..." error
  builtin/merge.c: use fixed strings, not "strbuf", fix leak
  show-branch: free() allocated "head" before return
  commit-graph: fix a parse_options_concat() leak
  http-backend.c: fix cmd_main() memory leak, refactor reg{exec,free}()
  http-backend.c: fix "dir" and "cmd_arg" leaks in cmd_main()
  worktree: fix a trivial leak in prune_worktrees()
  repack: fix leaks on error with "goto cleanup"
  name-rev: don't xstrdup() an already dup'd string
  various: add missing clear_pathspec(), fix leaks
  clone: use free() instead of UNLEAK()
  commit-graph: use free_commit_graph() instead of UNLEAK()
  bundle.c: don't leak the "args" in the "struct child_process"
  tests: mark tests as passing with SANITIZE=leak
2023-02-22 14:55:45 -08:00
6aac634f81 Merge branch 'jk/doc-ls-remote-matching'
Doc update.

* jk/doc-ls-remote-matching:
  doc/ls-remote: clarify pattern format
  doc/ls-remote: cosmetic cleanups for examples
2023-02-22 14:55:45 -08:00
a42d69ee5b Merge branch 'rs/cache-tree-strbuf-growth-fix'
Remove unnecessary explicit sizing of strbuf.

* rs/cache-tree-strbuf-growth-fix:
  cache-tree: fix strbuf growth in prime_cache_tree_rec()
2023-02-22 14:55:44 -08:00
24fb150dcd Merge branch 'ab/the-index-compatibility'
Remove more remaining uses of macros that relies on the_index
singleton instance without explicitly spelling it out.

* ab/the-index-compatibility:
  cocci & cache.h: remove "USE_THE_INDEX_COMPATIBILITY_MACROS"
  cache-tree API: remove redundant update_main_cache_tree()
  cocci & cache-tree.h: migrate "write_cache_as_tree" to "*_index_*"
  cocci & cache.h: apply pending "index_cache_pos" rule
  cocci & cache.h: fully apply "active_nr" part of index-compatibility
  builtin/rm.c: use narrower "USE_THE_INDEX_VARIABLE"
2023-02-22 14:55:44 -08:00
5fc6d00b65 Merge branch 'en/name-rev-make-taggerdate-much-less-important'
"git name-rev" heuristics update.

* en/name-rev-make-taggerdate-much-less-important:
  name-rev: fix names by dropping taggerdate workaround
2023-02-22 14:55:44 -08:00
2b15969f61 range-diff: let '--abbrev' option takes effect
As mentioned in 'git-range-diff.txt': "`git range-diff` also accepts the
regular diff options (see linkgit:git-diff[1])...", but '--abbrev' is not
in the "regular" scope.

In Git, the "abbrev" of an object may not be a fixed value in different
repositories, depending on the needs of the them(Linus mentioned in
e6c587c7 in 2016: "the Linux kernel project needs 11 to 12 hexdigits"
at that time ), that's why a user may want to display abbrev according
to a specified length.

Although a similar effect can be achieved through configuration (like:
git -c core.abbrev=<abbrev>), but based on ease of use (many users may not
know that the -c option can be specified) and the description in existing
document, supporting users to directly use '--abbrev', could be a good way.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 14:02:05 -08:00
c39952b925 fetch: choose a sensible default with --jobs=0 again
prior to 51243f9 (run-command API: don't fall back on online_cpus(),
2022-10-12) `git fetch --multiple --jobs=0` would choose some default amount
of jobs, similar to `git -c fetch.parallel=0 fetch --multiple`. While our
documentation only ever promised that `fetch.parallel` would fall back to a
"sensible default", it makes sense to do the same for `--jobs`. So fall back
to online_cpus() and not BUG() out.

This fixes https://github.com/git-for-windows/git/issues/4302

Reported-by: Drew Noakes <drnoakes@microsoft.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 12:09:40 -08:00
17ab64e1b5 trace.c, git.c: remove unnecessary parameter to trace_repo_setup()
trace_repo_setup() of trace.c is called with the argument 'prefix' from
only one location, run_builtin of git.c, which sets 'prefix' to the return
value of setup_git_directory() or setup_git_directory_gently() (a wrapper
of the former).

Now that "prefix" is in startup_info there is no need for the parameter
of trace_repo_setup() because setup_git_directory() sets "startup_info->prefix"
to the same value it returns. It would be less confusing to use "prefix"
from startup_info instead of passing it as an argument.

Signed-off-by: Idriss Fekir <mcsm224@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 12:06:32 -08:00
d35d8f2e7a t2015-checkout-unborn.sh: changes the style for cd
the `cd` followed the old style which wasn't consistent with the rest of
the test suite, so this commit makes it consistent with the current
style of the test suite for `cd` in  subshell.

Signed-off-by: Ashutosh Pandey <ashutosh.pandeyhlr007@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 12:01:53 -08:00
a4cf900ee7 diff: teach diff to read algorithm from diff driver
It can be useful to specify diff algorithms per file type. For example,
one may want to use the minimal diff algorithm for .json files, another
for .c files, etc.

The diff machinery already checks attributes for a diff driver. Teach
the diff driver parser a new type "algorithm" to look for in the
config, which will be used if a driver has been specified through the
attributes.

Enforce precedence of the diff algorithm by favoring the command line
option, then looking at the driver attributes & config combination, then
finally the diff.algorithm config.

To enforce precedence order, use a new `ignore_driver_algorithm` member
during options parsing to indicate the diff algorithm was set via command
line args.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 09:29:10 -08:00
11e95e16e8 diff: consolidate diff algorithm option parsing
A subsequent commit will need the ability to tell if the diff algorithm
was set through the command line through setting a new member of
diff_options. While this logic can be added to the
diff_opt_diff_algorithm() callback, the `--minimal` and `--histogram`
options are handled via OPT_BIT without a callback.

Remedy this by consolidating the options parsing logic for --minimal and
--histogram into one callback. This way we can modify `diff_options` in
that function.

As an additional refactor, the logic that sets the diff algorithm in
diff_opt_diff_algorithm() can be refactored into a helper that will
allow multiple callsites to set the diff algorithm.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 09:29:08 -08:00
16b3880dd7 rebase -i: check labels and refs when parsing todo list
Check that the argument to the "label" and "update-ref" commands is a
valid refname when the todo list is parsed rather than waiting until the
command is executed. This means that the user can deal with any errors
at the beginning of the rebase rather than having it stop halfway
through due to a typo in a label name. The "update-ref" command is
changed to reject single level refs as it is all to easy to type
"update-ref branch" which is incorrect rather than "update-ref
refs/heads/branch"

Note that it is not straight forward to check the arguments to "reset"
and "merge" commands as they may be any revision, not just a refname and
we do not have an equivalent of check_refname_format() for revisions.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 09:18:37 -08:00
6eb095d787 delta-islands: fix segfault when freeing island marks
In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03)
we have introduced logic to free `island_marks` in order to reduce heap
memory usage in git-pack-objects(1). This commit is causing segfaults in
the case where this Git command does not load delta islands at all, e.g.
when reading object IDs from standard input. One such crash can be hit
when using repacking multi-pack-indices with delta islands enabled.

The root cause of this bug is that we unconditionally dereference the
`island_marks` variable even in the case where it is a `NULL` pointer,
which is fixed by making it conditional. Note that we still leave the
logic in place to set the pointer to `-1` to detect use-after-free bugs
even when there are no loaded island marks at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 09:15:04 -08:00
fd2da4b1ea archive: add --mtime
Allow users to specify the modification time of archive entries.  The
new option --mtime uses approxidate() to parse a time specification and
overrides the default of using the current time for trees and the commit
time for tags and commits.  It can be used to create a reproducible
archive for a tree, or to use a specific mtime without creating a commit
with GIT_COMMITTER_DATE set.

This implementation doesn't support the negated form of the new option,
i.e. --no-mtime is not accepted.  It is not possible to have no mtime at
all.  We could use the Unix epoch or revert to the default behavior, but
since negation is not necessary for the intended use it's left undecided
for now.

Requested-by: Raul E Rangel <rrangel@chromium.org>
Suggested-by: demerphq <demerphq@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-18 09:29:13 -08:00
50bebf98d9 format.attach: allow empty value to disable multi-part messages
When a lower precedence configuration file (e.g. /etc/gitconfig)
defines format.attach in any way, there was no way to disable it in
a more specific configuration file (e.g. $HOME/.gitconfig).

Change the behaviour of setting it to an empty string.  It used to
mean that the result is still a multipart message with only dashes
used as a multi-part separator, but now it resets the setting to
the default (which would be to give an inline patch, unless other
command line options are in effect).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-17 15:43:09 -08:00
3b0ebb7a8d t0066: drop setup of "dir5"
The symlink setup in t0066 makes several directories with links, dir4
through dir6. But ever since dir5 was introduced in fa1da7d2ee
(dir-iterator: add flags parameter to dir_iterator_begin, 2019-07-10),
it has never actually been used. It was left over from an earlier
iteration of the patch which tried to handle recursive symlinks
specially, as seen in:

  https://lore.kernel.org/git/20190502144829.4394-7-matheus.bernardino@usp.br/

It's not hurting any of the existing tests to be there, but the extra
setup is confusing to anybody trying to read and understand the tests.
Let's drop the extra directory, and we'll rename "dir6" to "dir5" so
nobody wonders whether the gap in naming is important.

Helped-by: Matheus Tavares Bernardino <matheus.tavb@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-16 17:55:42 -08:00
29ae2c9e74 add basic http proxy tests
We do not test our http proxy functionality at all in the test suite, so
this is a pretty big blind spot. Let's at least add a basic check that
we can go through an authenticating proxy to perform a clone.

A few notes on the implementation:

  - I'm using a single apache instance to proxy to itself. This seems to
    work fine in practice, and we can check with a test that this rather
    unusual setup is doing what we expect.

  - I've put the proxy tests into their own script, and it's the only
    one which loads the apache proxy config. If any platform can't
    handle this (e.g., doesn't have the right modules), the start_httpd
    step should fail and gracefully skip the rest of the script (but all
    the other http tests in existing scripts will continue to run).

  - I used a separate passwd file to make sure we don't ever get
    confused between proxy and regular auth credentials. It's using the
    antiquated crypt() format. This is a terrible choice security-wise
    in the modern age, but it's what our existing passwd file uses, and
    should be portable. It would probably be reasonable to switch both
    of these to bcrypt, but we can do that in a separate patch.

  - On the client side, we test two situations with credentials: when
    they are present in the url, and when the username is present but we
    prompt for the password. I think we should be able to handle the
    case that _neither_ is present, but an HTTP 407 causes us to prompt
    for them. However, this doesn't seem to work. That's either a bug,
    or at the very least an opportunity for a feature, but I punted on
    it for now. The point of this patch is just getting basic coverage,
    and we can explore possible deficiencies later.

  - this doesn't work with LIB_HTTPD_SSL. This probably would be
    valuable to have, as https over an http proxy is totally different
    (it uses CONNECT to tunnel the session). But adding in
    mod_proxy_connect and some basic config didn't seem to work for me,
    so I punted for now. Much of the rest of the test suite does not
    currently work with LIB_HTTPD_SSL either, so we shouldn't be making
    anything much worse here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-16 16:24:23 -08:00
e00e56a7df dir-iterator: drop unused DIR_ITERATOR_FOLLOW_SYMLINKS
The `FOLLOW_SYMLINKS` flag was added to the dir-iterator API in
fa1da7d2ee (dir-iterator: add flags parameter to dir_iterator_begin,
2019-07-10) in order to follow symbolic links while traversing through a
directory.

`FOLLOW_SYMLINKS` gained its first caller in ff7ccc8c9a (clone: use
dir-iterator to avoid explicit dir traversal, 2019-07-10), but it was
subsequently removed in 6f054f9fb3 (builtin/clone.c: disallow `--local`
clones with symlinks, 2022-07-28).

Since then, we've held on to the code for `DIR_ITERATOR_FOLLOW_SYMLINKS`
in the name of making minimally invasive changes during a security
embargo.

In fact, we even changed the dir-iterator API in bffc762f87
(dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS,
2023-01-24) without having any non-test callers of that flag.

Now that we're past those security embargo(s), let's finalize our
cleanup of the `DIR_ITERATOR_FOLLOW_SYMLINKS` code and remove its
implementation since there are no remaining callers.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-16 16:21:56 -08:00
58eab6ff13 test-genzeros: avoid raw write(2)
This test helper feeds 256kB of data at once to a single invocation
of the write(2) system call, which may be too much for some
platforms.

Call our xwrite() wrapper that knows to honor MAX_IO_SIZE limit and
cope with short writes due to EINTR instead, and die a bit more
loudly by calling die_errno() when xwrite() indicates an error.

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-16 08:30:38 -08:00
9deef088ae rev-list: clarify git-log default date format
The documentation mistakenly said that the default format was
similar to RFC 2822 format and tried to specify it by enumerating
differences, which had two problems:

 * There are some more differences from the 2822 format that are not
   mentioned; worse yet

 * The default format is not modeled after RFC 2822 format at all.
   As can be seen in f80cd783 (date.c: add "show_date()" function.,
   2005-05-06), it is a derivative of ctime(3) format.

Stop saying that it is similar to RFC 2822, and rewrite the
description to explain the format without requiring the reader to
know any other format.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 17:34:46 -08:00
d9d677b2d8 The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 17:11:54 -08:00
59397e9b7e Merge branch 'cw/doc-pushurl-vs-url'
Doc update.

* cw/doc-pushurl-vs-url:
  Documentation: clarify multiple pushurls vs urls
2023-02-15 17:11:54 -08:00
eb11ec23ff Merge branch 'ab/config-h-remove-unused'
Code clean-up.

* ab/config-h-remove-unused:
  config.h: remove unused git_configset_add_parameters()
2023-02-15 17:11:54 -08:00
06bca9708a Merge branch 'ab/retire-scripted-add-p'
Finally retire the scripted "git add -p/-i" implementation and have
everybody use the one reimplemented in C.

* ab/retire-scripted-add-p:
  docs & comments: replace mentions of "git-add--interactive.perl"
  add API: remove run_add_interactive() wrapper function
  add: remove "add.interactive.useBuiltin" & Perl "git add--interactive"
2023-02-15 17:11:53 -08:00
c5f7b2a6fe Merge branch 'rs/size-t-fixes'
Type fixes.

* rs/size-t-fixes:
  pack-objects: use strcspn(3) in name_cmp_len()
  read-cache: use size_t for {base,df}_name_compare()
2023-02-15 17:11:53 -08:00
063ec7b3b8 Merge branch 'kf/t5000-modernise'
Test clean-up.

* kf/t5000-modernise:
  t5000: modernise archive and :(glob) test
2023-02-15 17:11:53 -08:00
aa1e73bdd8 Merge branch 'wl/new-command-doc'
Comment fix.

* wl/new-command-doc:
  new-command.txt: update reference to builtin docs
2023-02-15 17:11:53 -08:00
4a6e6b0d5b Merge branch 'ar/userdiff-java-update'
Userdiff regexp update for Java language.

* ar/userdiff-java-update:
  userdiff: support Java sealed classes
  userdiff: support Java record types
  userdiff: support Java type parameters
2023-02-15 17:11:52 -08:00
f7c208cdf5 Merge branch 'po/attributes-text'
In-tree .gitattributes update to match the way we recommend our
users to mark a file as text.

* po/attributes-text:
  .gitattributes: include `text` attribute for eol attributes
2023-02-15 17:11:52 -08:00
a232de58f2 Merge branch 'ab/sequencer-unleak'
Plug leaks in sequencer subsystem and its users.

* ab/sequencer-unleak:
  commit.c: free() revs.commit in get_fork_point()
  builtin/rebase.c: free() "options.strategy_opts"
  sequencer.c: always free() the "msgbuf" in do_pick_commit()
  builtin/rebase.c: fix "options.onto_name" leak
  builtin/revert.c: move free-ing of "revs" to replay_opts_release()
  sequencer API users: fix get_replay_opts() leaks
  sequencer.c: split up sequencer_remove_state()
  rebase: use "cleanup" pattern in do_interactive_rebase()
2023-02-15 17:11:52 -08:00
4f59836451 Merge branch 'ds/bundle-uri-5'
The bundle-URI subsystem adds support for creation-token heuristics
to help incremental fetches.

* ds/bundle-uri-5:
  bundle-uri: test missing bundles with heuristic
  bundle-uri: store fetch.bundleCreationToken
  fetch: fetch from an external bundle URI
  bundle-uri: drop bundle.flag from design doc
  clone: set fetch.bundleURI if appropriate
  bundle-uri: download in creationToken order
  bundle-uri: parse bundle.<id>.creationToken values
  bundle-uri: parse bundle.heuristic=creationToken
  t5558: add tests for creationToken heuristic
  bundle: verify using check_connected()
  bundle: test unbundling with incomplete history
2023-02-15 17:11:52 -08:00
214242a6ab Merge branch 'cb/grep-fallback-failing-jit'
In an environment where dynamically generated code is prohibited to
run (e.g. SELinux), failure to JIT pcre patterns is expected.  Fall
back to interpreted execution in such a case.

* cb/grep-fallback-failing-jit:
  grep: fall back to interpreter if JIT memory allocation fails
2023-02-15 17:11:51 -08:00
ad6b320756 gpg: do show gpg's error message upon failure
There are few things more frustrating when signing a commit fails than
reading a terse "error: gpg failed to sign the data" message followed by
the unsurprising "fatal: failed to write commit object" message.

In many cases where signing a commit or tag fails, `gpg` actually said
something helpful, on its stderr, and Git even consumed that, but then
keeps mum about it.

Teach Git to stop withholding that rather important information.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:55:24 -08:00
8300d15d5e t7510: add a test case that does not need gpg
This test case not only increases test coverage in setups without
working gpg, but also prepares for verifying that the error message of
`gpg.program` is shown upon failure.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:55:22 -08:00
613bef56b8 shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.

This has a few downsides:

  - sscanf("%s") reportedly misbehaves on macOS with some input and
    locale combinations, returning a partial or garbled string. See
    this thread:

      https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/

  - scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
    rule would never pull "origin" out of "refs/remotes/origin/HEAD".
    Instead it always produced "origin/HEAD", which is redundant with
    the "refs/remotes/%s" rule.

  - scanf in general is an error-prone interface. For example, scanning
    for "%s" will copy bytes into a destination string, which must have
    been correctly sized ahead of time to avoid a buffer overflow. In
    this case, the code is OK (the buffer is pessimistically sized to
    match the original string, which should give us a maximum). But in
    general, we do not want to encourage people to use scanf at all.

So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).

We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.

The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.

There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):

  - the first covers the real-world case which misbehaved on macOS.
    Setting LC_ALL is required to trigger the problem there (since
    otherwise our tests use LC_ALL=C), and hopefully is at worst simply
    ignored on other systems (and doesn't cause libc to complain, etc,
    on systems without that locale).

  - the second covers the "origin/HEAD" case as discussed above, which
    is now fixed

  - the remainder are for "weird" cases that work both before and after
    this patch, but would be easy to get wrong with off-by-one problems
    in the parsing (and came out of discussions and earlier iterations
    of the patch that did get them wrong).

  - absent here are tests of boring, expected-to-work cases like
    "refs/heads/foo", etc. Those are covered all over the test suite
    both explicitly (for-each-ref's refname:short) and implicitly (in
    the output of git-status, etc).

Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00
8f416f65c9 shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.

We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.

The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.

Note that there are two things here that aren't just simple text
replacements:

  1. We also use nr_rules to see if a previous call has initialized the
     static pre-processing variables. We can just use the scanf_fmts
     pointer to do the same thing, as it is non-NULL only after we've
     done that initialization.

  2. If nr_rules is zero after we've counted it up, we bail from the
     function. This code is unreachable, though, as the set of rules is
     hard-coded and non-empty. And that becomes even more apparent now
     that we are using the constant. So we can drop this conditional
     completely (and ironically, the code would have the same output if
     it _did_ trigger, as we'd simply skip the loop entirely and return
     the whole refname).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00
dd5e4d3976 shorten_unambiguous_ref(): avoid integer truncation
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.

In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).

And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.

But even if it is not a problem in practice so far, it is worth fixing
for two reasons:

  1. We'll shortly be replacing the sscanf() call with a real parser
     which will handle arbitrary-sized strings.

  2. Assigning strlen() to an int is an anti-pattern that requires
     people to look twice when auditing for real overflow problems.

So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00
b1485644f9 Sync with 'maint' 2023-02-14 14:17:35 -08:00
768bb238c4 Prepare for 2.39.3 just in case
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-14 14:15:57 -08:00
037db6d563 Merge branch 'sk/remove-duplicate-includes' into maint-2.39
Code clean-up.

* sk/remove-duplicate-includes:
  git: remove duplicate includes
2023-02-14 14:15:57 -08:00
ff6c740339 Merge branch 'rs/clarify-error-in-write-loose-object' into maint-2.39
Code clean-up.

* rs/clarify-error-in-write-loose-object:
  object-file: inline write_buffer()
2023-02-14 14:15:57 -08:00
651b4430d1 Merge branch 'rs/reflog-expiry-cleanup' into maint-2.39
Code clean-up.

* rs/reflog-expiry-cleanup:
  reflog: clear leftovers in reflog_expiry_cleanup()
2023-02-14 14:15:56 -08:00
dfd37b70b1 Merge branch 'rs/clear-commit-marks-cleanup' into maint-2.39
Code clean-up.

* rs/clear-commit-marks-cleanup:
  commit: skip already cleared parents in clear_commit_marks_1()
2023-02-14 14:15:56 -08:00
7ac5eca21c Merge branch 'rs/am-parse-options-cleanup' into maint-2.39
Code clean-up.

* rs/am-parse-options-cleanup:
  am: don't pass strvec to apply_parse_options()
2023-02-14 14:15:56 -08:00
b7a7af266b Merge branch 'jk/server-supports-v2-cleanup' into maint-2.39
Code clean-up.

* jk/server-supports-v2-cleanup:
  server_supports_v2(): use a separate function for die_on_error
2023-02-14 14:15:55 -08:00
8d404d0d95 Merge branch 'jk/unused-post-2.39' into maint-2.39
Code clean-up around unused function parameters.

* jk/unused-post-2.39:
  userdiff: mark unused parameter in internal callback
  list-objects-filter: mark unused parameters in virtual functions
  diff: mark unused parameters in callbacks
  xdiff: mark unused parameter in xdl_call_hunk_func()
  xdiff: drop unused parameter in def_ff()
  ws: drop unused parameter from ws_blank_line()
  list-objects: drop process_gitlink() function
  blob: drop unused parts of parse_blob_buffer()
  ls-refs: use repository parameter to iterate refs
2023-02-14 14:15:55 -08:00
2f80d1b42e Merge branch 'rj/branch-copy-and-rename' into maint-2.39
Fix a pair of bugs in 'git branch'.

* rj/branch-copy-and-rename:
  branch: force-copy a branch to itself via @{-1} is a no-op
2023-02-14 14:15:55 -08:00
8ca2b1f248 Merge branch 'rs/t3920-crlf-eating-grep-fix' into maint-2.39
Test fix.

* rs/t3920-crlf-eating-grep-fix:
  t3920: support CR-eating grep
2023-02-14 14:15:54 -08:00
763ae829a3 Merge branch 'js/t3920-shell-and-or-fix' into maint-2.39
Test fix.

* js/t3920-shell-and-or-fix:
  t3920: don't ignore errors of more than one command with `|| true`
2023-02-14 14:15:54 -08:00
81b216e4f7 Merge branch 'ab/t4023-avoid-losing-exit-status-of-diff' into maint-2.39
Test fix.

* ab/t4023-avoid-losing-exit-status-of-diff:
  t4023: fix ignored exit codes of git
2023-02-14 14:15:54 -08:00
54941a5316 Merge branch 'ab/t7600-avoid-losing-exit-status-of-git' into maint-2.39
Test fix.

* ab/t7600-avoid-losing-exit-status-of-git:
  t7600: don't ignore "rev-parse" exit code in helper
2023-02-14 14:15:54 -08:00
2509d0198c Merge branch 'ab/t5314-avoid-losing-exit-status' into maint-2.39
Test fix.

* ab/t5314-avoid-losing-exit-status:
  t5314: check exit code of "git"
2023-02-14 14:15:53 -08:00
5a8f4c8adc Merge branch 'rs/plug-pattern-list-leak-in-lof' into maint-2.39
Leak fix.

* rs/plug-pattern-list-leak-in-lof:
  list-objects-filter: plug pattern_list leak
2023-02-14 14:15:53 -08:00
db2a91ba36 Merge branch 'rs/t4205-do-not-exit-in-test-script' into maint-2.39
Test fix.

* rs/t4205-do-not-exit-in-test-script:
  t4205: don't exit test script on failure
2023-02-14 14:15:53 -08:00
e34fd1334c Merge branch 'jc/doc-checkout-b' into maint-2.39
Clarify how "checkout -b/-B" and "git branch [-f]" are similar but
different in the documentation.

* jc/doc-checkout-b:
  checkout: document -b/-B to highlight the differences from "git branch"
2023-02-14 14:15:52 -08:00
26fc326044 Merge branch 'jc/doc-branch-update-checked-out-branch' into maint-2.39
Document that "branch -f <branch>" disables only the safety to
avoid recreating an existing branch.

* jc/doc-branch-update-checked-out-branch:
  branch: document `-f` and linked worktree behaviour
2023-02-14 14:15:52 -08:00
1f071460d3 Merge branch 'rs/ls-tree-path-expansion-fix' into maint-2.39
"git ls-tree --format='%(path) %(path)' $tree $path" showed the
path three times, which has been corrected.

* rs/ls-tree-path-expansion-fix:
  ls-tree: remove dead store and strbuf for quote_c_style()
  ls-tree: fix expansion of repeated %(path)
2023-02-14 14:15:52 -08:00
fa5958f4d6 Merge branch 'pb/doc-orig-head' into maint-2.39
Document ORIG_HEAD a bit more.

* pb/doc-orig-head:
  git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
  revisions.txt: be explicit about commands writing 'ORIG_HEAD'
  git-merge.txt: mention 'ORIG_HEAD' in the Description
  git-reset.txt: mention 'ORIG_HEAD' in the Description
  git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
2023-02-14 14:15:51 -08:00
4f8ab59838 Merge branch 'es/hooks-and-local-env' into maint-2.39
Doc update for environment variables set when hooks are invoked.

* es/hooks-and-local-env:
  githooks: discuss Git operations in foreign repositories
2023-02-14 14:15:51 -08:00
4950677b48 Merge branch 'ws/single-file-cone' into maint-2.39
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.

* ws/single-file-cone:
  dir: check for single file cone patterns
2023-02-14 14:15:51 -08:00
f8382a6396 Merge branch 'jk/ext-diff-with-relative' into maint-2.39
"git diff --relative" did not mix well with "git diff --ext-diff",
which has been corrected.

* jk/ext-diff-with-relative:
  diff: drop "name" parameter from prepare_temp_file()
  diff: clean up external-diff argv setup
  diff: use filespec path to set up tempfiles for ext-diff
2023-02-14 14:15:51 -08:00
7cbfd0e572 Merge branch 'ab/bundle-wo-args' into maint-2.39
Fix to a small regression in 2.38 days.

* ab/bundle-wo-args:
  bundle <cmd>: have usage_msg_opt() note the missing "<file>"
  builtin/bundle.c: remove superfluous "newargc" variable
  bundle: don't segfault on "git bundle <subcmd>"
2023-02-14 14:15:50 -08:00
259988af42 Merge branch 'ps/fsync-refs-fix' into maint-2.39
Fix the sequence to fsync $GIT_DIR/packed-refs file that forgot to
flush its output to the disk..

* ps/fsync-refs-fix:
  refs: fix corruption by not correctly syncing packed-refs to disk
2023-02-14 14:15:50 -08:00
725f293355 Merge branch 'lk/line-range-parsing-fix' into maint-2.39
When given a pattern that matches an empty string at the end of a
line, the code to parse the "git diff" line-ranges fell into an
infinite loop, which has been corrected.

* lk/line-range-parsing-fix:
  line-range: fix infinite loop bug with '$' regex
2023-02-14 14:15:49 -08:00
a67610f4ab Merge branch 'rs/use-enhanced-bre-on-macos' into maint-2.39
Newer regex library macOS stopped enabling GNU-like enhanced BRE,
where '\(A\|B\)' works as alternation, unless explicitly asked with
the REG_ENHANCED flag.  "git grep" now can be compiled to do so, to
retain the old behaviour.

* rs/use-enhanced-bre-on-macos:
  use enhanced basic regular expressions on macOS
2023-02-14 14:15:49 -08:00
11b53f8e52 Merge branch 'jk/curl-avoid-deprecated-api' into maint-2.39
Deal with a few deprecation warning from cURL library.

* jk/curl-avoid-deprecated-api:
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
2023-02-14 14:15:49 -08:00
6cdb8cd693 Merge branch 'jk/avoid-redef-system-functions' into maint-2.39
The jk/avoid-redef-system-functions-2.30 topic pre-merged for more
recent codebase.

* jk/avoid-redef-system-functions:
2023-02-14 14:15:49 -08:00
f3a28c2e09 Merge branch 'jk/avoid-redef-system-functions-2.30' into maint-2.39
Redefining system functions for a few functions did not follow our
usual "implement git_foo() and #define foo(args) git_foo(args)"
pattern, which has broken build for some folks.

* jk/avoid-redef-system-functions-2.30:
  git-compat-util: undefine system names before redeclaring them
  git-compat-util: avoid redefining system function names
2023-02-14 14:15:47 -08:00
83d585a5b9 Merge branch 'tb/ci-concurrency' into maint-2.39
Avoid unnecessary builds in CI, with settings configured in
ci-config.

* tb/ci-concurrency:
  ci: avoid unnecessary builds
2023-02-14 14:15:46 -08:00
f66b749c66 Merge branch 'cw/ci-whitespace' into maint-2.39
CI updates.  We probably want a clean-up to move the long shell
script embedded in yaml file into a separate file, but that can
come later.

* cw/ci-whitespace:
  ci (check-whitespace): move to actions/checkout@v3
  ci (check-whitespace): add links to job output
  ci (check-whitespace): suggest fixes for errors
2023-02-14 14:15:45 -08:00
a9405a8d7d Merge branch 'js/ci-disable-cmake-by-default' into maint-2.39
Stop running win+VS build by default.

* js/ci-disable-cmake-by-default:
  ci: only run win+VS build & tests in Git for Windows' fork
2023-02-14 14:15:45 -08:00
c867e4fa18 Sync with Git 2.39.2 2023-02-13 17:03:55 -08:00
567342fc77 test-ctype: test iscntrl, ispunct, isxdigit and isprint
Test the character classifiers added by 1c149ab2dd (ctype: support
iscntrl, ispunct, isxdigit and isprint, 2012-10-15) and 0fcec2ce54
(format-patch: make rfc2047 encoding more strict, 2012-10-18).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-13 13:36:05 -08:00
2c17de8b37 test-ctype: test islower and isupper
Test the character classifiers added by 43ccdf56ec (ctype: implement
islower/isupper macro, 2012-02-10).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-13 13:36:05 -08:00
d5071be5ed test-ctype: test isascii
Test the character classifier added by c2e9364a06 (cleanup: add
isascii(), 2009-03-07).  It returns 1 for NUL as well, which requires
special treatment, as our string-based tester can't find it with
strcmp(3).  Allow NUL to be given as the first character in a class
specification string.  This has the downside of no longer supporting
the empty string, but that's OK since we are not interested in testing
character classes with no members.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-13 13:36:05 -08:00
c5773dc078 commit-reach: avoid NULL dereference
The loop at the top of can_all_from_reach_with_flag() already
accounts for `from->objects[i].item' being NULL, so it follows
the cleanup loop should also account for a NULL `from_one'.

I managed to segfault here on one of my giant, many-remote repos
using `git fetch --negotiation-tip=...  --negotiation-only'
where the --negotiation-tip= argument was a glob which (inadvertently)
captured more refs than I wanted.  I have not reproduced this
in a standalone test case.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-11 11:36:24 -08:00
d9ec3b0dc0 doc/ls-remote: clarify pattern format
We document that you can specify "refs" to ls-remote, but we don't
explain any further than that they are "matched" as patterns. Since this
can be interpreted in a lot of ways, let's clarify that they are
tail-matched globs.

Likewise, let's use the word "patterns" to refer to them consistently,
rather than "refs" (both here and in the quick "-h" help), and mention
more explicitly that only one pattern needs to be matched (though there
is also an example already that shows this in action).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 21:57:51 -08:00
baebde7d19 doc/ls-remote: cosmetic cleanups for examples
There are effectively three example commands and their output, but
they're smushed together with no extra whitespace. Let's add some blank
lines to make them more readable.

Likewise, the first example uses "./." to refer to the path of the
current repository, which is somewhat distracting. That may have been
necessary back in 2005 when it was added, but we can just say "." these
days.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 18:54:58 -08:00
93ea118bed cache-tree: fix strbuf growth in prime_cache_tree_rec()
Use size_t to store the original length of the strbuf tree_len, as
that's the correct type.

Don't double the allocated size of the strbuf when adding a subdirectory
name.  And the chance of the trailing slash fitting in the slack left by
strbuf_add() is very high, so stop pre-growing the strbuf at all.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 12:24:12 -08:00
dfd0a89374 cocci & cache.h: remove "USE_THE_INDEX_COMPATIBILITY_MACROS"
Have the last users of "USE_THE_INDEX_COMPATIBILITY_MACROS" use the
underlying *_index() variants instead. Now all previous users of
"USE_THE_INDEX_COMPATIBILITY_MACROS" have been migrated away from the
wrapper macros, and if applicable to use the "USE_THE_INDEX_VARIABLE"
added in [1].

Let's leave the "index-compatibility.cocci" in place, even though it
won't be doing anything on "master". It will benefit any out-of-tree
code that need to use these compatibility macros. We can eventually
remove it.

1. bdafeae0b9 (cache.h & test-tool.h: add & use
   "USE_THE_INDEX_VARIABLE", 2022-11-19)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 11:38:40 -08:00
fcb864bce7 cache-tree API: remove redundant update_main_cache_tree()
Remove the redundant update_main_cache_tree() function, and make its
users use cache_tree_update() instead.

The behavior of populating the "the_index.cache_tree" if it wasn't
present already was needed when this function was introduced in [1],
but it hasn't been needed since [2]; The "cache_tree_update()" will
now lazy-allocate, so there's no need for the wrapper.

1. 996277c520 (Refactor cache_tree_update idiom from commit,
   2011-12-06)
2. fb0882648e (cache-tree: clean up cache_tree_update(), 2021-01-23)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 11:38:14 -08:00
99370863e2 cocci & cache-tree.h: migrate "write_cache_as_tree" to "*_index_*"
Add a trivial rule for "write_cache_as_tree" to
"index-compatibility.cocci", and apply it. This was left out of the
rules added in 0e6550a2c6 (cocci: add a
index-compatibility.pending.cocci, 2022-11-19) because this
compatibility wrapper lived in "cache-tree.h", not "cache.h"

But it's like the other "USE_THE_INDEX_COMPATIBILITY_MACROS", so let's
migrate it too.

The replacement of "USE_THE_INDEX_COMPATIBILITY_MACROS" here with
"USE_THE_INDEX_VARIABLE" is a manual change on top, now that these
files only use "&the_index", and don't need any compatibility
macros (or functions).

The wrapping of some argument lists is likewise manual, as coccinelle
would otherwise give us overly long argument lists.

The reason for putting the "O" in the cocci rule on the "-" and "+"
lines is because I couldn't get correct whitespacing otherwise,
i.e. I'd end up with "oid,&the_index", not "oid, &the_index".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 11:37:49 -08:00
babed893f5 cocci & cache.h: apply pending "index_cache_pos" rule
Apply the rule added in [1] to change "cache_name_pos" to
"index_name_pos", which allows us to get rid of another
"USE_THE_INDEX_COMPATIBILITY_MACROS" macro.

The replacement of "USE_THE_INDEX_COMPATIBILITY_MACROS" here with
"USE_THE_INDEX_VARIABLE" is a manual change on top, now that these
files only use "&the_index", and don't need any compatibility
macros (or functions).

1. 0e6550a2c6 (cocci: add a index-compatibility.pending.cocci,
   2022-11-19)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 11:37:27 -08:00
cec13b9514 cocci & cache.h: fully apply "active_nr" part of index-compatibility
Apply the "active_nr" part of "index-compatibility.pending.cocci",
which was left out in [1] due to an in-flight conflict. As of [2] the
topic we conflicted with has been merged to "master", so we can fully
apply this rule.

1. dc594180d9 (cocci & cache.h: apply variable section of "pending"
   index-compatibility, 2022-11-19)
2. 9ea1378d04 (Merge branch 'ab/various-leak-fixes', 2022-12-14)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 11:31:18 -08:00
6193aaa9f9 builtin/rm.c: use narrower "USE_THE_INDEX_VARIABLE"
Replace the "USE_THE_INDEX_COMPATIBILITY_MACROS" define with the
narrower "USE_THE_INDEX_VARIABLE". This could have been done in
07047d6829 (cocci: apply "pending" index-compatibility to some
"builtin/*.c", 2022-11-19), but I missed it at the time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-10 11:31:16 -08:00
fd2d4c135e gpg-interface: lazily initialize and read the configuration
Instead of forcing the porcelain commands to always read the
configuration variables related to the signing and verifying
signatures, lazily initialize the necessary subsystem on demand upon
the first use.

This hopefully would make it more future-proof as we do not have to
think and decide whether we should call git_gpg_config() in the
git_config() callback for each command.

A few git_config() callback functions that used to be custom
callbacks are now just a thin wrapper around git_default_config().
We could further remove, git_FOO_config and replace calls to
git_config(git_FOO_config) with git_config(git_default_config), but
to make it clear which ones are affected and the effect is only the
removal of git_gpg_config(), it is vastly preferred not to do such a
change in this step (they can be done on top once the dust settled).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-09 17:01:27 -08:00
23c56f7bd5 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-09 14:40:47 -08:00
6d1b2e48fe Merge branch 'ew/free-island-marks'
"git pack-objects" learned to release delta-island bitmap data when
it is done using it, saving peak heap memory usage.

* ew/free-island-marks:
  delta-islands: free island_marks and bitmaps
2023-02-09 14:40:47 -08:00
8a1d607877 Merge branch 'sk/winansi-createthread-fix'
Fix use of CreateThread() API call made early in the windows
start-up code.

* sk/winansi-createthread-fix:
  compat/winansi: check for errors of CreateThread() correctly
2023-02-09 14:40:47 -08:00
4158b92f16 Merge branch 'hj/remove-msys-support'
Remove support for MSys, which now lags way behind MSys2.

* hj/remove-msys-support:
  mingw: remove msysGit/MSYS1 support
  mingw: remove duplicate `USE_NED_ALLOCATOR` directive
2023-02-09 14:40:47 -08:00
a674c7edcf Merge branch 'jk/httpd-test-updates'
Test update.

* jk/httpd-test-updates:
  t/lib-httpd: increase ssl key size to 2048 bits
  t/lib-httpd: drop SSLMutex config
  t/lib-httpd: bump required apache version to 2.4
  t/lib-httpd: bump required apache version to 2.2
2023-02-09 14:40:47 -08:00
2c91b13751 Merge branch 'gc/index-format-doc'
Doc update.

* gc/index-format-doc:
  docs: document zero bits in index "mode"
2023-02-09 14:40:46 -08:00
b2182a8730 name-rev: fix names by dropping taggerdate workaround
Commit 7550424804 ("name-rev: include taggerdate in considering the best
name", 2016-04-22) introduced the idea of using taggerdate in the
criteria for selecting the best name.  At the time, a certain commit in
linux.git -- namely, aed06b9cfcab -- was being named by name-rev as
    v4.6-rc1~9^2~792
which, while correct, was very suboptimal.  Some investigation found
that tweaking the MERGE_TRAVERSAL_WEIGHT to lower it could give
alternate answers such as
    v3.13-rc7~9^2~14^2~42
or
    v3.13~5^2~4^2~2^2~1^2~42
A manual solution involving looking at tagger dates came up with
    v3.13-rc1~65^2^2~42
which is much nicer.  That workaround was then implemented in name-rev.

Unfortunately, the taggerdate heuristic is causing bugs.  I was pointed
to a case in a private repository where name-rev reports a name of the
form
    v2022.10.02~86
when users expected to see one of the form
    v2022.10.01~2
(I've modified the names and numbers a bit from the real testcase.)  As
you can probably guess, v2022.10.01 was created after v2022.10.02 (by a
few hours), even though it pointed to an older commit.  While the
condition is unusual even in the repository in question, it is not the
only problematic set of tags in that repository.  The taggerdate logic
is causing problems.

Further, it turns out that this taggerdate heuristic isn't even helping
anymore.  Due to the fix to naming logic in 3656f84278 ("name-rev:
prefer shorter names over following merges", 2021-12-04), we get
improved names without the taggerdate heuristic.  For the original
commit of interest in linux.git, a modern git without the taggerdate
heuristic still provides the same optimal answer of interest, namely:
    v3.13-rc1~65^2^2~42

So, the taggerdate is no longer providing benefit, and it is causing
problems.  Simply get rid of it.

However, note that "taggerdate" as a variable is used to store things
besides a taggerdate these days.  Ever since commit ef1e74065c
("name-rev: favor describing with tags and use committer date to
tiebreak", 2017-03-29), this has been used to store committer dates and
there it is used as a fallback tiebreaker (as opposed to a primary
criteria overriding effective distance calculations).  We do not want to
remove that fallback tiebreaker, so not all instances of "taggerdate"
are removed in this change.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-09 09:01:36 -08:00
93d52ed050 userdiff: support Java sealed classes
A new kind of class was added in Java 17 -- sealed classes.[1]  This
feature includes several new keywords that may appear in a declaration
of a class.  New modifiers before name of the class: "sealed" and
"non-sealed", and a clause after name of the class marked by keyword
"permits".

The current set of regular expressions in userdiff.c already allows the
modifier "sealed" and the "permits" clause, but not the modifier
"non-sealed", which is the first hyphenated keyword in Java.[2]  Allow
hyphen in the words that precede the name of type to match the
"non-sealed" modifier.

In new input file "java-sealed" for the test t4018-diff-funcname.sh, use
a Java code comment for the marker "RIGHT".  This workaround is needed,
because the name of the sealed class appears on the line of code that
has the "ChangeMe" marker.

[1] Detailed description in "JEP 409: Sealed Classes"
    https://openjdk.org/jeps/409
[2] "JEP draft: Keyword Management for the Java Language"
    https://openjdk.org/jeps/8223002

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:57:13 -08:00
575e6fcfcc userdiff: support Java record types
A new kind of class was added in Java 16 -- records.[1]  The syntax of
records is similar to regular classes with one important distinction:
the name of the record class is followed by a mandatory list of
components.  The list is enclosed in parentheses, it may be empty, and
it may immediately follow the name of the class or type parameters, if
any, with or without separating whitespace.  For example:

    public record Example(int i, String s) {
    }

    public record WithTypeParameters<A, B>(A a, B b, String s) {
    }

    record SpaceBeforeComponents (String comp1, int comp2) {
    }

Support records in the builtin userdiff pattern for Java.  Add "record"
to the alternatives of keywords for kinds of class.

Allowing matching various possibilities for the type parameters and/or
list of the components of a record has already been covered by the
preceding patch.

[1] detailed description is available in "JEP 395: Records"
    https://openjdk.org/jeps/395

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:57:11 -08:00
39226a8dac userdiff: support Java type parameters
A class or interface in Java can have type parameters following the name
in the declared type, surrounded by angle brackets (paired less than and
greater than signs).[2]   The type parameters -- `A` and `B` in the
examples -- may follow the class name immediately:

    public class ParameterizedClass<A, B> {
    }

or may be separated by whitespace:

    public class SpaceBeforeTypeParameters <A, B> {
    }

A part of the builtin userdiff pattern for Java matches declarations of
classes, enums, and interfaces.  The regular expression requires at
least one whitespace character after the name of the declared type.
This disallows matching for opening angle bracket of type parameters
immediately after the name of the type.  Mandatory whitespace after the
name of the type also disallows using the pattern in repositories with a
fairly common code style that puts braces for the body of a class on
separate lines:

    class WithLineBreakBeforeOpeningBrace
    {
    }

Support matching Java code in more diverse code styles and declarations
of classes and interfaces with type parameters immediately following the
name of the type in the builtin userdiff pattern for Java.  Do so by
just matching anything until the end of the line after the keywords for
the kind of type being declared.

[1] Since Java 5 released in 2004.
[2] Detailed description is available in the Java Language
    Specification, sections "Type Variables" and "Parameterized Types":
    https://docs.oracle.com/javase/specs/jls/se17/html/jls-4.html#jls-4.4

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:56:57 -08:00
0414b3891c hook: support a --to-stdin=<path> option
Expose the "path_to_stdin" API added in the preceding commit in the
"git hook run" command.

For now we won't be using this command interface outside of the tests,
but exposing this functionality makes it easier to test the hook
API. The plan is to use this to extend the "sendemail-validate"
hook[1][2].

1. https://lore.kernel.org/git/ad152e25-4061-9955-d3e6-a2c8b1bd24e7@amd.com
2. https://lore.kernel.org/git/20230120012459.920932-1-michael.strawbridge@amd.com

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:50:03 -08:00
96af564d27 sequencer: use the new hook API for the simpler "post-rewrite" call
Change the invocation of the "post-rewrite" hook added in
795160457d (sequencer (rebase -i): run the post-rewrite hook, if
needed, 2017-01-02) to use the new hook API.

This leaves the more complex "post-rewrite" invocation added in
a87a6f3c98 (commit: move post-rewrite code to libgit, 2017-11-17)
here in sequencer.c unconverted.

Here we can pass in a file's via the "in" file descriptor, in that
case we don't have a file, but will need to write_in_full() to an "in"
provide by the API. Support for that will be added to the hook API in
the future, but we're not there yet.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:50:03 -08:00
917e080249 hook API: support passing stdin to hooks, convert am's 'post-rewrite'
Convert the invocation of the 'post-rewrite' hook run by 'git am' to
use the hook.h library. To do this we need to add a "path_to_stdin"
member to "struct run_hooks_opt".

In our API this is supported by asking for a file path, rather
than by reading stdin. Reading directly from stdin would involve caching
the entire stdin (to memory or to disk) once the hook API is made to
support "jobs" larger than 1, along with support for executing N hooks
at a time (i.e. the upcoming config-based hooks).

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:50:03 -08:00
540267304d run-command: allow stdin for run_processes_parallel
While it makes sense not to inherit stdin from the parent process to
avoid deadlocking, it's not necessary to completely ban stdin to
children. An informed user should be able to configure stdin safely. By
setting `some_child.process.no_stdin=1` before calling `get_next_task()`
we provide a reasonable default behavior but enable users to set up
stdin streaming for themselves during the callback.

`some_child.process.stdout_to_stderr`, however, remains unmodifiable by
`get_next_task()` - the rest of the run_processes_parallel() API depends
on child output in stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:50:03 -08:00
5123e6e7bd run-command.c: remove dead assignment in while-loop
Remove code that's been unused since it was added in
c553c72eed (run-command: add an asynchronous parallel child
processor, 2015-12-15).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 12:50:03 -08:00
7876265d61 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-08 09:14:51 -08:00
3fe6612d4c Merge branch 'ds/scalar-ignore-cron-error'
Allow "scalar" to warn but continue when its periodic maintenance
feature cannot be enabled.

* ds/scalar-ignore-cron-error:
  scalar: only warn when background maintenance fails
  t921*: test scalar behavior starting maintenance
  t: allow 'scalar' in test_must_fail
2023-02-08 09:14:42 -08:00
c6dea59323 Merge branch 'mh/doc-credential-cache-only-in-core'
Documentation clarification.

* mh/doc-credential-cache-only-in-core:
  Documentation: clarify that cache forgets credentials if the system restarts
2023-02-08 09:14:42 -08:00
ad7fd3cc03 Merge branch 'gm/request-pull-with-non-pgp-signed-tags'
Adjust "git request-pull" to strip embedded signature from signed
tags to notice non-PGP signatures.

* gm/request-pull-with-non-pgp-signed-tags:
  request-pull: filter out SSH/X.509 tag signatures
2023-02-08 09:14:42 -08:00
d390e08076 Documentation: clarify multiple pushurls vs urls
In a remote with multiple configured URLs, `git remote -v` shows the
correct url that fetch uses. However, `git config remote.<remote>.url`
returns the last defined url instead. This discrepancy can cause
confusion for users with a remote defined as such, since any url
defined after the first essentially acts as a pushurl.

Add documentation to clarify how fetch interacts with multiple urls
and how push interacts with multiple pushurls and urls.

Add test affirming interaction between fetch and multiple urls.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-07 11:02:27 -08:00
3eb1e1ca9a config.h: remove unused git_configset_add_parameters()
This function was removed in ecec57b3c9 (config: respect includes in
protected config, 2022-10-13), but its prototype was left here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-07 10:50:27 -08:00
0c10ed19c4 commit.c: free() revs.commit in get_fork_point()
Fix a memory leak that's been with us since d96855ff51 (merge-base:
teach "--fork-point" mode, 2013-10-23).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:53 -08:00
a535040887 builtin/rebase.c: free() "options.strategy_opts"
When the "strategy_opts" member was added in ba1905a5fe (builtin
rebase: add support for custom merge strategies, 2018-09-04) the
corresponding free() for it at the end of cmd_rebase() wasn't added,
let's do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:53 -08:00
a5792e9d09 sequencer.c: always free() the "msgbuf" in do_pick_commit()
In [1] the strbuf_release(&msgbuf) was moved into this
do_pick_commit(), but didn't take into account the case of [2], where
we'd return before the strbuf_release(&msgbuf).

Then when the "fixup" support was added in [3] this leak got worse, as
in this error case we added another place where we'd "return" before
reaching the strbuf_release().

This changes the behavior so that we'll call
update_abort_safety_file() in these cases where we'd previously
"return", but as noted in [4] "update_abort_safety_file() is a no-op
when rebasing and you're changing code that is only run when
rebasing.". Here "no-op" refers to the early return in
update_abort_safety_file() if git_path_seq_dir() doesn't exist.

1. 452202c74b (sequencer: stop releasing the strbuf in
   write_message(), 2016-10-21)
2. f241ff0d0a (prepare the builtins for a libified merge_recursive(),
   2016-07-26)
3. 6e98de72c0 (sequencer (rebase -i): add support for the 'fixup' and
   'squash' commands, 2017-01-02)
4. https://lore.kernel.org/git/bcace50b-a4c3-c468-94a3-4fe0c62b3671@dunelm.org.uk/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:52 -08:00
94ad545d47 builtin/rebase.c: fix "options.onto_name" leak
Similar to the existing "squash_onto_name" added in [1] we need to
free() the xstrdup()'d "options.onto.name" added for "--keep-base" in
[2]..

1. 9dba809a69 (builtin rebase: support --root, 2018-09-04)
2. 414d924beb (rebase: teach rebase --keep-base, 2019-08-27)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:52 -08:00
a6a700a43c builtin/revert.c: move free-ing of "revs" to replay_opts_release()
In [1] and [2] I added the code being moved here to cmd_revert() and
cmd_cherry_pick(), now that we've got a "replay_opts_release()" for
the "struct replay_opts" it should know how to free these "revs",
rather than having these users reach into the struct to free its
individual members.

1. d1ec656d68 (cherry-pick: free "struct replay_opts" members,
   2022-11-08)
2. fd74ac95ac (revert: free "struct replay_opts" members, 2022-07-01)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:52 -08:00
9ff2f06069 sequencer API users: fix get_replay_opts() leaks
Make the replay_opts_release() function added in the preceding commit
non-static, and use it for freeing the "struct replay_opts"
constructed for "rebase" and "revert".

To safely call our new replay_opts_release() we'll need to stop
calling it in sequencer_remove_state(), and instead call it where we
allocate the "struct replay_opts" itself.

This is because in e.g. do_interactive_rebase() we construct a "struct
replay_opts" with "get_replay_opts()", and then call
"complete_action()". If we get far enough in that function without
encountering errors we'll call "pick_commits()" which (indirectly)
calls sequencer_remove_state() at the end.

But if we encounter errors anywhere along the way we'd punt out early,
and not free() the memory we allocated. Remembering whether we
previously called sequencer_remove_state() would be a hassle.

Using a FREE_AND_NULL() pattern would also work, as it would be safe
to call replay_opts_release() repeatedly. But let's fix this properly
instead, by having the owner of the data free() it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:52 -08:00
6a09c3a9a6 sequencer.c: split up sequencer_remove_state()
Split off the free()-ing in sequencer_remove_state() into a utility
function, which will be adjusted and called independent of the other
code in sequencer_remove_state() in a subsequent commit.

The only functional change here is changing the "int" to a "size_t",
which is the correct type, as "xopts_nr" is a "size_t".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:52 -08:00
01fd5fb14b rebase: use "cleanup" pattern in do_interactive_rebase()
Use a "goto cleanup" pattern in do_interactive_rebase(). This
eliminates some duplicated free() code added in 53bbcfbde7 (rebase
-i: implement the main part of interactive rebase as a builtin,
2018-09-27), and sets us up for a subsequent commit which'll make
further use of the "cleanup" label.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 16:03:52 -08:00
c65d18cb52 push: free_refs() the "local_refs" in set_refspecs()
Fix a memory leak that's been with us since this code was added in
ca02465b41 (push: use remote.$name.push as a refmap, 2013-12-03).

The "remote = remote_get(...)" added in the same commit would seem to
leak based only on the context here, but that function is a wrapper
for sticking the remotes we fetch into "the_repository->remote_state".

See fd3cb0501e (remote: move static variables into per-repository
struct, 2021-11-17) for the addition of code in repository.c that
free's the "remote" allocated here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:40 -08:00
aa561208d9 push: refactor refspec_append_mapped() for subsequent leak-fix
The set_refspecs() caller of refspec_append_mapped() (added in [1])
left open the question[2] of whether the "remote" we lazily fetch
might be NULL in the "[...]uniquely name our ref?" case, as
remote_get() can return NULL.

If we got past the "[...]uniquely name our ref?" case we'd have
already segfaulted if we tried to dereference it as
"remote->push.nr". In these cases the config mechanism & previous
remote validation will have bailed out earlier.

Let's refactor this code to clarify that, we'll now BUG() out if we
can't get a "remote", and will no longer retrieve it for these common
cases where we don't need it.

1. ca02465b41 (push: use remote.$name.push as a refmap, 2013-12-03)
2. https://lore.kernel.org/git/c0c07b89-7eaf-21cd-748e-e14ea57f09fd@web.de/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:40 -08:00
1fdd31cf52 receive-pack: release the linked "struct command *" list
Fix a memory leak that's been with us since this code was introduced
in [1]. Later in [2] we started using FLEX_ALLOC_MEM() to allocate the
"struct command *".

1. 575f497456 (Add first cut at "git-receive-pack", 2005-06-29)
2. eb1af2df0b (git-receive-pack: start parsing ref update commands,
   2005-06-29)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:40 -08:00
fb2ebe72a3 grep API: plug memory leaks by freeing "header_list"
When the "header_list" struct member was added in [1], freeing this
field was neglected. Fix that now, so that commands like

	./git -P log -1 --color=always --author=A origin/master

will run leak-free.

1. 80235ba79e ("log --author=me --grep=it" should find intersection,
   not union, 2010-01-17)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:39 -08:00
891c9965fb grep.c: refactor free_grep_patterns()
Refactor the free_grep_patterns() function to split out the freeing of
the "struct grep_pat" it contains. Right now we're only freeing the
"pattern_list", but we should be freeing another member of the same
type, which we'll do in the subsequent commit.

Let's also replace the "return" if we don't have an
"opt->pattern_expression" with a conditional call of
free_pattern_expr().

Before db84376f98 (grep.c: remove "extended" in favor of
"pattern_expression", fix segfault, 2022-10-11) the pattern here was:

	if (!x)
		return;
	free_pattern_expr(y);

While at it, instead of:

	if (!x)
		return;
	free_pattern_expr(x);

Let's instead do:

	if (x)
		free_pattern_expr(x);

This will make it easier to free additional members from
free_grep_patterns() in the future.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:39 -08:00
41211db10f builtin/merge.c: free "&buf" on "Your local changes..." error
Plug a memory leak introduced in [1], since that change didn't follow
the "goto done" pattern introduced in [2] we'd leak the "&buf" memory.

1. e4cdfe84a0 (merge: abort if index does not match HEAD for trivial
   merges, 2022-07-23)
2. d5a35c114a (Copy resolve_ref() return value for longer use,
   2011-11-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:39 -08:00
345e216f63 builtin/merge.c: use fixed strings, not "strbuf", fix leak
Follow-up 465028e0e2 (merge: add missing strbuf_release(),
2021-10-07) and address the "msg" memory leak in this block. We could
free "&msg" before the "goto done" here, but even better is to avoid
allocating it in the first place.

By repeating the "Fast-forward" string here we can avoid using a
"struct strbuf" altogether.

Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:39 -08:00
81559612a9 show-branch: free() allocated "head" before return
Stop leaking the "head" variable, which we've been leaking since it
was originally added in [1], and in its current form since [2]

1. ed378ec7e8 (Make ref resolution saner, 2006-09-11)
2. d9e557a320 (show-branch: store resolved head in heap buffer,
   2017-02-14).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:39 -08:00
9d01cfed69 commit-graph: fix a parse_options_concat() leak
When the parse_options_concat() was added to this file in
84e4484f12 (commit-graph: use parse_options_concat(), 2021-08-23) we
wouldn't free() it if we returned early in these cases.

Since "result" is 0 by default we can "goto cleanup" in both cases,
and only need to set "result" if write_commit_graph_reachable() fails.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:38 -08:00
2139bd0200 http-backend.c: fix cmd_main() memory leak, refactor reg{exec,free}()
Fix a memory leak that's been with us ever since
2f4038ab33 (Git-aware CGI to provide dumb HTTP transport,
2009-10-30). In this case we're not calling regerror() after a failed
regexec(), and don't otherwise use "re" afterwards.

We can therefore simplify this code by calling regfree() right after
the regexec(). An alternative fix would be to add a regfree() to both
the "return" and "break" path in this for-loop.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:38 -08:00
eef75d247a http-backend.c: fix "dir" and "cmd_arg" leaks in cmd_main()
Free the "dir" variable after we're done with it. Before
917adc0360 (http-backend: add GIT_PROJECT_ROOT environment var,
2009-10-30) there was no leak here, as we'd get it via getenv(), but
since 917adc0360 we've xstrdup()'d it (or the equivalent), so we need
to free() it.

We also need to free the "cmd_arg" variable, which has been leaked
ever since it was added in 2f4038ab33 (Git-aware CGI to provide dumb
HTTP transport, 2009-10-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:38 -08:00
9f24f3c719 worktree: fix a trivial leak in prune_worktrees()
We were leaking both the "struct strbuf" in prune_worktrees(), as well
as the "path" we got from should_prune_worktree(). Since these were
the only two uses of the "struct string_list" let's change it to a
"DUP" and push these to it with "string_list_append_nodup()".

For the string_list_append_nodup() we could also string_list_append()
the main_path.buf, and then strbuf_release(&main_path) right away. But
doing it this way avoids an allocation, as we already have the "struct
strbuf" prepared for appending to "kept".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:38 -08:00
90428ddccf repack: fix leaks on error with "goto cleanup"
In cmd_repack() when we hit an error, replace "return ret" with "goto
cleanup" to ensure we free the necessary data structures.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:37 -08:00
486620ae0c name-rev: don't xstrdup() an already dup'd string
When "add_to_tip_table()" is called with a non-zero
"shorten_unambiguous" we always return an xstrdup()'d string, which
we'd then xstrdup() again, leaking memory. See [1] and [2] for how
this leak came about.

We could xstrdup() only if "shorten_unambiguous" wasn't true, but
let's instead inline this code, so that information on whether we need
to xstrdup() is contained within add_to_tip_table().

1. 98c5c4ad01 (name-rev: allow to specify a subpath for --refs
   option, 2013-06-18)
2. b23e0b9353 (name-rev: allow converting the exact object name at
   the tip of a ref, 2013-07-07)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:37 -08:00
7615cf94d2 various: add missing clear_pathspec(), fix leaks
Fix memory leaks resulting from a missing clear_pathspec().

- archive.c: Plug a leak in the "struct archiver_args", and
  clear_pathspec() the "pathspec" member that the "parse_pathspec_arg()"
  call in this function populates.

- builtin/clean.c: Fix a memory leak that's been with us since
  893d839970 (clean: convert to use parse_pathspec, 2013-07-14).

- builtin/reset.c: Add clear_pathspec() calls to cmd_reset(),
  including to the codepaths where we'd return early.

- builtin/stash.c: Call clear_pathspec() on the pathspec initialized
  in push_stash().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:37 -08:00
81e5c39cf6 clone: use free() instead of UNLEAK()
Change an UNLEAK() added in 0c4542738e (clone: free or UNLEAK further
pointers when finished, 2021-03-14) to use a "to_free" pattern
instead. In this case the "repo" can be either this absolute_pathdup()
value, or in the "else if" branch seen in the context the the
"argv[0]" argument to "main()".

We can only free() the value in the former case, hence the "to_free"
pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:37 -08:00
e8ed0a8ac5 commit-graph: use free_commit_graph() instead of UNLEAK()
In 0bfb48e672 (builtin/commit-graph.c: UNLEAK variables, 2018-10-03)
this was made to UNLEAK(), but we can just as easily invoke the
free_commit_graph() function added in c3756d5b7f (commit-graph: add
free_commit_graph, 2018-07-11) instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:36 -08:00
53537c6c17 bundle.c: don't leak the "args" in the "struct child_process"
Fix a leak that's been here since 7366096de9 (bundle API: change
"flags" to be "extra_index_pack_args", 2021-09-05). If we can't verify
the bundle, we didn't call child_process_clear() to clear the "args".

But rather than adding an additional child_process_clear() call, let's
verify the bundle before we start preparing the process we're going to
spawn. If we fail to verify, we don't need to push anything to the
child_process "args".

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:36 -08:00
b2e5d75d17 tests: mark tests as passing with SANITIZE=leak
When the "ab/various-leak-fixes" topic was merged in [1] only t6021
would fail if the tests were run in the
"GIT_TEST_PASSING_SANITIZE_LEAK=check" mode, i.e. to check whether we
marked all leak-free tests with "TEST_PASSES_SANITIZE_LEAK=true".

Since then we've had various tests starting to pass under
SANITIZE=leak. Let's mark those as passing, this is when they started
to pass, narrowed down with "git bisect":

- t5317-pack-objects-filter-objects.sh: In
  faebba436e (list-objects-filter: plug pattern_list leak, 2022-12-01).

- t3210-pack-refs.sh, t5613-info-alternate.sh,
  t7403-submodule-sync.sh: In 189e97bc4b (diff: remove parseopts member
  from struct diff_options, 2022-12-01).

- t1408-packed-refs.sh: In ab91f6b7c4 (Merge branch
  'rs/diff-parseopts', 2022-12-19).

- t0023-crlf-am.sh, t4152-am-subjects.sh, t4254-am-corrupt.sh,
  t4256-am-format-flowed.sh, t4257-am-interactive.sh,
  t5403-post-checkout-hook.sh: In a658e881c1 (am: don't pass strvec to
  apply_parse_options(), 2022-12-13)

- t1301-shared-repo.sh, t1302-repo-version.sh: In b07a819c05 (reflog:
  clear leftovers in reflog_expiry_cleanup(), 2022-12-13).

- t1304-default-acl.sh, t1410-reflog.sh,
  t5330-no-lazy-fetch-with-commit-graph.sh, t5502-quickfetch.sh,
  t5604-clone-reference.sh, t6014-rev-list-all.sh,
  t7701-repack-unpack-unreachable.sh: In b0c61be320 (Merge branch
  'rs/reflog-expiry-cleanup', 2022-12-26)

- t3800-mktag.sh, t5302-pack-index.sh, t5306-pack-nobase.sh,
  t5573-pull-verify-signatures.sh, t7612-merge-verify-signatures.sh: In
  69bbbe484b (hash-object: use fsck for object checks, 2023-01-18).

- t1451-fsck-buffer.sh: In 8e4309038f (fsck: do not assume
  NUL-termination of buffers, 2023-01-19).

- t6501-freshen-objects.sh: In abf2bb895b (Merge branch
  'jk/hash-object-fsck', 2023-01-30)

1. 9ea1378d04 (Merge branch 'ab/various-leak-fixes', 2022-12-14)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:34:36 -08:00
9fdc79ecba tests: don't lose misc "git" exit codes
Fix a few miscellaneous cases where:

- We lost the "git" exit code via "git ... | grep"
- Likewise by having a $(git) argument to git itself
- Used "test -z" to check that a command emitted no output, we can use
  "test_must_be_empty" and &&-chaining instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:42 -08:00
4bd0785dc2 tests: don't lose exit status with "test <op> $(git ...)"
As with the preceding commit, rewrite tests that ran "git" inside
command substitution and lost the exit status of "git" so that we
notice the failing "git". This time around we're converting cases that
didn't involve a containing sub-shell around the command substitution.

In the case of "t0060-path-utils.sh" and
"t2005-checkout-index-symlinks.sh" convert the relevant code to using
the modern style of indentation and newline wrapping while having to
change it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:42 -08:00
c7e03b4e39 tests: don't lose "git" exit codes in "! ( git ... | grep )"
Change tests that would lose the "git" exit code via a negation
pattern to:

- In the case of "t0055-beyond-symlinks.sh" compare against the
  expected output instead.

  We could use the same pattern as in the "t3700-add.sh" below, doing
  so would have the advantage that if we added an earlier test we
  wouldn't need to adjust the "expect" output.

  But as "t0055-beyond-symlinks.sh" is a small and focused test (less
  than 40 lines in total) let's use "test_cmp" instead.

- For "t3700-add.sh" use "sed -n" to print the expected "bad" part,
  and use "test_must_be_empty" to assert that it's not there. If we used
  "grep" we'd get a non-zero exit code.

  We could use "test_expect_code 1 grep", but this is more consistent
  with existing patterns in the test suite.

  We can also remove a repeated invocation of "git ls-files" for the
  last test that's being modified in that file, and search the
  existing "files" output instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:42 -08:00
0cd1a8818d tests: don't lose exit status with "(cd ...; test <op> $(git ...))"
Rewrite tests that ran "git" inside command substitution and lost the
exit status of "git" so that we notice the failing "git".

Have them use modern patterns such as a "test_cmp" of the expected
outputs instead.

We'll fix more of these these in the subsequent commit, for now we're
only converting the cases where this loss of exit code was combined
with spawning a sub-shell.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:41 -08:00
62f3a45bb4 t/lib-patch-mode.sh: fix ignored exit codes
Fix code added in b319ef70a9 (Add a small patch-mode testing library,
2009-08-13) to use &&-chaining.

This avoids losing both the exit code of a "git" and the "cat"
processes.

This fixes cases where we'd have e.g. missed memory leaks under
SANITIZE=leak, this code doesn't leak now as far as I can tell, but I
discovered it while looking at leaks in related code.

For "verify_saved_head()" we could make use of "test_cmp_rev" with
some changes, but it uses "git rev-parse --verify", and this existing
test does not. I think it could safely use it, but let's avoid the
while-at-it change, and narrowly fix the exit code problem.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:41 -08:00
fb18dd2831 auto-crlf tests: don't lose exit code in loops and outside tests
Change the functions which are called from within
"test_expect_success" to add the "|| return 1" idiom to their
for-loops, so we won't lose the exit code of "cp", "git" etc.

Then for those setup functions that aren't called from a
"test_expect_success" we need to put the setup code in a
"test_expect_success" as well. It would not be enough to properly
&&-chain these, as the calling code is the top-level script itself. As
we don't run the tests with "set -e" we won't report failing commands
at the top-level.

The "checkout" part of this would miss memory leaks under
SANITIZE=leak, this code doesn't leak (the relevant "git checkout"
leak has been fixed), but in a past version of git we'd continue past
this failure under SANITIZE=leak when these invocations had errored
out, even under "--immediate".

For checkout_files() we could run one test_expect_success() instead of
the 5 we run now in a loop.

But as this function added in [1] is already taking pains to split up
its setup into phases (there are 5 more "test_expect_success()" at the
end of it already, see [1]), let's follow that existing convention.

1. 343151dcbd (t0027: combinations of core.autocrlf, core.eol and text, 2014-06-29)

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:41 -08:00
5a7d41d849 docs & comments: replace mentions of "git-add--interactive.perl"
Now that we've removed "git-add--interactive.perl" let's replace
mentions of it with "add-interactive.c". In the case of the "git add"
documentation we were using it as an example filename, so the mention
wasn't wrong, but using a dead file is slightly confusing.

The "borrowed" comment here likewise isn't wrong, but let's mention
the successor file instead. In the case of pathspec.c the implied TODO
item should refer to the current code (and the comment may not even be
current, I didn't check).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:03:34 -08:00
d21878f073 add API: remove run_add_interactive() wrapper function
Now that the Perl "git-add--interactive" has gone away in the
preceding commit we don't need to pass along our desire for a mode as
a string, and can instead directly use the "enum add_p_mode", see
d2a233cb8b (built-in add -p: prepare for patch modes other than
"stage", 2019-12-21) for its introduction.

As a result of that the run_add_interactive() function would become a
trivial wrapper which would only run run_add_i() if a 0 (or now,
"NULL") "patch_mode" was provided. Let's instead remove it, and have
the one callsite that wanted the "NULL" case (interactive_add())
handle it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:03:34 -08:00
20b813d7d3 add: remove "add.interactive.useBuiltin" & Perl "git add--interactive"
Since [1] first released with Git v2.37.0 the built-in version of "add
-i" has been the default. That built-in implementation was added in
[2], first released with Git v2.25.0.

At this point enough time has passed to allow for finding any
remaining bugs in this new implementation, so let's remove the
fallback code.

As with similar migrations for "stash"[3] and "rebase"[4] we're
keeping a mention of "add.interactive.useBuiltin" in the
documentation, but adding a warning() to notify any outstanding users
that the built-in is now the default. As with [5] and [6] we should
follow-up in the future and eventually remove that warning.

1. 0527ccb1b5 (add -i: default to the built-in implementation,
   2021-11-30)
2. f83dff60a7 (Start to implement a built-in version of `git add
   --interactive`, 2019-11-13)
3. 8a2cd3f512 (stash: remove the stash.useBuiltin setting,
   2020-03-03)
4. d03ebd411c (rebase: remove the rebase.useBuiltin setting,
   2019-03-18)
5. deeaf5ee07 (stash: remove documentation for `stash.useBuiltin`,
   2022-01-27)
6. 9bcde4d531 (rebase: remove transitory rebase.useBuiltin setting &
   env, 2021-03-23)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:03:34 -08:00
e65b868d07 pack-objects: use strcspn(3) in name_cmp_len()
Call strcspn(3) to find the length of a string terminated by NUL, NL or
slash instead of open-coding it.  Adopt its return type, size_t, to
support strings of arbitrary length.  Use that type in callers as well
for variables and function parameters that receive the return value.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 14:31:11 -08:00
1b4a38d741 read-cache: use size_t for {base,df}_name_compare()
Support names of any length in base_name_compare() and df_name_compare()
by using size_t for their length parameters.  They pass the length on to
memcmp(3), which also takes it as a size_t.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 14:31:03 -08:00
d912a603ed t5000: modernise archive and :(glob) test
To match present day coding guiding codelines let's:

- use <<-EOF, so we can indent all lines to the
  the same level for this test

- use <<\EOF to notify the reader that no interpolation
  is expected in the body

Signed-off-by: Kostya Farber <kostya.farber@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 14:14:20 -08:00
d85e9448dd new-command.txt: update reference to builtin docs
Commit ec14d4ecb5 (builtin.h: take over documentation from
api-builtin.txt, 2017-08-02) deleted api-builtin.txt and moved the
contents into builtin.h, but new-command.txt still references the old
file.

Signed-off-by: Wes Lord <weslord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 14:07:33 -08:00
1f34e0cd3d .gitattributes: include text attribute for eol attributes
The standard advice for text file eol endings in the .gitattributes file
was updated in e28eae3184 (gitattributes: Document the unified "auto"
handling, 2016-08-26) with a recent clarification in 8c591dbfce (docs:
correct documentation about eol attribute, 2022-01-11), with a follow
up comment by the original author in [1] confirming the use of the eol
attribute in conjunction with the text attribute.

Update Git's .gitattributes file to reflect our own advice.

[1] https://lore.kernel.org/git/?q=%3C20220216115239.uo2ie3flaqo3nf2d%40tb-raspi4%3E.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 13:57:08 -08:00
cbf04937d5 Git 2.39.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:43:41 +01:00
3aef76ffd4 Sync with 2.38.4
* maint-2.38:
  Git 2.38.4
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:39 +01:00
7556e5d737 Git 2.38.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:43:30 +01:00
6487e9c459 Sync with 2.37.6
* maint-2.37:
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:28 +01:00
eb88fe1ff5 Git 2.37.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:38:32 +01:00
16004682f9 Sync with 2.36.5
* maint-2.36:
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:38:31 +01:00
673472a963 Git 2.36.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:37:53 +01:00
40843216c5 Sync with 2.35.7
* maint-2.35:
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:37:52 +01:00
b7a92d078b Git 2.35.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:29:45 +01:00
6a53a59bf9 Sync with 2.34.7
* maint-2.34:
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:44 +01:00
91da4a29e1 Git 2.34.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:29:17 +01:00
a7237f5ae9 Sync with 2.33.7
* maint-2.33:
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:16 +01:00
bd6d3de01f Merge branch 'jk/curl-avoid-deprecated-api'
Deal with a few deprecation warning from cURL library.

* jk/curl-avoid-deprecated-api:
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
2023-02-06 09:27:41 +01:00
f44e6a2105 http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.

Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:

  - we don't have to worry that silencing curl's deprecation warnings
    might cause us to miss other more useful ones

  - we'd eventually want to move to the new variant anyway, so this gets
    us set up (albeit with some extra ugly boilerplate for the
    conditional)

There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:

  GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
  if (...http is allowed...)
	GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);

and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.

On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).

This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:27:09 +01:00
4bd481e0ad http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.

But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros).  But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.

Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).

Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.

Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:27:09 +01:00
4fab049258 http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:27:08 +01:00
ed4404af3c Git 2.33.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:25:58 +01:00
87248c5933 Sync with 2.32.6
* maint-2.32:
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:25:56 +01:00
2aedeff35f Git 2.32.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:25:09 +01:00
aeb93d7da2 Sync with 2.31.7
* maint-2.31:
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:25:08 +01:00
0bbcf95194 Git 2.31.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:24:07 +01:00
e14d6b8408 Sync with 2.30.8
* maint-2.30:
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:24:06 +01:00
394a759d2b Git 2.30.8
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 09:14:45 +01:00
a3033a68ac Merge branch 'ps/apply-beyond-symlink' into maint-2.30
Fix a vulnerability (CVE-2023-23946) that allows crafted input to trick
`git apply` into writing files outside of the working tree.

* ps/apply-beyond-symlink:
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:12:16 +01:00
2c9a4c7310 Merge branch 'tb/clone-local-symlinks' into maint-2.30
Resolve a security vulnerability (CVE-2023-22490) where `clone_local()`
is used in conjunction with non-local transports, leading to arbitrary
path exfiltration.

* tb/clone-local-symlinks:
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:09:14 +01:00
647982bb71 delta-islands: free island_marks and bitmaps
On my mirror of linux.git forkgroup with 780 islands, this saves
nearly 4G of heap memory in pack-objects.  This savings only
benefits delta island users of pack bitmaps, as the process
would otherwise be exiting anyways.

However, there's probably not many delta island users, but the
majority of delta island users would also be pack bitmaps users.

Signed-off-by: Eric Wong <e@80x24.org>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-03 18:01:46 -08:00
a6a323b31e The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-03 16:08:22 -08:00
3eda8302e5 Merge branch 'en/ls-files-doc-update'
Doc update to ls-files.

* en/ls-files-doc-update:
  ls-files: guide folks to --exclude-standard over other --exclude* options
  ls-files: clarify descriptions of status tags for -t
  ls-files: clarify descriptions of file selection options
  ls-files: add missing documentation for --resolve-undo option
2023-02-03 16:08:22 -08:00
2c6e5b32aa Merge branch 'en/rebase-incompatible-opts'
"git rebase" often ignored incompatible options instead of
complaining, which has been corrected.

* en/rebase-incompatible-opts:
  rebase: provide better error message for apply options vs. merge config
  rebase: put rebase_options initialization in single place
  rebase: fix formatting of rebase --reapply-cherry-picks option in docs
  rebase: clarify the OPT_CMDMODE incompatibilities
  rebase: add coverage of other incompatible options
  rebase: fix incompatiblity checks for --[no-]reapply-cherry-picks
  rebase: fix docs about incompatibilities with --root
  rebase: remove --allow-empty-message from incompatible opts
  rebase: flag --apply and --merge as incompatible
  rebase: mark --update-refs as requiring the merge backend
2023-02-03 16:08:21 -08:00
c7757b2781 Merge branch 'as/ssh-signing-improve-key-missing-error'
Improve the error message given when private key is not loaded in
the ssh agent in the codepath to sign with an ssh key.

* as/ssh-signing-improve-key-missing-error:
  ssh signing: better error message when key not in agent
2023-02-03 16:08:21 -08:00
86cca7593e Merge branch 'jc/attr-doc-fix'
Comment fix.

* jc/attr-doc-fix:
  attr: fix instructions on how to check attrs
2023-02-03 16:08:21 -08:00
fade728df1 apply: fix writing behind newly created symbolic links
When writing files git-apply(1) initially makes sure that none of the
files it is about to create are behind a symlink:

```
 $ git init repo
 Initialized empty Git repository in /tmp/repo/.git/
 $ cd repo/
 $ ln -s dir symlink
 $ git apply - <<EOF
 diff --git a/symlink/file b/symlink/file
 new file mode 100644
 index 0000000..e69de29
 EOF
 error: affected file 'symlink/file' is beyond a symbolic link
```

This safety mechanism is crucial to ensure that we don't write outside
of the repository's working directory. It can be fooled though when the
patch that is being applied creates the symbolic link in the first
place, which can lead to writing files in arbitrary locations.

Fix this by checking whether the path we're about to create is
beyond a symlink or not. Tightening these checks like this should be
fine as we already have these precautions in Git as explained
above. Ideally, we should update the check we do up-front before
starting to reflect the computed changes to the working tree so that
we catch this case as well, but as part of embargoed security work,
adding an equivalent check just before we try to write out a file
should serve us well as a reasonable first step.

Digging back into history shows that this vulnerability has existed
since at least Git v2.9.0. As Git v2.8.0 and older don't build on my
system anymore I cannot tell whether older versions are affected, as
well.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-03 14:41:31 -08:00
2987407f3c mingw: remove msysGit/MSYS1 support
MSys has long fallen behind MSYS2 in features like Unicode or
x86_64 support or even security bug fixes, and is therefore no
longer used by anyone in the Git developer community. The Git for
Windows project itself started switching from MSys to MSYS2 early
in 2015, i.e. about eight years ago. Let's drop supporting MSys as
a development platform.

Signed-off-by: Harshil-Jani <harshiljani2002@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-02 08:06:30 -08:00
c0b50458b9 mingw: remove duplicate USE_NED_ALLOCATOR directive
nedalloc was added to fix the slowness of memory allocator. Here
specifically for the MSys2 build there seems to be a duplication of
USE_NED_ALLOCATOR directive. So this patch intends to remove the
duplicate USE_NED_ALLOCATOR and keeping it only into the MSys2 config
section so it still uses the nedalloc.

Signed-off-by: Harshil-Jani <harshiljani2002@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-02 08:06:20 -08:00
592bcab61b compat/winansi: check for errors of CreateThread() correctly
The return value for failed thread creation is NULL,
not INVALID_HANDLE_VALUE, unlike other Windows API functions.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 14:36:19 -08:00
b08edf709d t/lib-httpd: increase ssl key size to 2048 bits
Recent versions of openssl will refuse to work with 1024-bit RSA keys,
as they are considered insecure. I didn't track down the exact version
in which the defaults were tightened, but the Debian-package openssl 3.0
on my system yields:

  $ LIB_HTTPD_SSL=1 ./t5551-http-fetch-smart.sh -v -i
  [...]
  SSL Library Error: error:0A00018F:SSL routines::ee key too small
  1..0 # SKIP web server setup failed

This could probably be overcome with configuration, but that's likely
to be a headache (especially if it requires touching /etc/openssl).
Let's just pick a key size that's less outrageously out of date.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
d113449e26 t/lib-httpd: drop SSLMutex config
The SSL config enabled by setting LIB_HTTPD_SSL does not work with
Apache versions greater than 2.2, as more recent versions complain about
the SSLMutex directive. According to
https://httpd.apache.org/docs/current/upgrading.html:

  Directives AcceptMutex, LockFile, RewriteLock, SSLMutex,
  SSLStaplingMutex, and WatchdogMutexPath have been replaced with a
  single Mutex directive. You will need to evaluate any use of these
  removed directives in your 2.2 configuration to determine if they can
  just be deleted or will need to be replaced using Mutex.

Deleting this line will just use the system default, which seems
sensible. The original came as part of faa4bc35a0 (http-push: add
regression tests, 2008-02-27), but no specific reason is given there (or
on the mailing list) for its presence.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
edd060dc84 t/lib-httpd: bump required apache version to 2.4
Apache 2.4 has been out since early 2012, almost 11 years. And its
predecessor, 2.2, has been out of support since its last release in
2017, over 5 years ago. The last mention on the mailing list was from
around the same time, in this thread:

  https://lore.kernel.org/git/20171231023234.21215-1-tmz@pobox.com/

We can probably assume that 2.4 is available everywhere. And the stakes
are fairly low, as the worst case is that such a platform would skip the
http tests.

This lets us clean up a few minor version checks in the config file, but
also revert f1f2b45be0 (tests: adjust the configuration for Apache 2.2,
2016-05-09). Its technique isn't _too_ bad, but certainly required a bit
more explanation than the 2.4 version it replaced. I manually confirmed
that the test in t5551 still behaves as expected (if you replace
"cadabra" with "foo", the server correctly rejects the request).

It will also help future patches which will no longer have to deal with
conditional config for this old version.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
d762617079 t/lib-httpd: bump required apache version to 2.2
Apache 2.2 was released in 2005, almost 18 years ago. We can probably
assume that people are running a version at least that old (and the
stakes for removing it are fairly low, as the worst case is that they
would not run the http tests against their ancient version).

Dropping support for the older versions cleans up the config file a
little, and will also enable us to bump the required version further
(with more cleanups) in a future patch.

Note that the file actually checks for version 2.1. In apache's
versioning scheme, odd numbered versions are for development and even
numbers are for stable releases. So 2.1 and 2.2 are effectively the same
from our perspective.

Older versions would just fail to start, which would generally cause us
to skip the tests. However, we do have version detection code in
lib-httpd.sh which produces a nicer error message, so let's update that,
too. I didn't bother handling the case of "3.0", etc. Apache has been on
2.x for 21 years, with no signs of bumping the major version.  And if
they eventually do, I suspect there will be enough breaking changes that
we'd need to update more than just the numeric version check. We can
worry about that hypothetical when it happens.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
3a2ebaebc7 docs: document zero bits in index "mode"
Documentation/gitformat-index.txt describes the "mode" as 32 bits, but
only documents 16 bits. Document the missing 16 bits and specify that
'unused' bits must be zero.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 08:49:23 -08:00
50b6ad55b0 grep: fall back to interpreter if JIT memory allocation fails
Under Linux systems with SELinux's 'deny_execmem' or PaX's MPROTECT
enabled, the allocation of PCRE2's JIT rwx memory may be prohibited,
making pcre2_jit_compile() fail with PCRE2_ERROR_NOMEMORY (-48):

  [user@fedora git]$ git grep -c PCRE2_JIT
  grep.c:1

  [user@fedora git]$ # Enable SELinux's W^X policy
  [user@fedora git]$ sudo semanage boolean -m -1 deny_execmem

  [user@fedora git]$ # JIT memory allocation fails, breaking 'git grep'
  [user@fedora git]$ git grep -c PCRE2_JIT
  fatal: Couldn't JIT the PCRE2 pattern 'PCRE2_JIT', got '-48'

Instead of failing hard in this case and making 'git grep' unusable on
such systems, simply fall back to interpreter mode, leading to a much
better user experience.

As having a functional PCRE2 JIT compiler is a legitimate use case for
performance reasons, we'll only do the fallback if the supposedly
available JIT is found to be non-functional by attempting to JIT compile
a very simple pattern. If this fails, JIT is deemed to be non-functional
and we do the interpreter fallback. For all other cases, i.e. the simple
pattern can be compiled but the user provided cannot, we fail hard as we
do now as the reason for the failure must be the pattern itself. To aid
users in helping themselves change the error message to include a hint
about the '(*NO_JIT)' prefix. Also clip the pattern at 64 characters to
ensure the hint will be seen by the user and not internally truncated by
the die() function.

Cc: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 11:39:02 -08:00
026df9e047 bundle-uri: test missing bundles with heuristic
The creationToken heuristic uses a different mechanism for downloading
bundles from the "standard" approach. Specifically: it uses a concrete
order based on the creationToken values and attempts to download as few
bundles as possible. It also modifies local config to store a value for
future fetches to avoid downloading bundles, if possible.

However, if any of the individual bundles has a failed download, then
the logic for the ordering comes into question. It is important to avoid
infinite loops, assigning invalid creation token values in config, but
also to be opportunistic as possible when downloading as many bundles as
seem appropriate.

These tests were used to inform the implementation of
fetch_bundles_by_token() in bundle-uri.c, but are being added
independently here to allow focusing on faulty downloads. There may be
more cases that could be added that result in modifications to
fetch_bundles_by_token() as interesting data shapes reveal themselves in
real scenarios.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
c429bed102 bundle-uri: store fetch.bundleCreationToken
When a bundle list specifies the "creationToken" heuristic, the Git
client downloads the list and then starts downloading bundles in
descending creationToken order. This process stops as soon as all
downloaded bundles can be applied to the repository (because all
required commits are present in the repository or in the downloaded
bundles).

When checking the same bundle list twice, this strategy requires
downloading the bundle with the maximum creationToken again, which is
wasteful. The creationToken heuristic promises that the client will not
have a use for that bundle if its creationToken value is at most the
previous creationToken value.

To prevent these wasteful downloads, create a fetch.bundleCreationToken
config setting that the Git client sets after downloading bundles. This
value allows skipping that maximum bundle download when this config
value is the same value (or larger).

To test that this works correctly, we can insert some "duplicate"
fetches into existing tests and demonstrate that only the bundle list is
downloaded.

The previous logic for downloading bundles by creationToken worked even
if the bundle list was empty, but now we have logic that depends on the
first entry of the list. Terminate early in the (non-sensical) case of
an empty bundle list.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
7f0cc04f2c fetch: fetch from an external bundle URI
When a user specifies a URI via 'git clone --bundle-uri', that URI may
be a bundle list that advertises a 'bundle.heuristic' value. In that
case, the Git client stores a 'fetch.bundleURI' config value storing
that URI.

Teach 'git fetch' to check for this config value and download bundles
from that URI before fetching from the Git remote(s). Likely, the bundle
provider has configured a heuristic (such as "creationToken") that will
allow the Git client to download only a portion of the bundles before
continuing the fetch.

Since this URI is completely independent of the remote server, we want
to be sure that we connect to the bundle URI before creating a
connection to the Git remote. We do not want to hold a stateful
connection for too long if we can avoid it.

To test that this works correctly, extend the previous tests that set
'fetch.bundleURI' to do follow-up fetches. The bundle list is updated
incrementally at each phase to demonstrate that the heuristic avoids
downloading older bundles. This includes the middle fetch downloading
the objects in bundle-3.bundle from the Git remote, and therefore not
needing that bundle in the third fetch.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
0524ad3542 bundle-uri: drop bundle.flag from design doc
The Implementation Plan section lists a 'bundle.flag' option that is not
documented anywhere else. What is documented elsewhere in the document
and implemented by previous changes is the 'bundle.heuristic' config
key. For now, a heuristic is required to indicate that a bundle list is
organized for use during 'git fetch', and it is also sufficient for all
existing designs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
4074d3c7e1 clone: set fetch.bundleURI if appropriate
Bundle providers may organize their bundle lists in a way that is
intended to improve incremental fetches, not just initial clones.
However, they do need to state that they have organized with that in
mind, or else the client will not expect to save time by downloading
bundles after the initial clone. This is done by specifying a
bundle.heuristic value.

There are two types of bundle lists: those at a static URI and those
that are advertised from a Git remote over protocol v2.

The new fetch.bundleURI config value applies for static bundle URIs that
are not advertised over protocol v2. If the user specifies a static URI
via 'git clone --bundle-uri', then Git can set this config as a reminder
for future 'git fetch' operations to check the bundle list before
connecting to the remote(s).

For lists provided over protocol v2, we will want to take a different
approach and create a property of the remote itself by creating a
remote.<id>.* type config key. That is not implemented in this change.

Later changes will update 'git fetch' to consume this option.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
7903efb717 bundle-uri: download in creationToken order
The creationToken heuristic provides an ordering on the bundles
advertised by a bundle list. Teach the Git client to download bundles
differently when this heuristic is advertised.

The bundles in the list are sorted by their advertised creationToken
values, then downloaded in decreasing order. This avoids the previous
strategy of downloading bundles in an arbitrary order and attempting
to apply them (likely failing in the case of required commits) until
discovering the order through attempted unbundling.

During a fresh 'git clone', it may make sense to download the bundles in
increasing order, since that would prevent the need to attempt
unbundling a bundle with required commits that do not exist in our empty
object store. The cost of testing an unbundle is quite low, and instead
the chosen order is optimizing for a future bundle download during a
'git fetch' operation with a non-empty object store.

Since the Git client continues fetching from the Git remote after
downloading and unbundling bundles, the client's object store can be
ahead of the bundle provider's object store. The next time it attempts
to download from the bundle list, it makes most sense to download only
the most-recent bundles until all tips successfully unbundle. The
strategy implemented here provides that short-circuit where the client
downloads a minimal set of bundles.

However, we are not satisfied by the naive approach of downloading
bundles until one successfully unbundles, expecting the earlier bundles
to successfully unbundle now. The example repository in t5558
demonstrates this well:

 ---------------- bundle-4

       4
      / \
 ----|---|------- bundle-3
     |   |
     |   3
     |   |
 ----|---|------- bundle-2
     |   |
     2   |
     |   |
 ----|---|------- bundle-1
      \ /
       1
       |
 (previous commits)

In this repository, if we already have the objects for bundle-1 and then
try to fetch from this list, the naive approach will fail. bundle-4
requires both bundle-3 and bundle-2, though bundle-3 will successfully
unbundle without bundle-2. Thus, the algorithm needs to keep this in
mind.

A later implementation detail will store the maximum creationToken seen
during such a bundle download, and the client will avoid downloading a
bundle unless its creationToken is strictly greater than that stored
value. For now, if the client seeks to download from an identical
bundle list since its previous download, it will download the
most-recent bundle then stop since its required commits are already in
the object store.

Add tests that exercise this behavior, but we will expand upon these
tests when incremental downloads during 'git fetch' make use of
creationToken values.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
512fccf8a5 bundle-uri: parse bundle.<id>.creationToken values
The previous change taught Git to parse the bundle.heuristic value,
especially when its value is "creationToken". Now, teach Git to parse
the bundle.<id>.creationToken values on each bundle in a bundle list.

Before implementing any logic based on creationToken values for the
creationToken heuristic, parse and print these values for testing
purposes.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
c93c3d2fa4 bundle-uri: parse bundle.heuristic=creationToken
The bundle.heuristic value communicates that the bundle list is
organized to make use of the bundle.<id>.creationToken values that may
be provided in the bundle list. Those values will create a total order
on the bundles, allowing the Git client to download them in a specific
order and even remember previously-downloaded bundles by storing the
maximum creation token value.

Before implementing any logic that parses or uses the
bundle.<id>.creationToken values, teach Git to parse the
bundle.heuristic value from a bundle list. We can use 'test-tool
bundle-uri' to print the heuristic value and verify that the parsing
works correctly.

As an extra precaution, create the internal 'heuristics' array to be a
list of (enum, string) pairs so we can iterate through the array entries
carefully, regardless of the enum values.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
7bc73e7b61 t5558: add tests for creationToken heuristic
As documented in the bundle URI design doc in 2da14fad8f (docs:
document bundle URI standard, 2022-08-09), the 'creationToken' member of
a bundle URI allows a bundle provider to specify a total order on the
bundles.

Future changes will allow the Git client to understand these members and
modify its behavior around downloading the bundles in that order. In the
meantime, create tests that add creation tokens to the bundle list. For
now, the Git client correctly ignores these unknown keys.

Create a new test helper function, test_remote_https_urls, which filters
GIT_TRACE2_EVENT output to extract a list of URLs passed to
git-remote-https child processes. This can be used to verify the order
of these requests as we implement the creationToken heuristic. For now,
we need to sort the actual output since the current client does not have
a well-defined order that it applies to the bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:47 -08:00
d9fd674c8b bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.

Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().

This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.

To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.

The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:

  error: Could not read <missing-commit>
  fatal: Failed to traverse parents of commit <extant-commit>

However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:

  error: some prerequisite commits exist in the object store,
         but are not connected to the repository's history

(Line break added here for the commit message formatting, only.)

While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:47 -08:00
e72171f085 bundle: test unbundling with incomplete history
When verifying a bundle, Git checks first that all prerequisite commits
exist in the object store, then adds an additional check: those
prerequisite commits must be reachable from references in the
repository.

This check is stronger than what is checked for refs being added during
'git fetch', which simply guarantees that the new refs have a complete
history up to the point where it intersects with the current reachable
history.

However, we also do not have any tests that check the behavior under
this condition. Create a test that demonstrates its behavior.

In order to construct a broken history, perform a shallow clone of a
repository with a linear history, but whose default branch ('base') has
a single commit, so dropping the shallow markers leaves a complete
history from that reference. However, the 'tip' reference adds a
shallow commit whose parent is missing in the cloned repository. Trying
to unbundle a bundle with the 'tip' as a prerequisite will succeed past
the object store check and move into the reachability check.

The two errors that are reported are of this form:

  error: Could not read <missing-commit>
  fatal: Failed to traverse parents of commit <present-commit>

These messages are not particularly helpful for the person running the
unbundle command, but they do prevent the command from succeeding.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:47 -08:00
2fc9e9ca3c The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-30 14:24:32 -08:00
a5eaa76b30 Merge branch 'ar/markup-em-dash'
Doc mark-up updates.

* ar/markup-em-dash:
  Documentation: render dash correctly
2023-01-30 14:24:24 -08:00
777afaaa5c Merge branch 'tb/t0003-invoke-dd-more-portably'
Test portability fix.

* tb/t0003-invoke-dd-more-portably:
  t0003: call dd with portable blocksize
2023-01-30 14:24:23 -08:00
abf2bb895b Merge branch 'jk/hash-object-fsck'
"git hash-object" now checks that the resulting object is well
formed with the same code as "git fsck".

* jk/hash-object-fsck:
  fsck: do not assume NUL-termination of buffers
  hash-object: use fsck for object checks
  fsck: provide a function to fsck buffer without object struct
  t: use hash-object --literally when created malformed objects
  t7030: stop using invalid tag name
  t1006: stop using 0-padded timestamps
  t1007: modernize malformed object tests
2023-01-30 14:24:22 -08:00
4ac326f64f Merge branch 'po/pretty-format-columns-doc'
Clarify column-padding operators in the pretty format string.

* po/pretty-format-columns-doc:
  doc: pretty-formats note wide char limitations, and add tests
  doc: pretty-formats describe use of ellipsis in truncation
  doc: pretty-formats document negative column alignments
  doc: pretty-formats: delineate `%<|(` parameter values
  doc: pretty-formats: separate parameters from placeholders
2023-01-30 14:24:22 -08:00
06f2b5fb70 Merge branch 'jc/doc-checkout-b'
Clarify how "checkout -b/-B" and "git branch [-f]" are similar but
different in the documentation.

* jc/doc-checkout-b:
  checkout: document -b/-B to highlight the differences from "git branch"
2023-01-30 14:24:21 -08:00
4f542975d1 Documentation: clarify that cache forgets credentials if the system restarts
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-29 09:21:07 -08:00
dea6308892 scalar: only warn when background maintenance fails
A user reported issues with 'scalar clone' and 'scalar register' when
working in an environment that had locked down the ability to run
'crontab' or 'systemctl' in that those commands registered as _failures_
instead of opportunistically reporting a success with just a warning
about background maintenance.

As a workaround, they can use GIT_TEST_MAINT_SCHEDULER to fake a
successful background maintenance, but this is not a viable strategy for
long-term.

Update 'scalar register' and 'scalar clone' to no longer fail by
modifying register_dir() to only warn when toggle_maintenance(1) fails.

Since background maintenance is a "nice to have" and not a requirement
for a working repository, it is best to move this from hard error to
gentle warning.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 12:38:26 -08:00
eeea9ae165 t921*: test scalar behavior starting maintenance
A user recently reported issues with 'scalar register' and 'scalar
clone' in that they failed when the system had permissions locked down
so both 'crontab' and 'systemctl' commands failed when trying to enable
background maintenance.

This hard error is undesirable, but let's create tests that demonstrate
this behavior before modiying the behavior. We can use
GIT_TEST_MAINT_SCHEDULER to guarantee failure and check the exit code
and error message.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 12:38:26 -08:00
008217cb4a t: allow 'scalar' in test_must_fail
This will enable scalar tests to use the test_must_fail helper, when
necessary.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 12:38:26 -08:00
5cc9858f1b The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 08:51:41 -08:00
d26e26a3f5 Merge branch 'cw/fetch-remote-group-with-duplication'
"git fetch <group>", when "<group>" of remotes lists the same
remote twice, unnecessarily failed when parallel fetching was
enabled, which has been corrected.

* cw/fetch-remote-group-with-duplication:
  fetch: fix duplicate remote parallel fetch bug
2023-01-27 08:51:41 -08:00
8f82904caf Merge branch 'jc/doc-branch-update-checked-out-branch'
Document that "branch -f <branch>" disables only the safety to
avoid recreating an existing branch.

* jc/doc-branch-update-checked-out-branch:
  branch: document `-f` and linked worktree behaviour
2023-01-27 08:51:41 -08:00
630ae5ee65 Merge branch 'jk/hash-object-literally-fd-leak'
Leakfix.

* jk/hash-object-literally-fd-leak:
  hash-object: fix descriptor leak with --literally
2023-01-27 08:51:41 -08:00
7d4d34f843 Merge branch 'pb/branch-advice-recurse-submodules'
Improve advice message given when "git branch --recurse-submodules"
fails.

* pb/branch-advice-recurse-submodules:
  branch: improve advice when --recurse-submodules fails
2023-01-27 08:51:40 -08:00
531d13d4d2 Merge branch 'km/send-email-with-v-reroll-count'
"git send-email -v 3" used to be expanded to "git send-email
--validate 3" when the user meant to pass them down to
"format-patch", which has been corrected.

* km/send-email-with-v-reroll-count:
  send-email: relay '-v N' to format-patch
2023-01-27 08:51:40 -08:00
557d93a146 Merge branch 'cb/grep-pcre-ucp'
"grep -P" learned to use Unicode Character Property to grok
character classes when processing \b and \w etc.

* cb/grep-pcre-ucp:
  grep: correctly identify utf-8 characters with \{b,w} in -P
2023-01-27 08:51:40 -08:00
3e6417681c Merge branch 'sa/cat-file-mailmap--batch-check'
Docfix.

* sa/cat-file-mailmap--batch-check:
  git-cat-file.txt: fix list continuations rendering literally
2023-01-27 08:51:40 -08:00
ce400c9da9 Merge branch 'ab/cache-api-cleanup-users'
Updates the users of the cache API.

* ab/cache-api-cleanup-users:
  treewide: always have a valid "index_state.repo" member
2023-01-27 08:51:39 -08:00
06cc6f6a41 attr: fix instructions on how to check attrs
The instructions in attr.h describing what functions to call to check
attributes is missing the index as the first argument to
git_check_attr(), as well as tree_oid as the second argument.

When 7a400a2c (attr: remove an implicit dependency on the_index,
2018-08-13) started passing an index_state instance to git_check_attr(),
it forgot to update the API documentation in
Documentation/technical/api-gitattributes.txt. Later, 3a1b3415
(attr: move doc to attr.h, 2019-11-17) moved the API documentation to
attr.h as a comment, but still left out the index_state as an argument.

In 47cfc9b (attr: add flag `--source` to work with tree-ish 2023-01-14)
added tree_oid as an optional parameter but was not added to the docs in
attr.h

Fix this to make the documentation in the comment consistent with the
actual function signature.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-26 14:16:48 -08:00
a9cad02538 request-pull: filter out SSH/X.509 tag signatures
git request-pull filters PGP signatures out of the tag message, but not
SSH or X.509 signatures.

Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 15:54:41 -08:00
eddfcd8ece rebase: provide better error message for apply options vs. merge config
When config which selects the merge backend (currently,
rebase.autosquash=true or rebase.updateRefs=true) conflicts with other
options on the command line (such as --whitespace=fix), make the error
message specifically call out the config option and specify how to
override that config option on the command line.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
3dc55b2087 rebase: put rebase_options initialization in single place
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
9a7d7ce9f6 rebase: fix formatting of rebase --reapply-cherry-picks option in docs
Commit ce5238a690 ("rebase --keep-base: imply --reapply-cherry-picks",
2022-10-17) accidentally added some blank lines that cause extra
paragraphs about --reapply-cherry-picks to be considered not part of
the documentation of that option.  Remove the blank lines to make it
clear we are still discussing --reapply-cherry-picks.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
925360041c rebase: clarify the OPT_CMDMODE incompatibilities
--edit-todo was documented as being incompatible with any of the options
for the apply backend.  However, it is also incompatible with any of the
options for the merge backend, and is incompatible with any options that
are not backend specific as well.  The same can be said for --continue,
--skip, --abort, --quit, etc.

This is already somewhat implicitly covered by the synopsis, but since
"[<options>]" in the first two variants are vague it might be easy to
miss this.  That might not be a big deal, but since the rebase manpage
has to spend so much verbiage about incompatibility of options, making
a separate section for these options that are incompatible with
everything else seems clearer.  Do that, and remove the needless
inclusion of --edit-todo in the explicit incompatibility list.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
796abac7e1 rebase: add coverage of other incompatible options
The git-rebase manual noted several sets of incompatible options, but
we were missing tests for a few of these.  Further, we were missing
code checks for one of these, which could result in command line
options being silently ignored.

Also, note that adding a check for autosquash means that using
--whitespace=fix together with the config setting rebase.autosquash=true
will trigger an error.  A subsequent commit will improve the error
message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
ffeaca177a rebase: fix incompatiblity checks for --[no-]reapply-cherry-picks
--[no-]reapply-cherry-picks was traditionally only supported by the
sequencer.  Support was added for the apply backend, when --keep-base is
also specified, in commit ce5238a690 ("rebase --keep-base: imply
--reapply-cherry-picks", 2022-10-17).  Make the code error out when
--[no-]reapply-cherry-picks is specified AND the apply backend is used
AND --keep-base is not specified.  Also, clarify a number of comments
surrounding the interaction of these flags.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
b8ad365640 rebase: fix docs about incompatibilities with --root
In commit 5dacd4abdd ("git-rebase.txt: document incompatible options",
2018-06-25), I added notes about incompatibilities between options for
the apply and merge backends.  Unfortunately, I inverted the condition
when --root was incompatible with the apply backend.  Fix the
documentation, and add a testcase that verifies the documentation
matches the code.

While at it, the documentation for --root also tried to cover some of
the backend differences between the apply and merge backends in relation
to reapplying cherry picks.  The information:
  * assumed that the apply backend was the default (it isn't anymore)
  * was written before --reapply-cherry-picks became an option
  * was written before the detailed information on backend differences
All of these factors make the sentence under --root about reapplying
cherry picks contradict information that is now available elsewhere in
the manual, and the other references are correct.  So just strike this
sentence.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
1a66d8c6f6 rebase: remove --allow-empty-message from incompatible opts
--allow-empty-message was turned into a no-op and even documented
as such; the flag is simply ignored.  Since the flag is ignored, it
shouldn't be documented as being incompatible with other flags.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:52 -08:00
7d718c552b rebase: flag --apply and --merge as incompatible
Previously, we flagged options which implied --apply as being
incompatible with options which implied --merge.  But if both options
were given explicitly, then we didn't flag the incompatibility.  The
same is true with --apply and --interactive.  Add the check, and add
some testcases to verify these are also caught.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:52 -08:00
1207599e83 rebase: mark --update-refs as requiring the merge backend
--update-refs is built in terms of the sequencer, which requires the
merge backend.  It was already marked as incompatible with the apply
backend in the git-rebase manual, but the code didn't check for this
incompatibility and warn the user.  Check and error now.

While at it, fix a typo in t3422...and fix some misleading wording
(most options which used to be am-specific have since been implemented
in the merge backend as well).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:52 -08:00
dce7b31126 ssh signing: better error message when key not in agent
When signing a commit with a SSH key, with the private key missing from
ssh-agent, a confusing error message is produced:

    error: Load key
    "/var/folders/t5/cscwwl_n3n1_8_5j_00x_3t40000gn/T//.git_signing_key_tmpkArSj7":
    invalid format? fatal: failed to write commit object

The temporary file .git_signing_key_tmpkArSj7 created by git contains a
valid *public* key.  The error message comes from `ssh-keygen -Y sign' and
is caused by a fallback mechanism in ssh-keygen whereby it tries to
interpret .git_signing_key_tmpkArSj7 as a *private* key if it can't find in
the agent [1].  A fix is scheduled to be released in OpenSSH 9.1. All that
needs to be done is to pass an additional backward-compatible option -U to
'ssh-keygen -Y sign' call.  With '-U', ssh-keygen always interprets the file
as public key and expects to find the private key in the agent.

As a result, when the private key is missing from the agent, a more accurate
error message gets produced:

    error: Couldn't find key in agent

[1] https://bugzilla.mindrot.org/show_bug.cgi?id=3429

Signed-off-by: Adam Szkoda <adaszko@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 08:59:51 -08:00
a5005ded43 Merge branch 'ab/makeflags'
Backport a Makefile fix from upstream git.

* ab/makeflags:
  Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
2023-01-25 11:39:35 +01:00
e69547b7b6 Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Since GNU make 4.4 the semantics of the $(MAKEFLAGS) variable has
changed in a backward-incompatible way, as its "NEWS" file notes:

  Previously only simple (one-letter) options were added to the MAKEFLAGS
  variable that was visible while parsing makefiles.  Now, all options are
  available in MAKEFLAGS.  If you want to check MAKEFLAGS for a one-letter
  option, expanding "$(firstword -$(MAKEFLAGS))" is a reliable way to return
  the set of one-letter options which can be examined via findstring, etc.

This means that $(MAKEFLAGS) now contains long options like
"--jobserver-auth=fifo:<path>" and we have to adapt to that.

Note that the "-" in "-$(MAKEFLAGS)" is critical here, as the variable
will always contain leading whitespace if there are no short options,
but long options are present.

This is a partial backport of 67b36879fc (Makefiles: change search
through $(MAKEFLAGS) for GNU make 4.4, 2022-11-30), which had been
applied directly to git/git.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-01-25 10:54:31 +01:00
bffc762f87 dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
When using the dir_iterator API, we first stat(2) the base path, and
then use that as a starting point to enumerate the directory's contents.

If the directory contains symbolic links, we will immediately die() upon
encountering them without the `FOLLOW_SYMLINKS` flag. The same is not
true when resolving the top-level directory, though.

As explained in a previous commit, this oversight in 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28)
can be used as an attack vector to include arbitrary files on a victim's
filesystem from outside of the repository.

Prevent resolving top-level symlinks unless the FOLLOW_SYMLINKS flag is
given, which will cause clones of a repository with a symlink'd
"$GIT_DIR/objects" directory to fail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
cf8f6ce02a clone: delay picking a transport until after get_repo_path()
In the previous commit, t5619 demonstrates an issue where two calls to
`get_repo_path()` could trick Git into using its local clone mechanism
in conjunction with a non-local transport.

That sequence is:

 - the starting state is that the local path https:/example.com/foo is a
   symlink that points to ../../../.git/modules/foo. So it's dangling.

 - get_repo_path() sees that no such path exists (because it's
   dangling), and thus we do not canonicalize it into an absolute path

 - because we're using --separate-git-dir, we create .git/modules/foo.
   Now our symlink is no longer dangling!

 - we pass the url to transport_get(), which sees it as an https URL.

 - we call get_repo_path() again, on the url. This second call was
   introduced by f38aa83f9a (use local cloning if insteadOf makes a
   local URL, 2014-07-17). The idea is that we want to pull the url
   fresh from the remote.c API, because it will apply any aliases.

And of course now it sees that there is a local file, which is a
mismatch with the transport we already selected.

The issue in the above sequence is calling `transport_get()` before
deciding whether or not the repository is indeed local, and not passing
in an absolute path if it is local.

This is reminiscent of a similar bug report in [1], where it was
suggested to perform the `insteadOf` lookup earlier. Taking that
approach may not be as straightforward, since the intent is to store the
original URL in the config, but to actually fetch from the insteadOf
one, so conflating the two early on is a non-starter.

Note: we pass the path returned by `get_repo_path(remote->url[0])`,
which should be the same as `repo_name` (aside from any `insteadOf`
rewrites).

We *could* pass `absolute_pathdup()` of the same argument, which
86521acaca (Bring local clone's origin URL in line with that of a remote
clone, 2008-09-01) indicates may differ depending on the presence of
".git/" for a non-bare repo. That matters for forming relative submodule
paths, but doesn't matter for the second call, since we're just feeding
it to the transport code, which is fine either way.

[1]: https://lore.kernel.org/git/CAMoD=Bi41mB3QRn3JdZL-FGHs4w3C2jGpnJB-CqSndO7FMtfzA@mail.gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
58325b93c5 t5619: demonstrate clone_local() with ambiguous transport
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.

Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.

However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.

This happens in a few parts:

 1. We first call `get_repo_path()` to see if the remote is a local
    path. If it is, we replace the repo name with its absolute path.

 2. We then call `transport_get()` on the repo name and decide how to
    access it. If it was turned into an absolute path in the previous
    step, then we should always treat it like a file.

 3. We use `get_repo_path()` again, and set `is_local` as appropriate.
    But it's already too late to rewrite the repo name as an absolute
    path, since we've already fed it to the transport code.

The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):

    http://127.0.0.1:NNNN/dumb/sub.git

However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.

To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").

This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.

This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.

The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.

In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.

This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").

The appropriate fix is two-fold:

 - Resolve the transport later on (to avoid using the local
   clone optimization with a non-local transport).

 - Avoid reading through the top-level "objects" directory when
   (correctly) using the clone_local() optimization.

This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.

[^1]: Provided that any target directory does not contain symbolic
  links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
  disallow `--local` clones with symlinks, 2022-07-28) will abort the
  clone.

Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
1e5a89c1b4 Merge branch 'js/windows-rce'
Fix a Remote Code Execution vulnerability on Windows. This is caused by
the fact that Tcl on Windows always includes the current directory when
looking for an executable. Therefore malicious repositories can ship
with an aspell.exe in their top-level directory which is executed by Git
GUI without giving the user a chance to inspect it first, i.e. running
untrusted code.

This merge fixes CVE-2022-41953.

* js/windows-rce:
  Work around Tcl's default `PATH` lookup
  Move the `_which` function (almost) to the top
  Move is_<platform> functions to the beginning
  is_Cygwin: avoid `exec`ing anything
  windows: ignore empty `PATH` elements
2023-01-24 14:14:05 +01:00
aae9560a35 Work around Tcl's default PATH lookup
As per https://www.tcl.tk/man/tcl8.6/TclCmd/exec.html#M23, Tcl's `exec`
function goes out of its way to imitate the highly dangerous path lookup
of `cmd.exe`, but _of course_ only on Windows:

	If a directory name was not specified as part of the application
	name, the following directories are automatically searched in
	order when attempting to locate the application:

	    The directory from which the Tcl executable was loaded.

	    The current directory.

	    The Windows 32-bit system directory.

	    The Windows home directory.

	    The directories listed in the path.

The dangerous part is the second item, of course: `exec` _prefers_
executables in the current directory to those that are actually in the
`PATH`.

It is almost as if people wanted to Windows users vulnerable,
specifically.

To avoid that, Git GUI already has the `_which` function that does not
imitate that dangerous practice when looking up executables in the
search path.

However, Git GUI currently fails to use that function e.g. when trying to
execute `aspell` for spell checking.

That is not only dangerous but combined with Tcl's unfortunate default
behavior and with the fact that Git GUI tries to spell-check a
repository just after cloning, leads to a critical Remote Code Execution
vulnerability.

Let's override both `exec` and `open` to always use `_which` instead of
letting Tcl perform the path lookup, to prevent this attack vector.

This addresses CVE-2022-41953.

For more details, see
https://github.com/git-for-windows/git/security/advisories/GHSA-v4px-mx59-w99c

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-01-24 14:10:40 +01:00
fd477a1d3b Move the _which function (almost) to the top
We are about to make use of the `_which` function to address
CVE-2022-41953 by overriding Tcl/Tk's unsafe PATH lookup on Windows.

In preparation for that, let's move it close to the top of the file to
make sure that even early `exec` calls that happen during the start-up
of Git GUI benefit from the fix.

This commit is best viewed with `--color-moved`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-01-24 14:10:40 +01:00
e0539b4b25 Move is_<platform> functions to the beginning
We need these in `_which` and they should be defined before that
function's definition.

This commit is best viewed with `--color-moved`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-01-24 14:10:40 +01:00
c5766eae6f is_Cygwin: avoid execing anything
The `is_Cygwin` function is used, among other things, to determine
how executables are discovered in the `PATH` list by the `_which` function.

We are about to change the behavior of the `_which` function on Windows
(but not Cygwin): On Windows, we want it to ignore empty elements of the
`PATH` instead of treating them as referring to the current directory
(which is a "legacy feature" according to
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03,
but apparently not explicitly deprecated, the POSIX documentation is
quite unclear on that even if the Cygwin project itself considers it to
be deprecated: https://github.com/cygwin/cygwin/commit/fc74dbf22f5c).

This is important because on Windows, `exec` does something very unsafe
by default (unless we're running a Cygwin version of Tcl, which follows
Unix semantics).

However, we try to `exec` something _inside_ `is_Cygwin` to determine
whether we're running within Cygwin or not, i.e. before we determined
whether we need to handle `PATH` specially or not. That's a Catch-22.

Therefore, and because it is much cleaner anyway, use the
`$::tcl_platform(os)` value which is guaranteed to start with `CYGWIN_`
when running a Cygwin variant of Tcl/Tk, instead of executing `cygpath
--windir`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-01-24 14:10:40 +01:00
8f23432b38 windows: ignore empty PATH elements
When looking up an executable via the `_which` function, Git GUI
imitates the `execlp()` strategy where the environment variable `PATH`
is interpreted as a list of paths in which to search.

For historical reasons, stemming from the olden times when it was
uncommon to download a lot of files from the internet into the current
directory, empty elements in this list are treated as if the current
directory had been specified.

Nowadays, of course, this treatment is highly dangerous as the current
directory often contains files that have just been downloaded and not
yet been inspected by the user. Unix/Linux users are essentially
expected to be very, very careful to simply not add empty `PATH`
elements, i.e. not to make use of that feature.

On Windows, however, it is quite common for `PATH` to contain empty
elements by mistake, e.g. as an unintended left-over entry when an
application was installed from the Windows Store and then uninstalled
manually.

While it would probably make most sense to safe-guard not only Windows
users, it seems to be common practice to ignore these empty `PATH`
elements _only_ on Windows, but not on other platforms.

Sadly, this practice is followed inconsistently between different
software projects, where projects with few, if any, Windows-based
contributors tend to be less consistent or even "blissful" about it.
Here is a non-exhaustive list:

Cygwin:

	It specifically "eats" empty paths when converting path lists to
	POSIX: https://github.com/cygwin/cygwin/commit/753702223c7d

	I.e. it follows the common practice.

PowerShell:

	It specifically ignores empty paths when searching the `PATH`.
	The reason for this is apparently so self-evident that it is not
	even mentioned here:
	https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_environment_variables#path-information

	I.e. it follows the common practice.

CMD:

	Oh my, CMD. Let's just forget about it, nobody in their right
	(security) mind takes CMD as inspiration. It is so unsafe by
	default that we even planned on dropping `Git CMD` from Git for
	Windows altogether, and only walked back on that plan when we
	found a super ugly hack, just to keep Git's users secure by
	default:

		https://github.com/git-for-windows/MINGW-packages/commit/82172388bb51

	So CMD chooses to hide behind the battle cry "Works as
	Designed!" that all too often leaves users vulnerable. CMD is
	probably the most prominent project whose lead you want to avoid
	following in matters of security.

Win32 API (`CreateProcess()`)

	Just like CMD, `CreateProcess()` adheres to the original design
	of the path lookup in the name of backward compatibility (see
	https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw
	for details):

		If the file name does not contain a directory path, the
		system searches for the executable file in the following
		sequence:

		    1. The directory from which the application loaded.

		    2. The current directory for the parent process.

		    [...]

	I.e. the Win32 API itself chooses backwards compatibility over
	users' safety.

Git LFS:

	There have been not one, not two, but three security advisories
	about Git LFS executing executables from the current directory by
	mistake. As part of one of them, a change was introduced to stop
	treating empty `PATH` elements as equivalent to `.`:
	https://github.com/git-lfs/git-lfs/commit/7cd7bb0a1f0d

	I.e. it follows the common practice.

Go:

	Go does not follow the common practice, and you can think about
	that what you want:
	https://github.com/golang/go/blob/go1.19.3/src/os/exec/lp_windows.go#L114-L135
	https://github.com/golang/go/blob/go1.19.3/src/path/filepath/path_windows.go#L108-L137

Git Credential Manager:

	It tries to imitate Git LFS, but unfortunately misses the empty
	`PATH` element handling. As of time of writing, this is in the
	process of being fixed:
	https://github.com/GitCredentialManager/git-credential-manager/pull/968

So now that we have established that it is a common practice to ignore
empty `PATH` elements on Windows, let's assess this commit's change
using Schneier's Five-Step Process
(https://www.schneier.com/crypto-gram/archives/2002/0415.html#1):

Step 1: What problem does it solve?

	It prevents an entire class of Remote Code Execution exploits via
	Git GUI's `Clone` functionality.

Step 2: How well does it solve that problem?

	Very well. It prevents the attack vector of luring an unsuspecting
	victim into cloning an executable into the worktree root directory
	that Git GUI immediately executes.

Step 3: What other security problems does it cause?

	Maybe non-security problems: If a project (ab-)uses the unsafe
	`PATH` lookup. That would not only be unsafe, though, but
	fragile in the first place because it would break when running
	in a subdirectory. Therefore I would consider this a scenario
	not worth keeping working.

Step 4: What are the costs of this measure?

	Almost nil, except for the time writing up this commit message
	;-)

Step 5: Given the answers to steps two through four, is the security
	measure worth the costs?

	Yes. Keeping Git's users Secure By Default is worth it. It's a
	tiny price to pay compared to the damages even a single
	successful exploit can cost.

So let's follow that common practice in Git GUI, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2023-01-24 14:10:40 +01:00
5dec958dcf The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-23 13:39:52 -08:00
ebed06a3e9 Merge branch 'zh/scalar-progress'
"scalar" learned to give progress bar.

* zh/scalar-progress:
  scalar: show progress if stderr refers to a terminal
2023-01-23 13:39:52 -08:00
5287319bf8 Merge branch 'ds/omit-trailing-hash-in-index'
Quickfix for a topic already in 'master'.

* ds/omit-trailing-hash-in-index:
  t1600: fix racy index.skipHash test
2023-01-23 13:39:52 -08:00
019a1031ea Merge branch 'jc/format-patch-v-unleak'
Plug a small leak.

* jc/format-patch-v-unleak:
  format-patch: unleak "-v <num>"
2023-01-23 13:39:52 -08:00
6e0f966efe Merge branch 'sk/win32-close-handle-upon-pthread-join'
Pthread emulation on Win32 leaked thread handle when a thread is
joined.

* sk/win32-close-handle-upon-pthread-join:
  win32: close handles of threads that have been joined
  win32: prepare pthread.c for change by formatting
2023-01-23 13:39:51 -08:00
5427bb4893 Merge branch 'rs/use-enhanced-bre-on-macos'
Newer regex library macOS stopped enabling GNU-like enhanced BRE,
where '\(A\|B\)' works as alternation, unless explicitly asked with
the REG_ENHANCED flag.  "git grep" now can be compiled to do so, to
retain the old behaviour.

* rs/use-enhanced-bre-on-macos:
  use enhanced basic regular expressions on macOS
2023-01-23 13:39:51 -08:00
cd37c45acf Merge branch 'ab/test-env-helper'
Remove "git env--helper" and demote it to a test-tool subcommand.

* ab/test-env-helper:
  env-helper: move this built-in to "test-tool env-helper"
2023-01-23 13:39:51 -08:00
577bff3a81 Merge branch 'kn/attr-from-tree'
"git check-attr" learned to take an optional tree-ish to read the
.gitattributes file from.

* kn/attr-from-tree:
  attr: add flag `--source` to work with tree-ish
  t0003: move setup for `--all` into new block
2023-01-23 13:39:51 -08:00
8a40af9cab Merge branch 'rs/ls-tree-path-expansion-fix'
"git ls-tree --format='%(path) %(path)' $tree $path" showed the
path three times, which has been corrected.

* rs/ls-tree-path-expansion-fix:
  ls-tree: remove dead store and strbuf for quote_c_style()
  ls-tree: fix expansion of repeated %(path)
2023-01-23 13:39:50 -08:00
b269563512 Merge branch 'en/t6426-todo-cleanup'
Test clean-up.

* en/t6426-todo-cleanup:
  t6426: fix TODO about making test more comprehensive
2023-01-23 13:39:50 -08:00
8844c1125e Merge branch 'ab/cache-api-cleanup'
Code clean-up to tighten the use of in-core index in the API.

* ab/cache-api-cleanup:
  cache API: add a "INDEX_STATE_INIT" macro/function, add release_index()
  read-cache.c: refactor set_new_index_sparsity() for subsequent commit
  sparse-index API: BUG() out on NULL ensure_full_index()
  sparse-index.c: expand_to_path() can assume non-NULL "istate"
  builtin/difftool.c: { 0 }-initialize rather than using memset()
2023-01-23 13:39:49 -08:00
70661d288b Documentation: render dash correctly
Three hyphens are rendered verbatim in documentation, so "--" has to be
used to produce a dash.  Fix asciidoc output for dashes.  This is
similar to previous commits f0b922473e (Documentation: render special
characters correctly, 2021-07-29) and de82095a95 (doc
hash-function-transition: fix asciidoc output, 2021-02-05).

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-23 09:40:14 -08:00
7fb89047cc bisect: fix "reset" when branch is checked out elsewhere
Since 1d0fa89 (checkout: add --ignore-other-wortrees, 2015-01-03) we
have a safety valve in checkout/switch to prevent the same branch from
being checked out simultaneously in multiple worktrees.

If a branch is bisected in a worktree while also being checked out in
another worktree; when the bisection is finished, checking out the
branch back in the current worktree may fail.

Let's teach bisect to use the "--ignore-other-worktrees" flag.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-22 09:23:11 -08:00
5458ba0a4d t0003: call dd with portable blocksize
The command `dd bs=101M count=1` is not portable,
e.g. dd shipped with MacOs does not understand the 'M'.

Use `dd bs=1048576 count=101`, which achives the same, instead.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-22 08:14:40 -08:00
56c8fb1e95 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-21 17:22:01 -08:00
86ccd39a74 Merge branch 'yc/doc-fetch-fix'
Doc fix.

* yc/doc-fetch-fix:
  doc: fix non-existent config name
2023-01-21 17:22:01 -08:00
30b4e5c888 Merge branch 'ab/bisect-cleanup'
Code clean-up.

* ab/bisect-cleanup:
  bisect: no longer try to clean up left-over `.git/head-name` files
  bisect: remove Cogito-related code
  bisect run: fix the error message
  bisect: verify that a bogus option won't try to start a bisection
  bisect--helper: make the order consistently `argc, argv`
  bisect--helper: simplify exit code computation
2023-01-21 17:22:01 -08:00
38a49aba90 Merge branch 'tl/ls-tree-code-clean-up'
Code clean-up.

* tl/ls-tree-code-clean-up:
  t3104: remove shift code in 'test_ls_tree_format'
  ls-tree: cleanup the redundant SPACE
  ls-tree: make "line_termination" less generic
  ls-tree: fold "show_tree_data" into "cb" struct
  ls-tree: use a "struct options"
  ls-tree: don't use "show_tree_data" for "fast" callbacks
2023-01-21 17:22:00 -08:00
d2917b9099 Merge branch 'ph/parse-date-reduced-precision'
Loosen date parsing heuristics.

* ph/parse-date-reduced-precision:
  date.c: allow ISO 8601 reduced precision times
2023-01-21 17:22:00 -08:00
e28d5d2160 Merge branch 'pw/rebase-exec-cleanup'
Code clean-up.

* pw/rebase-exec-cleanup:
  rebase: cleanup "--exec" option handling
2023-01-21 17:22:00 -08:00
9c2003a6cb Merge branch 'pb/doc-orig-head'
Document ORIG_HEAD a bit more.

* pb/doc-orig-head:
  git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
  revisions.txt: be explicit about commands writing 'ORIG_HEAD'
  git-merge.txt: mention 'ORIG_HEAD' in the Description
  git-reset.txt: mention 'ORIG_HEAD' in the Description
  git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
2023-01-21 17:22:00 -08:00
b106341d57 Merge branch 'yo/doc-use-more-switch-c'
Doc update.

* yo/doc-use-more-switch-c:
  doc: add "git switch -c" as another option on detached HEAD
2023-01-21 17:22:00 -08:00
df786f6efe Merge branch 'sk/merge-filtering-strategies-micro-optim'
Micro optimization.

* sk/merge-filtering-strategies-micro-optim:
  merge: break out of all_strategy loop when strategy is found
2023-01-21 17:21:59 -08:00
42423c61d9 Merge branch 'jk/interop-error'
Test helper improvement.

* jk/interop-error:
  t/interop: report which vanilla git command failed
2023-01-21 17:21:59 -08:00
f2744aa37e Merge branch 'ar/bisect-doc-update'
Doc update.

* ar/bisect-doc-update:
  git-bisect-lk2009: update nist report link
  git-bisect-lk2009: update java code conventions link
2023-01-21 17:21:59 -08:00
013f168211 Merge branch 'ar/test-cleanup'
Test clean-up.

* ar/test-cleanup:
  t7527: use test_when_finished in 'case insensitive+preserving'
  t6422: drop commented out code
  t6003: uncomment test '--max-age=c3, --topo-order'
2023-01-21 17:21:59 -08:00
c253d61137 Merge branch 'jc/doc-diff-patch.txt'
Doc update.

* jc/doc-diff-patch.txt:
  docs: link generating patch sections
2023-01-21 17:21:58 -08:00
fc2735f427 Merge branch 'es/hooks-and-local-env'
Doc update for environment variables set when hooks are invoked.

* es/hooks-and-local-env:
  githooks: discuss Git operations in foreign repositories
2023-01-21 17:21:58 -08:00
60ce816cb6 Merge branch 'rs/dup-array'
Code cleaning.

* rs/dup-array:
  use DUP_ARRAY
  add DUP_ARRAY
  do full type check in BARF_UNLESS_COPYABLE
  factor out BARF_UNLESS_COPYABLE
  mingw: make argv2 in try_shell_exec() non-const
2023-01-21 17:21:58 -08:00
90c47b3fba Merge branch 'jx/t1301-updates'
Test updates.

* jx/t1301-updates:
  t1301: do not change $CWD in "shared=all" test case
  t1301: use test_when_finished for cleanup
  t1301: fix wrong template dir for git-init
2023-01-21 17:21:58 -08:00
904d404274 The eighth batch
The cURL one hasn't cooked for a week in 'next', but let's fast
track it so that linux-musl CI job would be happy.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-20 15:36:22 -08:00
5970a4b797 Merge branch 'jk/read-object-cleanup'
Code clean-up.

* jk/read-object-cleanup:
  object-file: fix indent-with-space
  packfile: inline custom read_object()
  repo_read_object_file(): stop wrapping read_object_file_extended()
  read_object_file_extended(): drop lookup_replace option
  streaming: inline call to read_object_file_extended()
  object-file: inline calls to read_object()
2023-01-20 15:36:21 -08:00
10925f5e8a Merge branch 'jk/curl-avoid-deprecated-api'
Deal with a few deprecation warning from cURL library.

* jk/curl-avoid-deprecated-api:
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
2023-01-20 15:36:21 -08:00
8e4309038f fsck: do not assume NUL-termination of buffers
The fsck code operates on an object buffer represented as a pointer/len
combination. However, the parsing of commits and tags is a little bit
loose; we mostly scan left-to-right through the buffer, without checking
whether we've gone past the length we were given.

This has traditionally been OK because the buffers we feed to fsck
always have an extra NUL after the end of the object content, which ends
any left-to-right scan. That has always been true for objects we read
from the odb, and we made it true for incoming index-pack/unpack-objects
checks in a1e920a0a7 (index-pack: terminate object buffers with NUL,
2014-12-08).

However, we recently added an exception: hash-object asks index_fd() to
do fsck checks. That _may_ have an extra NUL (if we read from a pipe
into a strbuf), but it might not (if we read the contents from the
file). Nor can we just teach it to always add a NUL. We may mmap the
on-disk file, which will not have any extra bytes (if it's a multiple of
the page size). Not to mention that this is a rather subtle assumption
for the fsck code to make.

Instead, let's make sure that the fsck parsers don't ever look past the
size of the buffer they've been given. This _almost_ works already,
thanks to earlier work in 4d0d89755e (Make sure fsck_commit_buffer()
does not run out of the buffer, 2014-09-11). The theory there is that we
check up front whether we have the end of header double-newline
separator. And then any left-to-right scanning we do is OK as long as it
stops when it hits that boundary.

However, we later softened that in 84d18c0bcf (fsck: it is OK for a tag
and a commit to lack the body, 2015-06-28), which allows the
double-newline header to be missing, but does require that the header
ends in a newline. That was OK back then, because of the NUL-termination
guarantees (including the one from a1e920a0a7 mentioned above).

Because 84d18c0bcf guarantees that any header line does end in a
newline, we are still OK with most of the left-to-right scanning. We
only need to take care after completing a line, to check that there is
another line (and we didn't run out of buffer).

Most of these checks are just need to check "buffer < buffer_end" (where
buffer is advanced as we parse) before scanning for the next header
line. But here are a few notes:

  - we don't technically need to check for remaining buffer before
    parsing the very first line ("tree" for a commit, or "object" for a
    tag), because verify_headers() rejects a totally empty buffer. But
    we'll do so in the name of consistency and defensiveness.

  - there are some calls to strchr('\n'). These are actually OK by the
    "the final header line must end in a newline" guarantee from
    verify_headers(). They will always find that rather than run off the
    end of the buffer. Curiously, they do check for a NULL return and
    complain, but I believe that condition can never be reached.

    However, I converted them to use memchr() with a proper size and
    retained the NULL checks. Using memchr() is not much longer and
    makes it more obvious what is going on. Likewise, retaining the NULL
    checks serves as a defensive measure in case my analysis is wrong.

  - commit 9a1a3a4d4c (mktag: allow omitting the header/body \n
    separator, 2021-01-05), does check for the end-of-buffer condition,
    but does so with "!*buffer", relying explicitly on the NUL
    termination. We can accomplish the same thing with a pointer
    comparison. I also folded it into the follow-on conditional that
    checks the contents of the buffer, for consistency with the other
    checks.

  - fsck_ident() uses parse_timestamp(), which is based on strtoumax().
    That function will happily skip past leading whitespace, including
    newlines, which makes it a risk. We can fix this by scanning to the
    first digit ourselves, and then using parse_timestamp() to do the
    actual numeric conversion.

    Note that as a side effect this fixes the fact that we missed
    zero-padded timestamps like "<email>   0123" (whereas we would
    complain about "<email> 0123"). I doubt anybody cares, but I
    mention it here for completeness.

  - fsck_tree() does not need any modifications. It relies on
    decode_tree_entry() to do the actual parsing, and that function
    checks both that there are enough bytes in the buffer to represent
    an entry, and that there is a NUL at the appropriate spot (one
    hash-length from the end; this may not be the NUL for the entry we
    are parsing, but we know that in the worst case, everything from our
    current position to that NUL is a filename, so we won't run out of
    bytes).

In addition to fixing the code itself, we'd like to make sure our rather
subtle assumptions are not violated in the future. So this patch does
two more things:

  - add comments around verify_headers() documenting the link between
    what it checks and the memory safety of the callers. I don't expect
    this code to be modified frequently, but this may help somebody from
    accidentally breaking things.

  - add a thorough set of tests covering truncations at various key
    spots (e.g., for a "tree $oid" line, in the middle of the word
    "tree", right after it, after the space, in the middle of the $oid,
    and right at the end of the line. Most of these are fine already (it
    is only truncating right at the end of the line that is currently
    broken). And some of them are not even possible with the current
    code (we parse "tree " as a unit, so truncating before the space is
    equivalent). But I aimed here to consider the code a black box and
    look for any truncations that would be a problem for a left-to-right
    parser.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 15:39:43 -08:00
06a668cb90 fetch: fix duplicate remote parallel fetch bug
Fetching in parallel from a remote group with a duplicated remote results
in the following:

error: cannot lock ref '<ref>': is at <oid> but expected <oid>

This doesn't happen in serial since fetching from the same remote that
has already been fetched from is a noop. Therefore, remove any duplicated
remotes after remote groups are parsed.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 14:41:48 -08:00
540e7bc477 doc: pretty-formats note wide char limitations, and add tests
The previous commits added clarifications to the column alignment
placeholders, note that the spaces are optional around the parameters.

Also, a proposed extension [1] to allow hard truncation (without
ellipsis '..') highlighted that the existing code does not play well
with wide characters, such as Asian fonts and emojis.

For example, N wide characters take 2N columns so won't fit an odd number
column width, causing misalignment somewhere.

Further analysis also showed that decomposed characters, e.g. separate
`a` + `umlaut` Unicode code-points may also be mis-counted, in some cases
leaving multiple loose `umlauts` all combined together.

Add some notes about these limitations, and add basic tests to demonstrate
them.

The chosen solution for the tests is to substitute any wide character
that overlaps a splitting boundary for the unicode vertical ellipsis
code point as a rare but 'obvious' substitution.

An alternative could be the substitution with a single dot '.' which
matches regular expression usage, and our two dot ellipsis, and further
in scenarios where the bulk of the text is wide characters, would be
obvious. In mainly 'ascii' scenarios a singleton emoji being substituted
by a dot could be confusing.

It is enough that the tests fail cleanly. The final choice for the
substitute character can be deferred.

[1]
https://lore.kernel.org/git/20221030185614.3842-1-philipoakley@iee.email/

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 14:35:15 -08:00
b5cd634d7a doc: pretty-formats describe use of ellipsis in truncation
Commit a7f01c6b4d (pretty: support truncating in %>, %< and %><,
2013-04-19) added the use of ellipsis when truncating placeholder
values.

Show our 'two dot' ellipsis, and examples for the left, middle and
right truncation to avoid any confusion as to which end of the string
is adjusted. (cf justification and sub-string).

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 14:35:15 -08:00
63792c564c doc: pretty-formats document negative column alignments
Commit 066790d7cb (pretty.c: support <direction>|(<negative number>) forms,
2016-06-16) added the option for right justified column alignment without
updating the documentation.

Add an explanation of its use of negative column values.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 14:35:15 -08:00
8bcb8f8e22 doc: pretty-formats: delineate %<|( parameter values
Commit a57523428b (pretty: support padding placeholders, %< %> and %><,
2013-04-19) introduced column width place holders. It also added
separate column position `%<|(` placeholders for display screen based
placement.

Change the display screen parameter reference from 'N' to 'M' and
corresponding descriptives to make the distinction clearer.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 14:35:15 -08:00
d664a7ad20 doc: pretty-formats: separate parameters from placeholders
Commit a57523428b (pretty: support padding placeholders, %< %> and %><,
2013-04-19) introduced columnated place holders. These placeholders
can be confusing as they contain `<` and `>` characters as part
of their placeholders adjacent to the `<N>` parameters.

Add spaces either side of the `<N>` parameters in the title line.
The code (strtol) will consume any spaces around the number values
(assuming they are passed as a quoted string with spaces).
Note that the spaces are optional.

Subsequent commits will clarify other confusions.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 14:35:15 -08:00
221222b278 Sync with 'maint' 2023-01-19 13:49:38 -08:00
844ede312b Sync with maint-2.38
* maint-2.38:
  attr: adjust a mismatched data type
2023-01-19 13:49:08 -08:00
b78628d426 Sync with maint-2.37
* maint-2.37:
  attr: adjust a mismatched data type
2023-01-19 13:48:26 -08:00
f2027d2626 Sync with maint-2.36
* maint-2.36:
  attr: adjust a mismatched data type
2023-01-19 13:48:17 -08:00
5c1fc48d68 Sync with maint-2.35
* maint-2.35:
  attr: adjust a mismatched data type
2023-01-19 13:48:08 -08:00
c508c30968 Sync with maint-2.34
* maint-2.34:
  attr: adjust a mismatched data type
2023-01-19 13:48:00 -08:00
f39fe8fcb2 Sync with maint-2.33
* maint-2.33:
  attr: adjust a mismatched data type
2023-01-19 13:47:42 -08:00
25d7cb600c Sync with maint-2.32
* maint-2.32:
  attr: adjust a mismatched data type
2023-01-19 13:46:04 -08:00
012e0d76dc Sync with maint-2.31
* maint-2.31:
  attr: adjust a mismatched data type
2023-01-19 13:45:37 -08:00
f8bf6b8f3d Sync with maint-2.30
* maint-2.30:
  attr: adjust a mismatched data type
2023-01-19 13:45:23 -08:00
0227130244 attr: adjust a mismatched data type
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.

Windows and 32-bit Linux are among the affected platforms.

In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)

The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.

Let's address this by passing a pointer to a variable of the expected
data type.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 13:38:06 -08:00
fedb8ea2df checkout: document -b/-B to highlight the differences from "git branch"
The existing text read as if "git checkout -b/-B name" were
equivalent to "git branch [-f] name", which clearly was not
what we wanted to say.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 09:44:08 -08:00
590b636737 hash-object: fix descriptor leak with --literally
In hash_object(), we open a descriptor for each file to hash (whether we
got the filename from the command line or --stdin-paths), but never
close it. For the traditional code path, which feeds the result to
index_fd(), this is OK; it closes the descriptor for us.

But 5ba9a93b39 (hash-object: add --literally option, 2014-09-11) added a
second code path, which does not close the descriptor. There we need to
do so ourselves.

You can see the problem in a clone of git.git like this:

  $ git ls-files -s | grep ^100644 | cut -f2 |
    git hash-object --stdin-paths --literally >/dev/null
  fatal: could not open 'builtin/var.c' for reading: Too many open files

After this patch, it completes successfully. I didn't bother with a
test, as it's a pain to deal with descriptor limits portably, and the
fix is so trivial.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 08:24:21 -08:00
bf08abac56 branch: document -f and linked worktree behaviour
"git branch -f name start" forces to recreate the named branch, but
the forcing does not defeat the "do not touch a branch that is
checked out elsewhere" safety valve.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 23:48:11 -08:00
acabd2048e grep: correctly identify utf-8 characters with \{b,w} in -P
When UTF is enabled for a PCRE match, the corresponding flags are
added to the pcre2_compile() call, but PCRE2_UCP wasn't included.

This prevents extending the meaning of the character classes to
include those new valid characters and therefore result in failed
matches for expressions that rely on that extention, for ex:

  $ git grep -P '\bÆvar'

Add PCRE2_UCP so that \w will include Æ and therefore \b could
correctly match the beginning of that word.

This has an impact on performance that has been estimated to be
between 20% to 40% and that is shown through the added performance
test.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 15:24:52 -08:00
97cf0c7de5 branch: improve advice when --recurse-submodules fails
'git branch --recurse-submodules start from-here' fails if any submodule
present in 'from-here' is not yet cloned (under
submodule.propagateBranches=true). We then give this advice:

   "You may try updating the submodules using 'git checkout from-here && git submodule update --init'"

If 'submodule.recurse' is set, 'git checkout from-here' will also fail since
it will try to recursively checkout the submodules.

Improve the advice by adding '--no-recurse-submodules' to the checkout
command.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 15:13:21 -08:00
69bbbe484b hash-object: use fsck for object checks
Since c879daa237 (Make hash-object more robust against malformed
objects, 2011-02-05), we've done some rudimentary checks against objects
we're about to write by running them through our usual parsers for
trees, commits, and tags.

These parsers catch some problems, but they are not nearly as careful as
the fsck functions (which make sense; the parsers are designed to be
fast and forgiving, bailing only when the input is unintelligible). We
are better off doing the more thorough fsck checks when writing objects.
Doing so at write time is much better than writing garbage only to find
out later (after building more history atop it!) that fsck complains
about it, or hosts with transfer.fsckObjects reject it.

This is obviously going to be a user-visible behavior change, and the
test changes earlier in this series show the scope of the impact. But
I'd argue that this is OK:

  - the documentation for hash-object is already vague about which
    checks we might do, saying that --literally will allow "any
    garbage[...] which might not otherwise pass standard object parsing
    or git-fsck checks". So we are already covered under the documented
    behavior.

  - users don't generally run hash-object anyway. There are a lot of
    spots in the tests that needed to be updated because creating
    garbage objects is something that Git's tests disproportionately do.

  - it's hard to imagine anyone thinking the new behavior is worse. Any
    object we reject would be a potential problem down the road for the
    user. And if they really want to create garbage, --literally is
    already the escape hatch they need.

Note that the change here is actually in index_mem(), which handles the
HASH_FORMAT_CHECK flag passed by hash-object. That flag is also used by
"git-replace --edit" to sanity-check the result. Covering that with more
thorough checks likewise seems like a good thing.

Besides being more thorough, there are a few other bonuses:

  - we get rid of some questionable stack allocations of object structs.
    These don't seem to currently cause any problems in practice, but
    they subtly violate some of the assumptions made by the rest of the
    code (e.g., the "struct commit" we put on the stack and
    zero-initialize will not have a proper index from
    alloc_comit_index().

  - likewise, those parsed object structs are the source of some small
    memory leaks

  - the resulting messages are much better. For example:

      [before]
      $ echo 'tree 123' | git hash-object -t commit --stdin
      error: bogus commit object 0000000000000000000000000000000000000000
      fatal: corrupt commit

      [after]
      $ echo 'tree 123' | git.compile hash-object -t commit --stdin
      error: object fails fsck: badTreeSha1: invalid 'tree' line format - bad sha1
      fatal: refusing to create malformed object

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 12:59:45 -08:00
35ff327e2d fsck: provide a function to fsck buffer without object struct
The fsck code has been slowly moving away from requiring an object
struct in commits like 103fb6d43b (fsck: accept an oid instead of a
"struct tag" for fsck_tag(), 2019-10-18), c5b4269b57 (fsck: accept an
oid instead of a "struct commit" for fsck_commit(), 2019-10-18), etc.

However, the only external interface that fsck.c provides is
fsck_object(), which requires an object struct, then promptly discards
everything except its oid and type. Let's factor out the post-discard
part of that function as fsck_buffer(), leaving fsck_object() as a thin
wrapper around it. That will provide more flexibility for callers which
may not have a struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 12:59:44 -08:00
34959d80db t: use hash-object --literally when created malformed objects
Many test scripts use hash-object to create malformed objects to see how
we handle the results in various commands. In some cases we already have
to use "hash-object --literally", because it does some rudimentary
quality checks. But let's use "--literally" more consistently to
future-proof these tests against hash-object learning to be more
careful.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 12:59:44 -08:00
ad5dfeac04 t7030: stop using invalid tag name
We intentionally invalidate the signature of a tag by switching its tag
name from "seventh" to "7th forged". However, the latter is not a valid
tag name because it contains a space. This doesn't currently affect the
test, but we're better off using something syntactically valid. That
reduces the number of possible failure modes in the test, and
future-proofs us if git hash-object gets more picky about its input.

The t7031 script, which was mostly copied from t7030, has the same
problem, so we'll fix it, too.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 12:59:44 -08:00
61cc4be7ec t1006: stop using 0-padded timestamps
The fake objects in t1006 use dummy timestamps like "0000000000 +0000".
While this does make them look more like normal timestamps (which,
unless it is 1970, have many digits), it actually violates our fsck
checks, which complain about zero-padded timestamps.

This doesn't currently break anything, but let's future-proof our tests
against a version of hash-object which is a little more careful about
its input. We don't actually care about the exact values here (and in
fact, the helper functions in this script end up removing the timestamps
anyway, so we don't even have to adjust other parts of the tests).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 12:59:44 -08:00
6e2646075c t1007: modernize malformed object tests
The tests in t1007 for detecting malformed objects have two
anachronisms:

 - they use "sha1" instead of "oid" in variable names, even though the
   script as a whole has been adapted to handle sha256

 - they use test_i18ngrep, which is no longer necessary

Since we'll be adding a new similar test, let's clean these up so they
are all consistently using the modern style.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 12:59:44 -08:00
8534bb4cb1 git-cat-file.txt: fix list continuations rendering literally
With Asciidoctor, all of the '+' introduced in a797c0ea04 ("cat-file:
add mailmap support to --batch-check option", 2022-12-20) render
literally rather than functioning as list continuations. With asciidoc,
this renders just fine. It's not too surprising that there is room for
ambiguity and surprises here, since we have lists within lists.

Simply replacing all of these '+' with empty lines makes this render
fine using both tools. Except, in the third hunk, where after this inner
'*' list ends, we want to continue with more contents of the outer list
item (`--batch-command=<format>`). We can solve any ambiguity here and
make this clear to both tools by wrapping the inner list in an open
block (using "--").

For consistency, let's wrap all three of these inner lists from
a797c0ea04 in open blocks. This also future-proofs us a little -- if we
ever gain more contents after any of those first two lists, as we did
already in a797c0ea04 for the third list, we're prepared and should
render fine with both asciidoc and Asciidoctor from the start.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-18 08:24:39 -08:00
6269f8eaad treewide: always have a valid "index_state.repo" member
When the "repo" member was added to "the_index" in [1] the
repo_read_index() was made to populate it, but the unpopulated
"the_index" variable didn't get the same treatment.

Let's do that in initialize_the_repository() when we set it up, and
likewise for all of the current callers initialized an empty "struct
index_state".

This simplifies code that needs to deal with "the_index" or a custom
"struct index_state", we no longer need to second-guess this part of
the "index_state" deep in the stack. A recent example of such
second-guessing is the "istate->repo ? istate->repo : the_repository"
code in [2]. We can now simply use "istate->repo".

We're doing this by making use of the INDEX_STATE_INIT() macro (and
corresponding function) added in [3], which now have mandatory "repo"
arguments.

Because we now call index_state_init() in repository.c's
initialize_the_repository() we don't need to handle the case where we
have a "repo->index" whose "repo" member doesn't match the "repo"
we're setting up, i.e. the "Complete the double-reference" code in
repo_read_index() being altered here. That logic was originally added
in [1], and was working around the lack of what we now have in
initialize_the_repository().

For "fsmonitor-settings.c" we can remove the initialization of a NULL
"r" argument to "the_repository". This was added back in [4], and was
needed at the time for callers that would pass us the "r" from an
"istate->repo". Before this change such a change to
"fsmonitor-settings.c" would segfault all over the test suite (e.g. in
t0002-gitfile.sh).

This change has wider eventual implications for
"fsmonitor-settings.c". The reason the other lazy loading behavior in
it is required (starting with "if (!r->settings.fsmonitor) ..." is
because of the previously passed "r" being "NULL".

I have other local changes on top of this which move its configuration
reading to "prepare_repo_settings()" in "repo-settings.c", as we could
now start to rely on it being called for our "r". But let's leave all
of that for now, and narrowly remove this particular part of the
lazy-loading.

1. 1fd9ae517c (repository: add repo reference to index_state,
   2021-01-23)
2. ee1f0c242e (read-cache: add index.skipHash config option,
   2023-01-06)
3. 2f6b1eb794 (cache API: add a "INDEX_STATE_INIT" macro/function,
   add release_index(), 2023-01-12)
4. 1e0ea5c431 (fsmonitor: config settings are repository-specific,
   2022-03-25)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 14:32:06 -08:00
dc71be4fda Merge branch 'ds/omit-trailing-hash-in-index' into ab/cache-api-cleanup-users
* ds/omit-trailing-hash-in-index:
  t1600: fix racy index.skipHash test
2023-01-17 14:31:40 -08:00
73f69f22e5 Merge branch 'ab/cache-api-cleanup' into ab/cache-api-cleanup-users
* ab/cache-api-cleanup:
  cache API: add a "INDEX_STATE_INIT" macro/function, add release_index()
  read-cache.c: refactor set_new_index_sparsity() for subsequent commit
  sparse-index API: BUG() out on NULL ensure_full_index()
  sparse-index.c: expand_to_path() can assume non-NULL "istate"
  builtin/difftool.c: { 0 }-initialize rather than using memset()
2023-01-17 14:31:26 -08:00
6c065f72b8 http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.

Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:

  - we don't have to worry that silencing curl's deprecation warnings
    might cause us to miss other more useful ones

  - we'd eventually want to move to the new variant anyway, so this gets
    us set up (albeit with some extra ugly boilerplate for the
    conditional)

There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:

  GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
  if (...http is allowed...)
	GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);

and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.

On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).

This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 08:03:08 -08:00
fe7e44e1ab http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.

But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros).  But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.

Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).

Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.

Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 08:03:08 -08:00
6956015704 http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 08:03:07 -08:00
42ea7a4150 t1600: fix racy index.skipHash test
The test 1600.6 can fail under --stress due to mtime collisions. Most of
the tests include a removal of the index file to guarantee that the
index is updated. However, the submodule test addded in ee1f0c242e
(read-cache: add index.skipHash config option, 2023-01-06) did not
include this removal. Thus, on rare occasions, the test can fail because
the index still has a non-null trailing hash, as detected by the helper
added in da9acde14e (test-lib-functions: add helper for trailing hash,
2023-01-06).

By removing the submodule's index before the 'git -C sub add a' command,
we guarantee that the index is rewritten with the new index.skipHash
config option.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 07:41:44 -08:00
a7caae2729 Sync with 'maint' 2023-01-17 06:59:22 -08:00
37537d6472 attr: adjust a mismatched data type
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.

Windows and 32-bit Linux are among the affected platforms.

In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)

The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.

Let's address this by passing a pointer to a variable of the expected
data type.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 06:58:20 -08:00
508386c6c5 Sync with 2.39.1 2023-01-16 12:11:58 -08:00
262c45b6a1 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-16 12:07:47 -08:00
eaebc89f88 Merge branch 'jk/strncmp-to-api-funcs'
Code clean-up.

* jk/strncmp-to-api-funcs:
  convert trivial uses of strncmp() to skip_prefix()
  convert trivial uses of strncmp() to starts_with()
2023-01-16 12:07:47 -08:00
3ed618f28f Merge branch 'ar/dup-words-fixes'
Typofixes.

* ar/dup-words-fixes:
  *: fix typos which duplicate a word
2023-01-16 12:07:47 -08:00
ffd9238685 Merge branch 'ds/omit-trailing-hash-in-index'
Introduce an optional configuration to allow the trailing hash that
protects the index file from bit flipping.

* ds/omit-trailing-hash-in-index:
  features: feature.manyFiles implies fast index writes
  test-lib-functions: add helper for trailing hash
  read-cache: add index.skipHash config option
  hashfile: allow skipping the hash function
2023-01-16 12:07:47 -08:00
ab85a7de6d Merge branch 'ws/single-file-cone'
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.

* ws/single-file-cone:
  dir: check for single file cone patterns
2023-01-16 12:07:47 -08:00
1120c54c12 Merge branch 'jk/ext-diff-with-relative'
"git diff --relative" did not mix well with "git diff --ext-diff",
which has been corrected.

* jk/ext-diff-with-relative:
  diff: drop "name" parameter from prepare_temp_file()
  diff: clean up external-diff argv setup
  diff: use filespec path to set up tempfiles for ext-diff
2023-01-16 12:07:46 -08:00
af8a3bb853 Merge branch 'ds/bundle-uri-4'
Code clean-up.

* ds/bundle-uri-4:
  test-bundle-uri: drop unused variables
2023-01-16 12:07:46 -08:00
b242e89dff Merge branch 'tr/am--no-verify'
Conditionally skip the pre-applypatch and applypatch-msg hooks when
applying patches with 'git am'.

* tr/am--no-verify:
  am: allow passing --no-verify flag
2023-01-16 12:07:46 -08:00
763f20fb4a Merge branch 'tb/ci-concurrency'
Avoid unnecessary builds in CI, with settings configured in
ci-config.

* tb/ci-concurrency:
  ci: avoid unnecessary builds
2023-01-16 12:07:46 -08:00
42f9a60013 Merge branch 'pw/ci-print-failure-name-fix'
(cosmetic) CI regression fix.

* pw/ci-print-failure-name-fix:
  ci(github): restore "print test failures" step name
2023-01-16 12:07:45 -08:00
7c7357910b Merge branch 'es/t1509-root-fixes'
Test fixes.

* es/t1509-root-fixes:
  t1509: facilitate repeated script invocations
  t1509: make "setup" test more robust
  t1509: fix failing "root work tree" test due to owner-check
2023-01-16 12:07:45 -08:00
c6ab91335a fsck: document the new gitattributes message IDs
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-16 12:03:14 -08:00
2f6b1eb794 cache API: add a "INDEX_STATE_INIT" macro/function, add release_index()
Hopefully in some not so distant future, we'll get advantages from always
initializing the "repo" member of the "struct index_state". To make
that easier let's introduce an initialization macro & function.

The various ad-hoc initialization of the structure can then be changed
over to it, and we can remove the various "0" assignments in
discard_index() in favor of calling index_state_init() at the end.

While not strictly necessary, let's also change the CALLOC_ARRAY() of
various "struct index_state *" to use an ALLOC_ARRAY() followed by
index_state_init() instead.

We're then adding the release_index() function and converting some
callers (including some of these allocations) over to it if they
either won't need to use their "struct index_state" again, or are just
about to call index_state_init().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-16 10:46:58 -08:00
4433bd24e4 scalar: show progress if stderr refers to a terminal
Sometimes when users use scalar to download a monorepo with a long
commit history, they want to check the progress bar to know how long
they still need to wait during the fetch process, but scalar
suppresses this output by default.

So let's check whether scalar stderr refer to a terminal, if so,
show progress, otherwise disable it.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-16 10:42:22 -08:00
5b8db44bdd format-patch: unleak "-v <num>"
The "subject_prefix" member of "struct revision" usually is set to a
borrowed string (either a string literal like "PATCH" that appear in
the program text as a hardcoded default, or the value of
"format.subjectprefix") and is never freed when the containing
revision structure is released.  The "-v <num>" codepath however
violates this rule and stores a pointer to an allocated string to
this member, relinquishing the responsibility to free it when it is
done using the revision structure, leading to a small one-time leak.

Instead, keep track of the string it allocates to let the revision
structure borrow, and clean it up when it is done.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-16 10:31:45 -08:00
c388fcda99 ls-tree: remove dead store and strbuf for quote_c_style()
Stop initializing "name" because it is set again before use.

Let quote_c_style() write directly to "sb" instead of taking a detour
through "quoted".  This avoids an allocation and a string copy.  The
result is the same because the function only appends.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 19:22:26 -08:00
16fb5c54bd ls-tree: fix expansion of repeated %(path)
expand_show_tree() borrows the base strbuf given to us by read_tree() to
build the full path of the current entry when handling %(path).  Only
its indirect caller, show_tree_fmt(), removes the added entry name.
That works fine as long as %(path) is only included once in the format
string, but accumulates duplicates if it's repeated:

   $ git ls-tree --format='%(path) %(path) %(path)' HEAD M*
   Makefile MakefileMakefile MakefileMakefileMakefile

Reset the length after each use to get the same expansion every time;
here's the behavior with this patch:

   $ ./git ls-tree --format='%(path) %(path) %(path)' HEAD M*
   Makefile Makefile Makefile

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 19:22:26 -08:00
dcb47e52b0 t6426: fix TODO about making test more comprehensive
t6426.7 (a rename/add testcase) long had a TODO/FIXME comment about
how the test could be improved (with some commented out sample code
that had a few small errors), but those improvements were blocked on
other changes still in progress.  The necessary changes were put in
place years ago but the comment was forgotten.  Remove and fix the
commented out code section and finally remove the big TODO/FIXME
comment.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 18:28:56 -08:00
4a1baacd46 env-helper: move this built-in to "test-tool env-helper"
Since [1] there has been no reason for keeping "git env--helper" a
built-in. The reason it was a built-in to begin with was to support
the GIT_TEST_GETTEXT_POISON mode removed in that commit. I.e. unlike
the rest of "test-tool" it would potentially be called by the
installed git via "git-sh-i18n.sh".

As none of that applies since [1] we should stop carrying this
technical debt, and move it to t/helper/*. As this mostly move-only
change shows this has the nice bonus that we'll stop wasting time
translating the internal-only strings it emits.

Even though this was a built-in, it was intentionally never
documented, see its introduction in [2]. It never saw use outside of
the test suite, except for the "GIT_TEST_GETTEXT_POISON" use-case
noted above.

1. d162b25f95 (tests: remove support for GIT_TEST_GETTEXT_POISON,
   2021-01-20)
2. b4f207f339 (env--helper: new undocumented builtin wrapping
   git_env_*(), 2019-06-21)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 18:07:11 -08:00
47cfc9bd7d attr: add flag --source to work with tree-ish
The contents of the .gitattributes files may evolve over time, but "git
check-attr" always checks attributes against them in the working tree
and/or in the index. It may be beneficial to optionally allow the users
to check attributes taken from a commit other than HEAD against paths.

Add a new flag `--source` which will allow users to check the
attributes against a commit (actually any tree-ish would do). When the
user uses this flag, we go through the stack of .gitattributes files but
instead of checking the current working tree and/or in the index, we
check the blobs from the provided tree-ish object. This allows the
command to also be used in bare repositories.

Since we use a tree-ish object, the user can pass "--source
HEAD:subdirectory" and all the attributes will be looked up as if
subdirectory was the root directory of the repository.

We cannot simply use the `<rev>:<path>` syntax without the `--source`
flag, similar to how it is used in `git show` because any non-flag
parameter before `--` is treated as an attribute and any parameter after
`--` is treated as a pathname.

The change involves creating a new function `read_attr_from_blob`, which
given the path reads the blob for the path against the provided source and
parses the attributes line by line. This function is plugged into
`read_attr()` function wherein we go through the stack of attributes
files.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Co-authored-by: toon@iotcl.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 08:49:55 -08:00
c847e8c228 t0003: move setup for --all into new block
There is some setup code which is used by multiple tests being setup in
`attribute test: --all option`. This means when we run "sh
./t0003-attributes.sh --run=setup,<num>" there is a chance of failing
since we missed this setup block.

So to ensure that setups are independent of test logic, move this to a
new setup block.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Co-authored-by: toon@iotcl.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 08:49:55 -08:00
ca554bf36c doc: fix non-existent config name
Replace non-existent `branch.<name>.fetch` to `remote.<repository>.fetch`, in
the first example in `git-fetch` doc, which was introduced in
d504f6975d (modernize fetch/merge/pull examples, 2009-10-21).

Rename placeholder `<name>` to `<repository>`, to be consistent with all other
uses in git docs, except that `git-config.txt` uses `remote.<name>.fetch` in
its "Variables" section.

Also add missing monospace markups.

Signed-off-by: Yukai Chou <muzimuzhi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 17:33:32 -08:00
cf4936ed74 t3104: remove shift code in 'test_ls_tree_format'
In t3104-ls-tree-format.sh, There is a legacy 'shift 2' code
and the relevant code block no longer depends on it anymore,
so let's remove it for a small cleanup.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 15:09:23 -08:00
925a7c6b6b ls-tree: cleanup the redundant SPACE
An redundant space was found in ls-tree.c, which is no doubt
a small change, but it might be OK to make a commit on its own.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 15:09:23 -08:00
e6c75d8dd7 ls-tree: make "line_termination" less generic
The "ls-tree" command isn't capable of ending "lines" with anything
except '\n' or '\0', and in the latter case we can avoid calling
write_name_quoted_relative() entirely. Let's do that, less for
optimization and more for clarity, the write_name_quoted_relative()
API itself does much the same thing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 15:09:23 -08:00
65d1f6c9fa ls-tree: fold "show_tree_data" into "cb" struct
After the the preceding two commits the only user of the
"show_tree_data" struct needed it along with the "options" member,
let's instead fold all of that into a "show_tree_data" struct that
we'll use only for that callback.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 15:09:23 -08:00
030a3d5d9e ls-tree: use a "struct options"
As a first step towards being able to turn this code into an API some
day let's change the "static" options in builtin/ls-tree.c into a
"struct ls_tree_options" that can be constructed dynamically without
the help of parse_options().

Because we're now using non-static variables for this we'll need to
clear_pathspec() at the end of cmd_ls_tree(), least various tests
start failing under SANITIZE=leak. The memory leak was already there
before, now it's just being brought to the surface.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 15:09:22 -08:00
7677417b57 ls-tree: don't use "show_tree_data" for "fast" callbacks
As noted in [1] the code that made it in as part of
9c4d58ff2c (ls-tree: split up "fast path" callbacks, 2022-03-23) was
a "maybe a good idea, maybe not" RFC-quality patch. I hadn't looked
very carefully at the resulting patterns.

The implementation shared the "struct show_tree_data data", which was
introduced in e81517155e (ls-tree: introduce struct "show_tree_data",
2022-03-23) both for use in 455923e0a1 (ls-tree: introduce "--format"
option, 2022-03-23), and because the "fat" callback hadn't been split
up as 9c4d58ff2c did.

Now that that's been done we can see that most of what
show_tree_common() was doing could be done lazily by the callbacks
themselves, who in the pre-image were often using an odd mis-match of
their own arguments and those same arguments stuck into the "data"
structure. Let's also have the callers initialize the "type", rather
than grabbing it from the "data" structure afterwards.

1. https://lore.kernel.org/git/cover-0.7-00000000000-20220310T134811Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyronteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 15:09:22 -08:00
de54b5fec4 bisect: no longer try to clean up left-over .git/head-name files
As per the code comment, the `.git/head-name` files were cleaned up for
backwards-compatibility: an old version of `git bisect` could have left
them behind.

Now, just how old would such a version be? As of 0f497e75f0 (Eliminate
confusing "won't bisect on seeked tree" failure, 2008-02-23), `git
bisect` does not write that file anymore. Which corresponds to Git
v1.5.4.4.

Even if the likelihood is non-nil that there might still be users out
there who use such an old version to start a bisection, but then decide
to continue bisecting with a current Git version, it is highly
improbable.

So let's remove that code, at long last.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:14 -08:00
70d3dbfea9 bisect: remove Cogito-related code
Once upon a time, there was this idea that Git would not actually be a
single coherent program, but rather a set of low-level programs that
users cobble together via shell scripts, or develop high-level user
interfaces for Git, or both.

Cogito was such a high-level user interface, incidentally implemented
via shell scripts that cobble together Git calls.

It did turn out relatively quickly that Git would much rather provide a
useful high-level user interface itself.

As of April 19th, 2007, Cogito was therefore discontinued (see
https://lore.kernel.org/git/20070419124648.GL4489@pasky.or.cz/).

Nevertheless, for almost 15 years after that announcement, Git carried
special code in `git bisect` to accommodate Cogito.

Since it is beyond doubt that there are no more Cogito users, let's
remove the last remnant of Cogito-accommodating code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:14 -08:00
4de06fbd56 bisect run: fix the error message
In d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function
in C, 2021-09-13), we ported the `bisect run` subcommand to C, including
the part that prints out an error message when the implicit `git bisect
bad` or `git bisect good` failed.

However, the error message was supposed to print out whether the state
was "good" or "bad", but used a bogus (because non-populated) `args`
variable for it. This was fixed in [1], but as of [2] (when
`bisect--helper` was changed to the present `bisect-state') the error
message still talks about implementation details that should not
concern end users.

Fix that, and add a regression test to ensure that the intended form of
the error message.

1. 80c2e9657f (bisect--helper: report actual bisect_state() argument
   on error, 2022-01-18
2. f37d0bdd42 (bisect: fix output regressions in v2.30.0, 2022-11-10)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:14 -08:00
2f645b33ba bisect: verify that a bogus option won't try to start a bisection
We do not want `git bisect --bogus-option` to start a bisection. To
verify that, we look for the tell-tale error message `You need to start
by "git bisect start"` and fail if it was found.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:14 -08:00
6f97792285 bisect--helper: make the order consistently argc, argv
In C, the natural order is for `argc` to come before `argv` by virtue of
the `main()` function declaring the parameters in precisely that order.

It is confusing & distracting, then, when readers familiar with the C
language read code where that order is switched around.

Let's just change the order and avoid that type of developer friction.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:13 -08:00
7a8d7aaa47 bisect--helper: simplify exit code computation
We _already_ have a function to determine whether a given `enum
bisect_error` value is non-zero but still _actually_ indicates success.

Let's use it instead of duplicating the logic.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:13 -08:00
ebdc46c242 docs: link generating patch sections
Currently, in the git-log documentation, the reference to generating
patches does not match the section title. This can make the section
"Generating patch text with -p" hard to find, since typically readers of
the documentation will copy and paste to search the page.

Let's make this more convenient for readers by linking it directly to
the section.

Since git-log pulls in diff-generate-patch.txt, we can provide a direct
link to the section. Otherwise, change the verbiage to match exactly
what the section title is, to at least make searching for it an easier
task.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 12:55:14 -08:00
e57d2c5937 rebase: cleanup "--exec" option handling
When handling "--exec" rebase collects the commands into a struct
string_list, then prepends "exec " to each command creating a multi line
string and finally splits that string back into a list of commands. This
is an artifact of the scripted rebase and the need to support "rebase
--preserve-merges". Now that "--preserve-merges" no-longer exists we can
cleanup the way the argument is handled. There is no need to add the
"exec " prefix to the commands as that is added by todo_list_to_strbuf().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 12:23:14 -08:00
a87a20cbb4 t7527: use test_when_finished in 'case insensitive+preserving'
Most tests in t7527-builtin-fsmonitor.sh that start a daemon, use the
helper function test_when_finished with stop_daemon_delete_repo.
Function stop_daemon_delete_repo explicitly stops the daemon.  Calling
it via test_when_finished is needed for tests that don't check daemon's
automatic shutdown logic [1] and it is needed to avoid daemons being
left running in case of breakage of the logic of automatic shutdown of
the daemon.

Unlike these tests, test 'case insensitive+preserving' added in [2] has
a call to function test_when_finished commented out.  It was commented
out in all versions of the patch [2] during development [3].  This seems
to not be intentional, because neither commit message in [2], nor the
comment above the test mention this line being commented out.  Compare
it, for example, to "# unicode_debug=true" which is explicitly described
by a documentation comment above it.

Uncomment test_when_finished for stop_daemon_delete_repo in test 'case
insensitive+preserving' to ensure that daemons are not left running in
cases when automatic shutdown logic of daemon itself is broken.

[1] See documentation in "fsmonitor--daemon.h" for details.
[2] caa9c37ec0 (t7527: test FSMonitor on case insensitive+preserving
    file system, 2022-05-26)
[3] See mailing list thread
    https://lore.kernel.org/git/41f8cbc2ae45cb86e299eb230ad3cb0319256c37.1653601644.git.gitgitgadget@gmail.com/T/#t

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 12:06:10 -08:00
5da4597297 t6422: drop commented out code
In commit [1] tests in t6422-merge-rename-corner-cases.sh were
refactored to not run setup steps separately.  This included replacing
all tests like

	test_expect_success "setup ..." '
		<code of setup>
	'

with corresponding Shell functions

	test_setup_... () {
		<code of setup>
	}

During this replacement first and last lines of one of such tests got
left commented out in code.  Drop these lines to avoid confusion.

[1] da1e295e00 (t604[236]: do not run setup in separate tests, 2019-10-22)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 12:05:47 -08:00
b3594800eb t6003: uncomment test '--max-age=c3, --topo-order'
Test '--max-age=c3, --topo-order' in t6003-rev-list-topo-order.sh has
been commented out as failing since its introduction in [1].  However,
the test is successful at least since commit [2] -- bisecting further is
harder because of incompatibility of such old Git code with modern
header file <openssl/bn.h> [3].

Uncomment this test to gain test coverage.

[1] f573571a21 ([PATCH] Add t/t6003 with some --topo-order tests,
    2005-07-07)
[2] 765ac8ec46 (Rip out merge-order and make "git log <paths>..." work
    again., 2006-02-28)
[3] BIGNUM used in git's `epoch.c` which was removed in [2] changed
    significantly between OpenSSL 1.0.2 and OpenSSL 1.1.0
    See also https://stackoverflow.com/a/42295243/1083697 and
    https://lore.kernel.org/git/Y71qiCs+oAS2OegH@coredump.intra.peff.net/

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 12:05:41 -08:00
f5156f1885 git-bisect-lk2009: update nist report link
Commit d656218a83 (docs/bisect-lk2009: update nist report link,
2017-04-20) replaced a dead link to news release on nist.gov.  However,
this might be confusing to the reader (like myself) because the article
git-bisect-lk2009.txt quotes from the news release but the exact quote
cannot be found in the full report.  In addition to that, the link added
in 2017 is also dead in 2023.

Replace the reference to nist.gov with an version of the original NIST
news release archived to the Wayback Machine.  Include also an updated
link to a live version of the full report.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:58:51 -08:00
18ecb23c4b git-bisect-lk2009: update java code conventions link
A reference to Java Code Conventions in git-bisect-lk2009.txt uses an
outdated URL that redirects to table of contents for the conventions.
The actual claim about "80%" that this reference backs up is on the
first page of the conventions:

  https://www.oracle.com/java/technologies/javase/codeconventions-introduction.html

Use this newer URL and its title in the reference.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:58:51 -08:00
e750951e74 ls-files: guide folks to --exclude-standard over other --exclude* options
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:55:17 -08:00
4173b806c7 ls-files: clarify descriptions of status tags for -t
Much like the file selection options we tweaked in the last commit, the
status tags printed with -t had descriptions that were easy to
misunderstand, and for many of the same reasons.  Clarify them.

Also, while at it, remove the "semi-deprecated" comment for "git
ls-files -t".  The -t option was marked as semi-deprecated in 5bc0e247c4
("Document ls-files -t as semi-obsolete.", 2010-07-28) because:

    "git ls-files -t" is [...] badly documented, hence we point the
    users to superior alternatives.
    The feature is marked as "semi-obsolete" but not "scheduled for removal"
    since it's a plumbing command, scripts might use it, and Git testsuite
    already uses it to test the state of the index.

Marking it as obsolete because it was easily misunderstood, which I
think was primarily due to documentation problems, is one strategy, but
I think fixing the documentation is a better option.  Especially since
in the intervening time, "git ls-files -t" has become heavily used by
sparse-checkout users where the same confusion just doesn't apply.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:55:17 -08:00
2b02d2df2b ls-files: clarify descriptions of file selection options
The previous descriptions of the file selection options were very easy
to misunderstand.  For example:

  * "Show cached files in the output"
    This could be interpreted as meaning "show files which have been
    modified and git-add'ed, i.e. files which have cached changes
    relative to HEAD".

  * "Show deleted files"
    This could be interpreted as meaning "for each `git rm $FILE` we
    ran, show me $FILE"

  * "Show modified files"
    This could be interpreted as meaning "show files which have been
    modified and git-add'ed" or as "show me files that differ from HEAD"
    or as "show me undeleted files different from HEAD" (given that
    --deleted is a separate option), none of which are correct.

Further, it's not very clear when some options only modify and/or
override other options, as was the case with --ignored, --directory, and
--unmerged (I've seen folks confused by each of them on the mailing
list, sometimes even fellow git developers.)

Tweak these definitions, and the one for --killed, to try to make them
all a bit more clear.  Finally, also clarify early on that duplicate
reports for paths are often expected (both when (a) there are multiple
entries for the file in the index -- i.e. when there are conflicts, and
also (b) when the user specifies options that might pick the same file
multiple times, such as `git ls-files --cached --deleted --modified`
when there is a file with an unstaged deletion).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:55:16 -08:00
2a34b3181d ls-files: add missing documentation for --resolve-undo option
ls-files' --resolve-undo option has existed ever since 9d9a2f4aba
("resolve-undo: basic tests", 2009-12-25), but was never documented.
However, the option has been referred to in the ls-files manual itself
ever since ce74de931d ("ls-files: introduce "--format" option",
2022-07-23), making its omission a bit jarring.  Document this option.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:55:16 -08:00
b56be49984 date.c: allow ISO 8601 reduced precision times
ISO 8601 permits "reduced precision" time representations to omit the
seconds value or both the minutes and the seconds values.  The
abbreviate times could look like 17:45 or 1745 to omit the seconds,
or simply as 17 to omit both the minutes and the seconds.

parse_date_basic accepts the 17:45 format but it rejects the other two.
Change it to accept 4-digit and 2-digit time values when they follow a
recognized date and a 'T'.

Before this change:

$ TZ=UTC test-tool date approxidate 2022-12-13T23:00 2022-12-13T2300 2022-12-13T23
2022-12-13T23:00 -> 2022-12-13 23:00:00 +0000
2022-12-13T2300 -> 2022-12-13 23:54:13 +0000
2022-12-13T23 -> 2022-12-13 23:54:13 +0000

After this change:

$ TZ=UTC helper/test-tool date approxidate 2022-12-13T23:00 2022-12-13T2300 2022-12-13T23
2022-12-13T23:00 -> 2022-12-13 23:00:00 +0000
2022-12-13T2300 -> 2022-12-13 23:00:00 +0000
2022-12-13T23 -> 2022-12-13 23:00:00 +0000

Note: ISO 8601 also allows reduced precision date strings such as
"2022-12" and "2022". This patch does not attempt to address these.

Reported-by: Pat LaVarre <plavarre@purestorage.com>
Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:49:04 -08:00
fca2d86c97 t/interop: report which vanilla git command failed
The interop test library sets up wrappers "git.a" and "git.b" to
represent the two versions to be tested. It also wraps vanilla "git" to
report an error, with the goal of catching tests which accidentally fail
to use one of the version-specific wrappers (which could invalidate the
tests in a very subtle way).

But when it catches an invocation of vanilla git, it doesn't give any
details, which makes it very hard to debug exactly which invocation is
responsible (especially if it's buried in a function invocation, etc).
Let's report the arguments passed to git, which helps narrow it down.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 11:48:24 -08:00
5bdf6d4ac0 read-cache.c: refactor set_new_index_sparsity() for subsequent commit
Refactor code added to set_new_index_sparsity() in [1] to eliminate
indentation resulting from putting the body of his function within the
"if" block. Let's instead return early if we have no
istate->repo. This trivial change makes the subsequent commit's diff
smaller.

1. 491df5f679 (read-cache: set sparsity when index is new, 2022-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 10:36:58 -08:00
29fefafcba sparse-index API: BUG() out on NULL ensure_full_index()
Make the ensure_full_index() function stricter, and have it only
accept a non-NULL "struct index_state". This function (and this
behavior) was added in [1].

The only reason it needed to be this lax was due to interaction with
repo_index_has_changes(). See the addition of that code in [2].

The other reason for why this was needed dates back to interaction
with code added in [3]. In [4] we started calling ensure_full_index()
in unpack_trees(), but the caller added in 34110cd4e3 wants to pass
us a NULL "dst_index". Let's instead do the NULL check in
unpack_trees() itself.

1. 4300f8442a (sparse-index: implement ensure_full_index(), 2021-03-30)
2. 0c18c059a1 (read-cache: ensure full index, 2021-04-01)
3. 34110cd4e3 (Make 'unpack_trees()' have a separate source and
   destination index, 2008-03-06)
4. 6863df3550 (unpack-trees: ensure full index, 2021-03-30)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 10:36:57 -08:00
d2cdf2c285 sparse-index.c: expand_to_path() can assume non-NULL "istate"
This function added in [1] was subsequently used in [2]. All of the
calls to it are in name-hash.c, and come after calls to
lazy_init_name_hash(istate). The first thing that function does is:

	if (istate->name_hash_initialized)
		return;

So we can already assume that we have a non-NULL "istate" here, or
we'd be segfaulting. Let's not confuse matters by making it appear
that's not the case.

1. 71f82d032f (sparse-index: expand_to_path(), 2021-04-12)
2. 4589bca829 (name-hash: use expand_to_path(), 2021-04-12)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 10:36:57 -08:00
0dda3ac925 builtin/difftool.c: { 0 }-initialize rather than using memset()
Refactor an initialization of a variable added in
03831ef7b5 (difftool: implement the functionality in the builtin,
2017-01-19). This refactoring makes a subsequent change smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 10:36:57 -08:00
0c75692ebc merge: break out of all_strategy loop when strategy is found
Once we find a match, there is no point to try finding the second
match in the inner loop.  Break out of the loop once we find the
first match.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 10:24:57 -08:00
772f8ff826 githooks: discuss Git operations in foreign repositories
Hook authors are periodically caught off-guard by difficult-to-diagnose
errors when their hook invokes Git commands in a repository other than
the local one. In particular, Git environment variables, such as GIT_DIR
and GIT_WORK_TREE, which reference the local repository cause the Git
commands to operate on the local repository rather than on the
repository which the author intended. This is true whether the
environment variables have been set manually by the user or
automatically by Git itself. The same problem crops up when a hook
invokes Git commands in a different worktree of the same repository, as
well.

Recommended best-practice[1,2,3,4,5,6] for avoiding this problem is for
the hook to ensure that Git variables are unset before invoking Git
commands in foreign repositories or other worktrees:

    unset $(git rev-parse --local-env-vars)

However, this advice is not documented anywhere. Rectify this
shortcoming by mentioning it in githooks.txt documentation.

[1]: https://lore.kernel.org/git/YFuHd1MMlJAvtdzb@coredump.intra.peff.net/
[2]: https://lore.kernel.org/git/20200228190218.GC1408759@coredump.intra.peff.net/
[3]: https://lore.kernel.org/git/20190516221702.GA11784@sigill.intra.peff.net/
[4]: https://lore.kernel.org/git/20190422162127.GC9680@sigill.intra.peff.net/
[5]: https://lore.kernel.org/git/20180716183942.GB22298@sigill.intra.peff.net/
[6]: https://lore.kernel.org/git/20150203163235.GA9325@peff.net/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:59:26 -08:00
9e37969e4b doc: add "git switch -c" as another option on detached HEAD
In the "DETACHED HEAD" section in the git-checkout doc, it suggests
using "git checkout -b <branch-name>" to create a new branch on the
detached head.

On the other hand, when you checkout a commit that is not at the tip of
any named branch (e.g., when you checkout a tag), git suggests using
"git switch -c <branch-name>".

Add "git switch -c" as another option and mitigate this inconsistency.

Signed-off-by: Yutaro Ohno <yutaro.ono.418@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:57:40 -08:00
f1c9243fc5 git-rebase.txt: add a note about 'ORIG_HEAD' being overwritten
'ORIG_HEAD' is written at the start of the rebase, but is not guaranteed
to still point to the original branch tip at the end of the rebase.

Indeed, using other commands that write 'ORIG_HEAD' during the rebase,
like splitting a commit using 'git reset HEAD^', will lead to 'ORIG_HEAD'
being overwritten. This causes confusion for some users [1].

Add a note about that in the 'Description' section, and mention the more
robust alternative of using the branch's reflog.

[1] https://lore.kernel.org/git/28ebf03b-e8bb-3769-556b-c9db17e43dbb@gmail.com/T/#m827179c5adcfb504d67f76d03c8e6942b55e5ed0

Reported-by: Erik Cervin Edin <erik@cervined.in>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:55:46 -08:00
c6eec9cb36 revisions.txt: be explicit about commands writing 'ORIG_HEAD'
When mentioning 'ORIG_HEAD', be explicit about which command write that
pseudo-ref, namely 'git am', 'git merge', 'git rebase' and 'git reset'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:55:46 -08:00
0c514d5766 git-merge.txt: mention 'ORIG_HEAD' in the Description
The fact that 'git merge' writes 'ORIG_HEAD' before performing the merge
is missing from the documentation of the command.

Mention it in the 'Description' section.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:55:46 -08:00
d03c773cf6 git-reset.txt: mention 'ORIG_HEAD' in the Description
The fact that 'git reset' writes 'ORIG_HEAD' before changing HEAD is
mentioned in an example, but is missing from the 'Description' section.

Mention it in the discussion of the "'git reset' [<mode>] [<commit>]"
form of the command.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:55:45 -08:00
e29678bb7c git-cherry-pick.txt: do not use 'ORIG_HEAD' in example
Commit 67ac1e1d57 (cherry-pick/revert: add support for
-X/--strategy-option, 2010-12-10) added an example to the documentation
of 'git cherry-pick'. This example mentions how to abort a failed
cherry-pick and retry with an additional merge strategy option.

The command used in the example to abort the cherry-pick is 'git reset
--merge ORIG_HEAD', but cherry-pick does not write 'ORIG_HEAD' before
starting its operation. So this command would checkout a commit
unrelated to what was at HEAD when the user invoked cherry-pick.

Use 'git cherry-pick --abort' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:55:45 -08:00
15b63689a1 object-file: fix indent-with-space
Commit b25562e63f (object-file: inline calls to read_object(),
2023-01-07) accidentally indented a conditional block with spaces
instead of a tab.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 09:36:15 -08:00
6e57841096 use DUP_ARRAY
Add a semantic patch for replace ALLOC_ARRAY+COPY_ARRAY with DUP_ARRAY
to reduce code duplication and apply its results.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-09 13:28:36 +09:00
d2ec87a684 add DUP_ARRAY
Add a macro for allocating and populating a shallow copy of an array.
It is intended to replace a sequence like this:

   ALLOC_ARRAY(dst, n);
   COPY_ARRAY(dst, src, n);

With the less repetitve:

   DUP_ARRAY(dst, src, n);

It checks whether the types of source and destination are compatible to
ensure the copy can be used safely.

An easier alternative would be to only consider the source and return
a void pointer, that could be used like this:

   dst = ARRAY_DUP(src, n);

That would be more versatile, as it could be used in declarations as
well.  Making it type-safe would require the use of typeof_unqual from
C23, though.

So use the safe and compatible variant for now.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-09 13:28:36 +09:00
08e8c26665 do full type check in BARF_UNLESS_COPYABLE
Use __builtin_types_compatible_p to perform a full type check if
possible.  Otherwise fall back to the old size comparison, but add a
non-evaluated assignment to catch more type mismatches.  It doesn't flag
copies between arrays with different signedness, but that's as close to
a full type check as it gets without the builtin, as far as I can see.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-09 13:28:36 +09:00
1891846fa4 factor out BARF_UNLESS_COPYABLE
Move the common basic element type check of COPY_ARRAY and MOVE_ARRAY to
a new macro.  This reduces code duplication and simplifies adding more
elaborate checks.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-09 13:28:36 +09:00
09884f352e mingw: make argv2 in try_shell_exec() non-const
Prepare for a stricter type check in COPY_ARRAY by removing the const
qualifier of argv2, like we already do to placate Visual Studio.  We
have to add it back using explicit casts when actually using the
variable, unfortunately, because GCC (rightly) refuses to add it
implicitly.  Similar casts are already used in mingw_execv().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-09 13:28:21 +09:00
a38d39a4c5 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 13:25:20 +09:00
7ec4cccaa5 Merge branch 'cw/ci-whitespace'
CI updates.  We probably want a clean-up to move the long shell
script embedded in yaml file into a separate file, but that can
come later.

* cw/ci-whitespace:
  ci (check-whitespace): move to actions/checkout@v3
  ci (check-whitespace): add links to job output
  ci (check-whitespace): suggest fixes for errors
2023-01-08 13:25:20 +09:00
bfc7ef3554 Merge branch 'js/drop-mingw-test-cmp'
Use `git diff --no-index` as a test_cmp on Windows.

We'd probably need to revisit "do we really want to, and have to,
lose CRLF vs LF?" later, at which time we may be able to further
clean this up by replacing "git diff --no-index" with "diff -u".

* js/drop-mingw-test-cmp:
  tests(mingw): avoid very slow `mingw_test_cmp`
2023-01-08 13:25:19 +09:00
37449fbeb5 Merge branch 'js/ci-disable-cmake-by-default'
Stop running win+VS build by default.

* js/ci-disable-cmake-by-default:
  ci: only run win+VS build & tests in Git for Windows' fork
2023-01-08 13:25:19 +09:00
c2f32bef9c packfile: inline custom read_object()
When the pack code was split into its own file[1], it got a copy of the
static read_object() function. But there's only one caller here, so we
could just inline it. And it's worth doing so, as the name read_object()
invites comparisons to the public read_object_file(), but the two don't
behave quite the same.

[1] The move happened over several commits, but the relevant one here is
    f1d8130be0 (pack: move clear_delta_base_cache(), packed_object_info(),
    unpack_entry(), 2017-08-18).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:52:55 +09:00
0ba05cf2e0 repo_read_object_file(): stop wrapping read_object_file_extended()
The only caller of read_object_file_extended() is the thin wrapper of
repo_read_object_file(). Instead of wrapping, let's just rename the
inner function and let people call it directly. This cleans up the
namespace and reduces confusion.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:52:55 +09:00
7be13f5f74 read_object_file_extended(): drop lookup_replace option
Our sole caller always passes in "1", so we can just drop the parameter
entirely. Anybody who doesn't want this behavior could easily call
oid_object_info_extended() themselves, as we're just a thin wrapper
around it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:52:55 +09:00
34728d7f30 streaming: inline call to read_object_file_extended()
The open_istream_incore() function is the only direct user of
read_object_file_extended(), and the only caller which unsets the
lookup_replace flag. Since read_object_file_extended() is now just a
thin wrapper around oid_object_info_extended(), let's inline the call.
That will let us simplify read_object_file_extended() in the next patch.

The inlined version here is a few more lines because of the query setup,
but it's much more flexible, since we can pass (or omit) any flags we
want.

Note the updated comment in the istream struct definition. It was
already slightly wrong (we never called read_object(); it has been
read_object_file_extended() since day one), but should now be accurate.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:52:54 +09:00
b25562e63f object-file: inline calls to read_object()
Since read_object() is these days just a thin wrapper around
oid_object_info_extended(), and since it only has two callers, let's
just inline those calls. This has a few positive outcomes:

  - it's a net reduction in source code lines

  - even though the callers end up with a few extra lines, they're now
    more flexible and can use object_info flags directly. So no more
    need to convert die_if_corrupt between parameter/flag, and we can
    ask for lookup replacement with a flag rather than doing it
    ourselves.

  - there's one fewer function in an already crowded namespace (e.g.,
    the difference between read_object() and read_object_file() was not
    immediately obvious; now we only have one of them).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:52:54 +09:00
d43b99322b convert trivial uses of strncmp() to skip_prefix()
As with the previous patch, using skip_prefix() is more readable and
less error-prone than a raw strncmp(), because it avoids a
manually-computed length. These cases differ from the previous patch
that uses starts_with() because they care about the value after the
matched prefix.

We can convert these to use skip_prefix() by introducing an extra
variable to hold the out-pointer.

Note in the case in ws.c that to get rid of the magic number "9"
completely, we also switch out "len" for recomputing the pointer
difference. These are equivalent because "len" is always "ep - string".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:34:37 +09:00
20869d1a1d convert trivial uses of strncmp() to starts_with()
It's more readable to use starts_with() instead of strncmp() to match a
prefix, as the latter requires a manually-computed length, and has the
funny "matching is zero" return value common to cmp functions.  This
patch converts several cases which were found with:

  git grep 'strncmp(.*, [0-9]*)'

But note that it doesn't convert all such cases. There are several where
the magic length number is repeated elsewhere in the code, like:

  /* handle "buf" which isn't NUL-terminated and might be too small */
  if (len >= 3 && !strncmp(buf, "foo", 3))

or:

  /* exact match for "foo", but within a larger string */
  if (end - buf == 3 && !strncmp(buf, "foo", 3))

While it would not produce the wrong outcome to use starts_with() in
these cases, we'd still be left with one instance of "3". We're better
to leave them for now, as the repeated "3" makes it clear that the two
are linked (there may be other refactorings that handle both, but
they're out of scope for this patch).

A few things to note while reading the patch:

  - all cases but one are trying to match, and so lose the extra "!".
    The case in the first hunk of urlmatch.c is not-matching, and hence
    gains a "!".

  - the case in remote-fd.c is matching the beginning of "connect foo",
    but we never look at str+8 to parse the "foo" part (which would make
    this a candidate for skip_prefix(), not starts_with()). This seems
    at first glance like a bug, but is a limitation of how remote-fd
    works.

  - the second hunk in urlmatch.c shows some cases adjacent to other
    strncmp() calls that are left. These are of the "exact match within
    a larger string" type, as described above.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:34:35 +09:00
b39a84185e *: fix typos which duplicate a word
Fix typos in code comments which repeat various words.  Most of the
cases are simple in that they repeat a word that usually cannot be
repeated in a grammatically correct sentence.  Just remove the
incorrectly duplicated word in these cases and rewrap text, if needed.

A tricky case is usage of "that that", which is sometimes grammatically
correct.  However, an instance of this in "t7527-builtin-fsmonitor.sh"
doesn't need two words "that", because there is only one daemon being
discussed, so replace the second "that" with "the".

Reword code comment "entries exist on on-disk index" in function
update_one in file cache-tree.c, by replacing incorrect preposition "on"
with "in".

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:28:34 +09:00
54463d32ef use enhanced basic regular expressions on macOS
When 1819ad327b (grep: fix multibyte regex handling under macOS,
2022-08-26) started to use the native regex library instead of Git's
own (compat/regex/), it lost support for alternation in basic
regular expressions.

Bring it back by enabling the flag REG_ENHANCED on macOS when
compiling basic regular expressions.

Reported-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-08 10:06:34 +09:00
17194b195d features: feature.manyFiles implies fast index writes
The recent addition of the index.skipHash config option allows index
writes to speed up by skipping the hash computation for the trailing
checksum. This is particularly critical for repositories with many files
at HEAD, so add this config option to two cases where users in that
scenario may opt-in to such behavior:

 1. The feature.manyFiles config option enables some options that are
    helpful for repositories with many files at HEAD.

 2. 'scalar register' and 'scalar reconfigure' set config options that
    optimize for large repositories.

In both of these cases, set index.skipHash=true to gain this
speedup. Add tests that demonstrate the proper way that
index.skipHash=true can override feature.manyFiles=true.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-07 07:46:14 +09:00
da9acde14e test-lib-functions: add helper for trailing hash
It can be helpful to check that a file format with a trailing hash has a
specific hash in the final bytes of a written file. This is made more
apparent by recent changes that allow skipping the hash algorithm and
writing a null hash at the end of the file instead.

Add a new test_trailing_hash helper and use it in t1600 to verify that
index.skipHash=true really does skip the hash computation, since
'git fsck' does not actually verify the hash. This confirms that when
the config is disabled explicitly in a super project but enabled in a
submodule, then the use of repo_config_get_bool() loads config from the
correct repository in the case of 'git add'. There are other cases where
istate->repo is NULL and thus this config is loaded instead from
the_repository, but that's due to many different code paths initializing
index_state structs in their own way.

Keep the 'git fsck' call to ensure that any potential future change to
check the index hash does not cause an error in this case.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-07 07:46:14 +09:00
ee1f0c242e read-cache: add index.skipHash config option
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.

One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.

This trade-off between file stability at rest and write-time performance
is not easy to balance. The index is an interesting case for a couple
reasons:

1. Writes block users. Writing the index takes place in many user-
   blocking foreground operations. The speed improvement directly
   impacts their use. Other file formats are typically written in the
   background (commit-graph, multi-pack-index) or are super-critical to
   correctness (pack-files).

2. Index files are short lived. It is rare that a user leaves an index
   for a long time with many staged changes. Outside of staged changes,
   the index can be completely destroyed and rewritten with minimal
   impact to the user.

Following a similar approach to one used in the microsoft/git fork [1],
add a new config option (index.skipHash) that allows disabling this
hashing during the index write. The cost is that we can no longer
validate the contents for corruption-at-rest using the trailing hash.

[1] 21fed2d914

We load this config from the repository config given by istate->repo,
with a fallback to the_repository if it is not set.

While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.

The one exception is 'git fsck' which checks the hash of the index file.
This used to be a check on every index read, but was split out to just
the index in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14) and released first in Git 2.13.0. Document the versions that
relaxed these restrictions, with the optimistic expectation that this
change will be included in Git 2.40.0.

Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.

As a quick comparison, I tested 'git update-index --force-write' with
and without index.skipHash=true on a copy of the Linux kernel
repository.

Benchmark 1: with hash
  Time (mean ± σ):      46.3 ms ±  13.8 ms    [User: 34.3 ms, System: 11.9 ms]
  Range (min … max):    34.3 ms …  79.1 ms    82 runs

Benchmark 2: without hash
  Time (mean ± σ):      26.0 ms ±   7.9 ms    [User: 11.8 ms, System: 14.2 ms]
  Range (min … max):    16.3 ms …  42.0 ms    69 runs

Summary
  'without hash' ran
    1.78 ± 0.76 times faster than 'with hash'

These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.

Test this new config option, both at a command-line level and within a
submodule. The confirmation is currently limited to confirm that 'git
fsck' does not complain about the index. Future updates will make this
test more robust.

It is critical that this test is placed before the test_index_version
tests, since those tests obliterate the .git/config file and hence lose
the setting from GIT_TEST_DEFAULT_HASH, if set.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-07 07:46:14 +09:00
1687150b5d hashfile: allow skipping the hash function
The hashfile API is useful for generating files that include a trailing
hash of the file's contents up to that point. Using such a hash is
helpful for verifying the file for corruption-at-rest, such as a faulty
drive causing flipped bits.

Git's index file includes this trailing hash, so it uses a 'struct
hashfile' to handle the I/O to the file. This was very convenient to
allow using the hashfile methods during these operations.

However, hashing the file contents during write comes at a performance
penalty. It's slower to hash the bytes on their way to the disk than
without that step. This problem is made worse by the replacement of
hardware-accelerated SHA1 computations with the software-based sha1dc
computation.

This write cost is significant, and the checksum capability is likely
not worth that cost for such a short-lived file. The index is rewritten
frequently and the only time the checksum is checked is during 'git
fsck'. Thus, it would be helpful to allow a user to opt-out of the hash
computation.

We first need to allow Git to opt-out of the hash computation in the
hashfile API. The buffered writes of the API are still helpful, so it
makes sense to make the change here.

Introduce a new 'skip_hash' option to 'struct hashfile'. When set, the
update_fn and final_fn members of the_hash_algo are skipped. When
finalizing the hashfile, the trailing hash is replaced with the null
hash.

This use of a trailing null hash would be desireable in either case,
since we do not want to special case a file format to have a different
length depending on whether it was hashed or not. When the final bytes
of a file are all zero, we can infer that it was written without
hashing, and thus that verification is not available as a check for file
consistency. This also means that we could easily toggle hashing for any
file format we desire.

A version of this patch has existed in the microsoft/git fork since
2017 [1] (the linked commit was rebased in 2018, but the original dates
back to January 2017). Here, the change to make the index use this fast
path is delayed until a later change.

[1] 21fed2d914

Co-authored-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-07 07:46:14 +09:00
f034bb1cad diff: drop "name" parameter from prepare_temp_file()
The prepare_temp_file() function takes a diff_filespec as well as a
filename. But it is almost certainly an error to pass in a name that
isn't the filespec's "path" parameter, since that is the only thing that
reliably tells us how to find the content (and indeed, this was the
source of a recently-fixed bug).

So let's drop the redundant "name" parameter and just use one->path
throughout the function. This simplifies the interface a little bit, and
makes it impossible for calling code to get it wrong.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06 21:50:09 +09:00
de8f14e1c0 diff: clean up external-diff argv setup
Since the previous commit, setting up the tempfile for an external diff
uses df->path from the diff_filespec, rather than the logical name. This
means add_external_diff_name() does not need to take a "name" parameter
at all, and we can drop it. And that in turn lets us simplify the
conditional for handling renames (when the "other" name is non-NULL).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06 21:50:07 +09:00
a0f83e7776 diff: use filespec path to set up tempfiles for ext-diff
When we're going to run an external diff, we have to make the contents
of the pre- and post-images available either by dumping them to a
tempfile, or by pointing at a valid file in the worktree. The logic of
this is all handled by prepare_temp_file(), and we just pass in the
filename and the diff_filespec.

But there's a gotcha here. The "filename" we have is a logical filename
and not necessarily a path on disk or in the repository. This matters in
at least one case: when using "--relative", we may have a name like
"foo", even though the file content is found at "subdir/foo". As a
result, we look for the wrong path, fail to find "foo", and claim that
the file has been deleted (passing "/dev/null" to the external diff,
rather than the correct worktree path).

We can fix this by passing the pathname from the diff_filespec, which
should always be a full repository path (and that's what we want even if
reusing a worktree file, since we're always operating from the top-level
of the working tree).

The breakage seems to go all the way back to cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12). As far as I can tell, before then "name" would always have
been the same as the filespec's "path".

There are two related cases I looked at that aren't buggy:

  1. the only other caller of prepare_temp_file() is run_textconv(). But
     it always passes the filespec's path field, so it's OK.

  2. I wondered if file renames/copies might cause similar confusion.
     But they don't, because run_external_diff() receives two names in
     that case: "name" and "other", which correspond to the two sides of
     the diff. And we did correctly pass "other" when handling the
     post-image side. Barring the use of "--relative", that would always
     match "two->path", the path of the second filespec (and the rename
     destination).

So the only bug is just the interaction with external diff drivers and
--relative.

Reported-by: Carl Baldwin <carl@ecbaldwin.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06 21:49:55 +09:00
d4e241a145 test-bundle-uri: drop unused variables
Commit 70b9c10373 (bundle-uri client: add helper for testing server,
2022-12-22) added a cmd_ls_remote() function which contains "uploadpack"
and "server_options" variables. Neither of these variables is ever
modified after being initialized, so the code to handle non-NULL and
non-empty values is impossible to reach.

While in theory we might add command-line parsing to set these, let's
drop the dead code for now in the name of cleanliness. It's easy enough
to add it back later if need be.

Noticed by Coverity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06 21:34:49 +09:00
4dbebc36b0 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-05 15:07:23 +09:00
d4c5400865 Merge branch 'ab/no-more-git-global-super-prefix'
Stop using "git --super-prefix" and narrow the scope of its use to
the submodule--helper.

* ab/no-more-git-global-super-prefix:
  read-tree: add "--super-prefix" option, eliminate global
  submodule--helper: convert "{update,clone}" to their own "--super-prefix"
  submodule--helper: convert "status" to its own "--super-prefix"
  submodule--helper: convert "sync" to its own "--super-prefix"
  submodule--helper: convert "foreach" to its own "--super-prefix"
  submodule--helper: don't use global --super-prefix in "absorbgitdirs"
  submodule.c & submodule--helper: pass along "super_prefix" param
  read-tree + fetch tests: test failing "--super-prefix" interaction
  submodule absorbgitdirs tests: add missing "Migrating git..." tests
2023-01-05 15:07:23 +09:00
bc58ebf84e Merge branch 'ab/bundle-wo-args'
Fix to a small regression in 2.38 days.

* ab/bundle-wo-args:
  bundle <cmd>: have usage_msg_opt() note the missing "<file>"
  builtin/bundle.c: remove superfluous "newargc" variable
  bundle: don't segfault on "git bundle <subcmd>"
2023-01-05 15:07:22 +09:00
6b1e4b13bf Merge branch 'km/doc-branch-start-point'
Typofix.

* km/doc-branch-start-point:
  doc/git-branch: fix --force description typo
2023-01-05 15:07:21 +09:00
09bfb2ed81 Merge branch 'ar/typofix-gitattributes-doc'
Typofix.

* ar/typofix-gitattributes-doc:
  gitattributes.txt: fix typo in "comma separated"
2023-01-05 15:07:21 +09:00
6f212b7c3f Merge branch 'sg/test-oid-wo-incomplete-line'
Test helper updates.

* sg/test-oid-wo-incomplete-line:
  tests: make 'test_oid' print trailing newline
2023-01-05 15:07:19 +09:00
3eac69d267 Merge branch 'dh/mingw-ownership-check-typofix'
Error message typofix.

* dh/mingw-ownership-check-typofix:
  mingw: fix typo in an error message from ownership check
2023-01-05 15:07:18 +09:00
1f9b02b970 Merge branch 'jt/avoid-lazy-fetch-commits'
Even in a repository with promisor remote, it is useless to
attempt to lazily attempt fetching an object that is expected to be
commit, because no "filter" mode omits commit objects.  Take
advantage of this assumption to fail fast on errors.

* jt/avoid-lazy-fetch-commits:
  commit: don't lazy-fetch commits
  object-file: emit corruption errors when detected
  object-file: refactor map_loose_object_1()
  object-file: remove OBJECT_INFO_IGNORE_LOOSE
2023-01-05 15:07:17 +09:00
319c3abadb Merge branch 'sa/cat-file-mailmap--batch-check'
'cat-file' gains mailmap support for its '--batch-check' and '-s'
options.

* sa/cat-file-mailmap--batch-check:
  cat-file: add mailmap support to --batch-check option
  cat-file: add mailmap support to -s option
2023-01-05 15:07:17 +09:00
566902f2db am: allow passing --no-verify flag
The git-am --no-verify flag is analogous to the same flag passed to
git-commit. It bypasses the pre-applypatch and applypatch-msg hooks
if they are enabled.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-05 14:52:25 +09:00
5842710dc2 dir: check for single file cone patterns
The sparse checkout documentation states that the cone mode pattern set
is limited to patterns that either recursively include directories or
patterns that match all files in a directory. In the sparse checkout
file, the former manifest in the form:

    /A/B/C/

while the latter become a pair of patterns either in the form:

    /A/B/
    !/A/B/*/

or in the special case of matching the toplevel files:

    /*
    !/*/

The 'add_pattern_to_hashsets()' function contains checks which serve to
disable cone-mode when non-cone patterns are encountered. However, these
do not catch when the pattern list attempts to match a single file or
directory, e.g. a pattern in the form:

    /A/B/C

This causes sparse-checkout to exhibit unexpected behaviour when such a
pattern is in the sparse-checkout file and cone mode is enabled.
Concretely, with the pattern like the above, sparse-checkout, in
non-cone mode, will only include the directory or file located at
'/A/B/C'. However, with cone mode enabled, sparse-checkout will instead
just manifest the toplevel files but not any file located at '/A/B/C'.

Relatedly, issues occur when supplying the same kind of filter when
partial cloning with '--filter=sparse:oid=<oid>'. 'upload-pack' will
correctly just include the objects that match the non-cone pattern
matching. Which means that checking out the newly cloned repo with the
same filter, but with cone mode enabled, fails due to missing objects.

To fix these issues, add a cone mode pattern check that asserts that
every pattern is either a directory match or the pattern '/*'. Add a
test to verify the new pattern check and modify another to reflect that
non-directory patterns are caught earlier.

Signed-off-by: William Sprent <williams@unity3d.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-05 11:14:28 +09:00
238a9dfe86 win32: close handles of threads that have been joined
After the thread terminates, the handle to the
original thread should be closed.

This change makes win32_pthread_join POSIX compliant.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-04 15:39:47 +09:00
23a6a12dfa win32: prepare pthread.c for change by formatting
File has been formatted to meet coding guidelines.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-04 15:39:47 +09:00
7b341645e3 ci(github): restore "print test failures" step name
As well as removing the explicit shell setting d8b21a0fe2 (CI: don't
explicitly pick "bash" shell outside of Windows, fix regression,
2022-12-07) also reverted the name of the print test failures step
introduced by 5aeb145780 (ci(github): bring back the 'print test
failures' step, 2022-06-08). This is unfortunate as 5aeb145780 added a
message to direct contributors to the "print test failures" step when a
test fails and that step is no-longer known by that name on the
non-windows ci jobs.

In principle we could update the message to print the correct name for
the step but then we'd have to deal with having two different names for
the same step on different jobs. It is simpler for the implementation
and contributors to use the same name for this step on all jobs.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-04 15:16:15 +09:00
2b4f5a4e4b The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-02 21:37:19 +09:00
3ed91c5f22 Merge branch 'ps/fsync-refs-fix'
Fix the sequence to fsync $GIT_DIR/packed-refs file that forgot to
flush its output to the disk..

* ps/fsync-refs-fix:
  refs: fix corruption by not correctly syncing packed-refs to disk
2023-01-02 21:37:19 +09:00
039e5a0b70 Merge branch 'sk/win32-pthread-exit-fix'
An API emulation fix.

* sk/win32-pthread-exit-fix:
  win32: use _endthreadex to terminate threads, not ExitThread
2023-01-02 21:37:19 +09:00
e83d57e34a Merge branch 'ew/format-patch-mboxrd'
"git format-patch" learned to honor format.mboxrd even when sending
patches to the standard output stream,

* ew/format-patch-mboxrd:
  format-patch: support format.mboxrd with --stdout
2023-01-02 21:37:19 +09:00
0903d8bbde Merge branch 'ds/bundle-uri-4'
Bundle URIs part 4.

* ds/bundle-uri-4:
  clone: unbundle the advertised bundles
  bundle-uri: download bundles from an advertised list
  bundle-uri: allow relative URLs in bundle lists
  strbuf: introduce strbuf_strip_file_from_path()
  bundle-uri: serve bundle.* keys from config
  bundle-uri client: add helper for testing server
  transport: rename got_remote_heads
  bundle-uri client: add boolean transfer.bundleURI setting
  clone: request the 'bundle-uri' command when available
  t: create test harness for 'bundle-uri' command
  protocol v2: add server-side "bundle-uri" skeleton
2023-01-02 21:37:18 +09:00
3f2e4c09c7 Merge branch 'lk/line-range-parsing-fix'
When given a pattern that matches an empty string at the end of a
line, the code to parse the "git diff" line-ranges fell into an
infinite loop, which has been corrected.

* lk/line-range-parsing-fix:
  line-range: fix infinite loop bug with '$' regex
2023-01-02 21:37:18 +09:00
6bae53b138 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-28 12:06:17 +09:00
48475f43a0 Merge branch 'sa/git-var-sequence-editor'
Just like "git var GIT_EDITOR" abstracts the complex logic to
choose which editor gets used behind it, "git var" now give support
to GIT_SEQUENCE_EDITOR.

* sa/git-var-sequence-editor:
  var: add GIT_SEQUENCE_EDITOR variable
2022-12-28 12:06:17 +09:00
b3b9e5c171 Merge branch 'ss/pull-v-recurse-fix'
"git pull -v --recurse-submodules" attempted to pass "-v" down to
underlying "git submodule update", which did not understand the
request and barfed, which has been corrected.

* ss/pull-v-recurse-fix:
  submodule: accept -v for the update command
2022-12-28 12:06:17 +09:00
6d5e9e53aa bundle <cmd>: have usage_msg_opt() note the missing "<file>"
Improve the usage we emit on e.g. "git bundle create" to note why
we're showing the usage, it's because the "<file>" argument is
missing.

We know that'll be the case for all parse_options_cmd_bundle() users,
as they're passing the "char **bundle_file" parameter, which as the
context shows we're expected to populate.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-28 08:30:52 +09:00
e778ecbcee builtin/bundle.c: remove superfluous "newargc" variable
As noted in 891cb09db6 (bundle: don't segfault on "git bundle
<subcmd>", 2022-12-20) the "newargc" in this function is redundant to
using our own "argc". Let's refactor the function to avoid needlessly
introducing another variable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-28 08:30:01 +09:00
f95526419b gitattributes.txt: fix typo in "comma separated"
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-28 08:29:29 +09:00
27875aeec9 doc/git-branch: fix --force description typo
Update the description of --force to use '<start-point>' rather than
'<startpoint>' to match the spelling used everywhere else in the
git-branch documentation.

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-27 09:45:58 +09:00
8a4e8f6a67 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 11:42:31 +09:00
cd2cc44c02 Merge branch 'ab/darwin-default-to-sha1dc'
Use the SHA1DC implementation on macOS, just like other platforms,
by default.

* ab/darwin-default-to-sha1dc:
  Makefile: use sha1collisiondetection by default on OSX and Darwin
2022-12-26 11:42:07 +09:00
3613ab5df5 Merge branch 'sk/remove-duplicate-includes'
Code clean-up.

* sk/remove-duplicate-includes:
  git: remove duplicate includes
2022-12-26 11:42:07 +09:00
e57caee004 Merge branch 'pg/diff-stat-unmerged-regression-fix'
The output from "git diff --stat" on an unmerged path lost the
terminating LF in Git 2.39, which has been corrected.

* pg/diff-stat-unmerged-regression-fix:
  diff: fix regression with --stat and unmerged file
2022-12-26 11:42:07 +09:00
78d15022e7 Merge branch 'jk/ref-filter-error-reporting-fix'
Clean-ups in error messages produced by "git for-each-ref" and friends.

* jk/ref-filter-error-reporting-fix:
  ref-filter: convert email atom parser to use err_bad_arg()
  ref-filter: truncate atom names in error messages
  ref-filter: factor out "unrecognized %(foo) arg" errors
  ref-filter: factor out "%(foo) does not take arguments" errors
  ref-filter: reject arguments to %(HEAD)
2022-12-26 11:42:06 +09:00
d4539b5c71 Merge branch 'rs/clarify-error-in-write-loose-object'
Code clean-up.

* rs/clarify-error-in-write-loose-object:
  object-file: inline write_buffer()
2022-12-26 11:42:06 +09:00
b0c61be320 Merge branch 'rs/reflog-expiry-cleanup'
Code clean-up.

* rs/reflog-expiry-cleanup:
  reflog: clear leftovers in reflog_expiry_cleanup()
2022-12-26 11:42:06 +09:00
c637bd230d Merge branch 'rs/clear-commit-marks-cleanup'
Code clean-up.

* rs/clear-commit-marks-cleanup:
  commit: skip already cleared parents in clear_commit_marks_1()
2022-12-26 11:42:05 +09:00
d8e406449a Merge branch 'rs/am-parse-options-cleanup'
Code clean-up.

* rs/am-parse-options-cleanup:
  am: don't pass strvec to apply_parse_options()
2022-12-26 11:42:05 +09:00
7124e36ec7 Merge branch 'jk/server-supports-v2-cleanup'
Code clean-up.

* jk/server-supports-v2-cleanup:
  server_supports_v2(): use a separate function for die_on_error
2022-12-26 11:42:05 +09:00
179547932f Merge branch 'jk/unused-post-2.39'
Code clean-up around unused function parameters.

* jk/unused-post-2.39:
  userdiff: mark unused parameter in internal callback
  list-objects-filter: mark unused parameters in virtual functions
  diff: mark unused parameters in callbacks
  xdiff: mark unused parameter in xdl_call_hunk_func()
  xdiff: drop unused parameter in def_ff()
  ws: drop unused parameter from ws_blank_line()
  list-objects: drop process_gitlink() function
  blob: drop unused parts of parse_blob_buffer()
  ls-refs: use repository parameter to iterate refs
2022-12-26 11:42:05 +09:00
c099531b00 Merge branch 'jt/http-fetch-trace2-report-name'
"git http-fetch" (which is rarely used) forgot to identify itself
in the trace2 output.

* jt/http-fetch-trace2-report-name:
  http-fetch: invoke trace2_cmd_name()
2022-12-26 11:42:04 +09:00
4a9b839dd1 Merge branch 'sg/help-autocorrect-config-fix'
The code to auto-correct a misspelt subcommand unnecessarily called
into git_default_config() from the early config codepath, which was
a no-no.  This has bee corrected.

* sg/help-autocorrect-config-fix:
  help.c: fix autocorrect in work tree for bare repository
2022-12-26 11:42:04 +09:00
4002ec3dcf read-tree: add "--super-prefix" option, eliminate global
The "--super-prefix" option to "git" was initially added in [1] for
use with "ls-files"[2], and shortly thereafter "submodule--helper"[3]
and "grep"[4]. It wasn't until [5] that "read-tree" made use of it.

At the time [5] made sense, but since then we've made "ls-files"
recurse in-process in [6], "grep" in [7], and finally
"submodule--helper" in the preceding commits.

Let's also remove it from "read-tree", which allows us to remove the
option to "git" itself.

We can do this because the only remaining user of it is the submodule
API, which will now invoke "read-tree" with its new "--super-prefix"
option. It will only do so when the "submodule_move_head()" function
is called.

That "submodule_move_head()" function was then only invoked by
"read-tree" itself, but now rather than setting an environment
variable to pass "--super-prefix" between cmd_read_tree() we:

- Set a new "super_prefix" in "struct unpack_trees_options". The
  "super_prefixed()" function in "unpack-trees.c" added in [5] will now
  use this, rather than get_super_prefix() looking up the environment
  variable we set earlier in the same process.

- Add the same field to the "struct checkout", which is only needed to
  ferry the "super_prefix" in the "struct unpack_trees_options" all the
  way down to the "entry.c" callers of "submodule_move_head()".

  Those calls which used the super prefix all originated in
  "cmd_read_tree()". The only other caller is the "unlink_entry()"
  caller in "builtin/checkout.c", which now passes a "NULL".

1. 74866d7579 (git: make super-prefix option, 2016-10-07)
2. e77aa336f1 (ls-files: optionally recurse into submodules, 2016-10-07)
3. 89c8626557 (submodule helper: support super prefix, 2016-12-08)
4. 0281e487fd (grep: optionally recurse into submodules, 2016-12-16)
5. 3d415425c7 (unpack-trees: support super-prefix option, 2017-01-17)
6. 188dce131f (ls-files: use repository object, 2017-06-22)
7. f9ee2fcdfa (grep: recurse in-process using 'struct repository', 2017-08-02)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:44 +09:00
f5a6be9d54 submodule--helper: convert "{update,clone}" to their own "--super-prefix"
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper status" to use its own "--super-prefix", instead of
relying on the global "--super-prefix" argument to "git".

We need to convert both of these away from the global "--super-prefix"
at the same time, because "update" will call "clone", but "clone"
itself didn't make use of the global "--super-prefix" for displaying
paths. It was only on the list of sub-commands that accepted it
because "update"'s use of it would set it in its environment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:44 +09:00
04f1fab4a1 submodule--helper: convert "status" to its own "--super-prefix"
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper status" to use its own "--super-prefix", instead of
relying on the global "--super-prefix" argument to "git" itself. See
that earlier commit for the rationale and background.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:44 +09:00
99a32d87f8 submodule--helper: convert "sync" to its own "--super-prefix"
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper sync" to use its own "--super-prefix", instead of
relying on the global "--super-prefix" argument to "git" itself. See
that earlier commit for the rationale and background.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:44 +09:00
677c981260 submodule--helper: convert "foreach" to its own "--super-prefix"
As with a preceding commit to convert "absorbgitdirs", we can convert
"submodule--helper foreach" to use its own "--super-prefix", instead
of relying on the global "--super-prefix" argument to "git"
itself. See that earlier commit for the rationale and background.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:44 +09:00
bb61a962d2 submodule--helper: don't use global --super-prefix in "absorbgitdirs"
The "--super-prefix" facility was introduced in [1] has always been a
transitory hack, which is why we've made it an error to supply it as
an option to "git" to commands that don't know about it.

That's been a good goal, as it has a global effect we haven't wanted
calls to get_super_prefix() from built-ins we didn't expect.

But it has meant that when we've had chains of different built-ins
using it all of the processes in that "chain" have needed to support
it, and worse processes that don't need it have needed to ask for
"SUPPORT_SUPER_PREFIX" because their parent process needs it.

That's how "fsmonitor--daemon" ended up with it, per [2] it's called
from (among other things) "submodule--helper absorbgitdirs", but as we
declared "submodule--helper" as "SUPPORT_SUPER_PREFIX" we needed to
declare "fsmonitor--daemon" as accepting it too, even though it
doesn't care about it.

But in the case of "absorbgitdirs" it only needed "--super-prefix" to
invoke itself recursively, and we'd never have another "in-between"
process in the chain. So we didn't need the bigger hammer of "git
--super-prefix", and the "setenv(GIT_SUPER_PREFIX_ENVIRONMENT, ...)"
that it entails.

Let's instead accept a hidden "--super-prefix" option to
"submodule--helper absorbgitdirs" itself.

Eventually (as with all other "--super-prefix" users) we'll want to
clean this code up so that this all happens in-process. I.e. needing
any variant of "--super-prefix" is itself a hack around our various
global state, and implicit reliance on "the_repository". This stepping
stone makes such an eventual change easier, as we'll need to deal with
less global state at that point.

The "fsmonitor--daemon" test adjusted here was added in [3]. To assert
that it didn't run into the "--super-prefix" message it was asserting
the output it didn't have. Let's instead assert the full output that
we *do* have, using the same pattern as a preceding change to
"t/t7412-submodule-absorbgitdirs.sh" used.

We could also remove the test entirely (as [4] did), but even though
the initial reason for having it is gone we're still getting some
marginal benefit from testing the "fsmonitor" and "submodule
absorbgitdirs" interaction, so let's keep it.

The change here to have either a NULL or non-"" string as a
"super_prefix" instead of the previous arrangement of "" or non-"" is
somewhat arbitrary. We could also decide to never have to check for
NULL.

As we'll be changing the rest of the "git --super-prefix" users to the
same pattern, leaving them all consistent makes sense. Why not pick ""
over NULL? Because that's how the "prefix" works[5], and having
"prefix" and "super_prefix" work the same way will be less
confusing. That "prefix" picked NULL instead of "" is itself
arbitrary, but as it's easy to make this small bit of our overall API
consistent, let's go with that.

1. 74866d7579 (git: make super-prefix option, 2016-10-07)
2. 53fcfbc84f (fsmonitor--daemon: allow --super-prefix argument,
   2022-05-26)
3. 53fcfbc84f (fsmonitor--daemon: allow --super-prefix argument,
   2022-05-26)
4. https://lore.kernel.org/git/20221109004708.97668-5-chooglen@google.com/
5. 9725c8dda2 (built-ins: trust the "prefix" from run_builtin(),
   2022-02-16)

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:43 +09:00
f0a5e5ad57 submodule.c & submodule--helper: pass along "super_prefix" param
Start passing the "super_prefix" along as a parameter to
get_submodule_displaypath() and absorb_git_dir_into_superproject(),
rather than get the value directly as a global.

This is in preparation for subsequent commits, where we'll gradually
phase out get_super_prefix() for an alternative way of getting the
"super_prefix".

Most of the users of this get a get_super_prefix() value, either
directly or by indirection. The exceptions are:

- builtin/rm.c: Doesn't declare SUPPORT_SUPER_PREFIX, so we'd have
  died if this was provided, so it's safe to pass "NULL".

- deinit_submodule(): The "deinit_submodule()" function has never been
  able to use the "git -super-prefix". It will call
  "absorb_git_dir_into_superproject()", but it will only do so from the
  top-level project.

  If "absorbgitdirs" recurses will use the "path" passed to
  "absorb_git_dir_into_superproject()" in "deinit_submodule()" as its
  starting "--super-prefix". So we can safely remove the
  get_super_prefix() call here, and pass NULL instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:43 +09:00
0d1806e53d read-tree + fetch tests: test failing "--super-prefix" interaction
Ever since "git fetch --refetch" was introduced in 0f5e885173 (Merge
branch 'rc/fetch-refetch', 2022-04-04) the test being added here would
fail. This is because "restore" will "read-tree .. --reset <hash>",
which will in turn invoke "fetch". The "fetch" will then die with:

	fatal: fetch doesn't support --super-prefix

This edge case and other "--super-prefix" bugs will be fixed in
subsequent commits, but let's first add a "test_expect_failure" test
for it. It passes until the very last command in the test.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:43 +09:00
49eb1d388a submodule absorbgitdirs tests: add missing "Migrating git..." tests
Fix a blind spots in the tests surrounding "submodule absorbgitdirs"
and test what output we emit, and how emitted the message and behavior
interacts with a "git worktree" where the repository isn't at the base
of the working directory.

The "$(pwd)" instead of "$PWD" here is needed due to Windows, where
the latter will be a path like "/d/a/git/[...]", whereas we need
"D:/a/git/[...]".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26 10:21:43 +09:00
0006e2e3f1 win32: use _endthreadex to terminate threads, not ExitThread
Because we use the C runtime and
use _beginthreadex to create pthreads,
pthread_exit MUST use _endthreadex.

Otherwise, according to Microsoft:
"Failure to do so results in small
memory leaks when the thread
calls ExitThread."

Simply put, this is not the same as ExitThread.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:34:03 +09:00
4810946f60 format-patch: support format.mboxrd with --stdout
mboxrd is a more robust output format when used with --stdout
and needs more exposure.  Introducing this config knob lets
users choose the more robust format for all their --stdout
uses.

Relying on --pretty=mboxrd and including all of pretty-formats.txt
in the `git format-patch' documentation would likely be
confusing to users.  Furthermore, this setting is useful across
multiple invocations.  So introduce `format.mboxrd' as a boolean
configuration knob that changes the default --pretty=email format
to --pretty=mboxrd when (and only when) --stdout is in use.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:32:45 +09:00
876094ac16 clone: unbundle the advertised bundles
A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 command, when advertised _and_
when the client has chosen to enable the feature.

Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'. This takes place between the ref advertisement and
the object data download, and stateful connections will linger while
the client downloads bundles. In the future, we should consider closing
the remote connection during this process.

Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
12b0a14b9e bundle-uri: download bundles from an advertised list
The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
'git clone', but is not helpful when the clone operation discovers a
list of URIs from the bundle-uri protocol v2 command. To actually
download and unbundle the advertised bundles, we need a different
mechanism.

Create the new fetch_bundle_list() method which is very similar to
fetch_bundle_uri() except that it relies on download_bundle_list()
instead of fetch_bundle_uri_internal(). The download_bundle_list()
method will recursively call fetch_bundle_uri_internal() if any of the
advertised URIs serve a bundle list instead of a bundle. This will also
follow the bundle.list.mode setting from the input list: "any" will
download only one such URI while "all" will download data from all of
the URIs.

In an identical way to fetch_bundle_uri(), the bundles are unbundled
after all of the bundle lists have been expanded and all necessary URIs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
ebc3947955 bundle-uri: allow relative URLs in bundle lists
Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
9ea5796495 strbuf: introduce strbuf_strip_file_from_path()
The strbuf_parent_directory() method was added as a static method in
contrib/scalar by d0feac4e8c (scalar: 'register' sets recommended
config and starts maintenance, 2021-12-03) and then removed in
65f6a9eb0b (scalar: constrain enlistment search, 2022-08-18), but now
there is a need for a similar method in the bundle URI feature.

Re-add the method, this time in strbuf.c, but with a new name:
strbuf_strip_file_from_path(). The method requirements are slightly
modified to allow a trailing slash, in which case nothing is done, which
makes the name change valuable.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
738dc7d4a5 bundle-uri: serve bundle.* keys from config
Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".

In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
70b9c10373 bundle-uri client: add helper for testing server
Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
1b759e0cf1 transport: rename got_remote_heads
The 'got_remote_heads' member of 'struct git_transport_data' was used
historically to indicate that the initial server connection was made and
the ref advertisement was returned. With protocol v2, that initial
handshake does not necessarily include the ref advertisement, so this
member is not an accurate name. Thankfully, all uses of the member are
only checking to see if the handshake should take place, not whether or
not some local data has the ref advertisement.

Rename the member to 'finished_handshake' to represent the proper state.
Note that the variable is only set to 1 during the handshake() method.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
7cce9074a7 bundle-uri client: add boolean transfer.bundleURI setting
The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
0cfde740f0 clone: request the 'bundle-uri' command when available
Set up all the needed client parts of the 'bundle-uri' protocol v2
command, without actually doing anything with the bundle URIs.

If the server says it supports 'bundle-uri' teach Git to issue the
'bundle-uri' command after the 'ls-refs' during 'git clone'. The
returned key=value pairs are passed to the bundle list code which is
tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.

At this point, Git does nothing with that bundle list. It will not
download any of the bundles. That will come in a later change after
these protocol bits are finalized.

The no-op client is initially used only by 'git clone' to test the basic
functionality, and eventually will bootstrap the initial download of Git
objects during a fresh clone. The bundle URI client will not be
integrated into other fetches until a mechanism is created to select a
subset of bundles for download.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
8f788eb8b7 t: create test harness for 'bundle-uri' command
The previous change allowed for a Git server to advertise the
'bundle-uri' command as a capability based on the
uploadPack.advertiseBundleURIs config option. Create a set of tests that
check that this capability is advertised using 'git ls-remote'.

In order to test this functionality across three protocols (file, git,
and http), create lib-bundle-uri-protocol.sh to generalize the tests,
allowing the other test scripts to set an environment variable and
otherwise inherit the setup and tests from this script.

The tests currently only test that the 'bundle-uri' command is
advertised or not. Other actions will be tested as the Git client learns
to request the 'bundle-uri' command and parse its response.

To help with URI escaping, specifically for file paths with a space in
them, extract a 'sed' invocation from t9199-git-svn-info.sh into a
helper function for use here, too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
8b8d9a2298 protocol v2: add server-side "bundle-uri" skeleton
Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".

This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.

The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.

An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.

[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
ce54672f9b refs: fix corruption by not correctly syncing packed-refs to disk
At GitLab we have recently received a report where a repository was left
with a corrupted `packed-refs` file after the node hard-crashed even
though `core.fsync=reference` was set. This is something that in theory
should not happen if we correctly did the atomic-rename dance to:

    1. Write the data into a temporary file.

    2. Synchronize the temporary file to disk.

    3. Rename the temporary file into place.

So if we crash in the middle of writing the `packed-refs` file we should
only ever see either the old or the new state of the file.

And while we do the dance when writing the `packed-refs` file, there is
indeed one gotcha: we use a `FILE *` stream to write the temporary file,
but don't flush it before synchronizing it to disk. As a consequence any
data that is still buffered will not get synchronized and a crash of the
machine may cause corruption.

Fix this bug by flushing the file stream before we fsync.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:18:12 +09:00
891cb09db6 bundle: don't segfault on "git bundle <subcmd>"
Since aef7d75e58 (builtin/bundle.c: let parse-options parse
subcommands, 2022-08-19) we've been segfaulting if no argument was
provided.

The fix is easy, as all of the "git bundle" subcommands require a
non-option argument we can check that we have arguments left after
calling parse-options().

This makes use of code added in 73c3253d75 (bundle: framework for
options before bundle file, 2019-11-10), before this change that code
has always been unreachable. In 73c3253d75 we'd never reach it as we
already checked "argc < 2" in cmd_bundle() itself.

Then when aef7d75e58 (whose segfault we're fixing here) migrated this
code to the subcommand API it removed that "argc < 2" check, but we
were still checking the wrong "argc" in parse_options_cmd_bundle(), we
need to check the "newargc". The "argc" will always be >= 1, as it
will necessarily contain at least the subcommand name
itself (e.g. "create").

As an aside, this could be safely squashed into this, but let's not do
that for this minimal segfault fix, as it's an unrelated refactoring:

	--- a/builtin/bundle.c
	+++ b/builtin/bundle.c
	@@ -55,13 +55,12 @@ static int parse_options_cmd_bundle(int argc,
	 		const char * const usagestr[],
	 		const struct option options[],
	 		char **bundle_file) {
	-	int newargc;
	-	newargc = parse_options(argc, argv, NULL, options, usagestr,
	+	argc = parse_options(argc, argv, NULL, options, usagestr,
	 			     PARSE_OPT_STOP_AT_NON_OPTION);
	-	if (!newargc)
	+	if (!argc)
	 		usage_with_options(usagestr, options);
	 	*bundle_file = prefix_filename(prefix, argv[0]);
	-	return newargc;
	+	return argc;
	 }

	 static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {

Reported-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:01:09 +09:00
a797c0ea04 cat-file: add mailmap support to --batch-check option
Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `repo_read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 15:20:45 +09:00
49050a043b cat-file: add mailmap support to -s option
Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 15:20:45 +09:00
4542582e59 ci (check-whitespace): move to actions/checkout@v3
Get rid of deprecation warnings in the CI runs.  Also gets the latest
security patches.

Signed-off-by: Chris. Webster <chris@webstech.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:48:19 +09:00
b3ecdc780d ci (check-whitespace): add links to job output
A message in the step log will refer to the Summary output.

The job summary output is using markdown to improve readability.  The
git commands and commits with errors are now in ordered lists.
Commits and files in error are links to the user's repository.

Signed-off-by: Chris. Webster <chris@webstech.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:48:18 +09:00
288e3c4e3b ci (check-whitespace): suggest fixes for errors
Make the errors more visible by adding them to the job summary and
display the git commands that will usually fix the problem.

Signed-off-by: Chris. Webster <chris@webstech.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:48:17 +09:00
a0da6deeec ci: only run win+VS build & tests in Git for Windows' fork
It has been a frequent matter of contention that the win+VS jobs not
only take a long time to run, but are also more easily broken than the
other jobs (because they do not use the same `Makefile`-based builds as
all other jobs), and to make matters worse, these breakages are also
much harder to diagnose and fix than other jobs', especially for
contributors who are happy to stay away from Windows.

The purpose of these win+VS jobs is to maintain the CMake-based build
of Git, with the target audience being Visual Studio users on Windows
who are typically quite unfamiliar with `make` and POSIX shell
scripting, but the benefit of whose expertise we want for the Git
project nevertheless.

The CMake support was introduced for that specific purpose, and already
early on concerns were raised that it would put an undue burden on
contributors to ensure that these jobs pass in CI, when they do not have
access to Windows machines (nor want to have that).

This developer's initial hope was that it would be enough to fix win+VS
failures and provide the changes to be squashed into contributors'
patches, and that it would be worth the benefit of attracting
Windows-based developers' contributions.

Neither of these hopes have panned out.

To lower the frustration, and incidentally benefit from using way less
build minutes, let's just not run the win+VS jobs by default, which
appears to be the consensus of the mail thread leading up to
https://lore.kernel.org/git/xmqqk0311blt.fsf@gitster.g/

Since the Git for Windows project still needs to at least try to attract
more of said Windows-based developers, let's keep the jobs, but disable
them everywhere except in Git for Windows' fork. This will help because
Git for Windows' branch thicket is "continuously rebased" via automation
to the `shears/maint`, `shears/main`, `shears/next` and `shears/seen`
branches at https://github.com/git-for-windows/git. That way, the Git
for Windows project will still be notified early on about potential
breakages, but the Git project won't be burdened with fixing them
anymore, which seems to be the best compromise we can get on this issue.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:45:37 +09:00
4e57c88e02 line-range: fix infinite loop bug with '$' regex
When the -L argument to "git log" is passed the zero-width regular
expression "$" (as in "-L :$:line-range.c"), this results in an
infinite loop in find_funcname_matching_regexp().

Modify find_funcname_matching_regexp to correctly match the entire line
instead of the zero-width match at eol and update the loop condition to
prevent an infinite loop in the event of other undiscovered corner cases.

The primary change is that we pre-decrement the beginning-of-line marker
('bol') before comparing it to '\n'. In the case of '$', where we match the
'\n' at the end of the line and start the loop with bol == eol, this
ensures that bol will find the beginning of the line on which the match
occurred.

Originally reported in <https://stackoverflow.com/q/74690545/147356>.

Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:00:43 +09:00
4eb1ccecd4 mingw: fix typo in an error message from ownership check
When a repository is on a FAT32 file system, the user sees a message
that the path ownership cannot be determined.  Fix a typo in the
message.

Signed-off-by: Daniël Haazen <danielhaazen@hotmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 09:32:46 +09:00
7c2ef319c5 The first batch for 2.40
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-19 11:46:18 +09:00
963f8d3b63 Merge branch 'rj/branch-copy-and-rename'
Fix a pair of bugs in 'git branch'.

* rj/branch-copy-and-rename:
  branch: force-copy a branch to itself via @{-1} is a no-op
2022-12-19 11:46:18 +09:00
f3d9bc801a Merge branch 'rr/status-untracked-advice'
The advice message given by "git status" when it takes long time to
enumerate untracked paths has been updated.

* rr/status-untracked-advice:
  status: modernize git-status "slow untracked files" advice
2022-12-19 11:46:18 +09:00
053650ddad Merge branch 'aw/complete-case-insensitive'
Introduce a case insensitive mode to the Bash completion helpers.

* aw/complete-case-insensitive:
  completion: add case-insensitive match of pseudorefs
  completion: add optional ignore-case when matching refs
2022-12-19 11:46:18 +09:00
ab91f6b7c4 Merge branch 'rs/diff-parseopts'
The way the diff machinery prepares the options array for the
parse_options API has been refactored to avoid resource leaks.

* rs/diff-parseopts:
  diff: remove parseopts member from struct diff_options
  diff: use add_diff_options() in diff_opt_parse()
  diff: factor out add_diff_options()
2022-12-19 11:46:17 +09:00
4e09e0dae6 Merge branch 'sx/pthread-error-check-fix'
Correct pthread API usage.

* sx/pthread-error-check-fix:
  maintenance: compare output of pthread functions for inequality with 0
2022-12-19 11:46:17 +09:00
995916e24f Merge branch 'jk/avoid-redef-system-functions'
The jk/avoid-redef-system-functions-2.30 topic pre-merged for more
recent codebase.

* jk/avoid-redef-system-functions:
2022-12-19 11:46:17 +09:00
efcc48efa7 Merge branch 'jk/avoid-redef-system-functions-2.30'
Redefining system functions for a few functions did not follow our
usual "implement git_foo() and #define foo(args) git_foo(args)"
pattern, which has broken build for some folks.

* jk/avoid-redef-system-functions-2.30:
  git-compat-util: undefine system names before redeclaring them
  git-compat-util: avoid redefining system function names
2022-12-19 11:46:16 +09:00
3c0a988672 Merge branch 'rs/t3920-crlf-eating-grep-fix'
Test fix.

* rs/t3920-crlf-eating-grep-fix:
  t3920: support CR-eating grep
2022-12-19 11:46:14 +09:00
b7bb8828cf Merge branch 'js/t3920-shell-and-or-fix'
Test fix.

* js/t3920-shell-and-or-fix:
  t3920: don't ignore errors of more than one command with `|| true`
2022-12-19 11:46:14 +09:00
636de956c4 Merge branch 'jh/fsmonitor-darwin-modernize'
Stop using deprecated macOS API in fsmonitor.

* jh/fsmonitor-darwin-modernize:
  fsmonitor: eliminate call to deprecated FSEventStream function
2022-12-19 11:46:14 +09:00
314a0af909 Merge branch 'ab/t4023-avoid-losing-exit-status-of-diff'
Test fix.

* ab/t4023-avoid-losing-exit-status-of-diff:
  t4023: fix ignored exit codes of git
2022-12-19 11:46:13 +09:00
4eec47c1cd Merge branch 'ab/t7600-avoid-losing-exit-status-of-git'
Test fix.

* ab/t7600-avoid-losing-exit-status-of-git:
  t7600: don't ignore "rev-parse" exit code in helper
2022-12-19 11:46:13 +09:00
d2caf09d00 Merge branch 'ab/t5314-avoid-losing-exit-status'
Test fix.

* ab/t5314-avoid-losing-exit-status:
  t5314: check exit code of "git"
2022-12-19 11:46:13 +09:00
44265e5b57 Merge branch 'jh/t7527-unflake-by-forcing-cookie'
Make fsmonitor more robust to avoid the flakiness seen in t7527.

* jh/t7527-unflake-by-forcing-cookie:
  fsmonitor: fix race seen in t7527
2022-12-19 11:46:13 +09:00
02ec5e2eec Merge branch 'rs/plug-pattern-list-leak-in-lof'
Leak fix.

* rs/plug-pattern-list-leak-in-lof:
  list-objects-filter: plug pattern_list leak
2022-12-19 11:46:12 +09:00
907951c88b Merge branch 'rs/t4205-do-not-exit-in-test-script'
Test fix.

* rs/t4205-do-not-exit-in-test-script:
  t4205: don't exit test script on failure
2022-12-19 11:46:12 +09:00
a48a88019b tests: make 'test_oid' print trailing newline
Unlike other test helper functions, 'test_oid' doesn't terminate its
output with a LF, but, alas, the reason for this, if any, is not
mentioned in 2c02b110da (t: add test functions to translate
hash-related values, 2018-09-13)).

Now, in the vast majority of cases 'test_oid' is invoked in a command
substitution that is part of a heredoc or supplies an argument to a
command or the value to a variable, and the command substitution would
chop off any trailing LFs, so in these cases the lack or presence of a
trailing LF in its output doesn't matter.  However:

  - There appear to be only three cases where 'test_oid' is not
    invoked in a command substitution:

      $ git grep '\stest_oid ' -- ':/t/*.sh'
      t0000-basic.sh:  test_oid zero >actual &&
      t0000-basic.sh:  test_oid zero >actual &&
      t0000-basic.sh:  test_oid zero >actual &&

    These are all in test cases checking that 'test_oid' actually
    works, and that the size of its output matches the size of the
    corresponding hash function with conditions like

      test $(wc -c <actual) -eq 40

    In these cases the lack of trailing LF does actually matter,
    though they could be trivially updated to account for the presence
    of a trailing LF.

  - There are also a few cases where the lack of trailing LF in
    'test_oid's output actually hurts, because tests need to compare
    its output with LF terminated file contents, forcing developers to
    invoke it as 'echo $(test_oid ...)' to append the missing LF:

      $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
      t1302-repo-version.sh:  echo $(test_oid version) >expect &&
      t1500-rev-parse.sh:     echo "$(test_oid algo)" >expect &&
      t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val1)" > foo &&
      t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val2)" > foo &&
      t5313-pack-bounds-checks.sh:    echo $(test_oid oidfff) >file &&

    And there is yet another similar case in an in-flight topic at:

      https://public-inbox.org/git/813e81a058227bd373cec802e443fcd677042fb4.1670862677.git.gitgitgadget@gmail.com/

Arguably we would be better off if 'test_oid' terminated its output
with a LF.  So let's update 'test_oid' accordingly, update its tests
in t0000 to account for the extra character in those size tests, and
remove the now unnecessary 'echo $(...)' command substitutions around
'test_oid' invocations as well.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-19 09:49:11 +09:00
4c3dd9304e var: add GIT_SEQUENCE_EDITOR variable
The editor program used by Git when editing the sequencer "todo" file
is determined by examining a few environment variables and also
affected by configuration variables. Introduce "git var
GIT_SEQUENCE_EDITOR" to give users access to the final result of the
logic without having to know the exact details.

This is very similar in spirit to 44fcb497 (Teach git var about
GIT_EDITOR, 2009-11-11) that introduced "git var GIT_EDITOR".

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-18 11:48:26 +09:00
6f65f84766 submodule: accept -v for the update command
Since a56771a6 (builtin/pull: respect verbosity settings in
submodules, 2018-01-25), "git pull -v --recurse-submodules"
propagates the "-v" to the submodule command, but because the
latter command does not understand the option, it barfs.

Teach "git submodule update" to accept the option to fix it.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-18 10:28:30 +09:00
35898ad24d Makefile: use sha1collisiondetection by default on OSX and Darwin
When the sha1collisiondetection library was added and made the default
in [1] the interaction with APPLE_COMMON_CRYPTO added in [2] and [3]
seems to have been missed. On modern OSX and Darwin we are able to use
Apple's CommonCrypto both for SHA-1, and as a generic (but partial)
OpenSSL replacement.

This left OSX and Darwin without protection against the SHAttered
attack when building Git in its default configuration.

Let's also use sha1collisiondetection on OSX, to do so we'll need to
split up the "APPLE_COMMON_CRYPTO" flag into that flag and a new
"APPLE_COMMON_CRYPTO_SHA1".

Because of this we can stop conflating whether we want to use Apple's
CommonCrypto at all, and whether we want to use it for SHA-1.  This
makes the CI recipe added in [4] simpler.

1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. 4dcd7732db (Makefile: add support for Apple CommonCrypto facility, 2013-05-19)
3. 61067954ce (cache.h: eliminate SHA-1 deprecation warnings on Mac OS X, 2013-05-19)
4. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
   2022-10-20)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-16 06:06:56 +09:00
285da4321a ref-filter: convert email atom parser to use err_bad_arg()
The error message for a bogus argument to %(authoremail), etc, is:

   $ git for-each-ref --format='%(authoremail:foo)'
   fatal: unrecognized email option: foo

Saying just "email" is a little vague; most of the other atom parsers
would use the full name "%(authoremail)", but we can't do that here
because the same function also handles %(taggeremail), etc. Until
recently, passing atom->name was a bad idea, because it erroneously
included the arguments in the atom name. But since the previous commit
taught err_bad_arg() to handle this, we can now do so and get:

  fatal: unrecognized %(authoremail) argument: foo

which is consistent with other atoms.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:14:09 +09:00
1955ef10ed ref-filter: truncate atom names in error messages
If you pass a bogus argument to %(refname), you may end up with a
message like this:

  $ git for-each-ref --format='%(refname:foo)'
  fatal: unrecognized %(refname:foo) argument: foo

which is confusing. It should just say:

  fatal: unrecognized %(refname) argument: foo

which is clearer, and is consistent with most other atom parsers. Those
other parsers do not have the same problem because they pass the atom
name from a string literal in the parser function. But because the
parser for %(refname) also handles %(upstream) and %(push), it instead
uses atom->name, which includes the arguments. The oid atom parser which
handles %(tree), %(parent), etc suffers from the same problem.

It seems like the cleanest fix would be for atom->name to be _just_ the
name, since there's already a separate "args" field. But since that
field is also used for other things, we can't change it easily (e.g.,
it's how we find things in the used_atoms array, and clearly %(refname)
and %(refname:short) are not the same thing).

Instead, we'll teach our error_bad_arg() function to stop at the first
":". This is a little hacky, as we're effectively re-parsing the name,
but the format is simple enough to do this as a one-liner, and this
localizes the change to the error-reporting code.

We'll give the same treatment to err_no_arg(). None of its callers use
this atom->name trick, but it's worth future-proofing it while we're
here.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:14:04 +09:00
dda4fc1a84 ref-filter: factor out "unrecognized %(foo) arg" errors
Atom parsers that take arguments generally have a catch-all for "this
arg is not recognized". Most of them use the same printf template, which
is good, because it makes life easier for translators. Let's pull this
template into a helper function, which makes the code in the parsers
shorter and avoids any possibility of differences.

As with the previous commit, we'll pick an arbitrary atom to make sure
the test suite covers this code.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:14:00 +09:00
a33d0fae76 ref-filter: factor out "%(foo) does not take arguments" errors
Many atom parsers give the same error message, differing only in the
name of the atom. If we use "%s does not take arguments", that should
make life easier for translators, as they only need to translate one
string. And in doing so, we can easily pull it into a helper function to
make sure they are all using the exact same string.

I've added a basic test here for %(HEAD), just to make sure this code is
exercised at all in the test suite. We could cover each such atom, but
the effort-to-reward ratio of trying to maintain an exhaustive list
doesn't seem worth it.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:13:56 +09:00
afc1a946b2 ref-filter: reject arguments to %(HEAD)
The %(HEAD) atom doesn't take any arguments, but unlike other atoms in
the same boat (objecttype, deltabase, etc), it does not detect this
situation and complain. Let's make it consistent with the others.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:13:35 +09:00
209d9cb011 diff: fix regression with --stat and unmerged file
A regression was introduced in

  12fc4ad89e (diff.c: use utf8_strwidth() to count display width, 2022-09-14)

that causes missing newlines after "Unmerged" entries in `git diff
--cached --stat` output.

This problem affects v2.39.0-rc0 through v2.39.0.

Add the missing newline along with a new test to cover this
behavior.

Signed-off-by: Peter Grayson <pete@jpgrayson.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:12:04 +09:00
92cb135855 git: remove duplicate includes
These files are already included; we do not need to include them again

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:09:38 +09:00
b0226007f0 fsmonitor: eliminate call to deprecated FSEventStream function
Replace the call to `FSEventStreamScheduleWithRunLoop()` function with
the suggested `FSEventStreamSetDispatchQueue()` function.

The MacOS version of the builtin FSMonitor feature uses the
`FSEventStreamScheduleWithRunLoop()` function to drive the event loop
and process FSEvents from the system.  This routine has now been
deprecated by Apple.  The MacOS 13 (Ventura) compiler tool chain now
generates a warning when compiling calls to this function.  In
DEVELOPER=1 mode, this now causes a compile error.

The `FSEventStreamSetDispatchQueue()` function is conceptually similar
and is the suggested replacement.  However, there are some subtle
thread-related differences.

Previously, the event stream would be processed by the
`fsm_listen__loop()` thread while it was in the `CFRunLoopRun()`
method.  (Conceptually, this was a blocking call on the lifetime of
the event stream where our thread drove the event loop and individual
events were handled by the `fsevent_callback()`.)

With the change, a "dispatch queue" is created and FSEvents will be
processed by a hidden queue-related thread (that calls the
`fsevent_callback()` on our behalf).  Our `fsm_listen__loop()` thread
maintains the original blocking model by waiting on a mutex/condition
variable pair while the hidden thread does all of the work.

While the deprecated API used by the original were introduced in
macOS 10.5 (Oct 2007), the API used by the updated code were
introduced back in macOS 10.6 (Aug 2009) and has been available
since then.  So this change _could_ break those who have happily
been using 10.5 (if there were such people), but these two dates
both predate the oldest versions of macOS Apple seems to support
anyway, so we should be safe.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:08:27 +09:00
7e2ad1cda2 commit: don't lazy-fetch commits
When parsing commits, fail fast when the commit is missing or
corrupt, instead of attempting to fetch them. This is done by inlining
repo_read_object_file() and setting the flag that prevents fetching.

This is motivated by a situation in which through a bug (not necessarily
through Git), there was corruption in the object store of a partial
clone. In this particular case, the problem was exposed when "git gc"
tried to expire reflogs, which calls repo_parse_commit(), which triggers
fetches of the missing commits.

(There are other possible solutions to this problem including passing an
argument from "git gc" to "git reflog" to inhibit all lazy fetches, but
I think that this fix is at the wrong level - fixing "git reflog" means
that this particular command works fine, or so we think (it will fail if
it somehow needs to read a legitimately missing blob, say, a .gitmodules
file), but fixing repo_parse_commit() will fix a whole class of bugs.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:05:55 +09:00
9e59b38c88 object-file: emit corruption errors when detected
Instead of relying on errno being preserved across function calls, teach
do_oid_object_info_extended() to itself report object corruption when
it first detects it. There are 3 types of corruption being detected:
 - when a replacement object is missing
 - when a loose object is corrupt
 - when a packed object is corrupt and the object cannot be read
   in another way

Note that in the RHS of this patch's diff, a check for ENOENT that was
introduced in 3ba7a06552 (A loose object is not corrupt if it cannot
be read due to EMFILE, 2010-10-28) is also removed. The purpose of this
check is to avoid a false report of corruption if the errno contains
something like EMFILE (or anything that is not ENOENT), in which case
a more generic report is presented. Because, as of this patch, we no
longer rely on such a heuristic to determine corruption, but surface
the error message at the point when we read something that we did not
expect, this check is no longer necessary.

Besides being more resilient, this also prepares for a future patch in
which an indirect caller of do_oid_object_info_extended() will need
such functionality.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:05:55 +09:00
ae285ac449 object-file: refactor map_loose_object_1()
This function can do 3 things:
 1. Gets an fd given a path
 2. Simultaneously gets a path and fd given an OID
 3. Memory maps an fd

Keep 3 (renaming the function accordingly) and inline 1 and 2 into their
respective callers.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:05:55 +09:00
acd6f0d973 object-file: remove OBJECT_INFO_IGNORE_LOOSE
Its last user was removed in 97b2fa08b6 (fetch-pack: drop
custom loose object cache, 2018-11-12), so we can remove it.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:05:55 +09:00
57e2c6ebbe Start the 2.40 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-14 18:32:26 +09:00
26f81233ab Merge branch 'js/t0021-windows-pwd'
Test fix.

* js/t0021-windows-pwd:
  t0021: use Windows-friendly `pwd`
2022-12-14 17:42:18 +09:00
d818458088 Merge branch 'sa/git-var-empty'
"git var UNKNOWN_VARIABLE" and "git var VARIABLE" with the variable
given an empty value used to behave identically.  Now the latter
just gives an empty output, while the former still gives an error
message.

* sa/git-var-empty:
  var: allow GIT_EDITOR to return null
  var: do not print usage() with a correct invocation
2022-12-14 15:55:47 +09:00
cb3d2e535a Merge branch 'rs/multi-filter-args'
Fix a bug where `pack-objects` would not respect multiple `--filter`
arguments when invoked directly.

* rs/multi-filter-args:
  list-objects-filter: remove OPT_PARSE_LIST_OBJECTS_FILTER_INIT()
  pack-objects: simplify --filter handling
  pack-objects: fix handling of multiple --filter options
  t5317: demonstrate failure to handle multiple --filter options
  t5317: stop losing return codes of git ls-files
2022-12-14 15:55:47 +09:00
a1b8e5ec28 Merge branch 'tl/pack-bitmap-absolute-paths'
The pack-bitmap machinery is taught to log the paths of redundant
bitmap(s) to trace2 instead of stderr.

* tl/pack-bitmap-absolute-paths:
  pack-bitmap.c: trace bitmap ignore logs when midx-bitmap is found
  pack-bitmap.c: break out of the bitmap loop early if not tracing
  pack-bitmap.c: avoid exposing absolute paths
  pack-bitmap.c: remove unnecessary "open_pack_index()" calls
2022-12-14 15:55:46 +09:00
06ae40f6e5 Merge branch 'yn/git-jump-emacs'
"git jump" (in contrib/) learned to present the "quickfix list" to
its standard output (instead of letting it consumed by the editor
it invokes), and learned to also drive emacs/emacsclient.

* yn/git-jump-emacs:
  git-jump: invoke emacs/emacsclient
  git-jump: move valid-mode check earlier
  git-jump: add an optional argument '--stdout'
2022-12-14 15:55:46 +09:00
9ea1378d04 Merge branch 'ab/various-leak-fixes'
Various leak fixes.

* ab/various-leak-fixes:
  built-ins: use free() not UNLEAK() if trivial, rm dead code
  revert: fix parse_options_concat() leak
  cherry-pick: free "struct replay_opts" members
  rebase: don't leak on "--abort"
  connected.c: free the "struct packed_git"
  sequencer.c: fix "opts->strategy" leak in read_strategy_opts()
  ls-files: fix a --with-tree memory leak
  revision API: call graph_clear() in release_revisions()
  unpack-file: fix ancient leak in create_temp_file()
  built-ins & libs & helpers: add/move destructors, fix leaks
  dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache"
  read-cache.c: clear and free "sparse_checkout_patterns"
  commit: discard partial cache before (re-)reading it
  {reset,merge}: call discard_index() before returning
  tests: mark tests as passing with SANITIZE=leak
2022-12-14 15:55:46 +09:00
7576e512ce Merge branch 'kz/merge-tree-merge-base'
"merge-tree" learns a new `--merge-base` option.

* kz/merge-tree-merge-base:
  docs: fix description of the `--merge-base` option
  merge-tree.c: allow specifying the merge-base when --stdin is passed
  merge-tree.c: add --merge-base=<commit> option
2022-12-14 15:55:46 +09:00
bee6e7a8f9 Merge branch 'dd/git-bisect-builtin'
`git bisect` becomes a builtin.

* dd/git-bisect-builtin:
  bisect; remove unused "git-bisect.sh" and ".gitignore" entry
  Turn `git bisect` into a full built-in
  bisect--helper: log: allow arbitrary number of arguments
  bisect--helper: handle states directly
  bisect--helper: emit usage for "git bisect"
  bisect test: test exit codes on bad usage
  bisect--helper: identify as bisect when report error
  bisect-run: verify_good: account for non-negative exit status
  bisect run: keep some of the post-v2.30.0 output
  bisect: fix output regressions in v2.30.0
  bisect: refactor bisect_run() to match CodingGuidelines
  bisect tests: test for v2.30.0 "bisect run" regressions
2022-12-14 15:55:45 +09:00
d422d06167 object-file: inline write_buffer()
write_buffer() reports the OS error if it is unable to write.  Its only
caller dies in that case, giving some more context in its last message.

Inline this function and show only a single error message that includes
both the context (writing a loose object file) and the OS error.  This
shortens the code and simplifies the output.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-14 10:29:19 +09:00
c25d9e529d userdiff: mark unused parameter in internal callback
Since f12fa9ee6c (userdiff: add and use for_each_userdiff_driver(),
2021-04-08), lookup of userdiffs is done with a generic
for_each_userdiff_driver(). But the name lookup doesn't use the "type"
field, of course.

We can't get rid of that field from the generic interface because it is
used by t/helper/test-userdiff.c. So mark it as unused in this instance
to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
d3beb61f93 list-objects-filter: mark unused parameters in virtual functions
The "struct filter" abstract type defines several virtual function
pointers. Not all of the concrete functions need every parameter, but
they have to conform to the generic interface. Mark unused ones to
silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
61bdc7c5d8 diff: mark unused parameters in callbacks
The diff code provides a format_callback interface, but not every
callback needs each parameter (e.g., the "opt" and "data" parameters are
frequently left unused). Likewise for the output_prefix callback, the
low-level change/add_remove interfaces, the callbacks used by
xdi_diff(), etc.

Mark unused arguments in the callback implementations to quiet
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
8157ed4046 xdiff: mark unused parameter in xdl_call_hunk_func()
This function is used interchangeably with xdl_emit via a function
pointer, so we can't just drop the unused parameter. Mark it to silence
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
a361660aef xdiff: drop unused parameter in def_ff()
The def_ff() function is the default "find_func" for finding hunk
headers. It has never used its "priv" argument since it was introduced
in f258475a6e (Per-path attribute based hunk header selection.,
2007-07-06). But back then we used a function pointer to switch between
a caller-provided function and the default, so the two had to conform to
the same interface.

In ff2981f724 (xdiff: factor out match_func_rec(), 2016-05-28), that
pointer indirection went away in favor of code which directly calls
either of the two functions. So there's no need for def_ff() to retain
this unused parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
c5224f0f4c ws: drop unused parameter from ws_blank_line()
We take a ws_rule parameter, but have never looked at it since the
function was added in 877f23ccb8 (Teach "diff --check" about new blank
lines at end, 2008-06-26). A comment in the function does mention how we
_could_ use it, but nobody has felt the need to do so for over a decade.

We could keep it around as reminder of what could be done, but the
comment serves that purpose. And in the meantime, it triggers
-Wunused-parameter.

So let's drop it, which in turn allows us to drop similar arguments
further up the callstack. I've left the comment intact. It does still
say "ws_rule", but that name is used consistently in the whitespace
code, so the meaning is clear.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
00271485d4 list-objects: drop process_gitlink() function
Our object graph traversal code has a process_gitlink() function which
we call when we see a gitlink entry. The function does nothing; it was
added in the early days of gitlinks by 6e2f441bd4 (Teach git
list-objects logic to not follow gitlinks, 2007-04-13).

The comment above the function talks about some things we _could_ do.
But in the intervening 15 years, nobody has touched the function, and
the submodule code usually makes its own decisions about when and how to
examine the links. At the generic traversal layer, we can't assume that
the pointed-to commit is available.

Let's drop this placeholder that isn't really helping anything. This
silences some -Wunused-parameter warnings, and also gets rid of a crufty
use of "const unsigned char *" to pass a raw hash value.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:22 +09:00
c1166ca0e2 blob: drop unused parts of parse_blob_buffer()
Our parse_blob_buffer() takes a ptr/len combo, just like
parse_tree_buffer(), etc, and returns success or failure. But it doesn't
actually do anything with them; we just set the "parsed" flag in the
object and return success, without even looking at the contents.

There could be some value to keeping these unused parameters:

  - it's consistent with the parse functions for other object types. But
    we already lost that consistency in 837d395a5c (Replace parse_blob()
    with an explanatory comment, 2010-01-18).

  - As the comment from 837d395a5c explains, callers are supposed to
    make sure they have the object content available. So in theory
    asking for these parameters could serve as a signal. But there are
    only two callers, and one of them always passes NULL (after doing a
    streaming check of the object hash).

    This shows that there aren't likely to be a lot of callers (since
    everyone either uses the type-generic parse functions, or handles
    blobs individually), and that they need to take special care anyway
    (because we usually want to avoid loading whole blobs in memory if
    we can avoid it).

So let's just drop these unused parameters, and likewise the useless
return value. While we're touching the header file, let's move the
declaration of parse_blob_buffer() right below that explanatory comment,
where it's more likely to be seen by people looking for the function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:22 +09:00
91e2ab1587 ls-refs: use repository parameter to iterate refs
The ls_refs() function (for the v2 protocol command of the same name)
takes a repository parameter (like all v2 commands), but ignores it. It
should use it to access the refs.

This isn't a bug in practice, since we only call this function when
serving upload-pack from the main repository. But it's an awkward
gotcha, and it causes -Wunused-parameter to complain.

The main reason we don't use the repository parameter is that the ref
iteration interface we call doesn't have a "refs_" variant that takes a
ref_store. However we can easily add one. In fact, since there is only
one other caller (in ref-filter.c), there is no need to maintain the
non-repository wrapper; that caller can just use the_repository. It's
still a long way from consistently using a repository object, but it's
one small step in the right direction.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:22 +09:00
a31cfe3283 server_supports_v2(): use a separate function for die_on_error
The server_supports_v2() helper lets a caller find out if the server
supports a feature, and will optionally die if it's not supported. This
makes the return value confusing, as it's only meaningful when the
function is not asked to die.

Coverity flagged a new call like:

  /* check that we support "foo" */
  server_supports_v2("foo", 1);

complaining that we usually checked the return value, but this time we
didn't. But this call is correct, and other ones that did:

  if (server_supports_v2("foo", 1))
          do_something_with_foo();

are "wrong", in the sense that we know the conditional will always be
true (but there's no bug; the code is simply misleading).

Let's split the "die" behavior into its own function which returns void,
and modify each caller to use the correct one.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:08:52 +09:00
a658e881c1 am: don't pass strvec to apply_parse_options()
apply_parse_options() passes the array of argument strings to
parse_options(), which removes recognized options.  The removed strings
are not freed, though.

Make a copy of the strvec to pass to the function to retain the pointers
of its strings, so we release them all at the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:07:37 +09:00
4cb39fcf19 commit: skip already cleared parents in clear_commit_marks_1()
Don't put clean parents on the pending list, as they and their ancestors
don't need any treatment and would be skipped later anyway.  This saves
the allocation and release of a commit list item in ca. 20% of the cases
during a run of the test suite.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:07:08 +09:00
b07a819c05 reflog: clear leftovers in reflog_expiry_cleanup()
reflog_expiry_prepare() calls mark_reachable(), which recurively flags
commits as REACHABLE.  The traversal stops beyond a certain age
threshold; the boundary commits also marked as REACHABLE and put back
into mark_list at the end.  unreachable() finishes the traversal down to
the roots if necessary -- but if all interesting commits are younger
than the age threshold then only recent commits need to be visited.

When this optimization works then the boundary commits still sit there
in mark_list at the end.  Clear their REACHABLE flag and release the
commit list allocations.

While at it remove a duplicate code line from mark_reachable(); the same
flag is already set five lines up.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:06:26 +09:00
01443f01b7 Git 2.39.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:25:28 +09:00
96738bb0e1 Sync with 2.38.3 2022-12-13 21:25:15 +09:00
37ed7bf0f1 Git 2.38.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:24:14 +09:00
fea9f607a8 Sync with Git 2.37.5 2022-12-13 21:23:36 +09:00
e43ac5f23d Git 2.37.5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:20:47 +09:00
431f6e67e6 Merge branch 'maint-2.36' into maint-2.37 2022-12-13 21:20:35 +09:00
ad949b24f8 Git 2.36.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:19:24 +09:00
8253c00421 Merge branch 'maint-2.35' into maint-2.36 2022-12-13 21:19:11 +09:00
02f4981723 Git 2.35.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:17:26 +09:00
fbabbc30e7 Merge branch 'maint-2.34' into maint-2.35 2022-12-13 21:17:10 +09:00
6c9466944c Git 2.34.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:15:39 +09:00
3748b5b7f5 Merge branch 'maint-2.33' into maint-2.34 2022-12-13 21:15:22 +09:00
7fe9bf55b8 Git 2.33.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:13:48 +09:00
5f22dcc02d Sync with Git 2.32.5 2022-12-13 21:13:11 +09:00
d96ea538e8 Git 2.32.5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:10:27 +09:00
32e357b6df Merge branch 'ps/attr-limits-with-fsck' into maint-2.32 2022-12-13 21:09:56 +09:00
8a755eddf5 Sync with Git 2.31.6 2022-12-13 21:09:40 +09:00
82689d5e5d Git 2.31.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 21:04:03 +09:00
16128765d7 Sync with Git 2.30.7 2022-12-13 21:02:20 +09:00
b7b37a3371 Git 2.30.7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 20:56:43 +09:00
7abb43cbc8 http-fetch: invoke trace2_cmd_name()
ee4512ed48 ("trace2: create new combined trace facility", 2019-02-
22) introduced trace2_cmd_name() and taught both the Git built-ins and
some non-built-ins to use it. However, http-fetch was not one of them
(perhaps due to its low usage at the time).

Teach http-fetch to invoke this function. After this patch, this
function will be invoked right after argument parsing, just like in
remote-curl.c.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 10:43:07 +09:00
0918d08887 help.c: fix autocorrect in work tree for bare repository
Currently, auto correction doesn't work reliably for commands which must
run in a work tree (e.g. `git status`) in Git work trees which are
created from a bare repository.

As far as I'm able to determine, this has been broken since commit
659fef199f (help: use early config when autocorrecting aliases,
2017-06-14), where the call to `git_config()` in `help_unknown_cmd()`
was replaced with a call to `read_early_config()`. From what I can tell,
the actual cause for the unexpected error is that we call
`git_default_config()` in the `git_unknown_cmd_config` callback instead
of simply returning `0` for config entries which we aren't interested
in.

Calling `git_default_config()` in this callback to `read_early_config()`
seems like a bad idea since those calls will initialize a bunch of state
in `environment.c` (among other things `is_bare_repository_cfg`) before
we've properly detected that we're running in a work tree.

All other callbacks provided to `read_early_config()` appear to only
extract their configurations while simply returning `0` for all other
config keys.

This commit changes the `git_unknown_cmd_config` callback to not call
`git_default_config()`. Instead we also simply return `0` for config
keys which we're not interested in.

Additionally the commit adds a new test case covering `help.autocorrect`
in a work tree created from a bare clone.

Signed-off-by: Simon Gerber <gesimu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 10:01:53 +09:00
a3795bf0e6 tests(mingw): avoid very slow mingw_test_cmp
When Git's test suite uses `test_cmp`, it is not actually trying to
compare binary files as the name `cmp` would suggest to users familiar
with Unix' tools, but the tests instead verify that actual output
matches the expected text.

On Unix, `cmp` works well enough for Git's purposes because only Line
Feed characters are used as line endings. However, on Windows, while
most tools accept Line Feeds as line endings, many tools produce
Carriage Return + Line Feed line endings, including some of the tools
used by the test suite (which are therefore provided via Git for Windows
SDK). Therefore, `cmp` would frequently fail merely due to different
line endings.

To accommodate for that, the `mingw_test_cmp` function was introduced
into Git's test suite to perform a line-by-line comparison that ignores
line endings. This function is a Bash function that is only used on
Windows, everywhere else `cmp` is used.

This is a double whammy because `cmp` is fast, and `mingw_test_cmp` is
slow, even more so on Windows because it is a Bash script function, and
Bash scripts are known to run particularly slowly on Windows due to
Bash's need for the POSIX emulation layer provided by the MSYS2 runtime.

The commit message of 32ed3314c1 (t5351: avoid using `test_cmp` for
binary data, 2022-07-29) provides an illuminating account of the
consequences: On Windows, the platform on which Git could really use all
the help it can get to improve its performance, the time spent on one
entire test script was reduced from half an hour to less than half a
minute merely by avoiding a single call to `mingw_test_cmp` in but a
single test case.

Learning the lesson to avoid shell scripting wherever possible, the Git
for Windows project implemented a minimal replacement for
`mingw_test_cmp` in the form of a `test-tool` subcommand that parses the
input files line by line, ignoring line endings, and compares them.
Essentially the same thing as `mingw_test_cmp`, but implemented in
C instead of Bash. This solution served the Git for Windows project
well, over years.

However, when this solution was finally upstreamed, the conclusion was
reached that a change to use `git diff --no-index` instead of
`mingw_test_cmp` was more easily reviewed and hence should be used
instead.

The reason why this approach was not even considered in Git for Windows
is that in 2007, there was already a motion on the table to use Git's
own diff machinery to perform comparisons in Git's test suite, but it
was dismissed in https://lore.kernel.org/git/xmqqbkrpo9or.fsf@gitster.g/
as undesirable because tests might potentially succeed due to bugs in
the diff machinery when they should not succeed, and those bugs could
therefore hide regressions that the tests try to prevent.

By the time Git for Windows' `mingw-test-cmp` in C was finally
contributed to the Git mailing list, reviewers agreed that the diff
machinery had matured enough and should be used instead.

When the concern was raised that the diff machinery, due to its
complexity, would perform substantially worse than the test helper
originally implemented in the Git for Windows project, a test
demonstrated that these performance differences are well lost within the
100+ minutes it takes to run Git's test suite on Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 07:18:06 +09:00
c48035d29b Git 2.39
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-12 09:59:08 +09:00
31cc8be91d Merge tag 'l10n-2.39.0-rnd1' of https://github.com/git-l10n/git-po
l10n-2.39.0-rnd1

* tag 'l10n-2.39.0-rnd1' of https://github.com/git-l10n/git-po:
  l10n: zh_TW.po: Git 2.39-rc2
  l10n: tr: v2.39.0 updates
  l10n: Update Catalan translation
  l10n: bg.po: Updated Bulgarian translation (5501t)
  l10n: de.po: update German translation
  l10n: zh_CN v2.39.0 round 1
  l10n: fr: v2.39 rnd 1
  l10n: po-id for 2.39 (round 1)
  l10n: sv.po: Update Swedish translation (5501t0f0)
2022-12-12 09:20:49 +09:00
694cb1b2ab Sync with Git 2.38.2 2022-12-11 09:34:51 +09:00
8706a59933 Git 2.38.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-11 09:32:48 +09:00
6d0497d526 l10n: zh_TW.po: Git 2.39-rc2
Signed-off-by: pan93412 <pan93412@gmail.com>
2022-12-11 01:27:25 +08:00
0ddd73fa9f ci: use a newer github-script version
The old version we currently use runs in node.js v12.x, which is being
deprecated in GitHub Actions. The new version uses node.js v16.x.

Incidentally, this also avoids the warning about the deprecated
`::set-output::` workflow command because the newer version of the
`github-script` Action uses the recommended new way to specify outputs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-12-10 16:32:16 +09:00
e71f00f73f Merge branch 'jx/ci-ubuntu-fix' into maint-2.38
Adjust the GitHub CI to newer ubuntu release.

* jx/ci-ubuntu-fix:
  ci: install python on ubuntu
  ci: use the same version of p4 on both Linux and macOS
  ci: remove the pipe after "p4 -V" to catch errors
  github-actions: run gcc-8 on ubuntu-20.04 image
2022-12-10 16:17:47 +09:00
bbfd79af89 Sync with 'maint' 2022-12-10 14:02:22 +09:00
ec9816c6b3 Merge branch 'js/ci-use-newer-up-down-artifact' into maint-2.38
CI fix.

* js/ci-use-newer-up-down-artifact:
  ci: avoid using deprecated {up,down}load-artifacts Action
2022-12-10 14:02:09 +09:00
75efbc1372 Merge branch 'ab/ci-use-macos-12' into maint-2.38
CI fix.

* ab/ci-use-macos-12:
  CI: upgrade to macos-12, and pin OSX version
2022-12-10 14:02:09 +09:00
634d026866 Merge branch 'ab/ci-retire-set-output' into maint-2.38
CI fix.

* ab/ci-retire-set-output:
  CI: migrate away from deprecated "set-output" syntax
2022-12-10 14:02:09 +09:00
8972be0252 Merge branch 'ab/ci-musl-bash-fix' into maint-2.38
CI fix.

* ab/ci-musl-bash-fix:
  CI: don't explicitly pick "bash" shell outside of Windows, fix regression
2022-12-10 14:02:09 +09:00
78c5de91f2 Merge branch 'od/ci-use-checkout-v3-when-applicable' into maint-2.38
Update GitHub CI to use actions/checkout@v3; use of the older
checkout@v2 gets annoying deprecation notices.

* od/ci-use-checkout-v3-when-applicable:
  ci(main): upgrade actions/checkout to v3
2022-12-10 14:02:09 +09:00
481d274aae Merge branch 'js/ci-use-newer-up-down-artifact'
CI fix.

* js/ci-use-newer-up-down-artifact:
  ci: avoid using deprecated {up,down}load-artifacts Action
2022-12-10 14:01:06 +09:00
0b32d1aea2 Merge branch 'ab/ci-use-macos-12'
CI fix.

* ab/ci-use-macos-12:
  CI: upgrade to macos-12, and pin OSX version
2022-12-10 14:01:06 +09:00
82444ead4c Merge branch 'ab/ci-retire-set-output'
CI fix.

* ab/ci-retire-set-output:
  CI: migrate away from deprecated "set-output" syntax
2022-12-10 14:01:05 +09:00
a64bf54bfa Merge branch 'ab/ci-musl-bash-fix'
CI fix.

* ab/ci-musl-bash-fix:
  CI: don't explicitly pick "bash" shell outside of Windows, fix regression
2022-12-10 14:01:05 +09:00
9044a398af Merge branch 'od/ci-use-checkout-v3-when-applicable'
Update GitHub CI to use actions/checkout@v3; use of the older
checkout@v2 gets annoying deprecation notices.

* od/ci-use-checkout-v3-when-applicable:
  ci(main): upgrade actions/checkout to v3
2022-12-10 14:01:05 +09:00
38645f8cb1 mailmap: update email address of Matheus Tavares
I haven't been very active in the community lately, but I'm soon going
to lose access to my previous commit email (@usp.br); so add my current
personal address to mailmap for any future message exchanges or patch
contributions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-10 09:17:36 +09:00
93a7bc8b28 rebase --update-refs: avoid unintended ref deletion
In b3b1a21d1a (sequencer: rewrite update-refs as user edits todo list,
2022-07-19), the 'todo_list_filter_update_refs()' step was added to handle
the removal of 'update-ref' lines from a 'rebase-todo'. Specifically, it
removes potential ref updates from the "update refs state" if a ref does not
have a corresponding 'update-ref' line.

However, because 'write_update_refs_state()' will not update the state if
the 'refs_to_oids' list was empty, removing *all* 'update-ref' lines will
result in the state remaining unchanged from how it was initialized (with
all refs' "after" OID being null). Then, when the ref update is applied, all
refs will be updated to null and consequently deleted.

To fix this, delete the 'update-refs' state file when 'refs_to_oids' is
empty. Additionally, add a tests covering "all update-ref lines removed"
cases.

Reported-by: herr.kaste <herr.kaste@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-12-09 19:31:45 +09:00
27ab4784d5 fsck: implement checks for gitattributes
Recently, a vulnerability was reported that can lead to an out-of-bounds
write when reading an unreasonably large gitattributes file. The root
cause of this error are multiple integer overflows in different parts of
the code when there are either too many lines, when paths are too long,
when attribute names are too long, or when there are too many attributes
declared for a pattern.

As all of these are related to size, it seems reasonable to restrict the
size of the gitattributes file via git-fsck(1). This allows us to both
stop distributing known-vulnerable objects via common hosting platforms
that have fsck enabled, and users to protect themselves by enabling the
`fetch.fsckObjects` config.

There are basically two checks:

    1. We verify that size of the gitattributes file is smaller than
       100MB.

    2. We verify that the maximum line length does not exceed 2048
       bytes.

With the preceding commits, both of these conditions would cause us to
either ignore the complete gitattributes file or blob in the first case,
or the specific line in the second case. Now with these consistency
checks added, we also grow the ability to stop distributing such files
in the first place when `receive.fsckObjects` is enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 17:07:04 +09:00
f8587c31c9 fsck: move checks for gitattributes
Move the checks for gitattributes so that they can be extended more
readily.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 17:05:00 +09:00
a59a8c687f fsck: pull out function to check a set of blobs
In `fsck_finish()` we check all blobs for consistency that we have found
during the tree walk, but that haven't yet been checked. This is only
required for gitmodules right now, but will also be required for a new
check for gitattributes.

Pull out a function `fsck_blobs()` that allows the caller to check a set
of blobs for consistency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 17:05:00 +09:00
bb3a9265e5 fsck: refactor fsck_blob() to allow for more checks
In general, we don't need to validate blob contents as they are opaque
blobs about whose content Git doesn't need to care about. There are some
exceptions though when blobs are linked into trees so that they would be
interpreted by Git. We only have a single such check right now though,
which is the one for gitmodules that has been added in the context of
CVE-2018-11235.

Now we have found another vulnerability with gitattributes that can lead
to out-of-bounds writes and reads. So let's refactor `fsck_blob()` so
that it is more extensible and can check different types of blobs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 17:05:00 +09:00
e0bfc0b3b9 Merge branch 'ps/attr-limits' into maint-2.32 2022-12-09 17:03:49 +09:00
6662a836eb Merge branch 'ps/attr-limits' into maint-2.30 2022-12-09 16:05:52 +09:00
3305300f4c Merge branch 'ps/format-padding-fix' into maint-2.30 2022-12-09 16:02:39 +09:00
304a50adff pretty: restrict input lengths for padding and wrapping formats
Both the padding and wrapping formatting directives allow the caller to
specify an integer that ultimately leads to us adding this many chars to
the result buffer. As a consequence, it is trivial to e.g. allocate 2GB
of RAM via a single formatting directive and cause resource exhaustion
on the machine executing this logic. Furthermore, it is debatable
whether there are any sane usecases that require the user to pad data to
2GB boundaries or to indent wrapped data by 2GB.

Restrict the input sizes to 16 kilobytes at a maximum to limit the
amount of bytes that can be requested by the user. This is not meant
as a fix because there are ways to trivially amplify the amount of
data we generate via formatting directives; the real protection is
achieved by the changes in previous steps to catch and avoid integer
wraparound that causes us to under-allocate and access beyond the
end of allocated memory reagions. But having such a limit
significantly helps fuzzing the pretty format, because the fuzzer is
otherwise quite fast to run out-of-memory as it discovers these
formatters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
f930a23943 utf8: refactor strbuf_utf8_replace to not rely on preallocated buffer
In `strbuf_utf8_replace`, we preallocate the destination buffer and then
use `memcpy` to copy bytes into it at computed offsets. This feels
rather fragile and is hard to understand at times. Refactor the code to
instead use `strbuf_add` and `strbuf_addstr` so that we can be sure that
there is no possibility to perform an out-of-bounds write.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
81c2d4c3a5 utf8: fix checking for glyph width in strbuf_utf8_replace()
In `strbuf_utf8_replace()`, we call `utf8_width()` to compute the width
of the current glyph. If the glyph is a control character though it can
be that `utf8_width()` returns `-1`, but because we assign this value to
a `size_t` the conversion will cause us to underflow. This bug can
easily be triggered with the following command:

    $ git log --pretty='format:xxx%<|(1,trunc)%x10'

>From all I can see though this seems to be a benign underflow that has
no security-related consequences.

Fix the bug by using an `int` instead. When we see a control character,
we now copy it into the target buffer but don't advance the current
width of the string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
937b71cc8b utf8: fix overflow when returning string width
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.

This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:

    =================================================================
    ==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
    WRITE of size 2147483649 at 0x603000001168 thread T0
        #0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
        #2 0x5612bbb1087a in format_commit_item pretty.c:1801
        #3 0x5612bbc33bab in strbuf_expand strbuf.c:429
        #4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #6 0x5612bba0a4d5 in show_log log-tree.c:781
        #7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #10 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #11 0x5612bb56c993 in run_builtin git.c:466
        #12 0x5612bb56d397 in handle_builtin git.c:721
        #13 0x5612bb56db07 in run_argv git.c:788
        #14 0x5612bb56e8a7 in cmd_main git.c:923
        #15 0x5612bb803682 in main common-main.c:57
        #16 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
    allocated by thread T0 here:
        #0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5612bbcdd556 in xrealloc wrapper.c:136
        #2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
        #3 0x5612bbc32acd in strbuf_add strbuf.c:298
        #4 0x5612bbc33aec in strbuf_expand strbuf.c:418
        #5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #7 0x5612bba0a4d5 in show_log log-tree.c:781
        #8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #11 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #12 0x5612bb56c993 in run_builtin git.c:466
        #13 0x5612bb56d397 in handle_builtin git.c:721
        #14 0x5612bb56db07 in run_argv git.c:788
        #15 0x5612bb56e8a7 in cmd_main git.c:923
        #16 0x5612bb803682 in main common-main.c:57
        #17 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
      0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
      0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
      0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
    =>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
      0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==26009==ABORTING

Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.

Add a test that would have previously caused us to crash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
17d23e8a38 utf8: fix returning negative string width
The `utf8_strnwidth()` function calls `utf8_width()` in a loop and adds
its returned width to the end result. `utf8_width()` can return `-1`
though in case it reads a control character, which means that the
computed string width is going to be wrong. In the worst case where
there are more control characters than non-control characters, we may
even return a negative string width.

Fix this bug by treating control characters as having zero width.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
522cc87fdc utf8: fix truncated string lengths in utf8_strnwidth()
The `utf8_strnwidth()` function accepts an optional string length as
input parameter. This parameter can either be set to `-1`, in which case
we call `strlen()` on the input. Or it can be set to a positive integer
that indicates a precomputed length, which callers typically compute by
calling `strlen()` at some point themselves.

The input parameter is an `int` though, whereas `strlen()` returns a
`size_t`. This can lead to implementation-defined behaviour though when
the `size_t` cannot be represented by the `int`. In the general case
though this leads to wrap-around and thus to negative string sizes,
which is sure enough to not lead to well-defined behaviour.

Fix this by accepting a `size_t` instead of an `int` as string length.
While this takes away the ability of callers to simply pass in `-1` as
string length, it really is trivial enough to convert them to instead
pass in `strlen()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
48050c42c7 pretty: fix integer overflow in wrapping format
The `%w(width,indent1,indent2)` formatting directive can be used to
rewrap text to a specific width and is designed after git-shortlog(1)'s
`-w` parameter. While the three parameters are all stored as `size_t`
internally, `strbuf_add_wrapped_text()` accepts integers as input. As a
result, the casted integers may overflow. As these now-negative integers
are later on passed to `strbuf_addchars()`, we will ultimately run into
implementation-defined behaviour due to casting a negative number back
to `size_t` again. On my platform, this results in trying to allocate
9000 petabyte of memory.

Fix this overflow by using `cast_size_t_to_int()` so that we reject
inputs that cannot be represented as an integer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
1de69c0cdd pretty: fix adding linefeed when placeholder is not expanded
When a formatting directive has a `+` or ` ` after the `%`, then we add
either a line feed or space if the placeholder expands to a non-empty
string. In specific cases though this logic doesn't work as expected,
and we try to add the character even in the case where the formatting
directive is empty.

One such pattern is `%w(1)%+d%+w(2)`. `%+d` expands to reference names
pointing to a certain commit, like in `git log --decorate`. For a tagged
commit this would for example expand to `\n (tag: v1.0.0)`, which has a
leading newline due to the `+` modifier and a space added by `%d`. Now
the second wrapping directive will cause us to rewrap the text to
`\n(tag:\nv1.0.0)`, which is one byte shorter due to the missing leading
space. The code that handles the `+` magic now notices that the length
has changed and will thus try to insert a leading line feed at the
original posititon. But as the string was shortened, the original
position is past the buffer's boundary and thus we die with an error.

Now there are two issues here:

    1. We check whether the buffer length has changed, not whether it
       has been extended. This causes us to try and add the character
       past the string boundary.

    2. The current logic does not make any sense whatsoever. When the
       string got expanded due to the rewrap, putting the separator into
       the original position is likely to put it somewhere into the
       middle of the rewrapped contents.

It is debatable whether `%+w()` makes any sense in the first place.
Strictly speaking, the placeholder never expands to a non-empty string,
and consequentially we shouldn't ever accept this combination. We thus
fix the bug by simply refusing `%+w()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
f6e0b9f389 pretty: fix out-of-bounds read when parsing invalid padding format
An out-of-bounds read can be triggered when parsing an incomplete
padding format string passed via `--pretty=format` or in Git archives
when files are marked with the `export-subst` gitattribute.

This bug exists since we have introduced support for truncating output
via the `trunc` keyword a7f01c6b4d (pretty: support truncating in %>, %<
and %><, 2013-04-19). Before this commit, we used to find the end of the
formatting string by using strchr(3P). This function returns a `NULL`
pointer in case the character in question wasn't found. The subsequent
check whether any character was found thus simply checked the returned
pointer. After the commit we switched to strcspn(3P) though, which only
returns the offset to the first found character or to the trailing NUL
byte. As the end pointer is now computed by adding the offset to the
start pointer it won't be `NULL` anymore, and as a consequence the check
doesn't do anything anymore.

The out-of-bounds data that is being read can in fact end up in the
formatted string. As a consequence, it is possible to leak memory
contents either by calling git-log(1) or via git-archive(1) when any of
the archived files is marked with the `export-subst` gitattribute.

    ==10888==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000398 at pc 0x7f0356047cb2 bp 0x7fff3ffb95d0 sp 0x7fff3ffb8d78
    READ of size 1 at 0x602000000398 thread T0
        #0 0x7f0356047cb1 in __interceptor_strchrnul /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725
        #1 0x563b7cec9a43 in strbuf_expand strbuf.c:417
        #2 0x563b7cda7060 in repo_format_commit_message pretty.c:1869
        #3 0x563b7cda8d0f in pretty_print_commit pretty.c:2161
        #4 0x563b7cca04c8 in show_log log-tree.c:781
        #5 0x563b7cca36ba in log_tree_commit log-tree.c:1117
        #6 0x563b7c927ed5 in cmd_log_walk_no_free builtin/log.c:508
        #7 0x563b7c92835b in cmd_log_walk builtin/log.c:549
        #8 0x563b7c92b1a2 in cmd_log builtin/log.c:883
        #9 0x563b7c802993 in run_builtin git.c:466
        #10 0x563b7c803397 in handle_builtin git.c:721
        #11 0x563b7c803b07 in run_argv git.c:788
        #12 0x563b7c8048a7 in cmd_main git.c:923
        #13 0x563b7ca99682 in main common-main.c:57
        #14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    0x602000000398 is located 0 bytes to the right of 8-byte region [0x602000000390,0x602000000398)
    allocated by thread T0 here:
        #0 0x7f0356072faa in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:439
        #1 0x563b7cf7317c in xstrdup wrapper.c:39
        #2 0x563b7cd9a06a in save_user_format pretty.c:40
        #3 0x563b7cd9b3e5 in get_commit_format pretty.c:173
        #4 0x563b7ce54ea0 in handle_revision_opt revision.c:2456
        #5 0x563b7ce597c9 in setup_revisions revision.c:2850
        #6 0x563b7c9269e0 in cmd_log_init_finish builtin/log.c:269
        #7 0x563b7c927362 in cmd_log_init builtin/log.c:348
        #8 0x563b7c92b193 in cmd_log builtin/log.c:882
        #9 0x563b7c802993 in run_builtin git.c:466
        #10 0x563b7c803397 in handle_builtin git.c:721
        #11 0x563b7c803b07 in run_argv git.c:788
        #12 0x563b7c8048a7 in cmd_main git.c:923
        #13 0x563b7ca99682 in main common-main.c:57
        #14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725 in __interceptor_strchrnul
    Shadow bytes around the buggy address:
      0x0c047fff8020: fa fa fd fd fa fa 00 06 fa fa 05 fa fa fa fd fd
      0x0c047fff8030: fa fa 00 02 fa fa 06 fa fa fa 05 fa fa fa fd fd
      0x0c047fff8040: fa fa 00 07 fa fa 03 fa fa fa fd fd fa fa 00 00
      0x0c047fff8050: fa fa 00 01 fa fa fd fd fa fa 00 00 fa fa 00 01
      0x0c047fff8060: fa fa 00 06 fa fa 00 06 fa fa 05 fa fa fa 05 fa
    =>0x0c047fff8070: fa fa 00[fa]fa fa fd fa fa fa fd fd fa fa fd fd
      0x0c047fff8080: fa fa fd fd fa fa 00 00 fa fa 00 fa fa fa fd fa
      0x0c047fff8090: fa fa fd fd fa fa 00 00 fa fa fa fa fa fa fa fa
      0x0c047fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==10888==ABORTING

Fix this bug by checking whether `end` points at the trailing NUL byte.
Add a test which catches this out-of-bounds read and which demonstrates
that we used to write out-of-bounds data into the formatted message.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Original-patch-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
b49f309aa1 pretty: fix out-of-bounds read when left-flushing with stealing
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to
steal spaces. To do so we need to look ahead of the next token to see
whether there are spaces there. This loop takes into account ANSI
sequences that end with an `m`, and if it finds any it will skip them
until it finds the first space. While doing so it does not take into
account the buffer's limits though and easily does an out-of-bounds
read.

Add a test that hits this behaviour. While we don't have an easy way to
verify this, the test causes the following failure when run with
`SANITIZE=address`:

    ==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10
    READ of size 1 at 0x603000000baf thread T0
        #0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712
        #1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801
        #2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429
        #3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #5 0x55ba6f7884c8 in show_log log-tree.c:781
        #6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        #8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        #9 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        #10 0x55ba6f2ea993 in run_builtin git.c:466
        #11 0x55ba6f2eb397 in handle_builtin git.c:721
        #12 0x55ba6f2ebb07 in run_argv git.c:788
        #13 0x55ba6f2ec8a7 in cmd_main git.c:923
        #14 0x55ba6f581682 in main common-main.c:57
        #15 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8)
    allocated by thread T0 here:
        #0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x55ba6fa5b494 in xrealloc wrapper.c:136
        #2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99
        #3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298
        #4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418
        #5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #7 0x55ba6f7884c8 in show_log log-tree.c:781
        #8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        #11 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        #12 0x55ba6f2ea993 in run_builtin git.c:466
        #13 0x55ba6f2eb397 in handle_builtin git.c:721
        #14 0x55ba6f2ebb07 in run_argv git.c:788
        #15 0x55ba6f2ec8a7 in cmd_main git.c:923
        #16 0x55ba6f581682 in main common-main.c:57
        #17 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit
    Shadow bytes around the buggy address:
      0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
      0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa
      0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
      0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
    =>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa
      0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb

Luckily enough, this would only cause us to copy the out-of-bounds data
into the formatted commit in case we really had an ANSI sequence
preceding our buffer. So this bug likely has no security consequences.

Fix it regardless by not traversing past the buffer's start.

Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
81dc898df9 pretty: fix out-of-bounds write caused by integer overflow
When using a padding specifier in the pretty format passed to git-log(1)
we need to calculate the string length in several places. These string
lengths are stored in `int`s though, which means that these can easily
overflow when the input lengths exceeds 2GB. This can ultimately lead to
an out-of-bounds write when these are used in a call to memcpy(3P):

        ==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588
    WRITE of size 1 at 0x7f1ec62f97fe thread T0
        #0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762
        #2 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #3 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #5 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        #6 0x5628e95a44c8 in show_log log-tree.c:781
        #7 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        #8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        #10 0x5628e922f1a2 in cmd_log builtin/log.c:883
        #11 0x5628e9106993 in run_builtin git.c:466
        #12 0x5628e9107397 in handle_builtin git.c:721
        #13 0x5628e9107b07 in run_argv git.c:788
        #14 0x5628e91088a7 in cmd_main git.c:923
        #15 0x5628e939d682 in main common-main.c:57
        #16 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839)
    allocated by thread T0 here:
        #0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5628e98774d4 in xrealloc wrapper.c:136
        #2 0x5628e97cb01c in strbuf_grow strbuf.c:99
        #3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327
        #4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761
        #5 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #6 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #8 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        #9 0x5628e95a44c8 in show_log log-tree.c:781
        #10 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        #11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        #12 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        #13 0x5628e922f1a2 in cmd_log builtin/log.c:883
        #14 0x5628e9106993 in run_builtin git.c:466
        #15 0x5628e9107397 in handle_builtin git.c:721
        #16 0x5628e9107b07 in run_argv git.c:788
        #17 0x5628e91088a7 in cmd_main git.c:923
        #18 0x5628e939d682 in main common-main.c:57
        #19 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
      0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==8340==ABORTING

The pretty format can also be used in `git archive` operations via the
`export-subst` attribute. So this is what in our opinion makes this a
critical issue in the context of Git forges which allow to download an
archive of user supplied Git repositories.

Fix this vulnerability by using `size_t` instead of `int` to track the
string lengths. Add tests which detect this vulnerability when Git is
compiled with the address sanitizer.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Modified-by: Taylor  Blau <me@ttalorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
a244dc5b0a test-lib: add prerequisite for 64-bit platforms
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
platforms and regardless of the size of `long`.

This imitates the `LONG_IS_64BIT` prerequisite.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:04 +09:00
bd5df96b79 RelNotes: a couple of typofixes
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 13:36:49 +09:00
35c194dc57 t1509: facilitate repeated script invocations
t1509-root-work-tree.sh, which tests behavior of a Git repository
located at the root `/` directory, refuses to run if it detects the
presence of an existing repository at `/`. This safeguard ensures that
it won't clobber a legitimate repository at that location. However,
because t1509 does a poor job of cleaning up after itself, it runs afoul
of its own safety check on subsequent runs, which makes it painful to
run the script repeatedly since each run requires manual cleanup of
detritus from the previous run.

Address this shortcoming by making t1509 clean up after itself as its
last action. This is safe since the script can only make it to this
cleanup action if it did not find a legitimate repository at `/` in the
first place, so the resources cleaned up here can only have been created
by the script itself.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
2022-12-09 10:41:59 +09:00
ce153b8d4d t1509: make "setup" test more robust
One of the t1509 setup tests is very particular about the output it
expects from `git init`, and fails if the output differs even slightly
which can happen easily if the script is run multiple times since it
doesn't do a good job of cleaning up after itself (i.e. it leaves
detritus in the root directory `/`). One bit of cruft in particular
(`/HEAD`) makes the test fail since its presence causes `git init` to
alter its output; rather than reporting "Initialized empty Git
repository", it instead reports "Reinitialized existing Git repository"
when `/HEAD` is present. Address this problem by making the test do a
more careful job of crafting its intended initial state.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
2022-12-09 10:41:58 +09:00
7790b8c6b5 t1509: fix failing "root work tree" test due to owner-check
When 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02) tightened security surrounding
directory ownership, it neglected to adjust t1509-root-work-tree.sh to
take the new restriction into account. As a result, since the root
directory `/` is typically not owned by the user running the test
(indeed, t1509 refuses to run as `root`), the ownership check added
by 8959555cee kicks in and causes the test to fail:

    fatal: detected dubious ownership in repository at '/'
    To add an exception for this directory, call:

        git config --global --add safe.directory /

This problem went unnoticed for so long because t1509 is rarely run
since it requires setting up a `chroot` environment or a sacrificial
virtual machine in which `/` can be made writable and polluted by any
user.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
2022-12-09 10:41:58 +09:00
e5a9f4e57d Merge branch 'turkish' of github.com:bitigchi/git-po
* 'turkish' of github.com:bitigchi/git-po:
  l10n: tr: v2.39.0 updates
2022-12-08 08:25:27 +08:00
31e19ec5ee Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2022-12-08 08:24:56 +08:00
c72d15ec68 Merge branch 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po
* 'fz/po-zh_CN' of github.com:fangyi-zhou/git-po:
  l10n: zh_CN v2.39.0 round 1
2022-12-08 08:22:57 +08:00
f115c96e7a CI: migrate away from deprecated "set-output" syntax
As noted in [1] and the warnings the CI itself is spewing echoing
outputs to stdout is deprecated, and they should be written to
"$GITHUB_OUTPUT" instead.

1. https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-08 08:47:22 +09:00
1f398446c3 ci: avoid using deprecated {up,down}load-artifacts Action
The deprecated versions of these Actions still use node.js 12 whereas
workflows will need to use node.js 16 to avoid problems going forward.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-08 08:15:23 +09:00
d8b21a0fe2 CI: don't explicitly pick "bash" shell outside of Windows, fix regression
When the "js/ci-github-workflow-markup" topic was originally merged in
[1] it included a change to get rid of the "ci/print-test-failures.sh"
step[2]. This was then brought back in [3] as part of a fix-up patches
on top[4].

The problem was that [3] was not a revert of the relevant parts of
[2], but rather copy/pasted the "ci/print-test-failures.sh" step that
was present for the Windows job to all "ci/print-test-failures.sh"
steps. The Windows steps specified "shell: bash", but the non-Windows
ones did not.

This broke the "ci/print/test-failures.sh" step for the "linux-musl"
job, where we don't have a "bash" shell, just a "/bin/sh" (a
"dash"). This breakage was reported at the time[5], but hadn't been
fixed.

It would be sufficient to change this only for "linux-musl", but let's
change this for both "regular" and "dockerized" to omit the "shell"
line entirely, as we did before [2].

Let's also change undo the "name" change that [3] made while
copy/pasting the "print test failures" step for the Windows job. These
steps are now the same as they were before [2], except that the "if"
includes the "env.FAILED_TEST_ARTIFACTS" test.

1. fc5a070f59 (Merge branch 'js/ci-github-workflow-markup', 2022-06-07)
2. 08dccc8fc1 (ci: make it easier to find failed tests' logs in the
   GitHub workflow, 2022-05-21)
3. 5aeb145780 (ci(github): bring back the 'print test failures' step,
   2022-06-08)
4. d0d96b8280 (Merge branch 'js/ci-github-workflow-markup', 2022-06-17)
5. https://lore.kernel.org/git/220725.86sfmpneqp.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-08 08:06:00 +09:00
01e84b4517 l10n: tr: v2.39.0 updates
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2022-12-07 18:05:59 +03:00
bd390bce17 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2022-12-07 07:35:32 +01:00
d11192255d CI: upgrade to macos-12, and pin OSX version
Per [1] and the warnings our CI is emitting GitHub is phasing in
"macos-12" as their "macos-latest".

As with [2], let's pin our image to a specific version so that we're
not having it swept from under us, and our upgrade cycle can be more
predictable than whenever GitHub changes their images.

1. https://github.com/actions/runner-images/issues/6384
2. 0178420b9c (github-actions: run gcc-8 on ubuntu-20.04 image,
   2022-11-25)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-07 13:36:22 +09:00
86325d36e6 t3920: support CR-eating grep
grep(1) converts CRLF line endings to LF on current MinGW:

   $ uname -sr
   MINGW64_NT-10.0-22621 3.3.6-341.x86_64

   $ printf 'a\r\n' | hexdump.exe -C
   00000000  61 0d 0a                                          |a..|
   00000003

   $ printf 'a\r\n' | grep . | hexdump.exe -C
   00000000  61 0a                                             |a.|
   00000002

Create the intended test file by grepping the original file with LF
line endings and adding CRs explicitly.

The missing CRs went unnoticed because test_cmp on MinGW ignores line
endings since 4d715ac05c (Windows: a test_cmp that is agnostic to random
LF <> CRLF conversions, 2013-10-26).  Fix this test anyway to avoid
depending on that special test_cmp behavior, especially since this is
the only test that needs it.

Piping the output of grep(1) through append_cr has the side-effect of
ignoring its return value.  That means we no longer need the explicit
"|| true" to support commit messages without a body.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-07 13:33:18 +09:00
95494c6f61 t0021: use Windows-friendly pwd
In Git for Windows, when passing paths from shell scripts to regular
Win32 executables, thanks to the MSYS2 runtime a somewhat magic path
conversion happens that lets the shell script think that there is a file
at `/git/Makefile` and the Win32 process it spawned thinks that the
shell script said `C:/git-sdk-64/git/Makefile` instead.

This conversion is documented in detail over here:
https://www.msys2.org/docs/filesystem-paths/#automatic-unix-windows-path-conversion

As all automatic conversions, there are gaps. For example, to avoid
mistaking command-line options like `/LOG=log.txt` (which are quite
common in the Windows world) from being mistaken for a Unix-style
absolute path, the MSYS2 runtime specifically exempts arguments
containing a `=` character from that conversion.

We are about to change `test_cmp` to use `git diff --no-index`, which
involves spawning precisely such a Win32 process.

In combination, this would cause a failure in `t0021-conversion.sh`
where we pass an absolute path containing an equal character to the
`test_cmp` function.

Seeing as the Unix tools like `cp` and `diff` that are used by Git's
test suite in the Git for Windows SDK (thanks to the MSYS2 project)
understand both Unix-style as well as Windows-style paths, we can stave
off this problem by simply switching to Windows-style paths and
side-stepping the need for any automatic path conversion.

Note: The `PATH` variable is obviously special, as it is colon-separated
in the MSYS2 Bash used by Git for Windows, and therefore _cannot_
contain absolute Windows-style paths, lest the colon after the drive
letter is mistaken for a path separator. Therefore, we need to be
careful to keep the Unix-style when modifying the `PATH` variable.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-07 13:22:58 +09:00
c4f732bd42 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5501t)
2022-12-07 09:23:49 +08:00
84f7e2b926 Merge branch 'l10n-de-2.39' of github.com:ralfth/git
* 'l10n-de-2.39' of github.com:ralfth/git:
  l10n: de.po: update German translation
2022-12-07 09:23:24 +08:00
87292b4d64 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.39 (round 1)
2022-12-07 09:22:17 +08:00
b50a9a86be Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5501t0f0)
2022-12-07 09:21:49 +08:00
08714ee16a Merge branch 'fr_v2.39_rnd1' of github.com:jnavila/git
* 'fr_v2.39_rnd1' of github.com:jnavila/git:
  l10n: fr: v2.39 rnd 1
2022-12-07 09:21:25 +08:00
3457ed7f2e l10n: bg.po: Updated Bulgarian translation (5501t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2022-12-06 17:17:34 +01:00
2e71cbbddd Git 2.39-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-06 09:49:31 +09:00
6cf4d908a9 ci(main): upgrade actions/checkout to v3
To be up to date with actions/checkout opens the door to use the latest
features if necessary and get the latest security patches.

This also avoids a couple of deprecation warnings in the CI runs.

Note: The `actions/checkout` Action has been known to be broken in i686
containers as of v2, therefore we keep forcing it to v1 there. See
actions/runner#2115 for more details.

Signed-off-by: Oscar Dominguez <dominguez.celada@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-06 08:22:15 +09:00
3c50032ff5 attr: ignore overly large gitattributes files
Similar as with the preceding commit, start ignoring gitattributes files
that are overly large to protect us against out-of-bounds reads and
writes caused by integer overflows. Unfortunately, we cannot just define
"overly large" in terms of any preexisting limits in the codebase.

Instead, we choose a very conservative limit of 100MB. This is plenty of
room for specifying gitattributes, and incidentally it is also the limit
for blob sizes for GitHub. While we don't want GitHub to dictate limits
here, it is still sensible to use this fact for an informed decision
given that it is hosting a huge set of repositories. Furthermore, over
at GitLab we scanned a subset of repositories for their root-level
attribute files. We found that 80% of them have a gitattributes file
smaller than 100kB, 99.99% have one smaller than 1MB, and only a single
repository had one that was almost 3MB in size. So enforcing a limit of
100MB seems to give us ample of headroom.

With this limit in place we can be reasonably sure that there is no easy
way to exploit the gitattributes file via integer overflows anymore.
Furthermore, it protects us against resource exhaustion caused by
allocating the in-memory data structures required to represent the
parsed attributes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:50:03 +09:00
dfa6b32b5e attr: ignore attribute lines exceeding 2048 bytes
There are two different code paths to read gitattributes: once via a
file, and once via the index. These two paths used to behave differently
because when reading attributes from a file, we used fgets(3P) with a
buffer size of 2kB. Consequentially, we silently truncate line lengths
when lines are longer than that and will then parse the remainder of the
line as a new pattern. It goes without saying that this is entirely
unexpected, but it's even worse that the behaviour depends on how the
gitattributes are parsed.

While this is simply wrong, the silent truncation saves us with the
recently discovered vulnerabilities that can cause out-of-bound writes
or reads with unreasonably long lines due to integer overflows. As the
common path is to read gitattributes via the worktree file instead of
via the index, we can assume that any gitattributes file that had lines
longer than that is already broken anyway. So instead of lifting the
limit here, we can double down on it to fix the vulnerabilities.

Introduce an explicit line length limit of 2kB that is shared across all
paths that read attributes and ignore any line that hits this limit
while printing a warning.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:33:07 +09:00
d74b1fd54f attr: fix silently splitting up lines longer than 2048 bytes
When reading attributes from a file we use fgets(3P) with a buffer size
of 2048 bytes. This means that as soon as a line exceeds the buffer size
we split it up into multiple parts and parse each of them as a separate
pattern line. This is of course not what the user intended, and even
worse the behaviour is inconsistent with how we read attributes from the
index.

Fix this bug by converting the code to use `strbuf_getline()` instead.
This will indeed read in the whole line, which may theoretically lead to
an out-of-memory situation when the gitattributes file is huge. We're
about to reject any gitattributes files larger than 100MB in the next
commit though, which makes this less of a concern.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:29:30 +09:00
a60a66e409 attr: harden allocation against integer overflows
When parsing an attributes line, we need to allocate an array that holds
all attributes specified for the given file pattern. The calculation to
determine the number of bytes that need to be allocated was prone to an
overflow though when there was an unreasonable amount of attributes.

Harden the allocation by instead using the `st_` helper functions that
cause us to die when we hit an integer overflow.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
e1e12e97ac attr: fix integer overflow with more than INT_MAX macros
Attributes have a field that tracks the position in the `all_attrs`
array they're stored inside. This field gets set via `hashmap_get_size`
when adding the attribute to the global map of attributes. But while the
field is of type `int`, the value returned by `hashmap_get_size` is an
`unsigned int`. It can thus happen that the value overflows, where we
would now dereference teh `all_attrs` array at an out-of-bounds value.

We do have a sanity check for this overflow via an assert that verifies
the index matches the new hashmap's size. But asserts are not a proper
mechanism to detect against any such overflows as they may not in fact
be compiled into production code.

Fix this by using an `unsigned int` to track the index and convert the
assert to a call `die()`.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
447ac906e1 attr: fix out-of-bounds read with unreasonable amount of patterns
The `struct attr_stack` tracks the stack of all patterns together with
their attributes. When parsing a gitattributes file that has more than
2^31 such patterns though we may trigger multiple out-of-bounds reads on
64 bit platforms. This is because while the `num_matches` variable is an
unsigned integer, we always use a signed integer to iterate over them.

I have not been able to reproduce this issue due to memory constraints
on my systems. But despite the out-of-bounds reads, the worst thing that
can seemingly happen is to call free(3P) with a garbage pointer when
calling `attr_stack_free()`.

Fix this bug by using unsigned integers to iterate over the array. While
this makes the iteration somewhat awkward when iterating in reverse, it
is at least better than knowingly running into an out-of-bounds read.
While at it, convert the call to `ALLOC_GROW` to use `ALLOC_GROW_BY`
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
34ace8bad0 attr: fix out-of-bounds write when parsing huge number of attributes
It is possible to trigger an integer overflow when parsing attribute
names when there are more than 2^31 of them for a single pattern. This
can either lead to us dying due to trying to request too many bytes:

     blob=$(perl -e 'print "f" . " a=" x 2147483649' | git hash-object -w --stdin)
     git update-index --add --cacheinfo 100644,$blob,.gitattributes
     git attr-check --all file

    =================================================================
    ==1022==ERROR: AddressSanitizer: requested allocation size 0xfffffff800000032 (0xfffffff800001038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
        #0 0x7fd3efabf411 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
        #1 0x5563a0a1e3d3 in xcalloc wrapper.c:150
        #2 0x5563a058d005 in parse_attr_line attr.c:384
        #3 0x5563a058e661 in handle_attr_line attr.c:660
        #4 0x5563a058eddb in read_attr_from_index attr.c:769
        #5 0x5563a058ef12 in read_attr attr.c:797
        #6 0x5563a058f24c in bootstrap_attr_stack attr.c:867
        #7 0x5563a058f4a3 in prepare_attr_stack attr.c:902
        #8 0x5563a05905da in collect_some_attrs attr.c:1097
        #9 0x5563a059093d in git_all_attrs attr.c:1128
        #10 0x5563a02f636e in check_attr builtin/check-attr.c:67
        #11 0x5563a02f6c12 in cmd_check_attr builtin/check-attr.c:183
        #12 0x5563a02aa993 in run_builtin git.c:466
        #13 0x5563a02ab397 in handle_builtin git.c:721
        #14 0x5563a02abb2b in run_argv git.c:788
        #15 0x5563a02ac991 in cmd_main git.c:926
        #16 0x5563a05432bd in main common-main.c:57
        #17 0x7fd3ef82228f  (/usr/lib/libc.so.6+0x2328f)

    ==1022==HINT: if you don't care about these errors you may set allocator_may_return_null=1
    SUMMARY: AddressSanitizer: allocation-size-too-big /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 in __interceptor_calloc
    ==1022==ABORTING

Or, much worse, it can lead to an out-of-bounds write because we
underallocate and then memcpy(3P) into an array:

    perl -e '
        print "A " . "\rh="x2000000000;
        print "\rh="x2000000000;
        print "\rh="x294967294 . "\n"
    ' >.gitattributes
    git add .gitattributes
    git commit -am "evil attributes"

    $ git clone --quiet /path/to/repo
    =================================================================
    ==15062==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002550 at pc 0x5555559884d5 bp 0x7fffffffbc60 sp 0x7fffffffbc58
    WRITE of size 8 at 0x602000002550 thread T0
        #0 0x5555559884d4 in parse_attr_line attr.c:393
        #1 0x5555559884d4 in handle_attr_line attr.c:660
        #2 0x555555988902 in read_attr_from_index attr.c:784
        #3 0x555555988902 in read_attr_from_index attr.c:747
        #4 0x555555988a1d in read_attr attr.c:800
        #5 0x555555989b0c in bootstrap_attr_stack attr.c:882
        #6 0x555555989b0c in prepare_attr_stack attr.c:917
        #7 0x555555989b0c in collect_some_attrs attr.c:1112
        #8 0x55555598b141 in git_check_attr attr.c:1126
        #9 0x555555a13004 in convert_attrs convert.c:1311
        #10 0x555555a95e04 in checkout_entry_ca entry.c:553
        #11 0x555555d58bf6 in checkout_entry entry.h:42
        #12 0x555555d58bf6 in check_updates unpack-trees.c:480
        #13 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
        #14 0x555555785ab7 in checkout builtin/clone.c:724
        #15 0x555555785ab7 in cmd_clone builtin/clone.c:1384
        #16 0x55555572443c in run_builtin git.c:466
        #17 0x55555572443c in handle_builtin git.c:721
        #18 0x555555727872 in run_argv git.c:788
        #19 0x555555727872 in cmd_main git.c:926
        #20 0x555555721fa0 in main common-main.c:57
        #21 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
        #22 0x555555723f39 in _start (git+0x1cff39)

    0x602000002552 is located 0 bytes to the right of 2-byte region [0x602000002550,0x602000002552) allocated by thread T0 here:
        #0 0x7ffff768c037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x555555d7fff7 in xcalloc wrapper.c:150
        #2 0x55555598815f in parse_attr_line attr.c:384
        #3 0x55555598815f in handle_attr_line attr.c:660
        #4 0x555555988902 in read_attr_from_index attr.c:784
        #5 0x555555988902 in read_attr_from_index attr.c:747
        #6 0x555555988a1d in read_attr attr.c:800
        #7 0x555555989b0c in bootstrap_attr_stack attr.c:882
        #8 0x555555989b0c in prepare_attr_stack attr.c:917
        #9 0x555555989b0c in collect_some_attrs attr.c:1112
        #10 0x55555598b141 in git_check_attr attr.c:1126
        #11 0x555555a13004 in convert_attrs convert.c:1311
        #12 0x555555a95e04 in checkout_entry_ca entry.c:553
        #13 0x555555d58bf6 in checkout_entry entry.h:42
        #14 0x555555d58bf6 in check_updates unpack-trees.c:480
        #15 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
        #16 0x555555785ab7 in checkout builtin/clone.c:724
        #17 0x555555785ab7 in cmd_clone builtin/clone.c:1384
        #18 0x55555572443c in run_builtin git.c:466
        #19 0x55555572443c in handle_builtin git.c:721
        #20 0x555555727872 in run_argv git.c:788
        #21 0x555555727872 in cmd_main git.c:926
        #22 0x555555721fa0 in main common-main.c:57
        #23 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308

    SUMMARY: AddressSanitizer: heap-buffer-overflow attr.c:393 in parse_attr_line
    Shadow bytes around the buggy address:
      0x0c047fff8450: fa fa 00 02 fa fa 00 07 fa fa fd fd fa fa 00 00
      0x0c047fff8460: fa fa 02 fa fa fa fd fd fa fa 00 06 fa fa 05 fa
      0x0c047fff8470: fa fa fd fd fa fa 00 02 fa fa 06 fa fa fa 05 fa
      0x0c047fff8480: fa fa 07 fa fa fa fd fd fa fa 00 01 fa fa 00 02
      0x0c047fff8490: fa fa 00 03 fa fa 00 fa fa fa 00 01 fa fa 00 03
    =>0x0c047fff84a0: fa fa 00 01 fa fa 00 02 fa fa[02]fa fa fa fa fa
      0x0c047fff84b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
      Shadow gap:              cc
    ==15062==ABORTING

Fix this bug by using `size_t` instead to count the number of attributes
so that this value cannot reasonably overflow without running out of
memory before already.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
2455720950 attr: fix integer overflow when parsing huge attribute names
It is possible to trigger an integer overflow when parsing attribute
names that are longer than 2^31 bytes because we assign the result of
strlen(3P) to an `int` instead of to a `size_t`. This can lead to an
abort in vsnprintf(3P) with the following reproducer:

    blob=$(perl -e 'print "A " . "B"x2147483648 . "\n"' | git hash-object -w --stdin)
    git update-index --add --cacheinfo 100644,$blob,.gitattributes
    git check-attr --all path

    BUG: strbuf.c:400: your vsnprintf is broken (returned -1)

But furthermore, assuming that the attribute name is even longer than
that, it can cause us to silently truncate the attribute and thus lead
to wrong results.

Fix this integer overflow by using a `size_t` instead. This fixes the
silent truncation of attribute names, but it only partially fixes the
BUG we hit: even though the initial BUG is fixed, we can still hit a BUG
when parsing invalid attribute lines via `report_invalid_attr()`.

This is due to an underlying design issue in vsnprintf(3P) which only
knows to return an `int`, and thus it may always overflow with large
inputs. This issue is benign though: the worst that can happen is that
the error message is misreported to be either truncated or too long, but
due to the buffer being NUL terminated we wouldn't ever do an
out-of-bounds read here.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
8d0d48cf21 attr: fix out-of-bounds read with huge attribute names
There is an out-of-bounds read possible when parsing gitattributes that
have an attribute that is 2^31+1 bytes long. This is caused due to an
integer overflow when we assign the result of strlen(3P) to an `int`,
where we use the wrapped-around value in a subsequent call to
memcpy(3P). The following code reproduces the issue:

    blob=$(perl -e 'print "a" x 2147483649 . " attr"' | git hash-object -w --stdin)
    git update-index --add --cacheinfo 100644,$blob,.gitattributes
    git check-attr --all file

    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==8451==ERROR: AddressSanitizer: SEGV on unknown address 0x7f93efa00800 (pc 0x7f94f1f8f082 bp 0x7ffddb59b3a0 sp 0x7ffddb59ab28 T0)
    ==8451==The signal is caused by a READ memory access.
        #0 0x7f94f1f8f082  (/usr/lib/libc.so.6+0x176082)
        #1 0x7f94f2047d9c in __interceptor_strspn /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:752
        #2 0x560e190f7f26 in parse_attr_line attr.c:375
        #3 0x560e190f9663 in handle_attr_line attr.c:660
        #4 0x560e190f9ddd in read_attr_from_index attr.c:769
        #5 0x560e190f9f14 in read_attr attr.c:797
        #6 0x560e190fa24e in bootstrap_attr_stack attr.c:867
        #7 0x560e190fa4a5 in prepare_attr_stack attr.c:902
        #8 0x560e190fb5dc in collect_some_attrs attr.c:1097
        #9 0x560e190fb93f in git_all_attrs attr.c:1128
        #10 0x560e18e6136e in check_attr builtin/check-attr.c:67
        #11 0x560e18e61c12 in cmd_check_attr builtin/check-attr.c:183
        #12 0x560e18e15993 in run_builtin git.c:466
        #13 0x560e18e16397 in handle_builtin git.c:721
        #14 0x560e18e16b2b in run_argv git.c:788
        #15 0x560e18e17991 in cmd_main git.c:926
        #16 0x560e190ae2bd in main common-main.c:57
        #17 0x7f94f1e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #18 0x7f94f1e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #19 0x560e18e110e4 in _start ../sysdeps/x86_64/start.S:115

    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/usr/lib/libc.so.6+0x176082)
    ==8451==ABORTING

Fix this bug by converting the variable to a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
eb22e7dfa2 attr: fix overflow when upserting attribute with overly long name
The function `git_attr_internal()` is called to upsert attributes into
the global map. And while all callers pass a `size_t`, the function
itself accepts an `int` as the attribute name's length. This can lead to
an integer overflow in case the attribute name is longer than `INT_MAX`.

Now this overflow seems harmless as the first thing we do is to call
`attr_name_valid()`, and that function only succeeds in case all chars
in the range of `namelen` match a certain small set of chars. We thus
can't do an out-of-bounds read as NUL is not part of that set and all
strings passed to this function are NUL-terminated. And furthermore, we
wouldn't ever read past the current attribute name anyway due to the
same reason. And if validation fails we will return early.

On the other hand it feels fragile to rely on this behaviour, even more
so given that we pass `namelen` to `FLEX_ALLOC_MEM()`. So let's instead
just do the correct thing here and accept a `size_t` as line length.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
395bec6b39 Merge branch 'jk/avoid-redef-system-functions-2.30' into jk/avoid-redef-system-functions
* jk/avoid-redef-system-functions-2.30:
  git-compat-util: undefine system names before redeclaring them
2022-12-05 12:16:00 +09:00
e1a95b78d8 git-compat-util: undefine system names before redeclaring them
When we define a macro to point a system function (e.g., flockfile) to
our custom wrapper, we should make sure that the system did not already
define it as a macro. This is rarely a problem, but can cause
compilation failures if both of these are true:

  - we decide to define our own wrapper even though the system provides
    the function; we know this happens at least with uclibc, which may
    declare flockfile, etc, without _POSIX_THREAD_SAFE_FUNCTIONS

  - the system version is declared as a macro; we know this happens at
    least with uclibc's version of getc_unlocked()

So just handling getc_unlocked() would be sufficient to deal with the
real-world case we've seen. But since it's easy to do, we may as well be
defensive about the other macro wrappers added in the previous patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 12:15:37 +09:00
786e67611d maintenance: compare output of pthread functions for inequality with 0
The documentation for pthread_create and pthread_sigmask state that:

"On success, pthread_create() returns 0;
on error, it returns an error number"

As such, we ought to check for an error
by seeing if the output is not 0.

Checking for "less than" is a mistake
as the error code numbers can be greater than 0.

Signed-off-by: Seija <doremylover123@gmail.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 10:15:54 +09:00
500317ae03 t3920: don't ignore errors of more than one command with || true
It is customary to write `A || true` to ignore a potential error exit of
command A. But when we have a sequence `A && B && C || true && D`, then
a failure of any of A, B, or C skips to D right away. This is not
intended here. Turn the command whose failure is to be ignored into a
compound command to ensure it is the only one that is allowed to fail.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 10:02:34 +09:00
5f3bfdc4f3 t4023: fix ignored exit codes of git
Change a "git diff-tree" command to be &&-chained so that we won't
ignore its exit code, see the ea05fd5fbf (Merge branch
'ab/keep-git-exit-codes-in-tests', 2022-03-16) topic for prior art.

This fixes code added in b45563a229 (rename: Break filepairs with
different types., 2007-11-30). Due to hiding the exit code we hid a
memory leak under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 09:28:04 +09:00
4d81ce1b99 t7600: don't ignore "rev-parse" exit code in helper
Change the verify_mergeheads() helper the check the exit code of "git
rev-parse".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 09:27:32 +09:00
e77b88f728 l10n: de.po: update German translation
Reviewed-by: Matthias Rüster <matthias.ruester@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2022-12-02 17:28:32 +01:00
459419567a l10n: zh_CN v2.39.0 round 1
- Revise translation of 'stale'

Reviewed-by: 依云 <lilydjwg@gmail.com>
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2022-12-02 14:04:41 +00:00
243caa8982 t5314: check exit code of "git"
Amend the test added in [1] to check the exit code of the "git"
invocations. An in-flight change[2] introduced a memory leak in these
invocations, which went undetected unless we were running under
"GIT_TEST_SANITIZE_LEAK_LOG=true".

Note that the in-flight change made 8 test files fail, but as far as I
can tell only this one would have had its exit code hidden unless
under "GIT_TEST_SANITIZE_LEAK_LOG=true". The rest would be caught
without it.

We could pick other variable names here than "ln%d", e.g. "commit",
"dummy_blob" and "file_blob", but having the "rev-parse" invocations
aligned makes the difference between them more readable, so let's pick
"ln%d".

1. 4cf2143e02 (pack-objects: break delta cycles before delta-search
   phase, 2016-08-11)
2. https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/
3. faececa53f (test-lib: have the "check" mode for SANITIZE=leak
   consider leak logs, 2022-07-28)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 16:38:12 +09:00
6692d45477 fsmonitor: fix race seen in t7527
Fix racy tests in t7527 by forcing the use of cookie files during all
types of queries.  There were originaly observed on M1 macs with file
system encryption enabled.

There were a series of simple tests, such as "edit some files" and
"create some files", that started the daemon with GIT_TRACE_FSMONITOR
enabled so that the daemon would emit "event: <path>" messages to the
trace log.  The test would make worktree modifications and then grep
the log file to confirm it contained the expected trace messages.
The greps would occasionally racily-fail.  The expected messages
were always present in the log file, just not yet always present
when the greps ran.

NEEDSWORK: One could argue that the tests should use the `test-tool
fsmonitor-client query` and search for the expected pathnames in the
output rather than grepping the trace log, but I'll leave that for a
later exercise.

The racy tests called `test-tool fsmonitor-client query --token 0`
before grepping the log file.  (Presumably to introduce a small delay
and/or to let the daemon sync with the file system following the last
modification, but that was not always sufficient and hence the race.)

When the query arg is just "0", the daemon treated it as a V1
(aka timestamp-relative request) and responded with a "trivial
response" and a new token, but without trying to catch up to the
the file system event stream.  So the "event: <path>" messages
may or may not yet be in the log file when the grep commands
started.

FWIW, if the tests had sent `--token builtin:0:0` instead, it would
have forced a slightly different code path in the daemon that would
cause the daemon to use a cookie file and let it catch up with the
file system event stream.  I did not see any test failures with this
change.

Instead of modifying the test, I updated the fsmonitor--daemon to
always use a cookie file and catch up to the file system on any
query operation, regardless of the format of the request token.
This is safer.

FWIW, I think the effect of the race was limited to the test.
Commands like `git status` would always do a full scan when getting a
trivial response.  The fact that the daemon was slighly behind the
file system when it generated the response token would cause a second
`git status` to get a few extra paths that the client would have to
examine, but it would not be missing paths.

FWIW, I also think that an earlier version of the code always did
the cookie file for all types of queries, but it was optimized out
during a round of reviews or rework and we didn't notice the race.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 09:07:48 +09:00
faebba436e list-objects-filter: plug pattern_list leak
filter_sparse_oid__init() uses add_patterns_from_blob_to_list() to
populate the struct pattern_list member of struct filter_sparse_data.
Release it in the complementing filter_sparse_free().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 08:29:06 +09:00
189e97bc4b diff: remove parseopts member from struct diff_options
repo_diff_setup() builds the struct option array with git diff's command
line options and stores a pointer to it in the parseopts member of
struct diff_options.  The array is freed by diff_setup_done(), but not
by release_revisions().  Thus calling only repo_diff_setup() and
release_revisions() leaks that array.

We could free it in release_revisions() as well to plug that leak, but
there is a better way: Only build it when needed.  Absorb
prep_parse_options() into the last place that uses the parseopts member
of struct diff_options, add_diff_parseopts(), and get rid of said
member.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 08:25:30 +09:00
6c6048fa7f diff: use add_diff_options() in diff_opt_parse()
Prepare the removal of the parseopts member of struct diff_options by
using the API function add_diff_options() instead of accessing it
directly to get the command line option definitions.  Building the copy
by concatenating with an empty option array is slightly awkward, but
simpler than a non-concat version of add_diff_options() would be to use
in places that need concatenation.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 08:25:29 +09:00
c5630c4868 diff: factor out add_diff_options()
Add a function for appending the parseopts member of struct diff_options
to a struct option array.  Use it in two sites instead of accessing the
parseopts member directly.  Decoupling callers from diff internals like
that allows us to change the latter.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 08:25:29 +09:00
77e04b2ed4 t4205: don't exit test script on failure
Only abort the individual check instead of exiting the whole test script
if git show fails.  Noticed with GIT_TEST_PASSING_SANITIZE_LEAK=check.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 08:25:02 +09:00
805265fcf7 Merge branch 'ab/fewer-the-index-macros'
Squelch warnings from Coccinelle

* ab/fewer-the-index-macros:
  cocci: avoid "should ... be a metavariable" warnings
2022-12-01 18:38:07 +09:00
215ae4f264 Merge branch 'ab/gnumake-4.4-fix'
Adjust our Makefiles for GNUmake 4.4

* ab/gnumake-4.4-fix:
  Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
2022-12-01 18:38:07 +09:00
ecbc23e4c5 status: modernize git-status "slow untracked files" advice
`git status` can be slow when there are a large number of
untracked files and directories since Git must search the entire
worktree to enumerate them.  When it is too slow, Git prints
advice with the elapsed search time and a suggestion to disable
the search using the `-uno` option.  This suggestion also carries
a warning that might scare off some users.

However, these days, `-uno` isn't the only option.  Git can reduce
the time taken to enumerate untracked files by caching results from
previous `git status` invocations, when the `core.untrackedCache`
and `core.fsmonitor` features are enabled.

Update the `git status` man page to explain these configuration
options, and update the advice to provide more detail about the
current configuration and to refer to the updated documentation.

Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-01 15:27:41 +09:00
4948ed4731 Merge branch 'jk/avoid-redef-system-functions-2.30'
* jk/avoid-redef-system-functions-2.30
  git-compat-util: avoid redefining system function names
2022-12-01 09:17:22 +09:00
a61c70a7c8 Merge branch 'jk/avoid-redef-system-functions-2.30' into maint
* jk/avoid-redef-system-functions-2.30:
  git-compat-util: avoid redefining system function names
2022-12-01 09:14:46 +09:00
e0c08a4f73 git-compat-util: avoid redefining system function names
Our git-compat-util header defines a few noop wrappers for system
functions if they are not available. This was originally done with a
macro, but in 15b52a44e0 (compat-util: type-check parameters of no-op
replacement functions, 2020-08-06) we switched to inline functions,
because it gives us basic type-checking.

This can cause compilation failures when the system _does_ declare those
functions but we choose not to use them, since the compiler will
complain about the redeclaration. This was seen in the real world when
compiling against certain builds of uclibc, which may leave
_POSIX_THREAD_SAFE_FUNCTIONS unset, but still declare flockfile() and
funlockfile().

It can also be seen on any platform that has setitimer() if you choose
to compile without it (which plausibly could happen if the system
implementation is buggy). E.g., on Linux:

  $ make NO_SETITIMER=IWouldPreferNotTo git.o
      CC git.o
  In file included from builtin.h:4,
                   from git.c:1:
  git-compat-util.h:344:19: error: conflicting types for ‘setitimer’; have ‘int(int,  const struct itimerval *, struct itimerval *)’
    344 | static inline int setitimer(int which UNUSED,
        |                   ^~~~~~~~~
  In file included from git-compat-util.h:234:
  /usr/include/x86_64-linux-gnu/sys/time.h:155:12: note: previous declaration of ‘setitimer’ with type ‘int(__itimer_which_t,  const struct itimerval * restrict,  struct itimerval * restrict)’
    155 | extern int setitimer (__itimer_which_t __which,
        |            ^~~~~~~~~
  make: *** [Makefile:2714: git.o] Error 1

Here I think the compiler is complaining about the lack of "restrict"
annotations in our version, but even if we matched it completely (and
there is no way to match all platforms anyway), it would still complain
about a static declaration following a non-static one. Using macros
doesn't have this problem, because the C preprocessor rewrites the name
in our code before we hit this level of compilation.

One way to fix this would just be to revert most of 15b52a44e0. What we
really cared about there was catching build problems with
precompose_argv(), which most platforms _don't_ build, and which is our
custom function. So we could just switch the system wrappers back to
macros; most people build the real versions anyway, and they don't
change. So the extra type-checking isn't likely to catch bugs.

But with a little work, we can have our cake and eat it, too. If we
define the type-checking wrappers with a unique name, and then redirect
the system names to them with macros, we still get our type checking,
but without redeclaring the system function names.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-01 09:11:59 +09:00
cddd68ae33 cocci: avoid "should ... be a metavariable" warnings
Since [1] running "make coccicheck" has resulted in [2] being emitted
to the *.log files for the "spatch" run, and in the case of "make
coccicheck-test" we'd emit these to the user's terminal.

Nothing was broken as a result, but let's refactor the relevant rules
to eliminate the ambiguity between a possible variable and an
identifier.

1. 0e6550a2c6 (cocci: add a index-compatibility.pending.cocci,
   2022-11-19)
2. warning: line 257: should active_cache be a metavariable?
   warning: line 260: should active_cache_changed be a metavariable?
   warning: line 263: should active_cache_tree be a metavariable?
   warning: line 271: should active_nr be a metavariable?

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-01 07:25:57 +09:00
67b36879fc Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4
Since GNU make 4.4 the semantics of the $(MAKEFLAGS) variable has
changed in a backward-incompatible way, as its "NEWS" file notes:

  Previously only simple (one-letter) options were added to the MAKEFLAGS
  variable that was visible while parsing makefiles.  Now, all options are
  available in MAKEFLAGS.  If you want to check MAKEFLAGS for a one-letter
  option, expanding "$(firstword -$(MAKEFLAGS))" is a reliable way to return
  the set of one-letter options which can be examined via findstring, etc.

This upstream change meant that e.g.:

	make man

Would become very noisy, because in shared.mak we rely on extracting
"s" from the $(MAKEFLAGS), which now contains long options like
"--jobserver-auth=fifo:<path>", which we'll conflate with the "-s"
option.

So, let's change this idiom we've been carrying since [1], [2] and [3]
as the "NEWS" suggests.

Note that the "-" in "-$(MAKEFLAGS)" is critical here, as the variable
will always contain leading whitespace if there are no short options,
but long options are present. Without it e.g. "make --debug=all" would
yield "--debug=all" as the first word, but with it we'll get "-" as
intended. Then "-s" for "-s", "-Bs" for "-s -B" etc.

1. 0c3b4aac8e (git-gui: Support of "make -s" in: do not output
   anything of the build itself, 2007-03-07)
2. b777434383 (Support of "make -s": do not output anything of the
   build itself, 2007-03-07)
3. bb2300976b (Documentation/Makefile: make most operations "quiet",
   2009-03-27)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-01 07:24:12 +09:00
fe20a5e6a4 l10n: fr: v2.39 rnd 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2022-11-30 19:43:02 +01:00
1fe80770f3 l10n: po-id for 2.39 (round 1)
All of updates are new strings translation.

Update following components:

  * builtin/bundle.c
  * builtin/clone.c
  * builtin/commit.c
  * builtin/describe.c
  * builtin/diff.c
  * builtin/fsck.c
  * builtin/gc.c
  * builtin/merge-tree.c
  * builtin/repack.c
  * builtin/revert.c
  * builtin/stash.c
  * builtin/upload-pack.c
  * builtin/worktree.c
  * bundle-uri.c
  * push.c
  * revision.c
  * scalar.c

Translate following new components:

  * builtin/patch-id.c
  * t/helper/test-cache-tree.c
  * t/helper/test-fast-rebase.c
  * t/helper/test-reach.c
  * t/helper/test-serve-v2.c
  * t/helper/test-simple-ipc.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>

po revision bump

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2022-11-30 20:45:30 +07:00
7452749a78 Git 2.39-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 11:00:35 +09:00
4615d3e264 Merge branch 'ps/gnumake-4.4-fix'
* ps/gnumake-4.4-fix:
  Makefile: avoid multiple patterns when recipes generate one file
2022-11-30 10:57:19 +09:00
bcb71d45bf t1301: do not change $CWD in "shared=all" test case
In test case "shared=all", the working directory is permanently changed
to the "sub" directory. This leads to a strange behavior that the
temporary repositories created by subsequent test cases are all in this
"sub" directory, such as "sub/new", "sub/child.git". If we bypass this
test case, all subsequent test cases will have different working
directory.

Besides, all subsequent test cases assuming they are in the "sub"
directory do not run any destructive operations in their parent
directory (".."), and will not make damage out side of $TRASH_DIRECTORY.

So it is a safe change for us to run the test case "shared=all" in
current repository instead of creating and changing to "sub".

For the next test case, the path ".git/info" is assumed to be missing,
but we no longer run the test case in the "sub" repository which is
initialized from an empty template. In order for the test case to run
properly, we can set "TEST_CREATE_REPO_NO_TEMPLATE=1" to initialize the
default repository without a template.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:21:51 +09:00
5d64229ef5 t1301: use test_when_finished for cleanup
Refactor several test cases to use "test_when_finished" for cleanup.

1. For first of these, we used to clean-up outside the test, but instead
   let's use test_when_finished for that.

2. For the second, we used to leave "new" after we are done, but not use
   it at all later. Now we do clean up.

3. For the rest, these child.git test repositories used to follow
   "initialize what we are going to use to a known state before we use"
   pattern, which is not wrong per-se, but now we use "clean up the
   cruft we made after we are done" pattern, which may arguably be
   better simply because the test that makes cruft should know what
   cruft it created better than whatever comes later that may not know.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:21:51 +09:00
a0883a2440 t1301: fix wrong template dir for git-init
The template dir prepared in test case "forced modes" is not used as
expected because a wrong template dir is provided to "git init". This is
because the $CWD for "git-init" command is a sibling directory alongside
the template directory. Change it to the right template directory and
add a protection test using "test_path_is_file".

The wrong template directory was introduced by mistake in commit
e1df7fe43f (init: make --template path relative to $CWD, 2019-05-10).

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:21:50 +09:00
d4f7036887 list-objects-filter: remove OPT_PARSE_LIST_OBJECTS_FILTER_INIT()
OPT_PARSE_LIST_OBJECTS_FILTER_INIT() with a non-NULL second argument
passes a function pointer via an object pointer, which is undefined.  It
may work fine on platforms that implement C99 extension J.5.7 (Function
pointer casts).  Remove the unused macro and avoid the dependency on
that extension.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:35 +09:00
0d5448a554 pack-objects: simplify --filter handling
pack-objects uses OPT_PARSE_LIST_OBJECTS_FILTER_INIT() to initialize the
a rev_info struct lazily before populating its filter member using the
--filter option values.  It tracks whether the initialization is needed
using the .have_revs member of the callback data.

There is a better way: Use a stand-alone list_objects_filter_options
struct and build a rev_info struct with its .filter member after option
parsing.  This allows using the simpler OPT_PARSE_LIST_OBJECTS_FILTER()
and getting rid of the extra callback mechanism.

Even simpler would be using a struct rev_info as before 5cb28270a1
(pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28),
but that would expose a memory leak caused by repo_init_revisions()
followed by release_revisions() without a setup_revisions() call in
between.

Using list_objects_filter_options also allows pushing the rev_info
struct into get_object_list(), where it arguably belongs. Either way,
this is all left for later.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:33 +09:00
825babe5d5 pack-objects: fix handling of multiple --filter options
Since 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't
leak, 2022-03-28) --filter options given to git pack-objects overrule
earlier ones, letting only the leftmost win and leaking the memory
allocated for earlier ones.  Fix that by only initializing the rev_info
struct once.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:33 +09:00
f00d811533 t5317: demonstrate failure to handle multiple --filter options
git pack-objects should accept multiple --filter options as documented
in Documentation/rev-list-options.txt, but currently the last one wins.
Show that using tests with multiple blob size limits

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:32 +09:00
3f75a6e5b4 t5317: stop losing return codes of git ls-files
fb2d0db502 (test-lib-functions: add parsing helpers for ls-files and
ls-tree, 2022-04-04) not only started to use helper functions, it also
started to pipe the output of git ls-files into them directly, without
using a temporary file.  No explanation was given.  This causes the
return code of that git command to be ignored.

Revert that part of the change, use temporary files and check the return
code of git ls-files again.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:32 +09:00
9de31f7bd2 completion: add case-insensitive match of pseudorefs
When GIT_COMPLETION_IGNORE_CASE is set, also allow lowercase completion
text like "head" to match uppercase HEAD and other pseudorefs.

Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 09:58:06 +09:00
9bab766fb2 completion: add optional ignore-case when matching refs
If GIT_COMPLETION_IGNORE_CASE is set, --ignore-case will be added to
git for-each-ref calls so that refs can be matched case insensitively,
even when running on case sensitive filesystems.

Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 09:58:06 +09:00
c80046d63d l10n: sv.po: Update Swedish translation (5501t0f0)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2022-11-29 22:51:11 +01:00
083e01275b A bit more before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-29 10:41:06 +09:00
fd8dcbb07c Merge branch 'ab/doc-synopsis-and-cmd-usage'
Doc and message fix.

* ab/doc-synopsis-and-cmd-usage:
  i18n: fix command template placeholder format
2022-11-29 10:41:06 +09:00
8350c34930 Merge branch 'km/merge-recursive-typofix'
Fix an old typo in an error message.

* km/merge-recursive-typofix:
  merge-recursive: fix variable typo in error message
2022-11-29 10:41:06 +09:00
515ffabccf Merge branch 'jx/ci-ubuntu-fix'
Adjust the GitHub CI to newer ubuntu release.

* jx/ci-ubuntu-fix:
  ci: install python on ubuntu
  ci: use the same version of p4 on both Linux and macOS
  ci: remove the pipe after "p4 -V" to catch errors
  github-actions: run gcc-8 on ubuntu-20.04 image
2022-11-29 10:41:06 +09:00
8165c6af11 Merge branch 'jh/trace2-timers-and-counters'
Test fix.

* jh/trace2-timers-and-counters:
  trace2 tests: guard pthread test with "PTHREAD"
2022-11-29 10:41:05 +09:00
8a40cb1e5a Merge branch 'ah/chainlint-cpuinfo-parse-fix'
The format of a line in /proc/cpuinfo that describes a CPU on s390x
looked different from everybody else, and the code in chainlint.pl
failed to parse it.

* ah/chainlint-cpuinfo-parse-fix:
  chainlint.pl: fix /proc/cpuinfo regexp
2022-11-29 10:41:05 +09:00
f32996d99a Merge branch 'gc/resolve-alternate-symlinks'
Resolve symbolic links when processing the locations of alternate
object stores, since failing to do so can lead to confusing and buggy
behavior.

* gc/resolve-alternate-symlinks:
  object-file: use real paths when adding alternates
2022-11-29 10:41:05 +09:00
c8f4357010 pack-bitmap.c: trace bitmap ignore logs when midx-bitmap is found
When we find a midx bitmap, we do not bother checking for pack
bitmaps, since we can use only one. But since we will warn of unused
bitmaps via trace2, let's continue looking for pack bitmaps when
tracing is enabled.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-29 09:54:56 +09:00
833f4c0514 pack-bitmap.c: break out of the bitmap loop early if not tracing
After opening a bitmap successfully, we try opening others only
because we want to report that other bitmap files are ignored in
the trace2 log.  When trace2 is not enabled, we do not have to
do any of that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-29 09:54:56 +09:00
815c1e8202 Another batch before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-28 12:13:46 +09:00
041df69edd Merge branch 'ab/fewer-the-index-macros'
Progress on removing 'the_index' convenience wrappers.

* ab/fewer-the-index-macros:
  cocci: apply "pending" index-compatibility to some "builtin/*.c"
  cache.h & test-tool.h: add & use "USE_THE_INDEX_VARIABLE"
  {builtin/*,repository}.c: add & use "USE_THE_INDEX_VARIABLE"
  cocci: apply "pending" index-compatibility to "t/helper/*.c"
  cocci & cache.h: apply variable section of "pending" index-compatibility
  cocci & cache.h: apply a selection of "pending" index-compatibility
  cocci: add a index-compatibility.pending.cocci
  read-cache API & users: make discard_index() return void
  cocci & cache.h: remove rarely used "the_index" compat macros
  builtin/{grep,log}.: don't define "USE_THE_INDEX_COMPATIBILITY_MACROS"
  cache.h: remove unused "the_index" compat macros
2022-11-28 12:13:46 +09:00
613999cc5c Merge branch 'sg/plug-line-log-leaks'
A handful of leaks in the line-log machinery have been plugged.

* sg/plug-line-log-leaks:
  diff.c: use diff_free_queue()
  line-log: free the diff queues' arrays when processing merge commits
  line-log: free diff queue when processing non-merge commits
2022-11-28 12:13:46 +09:00
91c43cde25 Merge branch 'es/locate-httpd-module-location-in-test'
Add one more candidate directory that may house httpd modules while
running tests.

* es/locate-httpd-module-location-in-test:
  lib-httpd: extend module location auto-detection
2022-11-28 12:13:45 +09:00
399a9f31f7 Merge branch 'zk/push-use-bitmaps'
Test fix.

* zk/push-use-bitmaps:
  t5516: fail to run in verbose mode
2022-11-28 12:13:45 +09:00
7d7ed48dd5 Merge branch 'ew/prune-with-missing-objects-pack'
"git prune" may try to iterate over .git/objects/pack for trash
files to remove in it, and loudly fail when the directory is
missing, which is not necessary.  The command has been taught to
ignore such a failure.

* ew/prune-with-missing-objects-pack:
  prune: quiet ENOENT on missing directories
2022-11-28 12:13:44 +09:00
15a62fb957 Merge branch 'rs/list-objects-filter-leakfix'
Leakfix.

* rs/list-objects-filter-leakfix:
  list-objects-filter: plug combine_filter_data leak
2022-11-28 12:13:43 +09:00
6accbe3ce7 Merge branch 'pw/config-int-parse-fixes'
Assorted fixes of parsing end-user input as integers.

* pw/config-int-parse-fixes:
  git_parse_signed(): avoid integer overflow
  config: require at least one digit when parsing numbers
  git_parse_unsigned: reject negative values
2022-11-28 12:13:43 +09:00
ba88f8c81d Merge branch 'jk/parse-object-type-mismatch'
`parse_object()` hardening when checking for the existence of a
suspected blob object.

* jk/parse-object-type-mismatch:
  parse_object(): simplify blob conditional
  parse_object(): check on-disk type of suspected blob
  parse_object(): drop extra "has" check before checking object type
2022-11-28 12:13:42 +09:00
9f95c7aefa Makefile: avoid multiple patterns when recipes generate one file
A GNU make pattern rule with multiple targets has always meant that
a single invocation of the recipe will build all the targets.
However in older versions of GNU make a recipe that did not really
build all the targets would be tolerated.

Starting with GNU make 4.4 this behavior is deprecated and pattern
rules are expected to generate files to match all the patterns.
If not all targets are created then GNU make will not consider any
target up to date and will re-run the recipe when it is run again.

Modify Documentation/Makefile to split the man page-creating pattern
rule into a separate pattern rule for each pattern.

Reported-by: Alexander Kanavin <alex.kanavin@gmail.com>
Signed-off-by: Paul Smith <psmith@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-28 10:18:55 +09:00
9508dfd9f5 git-jump: invoke emacs/emacsclient
It works with GIT_EDITOR="emacs", "emacsclient" or "emacsclient -t"

Signed-off-by: Yoichi Nakayama <yoichi.nakayama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:49:51 +09:00
64685cb855 git-jump: move valid-mode check earlier
We check if the "mode" argument supplied by the user is valid by seeing
if we have a mode_$mode function defined. But we don't do that until
after creating the tempfile. This is wasteful (we create a tempfile but
never use it), and makes it harder to add new options (the recent stdout
option exits before creating the tempfile, so it misses the check and
"git jump --stdout foo" will produce "git-jump: 92: mode_foo: not found"
rather than the regular usage message).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:49:51 +09:00
cfb7b3b391 git-jump: add an optional argument '--stdout'
It can be used with M-x grep on Emacs.

Signed-off-by: Yoichi Nakayama <yoichi.nakayama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:49:51 +09:00
d1ddc4e3f6 i18n: fix command template placeholder format
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:29:44 +09:00
42db324c0f merge-recursive: fix variable typo in error message
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:26:10 +09:00
8774aa56ad send-email: relay '-v N' to format-patch
send-email relays unrecognized arguments to its format-patch call.
Passing '-v N' leads to an error because -v is consumed as
send-email's --validate.  For example,

  git send-email -v 3 @{u}

fails with

  fatal: ambiguous argument '3': unknown revision or path not in the
  working tree.  [...]

To prevent this, add the short --reroll-count option to send-email's
main option list and explicitly provide it to the format-patch call.

There other format-patch options that send-email doesn't relay
properly, including at least -n, -N, and the diff option -D.  Punt on
these because dealing with them is more complicated:

 * they would require configuring send-email to not ignore option case

 * send-email makes three GetOptions() calls with different sets of
   options, the last being the main set of options.  Unlike -v, which
   is consumed by the last GetOptions call, the -n, -N, and -D options
   are consumed as abbreviations by the earlier calls.

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:21:43 +09:00
2ad150e35e var: allow GIT_EDITOR to return null
The handling to die early when there is no EDITOR is valuable when
used in normal code (i.e., editor.c). In git-var, where
null/empty-string is a perfectly valid value to return, it doesn't
make as much sense.

Remove this handling from `git var GIT_EDITOR` so that it does not
fail so noisily when there is no defined editor.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:35:55 +09:00
26b8abc7b1 var: do not print usage() with a correct invocation
Before, git-var could print usage() even if the command was invoked
correctly with a variable defined in git_vars -- provided that its
read() function returned NULL.

Now, we only print usage() only if it was called with a logical
variable that wasn't defined -- regardless of read().

Since we now know the variable is valid when we call read_var(), we
can avoid printing usage() here (and exiting with code 129) and
instead exit quietly with code 1. While exiting with a different code
can be a breaking change, it's far better than changing the exit
status more generally from 'failure' to 'success'.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:35:55 +09:00
0d3507f3e7 ci: install python on ubuntu
Python is missing from the default ubuntu-22.04 runner image, which
prevents git-p4 from working. To install python on ubuntu, we need
to provide the correct package names:

 * On Ubuntu 18.04 (bionic), "/usr/bin/python2" is provided by the
   "python" package, and "/usr/bin/python3" is provided by the "python3"
   package.

 * On Ubuntu 20.04 (focal) and above, "/usr/bin/python2" is provided by
   the "python2" package which has a different name from bionic, and
   "/usr/bin/python3" is provided by "python3".

Since the "ubuntu-latest" runner image has a higher version, its
safe to use "python2" or "python3" package name.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:33:43 +09:00
31a1952bbd ci: use the same version of p4 on both Linux and macOS
There would be a segmentation fault when running p4 v16.2 on ubuntu
22.04 which is the latest version of ubuntu runner image for github
actions.

By checking each version from [1], p4d version 21.1 and above can work
properly on ubuntu 22.04. But version 22.x will break some p4 test
cases. So p4 version 21.x is exactly the version we can use.

With this update, the versions of p4 for Linux and macOS happen to be
the same. So we can add the version number directly into the "P4WHENCE"
variable, and reuse it in p4 installation for macOS.

By removing the "LINUX_P4_VERSION" variable from "ci/lib.sh", the
comment left above has nothing to do with p4, but still applies to
git-lfs. Since we have a fixed version of git-lfs installed on Linux,
we may have a different version on macOS.

[1]: https://cdist2.perforce.com/perforce/

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:32:56 +09:00
4137c84198 ci: remove the pipe after "p4 -V" to catch errors
When installing p4 as a dependency, we used to pipe output of "p4 -V"
and "p4d -V" to validate the installation and output a condensed version
information. But this would hide potential errors of p4 and would stop
with an empty output. E.g.: p4d version 16.2 running on ubuntu 22.04
causes sigfaults, even before it produces any output.

By removing the pipe after "p4 -V" and "p4d -V", we may get a
verbose output, and stop immediately on errors because we have "set
-e" in "ci/lib.sh". Since we won't look at these trace logs unless
something fails, just including the raw output seems most sensible.

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:31:59 +09:00
0178420b9c github-actions: run gcc-8 on ubuntu-20.04 image
GitHub starts to upgrade its runner image "ubuntu-latest" from version
"ubuntu-20.04" to version "ubuntu-22.04". It will fail to find and
install "gcc-8" package on the new runner image.

Change some of the runner images from "ubuntu-latest" to "ubuntu-20.04"
in order to install "gcc-8" as a dependency.

The first revision of this patch tried to replace "$runs_on_pool" in
"ci/*.sh" with a new "$runs_on_os" environment variable based on the
"os" field in the matrix strategy. But these "os" fields in matrix
strategies are obsolete legacies from commit [1] and commit [2], and
are no longer useful. So remove these unused "os" fields.

[1]: c08bb26010 (CI: rename the "Linux32" job to lower-case "linux32",
                 2021-11-23)
[2]: 25715419bf (CI: don't run "make test" twice in one job, 2021-11-23)

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:31:12 +09:00
4cc9eb338d docs: fix description of the --merge-base option
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-25 10:11:46 +09:00
199337d6ec object-file: use real paths when adding alternates
When adding an alternate ODB, we check if the alternate has the same
path as the object dir, and if so, we do nothing. However, that
comparison does not resolve symlinks. This makes it possible to add the
object dir as an alternate, which may result in bad behavior. For
example, it can trick "git repack -a -l -d" (possibly run by "git gc")
into thinking that all packs come from an alternate and delete all
objects.

	rm -rf test &&
	git clone https://github.com/git/git test &&
	(
	cd test &&
	ln -s objects .git/alt-objects &&
	# -c repack.updateserverinfo=false silences a warning about not
	# being able to update "info/refs", it isn't needed to show the
	# bad behavior
	GIT_ALTERNATE_OBJECT_DIRECTORIES=".git/alt-objects" git \
		-c repack.updateserverinfo=false repack -a -l -d  &&
	# It's broken!
	git status
	# Because there are no more objects!
	ls .git/objects/pack
	)

Fix this by resolving symlinks and relative paths before comparing the
alternate and object dir. This lets us clean up a number of issues noted
in 37a95862c6 (alternates: re-allow relative paths from environment,
2016-11-07):

- Now that we compare the real paths, duplicate detection is no longer
  foiled by relative paths.
- Using strbuf_realpath() allows us to "normalize" paths that
  strbuf_normalize_path() can't, so we can stop silently ignoring errors
  when "normalizing" paths from the environment.
- We now store an absolute path based on getcwd() (the "future
  direction" named in 37a95862c6), so chdir()-ing in the process no
  longer changes the directory pointed to by the alternate. This is a
  change in behavior, but a desirable one.

Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-25 09:44:08 +09:00
14903c8e92 trace2 tests: guard pthread test with "PTHREAD"
Since 81071626ba (trace2: add global counter mechanism, 2022-10-24)
these tests have been failing when git is compiled with NO_PTHREADS=Y,
which is always the case e.g. if 'uname -s' is "NONSTOP_KERNEL".

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-25 09:36:26 +09:00
c000d91638 Git 2.39-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-23 11:22:25 +09:00
c197977cb6 Merge branch 'mh/gitcredentials-generate'
Doc update.

* mh/gitcredentials-generate:
  Docs: describe how a credential-generating helper works
2022-11-23 11:22:25 +09:00
f8828f9125 Merge branch 'ps/receive-use-only-advertised'
"git receive-pack" used to use all the local refs as the boundary for
checking connectivity of the data "git push" sent, but now it uses
only the refs that it advertised to the pusher. In a repository with
the .hideRefs configuration, this reduces the resources needed to
perform the check.
cf. <221028.86bkpw805n.gmgdl@evledraar.gmail.com>
cf. <xmqqr0yrizqm.fsf@gitster.g>

* ps/receive-use-only-advertised:
  receive-pack: only use visible refs for connectivity check
  rev-parse: add `--exclude-hidden=` option
  revision: add new parameter to exclude hidden refs
  revision: introduce struct to handle exclusions
  revision: move together exclusion-related functions
  refs: get rid of global list of hidden refs
  refs: fix memory leak when parsing hideRefs config
2022-11-23 11:22:25 +09:00
173fc54b00 Merge branch 'jt/submodule-on-demand'
Push all submodules recursively with
'--recurse-submodules=on-demand'.

* jt/submodule-on-demand:
  Doc: document push.recurseSubmodules=only
2022-11-23 11:22:25 +09:00
8d7b35b43d Merge branch 'sz/macos-fsmonitor-symlinks'
Fix an issue where core.fsmonitor on macOS would not notice created
or modified symbolic links.

* sz/macos-fsmonitor-symlinks:
  fsmonitor--daemon: on macOS support symlink
2022-11-23 11:22:25 +09:00
a655f28a7a Merge branch 'ew/delta-islands-free'
Free structures related to delta islands after use.

* ew/delta-islands-free:
  delta-islands: free island-related data after use
2022-11-23 11:22:25 +09:00
2fe427ecb7 Merge branch 'mg/notes-newline'
Avoid a stray empty newline in the template when creating new notes.

* mg/notes-newline:
  notes: avoid empty line in template
2022-11-23 11:22:25 +09:00
032e8da541 Merge branch 'tb/howto-maintain-git-fixes'
A pair of bugfixes to the Documentation/howto/maintain-git.txt guide.

* tb/howto-maintain-git-fixes:
  Documentation: build redo-seen.sh from jch..seen
  Documentation: build redo-jch.sh from master..jch
2022-11-23 11:22:24 +09:00
cf9721cc46 Merge branch 'es/chainlint-lineno'
Teach chainlint.pl to show corresponding line numbers when printing
the source of a test.

* es/chainlint-lineno:
  chainlint: prefix annotated test definition with line numbers
  chainlint: latch line numbers at which each token starts and ends
  chainlint: sidestep impoverished macOS "terminfo"
2022-11-23 11:22:24 +09:00
ff84d031a9 Merge branch 'pw/rebase-no-reflog-action'
Avoid setting GIT_REFLOG_ACTION to improve readability of the
sequencer internals.

* pw/rebase-no-reflog-action:
  rebase: stop exporting GIT_REFLOG_ACTION
  sequencer: stop exporting GIT_REFLOG_ACTION
2022-11-23 11:22:24 +09:00
4a04f718c0 Merge branch 'ab/t7610-timeout'
Fix a source of flakiness in CI when compiling with SANITIZE=leak.

* ab/t7610-timeout:
  t7610: use "file:///dev/null", not "/dev/null", fixes MinGW
  t7610: fix flaky timeout issue, don't clone from example.com
2022-11-23 11:22:24 +09:00
56a64fcdc3 Merge branch 'rp/maintenance-qol'
'git maintenance register' is taught to write configuration to an
arbitrary path, and 'git for-each-repo' is taught to expand tilde
characters in paths.

* rp/maintenance-qol:
  builtin/gc.c: fix use-after-free in maintenance_unregister()
  maintenance --unregister: fix uninit'd data use & -Wdeclaration-after-statement
  maintenance: add option to register in a specific config
  for-each-repo: interpolate repo path arguments
2022-11-23 11:22:24 +09:00
3b041ea5f7 Merge branch 'pw/strict-label-lookups'
Correct an error where `git rebase` would mistakenly use a branch or
tag named "refs/rewritten/xyz" when missing a rebase label.

* pw/strict-label-lookups:
  sequencer: tighten label lookups
  sequencer: unify label lookup
2022-11-23 11:22:23 +09:00
6adf17050b Merge branch 'gc/redact-h2h3-headers'
Redact headers from cURL's h2h3 module in GIT_CURL_VERBOSE and
others.

* gc/redact-h2h3-headers:
  http: redact curl h2h3 headers in info
  t: run t5551 tests with both HTTP and HTTP/2
2022-11-23 11:22:23 +09:00
4b76998ff0 Merge branch 'ab/coccicheck-incremental'
"make coccicheck" is time consuming. It has been made to run more
incrementally.

* ab/coccicheck-incremental:
  Makefile: don't create a ".build/.build/" for cocci, fix output
  spatchcache: add a ccache-alike for "spatch"
  cocci: run against a generated ALL.cocci
  cocci rules: remove <id>'s from rules that don't need them
  Makefile: copy contrib/coccinelle/*.cocci to build/
  cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES
  cocci: make "coccicheck" rule incremental
  cocci: split off "--all-includes" from SPATCH_FLAGS
  cocci: split off include-less "tests" from SPATCH_FLAGS
  Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading
  Makefile: have "coccicheck" re-run if flags change
  Makefile: add ability to TAB-complete cocci *.patch rules
  cocci rules: remove unused "F" metavariable from pending rule
  Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T)
2022-11-23 11:22:23 +09:00
613fb30a49 Merge branch 'es/chainlint-output'
Teach chainlint.pl to annotate the original test definition instead
of the token stream.

* es/chainlint-output:
  chainlint: annotate original test definition rather than token stream
  chainlint: latch start/end position of each token
  chainlint: tighten accuracy when consuming input stream
  chainlint: add explanatory comments
2022-11-23 11:22:23 +09:00
58d80df6a3 Merge branch 'js/remove-stale-scalar-repos'
'scalar reconfigure -a' is taught to automatically remove
scalar.repo entires which no longer exist.

* js/remove-stale-scalar-repos:
  tests(scalar): tighten the stale `scalar.repo` test some
  scalar reconfigure -a: remove stale `scalar.repo` entries
2022-11-23 11:22:23 +09:00
e3d40fb240 Merge branch 'dd/bisect-helper-subcommand'
Fix a regression in the bisect-helper which mistakenly treats
arguments to the command given to 'git bisect run' as arguments to
the helper.

* dd/bisect-helper-subcommand:
  bisect--helper: parse subcommand with OPT_SUBCOMMAND
  bisect--helper: move all subcommands into their own functions
  bisect--helper: remove unused options
2022-11-23 11:22:22 +09:00
1107a3963b Merge branch 'ab/submodule-helper-prep-only'
Preparation to remove git-submodule.sh and replace it with a builtin.

* ab/submodule-helper-prep-only:
  submodule--helper: use OPT_SUBCOMMAND() API
  submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update"
  submodule--helper: remove --prefix from "absorbgitdirs"
  submodule API & "absorbgitdirs": remove "----recursive" option
  submodule.c: refactor recursive block out of absorb function
  submodule tests: test for a "foreach" blind-spot
  submodule--helper: fix a memory leak in "status"
  submodule tests: add tests for top-level flag output
  submodule--helper: move "config" to a test-tool
2022-11-23 11:22:22 +09:00
1f51b77f4f chainlint.pl: fix /proc/cpuinfo regexp
29fb2ec3 (chainlint.pl: validate test scripts in parallel,
2022-09-01) introduced a function that gets the number of cores from
/proc/cpuinfo on some systems, notably linux.

The regexp it uses (^processor\s*:) fails to match the desired lines in
the s390x architecture, where they look like this:

    processor 0: version = FF, identification = 148F67, machine = 2964

As a result, on s390x that function returns 0 as the number of cores,
and the chainlint.pl script exits without doing anything.

Signed-off-by: Andreas Hasenack <andreas.hasenack@canonical.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-23 10:20:19 +09:00
40286ca2fa parse_object(): simplify blob conditional
Commit 8db2dad7a0 (parse_object(): check on-disk type of suspected blob,
2022-11-17) simplified the conditional for checking if we might have a
blob. But we can simplify it further. In:

  !obj || (obj && obj->type == OBJ_BLOB)

the short-circuit "OR" means "obj" will always be true on the right-hand
side. The compiler almost certainly optimized that out anyway, but
dropping it makes the conditional easier to understand for humans.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-22 10:13:54 +09:00
1c7dc23d41 lib-httpd: extend module location auto-detection
Although it is possible to manually set LIB_HTTPD_PATH and
LIB_HTTPD_MODULE_PATH to point at the location of `httpd` and its
modules, doing so is cumbersome and easily forgotten. To address this,
0d344738dc (t/lib-http.sh: Restructure finding of default httpd
location, 2010-01-02) enhanced lib-httpd.sh to automatically detect the
location of `httpd` and its modules in order to facilitate out-of-the-
box testing on a wider range of platforms. Follow that lead by further
enhancing it to automatically detect the `httpd` modules on Void Linux,
as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-22 09:57:53 +09:00
288fcb1c94 t5516: fail to run in verbose mode
The test case "push with config push.useBitmap" of t5516 was introduced
in commit 82f67ee13f (send-pack.c: add config push.useBitmaps,
2022-06-17). It won't work in verbose mode, e.g.:

    $ sh t5516-fetch-push.sh --run='1,115' -v

This is because "git-push" will run in a tty in this case, and the
subcommand "git pack-objects" will contain an argument "--progress"
instead of "-q". Adding a specific option "--quiet" to "git push" will
get a stable result for t5516.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-22 09:16:30 +09:00
7c2dc122f9 list-objects-filter: plug combine_filter_data leak
filter_combine__init() allocates a struct combine_filter_data object and
assigns it to the filter_data member of struct filter_options.  Release
it in the complementing filter_combine__free().

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 16:43:26 +09:00
6974765352 prune: quiet ENOENT on missing directories
$GIT_DIR/objects/pack may be removed to save inodes in shared
repositories.  Quiet down prune in cases where either
$GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent,
but emit the system error in other cases to help users diagnose
permissions problems or resource constraints.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 15:58:54 +09:00
ac95f5d36a built-ins: use free() not UNLEAK() if trivial, rm dead code
For a lot of uses of UNLEAK() it would be quite tricky to release the
memory involved, or we're missing the relevant *_(release|clear)()
functions. But in these cases we have them already, and can just
invoke them on the variable(s) involved, instead of UNLEAK().

For "builtin/worktree.c" the UNLEAK() was also added in [1], but the
struct member it's unleaking was removed in [2]. The only non-"int"
member of that structure is "const char *keep_locked", which comes to
us via "argv" or a string literal[3].

We have good visibility via the compiler and
tooling (e.g. SANITIZE=address) on bad free()-ing, but none on
UNLEAK() we don't need anymore. So let's prefer releasing the memory
when it's easy.

For "bugreport", "worktree" and "config" we need to start using a "ret
= ..." return pattern. For "builtin/bugreport.c" these UNLEAK() were
added in [4], and for "builtin/config.c" in [1].

For "config" the code seen here was the only user of the "value"
variable. For "ACTION_{RENAME,REMOVE}_SECTION" we need to be sure to
return the right exit code in the cases where we were relying on
falling through to the top-level.

I think there's still a use-case for UNLEAK(), but hat it's changed
since then. Using it so that "we can see the real leaks" is
counter-productive in these cases.

It's more useful to have UNLEAK() be a marker of the remaining odd
cases where it's hard to free() the memory for whatever reason. With
this change less than 20 of them remain in-tree.

1. 0e5bba53af (add UNLEAK annotation for reducing leak false
   positives, 2017-09-08)
2. d861d34a6e (worktree: remove extra members from struct add_opts,
   2018-04-24)
3. 0db4961c49 (worktree: teach `add` to accept --reason <string> with
  --lock, 2021-07-15)
4. 0e5bba53af and 00d8c31105 (commit: fix "author_ident" leak,
   2022-05-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
603f2f5719 revert: fix parse_options_concat() leak
Free memory from parse_options_concat(), which comes from code
originally added (then extended) in [1].

At this point we could get several more tests leak-free by free()-ing
the xstrdup() just above the line being changed, but that one's
trickier than it seems. The sequencer_remove_state() function
supposedly owns it, but sometimes we don't call it. I have a fix for
it, but it's non-trivial, so let's fix the easy one first.

1. c62f6ec341 (revert: add --ff option to allow fast forward when
   cherry-picking, 2010-03-06)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
d1ec656d68 cherry-pick: free "struct replay_opts" members
Call the release_revisions() function added in
1878b5edc0 (revision.[ch]: provide and start using a
release_revisions(), 2022-04-13) in cmd_cherry_pick(), as well as
freeing the xmalloc()'d "revs" member itself.

This is the same change as the one made for cmd_revert() a few lines
above it in fd74ac95ac (revert: free "struct replay_opts" members,
2022-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
5ff6e8afac rebase: don't leak on "--abort"
Fix a leak in the recent 6159e7add4 (rebase --abort: improve reflog
message, 2022-10-12). Before that commit we'd strbuf_release() the
reflog message we were formatting, but when that code was refactored
to use "ropts.head_msg" the strbuf_release() was omitted.

Ideally the three users of "ropts" in cmd_rebase() should use
different "ropts" variables, in practice they're completely separate,
as this and the other user in the "switch" statement will "goto
cleanup", which won't touch "ropts".

The third caller after the "switch" is then unreachable if we take
these two branches, so all of them are getting a "{ 0 }" init'd
"ropts".

So it's OK that we're leaving a stale pointer in "ropts.head_msg",
cleaning it up was our responsibility, and it won't be used again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
dd4143e7bf connected.c: free the "struct packed_git"
The "new_pack" we allocate in check_connected() wasn't being
free'd. Let's do that before we return from the function. This has
leaked ever since "new_pack" was added to this function in
c6807a40dc (clone: open a shortcut for connectivity check,
2013-05-26).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
f1f4ebf432 sequencer.c: fix "opts->strategy" leak in read_strategy_opts()
When "read_strategy_opts()" is called we may have populated the
"opts->strategy" before, so we'll need to free() it to avoid leaking
memory.

We populate it before because we cal get_replay_opts() from within
"rebase.c" with an already populated "opts", which we then copy. Then
if we're doing a "rebase -i" the sequencer API itself will promptly
clobber our alloc'd version of it with its own.

If this code is changed to do, instead of the added free() here a:

	if (opts->strategy)
		opts->strategy = xstrdup("another leak");

We get a couple of stacktraces from -fsanitize=leak showing how we
ended up clobbering the already allocated value, i.e.:

	Direct leak of 6 byte(s) in 1 object(s) allocated from:
	    #0 0x7f2e8cd45545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
	    #1 0x7f2e8cb0fcaa in __GI___strdup string/strdup.c:42
	    #2 0x6c4778 in xstrdup wrapper.c:39
	    #3 0x66bcb8 in read_strategy_opts sequencer.c:2902
	    #4 0x66bf7b in read_populate_opts sequencer.c:2969
	    #5 0x6723f9 in sequencer_continue sequencer.c:5063
	    #6 0x4a4f74 in run_sequencer_rebase builtin/rebase.c:348
	    #7 0x4a64c8 in run_specific_rebase builtin/rebase.c:753
	    #8 0x4a9b8b in cmd_rebase builtin/rebase.c:1824
	    #9 0x407a32 in run_builtin git.c:466
	    #10 0x407e0a in handle_builtin git.c:721
	    #11 0x40803d in run_argv git.c:788
	    #12 0x40850f in cmd_main git.c:923
	    #13 0x4eee79 in main common-main.c:57
	    #14 0x7f2e8ca9f209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
	    #15 0x7f2e8ca9f2bb in __libc_start_main_impl ../csu/libc-start.c:389
	    #16 0x405fd0 in _start (git+0x405fd0)

	Direct leak of 4 byte(s) in 1 object(s) allocated from:
	    #0 0x7f2e8cd45545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
	    #1 0x7f2e8cb0fcaa in __GI___strdup string/strdup.c:42
	    #2 0x6c4778 in xstrdup wrapper.c:39
	    #3 0x4a3c31 in xstrdup_or_null git-compat-util.h:1169
	    #4 0x4a447a in get_replay_opts builtin/rebase.c:163
	    #5 0x4a4f5b in run_sequencer_rebase builtin/rebase.c:346
	    #6 0x4a64c8 in run_specific_rebase builtin/rebase.c:753
	    #7 0x4a9b8b in cmd_rebase builtin/rebase.c:1824
	    #8 0x407a32 in run_builtin git.c:466
	    #9 0x407e0a in handle_builtin git.c:721
	    #10 0x40803d in run_argv git.c:788
	    #11 0x40850f in cmd_main git.c:923
	    #12 0x4eee79 in main common-main.c:57
	    #13 0x7f2e8ca9f209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
	    #14 0x7f2e8ca9f2bb in __libc_start_main_impl ../csu/libc-start.c:389
	    #15 0x405fd0 in _start (git+0x405fd0)

This can be seen in e.g. the 4th test of
"t3404-rebase-interactive.sh".

In the larger picture the ownership of the "struct replay_opts" is
quite a mess, e.g. in this case rebase.c's static "get_replay_opts()"
function partially creates it, but nothing in rebase.c will free()
it. The structure is "mostly owned" by the sequencer API, but it also
expects to get these partially populated versions of it.

It would be better to have rebase keep track of what it allocated, and
free() that, and to pass that as a "const" to the sequencer API, which
would copy what it needs to its own version, and to free() that.

But doing so is a much larger change, and however messy the ownership
boundary is here is consistent with what we're doing already, so let's
just free() this to fix the leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
c07ce0602a ls-files: fix a --with-tree memory leak
Fix a memory leak in overlay_tree_on_index(), we need to
clear_pathspec() at some point, which might as well be after the last
time we use it in the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
fc47252d5b revision API: call graph_clear() in release_revisions()
Call graph_clear() in release_revisions(), this will free memory
allocated by e.g. this command, which will now run without memory
leaks:

	git -P log -1 --graph --no-graph --graph

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
e84a26e32f unpack-file: fix ancient leak in create_temp_file()
Fix a leak that's been with us since 3407bb4940 (Add "unpack-file"
helper that unpacks a sha1 blob into a tmpfile., 2005-04-18). See
00c8fd493a (cat-file: use streaming API to print blobs, 2012-03-07)
for prior art which shows the same API pattern, i.e. free()-ing the
result of read_object_file() after it's used.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
b6046abc0c built-ins & libs & helpers: add/move destructors, fix leaks
Fix various leaks in built-ins, libraries and a test helper here we
were missing a call to strbuf_release(), string_list_clear() etc, or
were calling them after a potential "return".

Comments on individual changes:

- builtin/checkout.c: Fix a memory leak that was introduced in [1]. A
  sibling leak introduced in [2] was recently fixed in [3]. As with [3]
  we should be using the wt_status_state_free_buffers() API introduced
  in [4].

- builtin/repack.c: Fix a leak that's been here since this use of
  "strbuf_release()" was added in a1bbc6c017 (repack: rewrite the shell
  script in C, 2013-09-15). We don't use the variable for anything
  except this loop, so we can instead free it right afterwards.

- builtin/rev-parse: Fix a leak that's been here since this code was
  added in 21d4783538 (Add a parseopt mode to git-rev-parse to bring
  parse-options to shell scripts., 2007-11-04).

- builtin/stash.c: Fix a couple of leaks that have been here since
  this code was added in d4788af875 (stash: convert create to builtin,
  2019-02-25), we strbuf_release()'d only some of the "struct strbuf" we
  allocated earlier in the function, let's release all of them.

- ref-filter.c: Fix a leak in 482c119186 (gpg-interface: improve
  interface for parsing tags, 2021-02-11), we don't use the "payload"
  variable that we ask parse_signature() to populate for us, so let's
  free it.

- t/helper/test-fake-ssh.c: Fix a leak that's been here since this
  code was added in 3064d5a38c (mingw: fix t5601-clone.sh,
  2016-01-27). Let's free the "struct strbuf" as soon as we don't need
  it anymore.

1. c45f0f525d (switch: reject if some operation is in progress,
   2019-03-29)
2. 2708ce62d2 (branch: sort detached HEAD based on a flag,
   2021-01-07)
3. abcac2e19f (ref-filter.c: fix a leak in get_head_description,
   2022-09-25)
4. 962dd7ebc3 (wt-status: introduce wt_status_state_free_buffers(),
   2020-09-27).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
083fd1a264 dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache"
When the "ident" member of the structure was added in
1e8fef609e (untracked cache: guard and disable on system changes,
2015-03-08) this function wasn't updated to free it. Let's do so.

Let's also free the "exclude_per_dir" memory we've been leaking
since[1], while making sure not to free() the constant ".gitignore"
string we add by default[2].

As we now have three struct members we're freeing let's change
free_untracked_cache() to return early if "uc" isn't defined. We won't
hand it to free() now, but that was just for convenience, once we're
dealing with >=2 struct members this pattern is more convenient.

1. f9e6c64958 (untracked cache: load from UNTR index extension,
   2015-03-08)
2. 039bc64e88 (core.excludesfile clean-up, 2007-11-14)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
b5fcb1c006 read-cache.c: clear and free "sparse_checkout_patterns"
The "sparse_checkout_patterns" member was added to the "struct
index_state" in 836e25c51b (sparse-checkout: hold pattern list in
index, 2021-03-30), but wasn't added to discard_index(). Let's do
that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
03267e8656 commit: discard partial cache before (re-)reading it
The read_cache() in prepare_to_commit() would end up clobbering the
pointer we had for a previously populated "the_index.cache_tree" in
the very common case of "git commit" stressed by e.g. the tests being
changed here.

We'd populate "the_index.cache_tree" by calling
"update_main_cache_tree" in prepare_index(), but would not end up with
a "fully prepared" index. What constitutes an existing index is
clearly overly fuzzy, here we'll check "active_nr" (aka
"the_index.cache_nr"), but our "the_index.cache_tree" might have been
malloc()'d already.

Thus the code added in 11c8a74a64 (commit: write cache-tree data when
writing index anyway, 2011-12-06) would end up allocating the
"cache_tree", and would interact here with code added in
7168624c35 (Do not generate full commit log message if it is not
going to be used, 2007-11-28). The result was a very common memory
leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
ab2cf37183 {reset,merge}: call discard_index() before returning
These two built-ins both deal with the index, but weren't discarding
it. In subsequent commits we'll add more free()-ing to discard_index()
that we've missed, but let's first call the existing function.

We can doubtless add discard_index() (or its alias discard_cache()) to
a lot more places, but let's just add it here for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
e5e37517dd tests: mark tests as passing with SANITIZE=leak
This marks tests that have been leak-free since various recent
commits, but which were not marked us such when the memory leak was
fixed. These were mostly discovered with the "check" mode added in
faececa53f (test-lib: have the "check" mode for SANITIZE=leak
consider leak logs, 2022-07-28).

Commits that fixed the last memory leak in these tests. Per narrowing
down when they started to pass under SANITIZE=leak with "bisect":

- t1022-read-tree-partial-clone.sh:
  7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
  strings, 2022-09-08)

- t4053-diff-no-index.sh: 07a6f94a6d (diff-no-index: release prefixed
  filenames, 2022-09-07)

- t6415-merge-dir-to-symlink.sh: bac92b1f39 (Merge branch
  'js/ort-clean-up-after-failed-merge', 2022-08-08).

- t5554-noop-fetch-negotiator.sh:
  66eede4a37 (prepare_repo_settings(): plug leak of config values,
  2022-09-08)

- t2012-checkout-last.sh, t7504-commit-msg-hook.sh,
  t91{15,46,60}-git-svn-*.sh: The in-flight "pw/rebase-no-reflog-action"
  series, upon which this is based:
  https://lore.kernel.org/git/pull.1405.git.1667575142.gitgitgadget@gmail.com/

Let's mark all of these as passing with
"TEST_PASSES_SANITIZE_LEAK=true", to have it regression tested,
including as part of the "linux-leaks" CI job.

Additionally, let's remove the "!SANITIZE_LEAK" prerequisite from
tests that now pass, these were marked as failing in:

- 77e56d55ba (diff.c: fix a double-free regression in a18d66cefb,
  2022-03-17)
- c4d1d52631 (tests: change some 'test $(git) = "x"' to test_cmp,
  2022-03-07)

These were not spotted with the new "check" mode, but manually, it
doesn't cover these sort of prerequisites. There's few enough that we
shouldn't bother to automate it. They'll be going away sooner than
later.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
cb34852270 Merge branch 'pw/rebase-no-reflog-action' into ab/various-leak-fixes
* pw/rebase-no-reflog-action:
  rebase: stop exporting GIT_REFLOG_ACTION
  sequencer: stop exporting GIT_REFLOG_ACTION
2022-11-21 12:32:24 +09:00
07047d6829 cocci: apply "pending" index-compatibility to some "builtin/*.c"
Apply "index-compatibility.pending.cocci" rule to "builtin/*", but
exclude those where we conflict with in-flight changes.

As a result some of them end up using only "the_index", so let's have
them use the more narrow "USE_THE_INDEX_VARIABLE" rather than
"USE_THE_INDEX_COMPATIBILITY_MACROS".

Manual changes not made by coccinelle, that were squashed in:

* Whitespace-wrap argument lists for repo_hold_locked_index(),
  repo_read_index_preload() and repo_refresh_and_write_index(), in cases
  where the line became too long after the transformation.
* Change "refresh_cache()" to "refresh_index()" in a comment in
  "builtin/update-index.c".
* For those whose call was followed by perror("<macro-name>"), change
  it to perror("<function-name>"), referring to the new function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
bdafeae0b9 cache.h & test-tool.h: add & use "USE_THE_INDEX_VARIABLE"
In a preceding commit we fully applied the
"index-compatibility.pending.cocci" rule to "t/helper/*".

Let's now stop defining "USE_THE_INDEX_COMPATIBILITY_MACROS" in
test-tool.h itself, and instead instead define
"USE_THE_INDEX_VARIABLE" in the individual test helpers that need
it. This mirrors how we do the same thing in the "builtin/" directory.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
666f53eb43 {builtin/*,repository}.c: add & use "USE_THE_INDEX_VARIABLE"
Split up the "USE_THE_INDEX_COMPATIBILITY_MACROS" into that setting
and a more narrow "USE_THE_INDEX_VARIABLE". In the case of these
built-ins we only need "the_index" variable, but not the compatibility
wrapper for functions we're not using.

Let's then have some users of "USE_THE_INDEX_COMPATIBILITY_MACROS" use
this more narrow and descriptive define.

For context: The USE_THE_INDEX_COMPATIBILITY_MACROS macro was added to
test-tool.h in f8adbec9fe (cache.h: flip
NO_THE_INDEX_COMPATIBILITY_MACROS switch, 2019-01-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
0ea414a14d cocci: apply "pending" index-compatibility to "t/helper/*.c"
Apply the "index-compatibility.pending.cocci" rule to the "t/helper/*"
directory, a subsequent commit will extend cache.h to further narrow
down the use of "USE_THE_INDEX_COMPATIBILITY_MACROS" in this area.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
dc594180d9 cocci & cache.h: apply variable section of "pending" index-compatibility
Mostly apply the part of "index-compatibility.pending.cocci" that
renames the global variables like "active_nr", which are a shorthand
to referencing (in that case) a struct member as "the_index.cache_nr".

In doing so move more of "index-compatibility.pending.cocci" to
"index-compatibility.cocci".

In the case of "active_nr" we'd have a textual conflict with
"ab/various-leak-fixes" in "next"[1]. Let's exclude that specific case
while moving the rule over from "pending".

1. 407b94280f8 (commit: discard partial cache before (re-)reading it,
   2022-11-08)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
031b2033e0 cocci & cache.h: apply a selection of "pending" index-compatibility
Apply a selection of rules in "index-compatibility.pending.cocci"
tree-wide, and in doing so migrate them to
"index-compatibility.cocci".

As in preceding commits the only manual changes here are the macro
removals in "cache.h", and the update to the '*.cocci" rules. The rest
of the C code changes are the result of applying those updated rules.

Move rules for some rarely used cache compatibility macros from
"index-compatibility.pending.cocci" to "index-compatibility.cocci" and
apply them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
0e6550a2c6 cocci: add a index-compatibility.pending.cocci
Add a coccinelle rule which covers the rest of the macros guarded by
"USE_THE_INDEX_COMPATIBILITY_MACROS" cache.h. If the result of this
were applied it can be reduced down to just:

	#ifdef USE_THE_INDEX_COMPATIBILITY_MACROS
	extern struct index_state the_index;
	#endif

But that patch is just under 2000 lines, so let's first add this as a
"pending", and then incrementally pick changes from it in subsequent
commits. In doing that we'll migrate rules from this
"index-compatibility.pending.cocci" to the "index-compatibility.cocci"
created in a preceding commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
9c5f3ee3b3 read-cache API & users: make discard_index() return void
The discard_index() function has not returned non-zero since
7a51ed66f6 (Make on-disk index representation separate from in-core
one, 2008-01-14), but we've had various code in-tree still acting as
though that might be the case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
fbc1ed629e cocci & cache.h: remove rarely used "the_index" compat macros
Since 4aab5b46f4 (Make read-cache.c "the_index" free., 2007-04-01)
we've been undergoing a slow migration away from these macros, but
haven't made much progress since f8adbec9fe (cache.h: flip
NO_THE_INDEX_COMPATIBILITY_MACROS switch, 2019-01-24).

Let's move forward a bit by changing the users of those macros that
are rare enough that we can convert them in one go, and then remove
the compatibility shim.

The only manual change to the C code here is to "cache.h", the rest is
all the result of applying the new "index-compatibility.cocci".

Even though it's a one-off, let's keep the coccinelle rules for
now. We'll extend them in subsequent commits, and this will help
anything that's in-flight or out-of-tree to migrate.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
8f56511945 builtin/{grep,log}.: don't define "USE_THE_INDEX_COMPATIBILITY_MACROS"
Adding "USE_THE_INDEX_COMPATIBILITY_MACROS" to these two appears to
have been unnecessary from the start, as going back and compiling
f8adbec9fe (cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch,
2019-01-24) without that addition works.

Let's not have these ask for the compatibility macros from cache.h
that they don't need.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:14 +09:00
c74e7b10b6 cache.h: remove unused "the_index" compat macros
The "active_alloc" macro added in 228e94f935 (Move index-related
variables into a structure., 2007-04-01) has not been used since
4aab5b46f4 (Make read-cache.c "the_index" free., 2007-04-01). Let's
remove it.

The rest of these are likewise unused, so let's not keep them
around. E.g. 12cd0bf9b0 (dir: stop using the index compatibility
macros, 2017-05-05) is the last use of "cache_dir_exists".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:14 +09:00
a0789512c5 The thirteenth batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-18 18:48:53 -05:00
e87a229d57 Merge branch 'en/sparse-checkout-design'
Design doc.

* en/sparse-checkout-design:
  sparse-checkout.txt: new document with sparse-checkout directions
2022-11-18 18:44:01 -05:00
26734da056 Merge branch 'jk/branch-delete-detached'
Fix a bug where `git branch -d` did not work on an orphaned HEAD.

* jk/branch-delete-detached:
  branch: gracefully handle '-d' on orphan HEAD
2022-11-18 18:44:00 -05:00
35a62bb579 Merge branch 'mh/credential-unrecognized-attrs'
Docfix.

* mh/credential-unrecognized-attrs:
  docs: clarify that credential discards unrecognised attributes
2022-11-18 18:43:59 -05:00
a92fce4c50 Merge branch 'vd/skip-cache-tree-update'
Avoid calling 'cache_tree_update()' when doing so would be redundant.

* vd/skip-cache-tree-update:
  rebase: use 'skip_cache_tree_update' option
  read-tree: use 'skip_cache_tree_update' option
  reset: use 'skip_cache_tree_update' option
  unpack-trees: add 'skip_cache_tree_update' option
  cache-tree: add perf test comparing update and prime
2022-11-18 18:43:56 -05:00
3f98d7ab1b Merge branch 'mh/increase-credential-cache-timeout'
Update the credential-cache documentation to provide a more realistic
example.

* mh/increase-credential-cache-timeout:
  Documentation: increase example cache timeout to 1 hour
2022-11-18 18:43:55 -05:00
35dc2cf03f Merge branch 'vd/update-refs-delete'
`git rebase --update-refs` would delete references when all `update-ref`
commands in the sequencer were removed, which has been corrected.

* vd/update-refs-delete:
  rebase --update-refs: avoid unintended ref deletion
2022-11-18 18:43:11 -05:00
ad9096881d Merge branch 'tb/repack-expire-to'
"git repack" learns to send cruft objects out of the way into
packfiles outside the repository.

* tb/repack-expire-to:
  builtin/repack.c: implement `--expire-to` for storing pruned objects
  builtin/repack.c: write cruft packs to arbitrary locations
  builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack`
  builtin/repack.c: pass "out" to `prepare_pack_objects`
2022-11-18 18:43:09 -05:00
e53598a5ab Merge branch 'ab/sha-makefile-doc'
Makefile comments updates and reordering to clarify knobs used to
choose SHA implementations.

* ab/sha-makefile-doc:
  Makefile: discuss SHAttered in *_SHA{1,256} discussion
  Makefile: document default SHA-1 backend on OSX
  Makefile & test-tool: replace "DC_SHA1" variable with a "define"
  Makefile: document SHA-1 and SHA-256 default and selection order
  Makefile: document default SHA-256 backend
  Makefile: rephrase the discussion of *_SHA1 knobs
  Makefile: create and use sections for "define" flag listing
  Makefile: correct DC_SHA1 documentation
  INSTALL: remove discussion of SHA-1 backends
  Makefile: always (re)set DC_SHA1 on fallback
2022-11-18 18:43:07 -05:00
69c1d609ba Merge branch 'ab/misc-hook-submodule-run-command'
Various test updates.

* ab/misc-hook-submodule-run-command:
  run-command tests: test stdout of run_command_parallel()
  submodule tests: reset "trace.out" between "grep" invocations
  hook tests: fix redirection logic error in 96e7225b31
2022-11-18 18:43:04 -05:00
7025f54c40 delta-islands: free island-related data after use
On my use case involving 771 islands of Linux on kernel.org,
this reduces memory usage by around 25MB.  The bulk of that
comes from free_remote_islands, since free_config_regexes only
saves around 40k.

This memory is saved early in the memory-intensive pack process,
making it available for the remainder of the long process.

Signed-off-by: Eric Wong <e@80x24.org>
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-18 18:30:49 -05:00
8db2dad7a0 parse_object(): check on-disk type of suspected blob
In parse_object(), we try to handle blobs by streaming rather than
loading them entirely into memory. The most common case here will be
that we haven't seen the object yet and check oid_object_info(), which
tells us we have a blob.

But we trigger this code on one other case: when we have an in-memory
object struct with type OBJ_BLOB (and without its "parsed" flag set,
since otherwise we'd return early from the function). This indicates
that some other part of the code suspected we have a blob (e.g., it was
mentioned by a tree or tag) but we haven't yet looked at the on-disk
copy.

In this case before hitting the streaming path, we check if we have the
object on-disk at all. This is mostly pointless extra work, as the
streaming path would complain if it couldn't open the object (albeit
with the message "hash mismatch", which is a little misleading).

But it's also insufficient to catch all problems. The streaming code
will only tell us "yes, the on-disk object matches the oid". But it
doesn't actually confirm that what we found was indeed a blob, and
neither does repo_has_object_file().

One way to improve this would be to teach stream_object_signature() to
check the type (either by returning it to us to check, or taking an
"expected" type). But there's an even simpler fix here: if we suspect
the object is a blob, just call oid_object_info() to confirm that we
have it on-disk, and that it really is a blob.

This is slightly less efficient than teaching stream_object_signature()
to do it (since it has to open the object already). But this case very
rarely comes up. In practice, we usually don't have any clue what the
type is, in which case we already call oid_object_info(). This
"suspected" case happens only when some other code created an object
struct but didn't actually parse the blob, which is actually tricky to
trigger at all (see the discussion of the test below).

I reworked the conditional a bit so that instead of:

  if ((suspected_blob && oid_object_info() == OBJ_BLOB)
      (no_clue && oid_object_info() == OBJ_BLOB)

we have the simpler:

  if ((suspected_blob || no_clue) && oid_object_info() == OBJ_BLOB)

This is shorter, but also reflects what we really want say, which is
"have we ruled out this being a blob; if not, check it on-disk".

In either case, if oid_object_info() fails to tell us it's a blob, we'll
skip the streaming code path and call repo_read_object_file(), just as
before. And if we really do have a mismatch with the existing object
struct, we'll eventually call lookup_commit(), etc, via
parse_object_buffer(), which will complain that it doesn't match our
existing obj->type.

So this fixes one of the lingering expect_failure cases from 0616617c7e
(t: introduce tests for unexpected object types, 2019-04-09).  That test
works by peeling a tag that claims to point to a blob (triggering us to
create the struct), but really points to something else, which we later
discover when we call parse_object() as part of the actual traversal).
Prior to this commit, we'd quietly check the sha1 and mark the blob as
"parsed". Now we correctly complain about the mismatch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-18 13:59:31 -05:00
04fb96219a parse_object(): drop extra "has" check before checking object type
When parsing an object of unknown type, we check to see if it's a blob,
so we can use our streaming code path. This uses oid_object_info() to
check the type, but before doing so we call repo_has_object_file(). This
latter is pointless, as oid_object_info() will already fail if the
object is missing. Checking it ahead of time just complicates the code
and is a waste of resources (albeit small).

Let's drop the redundant check.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-18 13:59:31 -05:00
cfbd173ccb branch: force-copy a branch to itself via @{-1} is a no-op
Since 52d59cc645 (branch: add a --copy (-c) option to go with --move
(-m), 2017-06-18) we can copy a branch to make a new branch with the
'-c' (copy) option or to overwrite an existing branch using the '-C'
(force copy) option.  A no-op possibility is considered when we are
asked to copy a branch to itself, to follow the same no-op introduced
for the rename (-M) operation in 3f59481e33 (branch: allow a no-op
"branch -M <current-branch> HEAD", 2011-11-25).  To check for this, in
52d59cc645 we compared the branch names provided by the user, source
(HEAD if omitted) and destination, and a match is considered as this
no-op.

Since ae5a6c3684 (checkout: implement "@{-N}" shortcut name for N-th
last branch, 2009-01-17) a branch can be specified using shortcuts like
@{-1}.  This allows this usage:

	$ git checkout -b test
	$ git checkout -
	$ git branch -C test test  # no-op
	$ git branch -C test @{-1} # oops
	$ git branch -C @{-1} test # oops

As we are using the branch name provided by the user to do the
comparison, if one of the branches is provided using a shortcut we are
not going to have a match and a call to git_config_copy_section() will
happen.  This will make a duplicate of the configuration for that
branch, and with this progression the second call will produce four
copies of the configuration, and so on.

Let's use the interpreted branch name instead for this comparison.

The rename operation is not affected.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 17:16:21 -05:00
bcec6780b2 receive-pack: only use visible refs for connectivity check
When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
      Range (min … max):   30.796 s … 31.071 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
      Range (min … max):    6.729 s …  6.850 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.56 ± 0.05 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
      Range (min … max):   47.706 s … 48.539 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
      Range (min … max):   47.504 s … 48.500 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.01 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:52 -05:00
5ff36c9b6b rev-parse: add --exclude-hidden= option
Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a section name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:52 -05:00
8c1bc2a71a revision: add new parameter to exclude hidden refs
Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:52 -05:00
1e9f273ac0 revision: introduce struct to handle exclusions
The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.

Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs and move the `struct string_list` that holds
all wildmatch patterns of excluded refs into it. Rename functions that
operate on this struct to match its name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:52 -05:00
05b9425960 revision: move together exclusion-related functions
Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:51 -05:00
9b67eb6fbe refs: get rid of global list of hidden refs
We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:51 -05:00
5eeb9aa208 refs: fix memory leak when parsing hideRefs config
When parsing the hideRefs configuration, we first duplicate the config
value so that we can modify it. We then subsequently append it to the
`hide_refs` string list, which is initialized with `strdup_strings`
enabled. As a consequence we again reallocate the string, but never
free the first duplicate and thus have a memory leak.

While we never clean up the static `hide_refs` variable anyway, this is
no excuse to make the leak worse by leaking every value twice. We are
also about to change the way this variable will be handled so that we do
indeed start to clean it up. So let's fix the memory leak by using the
`string_list_append_nodup()` so that we pass ownership of the allocated
string to `hide_refs`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:51 -05:00
3c9b01f0bf notes: avoid empty line in template
When `git notes` prepares the template it adds an empty newline between
the comment header and the content:

>
> #
> # Write/edit the notes for the following object:
>
> # commit 0f3c55d4c2b7864bffb2d92278eff08d0b2e083f
> # etc

This is wrong structurally because that newline is part of the comment,
too, and thus should be commented. Also, it throws off some positioning
strategies of editors and plugins, and it differs from how we do commit
templates.

Change this to follow the standard set by `git commit`:

>
> #
> # Write/edit the notes for the following object:
> #
> # commit 0f3c55d4c2b7864bffb2d92278eff08d0b2e083f
>

Tests pass unchanged after this code change.

Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-16 14:57:32 -05:00
23fb328c8d t7610: use "file:///dev/null", not "/dev/null", fixes MinGW
On MinGW the "/dev/null" is translated to "nul" on command-lines, even
though as in this case it'll never end up referring to an actual file.

So on Windows the fix for the previous "example.com" timeout issue in
8354cf752e (t7610: fix flaky timeout issue, don't clone from
example.com, 2022-11-05) would yield:

  fatal: repo URL: 'nul' must be absolute or begin with ./|../

Let's evade this yet again by prefixing this with "file://", which
makes this pass in the Windows CI.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-15 20:05:02 -05:00
049141dce9 bisect; remove unused "git-bisect.sh" and ".gitignore" entry
Since 73fce29427 (Turn `git bisect` into a full built-in, 2022-11-10)
we've used builtin/bisect.c instead of git-bisect.sh to implement the
"bisect" command.

Let's remove the unused leftover script, and the ".gitignore" entry for
the "git-bisect--helper", which also hasn't been built since
73fce29427.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-15 14:38:16 -05:00
03744bbdc4 builtin/gc.c: fix use-after-free in maintenance_unregister()
While trying to fix a move based on an uninitialized value (along with a
declaration after the first statement), be0fd57228
(maintenance --unregister: fix uninit'd data use &
-Wdeclaration-after-statement, 2022-11-15) unintentionally introduced a
use-after-free.

The problem arises when `maintenance_unregister()` sees a non-NULL
`config_file` string and thus tries to call
git_configset_get_value_multi() to lookup the corresponding values.

We store the result off, and then call git_configset_clear(), which
frees the pointer that we just stored. We then try to read that
now-freed pointer a few lines below, and there we have our
use-after-free:

    $ ./t7900-maintenance.sh -vxi --run=23 --valgrind
    [...]
    + git maintenance unregister --config-file ./other
    ==3048727== Invalid read of size 8
    ==3048727==    at 0x1869CA: maintenance_unregister (gc.c:1590)
    ==3048727==    by 0x188F42: cmd_maintenance (gc.c:2651)
    ==3048727==    by 0x128C62: run_builtin (git.c:466)
    ==3048727==    by 0x12907E: handle_builtin (git.c:721)
    ==3048727==    by 0x1292EC: run_argv (git.c:788)
    ==3048727==    by 0x12988E: cmd_main (git.c:926)
    ==3048727==    by 0x21ED39: main (common-main.c:57)
    ==3048727==  Address 0x4b38bc8 is 24 bytes inside a block of size 64 free'd
    ==3048727==    at 0x484617B: free (vg_replace_malloc.c:872)
    ==3048727==    by 0x2D207E: free_individual_entries (hashmap.c:188)
    ==3048727==    by 0x2D2153: hashmap_clear_ (hashmap.c:207)
    ==3048727==    by 0x270B5C: git_configset_clear (config.c:2375)
    ==3048727==    by 0x1869AC: maintenance_unregister (gc.c:1585)
    ==3048727==    by 0x188F42: cmd_maintenance (gc.c:2651)
    ==3048727==    by 0x128C62: run_builtin (git.c:466)
    ==3048727==    by 0x12907E: handle_builtin (git.c:721)
    ==3048727==    by 0x1292EC: run_argv (git.c:788)
    ==3048727==    by 0x12988E: cmd_main (git.c:926)
    ==3048727==    by 0x21ED39: main (common-main.c:57)
    [...]

Resolve this via a partial-revert of be0fd57228. The config_set struct
now gets a zero initialization, which makes free()-ing it a noop even
without calling git_configset_init(). When we do initialize it to a
non-zero value, it is only free()'d after our last read of `list`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-15 13:56:11 -05:00
be0fd57228 maintenance --unregister: fix uninit'd data use & -Wdeclaration-after-statement
Since (maintenance: add option to register in a specific config,
2022-11-09) we've been unable to build with "DEVELOPER=1" without
"DEVOPTS=no-error", as the added code triggers a
"-Wdeclaration-after-statement" warning.

And worse than that, the data handed to git_configset_clear() is
uninitialized, as can be spotted with e.g.:

	./t7900-maintenance.sh -vixd --run=23 --valgrind
	[...]
	+ git maintenance unregister --force
	Conditional jump or move depends on uninitialised value(s)
	   at 0x6B5F1E: git_configset_clear (config.c:2367)
	   by 0x4BA64E: maintenance_unregister (gc.c:1619)
	   by 0x4BD278: cmd_maintenance (gc.c:2650)
	   by 0x409905: run_builtin (git.c:466)
	   by 0x40A21C: handle_builtin (git.c:721)
	   by 0x40A58E: run_argv (git.c:788)
	   by 0x40AF68: cmd_main (git.c:926)
	   by 0x5D39FE: main (common-main.c:57)
	 Uninitialised value was created by a stack allocation
	   at 0x4BA22C: maintenance_unregister (gc.c:1557)

Let's fix both of these issues, and also move the scope of the
variable to the "if" statement it's used in, to make it obvious where
it's used.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-15 12:31:53 -05:00
1f80129d61 maintenance: add option to register in a specific config
maintenance register currently records the maintenance repo exclusively
within the user's global configuration, but other configuration files
may be relevant when running maintenance if they are included from the
global config. This option allows the user to choose where maintenance
repos are recorded.

Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 22:39:25 -05:00
13d5bbdf72 for-each-repo: interpolate repo path arguments
This is a quality of life change for git-maintenance, so repos can be
recorded with the tilde syntax. The register subcommand will not record
repos in this format by default.

Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 22:39:25 -05:00
eea7033409 The twelfth batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 19:56:07 -05:00
3c5d0ce3f5 Merge branch 'vh/my-first-contribution-typo'
Documentation fix.

* vh/my-first-contribution-typo:
  Documentation: fix typo
2022-11-14 19:53:55 -05:00
859899ddc1 Merge branch 'ks/partialclone-casing'
Documentation fix.

* ks/partialclone-casing:
  repository-version.txt: partialClone casing change
2022-11-14 19:53:43 -05:00
dc8be3971c Merge branch 'mh/password-can-be-pat'
Documentation update to git-credential(1).

* mh/password-can-be-pat:
  Documentation/gitcredentials.txt: mention password alternatives
2022-11-14 19:53:42 -05:00
69eb1be693 Merge branch 'js/ci-set-output'
Update the actions/github-script dependency in CI to avoid a
deprecation warning.

* js/ci-set-output:
  ci: use a newer `github-script` version
2022-11-14 19:53:38 -05:00
311bf13147 Merge branch 'ab/rev-info-init'
Progress on being able to initialize a rev_info struct with a macro.

* ab/rev-info-init:
  revisions API: extend the nascent REV_INFO_INIT macro
2022-11-14 19:53:37 -05:00
d0c3853034 Merge branch 'al/trace2-clearing-skip-worktree'
Add trace2 counters to the region to clear skip worktree bits in a
sparse checkout.

* al/trace2-clearing-skip-worktree:
  index: raise a bug if the index is materialised more than once
  index: add trace2 region for clear skip worktree
2022-11-14 19:53:34 -05:00
561f3948a5 Merge branch 'do/modernize-t7001'
Modernize test script to avoid "test -f" and friends.

* do/modernize-t7001:
  t7001-mv.sh: modernizing test script using functions
2022-11-14 19:53:31 -05:00
dabb9d875f Docs: describe how a credential-generating helper works
Previously the docs only described storage helpers.

A concrete example: Git Credential Manager can generate credentials
for GitHub and GitLab via OAuth.
https://github.com/GitCredentialManager/git-credential-manager

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 18:18:59 -05:00
c5353c4552 Documentation: fix typo
Signed-off-by: Vlad-Stefan Harbuz <vlad@vladh.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 18:14:58 -05:00
b637a41ebe http: redact curl h2h3 headers in info
With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module (invoked when using HTTP/2) also prints headers in its
"info", which don't get redacted. For example,

  echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
  GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
    -c 'http.cookiefile=cookiefile' \
    -c 'http.version=' \
    ls-remote https://github.com/git/git refs/heads/main 2>output &&
  grep 'cookie' output

produces output like:

  23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
  23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>

Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic. This fixes the broken redaction logic
that we noted in the previous commit, so mark the redaction tests as
passing under HTTP2.

[1] f8c3724aa9

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:42:46 -05:00
73c49a4474 t: run t5551 tests with both HTTP and HTTP/2
We have occasionally seen bugs that affect Git running only against an
HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
match headers case-insensitively when redacting, 2021-09-22). But since
we have no test coverage using HTTP/2, we only uncover these bugs in the
wild.

That commit gives a recipe for converting our Apache setup to support
HTTP/2, but:

  - it's not necessarily portable

  - we don't want to just test HTTP/2; we really want to do a variety of
    basic tests for _both_ protocols

This patch handles both problems by running a duplicate of t5551
(labeled as t5559 here) with an alternate-universe setup that enables
HTTP/2. So we'll continue to run t5551 as before, but run the same
battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
platform, then t5559 should bail during the webserver setup, and
gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
"auto" to "yes", where the point is to complain when webserver setup
fails).

In theory other http-related test scripts could benefit from the same
duplication, but doing t5551 should give us a reasonable check of basic
functionality, and would have caught both bugs we've seen in the wild
with HTTP/2.

A few notes on the implementation:

  - a script enables the server side config by calling enable_http2
    before starting the webserver. This avoids even trying to load any
    HTTP/2 config for t5551 (which is what lets it keep working with
    regular HTTP even on systems that don't support it). This also sets
    a prereq which can be used by individual tests.

  - As discussed in b66c77a64e, the http2 module isn't compatible with
    the "prefork" mpm, so we need to pick something else. I chose
    "event" here, which works on my Debian system, but it's possible
    there are platforms which would prefer something else. We can adjust
    that later if somebody finds such a platform.

  - The test "large fetch-pack requests can be sent using chunked
    encoding" makes sure we use a chunked transfer-encoding by looking
    for that header in the trace. But since HTTP/2 has its own streaming
    mechanisms, we won't find such a header. We could skip the test
    entirely by marking it with !HTTP2. But there's some value in making
    sure that the fetch itself succeeded. So instead, we'll confirm that
    either we're using HTTP2 _or_ we saw the expected chunked header.

  - the redaction tests fail under HTTP/2 with recent versions of curl.
    This is a bug! I've marked them with !HTTP2 here to skip them under
    t5559 for the moment. Using test_expect_failure would be more
    appropriate, but would require a bunch of boilerplate. Since we'll
    be fixing them momentarily, let's just skip them for now to keep the
    test suite bisectable, and we can re-enable them in the commit that
    fixes the bug.

  - one alternative layout would be to push most of t5551 into a
    lib-t5551.sh script, then source it from both t5551 and t5559.
    Keeping t5551 intact seemed a little simpler, as its one less level
    of indirection for people fixing bugs/regressions in the non-HTTP/2
    tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:42:46 -05:00
8ddc06631b pack-bitmap.c: avoid exposing absolute paths
In "open_midx_bitmap_1()" and "open_pack_bitmap_1()", when we find that
there are multiple bitmaps, we will only open the first one and then
leave warnings about the remaining pack information, the information
will contain the absolute path of the repository, for example in a
alternates usage scenario. So let's hide this kind of potentially
sensitive information in this commit.

Found-by: XingXin <moweng.xx@antgroup.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:21:16 -05:00
2aa84d5f3e pack-bitmap.c: remove unnecessary "open_pack_index()" calls
When trying to open a pack bitmap, we call open_pack_bitmap_1() in a
loop, during which it tries to open up the pack index corresponding
with each available pack.

It's likely that we'll end up relying on objects in that pack later
in the process (in which case we're doing the work of opening the
pack index optimistically), but not guaranteed.

For instance, consider a repository with a large number of small
packs, and one large pack with a bitmap. If we see that bitmap pack
last in our loop which calls open_pack_bitmap_1(), the current code
will have opened *all* pack index files in the repository. If the
request can be served out of the bitmapped pack alone, then the time
spent opening these idx files was wasted.S

Since open_pack_bitmap_1() calls is_pack_valid() later on (which in
turns calls open_pack_index() itself), we can just drop the earlier
call altogether.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:21:16 -05:00
e62f779ae6 Doc: document push.recurseSubmodules=only
Git learned pushing submodules without pushing the superproject by
the user specifying --recurse-submodules=only through 6c656c3fe4
("submodules: add RECURSE_SUBMODULES_ONLY value", 2016-12-20) and
225e8bf778 ("push: add option to push only submodules", 2016-12-20).
For users who use this feature regularly, it is desirable to have an
equivalent configuration.

It turns out that such a configuration (push.recurseSubmodules=only) is
already supported, even though it is neither documented nor mentioned
in the commit messages, due to the way the --recurse-submodules=only
feature was implemented (a function used to parse --recurse-submodules
was updated to support "only", but that same function is used to parse
push.recurseSubmodules too). What is left is to document it and test it,
which is what this commit does.

There is a possible point of confusion when recursing into a submodule
that itself has the push.recurseSubmodules=only configuration, because
if a repository has only its submodules pushed and not itself, its
superproject can never be pushed. Therefore, treat such configurations
as being "on-demand", and print a warning message.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 16:55:50 -05:00
7fd54b6238 docs: clarify that credential discards unrecognised attributes
It was previously unclear how unrecognised attributes are handled.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-12 23:57:34 -05:00
501e3bab99 merge-tree.c: allow specifying the merge-base when --stdin is passed
The previous commit added a `--merge-base` option in order to allow
using a specified merge-base for the merge.  Extend the input accepted
by `--stdin` to also allow a specified merge-base with each merge
requested.  For example:

    printf "<b3> -- <b1> <b2>" | git merge-tree --stdin

does a merge of b1 and b2, and uses b3 as the merge-base.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-12 23:53:04 -05:00
66265a693e merge-tree.c: add --merge-base=<commit> option
This patch will give our callers more flexibility to use `git merge-tree`,
such as:

    git merge-tree --write-tree --merge-base=branch^ HEAD branch

This does a merge of HEAD and branch, but uses branch^ as the merge-base.

And the reason why using an option flag instead of a positional argument
is to allow additional commits passed to merge-tree to be handled via an
octopus merge in the future.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-12 23:53:04 -05:00
a90085b68c tests(scalar): tighten the stale scalar.repo test some
As pointed out by Stolee, the previous incarnation of this test case was
not stringent enough: we want to verify that _only_ the stale entries
are removed (previously, the test case would have succeeded even if all
entries had been removed).

Let's rectify this and verify that the other entries are left intact.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:24:36 -05:00
29c550f0af repository-version.txt: partialClone casing change
Remotes are considered "promisor" if extensions.partialClone and some
other configuration variables are set. The casing for this in
Documentation/technical/repository-version.txt is not proper and may
cause confusion. This change corrects this casing.

Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:23:12 -05:00
0d12792f5f Makefile: don't create a ".build/.build/" for cocci, fix output
Fix a couple of issues in the recently merged 0f3c55d4c2b (Merge
branch 'ab/coccicheck-incremental' into next, 2022-11-08):

In copying over the "contrib/coccinelle/" rules to
".build/contrib/coccinelle/" we inadvertently ended up with a
".build/.build/contrib/coccinelle/" as well. We'd generate the
per-file patches in the former, and keep the rule and overall result
in the latter. E.g. running:

	make contrib/coccinelle/free.cocci.patch COCCI_SOURCES="attr.c grep.c"

Would, per "tree -a .build" yield the following result:

	.build
	├── .build
	│   └── contrib
	│       └── coccinelle
	│           └── free.cocci.patch
	│               ├── attr.c
	│               ├── attr.c.log
	│               ├── grep.c
	│               └── grep.c.log
	└── contrib
	    └── coccinelle
	        ├── FOUND_H_SOURCES
	        ├── free.cocci
	        └── free.cocci.patch

Now we'll instead generate all of our files in
".build/contrib/coccinelle/". Fixing this required renaming the
directory where we keep our per-file patches, as we'd otherwise
conflict with the result.

Now the per-file patch directory is named e.g. "free.cocci.d". And the
end result will now be:

	.build
	└── contrib
	    └── coccinelle
	        ├── FOUND_H_SOURCES
	        ├── free.cocci
	        ├── free.cocci.d
	        │   ├── attr.c.patch
	        │   ├── attr.c.patch.log
	        │   ├── grep.c.patch
	        │   └── grep.c.patch.log
	        └── free.cocci.patch

The per-file patches now have a ".patch" file suffix, which fixes
another issue reported against 0f3c55d4c2b: The summary output was
confusing. Before for the "make" command above we'd emit:

	[...]
	MKDIR -p .build/contrib/coccinelle
	CP contrib/coccinelle/free.cocci .build/contrib/coccinelle/free.cocci
	GEN .build/contrib/coccinelle/FOUND_H_SOURCES
	MKDIR -p .build/.build/contrib/coccinelle/free.cocci.patch
	SPATCH .build/.build/contrib/coccinelle/free.cocci.patch/grep.c
	SPATCH .build/.build/contrib/coccinelle/free.cocci.patch/attr.c
	SPATCH CAT $^ >.build/contrib/coccinelle/free.cocci.patch
	CP .build/contrib/coccinelle/free.cocci.patch contrib/coccinelle/free.cocci.patch

But now we'll instead emit (identical output at the start omitted):

	[...]
	MKDIR -p .build/contrib/coccinelle/free.cocci.d
	SPATCH grep.c >.build/contrib/coccinelle/free.cocci.d/grep.c.patch
	SPATCH attr.c >.build/contrib/coccinelle/free.cocci.d/attr.c.patch
	SPATCH CAT .build/contrib/coccinelle/free.cocci.d/**.patch >.build/contrib/coccinelle/free.cocci.patch
	CP .build/contrib/coccinelle/free.cocci.patch contrib/coccinelle/free.cocci.patch

I.e. we have an "SPATCH" line that makes it clear that we're running
against the "{attr,grep}.c" file. The "SPATCH CAT" is then altered to
correspond to it, showing that we're concatenating the
"free.cocci.d/**.patch" files into one generated "free.cocci.patch" at
the end.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:21:45 -05:00
73fce29427 Turn git bisect into a full built-in
Now that the shell script hands off to the `bisect--helper` to do
_anything_ (except to show the help), it is but a tiny step to let the
helper implement the actual `git bisect` command instead.

This retires `git-bisect.sh`, concluding a multi-year journey that many
hands helped with, in particular Pranit Bauna, Tanushree Tumane and
Miriam Rubio.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:06:02 -05:00
0da4b538e4 bisect--helper: log: allow arbitrary number of arguments
In a later change, we would like to turn bisect into a builtin by
renaming bisect--helper.

However, there's an oddity that "git bisect log" accepts any number of
arguments and it will just ignore them all.

Let's prepare for the next step by ignoring any arguments passed to
"git bisect--helper log"

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:06:01 -05:00
df63421be9 bisect--helper: handle states directly
In preparation for making `git bisect` a real built-in, let's prepare
the `bisect--helper` built-in to handle `git bisect--helper good` and
`git bisect--helper bad`, i.e. eliminate the need of `state` subcommand.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:06:00 -05:00
5512376ae1 bisect--helper: emit usage for "git bisect"
In subsequent commits we'll be removing "git-bisect.sh" in favor of
promoting "bisect--helper" to a "bisect" built-in.

In doing that we'll first need to have it support "git bisect--helper
<cmd>" rather than "git bisect--helper --<cmd>", and then finally have
its "-h" output claim to be "bisect" rather than "bisect--helper".

Instead of suffering that churn let's start claiming to be "git
bisect" now. In just a few commits this will be true, and in the
meantime emitting the "wrong" usage information from the helper is a
small price to pay to avoid the churn.

Let's also declare "BUILTIN_*" macros, when we eventually migrate the
sub-commands themselves to parse_options() we'll be able to re-use the
strings. See 0afd556b2e (worktree: define subcommand -h in terms of
command -h, 2022-10-13) for a recent example.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:58 -05:00
929bf9db28 bisect test: test exit codes on bad usage
Address a test blindspot, the "log" command is the odd one out because
"git-bisect.sh" ignores any arguments it receives. Let's test both the
exit codes we expect, and the stderr and stdout we're emitting.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:57 -05:00
252060be77 bisect--helper: identify as bisect when report error
In a later change, we will convert the bisect--helper to be builtin
bisect. Let's start by self-identifying it's the real bisect when reporting
error.

This change is safe since 'git bisect--helper' is an implementation
detail, users aren't expected to call 'git bisect--helper'.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:55 -05:00
8962f8f888 bisect-run: verify_good: account for non-negative exit status
Some system never reports negative exit code at all, they reports them
as bigger-than-128 instead.  We take extra care for those systems in the
later check for normal 'do_bisect_run' loop.

Let's check it here, too.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:53 -05:00
461fec41fa bisect run: keep some of the post-v2.30.0 output
Preceding commits fixed output and behavior regressions in
d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function
in C, 2021-09-13), which did not claim to be changing the output of
"git bisect run".

But some of the output it emitted was subjectively better, so once
we've asserted that we're back on v2.29.0 behavior, let's change some
of it back:

- We now quote the arguments again, but omit the first " " when
  printing the "running" line.
- Ditto for other cases where we emitted the argument
- We say "found first bad commit" again, not just "run success"

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Based-on-patch-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:52 -05:00
f37d0bdd42 bisect: fix output regressions in v2.30.0
When d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13) reimplemented parts of "git bisect run" in
C it changed the output we emitted so that:

 - The "running ..." line was now quoted
 - We lost the \n after our output
 - We started saying "bisect found ..." instead of "bisect run success"

Arguably some of this is better now, but as d1bbbe45df did not
advocate for changing the output, let's revert this for now. It'll be
easy to change it back if that's what we'd prefer.

This does not change the one remaining use of "command.buf" to emit
the quoted argument, as that's new in d1bbbe45df.

Some of these cases were not tested for in the tests added in the
preceding commit, I didn't have time to fleshen those out, but a look
at f1de981e8b will show that the other output being adjusted here is
now equivalent to what it was before d1bbbe45df.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:51 -05:00
bdd2aa8a8b bisect: refactor bisect_run() to match CodingGuidelines
We didn't add "{}" to all "if/else" branches, and one "error" was
mis-indented. Let's fix that first, which makes subsequent commits
smaller. In the case of the "if" we can simply early return instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:50 -05:00
982fecf7c1 bisect tests: test for v2.30.0 "bisect run" regressions
Add three failing tests which succeed on v2.29.0, but due to the topic
merged at [1] (specifically [2]) have been failing since then. We'll
address those regressions in subsequent commits.

There was also a "regression" where:

	git bisect run ./missing-script.sh

Would count a non-existing script as "good", as the shell would exit
with 127. That edge case is a bit too insane to preserve, so let's not
add it to these regression tests.

There was another regression that 'git bisect' consumed some options
that was meant to passed down to program run with 'git bisect run'.
Since that regression is breaking user's expectation, it has been fixed
earlier without this patch queued.

1. 0a4cb1f1f2 (Merge branch 'mr/bisect-in-c-4', 2021-09-23)
2. d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
   function in C, 2021-09-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:48 -05:00
2445d34fb9 Merge branch 'dd/bisect-helper-subcommand' into dd/git-bisect-builtin
* dd/bisect-helper-subcommand:
  bisect--helper: parse subcommand with OPT_SUBCOMMAND
  bisect--helper: move all subcommands into their own functions
  bisect--helper: remove unused options
2022-11-11 17:05:43 -05:00
e9011b6092 bisect--helper: parse subcommand with OPT_SUBCOMMAND
As of it is, we're parsing subcommand with OPT_CMDMODE, which will
continue to parse more options even if the command has been found.

When we're running "git bisect run" with a command that expecting
a "--log" or "--no-log" arguments, or one of those "--bisect-..."
arguments, bisect--helper may mistakenly think those options are
bisect--helper's option.

We may fix those problems by passing "--" when calling from
git-bisect.sh, and skip that "--" in bisect--helper. However, it may
interfere with user's "--".

Let's parse subcommand with OPT_SUBCOMMAND since that API was born for
this specific use-case.

Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:04:57 -05:00
464ce0aba8 bisect--helper: move all subcommands into their own functions
In a later change, we will use OPT_SUBCOMMAND to parse sub-commands to
avoid consuming non-option opts.

Since OPT_SUBCOMMAND needs a function pointer to operate,
let's move it now.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:04:54 -05:00
58786d73ba bisect--helper: remove unused options
'git-bisect.sh' used to have a 'bisect_next_check' to check if we have
both good/bad, old/new terms set or not.  In commit 129a6cf344
(bisect--helper: `bisect_next_check` shell function in C, 2019-01-02),
a subcommand for bisect--helper was introduced to port the check to C.
Since d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13), all users of 'bisect_next_check' was
re-implemented in C, this subcommand was no longer used but we forgot
to remove '--bisect-next-check'.

'git-bisect.sh' also used to have a 'bisect_write' function, whose
third positional parameter was a "nolog" flag.  This flag was only used
when 'bisect_start' invoked 'bisect_write' to write the starting good
and bad revisions.  Then 0f30233a11 (bisect--helper: `bisect_write`
shell function in C, 2019-01-02) ported it to C as a command mode of
'bisect--helper', which (incorrectly) added the '--no-log' option,
and convert the only place ('bisect_start') that call 'bisect_write'
with 'nolog' to 'git bisect--helper --bisect-write' with 'nolog'
instead of '--no-log', since 'bisect--helper' has command modes not
subcommands, all other command modes see and handle that option as well.
This bogus state didn't last long, however, because in the same patch
series 06f5608c14 (bisect--helper: `bisect_start` shell function
partially in C, 2019-01-02) the C reimplementation of bisect_start()
started calling the bisect_write() C function, this time with the
right 'nolog' function parameter. From then on there was no need for
the '--no-log' option in 'bisect--helper'. Eventually all bisect
subcommands were ported to C as 'bisect--helper' command modes, each
calling the bisect_write() C function instead, but when the
'--bisect-write' command mode was removed in 68efed8c8a
(bisect--helper: retire `--bisect-write` subcommand, 2021-02-03) it
forgot to remove that '--no-log' option.
'--no-log' option had never been used and it's unused now.

Let's remove --bisect-next-check and --no-log from option parsing.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:04:52 -05:00
48d69d8f2f chainlint: prefix annotated test definition with line numbers
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.

To further assist the test author, display line numbers along with the
annotated test definition, thus allowing the author to jump directly to
each problematic line.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 16:56:21 -05:00
bf42f0a030 chainlint: latch line numbers at which each token starts and ends
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.

To further assist the test author, an upcoming change will display line
numbers along with the annotated test definition, thus allowing the
author to jump directly to each problematic line. As preparation,
upgrade Lexer to latch the line numbers at which each token starts and
ends, and return that information with the token itself.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 16:56:21 -05:00
5451877f87 chainlint: sidestep impoverished macOS "terminfo"
Although the macOS Terminal.app is "xterm"-compatible, its corresponding
"terminfo" entries -- such as "xterm", "xterm-256color", and
"xterm-new"[1] -- neglect to mention capabilities which Terminal.app
actually supports (such as "dim text"). This oversight on Apple's part
ends up penalizing users of "good citizen" console programs which
consult "terminfo" to tailor their output based upon reported terminal
capabilities (as opposed to programs which assume that the terminal
supports ANSI codes). The same problem is present in other Apple
"terminfo" entries, such as "nsterm"[2], with which macOS Terminal.app
may be configured.

Sidestep this Apple problem by imbuing get_colors() with specific
knowledge of capabilities common to "xterm" and "nsterm", rather than
trusting "terminfo" to report them correctly. Although hard-coding such
knowledge is ugly, "xterm" support is nearly ubiquitous these days, and
Git itself sets precedence by assuming support for ANSI color codes. For
other terminal types, fall back to querying "terminfo" via `tput` as
usual.

FOOTNOTES

[1] iTerm2 FAQ suggests "xterm-new": https://iterm2.com/faq.html

[2] Neovim documentation recommends terminal type "nsterm" with
    Terminal.app: https://neovim.io/doc/user/term.html#terminfo

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 16:56:21 -05:00
688d82f254 sequencer: tighten label lookups
The `label` command creates a ref refs/rewritten/<label> that the
`reset` and `merge` commands resolve by calling lookup_label(). That
uses lookup_commit_reference_by_name() to look up the label ref. As
lookup_commit_reference_by_name() uses the dwim rules when looking up
the label it will look for a branch named
refs/heads/refs/rewritten/<label> and return that instead of an error if
the branch exists and the label does not. Fix this by using read_ref()
followed by lookup_commit_object() when looking up labels.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 23:36:24 -05:00
82766b2961 sequencer: unify label lookup
The arguments to the `reset` and `merge` commands may be a label created
with a `label` command or an arbitrary commit name. The `merge` command
uses the lookup_label() function to lookup its arguments but `reset` has
a slightly different version of that function in do_reset(). Reduce this
code duplication by calling lookup_label() from do_reset() as well.

This change improves the behavior of `reset` when the argument is a
tree.  Previously `reset` would accept a tree only for the rebase to
fail with

       update_ref failed for ref 'HEAD': cannot update ref 'HEAD': trying to write non-commit object da5497437fd67ca928333aab79c4b4b55036ea66 to branch 'HEAD'

Using lookup_label() means do_reset() will now error out straight away
if its argument is not a commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 23:36:24 -05:00
652bd0211d rebase: use 'skip_cache_tree_update' option
Enable the 'skip_cache_tree_update' option in both 'do_reset()'
('sequencer.c') and 'reset_head()' ('reset.c'). Both of these callers invoke
'prime_cache_tree()' after 'unpack_trees()', so we can remove an unnecessary
cache tree rebuild by skipping 'cache_tree_update()'.

When testing with 'p3400-rebase.sh' and 'p3404-rebase-interactive.sh', the
performance change of this update was negligible, likely due to the
operation being dominated by more expensive operations (like checking out
trees). However, since the change doesn't harm performance, it's worth
keeping this 'unpack_trees()' usage consistent with others that subsequently
invoke 'prime_cache_tree()'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:34 -05:00
dc5d40f5bc read-tree: use 'skip_cache_tree_update' option
When running 'read-tree' with a single tree and no prefix,
'prime_cache_tree()' is called after the tree is unpacked. In that
situation, skip a redundant call to 'cache_tree_update()' in
'unpack_trees()' by enabling the 'skip_cache_tree_update' unpack option.

Removing the redundant cache tree update provides a substantial performance
improvement to 'git read-tree <tree-ish>', as shown by a test added to
'p0006-read-tree-checkout.sh':

Test                          before            after
----------------------------------------------------------------------
read-tree br_ballast_plus_1   3.94(1.80+1.57)   3.00(1.14+1.28) -23.9%

Note that the 'read-tree' in 't1022-read-tree-partial-clone.sh' is updated
to read two trees, rather than one. The test was first introduced in
d3da223f22 (cache-tree: prefetch in partial clone read-tree, 2021-07-23) to
exercise the 'cache_tree_update()' code path, as used in 'git merge'. Since
this patch drops the call to 'cache_tree_update()' in single-tree 'git
read-tree', change the test to use the two-tree variant so that
'cache_tree_update()' is called as intended.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:34 -05:00
0e47bca0f7 reset: use 'skip_cache_tree_update' option
Enable the 'skip_cache_tree_update' option in the variants that call
'prime_cache_tree()' after 'unpack_trees()' (specifically, 'git reset
--mixed' and 'git reset --hard'). This avoids redundantly rebuilding the
cache tree in both 'cache_tree_update()' at the end of 'unpack_trees()' and
in 'prime_cache_tree()', resulting in a small (but consistent) performance
improvement. From the newly-added 'p7102-reset.sh' test:

Test                         before            after
--------------------------------------------------------------------
7102.1: reset --hard (...)   2.11(0.40+1.54)   1.97(0.38+1.47) -6.6%

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:34 -05:00
68fcd48baf unpack-trees: add 'skip_cache_tree_update' option
Add (disabled by default) option to skip the 'cache_tree_update()' at the
end of 'unpack_trees()'. In many cases, this cache tree update is redundant
because the caller of 'unpack_trees()' immediately follows it with
'prime_cache_tree()', rebuilding the entire cache tree from scratch. While
these operations aren't the most expensive part of operations like 'git
reset', the duplicate calls still create a minor unnecessary slowdown.

Introduce an option for callers to skip the 'cache_tree_update()' in
'unpack_trees()' if it is redundant (that is, if 'prime_cache_tree()' is
called afterwards). At the moment, no 'unpack_trees()' callers use the new
option; they will be updated in subsequent patches.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:34 -05:00
94fcf0e852 cache-tree: add perf test comparing update and prime
Add a performance test comparing the execution times of 'prime_cache_tree()'
and 'cache_tree_update(_, WRITE_TREE_SILENT | WRITE_TREE_REPAIR)'. The goal
of comparing these two is to identify which is the faster method for
rebuilding an invalid cache tree, ultimately to remove one when both are
(reundantly) called in immediate succession.

Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call
each tested function multiple times to ensure the reported times (to 0.01s
resolution) convey the differences between them.

The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to
capture a baseline for the overhead associated with running the tool),
'cache_tree_update()', and 'prime_cache_tree()' on four scenarios:

- A completely valid cache tree
- A cache tree with 2 invalid paths
- A cache tree with 50 invalid paths
- A completely empty cache tree

Example results:

Test                                        this tree
-----------------------------------------------------------
0090.2: no-op, clean                        1.27(0.48+0.52)
0090.3: prime_cache_tree, clean             2.02(0.83+0.85)
0090.4: cache_tree_update, clean            1.30(0.49+0.54)
0090.5: no-op, invalidate 2                 1.29(0.48+0.54)
0090.6: prime_cache_tree, invalidate 2      1.98(0.81+0.83)
0090.7: cache_tree_update, invalidate 2     2.12(0.94+0.86)
0090.8: no-op, invalidate 50                1.32(0.50+0.55)
0090.9: prime_cache_tree, invalidate 50     2.10(0.86+0.89)
0090.10: cache_tree_update, invalidate 50   2.35(1.14+0.90)
0090.11: no-op, empty                       1.33(0.50+0.54)
0090.12: prime_cache_tree, empty            2.04(0.84+0.87)
0090.13: cache_tree_update, empty           2.51(1.27+0.92)

These timings show that, while 'cache_tree_update()' is faster when the
cache tree is completely valid, it is equal to or slower than
'prime_cache_tree()' when there are any invalid paths. Since the redundant
calls are mostly in scenarios where the cache tree will be at least
partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will
likely perform better than 'cache_tree_update()' in typical cases.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:33 -05:00
eb20e63f5a branch: gracefully handle '-d' on orphan HEAD
When deleting a branch, "git branch -d" has a safety check that ensures
the branch is merged to its upstream (if any), or to HEAD. To do that,
naturally we try to resolve HEAD to a commit object. If we're on an
orphan branch (i.e., HEAD points to a branch that does not yet exist),
that will fail, and we'll bail with an error:

  $ git branch -d to-delete
  fatal: Couldn't look up commit object for HEAD

This usually isn't that big of a deal. The deletion would fail anyway,
since the branch isn't merged to HEAD, and you'd need to use "-D" (or
"-f"). And doing so skips the HEAD resolution, courtesy of 67affd5173
(git-branch -D: make it work even when on a yet-to-be-born branch,
2006-11-24).

But there are still two problems:

  1. The error message isn't very helpful. We should give the usual "not
     fully merged" message, which points the user at "branch -D". That
     was a problem even back in 67affd5173.

  2. Even without a HEAD, these days it's still possible for the
     deletion to succeed. After 67affd5173, commit 99c419c915 (branch
     -d: base the "already-merged" safety on the branch it merges with,
     2009-12-29) made it OK to delete a branch if it is merged to its
     upstream.

We can fix both by removing the die() in delete_branches() completely,
leaving head_rev NULL in this case. It's tempting to stop there, as it
appears at first glance that the rest of the code does the right thing
with a NULL. But sadly, it's not quite true.

We end up feeding the NULL to repo_is_descendant_of(). In the
traditional code path there, we call repo_in_merge_bases_many(). It
feeds the NULL to repo_parse_commit(), which is smart enough to return
an error, and we immediately return "no, it's not a descendant".

But there's an alternate code path: if we have a commit graph with
generation numbers, we end up in can_all_from_reach(), which does
eventually try to set a flag on the NULL commit and segfaults.

So instead, we'll teach the local branch_merged() helper to treat a NULL
as "not merged". This would be a little more elegant in in_merge_bases()
itself, but that function is called in a lot of places, and it's not
clear that quietly returning "not merged" is the right thing everywhere
(I'd expect in many cases, feeding a NULL is a sign of a bug).

There are four tests here:

  a. The first one confirms that deletion succeeds with an orphaned HEAD
     when the branch is merged to its upstream. This is case (2) above.

  b. Same, but with commit graphs enabled. Even if it is merged to
     upstream, we still check head_rev so that we can say "deleting
     because it's merged to upstream, even though it's not merged to
     HEAD". Without the second hunk in branch_merged(), this test would
     segfault in can_all_from_reach().

  c. The third one confirms that we correctly say "not merged to HEAD"
     when we can't resolve HEAD, and reject the deletion.

  d. Same, but with commit graphs enabled. Without the first hunk in
     branch_merged(), this one would segfault.

Reported-by: Martin von Zweigbergk <martinvonz@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:42:45 -05:00
14770cf0de git_parse_signed(): avoid integer overflow
git_parse_signed() checks that the absolute value of the parsed string
is less than or equal to a caller supplied maximum value. When
calculating the absolute value there is a integer overflow if `val ==
INTMAX_MIN`. To fix this avoid negating `val` when it is negative by
having separate overflow checks for positive and negative values.

An alternative would be to special case INTMAX_MIN before negating `val`
as it is always out of range. That would enable us to keep the existing
code but I'm not sure that the current two-stage check is any clearer
than the new version.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 21:30:39 -05:00
7595c0ece1 config: require at least one digit when parsing numbers
If the input to strtoimax() or strtoumax() does not contain any digits
then they return zero and set `end` to point to the start of the input
string.  git_parse_[un]signed() do not check `end` and so fail to return
an error and instead return a value of zero if the input string is a
valid units factor without any digits (e.g "k").

Tests are added to check that 'git config --int' and OPT_MAGNITUDE()
reject a units specifier without a leading digit.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 21:30:39 -05:00
84356ff770 git_parse_unsigned: reject negative values
git_parse_unsigned() relies on strtoumax() which unfortunately parses
negative values as large positive integers. Fix this by rejecting any
string that contains '-' as we do in strtoul_ui(). I've chosen to treat
negative numbers as invalid input and set errno to EINVAL rather than
ERANGE one the basis that they are never acceptable if we're looking for
a unsigned integer. This is also consistent with the existing behavior
of rejecting "1–2" with EINVAL.

As we do not have unit tests for this function it is tested indirectly
by checking that negative values of reject for core.bigFileThreshold are
rejected. As this function is also used by OPT_MAGNITUDE() a test is
added to check that rejects negative values too.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 21:30:38 -05:00
f13c3f28e7 Documentation: increase example cache timeout to 1 hour
Previously, the example *decreased* the cache timeout compared to the
default, making it less user friendly.

Instead, nudge users to make cache more usable. Many users choose
store over cache.
https://lore.kernel.org/git/CAGJzqskRYN49SeS8kSEN5-vbB_Jt1QvAV9QhS6zNuKh0u8wxPQ@mail.gmail.com/

The default timeout remains 15 minutes. A stronger nudge would
be to increase that.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 21:28:53 -05:00
0e34efb31d rebase: stop exporting GIT_REFLOG_ACTION
Now that struct replay_opts has a reflog_action member we no longer
need to export GIT_REFLOG_ACTION when starting a rebase. If the user
has set GIT_REFLOG_ACTION then we use it when initializing
reflog_action.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 18:15:54 -05:00
d188a60d72 sequencer: stop exporting GIT_REFLOG_ACTION
Each time it picks a commit the sequencer copies the GIT_REFLOG_ACITON
environment variable so it can temporarily change it and then restore
the previous value. This results in code that is hard to follow and also
leaks memory because (i) we fail to free the copy when we've finished
with it and (ii) each call to setenv() leaks the previous value. Instead
pass the reflog action around in a variable and use it to set
GIT_REFLOG_ACTION in the child environment when running "git commit".

Within the sequencer GIT_REFLOG_ACTION is no longer set and is only read
by sequencer_reflog_action(). It is still set by rebase before calling
the sequencer, that will be addressed in the next commit. cherry-pick
and revert are unaffected as they do not set GIT_REFLOG_ACTION before
calling the sequencer.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 18:15:43 -05:00
8354cf752e t7610: fix flaky timeout issue, don't clone from example.com
When t7610-mergetool.sh runs without failures the git://example.com
submodule URLs will never be used. That's because we "git submodule
add" it, but then manually populate them so that subsequent "git
submodule update -N" won't attempt to clone it, only update it without
fetching.

But if we fail in an earlier test it'll have the knock-on effect of
having later tests hang on that "git submodule update -N" as we
attempt to clone this repository from example.com.

This can be reproduced on "master" by running the test with
SANITIZE=leak without "--immediate". With
"GIT_TEST_PASSING_SANITIZE_LEAK=true" (which the linux-leaks job uses)
we'll skip the test entirely. So we'll only run into this when running
it manually, or with the "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode.

That's not because the failure has anything to do with leak detection
per-se. It just so happens that we have a leak that'll fail before
we've managed to fully set these up, and therefore "git submodule
update -N" ends up spawning "git clone".

Let's instead continue lying about the origin of this submodule by
providing a URL for it that doesn't work, but now one that *really*
doesn't work: /dev/null. If the test is passing we won't ever use
this, and if we have knock-on failures we'll fail early, instead of
waiting for a timeout.

The behavior of "-N" here might be surprising to some, since it's
explained as "[if you use -N we] don’t fetch new objects from the
remote site". But (perhaps counter-intuitively) it's only talking
about if it needs to do so via "git fetch". In this case we'll end up
spawning a "git clone", as we have no submodule set up.

See ff7f089ed1 (mergetool: Teach about submodules, 2011-04-13) for
the commit that implemented these "example.com" tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 17:29:31 -05:00
3a79a8085b Merge branch 'es/chainlint-output' into es/chainlint-lineno
* es/chainlint-output:
  chainlint: annotate original test definition rather than token stream
  chainlint: latch start/end position of each token
  chainlint: tighten accuracy when consuming input stream
  chainlint: add explanatory comments
2022-11-09 16:41:35 -05:00
319605f8f0 The eleventh batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 17:18:48 -05:00
be4ac3b197 Merge branch 'rs/no-more-run-command-v'
Simplify the run-command API.

* rs/no-more-run-command-v:
  replace and remove run_command_v_opt()
  replace and remove run_command_v_opt_cd_env_tr2()
  replace and remove run_command_v_opt_tr2()
  replace and remove run_command_v_opt_cd_env()
  use child_process members "args" and "env" directly
  use child_process member "args" instead of string array variable
  sequencer: simplify building argument list in do_exec()
  bisect--helper: factor out do_bisect_run()
  bisect: simplify building "checkout" argument list
  am: simplify building "show" argument list
  run-command: fix return value comment
  merge: remove always-the-same "verbose" arguments
2022-11-08 17:15:12 -05:00
3e9303dc8e Merge branch 'rs/archive-filter-error-once'
"git archive" mistakenly complained twice about a missing executable,
which has been corrected.

* rs/archive-filter-error-once:
  archive-tar: report filter start error only once
2022-11-08 17:15:09 -05:00
ec9a46af4f Merge branch 'ma/drop-redundant-diagnostic'
A redundant diagnostic message is dropped from test_path_is_missing().

* ma/drop-redundant-diagnostic:
  test-lib-functions: drop redundant diagnostic print
2022-11-08 17:15:06 -05:00
d957761eff Merge branch 'vb/ls-files-docfix'
Docfix.

* vb/ls-files-docfix:
  ls-files: fix --ignored and --killed flags in synopsis
2022-11-08 17:14:53 -05:00
15df8418a5 Merge branch 'jk/ref-filter-parsing-bugs'
Various tests exercising the transfer.credentialsInUrl configuration
are taught to avoid making requests which require resolving localhost
to reduce CI-flakiness.

* jk/ref-filter-parsing-bugs:
  ref-filter: fix parsing of signatures with CRLF and no body
  ref-filter: fix parsing of signatures without blank lines
2022-11-08 17:14:52 -05:00
4b6302c72f Merge branch 'po/glossary-around-traversal'
The glossary entries for "commit-graph file" and "reachability
bitmap" have been added.

* po/glossary-around-traversal:
  glossary: add reachability bitmap description
  glossary: add "commit graph" description
  doc: use 'object database' not ODB or abbreviation
  doc: use "commit-graph" hyphenation consistently
2022-11-08 17:14:51 -05:00
06e7696025 Merge branch 'jc/set-gid-bit-less-aggressively'
The adjust_shared_perm() helper function learned to refrain from
setting the "g+s" bit on directories when it is not necessary.

* jc/set-gid-bit-less-aggressively:
  adjust_shared_perm(): leave g+s alone when the group does not matter
2022-11-08 17:14:49 -05:00
bdd42e34e3 Merge branch 'es/mark-gc-cruft-as-experimental'
Enable gc.cruftpacks by default for those who opt into
feature.experimental setting.

* es/mark-gc-cruft-as-experimental:
  config: let feature.experimental imply gc.cruftPacks=true
  gc: add tests for --cruft and friends
2022-11-08 17:14:48 -05:00
098b1d07bc Merge branch 'tb/howto-using-redo-script'
Doc update.

* tb/howto-using-redo-script:
  Documentation/howto/maintain-git.txt: fix Meta/redo-jch.sh invocation
2022-11-08 17:14:45 -05:00
54e95b4663 Documentation/gitcredentials.txt: mention password alternatives
Git asks for a "password", but the user might use a
personal access token or OAuth access token instead.

Example:

    Password for 'https://AzureDiamond@github.com':

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 16:46:54 -05:00
ee0e7fc927 fsmonitor--daemon: on macOS support symlink
Resolves a problem where symbolic links were not showing up in diff when
created or modified.

kFSEventStreamEventFlagItemIsSymlink is also treated as a file update.
This is because kFSEventStreamEventFlagItemIsFile is not included in
FSEvents when creating or deleting symbolic links. For example:

$ ln -snf t test
  fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated|ItemIsSymlink|
$ ln -snf ci test
  fsevent: '/path/to/dir/test', flags=0x40200 ItemIsSymlink|ItemRemoved|
  fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated|ItemIsSymlink|

Signed-off-by: srz_zumix <zumix.cpp@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 16:36:09 -05:00
916ebb327c revisions API: extend the nascent REV_INFO_INIT macro
Have the REV_INFO_INIT macro added in [1] declare more members of
"struct rev_info" that we can initialize statically, and have
repo_init_revisions() do so with the memcpy(..., &blank) idiom
introduced in [2].

As the comment for the "REV_INFO_INIT" macro notes this still isn't
sufficient to initialize a "struct rev_info" for use yet. But we are
getting closer to that eventual goal.

Even though we can't fully initialize a "struct rev_info" with
REV_INFO_INIT it's useful for readability to clearly separate those
things that we can statically initialize, and those that we can't.

This change could replace the:

	list_objects_filter_init(&revs->filter);

In the repo_init_revisions() with this line, at the end of the
REV_INFO_INIT deceleration in revisions.h:

	.filter = LIST_OBJECTS_FILTER_INIT, \

But doing so would produce a minor conflict with an outstanding
topic[3]. Let's skip that for now. I have follow-ups to initialize
more of this statically, e.g. changes to get rid of grep_init(). We
can initialize more members with the macro in a future series.

1. f196c1e908 (revisions API users: use release_revisions() needing
   REV_INFO_INIT, 2022-04-13)
2. 5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
   macro, 2021-07-01)
3. https://lore.kernel.org/git/265b292ed5c2de19b7118dfe046d3d9d932e2e89.1667901510.git.ps@pks.im/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 16:34:01 -05:00
63357b79c9 ci: use a newer github-script version
The old version we currently use runs in node.js v12.x, which is being
deprecated in GitHub Actions. The new version uses node.js v16.x.

Incidentally, this also avoids the warning about the deprecated
`::set-output::` workflow command because the newer version of the
`github-script` Action uses the recommended new way to specify outputs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:35:13 -05:00
73c768dae9 chainlint: annotate original test definition rather than token stream
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.

However, now that each parsed token carries positional information, the
location of a detected problem can be pinpointed precisely in the
original test definition. Therefore, take advantage of this information
to annotate the test definition itself rather than annotating the parsed
token stream, thus making it easier for a test author to relate a
problem back to the source.

Maintaining the positional meta-information associated with each
detected problem requires a slight change in how the problems are
managed internally. In particular, shell syntax such as:

    msg="total: $(cd data; wc -w *.txt) words"

requires the lexical analyzer to recursively invoke the parser in order
to detect problems within the $(...) expression inside the double-quoted
string. In this case, the recursive parse context will detect the broken
&&-chain between the `cd` and `wc` commands, returning the token stream:

    cd data ; ?!AMP?! wc -w *.txt

However, the parent parse context will see everything inside the
double-quotes as a single string token:

    "total: $(cd data ; ?!AMP?! wc -w *.txt) words"

losing whatever positional information was attached to the ";" token
where the problem was detected.

One way to preserve the positional information of a detected problem in
a recursive parse context within a string would be to attach the
positional information to the annotation textually; for instance:

    "total: $(cd data ; ?!AMP:21:22?! wc -w *.txt) words"

and then extract the positional information when annotating the original
test definition.

However, a cleaner and much simpler approach is to maintain the list of
detected problems separately rather than embedding the problems as
annotations directly in the parsed token stream. Not only does this
ensure that positional information within recursive parse contexts is
not lost, but it keeps the token stream free from non-token pollution,
which may simplify implementation of validations added in the future
since they won't have to handle non-token "?!FOO!?" items specially.

Finally, the chainlint self-test "expect" files need a few mechanical
adjustments now that the original test definitions are emitted rather
than the parsed token stream. In particular, the following items missing
from the historic parsed-token output are now preserved verbatim:

    * indentation (and whitespace, in general)

    * comments

    * here-doc bodies

    * here-doc tag quoting (i.e. "\EOF")

    * line-splices (i.e. "\" at the end of a line)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
5f0321a9f2 chainlint: latch start/end position of each token
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.

To address this shortcoming, an upcoming change will make it print out
an annotated copy of the original test definition rather than the
tokenized representation. In order to do so, it will need to know the
start and end positions of each token in the original test definition.
As preparation, upgrade TestParser::scan_token() to latch the start and
end position of the token being scanned, and return that information
along with the token itself. A subsequent change will take advantage of
this positional information.

In terms of implementation, TestParser::scan_token() is retrofitted to
return a tuple consisting of the token's lexeme and its start and end
positions, rather than returning just the lexeme. However, an
alternative would be to define a class which represents a token:

    package Token;

    sub new {
        my ($class, $lexeme, $start, $end) = @_;
        bless [$lexeme, $start, $end] => $class;
    }

    sub as_string {
        my $self = shift @_;
        return $self->[0];
    }

    sub compare {
        my ($x, $y) = @_;
        if (UNIVERSAL::isa($y, 'Token')) {
            return $x->[0] cmp $y->[0];
        }
        return $x->[0] cmp $y;
    }

    use overload (
        '""' => 'as_string',
        'cmp' => 'compare'
    );

The major benefit of the class-based approach is that it is entirely
non-invasive; it requires no additional changes to the rest of the
script since a Token converts automatically to a string, which is what
scan_token() historically returned.

The big downside to the Token approach, however, is that it is _slow_;
on this developer's (old) machine, it increases user-time by an
unacceptable seven seconds when scanning all test scripts in the
project. Hence, the simple tuple approach is employed instead since it
adds only a fraction of a second user-time.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
ca748f5183 chainlint: tighten accuracy when consuming input stream
To extract the next token in the input stream, Lexer::scan_token() finds
the start of the token by skipping whitespace, then consumes characters
belonging to the token until it encounters a non-token character, such
as an operator, punctuation, or whitespace. In the case of an operator
or punctuation which ends a token, before returning the just-scanned
token, it pushes that operator or punctuation character back onto the
input stream to ensure that it will be the first character consumed by
the next call to scan_token().

However, scan_token() is intentionally lax when whitespace ends a token;
it doesn't bother pushing the whitespace character back onto the token
stream since it knows that the next call to scan_token() will, as its
first step, skip over whitespace anyhow when looking for the start of
the token.

Although such laxity is harmless for the proper functioning of the
lexical analyzer, it does make it difficult to precisely identify the
token's end position in the input stream. Accurate token position
information may be desirable, for instance, to annotate problems or
highlight other interesting facets of the input found during the parsing
phase. To accommodate such possibilities, tighten scan_token() by making
it push the token-ending whitespace character back onto the input
stream, just as it does for other token-ending characters.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
c90d81f8bb chainlint: add explanatory comments
The logic in TestParser::accumulate() for detecting broken &&-chains is
mostly well-commented, but a couple branches which were deemed obvious
and straightforward lack comments. In retrospect, though, these cases
may give future readers pause, so comment them, as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
69d94464e1 submodule--helper: use OPT_SUBCOMMAND() API
Have the cmd_submodule__helper() use the OPT_SUBCOMMAND() API
introduced in fa83cc834d (parse-options: add support for parsing
subcommands, 2022-08-19).

This is only a marginal reduction in line count, but once we start
unifying this with a yet-to-be-added "builtin/submodule.c" it'll be
much easier to reason about those changes, as they'll both use
OPT_SUBCOMMAND().

We don't need to worry about "argv[0]" being NULL in the die() because
we'd have errored out in parse_options() as we're not using
"PARSE_OPT_SUBCOMMAND_OPTIONAL".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
1b6e2001c7 submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update"
Since 29a5e9e1ff (submodule--helper update-clone: learn --init,
2022-03-04) we've been passing "-C <prefix>" from "git-submodule.sh"
whenever we pass "--prefix <prefix>", so the latter is redundant to
the former. Let's drop the "--prefix" option.

Suggested-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
64f48ad1f0 submodule--helper: remove --prefix from "absorbgitdirs"
Let's pass the "-C <prefix>" option instead to "absorbgitdirs" from
its only caller.

When it was added in f6f8586140 (submodule: add absorb-git-dir
function, 2016-12-12) there were other "submodule--helper" subcommands
that were invoked with "-C <prefix>", so we could have done this all
along.

Suggested-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
82ff87789b submodule API & "absorbgitdirs": remove "----recursive" option
Remove the "----recursive" option to "git submodule--helper
absorbgitdirs" (yes, with 4 dashes, not 2).

This option and all the "else" when "flags &
ABSORB_GITDIR_RECURSE_SUBMODULES" is false has never been used since
it was added in f6f8586140 (submodule: add absorb-git-dir function,
2016-12-12), which we'd have had to do as "----recursive", a
"--recursive" would have errored out.

It would be nice to follow-up with an optbug() assertion to
parse-options.c for such funnily named options, I manually validated
that this was the only long option whose name started with "-", but
let's skip adding such an assertion for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
46e87b5482 submodule.c: refactor recursive block out of absorb function
A move and indentation-only change to move the
ABSORB_GITDIR_RECURSE_SUBMODULES case into its own function, which as
we'll see makes the subsequent commit changing this code much smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
d50d8485ef submodule tests: test for a "foreach" blind-spot
We tested for "--" followed by command names, but not for "--"
followed by an argument that looks like an option, let's do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
435285bd82 submodule--helper: fix a memory leak in "status"
The "status" sub-command was leaking the "struct strvec" it was
setting up for the reasons explained in f92dbdbc6a (revisions API:
don't leak memory on argv elements that need free()-ing, 2022-08-02),
so let's use the "free_removed_argv_elements" option to
setup_revisions() to fix the leak.

Even if we did that, clobbering the "diff_files_args.nr" with the
return value of setup_revisions() would leave leaks in place, but we
can just stop clobbering it.

Ever since that code was added in a9f8a37584 (submodule: port
submodule subcommand 'status' from shell to C, 2017-10-06) we've had
no reason to modify the "nr" member ("argc" at the time): The next use
of "diff_files_args" after this is the "strvec_clear()" at the end of
the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
44874cbd19 submodule tests: add tests for top-level flag output
Exhaustively test for how combining various "mixed-level" "git
submodule" option works. "Mixed-level" here means options that are
accepted by a mixture of the top-level "submodule" command, and
e.g. the "status" sub-command.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
cc74a4ac72 submodule--helper: move "config" to a test-tool
As with other moves to "test-tool" in f322e9f51b (Merge branch
'ab/submodule-helper-prep', 2022-09-13) the "config" sub-command was
only used by our own tests.

It was last used by "git submodule" itself in code that went away with
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10).

Let's move it over, and while doing so make it easier to reason about
by splitting up the various uses for it into separate sub-commands, so
that we don't need to count arguments to see what it does.

This also has the advantage that we stop wasting future translator
time on this command, currently the usage information for this
internal-only tool has been translated into several languages. The use
of the "_" function has also been removed from the "please make
sure..." message.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
eb5b03a9c0 ci: avoid unnecessary builds
Whenever a branch is pushed to a repository which has GitHub Actions
enabled, a bunch of new workflow runs are started.

We sometimes see contributors push multiple branch updates in rapid
succession, which in conjunction with the impressive time swallowed by
even just a single CI build frequently leads to many queued-up runs.

This is particularly problematic in the case of Pull Requests where a
single contributor can easily (inadvertently) prevent timely builds for
other contributors when using a shared repository.

To help with this situation, let's use the `concurrency` feature of
GitHub workflows, essentially canceling GitHub workflow runs that are
obsoleted by more recent runs:

  https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

For workflows that *do* want the behavior in the pre-image of this
patch, they can use the ci-config feature to disable the new behavior by
adding an executable script on the ci-config branch called
'skip-concurrent' which terminates with a non-zero exit code.

Original-patch-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 13:26:20 -05:00
d00fa5528b Makefile: discuss SHAttered in *_SHA{1,256} discussion
Let's mention the SHAttered attack and more generally why we use the
sha1collisiondetection backend by default, and note that for SHA-256
the user should feel free to pick any of the supported backends as far
as hashing security is concerned.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
fb8d7add06 Makefile: document default SHA-1 backend on OSX
Since [1] the default SHA-1 backend on OSX has been
APPLE_COMMON_CRYPTO. Per [2] we'll skip using it on anything older
than Mac OS X 10.4 "Tiger"[3].

When "DC_SHA1" was made the default in [4] this interaction between it
and APPLE_COMMON_CRYPTO seems to have been missed in. Ever since
DC_SHA1 was "made the default" we've still used Apple's CommonCrypto
instead of sha1collisiondetection on modern versions of Darwin and
OSX.

1. 61067954ce (cache.h: eliminate SHA-1 deprecation warnings on Mac
   OS X, 2013-05-19)
2. 9c7a0beee0 (config.mak.uname: set NO_APPLE_COMMON_CRYPTO on older
   systems, 2014-08-15)
3. We could probably drop "NO_APPLE_COMMON_CRYPTO", as nobody's likely
   to care about such on old version of OSX anymore. But let's leave that
   for now.
4. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
dc1cf3580e Makefile & test-tool: replace "DC_SHA1" variable with a "define"
Address the root cause of technical debt we've been carrying since
sha1collisiondetection was made the default in [1]. In a preceding
commit we narrowly fixed a bug where the "DC_SHA1" variable would be
unset (in combination with "NO_APPLE_COMMON_CRYPTO=" on OSX), even
though we had the sha1collisiondetection library enabled.

But the only reason we needed to have such a user-exposed knob went
away with [1], and it's been doing nothing useful since then. We don't
care if you define DC_SHA1=*, we only care that you don't ask for any
other SHA-1 implementation. If it turns out that you didn't, we'll use
sha1collisiondetection, whether you had "DC_SHA1" set or not.

As a result of this being confusing we had e.g. [2] for cmake and the
recent [3] for ci/lib.sh setting "DC_SHA1" explicitly, even though
this was always a NOOP.

A much simpler way to do this is to stop having the Makefile and
CMakeLists.txt set "DC_SHA1" to be picked up by the test-lib.sh, let's
instead add a trivial "test-tool sha1-is-sha1dc". It returns zero if
we're using sha1collisiondetection, non-zero otherwise.

1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. c4b2f41b5f (cmake: support for testing git with ctest, 2020-06-26)
3. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
   2022-10-20)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
ed605fa1a8 Makefile: document SHA-1 and SHA-256 default and selection order
For the *_SHA1 and *_SHA256 flags we've discussed the various flags,
but not the fact that when you define multiple flags we'll pick one.

Which one we pick depends on the order they're listed in the Makefile,
which differed from the order we discussed them in this documentation.

Let's be explicit about how we select these, and re-arrange the
listings so that they're listed in the priority order we've picked.

I'd personally prefer that the selection was more explicit, and that
we'd error out if conflicting flags were provided, but per the
discussion downhtread of[1] the consensus was to keep theses semantics.

This behavior makes it easier to e.g. integrate with autoconf-like
systems, where the configuration can provide everything it can
support, and Git is tasked with picking the first one it prefers.

1. https://lore.kernel.org/git/220710.86mtdh81ty.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
84d71c2021 Makefile: document default SHA-256 backend
Since 27dc04c545 (sha256: add an SHA-256 implementation using
libgcrypt, 2018-11-14) we've claimed to support a BLK_SHA256 flag, but
there's no such SHA-256 backend.

Instead we fall back on adding "sha256/block/sha256.o" to "LIB_OBJS"
and adding "-DSHA256_BLK" to BASIC_CFLAGS.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
f569897cda Makefile: rephrase the discussion of *_SHA1 knobs
In the preceding commit the discussion of the *_SHA1 knobs was left
as-is to benefit from a smaller diff, but since we're changing these
let's use the same phrasing we use for most other knobs. E.g. "define
X", not "define X environment variable", and get rid of the "when
running make to link with" entirely.

Furthermore the discussion of DC_SHA1* options is now under a "Options
for the sha1collisiondetection implementation" heading, so we don't
need to clarify that these options go along with DC_SHA1=Y, so let's
rephrase them accordingly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
34b660e3e6 Makefile: create and use sections for "define" flag listing
Since the "Define ..." template of comments at the top of the Makefile
was started in 5bdac8b326 ([PATCH] Improve the compilation-time
settings interface, 2005-07-29) we've had a lot more flags added,
including flags that come in "groups". Not having any obvious
structure to the >500 line comment at the top of the Makefile has made
it hard to follow.

This change is almost entirely a move-only change, the two paragraphs
at the start of the first two sections are new, and so are the added
sections themselves, but other than that no lines are changed, only
moved.

We now list Makefile-only flags at the start, followed by stand-alone
flags, and then cover "optional library" flags in their respective
groups, followed by SHA-1 and SHA-256 flags, and finally
DEVELOPER-specific flags.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
b425ba2380 Makefile: correct DC_SHA1 documentation
The claim that DC_SHA1 takes priority over other *_SHA1 knobs was true
when it was added in [1], But that hasn't been the case since it was
made the fallback default in [2].

We should be making it not only the default, but something that takes
priority over other *_SHA1 knobs, but that's outside the scope of this
change. For now let's correct the documentation to match reality.

Let's also remove the "unconditionally enable" wording, per the above
the enabling of "DC_SHA1" is conditional on these other flags.

The "Define DC_SHA1" here is also a lie, actually it's "we don't care
if you define DC_SHA1, just don't define anything else", but that's a
more general issue that'll be addressed in a subsequent commit. Let's
first stop pretending that this setting (which we actually don't even
use) takes priority over anything else.

1. 8325e43b82 (Makefile: add DC_SHA1 knob, 2017-03-16)
2. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
0ced11d32f INSTALL: remove discussion of SHA-1 backends
The claim that OpenSSL is the default SHA-1 backend hasn't been true
since e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17),
but more importantly tweaking the SHA-1 backend isn't something that's
common enough to warrant discussing in the INSTALL document, so let's
remove this paragraph.

This discussion was originally added in c538d2d34a (Add some
installation notes in INSTALL, 2005-06-17) when tweaking the default
backend was more common. The current wording was added in
5beb577db8 (INSTALL: Describe dependency knobs from Makefile,
2009-09-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
e47913e8e2 Makefile: always (re)set DC_SHA1 on fallback
Fix an edge case introduced in in e6b07da278 (Makefile: make DC_SHA1
the default, 2017-03-17), when DC_SHA1 was made the default fallback
we started unconditionally adding to BASIC_CFLAGS and LIB_OBJS, so
we'd use the sha1collisiondetection by default.

But the "DC_SHA1" variable remained unset, so e.g.:

	make test DC_SHA1= T=t0013*.sh

Would skip the sha1collisiondetection tests, as we'd write
"DC_SHA1=''" to "GIT-BUILD-OPTIONS", but if we manually removed that
test prerequisite we'd pass the test (which we couldn't if we weren't
using sha1collisiondetection).

So let's have the fallback assignment use the 'override' directive
instead of the ":=" simply expanded variable introduced in
e6b07da278. In this case we explicitly want to override the user's
choice.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
a6c6f6d2fe ls-files: fix --ignored and --killed flags in synopsis
Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 21:55:06 -05:00
20d87d3291 sparse-checkout.txt: new document with sparse-checkout directions
Once upon a time, Matheus wrote some patches to make
   git grep [--cached | <REVISION>] ...
restrict its output to the sparsity specification when working in a
sparse checkout[1].  That effort got derailed by two things:

  (1) The --sparse-index work just beginning which we wanted to avoid
      creating conflicts for
  (2) Never deciding on flag and config names and planned high level
      behavior for all commands.

More recently, Shaoxuan implemented a more limited form of Matheus'
patches that only affected --cached, using a different flag name,
but also changing the default behavior in line with what Matheus did.
This again highlighted the fact that we never decided on command line
flag names, config option names, and the big picture path forward.

The --sparse-index work has been mostly complete (or at least released
into production even if some small edges remain) for quite some time
now.  We have also had several discussions on flag and config names,
though we never came to solid conclusions.  Stolee once upon a time
suggested putting all these into some document in
Documentation/technical[3], which Victoria recently also requested[4].
I'm behind the times, but here's a patch attempting to finally do that.

[1] https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/
    (See his second link in that email in particular)
[2] https://lore.kernel.org/git/20220908001854.206789-2-shaoxuan.yuan02@gmail.com/
[3] https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/
    (Scroll to the very end for the final few paragraphs)
[4] https://lore.kernel.org/git/cafcedba-96a2-cb85-d593-ef47c8c8397c@github.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 18:15:45 -05:00
44da9e0841 rebase --update-refs: avoid unintended ref deletion
In b3b1a21d1a (sequencer: rewrite update-refs as user edits todo list,
2022-07-19), the 'todo_list_filter_update_refs()' step was added to handle
the removal of 'update-ref' lines from a 'rebase-todo'. Specifically, it
removes potential ref updates from the "update refs state" if a ref does not
have a corresponding 'update-ref' line.

However, because 'write_update_refs_state()' will not update the state if
the 'refs_to_oids' list was empty, removing *all* 'update-ref' lines will
result in the state remaining unchanged from how it was initialized (with
all refs' "after" OID being null). Then, when the ref update is applied, all
refs will be updated to null and consequently deleted.

To fix this, delete the 'update-refs' state file when 'refs_to_oids' is
empty. Additionally, add a tests covering "all update-ref lines removed"
cases.

Reported-by: herr.kaste <herr.kaste@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 14:16:45 -05:00
c90db53d20 scalar reconfigure -a: remove stale scalar.repo entries
Every once in a while, a Git for Windows installation fails because the
attempt to reconfigure a Scalar enlistment failed because it was deleted
manually without removing the corresponding entries in the global Git
config.

In f5f0842d0b (scalar: let 'unregister' handle a deleted enlistment
directory gracefully, 2021-12-03), we already taught `scalar delete` to
handle the case of a manually deleted enlistment gracefully. This patch
adds the same graceful handling to `scalar reconfigure --all`.

This patch is best viewed with `--color-moved`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 13:57:13 -05:00
8c7abdc596 index: raise a bug if the index is materialised more than once
If clear_skip_worktree_from_present_files() encounter a sparse directory,
it fully materialise the index which should expand any sparse directories
and start going through each entries again. If this happens more than once,
raise it with a BUG.

Signed-off-by: Anh Le <anh@canva.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-04 20:28:28 -04:00
89aaab11a3 index: add trace2 region for clear skip worktree
When using sparse checkout, clear_skip_worktree_from_present_files() must
enumerate index entries to find ones with the SKIP_WORKTREE bit to
determine whether those index entries exist on disk (in which case their
SKIP_WORKTREE bit should be removed).

In a large repository, this may take considerable time depending on the
size of the index.

Add a trace2 region to surface this information, keeping a count of how
many paths have been checked. Separately, keep counts after a full index is
materialized.

Signed-off-by: Anh Le <anh@canva.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-04 20:28:28 -04:00
7cccf5b6c9 t7001-mv.sh: modernizing test script using functions
Test script to verify the presence/absence of files, paths, directories,
symlinks and other features in 'git mv' command are using the command
format:

'test (-e|f|d|h|...)'

Replace them with helper functions of format:

'test_path_is_*'

Replacing idiomatic helper functions:

'! test_path_is_*'

with

'test_path_is_missing'

This uses values of 'test_path_bar' in place of '! test_path_foo' to
bring in the helpful factor of indicating the failure of tests after the
mv command has been used, that is, it echoes if the feature/test_path
exists.

Signed-off-by: Debra Obondo <debraobondo@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-04 17:58:23 -04:00
3b08839926 The tenth batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-03 20:41:55 -04:00
fadacf2040 Merge branch 'jk/avoid-localhost'
Various tests exercising the transfer.credentialsInUrl configuration
are taught to avoid making requests which require resolving localhost
to reduce CI-flakiness.

* jk/avoid-localhost:
  t5516/t5601: be less strict about the number of credential warnings
  t5516: move plaintext-password tests from t5601 and t5516
2022-11-03 20:41:07 -04:00
8e1c5fcf28 ref-filter: fix parsing of signatures with CRLF and no body
This commit fixes a bug when parsing tags that have CRLF line endings, a
signature, and no body, like this (the "^M" are marking the CRs):

  this is the subject^M
  -----BEGIN PGP SIGNATURE-----^M
  ^M
  ...some stuff...^M
  -----END PGP SIGNATURE-----^M

When trying to find the start of the body, we look for a blank line
separating the subject and body. In this case, there isn't one. But we
search for it using strstr(), which will find the blank line in the
signature.

In the non-CRLF code path, we check whether the line we found is past
the start of the signature, and if so, put the body pointer at the start
of the signature (effectively making the body empty). But the CRLF code
path doesn't catch the same case, and we end up with the body pointer in
the middle of the signature field. This has two visible problems:

  - printing %(contents:subject) will show part of the signature, too,
    since the subject length is computed as (body - subject)

  - the length of the body is (sig - body), which makes it negative.
    Asking for %(contents:body) causes us to cast this to a very large
    size_t when we feed it to xmemdupz(), which then complains about
    trying to allocate too much memory.

These are essentially the same bugs fixed in the previous commit, except
that they happen when there is a CRLF blank line in the signature,
rather than no blank line at all. Both are caused by the refactoring in
9f75ce3d8f (ref-filter: handle CRLF at end-of-line more gracefully,
2020-10-29).

We can fix this by doing the same "sigstart" check that we do in the
non-CRLF case. And rather than repeat ourselves, we can just use
short-circuiting OR to collapse both cases into a single conditional.
I.e., rather than:

  if (strstr("\n\n"))
    ...found blank, check if it's in signature...
  else if (strstr("\r\n\r\n"))
    ...found blank, check if it's in signature...
  else
    ...no blank line found...

we can collapse this to:

  if (strstr("\n\n")) ||
      strstr("\r\n\r\n")))
    ...found blank, check if it's in signature...
  else
    ...no blank line found...

The tests show the problem and the fix. Though it wasn't broken, I
included contents:signature here to make sure it still behaves as
expected, but note the shell hackery needed to make it work. A
less-clever option would be to skip using test_atom and just "append_cr
>expected" ourselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:36:04 -04:00
b01e1c7ef0 ref-filter: fix parsing of signatures without blank lines
When ref-filter is asked to show %(content:subject), etc, we end up in
find_subpos() to parse out the three major parts: the subject, the body,
and the signature (if any).

When searching for the blank line between the subject and body, if we
don't find anything, we try to treat the whole message as the subject,
with no body. But our idea of "the whole message" needs to take into
account the signature, too. Since 9f75ce3d8f (ref-filter: handle CRLF at
end-of-line more gracefully, 2020-10-29), the code instead goes all the
way to the end of the buffer, which produces confusing output.

Here's an example. If we have a tag message like this:

  this is the subject
  -----BEGIN SSH SIGNATURE-----
  ...some stuff...
  -----END SSH SIGNATURE-----

then the current parser will put the start of the body at the end of the
whole buffer. This produces two buggy outcomes:

  - since the subject length is computed as (body - subject), showing
    %(contents:subject) will print both the subject and the signature,
    rather than just the single line

  - since the body length is computed as (sig - body), and the body now
    starts _after_ the signature, we end up with a negative length!
    Fortunately we never access out-of-bounds memory, because the
    negative length is fed to xmemdupz(), which casts it to a size_t,
    and xmalloc() bails trying to allocate an absurdly large value.

    In theory it would be possible for somebody making a malicious tag
    to wrap it around to a more reasonable value, but it would require a
    tag on the order of 2^63 bytes. And even if they did, all they get
    is an out of bounds string read. So the security implications are
    probably not interesting.

We can fix both by correctly putting the start of the body at the same
index as the start of the signature (effectively making the body empty).

Note that this is a real issue with signatures generated with gpg.format
set to "ssh", which would look like the example above. In the new tests
here I use a hard-coded tag message, for a few reasons:

  - regardless of what the ssh-signing code produces now or in the
    future, we should be testing this particular case

  - skipping the actual signature makes the tests simpler to write (and
    allows them to run on more systems)

  - t6300 has helpers for working with gpg signatures; for the purposes
    of this bug, "BEGIN PGP" is just as good a demonstration, and this
    simplifies the tests

Curiously, the same issue doesn't happen with real gpg signatures (and
there are even existing tests in t6300 with cover this). Those have a
blank line between the header and the content, like:

  this is the subject
  -----BEGIN PGP SIGNATURE-----

  ...some stuff...
  -----END PGP SIGNATURE-----

Because we search for the subject/body separator line with a strstr(),
we find the blank line in the signature, even though it's outside of
what we'd consider the body. But that puts us unto a separate code path,
which realizes that we're now in the signature and adjusts the line back
to "sigstart". So this patch is basically just making the "no line found
at all" case match that. And note that "sigstart" is always defined (if
there is no signature, it points to the end of the buffer as you'd
expect).

Reported-by: Martin Englund <martin@englund.nu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:36:04 -04:00
6fae3aaf22 spatchcache: add a ccache-alike for "spatch"
Add a rather trivial "spatchcache", with this running e.g.:

	make cocciclean
	make contrib/coccinelle/free.cocci.patch \
		SPATCH=contrib/coccicheck/spatchcache \
		SPATCH_FLAGS=--very-quiet

Is cut down from ~20s to ~5s on my system. Much of that is either
fixable shell overhead, or the around 40 files we "CANTCACHE" (see the
implementation).

This uses "redis" as a cache by default, but it's configurable. See
the embedded documentation.

This is *not* like ccache in that we won't cache failed spatch
invocations, or those where spatch suggests changes for us. Those
cases are so rare that I didn't think it was worth the bother, by far
the most common case is that it has no suggested changes. We'll also
refuse to cache any "spatch" invocation that has output on stderr,
which means that "--very-quiet" must be added to "SPATCH_FLAGS".

Because we narrow the cache to that we don't need to save away stdout,
stderr & the exit code. We simply cache the cases where we had no
suggested changes.

Another benchmark is to compare this with the previous
SPATCH_BATCH_SIZE=N, as noted in [1]. Before this (on my 8 core system) running:

	make clean; time make contrib/coccinelle/array.cocci.patch SPATCH_BATCH_SIZE=0

Would take 33s, but with the preceding changes running without this
"spatchcache" is slightly slower, or around 35s:

	make clean; time make contrib/coccinelle/array.cocci.patch

Now doing the same with SPATCH=contrib/coccinelle/spatchcache will
take around 6s, but we'll need to compile the *.o files first to take
full advantage of it (which can be fast with "ccache"):

	make clean; make; time make contrib/coccinelle/array.cocci.patch SPATCH=contrib/coccinelle/spatchcache

1. https://lore.kernel.org/git/YwdRqP1CyUAzCEn2@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
d0e624aed7 cocci: run against a generated ALL.cocci
The preceding commits to make the "coccicheck" target incremental made
it slower in some cases. As an optimization let's not have the
many=many mapping of <*.cocci>=<*.[ch]>, but instead concat the
<*.cocci> into an ALL.cocci, and then run one-to-many
ALL.cocci=<*.[ch]>.

A "make coccicheck" is now around 2x as fast as it was on "master",
and around 1.5x as fast as the preceding change to make the run
incremental:

	$ git hyperfine -L rev origin/master,HEAD~,HEAD -p 'make clean' 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' -r 3
	Benchmark 1: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'origin/master
	  Time (mean ± σ):      4.258 s ±  0.015 s    [User: 27.432 s, System: 1.532 s]
	  Range (min … max):    4.241 s …  4.268 s    3 runs

	Benchmark 2: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD~
	  Time (mean ± σ):      5.365 s ±  0.079 s    [User: 36.899 s, System: 1.810 s]
	  Range (min … max):    5.281 s …  5.436 s    3 runs

	Benchmark 3: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD
	  Time (mean ± σ):      2.725 s ±  0.063 s    [User: 14.796 s, System: 0.233 s]
	  Range (min … max):    2.667 s …  2.792 s    3 runs

	Summary
	  'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD' ran
	    1.56 ± 0.04 times faster than 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'origin/master'
	    1.97 ± 0.05 times faster than 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD~'

This can be turned off with SPATCH_CONCAT_COCCI, but as the
beneficiaries of "SPATCH_CONCAT_COCCI=" would mainly be those
developing the *.cocci rules themselves, let's leave this optimization
on by default.

For more information see my "Optimizing *.cocci rules by concat'ing
them" (<220901.8635dbjfko.gmgdl@evledraar.gmail.com>) on the
cocci@inria.fr mailing list.

This potentially changes the results of our *.cocci rules, but as
noted in that discussion it should be safe for our use. We don't name
rules, or if we do their names don't conflict across our *.cocci
files.

To the extent that we'd have any inter-dependencies between rules this
doesn't make that worse, as we'd have them now if we ran "make
coccicheck", applied the results, and would then have (due to
hypothetical interdependencies) suggested changes on the subsequent
"make coccicheck".

Our "coccicheck-test" target makes use of the ALL.cocci when running
tests, e.g. when testing unused.{c,out} we test it against ALL.cocci,
not unused.cocci. We thus assert (to the extent that we have test
coverage) that this concatenation doesn't change the expected results
of running these rules.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
340a4cb25c cocci rules: remove <id>'s from rules that don't need them
The <id> in the <rulename> part of the coccinelle syntax[1] is for our
purposes there to declares if we have inter-dependencies between
different rules.

But such <id>'s must be unique within a given semantic patch file.  As
we'll be processing a concatenated version of our rules in the
subsequent commit let's remove these names. They weren't being used
for the semantic patches themselves, and equated to a short comment
about the rule.

Both the filename and context of the rules makes it clear what they're
doing, so we're not gaining anything from keeping these. Retaining
them goes against recommendations that "contrib/coccinelle/README"
will be making in the subsequent commit.

This leaves only one named rule in our sources, where it's needed for
a "<id> <-> <extends> <id>" relationship:

	$ git -P grep '^@ ' -- contrib/coccinelle/
	contrib/coccinelle/swap.cocci:@ swap @
	contrib/coccinelle/swap.cocci:@ extends swap @

1. https://coccinelle.gitlabpages.inria.fr/website/docs/main_grammar.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
202086b85c Makefile: copy contrib/coccinelle/*.cocci to build/
Change the "coccinelle" rule so that we first copy the *.cocci source
in e.g. "contrib/coccinelle/strbuf.cocci" to
".build/contrib/coccinelle/strbuf.cocci" before operating on it.

For now this serves as a rather pointless indirection, but prepares us
for the subsequent commit where we'll be able to inject generated
*.cocci files. Having the entire dependency tree live inside .build/*
simplifies both the globbing we'd need to do, and any "clean" rules.

It will also help for future targets which will want to act on the
generated patches or the logs, e.g. targets to alert if we can't parse
certain files (or, less so than usual) with "spatch", and e.g. a
replacement for "ci/run-static-analysis.sh". Such a replacement won't
care about placing the patches in the in-tree, only whether they're
"OK" (and about the diff).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
316e3886e3 cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES
Improve the incremental rebuilding support of "coccicheck" by
piggy-backing on the computed dependency information of the
corresponding *.o file, rather than rebuilding all <RULE>/<FILE> pairs
if either their corresponding file changes, or if any header changes.

This in effect uses the same method that the "sparse" target was made
to use in c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23), except that the dependency on the *.o file isn't a hard
one, we check with $(wildcard) if the *.o file exists, and if so we'll
depend on it.

This means that the common case of:

	make
	make coccicheck

Will benefit from incremental rebuilding, now changing e.g. a header
will only re-run "spatch" on those those *.c files that make use of
it:

By depending on the *.o we piggy-back on
COMPUTE_HEADER_DEPENDENCIES. See c234e8a0ec (Makefile: make the
"sparse" target non-.PHONY, 2021-09-23) for prior art of doing that
for the *.sp files. E.g.:

    make contrib/coccinelle/free.cocci.patch
    make -W column.h contrib/coccinelle/free.cocci.patch

Will take around 15 seconds for the second command on my 8 core box if
I didn't run "make" beforehand to create the *.o files. But around 2
seconds if I did and we have those "*.o" files.

Notes about the approach of piggy-backing on *.o for dependencies:

 * It *is* a trade-off since we'll pay the extra cost of running the C
   compiler, but we're probably doing that anyway. The compiler is much
   faster than "spatch", so even though we need to re-compile the *.o to
   create the dependency info for the *.c for "spatch" it's
   faster (especially if using "ccache").

 * There *are* use-cases where some would like to have *.o files
   around, but to have the "make coccicheck" ignore them. See:
   https://lore.kernel.org/git/20220826104312.GJ1735@szeder.dev/

   For those users a:

	make
	make coccicheck SPATCH_USE_O_DEPENDENCIES=

   Will avoid considering the *.o files.

 * If that *.o file doesn't exist we'll depend on an intermediate file
   of ours which in turn depends on $(FOUND_H_SOURCES).

   This covers both an initial build, or where "coccicheck" is run
   without running "all" beforehand, and because we run "coccicheck"
   on e.g. files in compat/* that we don't know how to build unless
   the requisite flag was provided to the Makefile.

   Most of the runtime of "incremental" runs is now spent on various
   compat/* files, i.e. we conditionally add files to COMPAT_OBJS, and
   therefore conflate whether we *can* compile an object and generate
   dependency information for it with whether we'd like to link it
   into our binary.

   Before this change the distinction didn't matter, but now one way
   to make this even faster on incremental builds would be to peel
   those concerns apart so that we can see that e.g. compat/mmap.c
   doesn't depend on column.h.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
f1c903debd cocci: make "coccicheck" rule incremental
Optimize the very slow "coccicheck" target to take advantage of
incremental rebuilding, and fix outstanding dependency problems with
the existing rule.

The rule is now faster both on the initial run as we can make better
use of GNU make's parallelism than the old ad-hoc combination of
make's parallelism combined with $(SPATCH_BATCH_SIZE) and/or the
"--jobs" argument to "spatch(1)".

It also makes us *much* faster when incrementally building, it's now
viable to "make coccicheck" as topic branches are merged down.

The rule didn't use FORCE (or its equivalents) before, so a:

	make coccicheck
	make coccicheck

Would report nothing to do on the second iteration. But all of our
patch output depended on all $(COCCI_SOURCES) files, therefore e.g.:

    make -W grep.c coccicheck

Would do a full re-run, i.e. a a change in a single file would force
us to do a full re-run.

The reason for this (not the initial rationale, but my analysis) is:

* Since we create a single "*.cocci.patch+" we don't know where to
  pick up where we left off, or how to incrementally merge e.g. a
  "grep.c" change with an existing *.cocci.patch.

* We've been carrying forward the dependency on the *.c files since
  63f0a758a0 (add coccicheck make target, 2016-09-15) the rule was
  initially added as a sort of poor man's dependency discovery.

  As we don't include other *.c files depending on other *.c files
  has always been broken, as could be trivially demonstrated
  e.g. with:

       make coccicheck
       make -W strbuf.h coccicheck

  However, depending on the corresponding *.c files has been doing
  something, namely that *if* an API change modified both *.c and *.h
  files we'd catch the change to the *.h we care about via the *.c
  being changed.

  For API changes that happened only via *.h files we'd do the wrong
  thing before this change, but e.g. for function additions (not
  "static inline" ones) catch the *.h change by proxy.

Now we'll instead:

 * Create a <RULE>/<FILE> pair in the .build directory, E.g. for
   swap.cocci and grep.c we'll create
   .build/contrib/coccinelle/swap.cocci.patch/grep.c.

   That file is the diff we'll apply for that <RULE>-<FILE>
   combination, if there's no changes to me made (the common case)
   it'll be an empty file.

 * Our generated *.patch
   file (e.g. contrib/coccinelle/swap.cocci.patch) is now a simple "cat
   $^" of all of all of the <RULE>/<FILE> files for a given <RULE>.

   In the case discussed above of "grep.c" being changed we'll do the
   full "cat" every time, so they resulting *.cocci.patch will always
   be correct and up-to-date, even if it's "incrementally updated".

   See 1cc0425a27 (Makefile: have "make pot" not "reset --hard",
   2022-05-26) for another recent rule that used that technique.

As before we'll:

 * End up generating a contrib/coccinelle/swap.cocci.patch, if we
   "fail" by creating a non-empty patch we'll still exit with a zero
   exit code.

   Arguably we should move to a more Makefile-native way of doing
   this, i.e. fail early, and if we want all of the "failed" changes
   we can use "make -k", but as the current
   "ci/run-static-analysis.sh" expects us to behave this way let's
   keep the existing behavior of exhaustively discovering all cocci
   changes, and only failing if spatch itself errors out.

Further implementation details & notes:

 * Before this change running "make coccicheck" would by default end
   up pegging just one CPU at the very end for a while, usually as
   we'd finish whichever *.cocci rule was the most expensive.

   This could be mitigated by combining "make -jN" with
   SPATCH_BATCH_SIZE, see 960154b9c1 (coccicheck: optionally batch
   spatch invocations, 2019-05-06).

   There will be cases where getting rid of "SPATCH_BATCH_SIZE" makes
   things worse, but a from-scratch "make coccicheck" with the default
   of SPATCH_BATCH_SIZE=1 (and tweaking it doesn't make a difference)
   is faster (~3m36s v.s. ~3m56s) with this approach, as we can feed
   the CPU more work in a less staggered way.

 * Getting rid of "SPATCH_BATCH_SIZE" particularly helps in cases
   where the default of 1 yields parallelism under "make coccicheck",
   but then running e.g.:

       make -W contrib/coccinelle/swap.cocci coccicheck

   I.e. before that would use only one CPU core, until the user
   remembered to adjust "SPATCH_BATCH_SIZE" differently than the
   setting that makes sense when doing a non-incremental run of "make
   coccicheck".

 * Before the "make coccicheck" rule would have to clean
   "contrib/coccinelle/*.cocci.patch*", since we'd create "*+" and
   "*.log" files there. Now those are created in
   .build/contrib/coccinelle/, which is covered by the "cocciclean" rule
   already.

Outstanding issues & future work:

 * We could get rid of "--all-includes" in favor of manually
   specifying a list of includes to give to "spatch(1)".

   As noted upthread of [1] a naïve removal of "--all-includes" will
   result in broken *.cocci patches, but if we know the exhaustive
   list of includes via COMPUTE_HEADER_DEPENDENCIES we don't need to
   re-scan for them, we could grab the headers to include from the
   .depend.d/<file>.o.d and supply them with the "--include" option to
   spatch(1).q

1. https://lore.kernel.org/git/87ft18tcog.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
60cfad9cbe cocci: split off "--all-includes" from SPATCH_FLAGS
Per the rationale in 7b63ea5750 (Makefile: remove mandatory "spatch"
arguments from SPATCH_FLAGS, 2022-07-05) we have certain flags that
are truly mandatory, such as "--sp-file" and "--patch .". The
"--all-includes" flag is also critical, but per [1] we might want to
ad-hoc tweak it occasionally for testing or one-offs.

But being unable to set e.g. SPATCH_FLAGS="--verbose-parsing" without
breaking how our "spatch" works isn't ideal, i.e. before this we'd
need to know about the default include flags, and specify:
SPATCH_FLAGS="--all-includes --verbose-parsing".

If we were then to change the default include flag (e.g. to
"--recursive-includes") in the future any such one-off commands would
need to be correspondingly updated.

Let's instead leave the SPATCH_FLAGS for the user, while creating a
new SPATCH_INCLUDE_FLAGS to allow for ad-hoc testing of the include
strategy itself.

1. https://lore.kernel.org/git/20220823095733.58685-1-szeder.dev@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
b75f2701c6 cocci: split off include-less "tests" from SPATCH_FLAGS
Amend the "coccicheck-test" rule added in f7ff6597a7 (cocci: add a
"coccicheck-test" target and test *.cocci rules, 2022-07-05) to stop
using "--all-includes". The flags we'll need for the tests are
different than the ones we'll need for our main source code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
49f54c4955 Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading
Split off the "; setting[...]" part of the comment added in In
960154b9c1 (coccicheck: optionally batch spatch invocations,
2019-05-06), and restore what we had before that, which was a comment
indicating that variables for the "coccicheck" target were being set
here.

When 960154b9c1 amended the heading to discuss SPATCH_BATCH_SIZE it
left no natural place to add a new comment about other flags that
preceded it. As subsequent commits will add such comments we need to
split the existing comment up.

The wrapping for the "SPATCH_BATCH_SIZE" is now a bit odd, but
minimizes the diff size. As a subsequent commit will remove that
feature altogether this is worth it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
09d9a69e31 Makefile: have "coccicheck" re-run if flags change
Fix an issue with the "coccicheck" family of rules that's been here
since 63f0a758a0 (add coccicheck make target, 2016-09-15), unlike
e.g. "make grep.o" we wouldn't re-run it when $(SPATCH) or
$(SPATCH_FLAGS) changed. To test new flags we needed to first do a
"make cocciclean".

This now uses the same (copy/pasted) pattern as other "DEFINES"
rules. As a result we'll re-run properly. This can be demonstrated
e.g. on the issue noted in [1]:

	$ make contrib/coccinelle/xcalloc.cocci.patch COCCI_SOURCES=promisor-remote.c V=1
	[...]
	    SPATCH contrib/coccinelle/xcalloc.cocci
	$ make contrib/coccinelle/xcalloc.cocci.patch COCCI_SOURCES=promisor-remote.c SPATCH_FLAGS="--all-includes --recursive-includes"
	    * new spatch flags
	    SPATCH contrib/coccinelle/xcalloc.cocci
	     SPATCH result: contrib/coccinelle/xcalloc.cocci.patch
	$

1. https://lore.kernel.org/git/20220823095602.GC1735@szeder.dev/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
e603a140ae Makefile: add ability to TAB-complete cocci *.patch rules
Declare the contrib/coccinelle/<rule>.cocci.patch rules in such a way
as to allow TAB-completion, and slightly optimize the Makefile by
cutting down on the number of $(wildcard) in favor of defining
"coccicheck" and "coccicheck-pending" in terms of the same
incrementally filtered list.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
895ae7ae2a cocci rules: remove unused "F" metavariable from pending rule
Fix an issue with a rule added in 9b45f49981 (object-store: prepare
has_{sha1, object}_file to handle any repo, 2018-11-13). We've been
spewing out this warning into our $@.log since that rule was added:

	warning: rule starting on line 21: metavariable F not used in the - or context code

We should do a better job of scouring our coccinelle log files for
such issues, but for now let's fix this as a one-off.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:15 -04:00
c4864e3755 Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T)
In f7ff6597a7 (cocci: add a "coccicheck-test" target and test *.cocci
rules, 2022-07-05) we abbreviated "_TEST" to "_T" to have it align
with the rest of the "="'s above it.

Subsequent commits will add more QUIET_SPATCH_* variables, so let's
stop abbreviating this, and indent it in preparation for adding more
of these variables.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:15 -04:00
586d8b5052 diff.c: use diff_free_queue()
Use diff_free_queue() instead of open-coding it.  This shortens the
code and make it less repetitive.

Note that the second hunk in diff_flush() is interesting, because the
'free_queue' label separates the loop freeing the queue's filepairs
from free()-ing the queue's internal array.  This is somewhat
suspicious, but it was not an issue before: there is only one place
from where we jump to this label with a goto, and that is protected by
an 'if (!q->nr && ...)' condition, i.e. we only skipped the loop
freeing the filepairs when there were no filepairs in the queue to
begin with.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 20:16:34 -04:00
ef84222fa9 line-log: free the diff queues' arrays when processing merge commits
When processing merge commits, the line-level log first creates an
array of diff queues, each comparing the merge commit with one of its
parents, to check whether any of the files in the given line ranges
were modified.  Alas, when freeing these queues it only frees the
filepairs in the queues, but not the queues' internal arrays holding
pointers to those filepairs.

Use the diff_free_queue() helper function introduced in the previous
commit to free the diff queues' internal arrays as well.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 20:16:34 -04:00
04ae00062d line-log: free diff queue when processing non-merge commits
When processing a non-merge commit, the line-level log first asks the
tree-diff machinery whether any of the files in the given line ranges
were modified between the current commit and its parent, and if some
of them were, then it loads the contents of those files from both
commits to see whether their line ranges were modified and/or need to
be adjusted.  Alas, it doesn't free() the diff queue holding the
results of that query and the contents of those files once its done.
This can add up to a substantial amount of leaked memory, especially
when the file in question is big and is frequently modified: a user
reported "Out of memory, malloc failed" errors with a 2MB text file
that was modified ~2800 times [1] (I estimate the leak would use up
almost 11GB memory in that case).

Free that diff queue to plug this memory leak.  However, instead of
simply open-coding the necessary three lines, add them as a helper
function to the diff API, because it will be useful elsewhere as well.

[1] https://public-inbox.org/git/CAFOPqVXz2XwzX8vGU7wLuqb2ZuwTuOFAzBLRM_QPk+NJa=eC-g@mail.gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 20:16:34 -04:00
db8016b43f t5516/t5601: be less strict about the number of credential warnings
It is unclear as to _why_, but under certain circumstances the warning
about credentials being passed as part of the URL seems to be swallowed
by the `git remote-https` helper in the Windows jobs of Git's CI builds.

Since it is not actually important how many times Git prints the
warning/error message, as long as it prints it at least once, let's just
make the test a bit more lenient and test for the latter instead of the
former, which works around these CI issues.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-01 16:35:05 -04:00
762521e8a5 t5516: move plaintext-password tests from t5601 and t5516
Commit 6dcbdc0d66 (remote: create fetch.credentialsInUrl config,
2022-06-06) added tests for our handling of passwords in URLs. Since the
obvious URL to be affected is git-over-http, the tests use http. However
they don't set up a test server; they just try to access
https://localhost, assuming it will fail (because the nothing is
listening there).

This causes some possible problems:

  - There might be a web server running on localhost, and we do not
    actually want to connect to that.

  - The DNS resolver, or the local firewall, might take a substantial
    amount of time (or forever, whichever comes first) to fail to
    connect, slowing down the tests cases unnecessarily.

  - Since there's no server, our tests for "allow" and "warn" still
    expect the clone/fetch/push operations to fail, even though in the
    real world we'd expect these to succeed. We scrape stderr to see
    what happened, but it's not as robust as a more realistic test.

Let's instead move these to t5551, which is all about testing http and
where we have a real server. That eliminates any issues with contacting
a strange URL, and lets the "allow" and "warn" tests confirm that the
operation actually succeeds.

It's not quite a verbatim move for a few reasons:

  - we can drop the LIBCURL dependency; it's already part of
    lib-httpd.sh

  - we'll use HTTPD_URL_USER_PASS, etc, instead of our fake URL. To
    avoid repetition, we'll add a few extra variables.

  - the "https://username:@localhost" test uses a funny URL that
    lib-httpd.sh doesn't provide. We'll similarly construct it in a
    variable. Note that we're hard-coding the lib-httpd username here,
    but t5551 already does that everywhere.

  - for the "domain:port" test, the URL provided by lib-httpd is fine,
    since our test server will always be on an exotic port. But we'll
    confirm in the test that this is so.

  - since our message-matching is done via grep, I simplified it to use
    a regex, rather than trying to massage lib-httpd's variables.
    Arguably this makes it more readable, too, while retaining the bits
    we care about: the fatal/warning distinction, the "uses plaintext"
    message, and the fact that the password was redacted.

  - we'll use the /auth/ path for the repo, which shows that we are
    indeed making use of the auth information when needed.

  - we'll also use /smart/; most of these tests could be done via /dumb/
    in t5550, but setting up pushes there requires extra effort and
    dependencies. The smart protocol is what most everyone is using
    these days anyway.

This patch is my own, but I stole the analysis and a few bits of the
commit message from a patch by Johannes Schindelin.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-01 16:35:05 -04:00
cc8f95c042 test-lib-functions: drop redundant diagnostic print
`test_path_is_missing` was introduced back in 2caf20c52b ("test-lib:
user-friendly alternatives to test [-d|-f|-e]", 2010-08-10). It took the
path that was supposed to be missing, as well as an optional "diagnosis"
that would be echoed if the path was found to be alive.

Commit 45a2686441 ("test-lib-functions: remove bug-inducing
"diagnostics" helper param", 2021-02-12) dropped this diagnostic
functionality from several `test_path_is_foo` helpers, but note how it
tweaked the README entry on `test_path_is_missing` without actually
adjusting its implementation.

Commit e7884b353b ("test-lib-functions: assert correct parameter count",
2021-02-12) then followed up by asserting that we get just a single
argument.

This history leaves us in a state where we assert that we have exactly
one argument, then go on to anyway check for arguments, echoing them
all. It's clear that we can simplify this code. We should also note that
we run `ls -ld "$1"`, so printing the filename a second time doesn't
really buy us anything. Thus, we can drop the whole `if` block as
redundant.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 21:12:09 -04:00
c805f06b01 Documentation: build redo-seen.sh from jch..seen
In a similar spirit as the previous commit, the 'seen' branch gets
rebuilt by reintegrating topics between 'jch' and the (old) tip of
'seen'.

Update the instructions on how to generate Meta/redo-seen.sh for the
first time to reflect this.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 18:52:21 -04:00
7fa56b1a00 Documentation: build redo-jch.sh from master..jch
Rebuilding the 'jch' branch begins by reintegrating any topics between
'master' and 'jch', not 'master' and 'seen'.

In the maintainer guide, the documentation isn't quite right, since the
initial input to Meta/Reintegrate is "master..seen", not "master..jch".
This can lead to confusing results when generating the Meta/redo-jch.sh
script for the first time.

Additionally, rebuilding 'jch' takes place in two steps. First, running
the script up to the first "### match next" cut-line, and then comparing
the result with what's on 'next' (i.e. with "git diff jch next"). Then,
the remaining set of topics get merged down to 'jch' (which aren't on
'next') by running the entire "redo-jch.sh" script.

Clarify the documentation to reflect this.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 18:52:16 -04:00
fe004a4333 run-command tests: test stdout of run_command_parallel()
Extend the tests added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) to test stdout in
addition to stderr.

When the "ungroup" feature was added in fd3aaf53f7 (run-command: add
an "ungroup" option to run_process_parallel(), 2022-06-07) its tests
were made to test both the stdout and stderr, but these existing tests
were left alone. Let's also exhaustively test our expected output
here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 00:16:37 -04:00
ac48da5a92 submodule tests: reset "trace.out" between "grep" invocations
Fix test patterns added in 62104ba14a (submodules: allow parallel
fetching, add tests and documentation, 2015-12-15) and
a028a1930c (fetching submodules: respect `submodule.fetchJobs` config
option, 2016-02-29).

In the former case we were leaving a trace.out file at the top-level
for any subsequent tests (there are none, currently). Let's clean the
file up instead.

In the latter case we were testing that a given configuration would
result in "N tasks" in the log, but we were grepping through the log
for all previous such tests, when we really meant to clear the logs
between the "grep" invocations.

In practice this resulted in no logic error, as e.g. "--fetch 7" would
not print out a "9 tasks" line, but let's be paranoid and stop
implicitly assuming that that's the case.

This change was originally left out of 51243f9f0f (run-command API:
don't fall back on online_cpus(), 2022-10-12), which added the
">trace.out" seen at the end of the context.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 00:16:37 -04:00
035cccf46e hook tests: fix redirection logic error in 96e7225b31
The tests added in 96e7225b31 (hook: add 'run' subcommand,
2021-12-22) were redirecting to "actual" both in the body of the hook
itself and in the testing code below.

The net result was that the "2>>actual" redirection later in the test
wasn't doing anything. Let's have those redirection do what it looks
like they're doing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 00:16:37 -04:00
c03801e19c The ninth batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 21:14:28 -04:00
2f503ee0d7 Merge branch 'jt/skipping-negotiator-wo-recursion'
Rewrite a deep recursion in the skipping negotiator to use a loop
with on-heap prio queue to avoid stack wastage.

* jt/skipping-negotiator-wo-recursion:
  negotiator/skipping: avoid stack overflow
2022-10-30 21:04:44 -04:00
1e230dfd6c Merge branch 'jc/doc-fsck-msgids'
Add documentation for message IDs in fsck error messages.

* jc/doc-fsck-msgids:
  Documentation: add lint-fsck-msgids
  fsck: document msg-id
  fsck: remove the unused MISSING_TREE_OBJECT
  fsck: remove the unused BAD_TAG_OBJECT
2022-10-30 21:04:44 -04:00
b1e3dd68ee Merge branch 'en/merge-tree-sequence'
"git merge-tree --stdin" is a new way to request a series of merges
and report the merge results.

* en/merge-tree-sequence:
  merge-tree: support multiple batched merges with --stdin
  merge-tree: update documentation for differences in -z output
2022-10-30 21:04:44 -04:00
d32dd8add5 Merge branch 'ds/bundle-uri-3'
Define the logical elements of a "bundle list", data structure to
store them in-core, format to transfer them, and code to parse
them.

* ds/bundle-uri-3:
  bundle-uri: suppress stderr from remote-https
  bundle-uri: quiet failed unbundlings
  bundle: add flags to verify_bundle()
  bundle-uri: fetch a list of bundles
  bundle: properly clear all revision flags
  bundle-uri: limit recursion depth for bundle lists
  bundle-uri: parse bundle list in config format
  bundle-uri: unit test "key=value" parsing
  bundle-uri: create "key=value" line parsing
  bundle-uri: create base key-value pair parsing
  bundle-uri: create bundle_list struct and helpers
  bundle-uri: use plain string in find_temp_filename()
2022-10-30 21:04:44 -04:00
bf0d9d0d34 Merge branch 'rj/branch-do-not-exit-with-minus-one-status'
"git branch --edit-description" can exit with status -1 which is
not a good practice; it learned to use 1 as everybody else instead.

* rj/branch-do-not-exit-with-minus-one-status:
  branch: error code with --edit-description
2022-10-30 21:04:43 -04:00
0c025612d4 Merge branch 'rj/branch-copy-rename-error-codepath-cleanup'
Code simplification.

* rj/branch-copy-rename-error-codepath-cleanup:
  branch: error copying or renaming a detached HEAD
2022-10-30 21:04:43 -04:00
c41ec63ef5 Merge branch 'tb/cap-patch-at-1gb'
"git apply" limits its input to a bit less than 1 GiB.

* tb/cap-patch-at-1gb:
  apply: reject patches larger than ~1 GiB
2022-10-30 21:04:43 -04:00
c7ccd4eae9 Merge branch 'jr/embargoed-releases-doc'
The role the security mailing list plays in an embargoed release
has been documented.

* jr/embargoed-releases-doc:
  embargoed releases: also describe the git-security list and the process
2022-10-30 21:04:43 -04:00
969230b64f Merge branch 'en/ort-dir-rename-and-symlink-fix'
Merging a branch with directory renames into a branch that changes
the directory to a symlink was mishandled by the ort merge
strategy, which has been corrected.

* en/ort-dir-rename-and-symlink-fix:
  merge-ort: fix bug with dir rename vs change dir to symlink
2022-10-30 21:04:43 -04:00
a23e0b69e2 Merge branch 'pb/subtree-split-and-merge-after-squashing-tag-fix'
A bugfix to "git subtree" in its split and merge features.

* pb/subtree-split-and-merge-after-squashing-tag-fix:
  subtree: fix split after annotated tag was squashed merged
  subtree: fix squash merging after annotated tag was squashed merged
  subtree: process 'git-subtree-split' trailer in separate function
  subtree: use named variables instead of "$@" in cmd_pull
  subtree: define a variable before its first use in 'find_latest_squash'
  subtree: prefix die messages with 'fatal'
  subtree: add 'die_incompatible_opt' function to reduce duplication
  subtree: use 'git rev-parse --verify [--quiet]' for better error messages
  test-lib-functions: mark 'test_commit' variables as 'local'
2022-10-30 21:04:43 -04:00
8851c4b065 Merge branch 'pw/rebase-reflog-fixes'
Fix some bugs in the reflog messages when rebasing and changes the
reflog messages of "rebase --apply" to match "rebase --merge" with
the aim of making the reflog easier to parse.

* pw/rebase-reflog-fixes:
  rebase: cleanup action handling
  rebase --abort: improve reflog message
  rebase --apply: make reflog messages match rebase --merge
  rebase --apply: respect GIT_REFLOG_ACTION
  rebase --merge: fix reflog message after skipping
  rebase --merge: fix reflog when continuing
  t3406: rework rebase reflog tests
  rebase --apply: remove duplicated code
2022-10-30 21:04:43 -04:00
003f815dd9 Merge branch 'pw/rebase-keep-base-fixes'
"git rebase --keep-base" used to discard the commits that are
already cherry-picked to the upstream, even when "keep-base" meant
that the base, on top of which the history is being rebuilt, does
not yet include these cherry-picked commits.  The --keep-base
option now implies --reapply-cherry-picks and --no-fork-point
options.

* pw/rebase-keep-base-fixes:
  rebase --keep-base: imply --no-fork-point
  rebase --keep-base: imply --reapply-cherry-picks
  rebase: factor out branch_base calculation
  rebase: rename merge_base to branch_base
  rebase: store orig_head as a commit
  rebase: be stricter when reading state files containing oids
  t3416: set $EDITOR in subshell
  t3416: tighten two tests
2022-10-30 21:04:42 -04:00
e5be3c632a Merge branch 'jh/trace2-timers-and-counters'
Two new facilities, "timer" and "counter", are introduced to the
trace2 API.

* jh/trace2-timers-and-counters:
  trace2: add global counter mechanism
  trace2: add stopwatch timers
  trace2: convert ctx.thread_name from strbuf to pointer
  trace2: improve thread-name documentation in the thread-context
  trace2: rename the thread_name argument to trace2_thread_start
  api-trace2.txt: elminate section describing the public trace2 API
  tr2tls: clarify TLS terminology
  trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx
2022-10-30 21:04:42 -04:00
c112d8d9c2 Merge branch 'tb/shortlog-group'
"git shortlog" learned to group by the "format" string.

* tb/shortlog-group:
  shortlog: implement `--group=committer` in terms of `--group=<format>`
  shortlog: implement `--group=author` in terms of `--group=<format>`
  shortlog: extract `shortlog_finish_setup()`
  shortlog: support arbitrary commit format `--group`s
  shortlog: extract `--group` fragment for translation
  shortlog: make trailer insertion a noop when appropriate
  shortlog: accept `--date`-related options
2022-10-30 21:04:42 -04:00
71aa6e3d85 Merge branch 'rs/absorb-git-dir-simplify'
Code simplification by using strvec_pushf() instead of building an
argument in a separate strbuf.

* rs/absorb-git-dir-simplify:
  submodule: use strvec_pushf() for --super-prefix
2022-10-30 21:04:42 -04:00
c88895e67b Merge branch 'jk/repack-tempfile-cleanup'
The way "git repack" creared temporary files when it received a
signal was prone to deadlocking, which has been corrected.

* jk/repack-tempfile-cleanup:
  t7700: annotate cruft-pack failure with ok=sigpipe
  repack: drop remove_temporary_files()
  repack: use tempfiles for signal cleanup
  repack: expand error message for missing pack files
  repack: populate extension bits incrementally
  repack: convert "names" util bitfield to array
2022-10-30 21:04:42 -04:00
75f416ec6a Merge branch 'sg/stable-docdep'
Make sure generated dependency file is stably sorted to help
developers debugging their build issues.

* sg/stable-docdep:
  Documentation/build-docdep.perl: generate sorted output
2022-10-30 21:04:42 -04:00
576b19924e Merge branch 'sd/doc-smtp-encryption'
* sd/doc-smtp-encryption:
  docs: git-send-email: difference between ssl and tls smtp-encryption
2022-10-30 21:04:42 -04:00
160314e625 Merge branch 'jz/patch-id'
A new "--include-whitespace" option is added to "git patch-id", and
existing bugs in the internal patch-id logic that did not match
what "git patch-id" produces have been corrected.

* jz/patch-id:
  builtin: patch-id: remove unused diff-tree prefix
  builtin: patch-id: add --verbatim as a command mode
  patch-id: fix patch-id for mode changes
  builtin: patch-id: fix patch-id with binary diffs
  patch-id: use stable patch-id for rebases
  patch-id: fix stable patch id for binary / header-only
2022-10-30 21:04:41 -04:00
8fea12ab40 glossary: add reachability bitmap description
Describe the purpose of the reachability bitmap.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 19:58:46 -04:00
4973726c5d glossary: add "commit graph" description
Git has an additional "commit graph" capability that supplements the
normal commit object's directed acyclic graph (DAG). The supplemental
commit graph file is designed for speed of access.

Describe the commit graph both from the normative DAG view point and
from the commit graph file perspective.

Also, clarify the link between the branch ref and branch tip
by linking to the `ref` glossary entry, matching this commit graph
entry.

The commit-graph file is also distinguished by its hyphenation.

Subsequent commit catches the few cases where the hyphenation of
commit-graph was missing.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 19:58:46 -04:00
fa8e8d5b31 doc: use 'object database' not ODB or abbreviation
The abbreviation 'ODB' is used in the technical documentation
sections for commit-graph and parallel-checkout, along with an
'odb' option in `git-pack-redundant`, without expansion.

Use 'object database' in full, in those entries. The text has not
been reflowed to keep the changes minimal.

While in the glossary for `object` terms, add the common`oid`
abbreviation to its entry.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 19:58:46 -04:00
776ba91a5e doc: use "commit-graph" hyphenation consistently
Note, historical release notes have not been updated.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 19:58:40 -04:00
1e4ea950f7 archive-tar: report filter start error only once
A missing tar filter is reported by start_command() using error(), but
also by its caller, write_tar_filter_archive(), using die():

   $ git -c tar.invalid.command=foo archive --format=invalid HEAD
   error: cannot run foo: No such file or directory
   fatal: unable to start 'foo' filter: No such file or directory

The second message contains all relevant information and even says that
the failed command was intended to be used as a filter.  Silence the
first one because it's redundant.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 19:50:43 -04:00
ddbb47fde9 replace and remove run_command_v_opt()
Replace the remaining calls of run_command_v_opt() with run_command()
calls and explict struct child_process variables.  This is more verbose,
but not by much overall.  The code becomes more flexible, e.g. it's easy
to extend to conditionally add a new argument.

Then remove the now unused function and its own flag names, simplifying
the run-command API.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:51 -04:00
ef249b398e replace and remove run_command_v_opt_cd_env_tr2()
The convenience function run_command_v_opt_cd_env_tr2() has no external
callers left.  Inline it and remove it from the API.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:50 -04:00
d82dbbd849 replace and remove run_command_v_opt_tr2()
The convenience function run_command_v_opt_tr2() is only used by a
single caller.  Use struct child_process and run_command() directly
instead and remove the underused function.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:48 -04:00
eb5b6b57d0 replace and remove run_command_v_opt_cd_env()
run_command_v_opt_cd_env() is only used in an example in a comment.  Use
the struct child_process member "env" and run_command() directly instead
and then remove the unused convenience function.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:47 -04:00
0e90673957 use child_process members "args" and "env" directly
Build argument list and environment of child processes by using
struct child_process and populating its members "args" and "env"
directly instead of maintaining separate strvecs and letting
run_command_v_opt() and friends populate these members.  This is
simpler, shorter and slightly more efficient.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:40 -04:00
4120294cbf use child_process member "args" instead of string array variable
Use run_command() with a struct child_process variable and populate its
"args" member directly instead of building a string array and passing it
to run_command_v_opt().  This avoids the use of magic index numbers and
makes simplifies the possible addition of more arguments in the future.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:39 -04:00
242aa33de0 sequencer: simplify building argument list in do_exec()
Build child_argv during initialization, taking advantage of the C99
support for initialization expressions that are not compile time
constants.  This avoids the use of a magic index constant and is shorter
and simpler.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:37 -04:00
eede29aa35 bisect--helper: factor out do_bisect_run()
Deduplicate the code for reporting and starting the bisect run command
by moving it to a short helper function.  Use a string array instead of
a strvec to prepare the arguments, for simplicity.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:36 -04:00
48750b2d0d bisect: simplify building "checkout" argument list
Reduce the scope of argv_checkout, which allows to fully build it during
initialization.  Use oid_to_hex() instead of oid_to_hex_r(), because
that's simpler and using the static buffer of the former is just as safe
as the old static argv_checkout.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:35 -04:00
75c92a0540 am: simplify building "show" argument list
Build the string array av during initialization, without any magic
numbers or heap allocations.  Not duplicating the result of oid_to_hex()
is safe because run_command_v_opt() duplicates all arguments already.
(It would even be safe if it didn't, but that's a different story.)

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:33 -04:00
53c4be3fd8 run-command: fix return value comment
483bbd4e4c (run-command: introduce child_process_init(), 2014-08-19) and
2d71608ec0 (run-command: factor out child_process_clear(), 2015-10-24)
added help texts about child_process_init() and child_process_clear()
without updating the immediately following documentation of return codes
that only applied to the preexisting functions.

4c4066d95d (run-command: move doc to run-command.h, 2019-11-17) started
to list the functions explicitly that this paragraph applies to, but
still wrongly included child_process_init() and child_process_clear().
Remove their names from that list.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:32 -04:00
9397f3cf7e merge: remove always-the-same "verbose" arguments
Simplify the code that builds the arguments for the "read-tree"
invocation in reset_hard() and read_empty() to remove the "verbose"
parameter.

Before 172b6428d0 (do not overwrite untracked during merge from
unborn branch, 2010-11-14) there was a "reset_hard()" function that
would be called in two places, one of those passed a "verbose=1", the
other a "verbose=0".

After 172b6428d0 when read_empty() was split off from reset_hard()
both of these functions only had one caller. The "verbose" in
read_empty() would always be false, and the one in reset_hard() would
always be true.

There was never a good reason for the code to act this way, it
happened because the read_empty() function was a copy/pasted and
adjusted version of reset_hard().

Since we're no longer conditionally adding the "-v" parameter
here (and we'd only add it for "reset_hard()" we'll be able to move to
a simpler and safer run-command API in the subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:31 -04:00
671bbf7b9d adjust_shared_perm(): leave g+s alone when the group does not matter
Julien Moutinho reports that in an environment where directory does
not have BSD group semantics and requires the g+s to be set (aka
FORCE_DIR_SET_GID), but the system forbids chmod() to touch the g+s
bit, adjust_shared_perm() fails even when the repository is for
private use with perm = 0600, because we unconditionally try to set
the g+s bit.

When we grant extra access based on group membership (i.e. the
directory has either g+r or g+w bit set), which group the directory
and its contents are owned by matters.  But otherwise (e.g. perm is
set to 0600, in Julien's case), flipping g+s bit is not necessary.

Reported-by: Julien Moutinho <julm+git@sourcephile.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-28 14:55:27 -07:00
63bba4fdd8 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-28 11:27:01 -07:00
7d5a4d86a6 Merge branch 'tb/diffstat-with-utf8-strwidth'
"git diff --stat" etc. were invented back when everything was ASCII
and strlen() was a way to measure the display width of a string;
adjust them to compute the display width assuming UTF-8 pathnames.

* tb/diffstat-with-utf8-strwidth:
  diff: leave NEEDWORK notes in show_stats() function
  diff.c: use utf8_strwidth() to count display width
2022-10-28 11:26:55 -07:00
330135ac81 Merge branch 'mm/git-pm-try-catch-syntax-fix'
Fix a longstanding syntax error in Git.pm error codepath.

* mm/git-pm-try-catch-syntax-fix:
  Git.pm: trust rev-parse to find bare repositories
  Git.pm: add semicolon after catch statement
2022-10-28 11:26:54 -07:00
c5dd7773e1 Merge branch 'tb/remove-unused-pack-bitmap'
When creating a multi-pack bitmap, remove per-pack bitmap files
unconditionally as they will never be consulted.

* tb/remove-unused-pack-bitmap:
  builtin/repack.c: remove redundant pack-based bitmaps
2022-10-28 11:26:54 -07:00
7b9b634ca5 Merge branch 'ab/doc-synopsis-and-cmd-usage'
The short-help text shown by "git cmd -h" and the synopsis text
shown at the beginning of "git help cmd" have been made more
consistent.

* ab/doc-synopsis-and-cmd-usage: (34 commits)
  tests: assert consistent whitespace in -h output
  tests: start asserting that *.txt SYNOPSIS matches -h output
  doc txt & -h consistency: make "worktree" consistent
  worktree: define subcommand -h in terms of command -h
  reflog doc: list real subcommands up-front
  doc txt & -h consistency: make "commit" consistent
  doc txt & -h consistency: make "diff-tree" consistent
  doc txt & -h consistency: use "[<label>...]" for "zero or more"
  doc txt & -h consistency: make "annotate" consistent
  doc txt & -h consistency: make "stash" consistent
  doc txt & -h consistency: add missing options
  doc txt & -h consistency: use "git foo" form, not "git-foo"
  doc txt & -h consistency: make "bundle" consistent
  doc txt & -h consistency: make "read-tree" consistent
  doc txt & -h consistency: make "rerere" consistent
  doc txt & -h consistency: add missing options and labels
  doc txt & -h consistency: make output order consistent
  doc txt & -h consistency: add or fix optional "--" syntax
  doc txt & -h consistency: fix mismatching labels
  doc SYNOPSIS & -h: use "-" to separate words in labels, not "_"
  ...
2022-10-28 11:26:54 -07:00
5af5e54106 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-27 15:25:55 -07:00
2843bdeaca Sync with 'maint' 2022-10-27 15:25:24 -07:00
e7e5c6f715 Downmerge a bit more for 2.38.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-27 15:24:23 -07:00
40d2f93bde Merge branch 'rs/archive-dedup-printf' into maint-2.38
Code simplification.

* rs/archive-dedup-printf:
  archive: deduplicate verbose printing
2022-10-27 15:24:14 -07:00
4532cd8377 Merge branch 'jh/struct-zero-init-with-older-clang' into maint-2.38
Work around older clang that warns against C99 zero initialization
syntax for struct.

* jh/struct-zero-init-with-older-clang:
  config.mak.dev: disable suggest braces error on old clang versions
2022-10-27 15:24:13 -07:00
92cd390849 Merge branch 'rs/use-fspathncmp' into maint-2.38
Code clean-up.

* rs/use-fspathncmp:
  dir: use fspathncmp() in pl_hashmap_cmp()
2022-10-27 15:24:13 -07:00
64de207727 Merge branch 'rj/branch-edit-desc-unborn' into maint-2.38
"git branch --edit-description" on an unborh branch misleadingly
said that no such branch exists, which has been corrected.

* rj/branch-edit-desc-unborn:
  branch: description for non-existent branch errors
2022-10-27 15:24:13 -07:00
94f76c6ad9 Merge branch 'pw/remove-rebase-p-test' into maint-2.38
Remove outdated test.

* pw/remove-rebase-p-test:
  t3435: remove redundant test case
2022-10-27 15:24:13 -07:00
196b784428 Merge branch 'jc/use-of-uc-in-log-messages' into maint-2.38
Clarify that "the sentence after <area>: prefix does not begin with
a capital letter" rule applies only to the commit title.

* jc/use-of-uc-in-log-messages:
  SubmittingPatches: use usual capitalization in the log message body
2022-10-27 15:24:13 -07:00
606c7e2147 Merge branch 'jc/tmp-objdir' into maint-2.38
The code to clean temporary object directories (used for
quarantine) tried to remove them inside its signal handler, which
was a no-no.

* jc/tmp-objdir:
  tmp-objdir: skip clean up when handling a signal
2022-10-27 15:24:12 -07:00
3cf20d1957 Merge branch 'dd/document-runtime-prefix-better' into maint-2.38
Update comment in the Makefile about the RUNTIME_PREFIX config knob.

* dd/document-runtime-prefix-better:
  Makefile: clarify runtime relative gitexecdir
2022-10-27 15:24:12 -07:00
cf649a3613 Merge branch 'ab/unused-annotation' into maint-2.38
Compilation fix for ancient compilers.

* ab/unused-annotation:
  git-compat-util.h: GCC deprecated message arg only in GCC 4.5+
2022-10-27 15:24:12 -07:00
a9514e3b95 Merge branch 'tb/midx-repack-ignore-cruft-packs' into maint-2.38
"git multi-pack-index repack/expire" used to repack unreachable
cruft into a new pack, which have been corrected.
cf. <63a1c3d4-eff3-af10-4263-058c88e74594@github.com>

* tb/midx-repack-ignore-cruft-packs:
  midx.c: avoid cruft packs with non-zero `repack --batch-size`
  midx.c: remove unnecessary loop condition
  midx.c: replace `xcalloc()` with `CALLOC_ARRAY()`
  midx.c: avoid cruft packs with `repack --batch-size=0`
  midx.c: prevent `expire` from removing the cruft pack
  Documentation/git-multi-pack-index.txt: clarify expire behavior
  Documentation/git-multi-pack-index.txt: fix typo
2022-10-27 15:24:11 -07:00
1b97c136cc Merge branch 'so/diff-merges-cleanup' into maint-2.38
Code clean-up.

* so/diff-merges-cleanup:
  diff-merges: clarify log.diffMerges documentation
  diff-merges: cleanup set_diff_merges()
  diff-merges: cleanup func_by_opt()
2022-10-27 15:24:11 -07:00
feba8be3f0 Merge branch 'rj/ref-filter-get-head-description-leakfix' into maint-2.38
Leakfix.

* rj/ref-filter-get-head-description-leakfix:
  ref-filter.c: fix a leak in get_head_description
2022-10-27 15:24:11 -07:00
ded944ff29 Merge branch 'jc/environ-docs' into maint-2.38
Documentation on various Boolean GIT_* environment variables have
been clarified.

* jc/environ-docs:
  environ: GIT_INDEX_VERSION affects not just a new repository
  environ: simplify description of GIT_INDEX_FILE
  environ: GIT_FLUSH should be made a usual Boolean
  environ: explain Boolean environment variables
  environ: document GIT_SSL_NO_VERIFY
2022-10-27 15:24:09 -07:00
86fa96860b Makefile: force -O0 when compiling with SANITIZE=leak
Cherry pick commit d3775de0 (Makefile: force -O0 when compiling with
SANITIZE=leak, 2022-10-18), as otherwise the leak checker at GitHub
Actions CI seems to fail with a false positive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-27 15:12:22 -07:00
246eedf2bc Merge branch 'js/cmake-updates'
Update to build procedure with VS using CMake/CTest.

* js/cmake-updates:
  cmake: increase time-out for a long-running test
  cmake: avoid editing t/test-lib.sh
  add -p: avoid ambiguous signed/unsigned comparison
  cmake: copy the merge tools for testing
  cmake: make it easier to diagnose regressions in CTest runs
2022-10-27 14:51:53 -07:00
702bb4baea Merge branch 'nw/t1002-cleanup'
Code clean-up in test.

* nw/t1002-cleanup:
  t1002: modernize outdated conditional
2022-10-27 14:51:53 -07:00
6ae1a6eaf2 Merge branch 'ab/run-hook-api-cleanup'
Move a global variable added as a hack during regression fixes to
its proper place in the API.

* ab/run-hook-api-cleanup:
  run-command.c: remove "max_processes", add "const" to signal() handler
  run-command.c: pass "opts" further down, and use "opts->processes"
  run-command.c: use "opts->processes", not "pp->max_processes"
  run-command.c: don't copy "data" to "struct parallel_processes"
  run-command.c: don't copy "ungroup" to "struct parallel_processes"
  run-command.c: don't copy *_fn to "struct parallel_processes"
  run-command.c: make "struct parallel_processes" const if possible
  run-command API: move *_tr2() users to "run_processes_parallel()"
  run-command API: have run_process_parallel() take an "opts" struct
  run-command.c: use designated init for pp_init(), add "const"
  run-command API: don't fall back on online_cpus()
  run-command API: make "n" parameter a "size_t"
  run-command tests: use "return", not "exit"
  run-command API: have "run_processes_parallel{,_tr2}()" return void
  run-command test helper: use "else if" pattern
2022-10-27 14:51:53 -07:00
f62c546455 Merge branch 'tb/save-keep-pack-during-geometric-repack'
When geometric repacking feature is in use together with the
--pack-kept-objects option, we lost packs marked with .keep files.

* tb/save-keep-pack-during-geometric-repack:
  repack: don't remove .keep packs with `--pack-kept-objects`
2022-10-27 14:51:53 -07:00
220604042c Merge branch 'jk/unused-anno-more'
More UNUSED annotation to help using -Wunused option with the
compiler.

* jk/unused-anno-more:
  ll-merge: mark unused parameters in callbacks
  diffcore-pickaxe: mark unused parameters in pickaxe functions
  convert: mark unused parameter in null stream filter
  apply: mark unused parameters in noop error/warning routine
  apply: mark unused parameters in handlers
  date: mark unused parameters in handler functions
  string-list: mark unused callback parameters
  object-file: mark unused parameters in hash_unknown functions
  mark unused parameters in trivial compat functions
  update-index: drop unused argc from do_reupdate()
  submodule--helper: drop unused argc from module_list_compute()
  diffstat_consume(): assert non-zero length
2022-10-27 14:51:52 -07:00
99bb1a0bea Merge branch 'tb/midx-bitmap-selection-fix'
A bugfix with tracing support in midx codepath

* tb/midx-bitmap-selection-fix:
  pack-bitmap-write.c: instrument number of reused bitmaps
  midx.c: instrument MIDX and bitmap generation with trace2 regions
  midx.c: consider annotated tags during bitmap selection
  midx.c: fix whitespace typo
2022-10-27 14:51:52 -07:00
c695592850 config: let feature.experimental imply gc.cruftPacks=true
We are interested in exploring whether gc.cruftPacks=true should become
the default value.

To determine whether it is safe to do so, let's encourage more users to
try it out.

Users who have set feature.experimental=true have already volunteered to
try new and possibly-breaking config changes, so let's try this new
default with that set of users.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-26 14:39:31 -07:00
12253ab6d0 gc: add tests for --cruft and friends
In 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects via
loose, 2022-05-20) gc learned to respect '--cruft' and 'gc.cruftPacks'.
'--cruft' is exercised in t5329-pack-objects-cruft.sh, but in a way that
doesn't check whether a lone gc run generates these cruft packs.
'gc.cruftPacks' is never exercised.

Add some tests to exercise these options to gc in the gc test suite.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-26 14:39:30 -07:00
6c3b077c71 Documentation/howto/maintain-git.txt: fix Meta/redo-jch.sh invocation
The Meta/redo-jch.sh script is generated a few lines earlier by running:

    $ Meta/Reintegrate master..seen >Meta/redo-jch.sh

But the resulting script is not necessarily executable. Later mentions
of this script invoke it with sh (instead of directly), but this one is
an odd one out.

Update the documentation to invoke the Meta/redo-jch.sh script with sh
in case the maintainer has not made the script executable.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-26 13:35:41 -07:00
8f24115165 branch: error code with --edit-description
Since c2d17ba3db (branch --edit-description: protect against mistyped
branch name, 2012-02-05) we return -1 on error editing the branch
description.

Let's change to 1, which follows the established convention and it is
better for portability reasons.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-26 10:52:37 -07:00
77e7267e47 branch: error copying or renaming a detached HEAD
In c847f53712 (Detached HEAD (experimental), 2007-01-01) an error
condition was introduced in rename_branch() to prevent renaming, later
also copying, a detached HEAD.

The condition used was checking for NULL in oldname, the source branch
to rename/copy.  That condition cannot be satisfied because if no source
branch is specified, HEAD is going to be used in the call.

The error issued instead is:

	fatal: Invalid branch name: 'HEAD'

Let's remove the condition in copy_or_rename_branch() (the current
function name) and check for HEAD before calling it, dying with the
original intended error if we're in a detached HEAD.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-26 10:52:24 -07:00
db29e6bbae Sync with 'maint' 2022-10-26 10:49:20 -07:00
4654134976 negotiator/skipping: avoid stack overflow
mark_common() in negotiator/skipping.c may overflow the stack due to
recursive function calls. Avoid this by instead recursing using a
heap-allocated data structure.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 17:14:40 -07:00
b715529770 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 17:11:44 -07:00
4039b8f112 Merge branch 'jc/more-sanitizer-at-ci'
Enable address and undefined sanitizer tasks at GitHub Actions CI.

* jc/more-sanitizer-at-ci:
  ci: add address and undefined sanitizer tasks
2022-10-25 17:11:44 -07:00
bda957de7c Merge branch 'jc/ci-osx-with-sha1dc'
Give a bit more diversity to macOS CI by using sha1dc in one of the
jobs (the other one tests Apple Common Crypto).

* jc/ci-osx-with-sha1dc:
  ci: use DC_SHA1=YesPlease on osx-clang job for CI
2022-10-25 17:11:44 -07:00
777f548b5a Merge branch 'gc/bare-repo-discovery'
Allow configuration files in "protected" scopes to include other
configuration files.

* gc/bare-repo-discovery:
  config: respect includes in protected config
2022-10-25 17:11:44 -07:00
b988427918 Merge branch 'rs/diff-caret-bang-with-parents'
"git diff rev^!" did not show combined diff to go to the rev from
its parents.

* rs/diff-caret-bang-with-parents:
  diff: support ^! for merges
  revisions.txt: unspecify order of resolved parts of ^!
  revision: use strtol_i() for exclude_parent
2022-10-25 17:11:43 -07:00
7d8dc5a1af Downmerge a handful of topics for 2.38.2 2022-10-25 17:11:39 -07:00
1f49b5171a Merge branch 'jk/cleanup-callback-parameters' into maint-2.38
Code clean-up.

* jk/cleanup-callback-parameters:
  attr: drop DEBUG_ATTR code
  commit: avoid writing to global in option callback
  multi-pack-index: avoid writing to global in option callback
  test-submodule: inline resolve_relative_url() function
2022-10-25 17:11:39 -07:00
28f9cd0d5f Merge branch 'rs/gc-pack-refs-simplify' into maint-2.38
Code clean-up.

* rs/gc-pack-refs-simplify:
  gc: simplify maintenance_task_pack_refs()
2022-10-25 17:11:39 -07:00
b30a4435ed Merge branch 'nb/doc-mergetool-typofix' into maint-2.38
Typofix.

* nb/doc-mergetool-typofix:
  mergetool.txt: typofix 'overwriten' -> 'overwritten'
2022-10-25 17:11:38 -07:00
553ea9d8c7 Merge branch 'jk/sequencer-missing-author-name-check' into maint-2.38
Typofix in code.

* jk/sequencer-missing-author-name-check:
  sequencer: detect author name errors in read_author_script()
2022-10-25 17:11:38 -07:00
ff8d1ec5b8 Merge branch 'ds/bundle-uri-docfix' into maint-2.38
Doc formatting fix.

* ds/bundle-uri-docfix:
  bundle-uri: fix technical doc issues
2022-10-25 17:11:37 -07:00
71220d8e54 Merge branch 'ab/test-malloc-with-sanitize-leak' into maint-2.38
Test fix.

* ab/test-malloc-with-sanitize-leak:
  test-lib: have SANITIZE=leak imply TEST_NO_MALLOC_CHECK
2022-10-25 17:11:37 -07:00
3ae0094a91 Merge branch 'rs/bisect-start-leakfix' into maint-2.38
Code clean-up that results in plugging a leak.

* rs/bisect-start-leakfix:
  bisect--helper: plug strvec leak
2022-10-25 17:11:37 -07:00
1155c8efbb Merge branch 'jc/branch-description-unset' into maint-2.38
"GIT_EDITOR=: git branch --edit-description" resulted in failure,
which has been corrected.

* jc/branch-description-unset:
  branch: do not fail a no-op --edit-desc
2022-10-25 17:11:37 -07:00
48b754ddc0 Merge branch 'pw/ssh-sign-report-errors' into maint-2.38
The codepath to sign learned to report errors when it fails to read
from "ssh-keygen".

* pw/ssh-sign-report-errors:
  ssh signing: return an error when signature cannot be read
2022-10-25 17:11:35 -07:00
3694b3844e Merge branch 'pw/mailinfo-b-fix' into maint-2.38
Fix a logic in "mailinfo -b" that miscomputed the length of a
substring, which lead to an out-of-bounds access.

* pw/mailinfo-b-fix:
  mailinfo -b: fix an out of bounds access
2022-10-25 17:11:35 -07:00
4dccc006b0 Merge branch 'rs/test-httpd-in-C-locale' into maint-2.38
Force C locale while running tests around httpd to make sure we can
find expected error messages in the log.

* rs/test-httpd-in-C-locale:
  t/lib-httpd: pass LANG and LC_ALL to Apache
2022-10-25 17:11:35 -07:00
bcf22f29df Merge branch 'js/merge-ort-in-read-only-repo' into maint-2.38
In read-only repositories, "git merge-tree" tried to come up with a
merge result tree object, which it failed (which is not wrong) and
led to a segfault (which is bad), which has been corrected.

* js/merge-ort-in-read-only-repo:
  merge-ort: return early when failing to write a blob
  merge-ort: fix segmentation fault in read-only repositories
2022-10-25 17:11:34 -07:00
7f8a6caee5 Merge branch 'ja/rebase-i-avoid-amending-self' into maint-2.38
"git rebase -i" can mistakenly attempt to apply a fixup to a commit
itself, which has been corrected.

* ja/rebase-i-avoid-amending-self:
  sequencer: avoid dropping fixup commit that targets self via commit-ish
2022-10-25 17:11:34 -07:00
cf96b393d6 Merge branch 'jk/fsck-on-diet' into maint-2.38
"git fsck" failed to release contents of tree objects already used
from the memory, which has been fixed.

* jk/fsck-on-diet:
  parse_object_buffer(): respect save_commit_buffer
  fsck: turn off save_commit_buffer
  fsck: free tree buffers after walking unreachable objects
2022-10-25 17:11:33 -07:00
1655ac884a Merge branch 'ah/fsmonitor-daemon-usage-non-l10n' into maint-2.38
Fix messages incorrectly marked for translation.

* ah/fsmonitor-daemon-usage-non-l10n:
  fsmonitor--daemon: don't translate literal commands
2022-10-25 17:11:33 -07:00
0d5d92906a Merge branch 'jk/clone-allow-bare-and-o-together' into maint-2.38
"git clone" did not like to see the "--bare" and the "--origin"
options used together without a good reason.

* jk/clone-allow-bare-and-o-together:
  clone: allow "--bare" with "-o"
2022-10-25 17:11:33 -07:00
665d7e08b4 Merge branch 'jk/remote-rename-without-fetch-refspec' into maint-2.38
"git remote rename" failed to rename a remote without fetch
refspec, which has been corrected.

* jk/remote-rename-without-fetch-refspec:
  remote: handle rename of remote without fetch refspec
2022-10-25 17:11:32 -07:00
457f863fb4 Merge branch 'vd/fix-unaligned-read-index-v4' into maint-2.38
The codepath that reads from the index v4 had unaligned memory
accesses, which has been corrected.

* vd/fix-unaligned-read-index-v4:
  read-cache: avoid misaligned reads in index v4
2022-10-25 17:11:32 -07:00
c72f2febae Merge branch 'ab/coding-guidelines-c99' into maint-2.38
Update CodingGuidelines to clarify what features to use and avoid
in C99.

* ab/coding-guidelines-c99:
  CodingGuidelines: recommend against unportable C99 struct syntax
  CodingGuidelines: mention C99 features we can't use
  CodingGuidelines: allow declaring variables in for loops
  CodingGuidelines: mention dynamic C99 initializer elements
  CodingGuidelines: update for C99
2022-10-25 17:11:32 -07:00
3882a0d3ad Documentation: add lint-fsck-msgids
During the initial development of the fsck-msgids.txt feature, it
has become apparent that it is very much error prone to make sure
the description in the documentation file are sorted and correctly
match what is in the fsck.h header file.

Add a quick-and-dirty Perl script and doc-lint target to sanity
check that the fsck-msgids.txt is consistent with the error type
list in the fsck.h header file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 15:44:19 -07:00
f6534dbda4 fsck: document msg-id
The documentation lacks mention of specific <msg-id> that are supported.
While git-help --config will display a list of these options, often
developers' first instinct is to consult the git docs to find valid
config values.

Add a list of fsck error messages, and link to it from the git-fsck
documentation.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 15:44:18 -07:00
7edfb883ab fsck: remove the unused MISSING_TREE_OBJECT
This error type has never been used since it was introduced at
159e7b08 (fsck: detect gitmodules files, 2018-05-02).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 15:44:18 -07:00
51691fed06 fsck: remove the unused BAD_TAG_OBJECT
2175a0c6 (fsck: stop checking tag->tagged, 2019-10-18) stopped
checking the tagged object referred to by a tag object, which is what the
error message BAD_TAG_OBJECT was for. Since then the BAD_TAG_OBJECT
message is no longer used anywhere.

Remove the BAD_TAG_OBJECT msg-id.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 15:44:18 -07:00
f1c0e3946e apply: reject patches larger than ~1 GiB
The apply code is not prepared to handle extremely large files. It uses
"int" in some places, and "unsigned long" in others.

This combination leads to unfortunate problems when switching between
the two types. Using "int" prevents us from handling large files, since
large offsets will wrap around and spill into small negative values,
which can result in wrong behavior (like accessing the patch buffer with
a negative offset).

Converting from "unsigned long" to "int" also has truncation problems
even on LLP64 platforms where "long" is the same size as "int", since
the former is unsigned but the latter is not.

To avoid potential overflow and truncation issues in `git apply`, apply
similar treatment as in dcd1742e56 (xdiff: reject files larger than
~1GB, 2015-09-24), where the xdiff code was taught to reject large
files for similar reasons.

The maximum size was chosen somewhat arbitrarily, but picking a value
just shy of a gigabyte allows us to double it without overflowing 2^31-1
(after which point our value would wrap around to a negative number).
To give ourselves a bit of extra margin, the maximum patch size is a MiB
smaller than a full GiB, which gives us some slop in case we allocate
"(records + 1) * sizeof(int)" or similar.

Luckily, the security implications of these conversion issues are
relatively uninteresting, because a victim needs to be convinced to
apply a malicious patch.

Reported-by: 정재우 <thebound7@gmail.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-25 15:21:17 -07:00
a294443fa1 embargoed releases: also describe the git-security list and the process
With the recent turnover on the git-security list, questions came up how
things are usually run. Rather than answering questions individually,
extend Git's existing documentation about security vulnerabilities to
describe the git-security mailing list, how things are run on that list,
and what to expect throughout the process from the time a security bug
is reported all the way to the time when a fix is released.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Julia Ramer <gitprplr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 16:03:59 -07:00
0d32ae8d7f builtin: patch-id: remove unused diff-tree prefix
The last git version that had "diff-tree" in the header text
of "git diff-tree" output was v1.3.0 from 2006. The header text
was changed from "diff-tree" to "commit" in 91539833
("Log message printout cleanups").

Given how long ago this change was made, it is highly unlikely that
anyone is still feeding in outputs from that git version.

Remove the handling of the "diff-tree" prefix and document the
source of the other prefixes so that the overall functionality
is more clear.

Signed-off-by: Jerry Zhang <Jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 15:44:20 -07:00
2871f4d447 builtin: patch-id: add --verbatim as a command mode
There are situations where the user might not want the default
setting where patch-id strips all whitespace. They might be working
in a language where white space is syntactically important, or they
might have CI testing that enforces strict whitespace linting. In
these cases, a whitespace change would result in the patch
fundamentally changing, and thus deserving of a different id.

Add a new mode that is exclusive of --stable and --unstable called
--verbatim. It also corresponds to the config
patchid.verbatim = true. In this mode, the stable algorithm is
used and whitespace is not stripped from the patch text.

Users of --unstable mainly care about compatibility with old git
versions, which unstripping the whitespace would break. Thus there
isn't a usecase for the combination of --verbatim and --unstable,
and we don't expose this so as to not add maintainence burden.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
fixes https://github.com/Skydio/revup/issues/2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 15:44:20 -07:00
93105aba6c patch-id: fix patch-id for mode changes
Currently patch-id as used in rebase and cherry-pick does not account
for file modes if the file is modified. One consequence of this is
that if you have a local patch that changes modes, but upstream
has applied an outdated version of the patch that doesn't include
that mode change, "git rebase" will drop your local version of the
patch along with your mode changes. It also means that internal
patch-id doesn't produce the same output as the builtin, which does
account for mode changes due to them being part of diff output.

Fix by adding mode to the patch-id if it has changed, in the same
format that would be produced by diff, so that it is compatible
with builtin patch-id.

Signed-off-by: Jerry Zhang <Jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 15:44:20 -07:00
0df19eb9d9 builtin: patch-id: fix patch-id with binary diffs
"git patch-id" currently doesn't produce correct output if the
incoming diff has any binary files. Add logic to get_one_patchid
to handle the different possible styles of binary diff. This
attempts to keep resulting patch-ids identical to what would be
produced by the counterpart logic in diff.c, that is it produces
the id by hashing the a and b oids in succession.

In general we handle binary diffs by first caching the object ids from
the "index" line and using those if we then find an indication
that the diff is binary.

The input could contain patches generated with "git diff --binary". This
currently breaks the parse logic and results in multiple patch-ids
output for a single commit. Here we have to skip the contents of the
patch itself since those do not go into the patch id. --binary
implies --full-index so the object ids are always available.

When the diff is generated with --full-index there is no patch content
to skip over.

When a diff is generated without --full-index or --binary, it will
contain abbreviated object ids. This will still result in a sufficiently
unique patch-id when hashed, but does not match internal patch id
output. We'll call this ok for now as we already need specialized
arguments to diff in order to match internal patch id (namely -U3).

Signed-off-by: Jerry Zhang <Jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 15:44:19 -07:00
51276c1832 patch-id: use stable patch-id for rebases
Git doesn't persist patch-ids during the rebase process, so there is
no need to specifically invoke the unstable variant. Use the stable
logic for all internal patch-id calculations to minimize the number of
code paths and improve test coverage.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 15:44:19 -07:00
0570be79ea patch-id: fix stable patch id for binary / header-only
Patch-ids for binary patches are found by hashing the object
ids of the before and after objects in succession. However in
the --stable case, there is a bug where hunks are not flushed
for binary and header-only patch ids, which would always result
in a patch-id of 0000. The --unstable case is currently correct.

Reorder the logic to branch into 3 cases for populating the
patch body: header-only which populates nothing, binary which
populates the object ids, and normal which populates the text
diff. All branches will end up flushing the hunk.

Don't populate the ---a/ and +++b/ lines for binary diffs, to correspond
to those lines not being present in the "git diff" text output.
This is necessary because we advertise that the patch-id calculated
internally and used in format-patch is the same that what the
builtin "git patch-id" would produce when piped from a diff.

Update the test to run on both binary and normal files.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 15:44:19 -07:00
7b11234e3b shortlog: implement --group=committer in terms of --group=<format>
In the same spirit as the previous commit, reimplement
`--group=committer` as a special case of `--group=<format>`, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
9c10d4ff24 shortlog: implement --group=author in terms of --group=<format>
Instead of handling SHORTLOG_GROUP_AUTHOR separately, reimplement it as
a special case of the new `--group=<format>` mode, where the author mode
is a shorthand for `--group='%aN <%aE>'.

Note that we still need to keep the SHORTLOG_GROUP_AUTHOR enum since it
has a different meaning in `read_from_stdin()`, where it is still used
for a different purpose.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
10538e2a62 shortlog: extract shortlog_finish_setup()
Extract a function which finishes setting up the shortlog struct for
use. The caller in `make_cover_letter()` does not care about trailer
sorting, so it isn't strictly necessary to add a call there in this
patch.

But the next patch will add additional functionality to the new
`shortlog_finish_setup()` function, which the caller in
`make_cover_letter()` will care about.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
3dc95e09e1 shortlog: support arbitrary commit format --groups
In addition to generating a shortlog based on committer, author, or the
identity in one or more specified trailers, it can be useful to generate
a shortlog based on an arbitrary commit format.

This can be used, for example, to generate a distribution of commit
activity over time, like so:

    $ git shortlog --group='%cd' --date='format:%Y-%m' -s v2.37.0..
       117  2022-06
       274  2022-07
       324  2022-08
       263  2022-09
         7  2022-10

Arbitrary commit formats can be used. In fact, `git shortlog`'s default
behavior (to count by commit authors) can be emulated as follows:

    $ git shortlog --group='%aN <%aE>' ...

and future patches will make the default behavior (as well as
`--committer`, and `--group=trailer:<trailer>`) special cases of the
more flexible `--group` option.

Note also that the SHORTLOG_GROUP_FORMAT enum value is used only to
designate that `--group:<format>` is in use when in stdin mode to
declare that the combination is invalid.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
b017d3dae9 shortlog: extract --group fragment for translation
The subsequent commit will add another unhandled case in
`read_from_stdin()` which will want to use the same message as with
`--group=trailer`.

Extract the "--group=trailer" part from this message so the same
translation key can be used for both cases.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
0b293df964 shortlog: make trailer insertion a noop when appropriate
When there are no trailers to insert, it is natural that
insert_records_from_trailers() should return without having done any
work.

But instead we guard this call unnecessarily by first checking whether
`log->groups` has the `SHORTLOG_GROUP_TRAILER` bit set.

Prepare to match a similar pattern in the future where a function which
inserts records of a certain type does no work when no specifiers
matching that type are given.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
251554c269 shortlog: accept --date-related options
Prepare for a future patch which will introduce arbitrary pretty formats
via the `--group` argument.

To allow additional customizability (for example, to support something
like `git shortlog -s --group='%aD' --date='format:%Y-%m' ...` (which
groups commits by the datestring 'YYYY-mm' according to author date), we
must store off the `--date` parsed from calling `parse_revision_opt()`.

Note that this also affects custom output `--format` strings in `git
shortlog`. Though this is a behavior change, this is arguably fixing a
long-standing bug (ie., that `--format` strings are not affected by
`--date` specifiers as they should be).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 14:48:05 -07:00
91badeba32 builtin/repack.c: implement --expire-to for storing pruned objects
When pruning objects with `--cruft`, `git repack` offers some
flexibility when selecting the set of which objects are pruned via the
`--cruft-expiration` option.

This is useful for expiring objects which are older than the grace
period, making races where to-be-pruned objects become reachable and
then ancestors of freshly pushed objects, leaving the repository in a
corrupt state after pruning substantially less likely [1].

But in practice, such races are impossible to avoid entirely, no matter
how long the grace period is. To prevent this race, it is often
advisable to temporarily put a repository into a read-only state. But in
practice, this is not always practical, and so some middle ground would
be nice.

This patch introduces a new option, `--expire-to`, which teaches `git
repack` to write an additional cruft pack containing just the objects
which were pruned from the repository. The caller can specify a
directory outside of the current repository as the destination for this
second cruft pack.

This makes it possible to prune objects from a repository, while still
holding onto a supplemental copy of them outside of the original
repository. Having this copy on-disk makes it substantially easier to
recover objects when the aforementioned race is encountered.

`--expire-to` is implemented in a somewhat convoluted manner, which is
to take advantage of the fact that the first time `write_cruft_pack()`
is called, it adds the name of the cruft pack to the `names` string
list. That means the second time we call `write_cruft_pack()`, objects
in the previously-written cruft pack will be excluded.

As long as the caller ensures that no objects are expired during the
second pass, this is sufficient to generate a cruft pack containing all
objects which don't appear in any of the new packs written by `git
repack`, including the cruft pack. In other words, all of the objects
which are about to be pruned from the repository.

It is important to note that the destination in `--expire-to` does not
necessarily need to be a Git repository (though it can be) Notably, the
expired packs do not contain all ancestors of expired objects. So if the
source repository contains something like:

              <unreachable>
             /
    C1 --- C2
      \
       refs/heads/master

where C2 is unreachable, but has a parent (C1) which is reachable, and
C2 would be pruned, then the expiry pack will contain only C2, not C1.

[1]: https://lore.kernel.org/git/20190319001829.GL29661@sigill.intra.peff.net/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 13:39:42 -07:00
c12cda479e builtin/repack.c: write cruft packs to arbitrary locations
In the following commit, a new write_cruft_pack() caller will be added
which wants to write a cruft pack to an arbitrary location. Prepare for
this by adding a parameter which controls the destination of the cruft
pack.

For now, provide "packtmp" so that this commit does not change any
behavior.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 13:39:42 -07:00
eddad36860 builtin/repack.c: pass "cruft_expiration" to write_cruft_pack
`builtin/repack.c`'s `write_cruft_pack()` is used to generate the cruft
pack when `--cruft` is supplied. It uses a static variable
"cruft_expiration" which is filled in by option parsing.

A future patch will add an `--expire-to` option which allows `git
repack` to write a cruft pack containing the pruned objects out to a
separate repository. In order to implement this functionality, some
callers will have to pass a value for `cruft_expiration` different than
the one filled out by option parsing.

Prepare for this by teaching `write_cruft_pack` to take a
"cruft_expiration" parameter, instead of reading a single static
variable.

The (sole) existing caller of `write_cruft_pack()` will pass the value
for "cruft_expiration" filled in by option parsing, retaining existing
behavior. This means that we can make the variable local to
`cmd_repack()`, and eliminate the static declaration.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 13:39:42 -07:00
4e7b65ba8e builtin/repack.c: pass "out" to prepare_pack_objects
`builtin/repack.c`'s `prepare_pack_objects()` is used to prepare a set
of arguments to a `pack-objects` process which will generate a desired
pack.

A future patch will add an `--expire-to` option which allows `git
repack` to write a cruft pack containing the pruned objects out to a
separate repository. Prepare for this by teaching that function to write
packs to an arbitrary location specified by the caller.

All existing callers of `prepare_pack_objects()` will pass `packtmp` for
`out`, retaining the existing behavior.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 13:39:42 -07:00
81071626ba trace2: add global counter mechanism
Add global counters mechanism to Trace2.

The Trace2 counters mechanism adds the ability to create a set of
global counter variables and an API to increment them efficiently.
Counters can optionally report per-thread usage in addition to the sum
across all threads.

Counter events are emitted to the Trace2 logs when a thread exits and
at process exit.

Counters are an alternative to `data` and `data_json` events.

Counters are useful when you want to measure something across the life
of the process, when you don't want per-measurement events for
performance reasons, when the data does not fit conveniently within a
region, or when your control flow does not easily let you write the
final total.  For example, you might use this to report the number of
calls to unzip() or the number of de-delta steps during a checkout.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:26 -07:00
8ad575646c trace2: add stopwatch timers
Add stopwatch timer mechanism to Trace2.

Timers are an alternative to Trace2 Regions.  Regions are useful for
measuring the time spent in various computation phases, such as the
time to read the index, time to scan for unstaged files, time to scan
for untracked files, and etc.

However, regions are not appropriate in all places.  For example,
during a checkout, it would be very inefficient to use regions to
measure the total time spent inflating objects from the ODB from
across the entire lifetime of the process; a per-unzip() region would
flood the output and significantly slow the command; and some form of
post-processing would be requried to compute the time spent in unzip().

Timers can be used to measure a series of timer intervals and emit
a single summary event (at thread and/or process exit).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:26 -07:00
24a4c45da9 trace2: convert ctx.thread_name from strbuf to pointer
Convert the `tr2tls_thread_ctx.thread_name` field from a `strbuf`
to a "const char*" pointer.

The `thread_name` field is a constant string that is constructed when
the context is created.  Using a (non-const) `strbuf` structure for it
caused some confusion in the past because it implied that someone
could rename a thread after it was created.  That usage was not
intended.  Change it to a const pointer to make the intent more clear.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:26 -07:00
3124793604 trace2: improve thread-name documentation in the thread-context
Improve the documentation of the tr2tls_thread_ctx.thread_name field
and its relation to the tr2tls_thread_ctx.thread_id field.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:25 -07:00
a70839cf36 trace2: rename the thread_name argument to trace2_thread_start
Rename the `thread_name` argument in `tr2tls_create_self()` and
`trace2_thread_start()` to be `thread_base_name` to make it clearer
that the passed argument is a component used in the construction of
the actual `struct tr2tls_thread_ctx.thread_name` variable.

The base name will be used along with the thread id to create a
unique thread name.

This commit does not change how the `thread_name` field is
allocated or stored within the `tr2tls_thread_ctx` structure.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:25 -07:00
8e8c5ad27a api-trace2.txt: elminate section describing the public trace2 API
Eliminate the mostly obsolete `Public API` sub-section from the
`Trace2 API` section in the documentation.  Strengthen the referral
to `trace2.h`.

Most of the technical information in this sub-section was moved to
`trace2.h` in 6c51cb525d (trace2: move doc to trace2.h, 2019-11-17) to
be adjacent to the function prototypes.  The remaining text wasn't
that useful by itself.

Furthermore, the text would need a bit of overhaul to add routines
that do not immediately generate a message, such as stopwatch timers.
So it seemed simpler to just get rid of it.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:25 -07:00
5bbb925137 tr2tls: clarify TLS terminology
Reduce or eliminate use of the term "TLS" in the Trace2 code.

The term "TLS" has two popular meanings: "thread-local storage" and
"transport layer security".  In the Trace2 source, the term is associated
with the former.  There was concern on the mailing list about it refering
to the latter.

Update the source and documentation to eliminate the use of the "TLS" term
or replace it with the phrase "thread-local storage" to reduce ambiguity.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:25 -07:00
545ddca0c3 trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx
Use "size_t" rather than "int" for the "alloc" and "nr_open_regions"
fields in the "tr2tls_thread_ctx".  These are used by ALLOC_GROW().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24 12:45:25 -07:00
cdc3db33ce submodule: use strvec_pushf() for --super-prefix
absorb_git_dir_into_superproject() uses a strbuf and strvec_pushl() to
build and add the --super-prefix option and its argument.  Use a single
strvec_pushf() call to add the stuck form instead, which reduces the
code size and avoids a strbuf allocation and release.  The same is
already done in submodule_reset_index() and submodule_move_head().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-23 14:07:32 -07:00
9b3fadfd06 t7700: annotate cruft-pack failure with ok=sigpipe
One of our tests intentionally causes the cruft-pack generation phase of
repack to fail, in order to stimulate an exit from repack at the desired
moment. It does so by feeding a bogus option argument to pack-objects.
This is a simple and reliable way to get pack-objects to fail, but it
has one downside: pack-objects will die before reading its stdin, which
means the caller repack may racily get SIGPIPE writing to it.

For the purposes of this test, that's OK. We are checking whether repack
cleans up already-created .tmp files, and it will do so whether it exits
or dies by signal (because the tempfile API hooks both).

But we have to tell test_must_fail that either outcome is OK, or it
complains about the signal. Arguably this is a workaround (compared to
fixing repack), as repack dying to SIGPIPE means that it loses the
opportunity to give a more detailed message. But we don't actually write
such a message anyway; we rely on pack-objects to have written something
useful to stderr, and it does. In either case (signal or exit), that is
the main thing the user will see.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-23 11:08:45 -07:00
ec1edbcb56 merge-tree: support multiple batched merges with --stdin
Add an option, --stdin, to merge-tree which will accept lines of input
with two branches to merge per line, and which will perform all the
merges and give output for each in turn.  This option implies -z, and
modifies the output to also include a merge status since the exit code
of the program can no longer convey that information now that multiple
merges are involved.

This could be useful, for example, by Git hosting providers.  When one
branch is updated, one may want to check whether all code reviews
targetting that branch can still cleanly merge.  Avoiding the overhead
of starting up a separate process for each of those code reviews might
provide significant savings in a repository with many code reviews.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-22 22:21:26 -07:00
a9f5bb83e0 merge-tree: update documentation for differences in -z output
The Informational Messages was updated in de90581141 ("merge-ort:
optionally produce machine-readable output", 2022-06-18) to provide more
detailed and machine parseable output when `-z` is passed, but the
Documentation was not updated to reflect these changes.  Update it now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-22 22:21:24 -07:00
20da61f25f Git.pm: trust rev-parse to find bare repositories
When initializing a repository object, we run "git rev-parse --git-dir"
to let the C version of Git find the correct directory. But curiously,
if this fails we don't automatically say "not a git repository".
Instead, we do our own pure-perl check to see if we're in a bare
repository.

This makes little sense, as rev-parse will report both bare and non-bare
directories. This logic comes from d5c7721d58 (Git.pm: Add support for
subdirectories inside of working copies, 2006-06-24), but I don't see
any reason given why we can't just rely on rev-parse. Worse, because we
treat any non-error response from rev-parse as a non-bare repository,
we'll erroneously set the object's WorkingCopy, even in a bare
repository.

But it gets worse. Since 8959555cee (setup_git_directory(): add an owner
check for the top-level directory, 2022-03-02), it's actively wrong (and
dangerous). The perl code doesn't implement the same ownership checks.
And worse, after "finding" the bare repository, it sets GIT_DIR in the
environment, which tells any subsequent Git commands that we've
confirmed the directory is OK, and to trust us. I.e., it re-opens the
vulnerability plugged by 8959555cee when using Git.pm's repository
discovery code.

We can fix this by just relying on rev-parse to tell us when we're not
in a repository, which fixes the vulnerability. Furthermore, we'll ask
its --is-bare-repository function to tell us if we're bare or not, and
rely on that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-22 16:39:48 -07:00
2b86c10084 merge-ort: fix bug with dir rename vs change dir to symlink
When changing a directory to a symlink on one side of history, and
renaming the parent of that directory to a different directory name
on the other side, e.g. with this kind of setup:

    Base commit: Has a file named dir/subdir/file
    Side1:       Rename dir/ -> renamed-dir/
    Side2:       delete dir/subdir/file, add dir/subdir as symlink

Then merge-ort was running into an assertion failure:

    git: merge-ort.c:2622: apply_directory_rename_modifications: Assertion `ci->dirmask == 0' failed

merge-recursive did not have as obvious an issue handling this case,
likely because we never fixed it to handle the case from commit
902c521a35 ("t6423: more involved directory rename test", 2020-10-15)
where we need to be careful about nested renames when a directory rename
occurs (dir/ -> renamed-dir/ implies dir/subdir/ ->
renamed-dir/subdir/).  However, merge-recursive does have multiple
problems with this testcase:

  * Incorrect stages for the file: merge-recursive omits the stage in
    the index corresponding to the base stage, making `git status`
    report "added by us" for renamed-dir/subdir/file instead of the
    expected "deleted by them".

  * Poor directory/file conflict handling: For the renamed-dir/subdir
    symlink, instead of reporting a file/directory conflict as
    expected, it reports "Error: Refusing to lose untracked file at
    renamed-dir/subdir".  This is a lie because there is no untracked
    file at that location.  It then does the normal suboptimal
    merge-recursive thing of having the symlink be tracked in the index
    at a location where it can't be written due to D/F conflicts
    (namely, renamed-dir/subdir), but writes it to the working tree at
    a different location as a new untracked file (namely,
    renamed-dir/subdir~B^0)

Technically, these problems don't prevent the user from resolving the
merge if they can figure out to ignore the confusion, but because both
pieces of output are quite confusing I don't want to modify the test
to claim the recursive also passes it even if it doesn't have the bug
that ort did.

So, fix the bug in ort by splitting the conflict_info for "dir/subdir"
into two, one for the directory part, one for the file (i.e. symlink)
part, since the symlink is being renamed by directory rename detection.
The directory part is needed for proper nesting, since there are still
conflict_info fields for files underneath it (though those are marked
as is_null, they are still present until the entries are processed,
and the entry processing wants every non-toplevel entry to have a
parent directory).

Reported-by: Stefano Rivera <stefano@rivera.za.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-22 16:10:33 -07:00
193430717a repack: drop remove_temporary_files()
After we've successfully finished the repack, we call
remove_temporary_files(), which looks for and removes any files matching
".tmp-$$-pack-*", where $$ is the pid of the current process. But this
is pointless. If we make it this far in the process, we've already
renamed these tempfiles into place, and there is nothing left to delete.

Nor is there a point in trying to call it to clean up when we _aren't_
successful. It's not safe for using in a signal handler, and the
previous commit already handed that job over to the tempfile API.

It might seem like it would be useful to clean up stray .tmp files left
by other invocations of git-repack. But it won't clean those files; it
only matches ones with its pid, and leaves the rest. Fortunately, those
are cleaned up naturally by successive calls to git-repack; we'll
consider .tmp-*.pack the same as normal packfiles, so "repack -ad", etc,
will roll up their contents and eventually delete them.

The one case that could matter is if pack-objects generates an extension
we don't know about, like ".tmp-pack-$$-$hash.some-new-ext". The current
code will quietly delete such a file, while after this patch we'd leave
it in place. In practice this doesn't happen, and would be indicative of
a bug. Leaving the file as cruft is arguably a better behavior, as it
means somebody is more likely to eventually notice and fix the bug.  If
we really wanted to be paranoid, we could scan for and warn about such
files, but that seems like overkill.

There's nothing to test with regard to the removal of this function. It
was doing nothing, so the behavior should be the same.  However, we can
verify (and protect) our assumption that "repack -ad" will eventually
remove stray files by adding a test for that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 18:03:52 -07:00
9cf10d8786 repack: use tempfiles for signal cleanup
When git-repack exits due to a signal, it tries to clean up by calling
its remove_temporary_files() function, which walks through the packs dir
looking for ".tmp-$$-pack-*" files to delete (where "$$" is the pid of
the current process).

The biggest problem here is that remove_temporary_files() is not safe to
call in a signal handler. It uses opendir(), which isn't on the POSIX
async-signal-safe list. The details will be platform-specific, but a
likely issue is that it needs to allocate memory; if we receive a signal
while inside malloc(), etc, we'll conflict on the allocator lock and
deadlock with ourselves.

We can fix this by just cleaning up the files directly, without walking
the directory. We already know the complete list of .tmp-* files that
were generated, because we recorded them via populate_pack_exts(). When
we find files there, we can use register_tempfile() to record the
filenames. If we receive a signal, then the tempfile API will clean them
up for us, and it's async-safe and pretty battle-tested.

Note that this is slightly racier than the existing scheme. We don't
record the filenames until pack-objects tells us the hash over stdout.
So during the period between it generating the file and reporting the
hash, we'd fail to clean up. However, that period is very small. During
most of the pack generation process pack-objects is using its own
internal tempfiles. It's only at the very end that it moves them into
the names git-repack expects, and then it immediately reports the name
to us. Given that cleanup like this is best effort (after all, we may
get SIGKILL), this level of race is acceptable.

When we register the tempfiles, we'll record them locally and use the
result to call rename_tempfile(), rather than renaming by hand.  This
isn't strictly necessary, as once we've renamed the files they're gone,
and the tempfile API's cleanup unlink() would simply become a pointless
noop. But managing the lifetimes of the tempfile objects is the cleanest
thing to do, and the tempfile pointers naturally fill the same role as
the old booleans.

This patch also fixes another small problem. We only hook signals, and
don't set up an atexit handler. So if we see an error that causes us to
die(), we'll leave the .tmp-* files in place. But since the tempfile API
handles this for us, this is now fixed for free. The new test covers
this by stimulating a failure of pack-objects when generating a cruft
pack. Before this patch, the .tmp-* file for the main pack would have
been left, but now we correctly clean it up.

Two small subtleties on the implementation:

  - in the renaming loop, we can stop re-constructing fname_old; we only
    use it when we have a tempfile to rename, so we can just ask the
    tempfile for its path (which, barring bugs, should be identical)

  - when renaming fails, our error message mentions fname_old. But since
    a failed rename_tempfile() invalidates the tempfile struct, we'll
    lose access to that string. Instead, let's mention the destination
    filename, which is what most other callers do.

Reported-by: Jan Pokorný <poki@fnusa.cz>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 18:03:52 -07:00
a4880b20cc repack: expand error message for missing pack files
If pack-objects tells us it generated pack $hash, we expect to find
.tmp-$$-pack-$hash.pack, .idx, .rev, and so on. Some of these files are
optional, but others are not. For the required ones, we'll bail with an
error if any of them is missing.

The error message is just "missing required file", which is a bit vague.
We should be more clear that it is not the user's fault, but rather that
the sub-pgoram we called is not operating as expected. In practice,
nobody should ever see this message, as it would generally only be
caused by a bug in Git.

It probably doesn't make sense to convert this to a BUG(), though, as
there are other (unlikely) possibilities, such as somebody else racily
deleting the files, filesystem errors causing stat() to fail, and so on.

A nice side effect here is that we stop relying on fname_old in this
code path, which will let us deal with it only in the first part of the
conditional.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 18:03:52 -07:00
b639606fd0 repack: populate extension bits incrementally
After generating the main pack and then any additional cruft packs, we
iterate over the "names" list (which contains hashes of packs generated
by pack-objects), and call populate_pack_exts() for each.

There's one small problem with this. In repack_promisor_objects(), we
may add entries to "names" and call populate_pack_exts() for them.
Calling it again is mostly just wasteful, as we'll stat() the filename
with each possible extension, get the same result, and just overwrite
our bits.

So we could drop the call there, and leave the final loop to populate
all of the bits. But instead, this patch does the reverse: drops the
final loop, and teaches the other two sites to populate the bits as they
add entries.

This makes the code easier to reason about, as you never have to worry
about when the util field is valid; it is always valid for each entry.

It also serves my ulterior purpose: recording the generated filenames as
soon as possible will make it easier for a future patch to use them for
cleaning up from a failed operation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 18:03:52 -07:00
d3d9c51973 repack: convert "names" util bitfield to array
We keep a string_list "names" containing the hashes of packs generated
on our behalf by pack-objects. The util field of each item is treated as
a bitfield that tells us which extensions (.pack, .idx, .rev, etc) are
present for each name.

Let's switch this to allocating a real array. That will give us room in
a future patch to store more data than just a single bit per extension.
And it makes the code a little easier to read, as we avoid casting back
and forth between uintptr_t and a void pointer.

Since the only thing we're storing is an array, we could just allocate
it directly. But instead I've put it into a named struct here. That
further increases readability around the casts, and in particular helps
differentiate us from other string_lists in the same file which use
their util field differently. E.g., the existing_*_packs lists still do
bit-twiddling, but their bits have different meaning than the ones in
"names". This makes it hard to grep around the code to see how the util
fields are used; now you can look for "generated_pack_data".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 18:03:52 -07:00
ce8529b2bb diff: leave NEEDWORK notes in show_stats() function
The previous step made an attempt to correctly compute display
columns allocated and padded different parts of diffstat output.
There are at least two known codepaths in the function that still
mixes up display widths and byte length that need to be fixed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 15:02:31 -07:00
1762382ab1 subtree: fix split after annotated tag was squashed merged
The previous commit fixed a failure in 'git subtree merge --squash' when
the previous squash-merge merged an annotated tag of the subtree
repository which is missing locally.

The same failure happens in 'git subtree split', either directly or when
called by 'git subtree push', under the same circumstances: 'cmd_split'
invokes 'find_existing_splits', which loops through previous commits and
invokes 'git rev-parse' (via 'process_subtree_split_trailer') on the
value of any 'git subtree-split' trailer it finds. This fails if this
value is the hash of an annotated tag which is missing locally.

Add a new optional argument 'repository' to 'cmd_split' and
'find_existing_splits', and invoke 'cmd_split' with that argument from
'cmd_push'. This allows 'process_subtree_split_trailer' to try to fetch
the missing tag from the 'repository' if it's not available locally,
mirroring the new behaviour of 'git subtree pull' and 'git subtree
merge'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:06 -07:00
0d330673d4 subtree: fix squash merging after annotated tag was squashed merged
When 'git subtree merge --squash $ref' is invoked, either directly or
through 'git subtree pull --squash $repo $ref', the code looks for the
latest squash merge of the subtree in order to create the new merge
commit as a child of the previous squash merge.

This search is done in function 'process_subtree_split_trailer', invoked
by 'find_latest_squash', which looks for the most recent commit with a
'git-subtree-split' trailer; that trailer's value is the object name in
the subtree repository of the ref that was last squash-merged. The
function verifies that this object is present locally with 'git
rev-parse', and aborts if it's not.

The hash referenced by the 'git-subtree-split' trailer is guaranteed to
correspond to a commit since it is the result of running 'git rev-parse
-q --verify "$1^{commit}"' on the first argument of 'cmd_merge' (this
corresponds to 'rev' in 'cmd_merge' which is passed through to
'new_squash_commit' and 'squash_msg').

But this is only the case since e4f8baa88a (subtree: parse revs in
individual cmd_ functions, 2021-04-27), which went into Git 2.32. Before
that commit, 'cmd_merge' verified the revision it was given using 'git
rev-parse --revs-only "$@"'. Such an invocation, when fed the name of an
annotated tag, would return the hash of the tag, not of the commit
referenced by the tag.

This leads to a failure in 'find_latest_squash' when squash-merging if
the most recent squash-merge merged an annotated tag of the subtree
repository, using a pre-2.32 version of 'git subtree', unless that
previous annotated tag is present locally (which is not usually the
case).

We can fix this by fetching the object directly by its hash in
'process_subtree_split_trailer' when 'git rev-parse' fails, but in order
to do so we need to know the name or URL of the subtree repository.
This is not possible in general for 'git subtree merge', but is easy
when it is invoked through 'git subtree pull' since in that case the
subtree repository is passed by the user at the command line.

Allow the 'git subtree pull' scenario to work out-of-the-box by adding
an optional 'repository' argument to functions 'cmd_merge',
'find_latest_squash' and 'process_subtree_split_trailer', and invoke
'cmd_merge' with that 'repository' argument in 'cmd_pull'.

If 'repository' is absent in 'process_subtree_split_trailer', instruct
the user to try fetching the missing object directly.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:06 -07:00
f10d31cf2d subtree: process 'git-subtree-split' trailer in separate function
Both functions 'find_latest_squash' (called by 'git subtree merge
--squash' and 'git subtree split --rejoin') and 'find_existing_splits'
(called by git 'subtree split') loop through commits that have a
'git-subtree-dir' trailer, and then process the 'git-subtree-mainline'
and 'git-subtree-split' trailers for those commits.

The processing done for the 'git-subtree-split' trailer is simple: we
check if the object exists with 'rev-parse' and set the variable
'sub' to the object name, or we die if the object does not exist.

In a future commit we will add more steps to the processing of this
trailer in order to make the code more robust.

To reduce code duplication, move the processing of the
'git-subtree-split' trailer to a dedicated function,
'process_subtree_split_trailer'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:06 -07:00
7990142eb1 subtree: use named variables instead of "$@" in cmd_pull
'cmd_pull' already checks that only two arguments are given,
'repository' and 'ref'. Define variables with these names instead of
using the positional parameter $2 and "$@".

This will allow a subsequent commit to pass 'repository' to 'cmd_merge'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:06 -07:00
34ab458cb1 subtree: define a variable before its first use in 'find_latest_squash'
The function 'find_latest_squash' takes a single argument, 'dir', but a
debug statement uses this variable before it takes its value from $1.

This statement thus gets the value of 'dir' from the calling function,
which currently is the same as the 'dir' argument, so it works but it
is confusing.

Move the definition of 'dir' before its first use.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:05 -07:00
5626a9e2a9 subtree: prefix die messages with 'fatal'
Just as was done in 0008d12284 (submodule: prefix die messages with
'fatal', 2021-07-10) for 'git-submodule.sh', make the 'die' messages
outputed by 'git-subtree.sh' more in line with the rest of the code base
by prefixing them with "fatal: ", and do not capitalize their first
letter.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:05 -07:00
2e94339fdc subtree: add 'die_incompatible_opt' function to reduce duplication
9a3e3ca2ba (subtree: be stricter about validating flags, 2021-04-27)
added validation code to check that options given to 'git subtree <cmd>'
made sense with the command being used.

Refactor these checks by adding a 'die_incompatible_opt' function to
reduce code duplication.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:05 -07:00
a50fcc13dd subtree: use 'git rev-parse --verify [--quiet]' for better error messages
There are three occurences of 'git rev-parse <rev>' in 'git-subtree.sh'
where the command expects a revision and the script dies or exits if the
revision can't be found. In that case, the error message from 'git
rev-parse' is:

    $ git rev-parse <bad rev>
    <bad rev>
    fatal: ambiguous argument '<bad rev>': unknown revision or path not in the working tree.
    Use '--' to separate paths from revisions, like this:
    'git <command> [<revision>...] -- [<file>...]'

This is a little confusing to the user, since this error message is
outputed by 'git subtree'.

At these points in the script, we know that we are looking for a single
revision, so be explicit by using '--verify', resulting in a little
better error message:

    $ git rev-parse --verify <bad rev>
    fatal: Needed a single revision

In the two occurences where we 'die' if 'git rev-parse' fails, 'git
subtree' outputs "could not rev-parse split hash $b from commit $sq", so
we actually do not need the supplementary error message from 'git
rev-parse'; add '--quiet' to silence it.

In the third occurence, we 'exit', so keep the error message from 'git
rev-parse'. Note that this messsage is still suboptimal since it can be
understood to mean that 'git rev-parse' did not receive a single
revision as argument, which is not the case here: the command did
receive a single revision, but the revision is not resolvable to an
available object.

The alternative would be to use '--' after the revision, as suggested by
the first error message, resulting in a clearer error message:

    $ git rev-parse <bad rev> --
    fatal: bad revision '<bad rev>'

Unfortunately we can't use that syntax because in the more common case
of the revision resolving to a known object, the command outputs the
object's hash, a newline, and the dashdash, which breaks the 'git
subtree' script.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:05 -07:00
455f0adf57 test-lib-functions: mark 'test_commit' variables as 'local'
Some variables in 'test_commit' have names that are common enough that
it is very likely that test authors might use them in a test. If they do
so and use 'test_commit' between setting such a variable and using it,
the variable value from 'test_commit' will leak back into the test and
most likely break it.

Prevent that by marking all variables in 'test_commit' as 'local'. This
allow a subsequent commit to use a 'tag' variable.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 13:51:05 -07:00
3dc6b4e027 Documentation/build-docdep.perl: generate sorted output
To make sure that our manpages are rebuilt when any of the included
source files change and only the affected manpages are rebuilt,
'build-docdep.perl' scans our documentation source files for include
directives, and outputs 'make' dependencies to be included by
'Documentation/Makefile'.  This script relies on Perl's hash data
structures, and generates its output while iterating over them, and
since hashes in Perl are very much unordered, the output varies
greatly from run to run, both the order of targets and the order of
dependencies of each target.

This lack of ordering doesn't matter for 'make', because it cares
neither about the order of targets in a Makefile nor about the order
of a target's dependencies.  However, it does matter to developers
looking into build issues potentially involving these generated
dependencies, as it's rather hard to tell whether there are any
relevant (i.e. not order-only) changes among the dependencies compared
to the previous run.

So let's make 'build-docdep.perl's output stable and ordered by
sorting the keys of the hashes before iterating over them.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 11:39:38 -07:00
1fc3c0ad40 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21 11:37:36 -07:00
c2058ea237 Merge branch 'rj/branch-edit-description-with-nth-checkout'
"git branch --edit-description @{-1}" is now a way to edit branch
description of the branch you were on before switching to the
current branch.

* rj/branch-edit-description-with-nth-checkout:
  branch: support for shortcuts like @{-1}, completed
2022-10-21 11:37:29 -07:00
1f20aa22d7 Merge branch 'ds/cmd-main-reorder'
Code clean-up.

* ds/cmd-main-reorder:
  git.c: improve code readability in cmd_main()
2022-10-21 11:37:29 -07:00
91d3d7e6e2 Merge branch 'ab/grep-simplify-extended-expression'
Giving "--invert-grep" and "--all-match" without "--grep" to the
"git log" command resulted in an attempt to access grep pattern
expression structure that has not been allocated, which has been
corrected.

* ab/grep-simplify-extended-expression:
  grep.c: remove "extended" in favor of "pattern_expression", fix segfault
2022-10-21 11:37:28 -07:00
4a48c7d25f Merge branch 'jc/symbolic-ref-no-recurse'
After checking out a "branch" that is a symbolic-ref that points at
another branch, "git symbolic-ref HEAD" reports the underlying
branch, not the symbolic-ref the user gave checkout as argument.
The command learned the "--no-recurse" option to stop after
dereferencing a symbolic-ref only once.

* jc/symbolic-ref-no-recurse:
  symbolic-ref: teach "--[no-]recurse" option
2022-10-21 11:37:28 -07:00
6269c46ada Merge branch 'jk/use-o0-in-leak-sanitizer'
Avoid false-positive from LSan whose assumption may be broken with
higher optimization levels.

* jk/use-o0-in-leak-sanitizer:
  Makefile: force -O0 when compiling with SANITIZE=leak
2022-10-21 11:37:27 -07:00
cc7574322f Merge branch 'ab/macos-build-fix-with-sha1dc'
Enable macOS build with sha1dc hash function.

* ab/macos-build-fix-with-sha1dc:
  fsmonitor OSX: compile with DC_SHA1=YesPlease
2022-10-21 11:37:27 -07:00
1ad5c3df35 ci: use DC_SHA1=YesPlease on osx-clang job for CI
7b8cfe34 (Merge branch 'ed/fsmonitor-on-networked-macos',
2022-10-17) broke the build on macOS with sha1dc by bypassing our
hash abstraction (git_SHA_CTX etc.), but it wasn't caught before the
problematic topic was merged down to the 'master' branch.  Nobody
was even compile testing with DC_SHA1 set, although it is the
recommended choice in these days for folks when they use SHA-1.

This was because the default for macOS uses Apple Common Crypto, and
both of the two CI jobs did not override the default.  Tweak one of
them to use DC_SHA1 to improve the coverage.

We may want to give similar diversity for Linux jobs so that some of
them build with other implementations of SHA-1; they currently all
build and test with DC_SHA1 as that is the default on everywhere
other than macOS.

But let's start small to fill only the immediate need.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-20 10:01:37 -07:00
1c0962c0c4 ci: add address and undefined sanitizer tasks
The current code is clean with these two sanitizers, and we would
like to keep it that way by running the checks for any new code.

The signal of "passed with asan, but not ubsan" (or vice versa) is
not that useful in practice, so it is tempting to run both santizers
in a single task, but it seems to take forever, so tentatively let's
try having two separate ones.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-20 09:20:59 -07:00
45c9f05c44 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 15:38:06 -07:00
617e9991d4 Merge branch 'jh/struct-zero-init-with-older-clang'
Work around older clang that warns against C99 zero initialization
syntax for struct.

* jh/struct-zero-init-with-older-clang:
  config.mak.dev: disable suggest braces error on old clang versions
2022-10-19 15:38:06 -07:00
fe9c607509 Merge branch 'rs/archive-dedup-printf'
Code simplification.

* rs/archive-dedup-printf:
  archive: deduplicate verbose printing
2022-10-19 15:38:06 -07:00
179eb1d967 Merge branch 'ab/coding-guidelines-c99'
Update CodingGuidelines to clarify what features to use and avoid
in C99.

* ab/coding-guidelines-c99:
  CodingGuidelines: recommend against unportable C99 struct syntax
  CodingGuidelines: mention C99 features we can't use
  CodingGuidelines: allow declaring variables in for loops
  CodingGuidelines: mention dynamic C99 initializer elements
  CodingGuidelines: update for C99
2022-10-19 15:38:05 -07:00
c858750b41 cmake: increase time-out for a long-running test
As suggested in
https://github.com/git-for-windows/git/issues/3966#issuecomment-1221264238,
t7112 can run for well over one hour, which seems to be the default
maximum run time at least when running CTest-based tests in Visual
Studio.

Let's increase the time-out as a stop gap to unblock developers wishing
to run Git's test suite in Visual Studio.

Note: The actual run time is highly dependent on the circumstances. For
example, in Git's CI runs, the Windows-based tests typically take a bit
over 5 minutes to run. CI runs have the added benefit that Windows
Defender (the common anti-malware scanner on Windows) is turned off,
something many developers are not at liberty to do on their work
stations. When Defender is turned on, even on this developer's high-end
Ryzen system, t7112 takes over 15 minutes to run.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 12:33:05 -07:00
ee9e66e4e7 cmake: avoid editing t/test-lib.sh
In 7f5397a07c (cmake: support for testing git when building out of the
source tree, 2020-06-26), we implemented support for running Git's test
scripts even after building Git in a different directory than the source
directory.

The way we did this was to edit the file `t/test-lib.sh` to override
`GIT_BUILD_DIR` to point somewhere else than the parent of the `t/`
directory.

This is unideal because it always leaves a tracked file marked as
modified, and it is all too easy to commit that change by mistake.

Let's change the strategy by teaching `t/test-lib.sh` to detect the
presence of a file called `GIT-BUILD-DIR` in the source directory. If it
exists, the contents are interpreted as the location to the _actual_
build directory. We then write this file as part of the CTest
definition.

To support building Git via a regular `make` invocation after building
it using CMake, we ensure that the `GIT-BUILD-DIR` file is deleted (for
convenience, this is done as part of the Makefile rule that is already
run with every `make` invocation to ensure that `GIT-BUILD-OPTIONS` is
up to date).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 12:33:05 -07:00
79d266223a add -p: avoid ambiguous signed/unsigned comparison
In the interactive `add` operation, users can choose to jump to specific
hunks, and Git will present the hunk list in that case. To avoid showing
too many lines at once, only a maximum of 21 hunks are shown, skipping
the "mode change" pseudo hunk.

The comparison performed to skip the "mode change" pseudo hunk (if any)
compares a signed integer `i` to the unsigned value `mode_change` (which
can be 0 or 1 because it is a 1-bit type).

According to section 6.3.1.8 of the C99 standard (see e.g.
https://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf), what should
happen is an automatic conversion of the "lesser" type to the "greater"
type, but since the types differ in signedness, it is ill-defined what
is the correct "usual arithmetic conversion".

Which means that Visual C's behavior can (and does) differ from GCC's:
When compiling Git using the latter, `add -p`'s `goto` command shows no
hunks by default because it casts a negative start offset to a pretty
large unsigned value, breaking the "goto hunk" test case in
`t3701-add-interactive.sh`.

Let's avoid that by converting the unsigned bit explicitly to a signed
integer.

Note: This is a long-standing bug in the Visual C build of Git, but it
has never been caught because t3701 is skipped when `NO_PERL` is set,
which is the case in the `vs-test` jobs of Git's CI runs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 11:55:28 -07:00
6a83b5f081 cmake: copy the merge tools for testing
Even when running the tests via CTest, t7609 and t7610 rely on more than
only a few mergetools to be copied to the build directory. Let's make it
so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 11:55:28 -07:00
2ea1d8b556 cmake: make it easier to diagnose regressions in CTest runs
When a test script fails in Git's test suite, the usual course of action
is to re-run it using options to increase the verbosity of the output,
e.g. `-v` and `-x`.

Like in Git's CI runs, when running the tests in Visual Studio via the
CTest route, it is cumbersome or at least requires a very unintuitive
approach to pass options to the test scripts: the CMakeLists.txt file
would have to be modified, passing the desired options to _all_ test
scripts, and then the CMake Cache would have to be reconfigured before
running the test in question individually. Unintuitive at best, and
opposite to the niceties IDE users expect.

So let's just pass those options by default: This will not clutter any
output window but the log that is written to a log file will have
information necessary to figure out test failures.

While at it, also imitate what the Windows jobs in Git's CI runs do to
accelerate running the test scripts: pass the `--no-bin-wrappers` and
`--no-chain-lint` options.

This makes the test runs noticeably faster because the `bin-wrappers/`
scripts as well as the `chain-lint` code make heavy use of POSIX shell
scripting, which is really, really slow on Windows due to the need to
emulate POSIX behavior via the MSYS2 runtime. In a test by Eric
Sunshine, it added two minutes (!) just to perform the chain-lint task.

The idea of adding a CMake config option (á la `GIT_TEST_OPTS`) was
considered during the development of this patch, but then dropped: such
a setting is global, across _all_ tests, where e.g. `--run=...` would
not make sense. Users wishing to override these new defaults are better
advised running the test script manually, in a Git Bash, with full
control over the command line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 11:55:28 -07:00
32205655dc fsmonitor OSX: compile with DC_SHA1=YesPlease
As we'll address in subsequent commits the "DC_SHA1=YesPlease" is not
on by default on OSX, instead we use Apple Common Crypto's SHA-1
implementation.

In 6beb2688d3 (fsmonitor: relocate socket file if .git directory is
remote, 2022-10-04) the build was broken with "DC_SHA1=YesPlease" (and
probably other non-"APPLE_COMMON_CRYPTO" SHA-1 backends).

So let's extract the fix for this from [1] to get the build working
again with "DC_SHA1=YesPlease". In addition to the fix in [1] we also
need to replace "SHA_DIGEST_LENGTH" with "GIT_MAX_RAWSZ".

1. https://lore.kernel.org/git/c085fc15b314abcb5e5ca6b4ee5ac54a28327cab.1665326258.git.gitgitgadget@gmail.com/

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 09:34:47 -07:00
d3775de074 Makefile: force -O0 when compiling with SANITIZE=leak
Compiling with -O2 can interact badly with LSan's leak-checker, causing
false positives. Imagine a simplified example like:

  char *str = allocate_some_string();
  if (some_func(str) < 0)
          die("bad str");
  free(str);

The compiler may eliminate "str" as a stack variable, and just leave it
in a register. The register is preserved through most of the function,
including across the call to some_func(), since we'd eventually need to
free it. But because die() is marked with NORETURN, the compiler knows
that it doesn't need to save registers, and just clobbers it.

When die() eventually exits, the leak-checker runs. It looks in
registers and on the stack for any reference to the memory allocated by
str (which would indicate that it's not leaked), but can't find one.  So
it reports it as a leak.

Neither system is wrong, really. The C standard (mostly section 5.1.2.3)
defines an abstract machine, and compilers are allowed to modify the
program as long as the observable behavior of that abstract machine is
unchanged. Looking at random memory values on the stack is undefined
behavior, and not something that the optimizer needs to support. But
there really isn't any other way for a leak checker to work; it
inherently has to do undefined things like scouring memory for pointers.
So the two things are inherently at odds with each other. We can't fix
it by changing the code, because from the perspective of the program
running in an abstract machine, there is no leak.

This has caused real false positives in the past, like:

  - https://lore.kernel.org/git/patch-v3-5.6-9a44204c4c9-20211022T175227Z-avarab@gmail.com/

  - https://lore.kernel.org/git/Yy4eo6500C0ijhk+@coredump.intra.peff.net/

  - https://lore.kernel.org/git/Y07yeEQu+C7AH7oN@nand.local/

This patch makes those go away by forcing -O0 when compiling with LSan.
There are a few ways we could do this:

  - we could just teach the linux-leaks CI job to set -O0. That's the
    smallest change, and means we wouldn't get spurious CI failures. But
    it doesn't help people looking for leaks manually or in a specific
    test (and because the problem depends on the vagaries of the
    optimizer, investigating these can waste a lot of time in
    head-scratching as the problem comes and goes)

  - we default to -O2 in CFLAGS; we could pull this out to a separate
    variable ("-O$(O)" or something) and modify "O" when LSan is in use.
    This is the most flexible, in that you could still build with "make
    O=2 SANITIZE=leak" if you really wanted to (say, for experimenting).
    But it would also fail to kick in if the user defines their own
    CFLAGS variable, which again leads to head-scratching.

  - we can just stick -O0 into BASIC_CFLAGS when enabling LSan. Since
    this comes after the user-provided CFLAGS, it will override any
    previous -O setting found there. This is more foolproof, albeit less
    flexible. If you want to experiment with an optimized leak-checking
    build, you'll have to put "-O2 -fsanitize=leak" into CFLAGS
    manually, rather than using our SANITIZE=leak Makefile magic.

Since the final one is the least likely to break in normal use, this
patch uses that approach.

The resulting build is a little slower, of course, but since LSan is
already about 2x slower than a regular build, another 10% slowdown isn't
that big a deal.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19 08:32:39 -07:00
77a1310e6b Git.pm: add semicolon after catch statement
When attempting to initialize a repository object in an unsafe
directory, a syntax error is reported (Can't use string as a HASH ref
while strict refs in use). Fix this runtime error by adding the required
semicolon after the catch statement.

Without the semicolon, the result of the following line (i.e., the
result of Cwd::abs_path) is passed as the third argument to Error.pm's
catch function. That function expects that its third argument,
$clauses, is a hash reference, and trying to access a string as a hash
reference is a fatal error.

[1] https://lore.kernel.org/git/20221011182607.f1113fff-9333-427d-ba45-741a78fa6040@korelogic.com/

Reported-by: Hank Leininger <hlein@korelogic.com>
Signed-off-by: Michael McClimon <michael@mcclimon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 22:13:04 -07:00
197443e80a repack: don't remove .keep packs with --pack-kept-objects
`git repack` supports a `--pack-kept-objects` flag which more or less
translates to whether or not we pass `--honor-pack-keep` down to `git
pack-objects` when assembling a new pack.

This behavior has existed since ee34a2bead (repack: add
`repack.packKeptObjects` config var, 2014-03-03). In that commit, the
documentation was extended to say:

    [...] Note that we still do not delete `.keep` packs after
    `pack-objects` finishes.

Unfortunately, this is not the case when `--pack-kept-objects` is
combined with a `--geometric` repack. When doing a geometric repack, we
include `.keep` packs when enumerating available packs only when
`pack_kept_objects` is set.

So this all works fine when `--no-pack-kept-objects` (or similar) is
given. Kept packs are excluded from the geometric roll-up, so when we go
to delete redundant packs (with `-d`), no `.keep` packs appear "below
the split" in our geometric progression.

But when `--pack-kept-objects` is given, things can go awry. Namely,
when a kept pack is included in the list of packs tracked by the
`pack_geometry` struct *and* part of the pack roll-up, we will delete
the `.keep` pack when we shouldn't.

Note that this *doesn't* result in object corruption, since the `.keep`
pack's objects are still present in the new pack. But the `.keep` pack
itself is removed, which violates our promise from back in ee34a2bead.

But there's more. Because `repack` computes the geometric roll-up
independently from selecting which packs belong in a MIDX (with
`--write-midx`), this can lead to odd behavior. Consider when a `.keep`
pack appears below the geometric split (ie., its objects will be part of
the new pack we generate).

We'll write a MIDX containing the new pack along with the existing
`.keep` pack. But because the `.keep` pack appears below the geometric
split line, we'll (incorrectly) try to remove it. While this doesn't
corrupt the repository, it does cause us to remove the MIDX we just
wrote, since removing that pack would invalidate the new MIDX.

Funny enough, this behavior became far less noticeable after e4d0c11c04
(repack: respect kept objects with '--write-midx -b', 2021-12-20), which
made `pack_kept_objects` be enabled by default only when we were writing
a non-MIDX bitmap.

But e4d0c11c04 didn't resolve this bug, it just made it harder to notice
unless callers explicitly passed `--pack-kept-objects`.

The solution is to avoid trying to remove `.keep` packs during
`--geometric` repacks, even when they appear below the geometric split
line, which is the approach this patch implements.

Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:29:23 -07:00
55d902cd61 builtin/repack.c: remove redundant pack-based bitmaps
When we write a MIDX bitmap after repacking, it is possible that the
repository would be left in a state with both pack- and multi-pack
reachability bitmaps.

This can occur, for instance, if a pack that was kept (either by having
a .keep file, or during a geometric repack in which it is not rolled up)
has a bitmap file, and the repack wrote a multi-pack index and bitmap.

When loading a reachability bitmap for the repository, the multi-pack
one is always preferred, so the pack-based one is redundant. Let's
remove it unconditionally, even if '-d' isn't passed, since there is no
practical reason to keep both around. The patch below does just that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:26:16 -07:00
4b992f0a24 ll-merge: mark unused parameters in callbacks
We have a generic ll_merge_fn, but not every implementation needs every
parameter. In particular, neither binary nor ext merges care about names
(since they do not generate conflict markers), and most do not need to
look at the ll_merge_driver itself.

Ironically, neither ll_xdl_merge() nor ll_union_merge() needs to have
their driver parameter annotated (even though both are named
drv_unused!).  This is because they may fall back to calling
ll_binary_merge() directly. And even though that function won't look at
it, we still pass it along, and hence it is "used" in the caller.

We could get away with passing NULL, but that's likely more confusing
and brittle than just passing along our own driver. And we have to keep
the driver parameter in all callbacks, since ll_ext_merge() uses it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
0ada4b9bfe diffcore-pickaxe: mark unused parameters in pickaxe functions
We have a virtual pickaxe_fn for handling -G versus -S pickaxe options.
They need to take the same set of parameters, but of course they care
about different ones (e.g., a regex -G will never use a kwset).

Mark the unused ones to appease -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
dfd2a23885 convert: mark unused parameter in null stream filter
The null stream filter unsurprisingly does not look at its "filter"
argument, since it just eats bytes. But we can't drop it, since it has
to conform to the same virtual interface that real filters do. Mark the
unused parameter to appease -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
7506535775 apply: mark unused parameters in noop error/warning routine
We squelch error/warning output by passing a noop handler to
set_error_routine(). We need to tell the compiler that this is intended
so that it doesn't trigger -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
0cff86990c apply: mark unused parameters in handlers
In parse_git_diff_header(), we have a table-driven parser that maps
strings to handler functions. Not all handlers need all of the
parameters; let's mark the unused ones to appease -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
7829746a6c date: mark unused parameters in handler functions
When parsing approxidates, we use a table to map special strings (like
"noon") to functions which handle them. Not all functions need the "now"
parameter, as they are not relative (e.g., "yesterday" does, but "pm"
does not). Let's annotate those to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
1ee3471045 string-list: mark unused callback parameters
String-lists may be used with callbacks for clearing or iteration. These
callbacks need to conform to a particular interface, even though not
every callback needs all of its parameters. Mark the unused ones to make
-Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:04 -07:00
9eb6cdadd1 object-file: mark unused parameters in hash_unknown functions
The 0'th entry of our hash_algos array fills out the virtual methods
with a series of functions which simply BUG(). This is the right thing
to do, since the point is to catch use of an invalid algo parameter, but
we need to annotate them to appease -Wunused-parameters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:03 -07:00
808e91956d mark unused parameters in trivial compat functions
When a platform feature isn't available or in use, we sometimes
conditionally compile empty or trivial functions to turn these into
noops. We need to annotate their parameters so that -Wunused-parameters
won't complain about them.

Note that there are many more of these in compat/mingw.h, but we'll
leave them for now, as there's some trickery required to get the UNUSED
macro available there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:03 -07:00
827f8305c4 update-index: drop unused argc from do_reupdate()
The parse-options callback for --again soaks up all remaining options by
manipulating the parse_opt_ctx's argc and argv fields. Even though it
has to look at both, the actual parsing happens via the do_reupdate()
helper, which only looks at the argv half (by passing it along to
parse_pathspec). So that helper doesn't need to see argc at all.

Note that the helper does look at "argv + 1" without confirming that
argc is greater than 0. We know this is correct because it is skipping
past the actual "--again" string, which will always be present. However,
to make what's going on more obvious, let's move that "+1" into the
caller, which has the matching "-1" when fixing up the ctx's argc/argv.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:03 -07:00
70aa1d7576 submodule--helper: drop unused argc from module_list_compute()
The module_list_compute() function takes an argc/argv pair, but never
looks at argc. This is OK, as the NULL terminator in argv is sufficient
for our purposes (we feed it to parse_pathspec(), which takes only the
array, not a count).

Note that one of the callers _looks_ like it would be buggy, but isn't:
we pass 0/NULL for argc/argv from module_foreach(), so finding the
terminating NULL in that argv naively would segfault. However,
parse_pathspec() is smart enough to interpret a bare NULL as an empty
argv.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:03 -07:00
0e5a87e042 diffstat_consume(): assert non-zero length
The callback interface for xdiff_emit_line_fn gives us a line/len pair,
but diffstat_consume() never looks at "len". At first glance this seems
like a bug that could cause us to read further than xdiff intends. But
in practice, we read only the first character, and xdiff would never
pass us an empty line.

Let's add a run-time assertion that this is true, which clarifies our
assumption and silences -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 21:24:03 -07:00
9c32cfb49c Sync with v2.38.1 2022-10-17 15:46:09 -07:00
4732897cf0 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 14:57:21 -07:00
8938463745 Merge branch 'pw/remove-rebase-p-test'
Remove outdated test.

* pw/remove-rebase-p-test:
  t3435: remove redundant test case
2022-10-17 14:56:35 -07:00
4050354b14 Merge branch 'rj/branch-edit-desc-unborn'
"git branch --edit-description" on an unborh branch misleadingly
said that no such branch exists, which has been corrected.

* rj/branch-edit-desc-unborn:
  branch: description for non-existent branch errors
2022-10-17 14:56:35 -07:00
a2e618cb0f Merge branch 'jt/promisor-remote-fetch-tweak'
Remove error detection from a function that fetches from promisor
remotes, and make it die when such a fetch fails to bring all the
requested objects, to give an early failure to various operations.

* jt/promisor-remote-fetch-tweak:
  promisor-remote: die upon failing fetch
  promisor-remote: remove a return value
2022-10-17 14:56:35 -07:00
2790ba84b6 Merge branch 'rs/use-fspathncmp'
Code clean-up.

* rs/use-fspathncmp:
  dir: use fspathncmp() in pl_hashmap_cmp()
2022-10-17 14:56:35 -07:00
138c400903 Merge branch 'jc/use-of-uc-in-log-messages'
Clarify that "the sentence after <area>: prefix does not begin with
a capital letter" rule applies only to the commit title.

* jc/use-of-uc-in-log-messages:
  SubmittingPatches: use usual capitalization in the log message body
2022-10-17 14:56:35 -07:00
8e28728cbb Merge branch 'dd/document-runtime-prefix-better'
Update comment in the Makefile about the RUNTIME_PREFIX config knob.

* dd/document-runtime-prefix-better:
  Makefile: clarify runtime relative gitexecdir
2022-10-17 14:56:34 -07:00
44ec91ba4f Merge branch 'ab/unused-annotation'
Compilation fix for ancient compilers.

* ab/unused-annotation:
  git-compat-util.h: GCC deprecated message arg only in GCC 4.5+
2022-10-17 14:56:34 -07:00
aff81ec1c8 Merge branch 'jc/tmp-objdir'
The code to clean temporary object directories (used for
quarantine) tried to remove them inside its signal handler, which
was a no-no.

* jc/tmp-objdir:
  tmp-objdir: skip clean up when handling a signal
2022-10-17 14:56:33 -07:00
272be0db8b Merge branch 'jc/branch-description-unset'
"GIT_EDITOR=: git branch --edit-description" resulted in failure,
which has been corrected.

* jc/branch-description-unset:
  branch: do not fail a no-op --edit-desc
2022-10-17 14:56:33 -07:00
86cc5ee3b7 Merge branch 'jk/cleanup-callback-parameters'
Code clean-up.

* jk/cleanup-callback-parameters:
  attr: drop DEBUG_ATTR code
  commit: avoid writing to global in option callback
  multi-pack-index: avoid writing to global in option callback
  test-submodule: inline resolve_relative_url() function
2022-10-17 14:56:32 -07:00
8646100e05 Merge branch 'rs/bisect-start-leakfix'
Code clean-up that results in plugging a leak.

* rs/bisect-start-leakfix:
  bisect--helper: plug strvec leak
2022-10-17 14:56:32 -07:00
7b8cfe34d9 Merge branch 'ed/fsmonitor-on-networked-macos'
By default, use of fsmonitor on a repository on networked
filesystem is disabled. Add knobs to make it workable on macOS.

* ed/fsmonitor-on-networked-macos:
  fsmonitor: fix leak of warning message
  fsmonitor: add documentation for allowRemote and socketDir options
  fsmonitor: check for compatability before communicating with fsmonitor
  fsmonitor: deal with synthetic firmlinks on macOS
  fsmonitor: avoid socket location check if using hook
  fsmonitor: relocate socket file if .git directory is remote
  fsmonitor: refactor filesystem checks to common interface
2022-10-17 14:56:31 -07:00
9a1925b08f rebase: cleanup action handling
Treating the action as a string is a hang over from the scripted
rebase. The last commit removed the only remaining use of the action
that required a string so lets convert the other action users to use
the existing action enum instead. If we ever need the action name as a
string in the future the action_names array exists exactly for that
purpose.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
6159e7add4 rebase --abort: improve reflog message
When aborting a rebase the reflog message looks like

	rebase (abort): updating HEAD

which is not very informative. Improve the message by mentioning the
branch that we are returning to as we do at the end of a successful
rebase so it looks like.

	rebase (abort): returning to refs/heads/topic

If GIT_REFLOG_ACTION is set in the environment we no longer omit
"(abort)" from the reflog message. We don't omit "(start)" and
"(finish)" when starting and finishing a rebase in that case so we
shouldn't omit "(abort)".

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
be0d29d301 rebase --apply: make reflog messages match rebase --merge
The apply backend creates slightly different reflog messages to the
merge backend when starting or finishing a rebase and when picking
commits. These differences make it harder than it needs to be to parse
the reflog (I have a script that reads the finishing messages from
rebase and it is a pain to have to accommodate two different message
formats). While it is possible to determine the backend used for a
rebase from the reflog messages, the differences are not designed for
that purpose. c2417d3af7 (rebase: drop '-i' from the reflog for
interactive-based rebases, 2020-02-15) removed the clear distinction
between the reflog messages of the two backends without complaint.

As the merge backend is the default it is likely to be the format most
common in existing reflogs. For that reason the apply backend is changed
to format its reflog messages to match the merge backend as closely as
possible. Note that there is still a difference as when committing a
conflict resolution the apply backend will use "(pick)" rather than
"(continue)" because it is not currently possible to change the message
for a single commit.

In addition to c2417d3af7 we also changed the reflog messages in
68aa495b59 (rebase: implement --merge via the interactive machinery,
2018-12-11) and 2ac0d6273f (rebase: change the default backend from "am"
to "merge", 2020-02-15). This commit makes the same change to "git
rebase --apply" that 2ac0d6273f made to "git rebase" without any backend
specific options. As the messages are changed to use an existing format
any scripts that can parse the reflog messages of the default rebase
backend should be unaffected by this change.

There are existing tests for the messages from both backends which are
adjusted to ensure that they do not get out of sync in the future.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
33f2b61ff9 rebase --apply: respect GIT_REFLOG_ACTION
The reflog messages when finishing a rebase hard code "rebase" rather
than using GIT_REFLOG_ACTION.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
1f2d5dc4d2 rebase --merge: fix reflog message after skipping
The reflog message for every pick after running "rebase --skip" looks
like

	rebase (skip) (pick): commit subject line

Fix this by not appending " (skip)" to the reflog action.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
da1d63363f rebase --merge: fix reflog when continuing
The reflog message for a conflict resolution committed by "rebase
--continue" looks like

	rebase (continue): commit subject line

Unfortunately the reflog message each subsequent pick look like

	rebase (continue) (pick): commit subject line

Fix this by setting the reflog message for "rebase --continue" in
sequencer_continue() so it does not affect subsequent commits. This
introduces a memory leak similar to the one leaking GIT_REFLOG_ACTION
in pick_commits(). Both of these will be fixed in a future series that
stops the sequencer calling setenv().

If we fail to commit the staged changes then we error out so
GIT_REFLOG_ACTION does not need to be reset in that case.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
4e5e1b4b61 t3406: rework rebase reflog tests
Refactor the tests in preparation for adding more tests in the next
few commits. The reworked tests use the same function for testing both
the "merge" and "apply" backends. The test coverage for the "apply"
backend now includes setting GIT_REFLOG_ACTION.

Note that rebasing the "conflicts" branch does not create any
conflicts yet. A commit to do that will be added in the next commit
and the diff ends up smaller if we have don't rename the branch when
it is added.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
57a1498592 rebase --apply: remove duplicated code
Use move_to_original_branch() when reattaching HEAD after a fast-forward
rather than open coding a copy of that code. move_to_original_branch()
does not call reset_head() if head_name is NULL but there should be no
user visible changes even though we currently call reset_head() in that
case. The reason for this is that the reset_head() call does not add a
message to the reflog because we're not changing the commit that HEAD
points to and so lock_ref_for_update() elides the update. When head_name
is not NULL then reset_head() behaves like "git symbolic-ref" and so the
reflog is updated.

Note that the removal of "strbuf_release(&msg)" is safe as there is an
identical call just above this hunk which can be seen by viewing the
diff with -U6.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 12:55:03 -07:00
a524c627a4 Merge branch 'pw/rebase-keep-base-fixes' into pw/rebase-reflog-fixes
* pw/rebase-keep-base-fixes:
  rebase --keep-base: imply --no-fork-point
  rebase --keep-base: imply --reapply-cherry-picks
  rebase: factor out branch_base calculation
  rebase: rename merge_base to branch_base
  rebase: store orig_head as a commit
  rebase: be stricter when reading state files containing oids
  t3416: set $EDITOR in subshell
  t3416: tighten two tests
2022-10-17 12:54:27 -07:00
aa1df8146d rebase --keep-base: imply --no-fork-point
Given the name of the option it is confusing if --keep-base actually
changes the base of the branch without --fork-point being explicitly
given on the command line.

The combination of --keep-base with an explicit --fork-point is still
supported even though --fork-point means we do not keep the same base
if the upstream branch has been rewound.  We do this in case anyone is
relying on this behavior which is tested in t3431[1]

[1] https://lore.kernel.org/git/20200715032014.GA10818@generichostname/

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:53:03 -07:00
ce5238a690 rebase --keep-base: imply --reapply-cherry-picks
As --keep-base does not rebase the branch it is confusing if it
removes commits that have been cherry-picked to the upstream branch.
As --reapply-cherry-picks is not supported by the "apply" backend this
commit ensures that cherry-picks are reapplied by forcing the upstream
commit to match the onto commit unless --no-reapply-cherry-picks is
given.

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:53:03 -07:00
d42c9ffa0f rebase: factor out branch_base calculation
Separate out calculating the merge base between 'onto' and 'HEAD' from
the check for whether we can fast-forward or not. This means we can skip
the fast-forward checks when the rebase is forced and avoid calculating
the merge-base between 'HEAD' and 'onto' when --keep-base is given.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:53:03 -07:00
a77060218d rebase: rename merge_base to branch_base
merge_base is not a very descriptive name, the variable always holds
the merge-base of 'branch' and 'onto' which is commit at the base of
the branch being rebased so rename it to branch_base.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:53:03 -07:00
f21becdd94 rebase: store orig_head as a commit
Using a struct commit rather than a struct oid to hold orig_head means
that we error out straight away if the branch being rebased does not
point to a commit. It also simplifies the code that handles finding
the merge base and fork point as it no longer has to convert from an
oid to a commit.

To avoid changing the behavior of "git rebase <upstream> <branch>" we
keep the existing call to read_ref() and use lookup_commit_object()
on the oid returned by that rather than calling
lookup_commit_reference_by_name() which applies the ref dwim rules to
its argument.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:53:03 -07:00
b8dbfd030c rebase: be stricter when reading state files containing oids
The state files for 'onto' and 'orig_head' should contain a full hex
oid, change the reading functions from get_oid() to get_oid_hex() to
reflect this. They should also name commits and not tags so add and use
a function that looks up a commit from an oid like
lookup_commit_reference() but without dereferencing tags.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:53:00 -07:00
05ec41855d t3416: set $EDITOR in subshell
As $EDITOR is exported, setting it in one test affects all subsequent
tests. Avoid this by always setting it in a subshell. Also remove a
couple of unnecessary call to set_fake_editor where the editor does
not change the todo list.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:45:09 -07:00
96601a26b4 t3416: tighten two tests
Add a check for the correct error message to the tests that check we
require a single merge base so we can be sure the rebase failed for
the correct reason. Also rename the tests to reflect what they are
testing.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17 11:45:09 -07:00
8d2863e4ed t1002: modernize outdated conditional
Tests in this script use an unusual and hard to reason about
conditional construct

    if expression; then false; else :; fi

Change them to use more idiomatic construct:

    ! expression

Cc: Christian Couder  <christian.couder@gmail.com>
Cc: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Nsengiyumva  Wilberforce <nsengiyumvawilberforce@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-14 09:16:50 -07:00
e9c3839944 pack-bitmap-write.c: instrument number of reused bitmaps
When debugging bitmap generation performance, it is useful to know how
many bitmaps were generated from scratch, and how many were the result
of permuting the bit-order of an existing bitmap.

Keep track of the latter, and emit the count as a trace2_data line to
aid in debugging.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 13:35:08 -07:00
2dcff52524 midx.c: instrument MIDX and bitmap generation with trace2 regions
When debugging MIDX and MIDX-bitmap related issues, it is useful to
figure out where Git is spending its time.

GitHub has been using the below trace2 regions to instrument various
components of generating a MIDX itself, as well time spent preparing to
build a MIDX bitmap.

These are limited to instrumenting the following functions:

  - midx.c::find_commits_for_midx_bitmap()
  - midx.c::midx_pack_order()
  - midx.c::prepare_midx_packing_data()
  - midx.c::write_midx_bitmap()
  - midx.c::write_midx_internal()
  - midx.c::write_midx_reverse_index()

to start and end with a trace2_region_enter() and trace2_region_leave(),
respectively.

The category for all of these is "midx", which matches the existing
convention. The region description matches the name of the function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 13:35:07 -07:00
1dc4f1ef0d midx.c: consider annotated tags during bitmap selection
When generating a multi-pack bitmap without a `--refs-snapshot` (e.g.,
by running `git multi-pack-index write --bitmap` directly), we determine
the set of bitmap-able commits by enumerating each reference, and adding
the referrent as the tip of a reachability traversal when it appears
somewhere in the MIDX. (Any commit we encounter during the reachability
traversal then becomes a candidate for bitmap selection).

But we incorrectly avoid peeling the object at the tip of each
reference. So if we see some reference that points at an annotated tag
(which in turn points through zero or more additional annotated tags at
a commit), that we will not add it as a tip for the reachability
traversal. This means that if some commit C is only referenced through
one or more annotated tag(s), then C won't become a bitmap candidate.

Correct this by peeling the reference tips as we enumerate them to
ensure that we consider commits which are the targets of annotated tags,
in addition to commits which are referenced directly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 13:35:05 -07:00
a8437f3cb1 midx.c: fix whitespace typo
This was unintentionally introduced via 893b563505 (midx: inline
nth_midxed_pack_entry(), 2021-09-11) where "struct repository *r"
became "struct repository * r".

The latter does not adhere to our usual style conventions, so fix that
up to look more like our usual declarations.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 13:35:03 -07:00
ecec57b3c9 config: respect includes in protected config
Protected config is implemented by reading a fixed set of paths,
which ignores config [include]-s. Replace this implementation with a
call to config_with_options(), which handles [include]-s and saves us
from duplicating the logic of 1) identifying which paths to read and 2)
reading command line config.

As a result, git_configset_add_parameters() is unused, so remove it. It
was introduced alongside protected config in 5b3c650777 (config: learn
`git_protected_config()`, 2022-07-14) as a way to handle command line
config.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 11:39:46 -07:00
a0343f3002 tests: assert consistent whitespace in -h output
Add a test for the *.txt and *.c output assertions which asserts that
for "-h" lines that aren't the "usage: " or " or: " lines they start
with the same amount of whitespace. This ensures that we won't have
buggy output like:

   [...]
   or: git tag [-n[<num>]]
               [...]
       [--create-reflog] [...]

Which should instead be like this, i.e. the options lines should be
aligned:

   [...]
   or: git tag [-n[<num>]]
               [...]
               [--create-reflog] [...]

It would be better to be able to use "test_cmp" here, i.e. to
construct the output we expect, and compare it against the actual
output.

For most built-in commands this would be rather straightforward. In
"t0450-txt-doc-vs-help.sh" we already compute the whitespace that a
"git-$builtin" needs, and strip away "usage: " or " or: " from the
start of lines. The problem is:

 * For commands that implement subcommands, such as "git bundle", we
   don't know whether e.g. "git bundle create" is the subcommand
   "create", or the argument "create" to "bundle" for the purposes of
   alignment.

   We *do* have that information from the *.txt version, since the
   part within the ''-quotes should be the command & subcommand, but
   that isn't consistent (e.g. see "git bundle" and "git
   commit-graph", only the latter is correct), and parsing that out
   would be non-trivial.

 * If we were to make this stricter we have various
   non-parse_options() users (e.g. "git diff-tree") that don't have the
   nicely aligned output which we've had since
   4631cfc20b (parse-options: properly align continued usage output,
   2021-09-21).

So rather than make perfect the enemy of the good let's assert that
for those lines that are indented they should all use the same
indentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:58 -07:00
c39fffc1c9 tests: start asserting that *.txt SYNOPSIS matches -h output
There's been a lot of incremental effort to make the SYNOPSIS output
in our documentation consistent with the -h output,
e.g. cbe485298b (git reflog [expire|delete]: make -h output
consistent with SYNOPSIS, 2022-03-17) is one recent example, but that
effort has been an uphill battle due to the lack of regression
testing.

This adds such regression testing, we can parse out the SYNOPSIS
output with "sed", and it turns out it's relatively easy to normalize
it and the "-h" output to match on another.

We now ensure that we won't have regressions when it comes to the list
of commands in "expect_help_to_match_txt" below, and in subsequent
commits we'll make more of them consistent.

The naïve parser here gets quite a few things wrong, but it doesn't
need to be perfect, just good enough that we can compare /some/ of
this help output. There's no cases where the output would match except
for the parser's stupidity, it's all cases of e.g. comparing the *.txt
to non-parse_options() output.

Since that output is wildly different than the *.txt anyway let's
leave this for now, we can fix the parser some other time, or it won't
become necessary as we'll e.g. convert more things to using
parse_options().

Having a special-case for "merge-tree"'s 1f0c3a29da (merge-tree:
implement real merges, 2022-06-18) is a bit ugly, but preferred to
blessing that " (deprecated)" pattern for other commands. We'd
probably want to add some other way of marking deprecated commands in
the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is
indistinguishable from the command taking an optional literal
"deprecated" string as an argument.

Some of the issues that are left:

 * "git show -h", "git whatchanged -h" and "git reflog --oneline -h"
   all showing "git log" and "git show" usage output. I.e. the
   "builtin_log_usage" in builtin/log.c doesn't take into account what
   command we're running.

 * Commands which implement subcommands such as like
   "multi-pack-index", "notes", "remote" etc. having their subcommands
   in a very different order in the *.txt and *.c. Fixing it would
   require some verbose diffs, so it's been left alone for now.

 * Commands such as "format-patch" have a very long argument list in
   the *.txt, but just "[<options>]" in the *.c.

   What to do about these has been left out of this series, except to
   the extent that preceding commits changed "[<options>]" (or
   equivalent) to the list of options in cases where that list of
   options was tiny, or we clearly meant to exhaustively list the
   options in both *.txt and *.c.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:58 -07:00
97f03a5628 doc txt & -h consistency: make "worktree" consistent
Make the "worktree" -h output consistent with the *.txt version.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:58 -07:00
0afd556b2e worktree: define subcommand -h in terms of command -h
Avoid repeating the "-h" output for the "git worktree" command, and
instead define the usage of each subcommand with macros, so that the
"-h" output for the command itself can re-use those definitions. See
[1], [2] and [3] for prior art using the same pattern.

1. b25b727494 (builtin/multi-pack-index.c: define common usage with a
   macro, 2021-03-30)
2. 8757b35d44 (commit-graph: define common usage with a macro,
   2021-08-23)
3. 1e91d3faf6 (reflog: move "usage" variables and use macros,
   2022-03-17)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:58 -07:00
4618d2ca82 reflog doc: list real subcommands up-front
Change the "git reflog" documentation to exhaustively list the
subcommands it accepts in the SYNOPSIS, as opposed to leaving that for
a "[verse]" in the DESCRIPTION section. This documentation style was
added in cf39f54efc (git reflog show, 2007-02-08), but isn't how
other commands which take subcommands are documented.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:58 -07:00
423be1f83c doc txt & -h consistency: make "commit" consistent
Make the "-h" output of "git commit" consistent with the *.txt version
by exhaustively listing the options that it takes.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:58 -07:00
320ee66de8 doc txt & -h consistency: make "diff-tree" consistent
Make the "diff-tree -h" output consistent with the *.txt version.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
463ea0cfae doc txt & -h consistency: use "[<label>...]" for "zero or more"
Correct uses of "<label>..." where we really meant to say
"[<label>...]", i.e. the command in question taken an optional set of
"<label>". As the CodingGuidelines notes "[o]ptional parts [should be]
enclosed in square brackets".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
df8738116f doc txt & -h consistency: make "annotate" consistent
The cmd_blame() already detected whether it was processing "blame" or
"annotate", but it didn't adjust its usage output accordingly. Let's
do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
951ec747d4 doc txt & -h consistency: make "stash" consistent
Amend both the -h output and *.txt to match one another. In this case
the *.txt didn't list the "save" subcommand, and the "-h" was
similarly missing some commands.

Let's also convert the *.c code to use a macro definition, similar to
that used in preceding commits. This avoids duplication.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
d9054a19ed doc txt & -h consistency: add missing options
Change those built-in commands that were attempting to exhaustively
list the options in the "-h" output to actually do so, and always
have *.txt documentation know about the exhaustive list of options.

Let's also fix the documentation and -h output for those built-in
commands where the *.txt and -h output was a mismatch of missing
options on both sides.

In the case of "interpret-trailers" fixing the missing options reveals
that the *.txt version was implicitly claiming that the command had
two operating modes, which a look at the -h version (and studying the
documentation) will show is not the case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
3e4ebe3a40 doc txt & -h consistency: use "git foo" form, not "git-foo"
Use the "git cmd" form instead of "git-cmd" for both "git
receive-pack" and "git credential-cache--daemon".

For "git-receive-pack" we do have a binary with that name, even when
installed with SKIP_DASHED_BUILT_INS=YesPlease, but for the purposes
of the SYNOPSIS let's use the "git cmd" form like everywhere else. It
can be invoked like that (and our tests do so), the parts of our
documentation that explain when you need to use the dashed form do so,
and use it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
a5748670e3 doc txt & -h consistency: make "bundle" consistent
Amend the -h output to match that of the *.txt output, the differences
were fairly small. In the case of "[<options>]" we only have a few of
them, so let's exhaustively list them as in the *.txt.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
e8eeda1f9e doc txt & -h consistency: make "read-tree" consistent
The C version was right to use "()" in place of "[]" around the option
listing, let's update the *.txt version accordingly, and furthermore
list the *.c options in the same order as the *.txt.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
d7756184c9 doc txt & -h consistency: make "rerere" consistent
For "rerere" say "pathspec" consistently, and list the subcommands in
the order that they're discussed in the "COMMANDS" section of the
documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
8c9e292dc0 doc txt & -h consistency: add missing options and labels
Fix various issues of SYNOPSIS and -h output syntax where:

 * Options such as --force were missing entirely
 * ...or the short option, such as -f

 * We said "opts" or "options", but could instead enumerate
   the (small) set of supported options

 * Options that were missing entirely (ls-remote's --sort=<key>)

   As we can specify "--sort" multiple times (it's backed by a
   string-list" it should really be "[(--sort=<key>)...]", which is
   what "git for-each-ref" lists it as, but let's leave that issue for
   a subsequent cleanup, and stop at making these consistent. Other
   "ref-filter.h" users share the same issue, e.g. "git-branch.txt".

 * For "verify-tag" and "verify-commit" we were missing the "--raw"
   option.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
8f5f2f646a doc txt & -h consistency: make output order consistent
Fix cases where the SYNOPSIS and -h output was presented in a
different order.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
c08cfc395f doc txt & -h consistency: add or fix optional "--" syntax
Add the "[--]" for those cases where the *.txt and -h were
inconsistent, or where we incorrectly stated in one but not the other
that the "--" was mandatory.

In the case of "rev-list" both sides were wrong, as we we don't
require one or more paths if "--" is used, e.g. this is OK:

	git rev-list HEAD --

That part of this change is not a "doc txt & -h consistency" change,
as we're changing both versions, doing so here makes both sides
consistent.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
f6a8ef0700 doc txt & -h consistency: fix mismatching labels
Fix various inconsistencies between command SYNOPSIS and the
corresponding -h output where our translatable labels didn't match
up.

In some cases we need to adjust the prose that follows the SYNOPSIS
accordingly, as it refers back to the changed label.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
a0c3244796 doc SYNOPSIS & -h: use "-" to separate words in labels, not "_"
Change "builtin/credential-cache--daemon.c" to use "<socket-path>" not
"<socket_path>" in a placeholder label, almost all of our
documentation uses this form.

This is now consistent with the "If a placeholder has multiple words,
they are separated by dashes" guideline added in
9c9b4f2f8b (standardize usage info string format, 2015-01-13), let's
add a now-passing test to assert that that's the case.

To do this we need to introduce a very sed-powered parser to extract
the SYNOPSIS from the *.txt, and handle not all commands with "-h"
having a corresponding *.txt (e.g. "bisect--helper"). We'll still want
to handle syntax edge cases in the *.txt in subsequent commits for
other checks, but let's do that then.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:56 -07:00
23a9235d52 doc txt & -h consistency: use "<options>", not "<options>..."
It's arguably more correct to say "[<option>...]" than either of these
forms, but the vast majority of our documentation uses the
"[<options>]" form to indicate an arbitrary number of options, let's
do the same in these cases, which were the odd ones out.

In the case of "mv" and "sparse-checkout" let's add the missing "[]"
to indicate that these are optional.

In the case of "t/helper/test-proc-receive.c" there is no *.txt
version, making it the only hunk in this commit that's not a "doc txt
& -h consistency" change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
007512152e stash doc SYNOPSIS & -h: correct padding around "[]()"
The whitespace padding of alternatives should be of the form "[-f |
--force]" not "[-f|--force]". Likewise we should not have padding
before the first option, so "(--all | <pack-filename>...)" is correct,
not "( --all | <pack-filename>... )".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
e2f4e7e8c0 doc txt & -h consistency: correct padding around "[]()"
The whitespace padding of alternatives should be of the form "[-f |
--force]" not "[-f|--force]". Likewise we should not have padding
before the first option, so "(--all | <pack-filename>...)" is correct,
not "( --all | <pack-filename>... )".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
8bc6f92486 doc txt & -h consistency: balance unbalanced "[" and "]"
Fix a "-h" output syntax issue introduced when "--diagnose" was added
in aac0e8ffee (builtin/bugreport.c: create '--diagnose' option,
2022-08-12): We need to close the "[" we opened. The
corresponding *.txt change did not have the same issue.

The "help -h" output then had one "]" too many, which is an issue
introduced in b40845293b (help: correct the usage string in -h and
documentation, 2021-09-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
dfc833332a doc txt & -h consistency: add "-z" to cat-file "-h"
Fix a bug in db9d67f2e9 (builtin/cat-file.c: support NUL-delimited
input with `-z`, 2022-07-22), before that change the SYNOPSIS and "-h"
output were the same, but not afterwards.

That change followed a similar earlier divergence in
473fa2df08 (Documentation: add --batch-command to cat-file synopsis,
2022-04-07). Subsequent commits will fix this sort of thing more
systematically, but let's fix this one as a one-off.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
d4056dba1f doc txt & -h consistency: fix incorrect alternates syntax
Fix the incorrect "[-o | --option <argument>]" syntax, which should be
"[(-o | --option) <argument>]", we were previously claiming that only
the long option accepted the "<argument>", which isn't what we meant.

This syntax issue for "bugreport" originated in
238b439d69 (bugreport: add tool to generate debugging info,
2020-04-16), and for "diagnose" in 6783fd3cef (builtin/diagnose.c:
create 'git diagnose' builtin, 2022-08-12), which copied and adjusted
"bugreport" documentation and code.

In the case of "Documentation/git-stash.txt" and "builtin/stash.c"
this is not a "doc txt & -h consistency" change, as we're changing
both versions, doing so here makes a subsequent change smaller.

In that case fix the incorrect "[-o | --option <argument>]" syntax,
which should be "[(-o | --option) <argument>]", we were previously
claiming that only the long option accepted the "<argument>", which
isn't what we meant.

The "stash" issue has been with us in both the "-h" and *.txt versions
since bd514cada4 (stash: introduce 'git stash store', 2013-06-15).

We could claim that this isn't a syntax issue if a "vertical bar binds
tighter than option and its argument", but such a rule would change
e.g. this "cat-file" SYNOPSIS example to mean something we don't:

	... [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]

We have various other examples where the post-image here is already
used, e.g. for "format-patch" ("-o"), "grep" ("-m"),
"submodule" ("set-branch -b") etc.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
5af8b61cc3 doc txt & -h consistency: word-wrap
Change the documentation and -h output for those built-in commands
where both the -h output and *.txt were lacking in word-wrapping.

There are many more built-ins that could use this treatment, this
change is narrowed to those where this whitespace change is needed to
make the -h and *.txt consistent in the end.

In the case of "Documentation/git-hash-object.txt" and
"builtin/hash-object.c" this is not a "doc txt & -h consistency"
change, as we're changing both versions, doing so here makes a
subsequent change smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:55 -07:00
acf7828e38 built-ins: consistently add "\n" between "usage" and options
Change commands in the "diff" family and "rev-list" to separate the
usage information and option listing with an empty line.

In the case of "git diff -h" we did this already (but let's use a
consistent "\n" pattern there), for the rest these are now consistent
with how the parse_options() API would emit usage.

As we'll see in a subsequent commit this also helps to make the "git
<cmd> -h" output more easily machine-readable, as we can assume that
the usage information is separated from the options by an empty line.

Note that "COMMON_DIFF_OPTIONS_HELP" starts with a "\n", so the
seeming omission of a "\n" here is correct, the second one is provided
by the macro.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
6df5494f73 doc SYNOPSIS: consistently use ' for commands
Most of our commands use ''-quotation only for the name of the command
itself, and not its (optional) arguments. Let's do the same for these.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
b2ca7e417e doc SYNOPSIS: don't use ' for subcommands
Almost all of our documentation doesn't use "'" syntax for
subcommands, but these did, let's make them consistent with the
rest.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
f587d16471 bundle: define subcommand -h in terms of command -h
Avoid repeating the "-h" output for the "git bundle" command, and
instead define the usage of each subcommand with macros, so that the
"-h" output for the command itself can re-use those definitions. See
[1], [2] and [3] for prior art using the same pattern.

1. b25b727494 (builtin/multi-pack-index.c: define common usage with a
   macro, 2021-03-30)
2. 8757b35d44 (commit-graph: define common usage with a macro,
   2021-08-23)
3. 1e91d3faf6 (reflog: move "usage" variables and use macros,
   2022-03-17)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
968a04e447 builtin/bundle.c: indent with tabs
Fix indentation issues introduced with 73c3253d75 (bundle: framework
for options before bundle file, 2019-11-10), and carried forward in
some subsequent commits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
6584fcc5c8 CodingGuidelines: update and clarify command-line conventions
Edit the section which explains how to create a good SYNOPSIS section
for clarity and accuracy, it was mostly introduced in
c455bd8950 (CodingGuidelines: Add a section on writing documentation,
2010-11-04):

 * Change "extra" example to "file", which now naturally follows from
   previous "<file>..." example (one or more) to "[<file>...]" (zero or
   more).

 * Explain how we prefer spacing around "[]()" tokens and "|"
   alternatives, this is not a new policy, but just codifies what's
   already the pattern in the most wide use in the documentation.

Having a space around " | " for flags, but not for flag values is
inconsistent, but this style guide codifies existing
patterns. Grepping shows that we don't have any instance matching the
second "Don't" example:

	git grep -E -h -o '=\([^)]+\)' -- builtin Documentation/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
e5e6667b48 tests: assert *.txt SYNOPSIS and -h output
Add a test to assert basic compliance with the CodingGuidelines in the
SYNOPSIS and builtin -h output. For now we only assert that the "-h"
output doesn't have "\t" characters, as a very basic syntax check.

Subsequent commits will expand on the checks here as various issues
are fixed, but let's first add the test scaffolding.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:54 -07:00
0b0ab95f17 run-command.c: remove "max_processes", add "const" to signal() handler
As with the *_fn members removed in a preceding commit, let's not copy
the "processes" member of the "struct run_process_parallel_opts" over
to the "struct parallel_processes".

In this case we need the number of processes for the kill_children()
function, which will be called from a signal handler. To do that
adjust this code added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) so that we use a
dedicated "struct parallel_processes_for_signal" for passing data to
the signal handler, in addition to the "struct parallel_process" it'll
now have access to our "opts" variable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:42 -07:00
d1610eef3f run-command.c: pass "opts" further down, and use "opts->processes"
Continue the migration away from the "max_processes" member of "struct
parallel_processes" to the "processes" member of the "struct
run_process_parallel_opts", in this case we needed to pass the "opts"
further down into pp_cleanup() and pp_buffer_stderr().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:42 -07:00
9f3df6c048 run-command.c: use "opts->processes", not "pp->max_processes"
Neither the "processes" nor "max_processes" members ever change after
their initialization, and they're always equivalent, but some existing
code used "pp->max_processes" when we were already passing the "opts"
to the function, let's use the "opts" directly instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:42 -07:00
2aa8d2259f run-command.c: don't copy "data" to "struct parallel_processes"
As with the *_fn members removed in a preceding commit, let's not copy
the "data" member of the "struct run_process_parallel_opts" over to
the "struct parallel_processes". Now that we're passing the "opts"
down there's no reason to do so.

This makes the code easier to follow, as we have a "const" attribute
on the "struct run_process_parallel_opts", but not "struct
parallel_processes". We do not alter the "ungroup" argument, so
storing it in the non-const structure would make this control flow
less obvious.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:42 -07:00
357f8e6e18 run-command.c: don't copy "ungroup" to "struct parallel_processes"
As with the *_fn members removed in the preceding commit, let's not
copy the "ungroup" member of the "struct run_process_parallel_opts"
over to the "struct parallel_processes". Now that we're passing the
"opts" down there's no reason to do so.

This makes the code easier to follow, as we have a "const" attribute
on the "struct run_process_parallel_opts", but not "struct
parallel_processes". We do not alter the "ungroup" argument, so
storing it in the non-const structure would make this control flow
less obvious.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
fa93951d79 run-command.c: don't copy *_fn to "struct parallel_processes"
The only remaining reason for copying the callbacks in the "struct
run_process_parallel_opts" over to the "struct parallel_processes" was
to avoid two if/else statements in case the "start_failure" and
"task_finished" callbacks were NULL.

Let's handle those cases in pp_start_one() and pp_collect_finished()
instead, and avoid the default_* stub functions, and the need to copy
this data around.

Organizing the code like this made more sense before the "struct
run_parallel_parallel_opts" existed, as we'd have needed to pass each
of these as a separate parameter.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
e39c9de860 run-command.c: make "struct parallel_processes" const if possible
Add a "const" to two "struct parallel_processes" parameters where
we're not modifying anything in "pp". For kill_children() we'll call
it from both the signal handler, and from run_processes_parallel()
itself. Adding a "const" there makes it clear that we don't need to
modify any state when killing our children.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
36d69bf77e run-command API: move *_tr2() users to "run_processes_parallel()"
Have the users of the "run_processes_parallel_tr2()" function use
"run_processes_parallel()" instead. In preceding commits the latter
was refactored to take a "struct run_process_parallel_opts" argument,
since the only reason for "run_processes_parallel_tr2()" to exist was
to take arguments that are now a part of that struct we can do away
with it.

See ee4512ed48 (trace2: create new combined trace facility,
2019-02-22) for the addition of the "*_tr2()" variant of the function,
it was used by every caller except "t/helper/test-run-command.c"..

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
6e5ba0bae4 run-command API: have run_process_parallel() take an "opts" struct
As noted in fd3aaf53f7 (run-command: add an "ungroup" option to
run_process_parallel(), 2022-06-07) which added the "ungroup" passing
it to "run_process_parallel()" via the global
"run_processes_parallel_ungroup" variable was a compromise to get the
smallest possible regression fix for "maint" at the time.

This follow-up to that is a start at passing that parameter and others
via a new "struct run_process_parallel_opts", as the earlier
version[1] of what became fd3aaf53f7 did.

Since we need to change all of the occurrences of "n" to
"opt->SOMETHING" let's take the opportunity and rename the terse "n"
to "processes". We could also have picked "max_processes", "jobs",
"threads" etc., but as the API is named "run_processes_parallel()"
let's go with "processes".

Since the new "run_processes_parallel()" function is able to take an
optional "tr2_category" and "tr2_label" via the struct we can at this
point migrate all of the users of "run_processes_parallel_tr2()" over
to it.

But let's not migrate all the API users yet, only the two users that
passed the "ungroup" parameter via the
"run_processes_parallel_ungroup" global

1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
c333e6f3a8 run-command.c: use designated init for pp_init(), add "const"
Use a designated initializer to initialize those parts of pp_init()
that don't need any conditionals for their initialization, this sets
us on a path to pp_init() itself into mostly a validation and
allocation function.

Since we're doing that we can add "const" to some of the members of
the "struct parallel_processes", which helps to clarify and
self-document this code. E.g. we never alter the "data" pointer we
pass t user callbacks, nor (after the preceding change to stop
invoking online_cpus()) do we change "max_processes", the same goes
for the "ungroup" option.

We can also do away with a call to strbuf_init() in favor of macro
initialization, and to rely on other fields being NULL'd or zero'd.

Making members of a struct "const" rather that the pointer to the
struct itself is usually painful, as e.g. it precludes us from
incrementally setting up the structure. In this case we only set it up
with the assignment in run_process_parallel() and pp_init(), and don't
pass the struct pointer around as "const", so making individual
members "const" is worth the potential hassle for extra safety.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
51243f9f0f run-command API: don't fall back on online_cpus()
When a "jobs = 0" is passed let's BUG() out rather than fall back on
online_cpus(). The default behavior was added when this API was
implemented in c553c72eed (run-command: add an asynchronous parallel
child processor, 2015-12-15).

Most of our code in-tree that scales up to "online_cpus()" by default
calls that function by itself. Keeping this default behavior just for
the sake of two callers means that we'd need to maintain this one spot
where we're second-guessing the config passed down into pp_init().

The preceding commit has an overview of the API callers that passed
"jobs = 0". There were only two of them (actually three, but they
resolved to these two config parsing codepaths).

The "fetch.parallel" caller already had a test for the
"fetch.parallel=0" case added in 0353c68818 (fetch: do not run a
redundant fetch from submodule, 2022-05-16), but there was no such
test for "submodule.fetchJobs". Let's add one here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
6a48b428b4 run-command API: make "n" parameter a "size_t"
Make the "n" variable added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) a "size_t". As
we'll see in a subsequent commit we do pass "0" here, but never "jobs
< 0".

We could have made it an "unsigned int", but as we're having to change
this let's not leave another case in the codebase where a size_t and
"unsigned int" size differ on some platforms. In this case it's likely
to never matter, but it's easier to not need to worry about it.

After this and preceding changes:

	make run-command.o DEVOPTS=extra-all CFLAGS=-Wno-unused-parameter

Only has one (and new) -Wsigned-compare warning relevant to a
comparison about our "n" or "{nr,max}_processes": About using our
"n" (size_t) in the same expression as online_cpus() (int). A
subsequent commit will adjust & deal with online_cpus() and that
warning.

The only users of the "n" parameter are:

 * builtin/fetch.c: defaults to 1, reads from the "fetch.parallel"
   config. As seen in the code that parses the config added in
   d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
   2019-10-05) will die if the git_config_int() return value is < 0.

   It will however pass us n = 0, as we'll see in a subsequent commit.

 * submodule.c: defaults to 1, reads from "submodule.fetchJobs"
   config. Read via code originally added in a028a1930c (fetching
   submodules: respect `submodule.fetchJobs` config option, 2016-02-29).

   It now piggy-backs on the the submodule.fetchJobs code and
   validation added in f20e7c1ea2 (submodule: remove
   submodule.fetchjobs from submodule-config parsing, 2017-08-02).

   Like builtin/fetch.c it will die if the git_config_int() return
   value is < 0, but like builtin/fetch.c it will pass us n = 0.

 * builtin/submodule--helper.c: defaults to 1. Read via code
   originally added in 2335b870fa (submodule update: expose parallelism
   to the user, 2016-02-29).

   Since f20e7c1ea2 (submodule: remove submodule.fetchjobs from
   submodule-config parsing, 2017-08-02) it shares a config parser and
   semantics with the submodule.c caller.

 * hook.c: hardcoded to 1, see 96e7225b31 (hook: add 'run'
   subcommand, 2021-12-22).

 * t/helper/test-run-command.c: can be -1 after parsing the arguments,
   but will then be overridden to online_cpus() before passing it to
   this API. See be5d88e112 (test-tool run-command: learn to run (parts
   of) the testsuite, 2019-10-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
910e2b372f run-command tests: use "return", not "exit"
Change the "run-command" test helper to "return" instead of calling
"exit", see 338abb0f04 (builtins + test helpers: use return instead
of exit() in cmd_*, 2021-06-08)

Because we'd previously gotten past the SANITIZE=leak check by using
exit() here we need to move to "goto cleanup" pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
7dd5762d9f run-command API: have "run_processes_parallel{,_tr2}()" return void
Change the "run_processes_parallel{,_tr2}()" functions to return void,
instead of int. Ever since c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) they have
unconditionally returned 0.

To get a "real" return value out of this function the caller needs to
get it via the "task_finished_fn" callback, see the example in hook.c
added in 96e7225b31 (hook: add 'run' subcommand, 2021-12-22).

So the "result = " and "if (!result)" code added to "builtin/fetch.c"
d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
2019-10-05) has always been redundant, we always took that "if"
path. Likewise the "ret =" in "t/helper/test-run-command.c" added in
be5d88e112 (test-tool run-command: learn to run (parts of) the
testsuite, 2019-10-04) wasn't used, instead we got the return value
from the "if (suite.failed.nr > 0)" block seen in the context.

Subsequent commits will alter this API interface, getting rid of this
always-zero return value makes it easier to understand those changes.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
a083f94c21 run-command test helper: use "else if" pattern
Adjust the cmd__run_command() to use an "if/else if" chain rather than
mutually exclusive "if" statements. This non-functional change makes a
subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
a2634646eb docs: git-send-email: difference between ssl and tls smtp-encryption
New explanation for the difference between these values.
It's hard to understand what they do based only on the names.
New description of used default ports.

Signed-off-by: Sotir Danailov <sndanailov@wired4ever.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 11:08:37 -07:00
8628a842bd bundle-uri: suppress stderr from remote-https
When downloading bundles from a git-remote-https subprocess, the bundle
URI logic wants to be opportunistic and download as much as possible and
work with what did succeed. This is particularly important in the "any"
mode, where any single bundle success will work.

If the URI is not available, the git-remote-https process will die()
with a "fatal:" error message, even though that error is not actually
fatal to the super process. Since stderr is passed through, it looks
like a fatal error to the user.

Suppress stderr to avoid these errors from bubbling to the surface. The
bundle URI API adds its own warning() messages on these failures.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
70334fc3eb bundle-uri: quiet failed unbundlings
When downloading a list of bundles in "all" mode, Git has no
understanding of the dependencies between the bundles. Git attempts to
unbundle the bundles in some order, but some may not pass the
verify_bundle() step because of missing prerequisites. This is passed as
error messages to the user, even when they eventually succeed in later
attempts after their dependent bundles are unbundled.

Add a new VERIFY_BUNDLE_QUIET flag to verify_bundle() that avoids the
error messages from the missing prerequisite commits. The method still
returns the number of missing prerequisit commits, allowing callers to
unbundle() to notice that the bundle failed to apply.

Use this flag in bundle-uri.c and test that the messages go away for
'git clone --bundle-uri' commands.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
89bd7fedf9 bundle: add flags to verify_bundle()
The verify_bundle() method has a 'verbose' option, but we will want to
extend this method to have more granular control over its output. First,
replace this 'verbose' option with a new 'flags' option with a single
possible value: VERIFY_BUNDLE_VERBOSE.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
c23f592117 bundle-uri: fetch a list of bundles
When the content at a given bundle URI is not understood as a bundle
(based on inspecting the initial content), then Git currently gives up
and ignores that content. Independent bundle providers may want to split
up the bundle content into multiple bundles, but still make them
available from a single URI.

Teach Git to attempt parsing the bundle URI content as a Git config file
providing the key=value pairs for a bundle list. Git then looks at the
mode of the list to see if ANY single bundle is sufficient or if ALL
bundles are required. The content at the selected URIs are downloaded
and the content is inspected again, creating a recursive process.

To guard the recursion against malformed or malicious content, limit the
recursion depth to a reasonable four for now. This can be converted to a
configured value in the future if necessary. The value of four is twice
as high as expected to be useful (a bundle list is unlikely to point to
more bundle lists).

To test this scenario, create an interesting bundle topology where three
incremental bundles are built on top of a single full bundle. By using a
merge commit, the two middle bundles are "independent" in that they do
not require each other in order to unbundle themselves. They each only
need the base bundle. The bundle containing the merge commit requires
both of the middle bundles, though. This leads to some interesting
decisions when unbundling, especially when we later implement heuristics
that promote downloading bundles until the prerequisite commits are
satisfied.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
c96060b0ce bundle: properly clear all revision flags
The verify_bundle() method checks two things for a bundle's
prerequisites:

 1. Are these objects in the object store?
 2. Are these objects reachable from our references?

In this second question, multiple uses of verify_bundle() in the same
process can report an invalid bundle even though it is correct. The
reason is due to not clearing all of the commit marks on the commits
previously walked.

The revision walk machinery was first introduced in-process by
fb9a54150d (git-bundle: avoid fork() in verify_bundle(), 2007-02-22).
This implementation used "-1" as the set of flags to clear. The next
meaningful change came in 2b064697a5 (revision traversal: retire
BOUNDARY_SHOW, 2007-03-05), which introduced the PREREQ_MARK flag
instead of a flag normally controlled by the revision-walk machinery.

In 86a0a408b9 (commit: factor out
clear_commit_marks_for_object_array, 2011-10-01), the loop over the
array of commits was replaced with a new
clear_commit_marks_for_object_array(), but simultaneously the "-1" value
was replaced with "ALL_REV_FLAGS", which stopped un-setting the
PREREQ_MARK flag. This means that if multiple commits were marked by the
PREREQ_MARK in a previous run of verify_bundle(), then this loop could
terminate early due to 'i' going to zero:

	while (i && (commit = get_revision(&revs)))
		if (commit->object.flags & PREREQ_MARK)
			i--;

The flag clearing work was changed again in 63647391e6 (bundle: avoid
using the rev_info flag leak_pending, 2017-12-25), but that was only
cosmetic and did not change the behavior.

It may seem that it would be sufficient to add the PREREQ_MARK flag to
the clear_commit_marks() call in its current location. However, we
actually need to do it in the "cleanup:" step, since the first loop
checking "Are these objects in the object store?" might add the
PREREQ_MARK flag to some objects and then terminate without performing a
walk due to one missing object. By clearing the flags in all cases, we
avoid this issue when running verify_bundle() multiple times in the same
process.

Moving this loop to the cleanup step alone would cause a segfault when
running 'git bundle verify' outside of a repository, but this is because
of that error condition using "goto cleanup" when returning is perfectly
safe. Nothing has been initialized at that point, so we can return
immediately without causing any leaks.

This behavior is verified carefully by a test that will be added soon
when Git learns to download bundle lists in a 'git clone --bundle-uri'
command.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
20c1e2a68b bundle-uri: limit recursion depth for bundle lists
The next change will start allowing us to parse bundle lists that are
downloaded from a provided bundle URI. Those lists might point to other
lists, which could proceed to an arbitrary depth (and even create
cycles). Restructure fetch_bundle_uri() to have an internal version that
has a recursion depth. Compare that to a new max_bundle_uri_depth
constant that is twice as high as we expect this depth to be for any
legitimate use of bundle list linking.

We can consider making max_bundle_uri_depth a configurable value if
there is demonstrated value in the future.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
738e5245fa bundle-uri: parse bundle list in config format
When a bundle provider wants to operate independently from a Git remote,
they want to provide a single, consistent URI that users can use in
their 'git clone --bundle-uri' commands. At this point, the Git client
expects that URI to be a single bundle that can be unbundled and used to
bootstrap the rest of the clone from the Git server. This single bundle
cannot be re-used to assist with future incremental fetches.

To allow for the incremental fetch case, teach Git to understand a
bundle list that could be advertised at an independent bundle URI. Such
a bundle list is likely to be inspected by human readers, even if only
by the bundle provider creating the list. For this reason, we can take
our expected "key=value" pairs and instead format them using Git config
format.

Create bundle_uri_parse_config_format() to parse a file in config format
and convert that into a 'struct bundle_list' filled with its
understanding of the contents.

Be careful to use error_action CONFIG_ERROR_ERROR when calling
git_config_from_file_with_options() because the default action for
git_config_from_file() is to die() on a parsing error.  The current
warning isn't particularly helpful if it arises to a user, but it will
be made more verbose at a higher layer later.

Update 'test-tool bundle-uri' to take this config file format as input.
It uses a filename instead of stdin because there is no existing way to
parse a FILE pointer in the config machinery. Using
git_config_from_mem() is overly complicated and more likely to introduce
bugs than this simpler version.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
d796cedbe8 bundle-uri: unit test "key=value" parsing
Create a new 'test-tool bundle-uri' test helper. This helper will assist
in testing logic deep in the bundle URI feature.

This change introduces the 'parse-key-values' subcommand, which parses
an input file as a list of lines. These are fed into
bundle_uri_parse_line() to test how we construct a 'struct bundle_list'
from that data. The list is then output to stdout as if the key-value
pairs were a Git config file.

We use an input file instead of stdin because of a future change to
parse in config-file format that works better as an input file.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
9424e373fd bundle-uri: create "key=value" line parsing
When advertising a bundle list over Git's protocol v2, we will use
packet lines. Each line will be of the form "key=value" representing a
bundle list. Connect the API necessary for Git's transport to the
key-value pair parsing created in the previous change.

We are not currently implementing this protocol v2 functionality, but
instead preparing to expose this parsing to be unit-testable.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
bff03c47f7 bundle-uri: create base key-value pair parsing
There will be two primary ways to advertise a bundle list: as a list of
packet lines in Git's protocol v2 and as a config file served from a
bundle URI. Both of these fundamentally use a list of key-value pairs.
We will use the same set of key-value pairs across these formats.

Create a new bundle_list_update() method that is currently unusued, but
will be used in the next change. It inspects each key to see if it is
understood and then applies it to the given bundle_list. Here are the
keys that we teach Git to understand:

* bundle.version: This value should be an integer. Git currently
  understands only version 1 and will ignore the list if the version is
  any other value. This version can be increased in the future if we
  need to add new keys that Git should not ignore. We can add new
  "heuristic" keys without incrementing the version.

* bundle.mode: This value should be one of "all" or "any". If this
  mode is not understood, then Git will ignore the list. This mode
  indicates whether Git needs all of the bundle list items to make a
  complete view of the content or if any single item is sufficient.

The rest of the keys use a bundle identifier "<id>" as part of the key
name. Keys using the same "<id>" describe a single bundle list item.

* bundle.<id>.uri: This stores the URI of the bundle item. This
  currently is expected to be an absolute URI, but will be relaxed to be
  a relative URI in the future.

While parsing, return an error if a URI key is repeated, since we can
make that restriction with bundle lists.

Make the git_parse_int() method global so we can parse the integer
version value carefully.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
0634f717a3 bundle-uri: create bundle_list struct and helpers
It will likely be rare where a user uses a single bundle URI and expects
that URI to point to a bundle. Instead, that URI will likely be a list
of bundles provided in some format. Alternatively, the Git server could
advertise a list of bundles.

In anticipation of these two ways of advertising multiple bundles,
create a data structure that represents such a list. This will be
populated using a common API, but for now focus on what data can be
represented.

Each list contains a number of remote_bundle_info structs. These contain
an 'id' that is used to uniquely identify them in the list, and also a
'uri' that contains the location of its data. Finally, there is a strbuf
containing the filename used when Git downloads the contents to disk.

The list itself stores these remote_bundle_info structs in a hashtable
using 'id' as the key. The order of the structs in the input is
considered unimportant, but future modifications to the format and these
data structures will place ordering possibilities on the set. The list
also has a few "global" properties, including the version (used when
parsing the list) and the mode. The mode is one of these two options:

1. BUNDLE_MODE_ALL: all listed URIs are intended to be combined
   together. The client should download all of the advertised data to
   have a complete copy of the data.

2. BUNDLE_MODE_ANY: any one listed item is sufficient to have a complete
   copy of the data. The client can choose arbitrarily from these
   options. In the future, the client may use pings to find the closest
   URI among geodistributed replicas, or use some other heuristic
   information added to the format.

This API is currently unused, but will soon be expanded with parsing
logic and then be consumed by the bundle URI download logic.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
23b6d00ba7 bundle-uri: use plain string in find_temp_filename()
The find_temp_filename() method was created in 53a50892be (bundle-uri:
create basic file-copy logic, 2022-08-09) and uses odb_mkstemp() to
create a temporary filename. The odb_mkstemp() method uses a strbuf in
its interface, but we do not need to continue carrying a strbuf
throughout the bundle URI code.

Convert the find_temp_filename() method to use a 'char *' and modify its
only caller. This makes sense that we don't actually need to modify this
filename directly later, so using a strbuf is overkill.

This change will simplify the data structure for tracking a bundle list
to use plain strings instead of strbufs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
d420dda057 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 10:36:12 -07:00
c68bd3ec22 Merge branch 'rs/gc-pack-refs-simplify'
Code clean-up.

* rs/gc-pack-refs-simplify:
  gc: simplify maintenance_task_pack_refs()
2022-10-11 10:36:12 -07:00
39c1578c5e Merge branch 'nb/doc-mergetool-typofix'
Typofix.

* nb/doc-mergetool-typofix:
  mergetool.txt: typofix 'overwriten' -> 'overwritten'
2022-10-11 10:36:12 -07:00
b0416d8f4a Merge branch 'jk/sequencer-missing-author-name-check'
Typofix in code.

* jk/sequencer-missing-author-name-check:
  sequencer: detect author name errors in read_author_script()
2022-10-11 10:36:12 -07:00
644195e02f Merge branch 'pw/ssh-sign-report-errors'
The codepath to sign learned to report errors when it fails to read
from "ssh-keygen".

* pw/ssh-sign-report-errors:
  ssh signing: return an error when signature cannot be read
2022-10-11 10:36:11 -07:00
601bb23876 Merge branch 'pw/mailinfo-b-fix'
Fix a logic in "mailinfo -b" that miscomputed the length of a
substring, which lead to an out-of-bounds access.

* pw/mailinfo-b-fix:
  mailinfo -b: fix an out of bounds access
2022-10-11 10:36:11 -07:00
654f5cedbc Merge branch 'rs/test-httpd-in-C-locale'
Force C locale while running tests around httpd to make sure we can
find expected error messages in the log.

* rs/test-httpd-in-C-locale:
  t/lib-httpd: pass LANG and LC_ALL to Apache
2022-10-11 10:36:11 -07:00
d54f0c5a44 Merge branch 'ds/bundle-uri-docfix'
Doc formatting fix.

* ds/bundle-uri-docfix:
  bundle-uri: fix technical doc issues
2022-10-11 10:36:10 -07:00
438c2f859b CodingGuidelines: recommend against unportable C99 struct syntax
Per 33665d98e6 (reftable: make assignments portable to AIX xlc
v12.01, 2022-03-28) forms like ".a.b = *c" can be replaced by using
".a = { .b = *c }" instead.

We'll probably allow these sooner than later, but since the workaround
is trivial let's note it among the C99 features we'd like to hold off
on for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 08:55:01 -07:00
db84376f98 grep.c: remove "extended" in favor of "pattern_expression", fix segfault
Since 79d3696cfb (git-grep: boolean expression on pattern matching.,
2006-06-30) the "pattern_expression" member has been used for complex
queries (AND/OR...), with "pattern_list" being used for the simple OR
queries. Since then we've used both "pattern_expression" and its
associated boolean "extended" member to see if we have a complex
expression.

Since f41fb662f5 (revisions API: have release_revisions() release
"grep_filter", 2022-04-13) we've had a subtle bug relating to that: If
we supplied options that were only used for "complex queries", but
didn't supply the query itself we'd set "opt->extended", but would
have a NULL "pattern_expression". As a result these would segfault as
we tried to call "free_grep_patterns()" from "release_revisions()":

	git -P log -1 --invert-grep
	git -P log -1 --all-match

The root cause of this is that we were conflating the state management
we needed in "compile_grep_patterns()" itself with whether or not we
had an "opt->pattern_expression" later on.

In this cases as we're going through "compile_grep_patterns()" we have
no "opt->pattern_list" but have "opt->no_body_match" or
"opt->all_match". So we'd set "opt->extended = 1", but not "return" on
"opt->extended" as that's an "else if" in the same "if" statement.

That behavior is intentional and required, as the common case is that
we have an "opt->pattern_list" that we're about to parse into the
"opt->pattern_expression".

But we don't need to keep track of this "extended" flag beyond the
state management in compile_grep_patterns() itself. It needs it, but
once we're out of that function we can rely on
"opt->pattern_expression" being non-NULL instead for using these
extended patterns.

As 79d3696cfb itself shows we've assumed that there's a one-to-one
mapping between the two since the very beginning. I.e. "match_line()"
would check "opt->extended" to see if it should call "match_expr()",
and the first thing we do in that function is assume that we have a
"opt->pattern_expression". We'd then call "match_expr_eval()", which
would have died if that "opt->pattern_expression" was NULL.

The "die" was added in c922b01f54 (grep: fix segfault when "git grep
'('" is given, 2009-04-27), and can now be removed as it's now clearly
unreachable. We still do the right thing in the case that prompted
that fix:

	git grep '('
	fatal: unmatched parenthesis

Arguably neither the "--invert-grep" option added in [1] nor the
earlier "--all-match" option added in [2] were intended to be used
stand-alone, and another approach[3] would be to error out in those
cases. But since we've been treating them as a NOOP when given without
--grep for a long time let's keep doing that.

We could also return in "free_pattern_expr()" if the argument is
non-NULL, as an alternative fix for this segfault does [4]. That would
be more elegant in making the "free_*()" function behave like
"free()", but it would also remove a sanity check: The
"free_pattern_expr()" function calls itself recursively, and only the
top-level is allowed to be NULL, let's not conflate those two
conditions.

1. 22dfa8a23d (log: teach --invert-grep option, 2015-01-12)
2. 0ab7befa31 (grep --all-match, 2006-09-27)
3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/
4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com

Reported-by: orygaw <orygaw@protonmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 08:48:54 -07:00
e3733b646d archive: deduplicate verbose printing
94bc671a1f (Add directory pattern matching to attributes, 2012-12-08)
moved the code for adding the trailing slash to names of directories and
submodules up.  This left both branches of the if statement starting
with the same conditional fprintf call.  Deduplicate it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 08:35:10 -07:00
c4f9490790 fsmonitor: fix leak of warning message
The fsm_settings__get_incompatible_msg() function returns an allocated
string.  So we can't pass its result directly to warning(); we must hold
on to the pointer and free it to avoid a leak.

The leak here is small and fixed size, but Coverity complained, and
presumably SANITIZE=leaks would eventually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 22:16:56 -07:00
0dc4e5c574 branch: support for shortcuts like @{-1}, completed
branch command with options "edit-description", "set-upstream-to" and
"unset-upstream" expects a branch name.  Since ae5a6c3684 (checkout:
implement "@{-N}" shortcut name for N-th last branch, 2009-01-17) a
branch can be specified using shortcuts like @{-1}.  Those shortcuts
need to be resolved when considering the arguments.

We can modify the description of the previously checked out branch with:

$ git branch --edit--description @{-1}

We can modify the upstream of the previously checked out branch with:

$ git branch --set-upstream-to upstream @{-1}
$ git branch --unset-upstream @{-1}

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 16:28:59 -07:00
d7d850e2b9 CodingGuidelines: mention C99 features we can't use
The C99 section of the CodingGuidelines is a good overview of what we
can use, but is sorely lacking in what we can't use. Something that
comes up occasionally is the portability of %z.

Per [1] we couldn't use it for the longest time due to MSVC not
supporting it, but nowadays by requiring C99 we rely on the MSVC
version that does, but we can't use it yet because a C library that
MinGW uses doesn't support it.

1. https://lore.kernel.org/git/a67e0fd8-4a14-16c9-9b57-3430440ef93c@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:12 -07:00
82dd01d81b CodingGuidelines: allow declaring variables in for loops
Since 44ba10d671 (revision: use C99 declaration of variable in for()
loop, 2021-11-14) released with v2.35.0 we've had a variable declared
with in a for loop.

Since then we've had inadvertent follow-ups to that with at least
cb2607759e (merge-ort: store more specific conflict information,
2022-06-18) released with v2.38.0.

As November 2022 is within the window of this upcoming release,
let's update the guideline to allow this.  We can have the promised
"revisit" discussion while this patch cooks, and drop it if it turns
out that it is still premature, which is not expected to happen at
this moment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:11 -07:00
442c27dde7 CodingGuidelines: mention dynamic C99 initializer elements
The first use of variables in initializer elements appears to have
been 2b6854c863 (Cleanup variables in cat-file, 2007-04-21) released
with v1.5.2.

Some of those caused portability issues, and e.g. that "cat-file" use
was changed in 66dbfd55e3 (Rewrite dynamic structure initializations
to runtime assignment, 2010-05-14) which went out with v1.7.2.

But curiously 66dbfd55e3 missed some of them, e.g. an archive.c use
added in d5f53d6d6f (archive: complain about path specs that don't
match anything, 2009-12-12), and another one in merge-index.c (later
builtin/merge-index.c) in 0077138cd9 (Simplify some instances of
run_command() by using run_command_v_opt()., 2009-06-08).

As far as I can tell there's been no point since 2b6854c863 in 2007
where a compiler that didn't support this has been able to compile
git. Presumably 66dbfd55e3 was an attempt to make headway with wider
portability that ultimately wasn't completed.

In any case, we are thoroughly reliant on this syntax at this point,
so let's update the guidelines, see
https://lore.kernel.org/git/xmqqy1tunjgp.fsf@gitster.g/ for the
initial discussion.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:11 -07:00
e88a2d02dc CodingGuidelines: update for C99
Since 7bc341e21b (git-compat-util: add a test balloon for C99
support, 2021-12-01) we've had a hard dependency on C99, but the prose
in CodingGuidelines was written under the assumption that we were
using C89 with a few C99 features.

As the updated prose notes we'd still like to hold off on novel C99
features, but let's make it clear that we target that C version, and
then enumerate new C99 features that are safe to use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:11 -07:00
a677d3c416 t3435: remove redundant test case
rebase --preserve-merges no longer exists so there is no point in
carrying this failing test case.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 11:18:18 -07:00
54795d37d9 config.mak.dev: disable suggest braces error on old clang versions
Add the "-Wno-missing-braces" option when building with an old version
of clang to suppress the "suggest braces around initialization" error
in developer mode.

For example, using an old version of clang gives the following errors
(when in DEVELOPER=1 mode):

$ make builtin/merge-file.o
    CC builtin/merge-file.o
builtin/merge-file.c:29:23: error: suggest braces around initialization \
			    of subobject [-Werror,-Wmissing-braces]
        mmfile_t mmfs[3] = { 0 };
                             ^
                             {}
builtin/merge-file.c:31:20: error: suggest braces around initialization \
			    of subobject [-Werror,-Wmissing-braces]
        xmparam_t xmp = { 0 };
                          ^
                          {}
2 errors generated.

This example compiles without error/warning with updated versions of
clang.  Since this is an obsolete error, use the -Wno-missing-braces
option to silence the warning when using an older compiler.  This
avoids the need to update the code to use "{{0}}" style
initializations.

Upstream clang version 8 has the problem.  It was fixed in version 9.

The version of clang distributed by Apple with XCode has its own
unique set of version numbers.  Apple clang version 11 has the
problem.  It was fixed in version 12.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 11:15:31 -07:00
e85701b4af The (real) first batch for 2.39
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 10:09:09 -07:00
19118cb857 Merge branch 'js/merge-ort-in-read-only-repo'
In read-only repositories, "git merge-tree" tried to come up with a
merge result tree object, which it failed (which is not wrong) and
led to a segfault (which is bad), which has been corrected.

* js/merge-ort-in-read-only-repo:
  merge-ort: return early when failing to write a blob
  merge-ort: fix segmentation fault in read-only repositories
2022-10-10 10:08:43 -07:00
a215853545 Merge branch 'tb/midx-repack-ignore-cruft-packs'
"git multi-pack-index repack/expire" used to repack unreachable
cruft into a new pack, which have been corrected.

* tb/midx-repack-ignore-cruft-packs:
  midx.c: avoid cruft packs with non-zero `repack --batch-size`
  midx.c: remove unnecessary loop condition
  midx.c: replace `xcalloc()` with `CALLOC_ARRAY()`
  midx.c: avoid cruft packs with `repack --batch-size=0`
  midx.c: prevent `expire` from removing the cruft pack
  Documentation/git-multi-pack-index.txt: clarify expire behavior
  Documentation/git-multi-pack-index.txt: fix typo
2022-10-10 10:08:43 -07:00
38bb92cf46 Merge branch 'hn/parse-worktree-ref'
Code and semantics cleaning.

* hn/parse-worktree-ref:
  refs: unify parse_worktree_ref() and ref_type()
2022-10-10 10:08:43 -07:00
dc154c39f7 Merge branch 'ja/rebase-i-avoid-amending-self'
"git rebase -i" can mistakenly attempt to apply a fixup to a commit
itself, which has been corrected.

* ja/rebase-i-avoid-amending-self:
  sequencer: avoid dropping fixup commit that targets self via commit-ish
2022-10-10 10:08:43 -07:00
83b2b47850 Merge branch 'rj/ref-filter-get-head-description-leakfix'
Leakfix.

* rj/ref-filter-get-head-description-leakfix:
  ref-filter.c: fix a leak in get_head_description
2022-10-10 10:08:42 -07:00
a1fdfb0975 Merge branch 'jc/environ-docs'
Documentation on various Boolean GIT_* environment variables have
been clarified.

* jc/environ-docs:
  environ: GIT_INDEX_VERSION affects not just a new repository
  environ: simplify description of GIT_INDEX_FILE
  environ: GIT_FLUSH should be made a usual Boolean
  environ: explain Boolean environment variables
  environ: document GIT_SSL_NO_VERIFY
2022-10-10 10:08:41 -07:00
2e6c1b59fd Merge branch 'ah/branch-autosetupmerge-grammofix'
Fix grammar of a message introduced in previous round.

* ah/branch-autosetupmerge-grammofix:
  push: improve grammar of branch.autoSetupMerge advice
2022-10-10 10:08:40 -07:00
82d5a8483e Merge branch 'ab/test-malloc-with-sanitize-leak'
Test fix.

* ab/test-malloc-with-sanitize-leak:
  test-lib: have SANITIZE=leak imply TEST_NO_MALLOC_CHECK
2022-10-10 10:08:40 -07:00
67bf4a83e9 Merge branch 'sy/sparse-grep'
"git grep" learned to expand the sparse-index more lazily and on
demand in a sparse checkout.

* sy/sparse-grep:
  builtin/grep.c: integrate with sparse index
2022-10-10 10:08:40 -07:00
4b4d97cfda Merge branch 'ds/scalar-unregister-idempotent'
"scalar unregister" in a repository that is already been
unregistered reported an error.

* ds/scalar-unregister-idempotent:
  string-list: document iterator behavior on NULL input
  gc: replace config subprocesses with API calls
  scalar: make 'unregister' idempotent
  maintenance: add 'unregister --force'
2022-10-10 10:08:40 -07:00
dc6dd55f70 Merge branch 'mc/cred-helper-ignore-unknown'
Most credential helpers ignored unknown entries in a credential
description, but a few died upon seeing them.  The latter were
taught to ignore them, too

* mc/cred-helper-ignore-unknown:
  osxkeychain: clarify that we ignore unknown lines
  netrc: ignore unknown lines (do not die)
  wincred: ignore unknown lines (do not die)
2022-10-10 10:08:40 -07:00
20a5dd670c Merge branch 'jk/remote-rename-without-fetch-refspec'
"git remote rename" failed to rename a remote without fetch
refspec, which has been corrected.

* jk/remote-rename-without-fetch-refspec:
  remote: handle rename of remote without fetch refspec
2022-10-10 10:08:39 -07:00
7aeb0d4c47 Merge branch 'jk/clone-allow-bare-and-o-together'
"git clone" did not like to see the "--bare" and the "--origin"
options used together without a good reason.

* jk/clone-allow-bare-and-o-together:
  clone: allow "--bare" with "-o"
2022-10-10 10:08:39 -07:00
fdbfac60fd Merge branch 'jk/fsck-on-diet'
"git fsck" failed to release contents of tree objects already used
from the memory, which has been fixed.

* jk/fsck-on-diet:
  parse_object_buffer(): respect save_commit_buffer
  fsck: turn off save_commit_buffer
  fsck: free tree buffers after walking unreachable objects
2022-10-10 10:08:39 -07:00
d194e61ea7 Merge branch 'so/diff-merges-cleanup'
Code clean-up.

* so/diff-merges-cleanup:
  diff-merges: clarify log.diffMerges documentation
  diff-merges: cleanup set_diff_merges()
  diff-merges: cleanup func_by_opt()
2022-10-10 10:08:39 -07:00
ab26e44d98 Merge branch 'ah/fsmonitor-daemon-usage-non-l10n'
Fix messages incorrectly marked for translation.

* ah/fsmonitor-daemon-usage-non-l10n:
  fsmonitor--daemon: don't translate literal commands
2022-10-10 10:08:39 -07:00
b77e3bdd97 symbolic-ref: teach "--[no-]recurse" option
Suppose you are managing many maintenance tracks in your project,
and some of the more recent ones are maint-2.36 and maint-2.37.
Further imagine that your project recently tagged the official 2.38
release, which means you would need to start maint-2.38 track soon,
by doing:

  $ git checkout -b maint-2.38 v2.38.0^0
  $ git branch --list 'maint-2.3[6-9]'
  * maint-2.38
    maint-2.36
    maint-2.37

So far, so good.  But it also is reasonable to want not to have to
worry about which maintenance track is the latest, by pointing a
more generic-sounding 'maint' branch at it, by doing:

  $ git symbolic-ref refs/heads/maint refs/heads/maint-2.38

which would allow you to say "whichever it is, check out the latest
maintenance track", by doing:

  $ git checkout maint
  $ git branch --show-current
  maint-2.38

It is arguably better to say that we are on 'maint-2.38' rather than
on 'maint', and "git merge/pull" would record "into maint-2.38" and
not "into maint", so I think what we have is a good behaviour.

One thing that is slightly irritating, however, is that I do not
think there is a good way (other than "cat .git/HEAD") to learn that
you checked out 'maint' to get into that state.  Just like the output
of "git branch --show-current" shows above, "git symbolic-ref HEAD"
would report 'refs/heads/maint-2.38', bypassing the intermediate
symbolic ref at 'refs/heads/maint' that is pointed at by HEAD.

The internal resolve_ref() API already has the necessary support for
stopping after resolving a single level of a symbolic-ref, and we
can expose it by adding a "--[no-]recurse" option to the command.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-09 12:31:24 -07:00
413bc6d20a git.c: improve code readability in cmd_main()
Check for an error condition whose body unconditionally exists
first, and then perform the special casing of "version" and "help"
as part of the preparation for the "normal codepath".  This makes
the code simpler to read.

Signed-off-by: Daniel Sonbolian <dsal3389@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-08 22:11:37 -07:00
f7669676d0 dir: use fspathncmp() in pl_hashmap_cmp()
Call fspathncmp() instead of open-coding it.  This shortens the code and
makes it less repetitive.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-08 22:09:03 -07:00
bcfc82bd48 branch: description for non-existent branch errors
When the repository does not yet have commits, some errors describe that
there is no branch:

    $ git init -b first

    $ git branch --edit-description first
    error: No branch named 'first'.

    $ git branch --set-upstream-to=upstream
    fatal: branch 'first' does not exist

    $ git branch -c second
    error: refname refs/heads/first not found
    fatal: Branch copy failed

That "first" branch is unborn but to say it doesn't exists is confusing.

Options "-c" (copy) and "-m" (rename) show the same error when the
origin branch doesn't exists:

    $ git branch -c non-existent-branch second
    error: refname refs/heads/non-existent-branch not found
    fatal: Branch copy failed

    $ git branch -m non-existent-branch second
    error: refname refs/heads/non-existent-branch not found
    fatal: Branch rename failed

Note that "--edit-description" without an explicit argument is already
considering the _empty repository_ circumstance in its error.  Also note
that "-m" on the initial branch it is an allowed operation.

Make the error descriptions for those branch operations with unborn or
non-existent branches, more informative.

This is the result of the change:

    $ git init -b first

    $ git branch --edit-description first
    error: No commit on branch 'first' yet.

    $ git branch --set-upstream-to=upstream
    fatal: No commit on branch 'first' yet.

    $ git branch -c second
    fatal: No commit on branch 'first' yet.

    $ git branch [-c/-m] non-existent-branch second
    fatal: No branch named 'non-existent-branch'.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-07 20:59:41 -07:00
bbe21b64a0 Start 2.39 cycle
The version numbers do not mean much, but we may want to call the
first one in 2023 version 3.1 or something, but let's just increment
the second digit from the previous one for this cycle.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-07 17:19:59 -07:00
9b89c08cae Merge branch 'ac/fuzzers'
Source file shuffling.

* ac/fuzzers:
  fuzz: reorganise the path for existing oss-fuzz fuzzers
2022-10-07 17:19:59 -07:00
837fdc900f Merge branch 'vd/fix-unaligned-read-index-v4'
The codepath that reads from the index v4 had unaligned memory
accesses, which has been corrected.

* vd/fix-unaligned-read-index-v4:
  read-cache: avoid misaligned reads in index v4
2022-10-07 17:19:59 -07:00
1f1f375cfe Merge branch 'es/retire-efgrep'
Prepare for GNU [ef]grep that throw warning of their uses.

* es/retire-efgrep:
  check-non-portable-shell: detect obsolescent egrep/fgrep
2022-10-07 17:19:59 -07:00
de73968e52 Merge branch 'dd/retire-efgrep'
Prepare for GNU [ef]grep that throw warning of their uses.

* dd/retire-efgrep:
  t: convert fgrep usage to "grep -F"
  t: convert egrep usage to "grep -E"
  t: remove \{m,n\} from BRE grep usage
  CodingGuidelines: allow grep -E
2022-10-07 17:19:59 -07:00
410a0e520d Merge branch 'ds/use-platform-regex-on-macos'
With a bit of header twiddling, use the native regexp library on
macOS instead of the compat/ one.

* ds/use-platform-regex-on-macos:
  grep: fix multibyte regex handling under macOS
2022-10-07 17:19:59 -07:00
3991bb73dd SubmittingPatches: use usual capitalization in the log message body
Update the description of the summary section to clarify that the
"do not capitalize" rule applies only the word after the "<area>:"
prefix of the title and nowhere else.  This hopefully will prevent
folks from writing their proposed log message in all lowercase.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-07 14:59:25 -07:00
7190b7ebf9 bundle-uri: fix technical doc issues
Two documentation issues exist in the technical docs for the bundle URI
feature.

First, there is an extraneous "the" across a linebreak, making the
nonsensical phrase "the bundle the list" which should just be "the
bundle list".

Secondly, the asciidoc update treats the string "`have`s" as starting a
"<code>" block, but the second tick is interpreted as an apostrophe
instead of a closing "</code>" tag. This causes entire sentences to be
formatted as code until the next one comes along. Simply adding a space
here does not work properly as the rendered HTML keeps that space.
Instead, restructure the sentence slightly to avoid using a plural,
allowing the HTML to render correctly.

Reported-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-07 11:00:21 -07:00
246526d019 bisect--helper: plug strvec leak
The strvec "argv" is used to build a command for run_command_v_opt(),
but never freed.  Use a constant string array instead, which doesn't
require any cleanup.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-07 10:21:18 -07:00
d5b41391a4 Git 2.38.1
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 20:00:33 -04:00
f64d4ca8d6 Sync with 2.37.4
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 20:00:04 -04:00
83d5e3341b Git 2.37.4
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 19:58:33 -04:00
f2798aa404 Sync with 2.36.3
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 19:58:16 -04:00
9a167cb786 t7527: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t7527 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 19:57:52 -04:00
fcdaa211e6 Git 2.36.3
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:45:10 -04:00
58612f82b6 Sync with 2.35.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:44:44 -04:00
868154bb1c Git 2.35.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:44:02 -04:00
ac8a1db867 Sync with 2.34.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:43:37 -04:00
be85cfc4db Git 2.34.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:43:08 -04:00
478a426f14 Sync with 2.33.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:42:55 -04:00
7800e1dccf Git 2.33.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:42:27 -04:00
3957f3c84e Sync with 2.32.4
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:42:02 -04:00
af778cd9be Git 2.32.4
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:41:15 -04:00
9cbd2827c5 Sync with 2.31.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:40:44 -04:00
ecf9b4a443 Git 2.31.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:39:26 -04:00
122512967e Sync with 2.30.6
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:39:15 -04:00
abd4d67ab0 Git 2.30.6
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:38:16 -04:00
69c5f17f11 attr: drop DEBUG_ATTR code
Since its inception in d0bfd026a8 (Add basic infrastructure to assign
attributes to paths, 2007-04-12), the attribute code carries a little
bit of debug code that is conditionally compiled only when DEBUG_ATTR is
set. But since you have to know about it and make a special build of Git
to use it, it's not clear that it's helping anyone (and there are very
few mentions of it on the list over the years).

Meanwhile, it causes slight headaches. Since it's not built as part of a
regular compile, it's subject to bitrot. E.g., this was dealt with in
712efb1a42 (attr: make it build with DEBUG_ATTR again, 2013-01-15), and
it currently fails to build with DEVELOPER=1 since e810e06357 (attr:
tighten const correctness with git_attr and match_attr, 2017-01-27).

And it causes confusion with -Wunused-parameter; the "what" parameter of
fill_one() is unused in a normal build, but needed in a debug build.

Let's just get rid of this code (and the now-useless parameter).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-06 09:59:17 -07:00
116761ba9c commit: avoid writing to global in option callback
The callback function for --trailer writes directly to the global
trailer_args and ignores opt->value completely. This is OK, since that's
where we expect to find the value. But it does mean the option
declaration isn't as clear. E.g., we have:

    OPT_BOOL(0, "reset-author", &renew_authorship, ...),
    OPT_CALLBACK_F(0, "trailer", NULL, ..., opt_pass_trailer)

In the first one we can see where the result will be stored, but in the
second, we get only NULL, and you have to go read the callback.

Let's pass &trailer_args, and use it in the callback. As a bonus, this
silences a -Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-06 09:58:06 -07:00
7faba18a9a multi-pack-index: avoid writing to global in option callback
We declare the --object-dir option like:

  OPT_CALLBACK(0, "object-dir", &opts.object_dir, ...);

but the pointer to opts.object_dir is completely unused. Instead, the
callback writes directly to a global. Which fortunately happens to be
opts.object_dir. So everything works as expected, but it's unnecessarily
confusing.

Instead, let's have the callback write to the option value pointer that
has been passed in. This also quiets a -Wunused-parameter warning (since
we don't otherwise look at "opt").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-06 09:56:51 -07:00
6823c19888 test-submodule: inline resolve_relative_url() function
The resolve_relative_url() function takes argc and argv parameters; it
then reads up to 3 elements of argv without looking at argc at all. At
first glance, this seems like a bug. But it has only one caller,
cmd__submodule_resolve_relative_url(), which does confirm that argc is
3.

The main reason this is a separate function is that it was moved from
library code in 96a28a9bc6 (submodule--helper: move
"resolve-relative-url-test" to a test-tool, 2022-09-01).

We can make this code simpler and more obviously safe by just inlining
the function in its caller. As a bonus, this silences a
-Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-06 09:56:28 -07:00
7a2d8ea47e t/lib-httpd: pass LANG and LC_ALL to Apache
t5411 starts a web server with no explicit language setting, so it uses
the system default.  Ten of its tests expect it to return error messages
containing the prefix "fatal: ", emitted by die().  This prefix can be
localized since a1fd2cf8cd (i18n: mark message helpers prefix for
translation, 2022-06-21), however.  As a result these ten tests break
for me on a system with LANG="de_DE.UTF-8" because the web server sends
localized messages with "Schwerwiegend: " instead of "fatal: ".

Fix these tests by passing LANG and LC_ALL to the web server, which are
set to "C" by t/test-lib.sh, to get untranslated messages on both sides.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-06 09:16:26 -07:00
7c07f36ad2 git-compat-util.h: GCC deprecated message arg only in GCC 4.5+
https://gcc.gnu.org/gcc-4.5/changes.html says

  The deprecated attribute now takes an optional string argument, for
  example, __attribute__((deprecated("text string"))), that will be
  printed together with the deprecation warning.

While GCC 4.5 is already 12 years old, git checks for even older
versions in places. Let's not needlessly break older compilers when
a small and simple fix is readily available.

Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Signed-off-by: Alejandro R Sedeño <asedeno@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 19:09:59 -07:00
ebb6c16607 Makefile: clarify runtime relative gitexecdir
"git" built with RUNTIME_PREFIX flag turned on could figure out
gitexecdir and other paths as relative to "git" executable.

However, in the section specifies gitexecdir, RUNTIME_PREFIX wasn't
mentioned, thus users may wrongly assume that "git" always locates
gitexecdir as relative path to the executable.

Let's clarify that only "git" built with RUNTIME_PREFIX will locate
gitexecdir as relative path.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 19:06:01 -07:00
d9fcaeece2 t5537: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t5537 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-05 20:19:15 -04:00
541607d934 t3206: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t3206 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-05 20:19:08 -04:00
b004c90282 gc: simplify maintenance_task_pack_refs()
Pass a constant string array directly to run_command_v_opt() instead of
copying it into a strvec first.  This shortens the code and avoids heap
allocations.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 12:46:27 -07:00
edbf9a2e20 mergetool.txt: typofix 'overwriten' -> 'overwritten'
Signed-off-by: Noah Betzen <noah@nezteb.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 12:25:56 -07:00
301f1e3ac1 promisor-remote: die upon failing fetch
In a partial clone, an attempt to read a missing object results in an
attempt to fetch that single object. In order to avoid multiple
sequential fetches, which would occur when multiple objects are missing
(which is the typical case), some commands have been taught to prefetch
in a batch: such a command would, in a partial clone, notice that
several objects that it will eventually need are missing, and call
promisor_remote_get_direct() with all such objects at once.

When this batch prefetch fails, these commands fall back to the
sequential fetches. But at $DAYJOB we have noticed that this results in
a bad user experience: a command would take unexpectedly long to finish
(and possibly use up a lot of bandwidth) if the batch prefetch would
fail for some intermittent reason, but all subsequent fetches would
work. It would be a better user experience for such a command would
just fail.

Therefore, make it a fatal error if the prefetch fails and at least one
object being fetched is known to be a promisor object. (The latter
criterion is to make sure that we are not misleading the user that such
an object would be present from the promisor remote. For example, a
missing object may be a result of repository corruption and not because
it is expectedly missing due to the repository being a partial clone.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:06:53 -07:00
00057bf14c promisor-remote: remove a return value
No caller of promisor_remote_get_direct() is checking its return value,
so remove it.

Not checking the return value means that the user would not know
whether the failure of reading an object is due to the promisor remote
not supplying the object or because of local repository corruption, but
this will be fixed in a subsequent patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:06:52 -07:00
5aa9e3262e fsmonitor: add documentation for allowRemote and socketDir options
Add documentation for 'fsmonitor.allowRemote' and 'fsmonitor.socketDir'.
Call-out experimental nature of 'fsmonitor.allowRemote' and limited
filesystem support for 'fsmonitor.socketDir'.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:05:23 -07:00
25c2cab08f fsmonitor: check for compatability before communicating with fsmonitor
If fsmonitor is not in a compatible state, warn with an appropriate message.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:05:23 -07:00
12fd27df79 fsmonitor: deal with synthetic firmlinks on macOS
Starting with macOS 10.15 (Catalina), Apple introduced a new feature
called 'firmlinks' in order to separate the boot volume into two
volumes, one read-only and one writable but still present them to the
user as a single volume. Along with this change, Apple removed the
ability to create symlinks in the root directory and replaced them with
'synthetic firmlinks'. See 'man synthetic.conf'

When FSEevents reports the path of changed files, if the path involves
a synthetic firmlink, the path is reported from the point of the
synthetic firmlink and not the real path. For example:

Real path:
/System/Volumes/Data/network/working/directory/foo.txt

Synthetic firmlink:
/network -> /System/Volumes/Data/network

FSEvents path:
/network/working/directory/foo.txt

This causes the FSEvents path to not match against the worktree
directory.

There are several ways in which synthetic firmlinks can be created:
they can be defined in /etc/synthetic.conf, the automounter can create
them, and there may be other means. Simply reading /etc/synthetic.conf
is insufficient. No matter what process creates synthetic firmlinks,
they all get created in the root directory.

Therefore, in order to deal with synthetic firmlinks, the root directory
is scanned and the first possible synthetic firmink that, when resolved,
is a prefix of the worktree is used to map FSEvents paths to worktree
paths.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:05:23 -07:00
8f44976882 fsmonitor: avoid socket location check if using hook
If monitoring is done via fsmonitor hook rather than IPC there is no
need to check if the location of the Unix Domain socket (UDS) file is
on a remote filesystem.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:05:23 -07:00
6beb2688d3 fsmonitor: relocate socket file if .git directory is remote
If the .git directory is on a remote filesystem, create the socket
file in 'fsmonitor.socketDir' if it is defined, else create it in $HOME.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:05:22 -07:00
508c1a572d fsmonitor: refactor filesystem checks to common interface
Provide a common interface for getting basic filesystem information
including filesystem type and whether the filesystem is remote.

Refactor existing code for getting basic filesystem info and detecting
remote file systems to the new interface.

Refactor filesystem checks to leverage new interface. For macOS,
error-out if the Unix Domain socket (UDS) file is on a remote
filesystem.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 11:05:22 -07:00
36fb0d07d8 ssh signing: return an error when signature cannot be read
If the signature file cannot be read we print an error message but do
not return an error to the caller. In practice it seems unlikely that
the file would be unreadable if the call to ssh-keygen succeeds.

The unlink_or_warn() call is moved to the end of the function so that
we always try and remove the signature file. This isn't strictly
necessary at the moment but it protects us against any extra code
being added between trying to read the signature file and the cleanup
at the end of the function in the future. unlink_or_warn() only prints
a warning if it exists and cannot be removed.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 10:21:52 -07:00
45350aeb11 sequencer: detect author name errors in read_author_script()
As we parse the author-script file, we check for missing or duplicate
lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our
final error conditional checks "date_i" twice and "name_i" not at all.
This not only leads to us failing to abort, but we may do an
out-of-bounds read on the string_list array.

The bug goes back to 442c36bd08 (am: improve author-script error
reporting, 2018-10-31), though the code was soon after moved to this
spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
It was presumably just a typo in 442c36bd08.

We'll add test coverage for all the error cases here, though only the
GIT_AUTHOR_NAME ones fail (even in a vanilla build they segfault
consistently, but certainly with SANITIZE=address).

Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-03 11:05:53 -07:00
3ef1494685 mailinfo -b: fix an out of bounds access
To remove bracketed strings containing "PATCH" from the subject line
cleanup_subject() scans the subject for the opening bracket using an
offset from the beginning of the line. It then searches for the
closing bracket with strchr(). To calculate the length of the
bracketed string it unfortunately adds rather than subtracts the
offset from the result of strchr(). This leads to an out of bounds
access in memmem() when looking to see if the brackets contain
"PATCH".

We have tests that trigger this bug that were added in ae52d57f0b
(t5100: add some more mailinfo tests, 2017-05-31). The commit message
mentions that they are marked test_expect_failure as they trigger an
assertion in strbuf_splice(). While it is reassuring that
strbuf_splice() detects the problem and dies in retrospect that should
perhaps have warranted a little more investigation. The bug was
introduced by 17635fc900 (mailinfo: -b option keeps [bracketed]
strings that is not a [PATCH] marker, 2009-07-15). I think the reason
it has survived so long is that '-b' is not a popular option and
without it the offset is always zero.

This was found by the address sanitizer while I was cleaning up the
test_todo idea in [1].

[1] https://lore.kernel.org/git/db558292-2783-3270-4824-43757822a389@gmail.com/

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-03 09:05:07 -07:00
3dcec76d9d Git 2.38
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-02 08:43:56 -07:00
c03bee6e9f Merge tag 'l10n-2.38.0-rnd3' of https://github.com/git-l10n/git-po
l10n-2.38.0-rnd3

* tag 'l10n-2.38.0-rnd3' of https://github.com/git-l10n/git-po: (25 commits)
  l10n: zh_TW.po: Git 2.38.0, round 3
  l10n: fr: v2.38.0 round 3
  l10n: Update Catalan translation
  l10n: de.po: update German translation
  l10n: zh_CN: 2.38.0 round 3
  l10n: tr: v2.38.0 3rd round
  l10n: bg.po: Updated Bulgarian translation (5484t)
  l10n: po-id for 2.38 (round 3)
  l10n: es: update translation
  l10n: sv.po: Update Swedish translation (5484t0f0u)
  l10n: Update Catalan translation
  l10n: fr: don't say that merge is "the default strategy"
  l10n: zh_CN v2.38.0 rounds 1 & 2
  l10n: po-id for 2.38 (round 2)
  l10n: tr: v2.38.0 round 2
  l10n: bg.po: Updated Bulgarian translation (5484t)
  l10n: fr: v2.38.0 round 2
  l10n: fr: v2.38 round 1
  l10n: fr: The word 'branche' is only feminine
  l10n: Update Catalan translation
  ...
2022-10-02 08:24:32 -07:00
a79c6b6081 diff: support ^! for merges
revision.c::handle_revision_arg_1() resolves <rev>^! by first adding the
negated parents and then <rev> itself.  builtin_diff_combined() expects
the first tree to be the merge and the remaining ones to be the parents,
though.  This mismatch results in bogus diff output.

Remember the first tree that doesn't belong to a parent and use it
instead of blindly picking the first one.  This makes "git diff <rev>^!"
consistent with "git show <rev>^!".

Reported-by: Tim Jaacks <tim.jaacks@garz-fricke.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-01 15:58:38 -07:00
9f91da752f revisions.txt: unspecify order of resolved parts of ^!
gitrevisions(7) says that <rev>^! resolves to <rev> and then all the
parents of <rev>.  revision.c::handle_revision_arg_1() actually adds
all parents first, then <rev>.  Change the documentation to leave the
order unspecified, to avoid misleading readers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-01 15:58:36 -07:00
793c21182e revision: use strtol_i() for exclude_parent
Avoid silent overflow of the int exclude_parent by using the appropriate
function, strtol_i(), to parse its value.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-01 15:58:33 -07:00
dedb2883ce l10n: zh_TW.po: Git 2.38.0, round 3
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2022-10-01 19:10:41 +08:00
8a7bfa0fd3 t7814: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t7814 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:31:40 -04:00
59f2f80280 t5537: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t5537 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:31:36 -04:00
c193e6bbee t5516: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t5516 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:31:34 -04:00
e175fb5767 t3207: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t3207 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:31:31 -04:00
ef374dd9b8 t2080: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:30:45 -04:00
092d3a2bf9 t1092: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:30:43 -04:00
067aa8fb41 t2080: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:27:18 -04:00
4a7dab5ce4 t1092: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:27:14 -04:00
22613b25ec tmp-objdir: skip clean up when handling a signal
In the tmp-objdir api, tmp_objdir_create will create a temporary
directory but also register signal handlers responsible for removing
the directory's contents and the directory itself. However, the
function responsible for recursively removing the contents and
directory, remove_dir_recurse() calls opendir(3) and closedir(3).
This can be problematic because these functions allocate and free
memory, which are not async-signal-safe functions. This can lead to
deadlocks.

One place we call tmp_objdir_create() is in git-receive-pack, where
we create a temporary quarantine directory "incoming". Incoming
objects will be written to this directory before they get moved to
the object directory.

We have observed this code leading to a deadlock:

	Thread 1 (Thread 0x7f621ba0b200 (LWP 326305)):
	#0  __lll_lock_wait_private (futex=futex@entry=0x7f621bbf8b80
		<main_arena>) at ./lowlevellock.c:35
	#1  0x00007f621baa635b in __GI___libc_malloc
		(bytes=bytes@entry=32816) at malloc.c:3064
	#2  0x00007f621bae9f49 in __alloc_dir (statp=0x7fff2ea7ed60,
		flags=0, close_fd=true, fd=5)
		at ../sysdeps/posix/opendir.c:118
	#3  opendir_tail (fd=5) at ../sysdeps/posix/opendir.c:69
	#4  __opendir (name=<optimized out>)
		at ../sysdeps/posix/opendir.c:92
	#5  0x0000557c19c77de1 in remove_dir_recurse ()
	git#6  0x0000557c19d81a4f in remove_tmp_objdir_on_signal ()
	#7  <signal handler called>
	git#8  _int_malloc (av=av@entry=0x7f621bbf8b80 <main_arena>,
		bytes=bytes@entry=7160) at malloc.c:4116
	git#9  0x00007f621baa62c9 in __GI___libc_malloc (bytes=7160)
		at malloc.c:3066
	git#10 0x00007f621bd1e987 in inflateInit2_ ()
		from /opt/gitlab/embedded/lib/libz.so.1
	git#11 0x0000557c19dbe5f4 in git_inflate_init ()
	git#12 0x0000557c19cee02a in unpack_compressed_entry ()
	git#13 0x0000557c19cf08cb in unpack_entry ()
	git#14 0x0000557c19cf0f32 in packed_object_info ()
	git#15 0x0000557c19cd68cd in do_oid_object_info_extended ()
	git#16 0x0000557c19cd6e2b in read_object_file_extended ()
	git#17 0x0000557c19cdec2f in parse_object ()
	git#18 0x0000557c19c34977 in lookup_commit_reference_gently ()
	git#19 0x0000557c19d69309 in mark_uninteresting ()
	git#20 0x0000557c19d2d180 in do_for_each_repo_ref_iterator ()
	git#21 0x0000557c19d21678 in for_each_ref ()
	git#22 0x0000557c19d6a94f in assign_shallow_commits_to_refs ()
	git#23 0x0000557c19bc02b2 in cmd_receive_pack ()
	git#24 0x0000557c19b29fdd in handle_builtin ()
	git#25 0x0000557c19b2a526 in cmd_main ()
	git#26 0x0000557c19b28ea2 in main ()

Since we can't do the cleanup in a portable and signal-safe way, skip
the cleanup when we're handling a signal.

This means that when signal handling, the temporary directory may not
get cleaned up properly. This is mitigated by b3cecf49ea (tmp-objdir: new
API for creating temporary writable databases, 2021-12-06) which changed
the default name and allows gc to clean up these temporary directories.

In the event of a normal exit, we should still be cleaning up via the
atexit() handler.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-30 21:26:58 -07:00
0ca6ead81e alias.c: reject too-long cmdline strings in split_cmdline()
This function improperly uses an int to represent the number of entries
in the resulting argument array. This allows a malicious actor to
intentionally overflow the return value, leading to arbitrary heap
writes.

Because the resulting argv array is typically passed to execv(), it may
be possible to leverage this attack to gain remote code execution on a
victim machine. This was almost certainly the case for certain
configurations of git-shell until the previous commit limited the size
of input it would accept. Other calls to split_cmdline() are typically
limited by the size of argv the OS is willing to hand us, so are
similarly protected.

So this is not strictly fixing a known vulnerability, but is a hardening
of the function that is worth doing to protect against possible unknown
vulnerabilities.

One approach to fixing this would be modifying the signature of
`split_cmdline()` to look something like:

    int split_cmdline(char *cmdline, const char ***argv, size_t *argc);

Where the return value of `split_cmdline()` is negative for errors, and
zero otherwise. If non-NULL, the `*argc` pointer is modified to contain
the size of the `**argv` array.

But this implies an absurdly large `argv` array, which more than likely
larger than the system's argument limit. So even if split_cmdline()
allowed this, it would fail immediately afterwards when we called
execv(). So instead of converting all of `split_cmdline()`'s callers to
work with `size_t` types in this patch, instead pursue the minimal fix
here to prevent ever returning an array with more than INT_MAX entries
in it.

Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
71ad7fe1bc shell: limit size of interactive commands
When git-shell is run in interactive mode (which must be enabled by
creating $HOME/git-shell-commands), it reads commands from stdin, one
per line, and executes them.

We read the commands with git_read_line_interactively(), which uses a
strbuf under the hood. That means we'll accept an input of arbitrary
size (limited only by how much heap we can allocate). That creates two
problems:

  - the rest of the code is not prepared to handle large inputs. The
    most serious issue here is that split_cmdline() uses "int" for most
    of its types, which can lead to integer overflow and out-of-bounds
    array reads and writes. But even with that fixed, we assume that we
    can feed the command name to snprintf() (via xstrfmt()), which is
    stuck for historical reasons using "int", and causes it to fail (and
    even trigger a BUG() call).

  - since the point of git-shell is to take input from untrusted or
    semi-trusted clients, it's a mild denial-of-service. We'll allocate
    as many bytes as the client sends us (actually twice as many, since
    we immediately duplicate the buffer).

We can fix both by just limiting the amount of per-command input we're
willing to receive.

We should also fix split_cmdline(), of course, which is an accident
waiting to happen, but that can come on top. Most calls to
split_cmdline(), including the other one in git-shell, are OK because
they are reading from an OS-provided argv, which is limited in practice.
This patch should eliminate the immediate vulnerabilities.

I picked 4MB as an arbitrary limit. It's big enough that nobody should
ever run into it in practice (since the point is to run the commands via
exec, we're subject to OS limits which are typically much lower). But
it's small enough that allocating it isn't that big a deal.

The code is mostly just swapping out fgets() for the strbuf call, but we
have to add a few niceties like flushing and trimming line endings. We
could simplify things further by putting the buffer on the stack, but
4MB is probably a bit much there. Note that we'll _always_ allocate 4MB,
which for normal, non-malicious requests is more than we would before
this patch. But on the other hand, other git programs are happy to use
96MB for a delta cache. And since we'd never touch most of those pages,
on a lazy-allocating OS like Linux they won't even get allocated to
actual RAM.

The ideal would be a version of strbuf_getline() that accepted a maximum
value. But for a minimal vulnerability fix, let's keep things localized
and simple. We can always refactor further on top.

The included test fails in an obvious way with ASan or UBSan (which
notice the integer overflow and out-of-bounds reads). Without them, it
fails in a less obvious way: we may segfault, or we may try to xstrfmt()
a long string, leading to a BUG(). Either way, it fails reliably before
this patch, and passes with it. Note that we don't need an EXPENSIVE
prereq on it. It does take 10-15s to fail before this patch, but with
the new limit, we fail almost immediately (and the perl process
generating 2GB of data exits via SIGPIPE).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
32696a4cbe shell: add basic tests
We have no tests of even basic functionality of git-shell. Let's add a
couple of obvious ones. This will serve as a framework for adding tests
for new things we fix, as well as making sure we don't screw anything up
too badly while doing so.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
a1d4f67c12 transport: make protocol.file.allow be "user" by default
An earlier patch discussed and fixed a scenario where Git could be used
as a vector to exfiltrate sensitive data through a Docker container when
a potential victim clones a suspicious repository with local submodules
that contain symlinks.

That security hole has since been plugged, but a similar one still
exists.  Instead of convincing a would-be victim to clone an embedded
submodule via the "file" protocol, an attacker could convince an
individual to clone a repository that has a submodule pointing to a
valid path on the victim's filesystem.

For example, if an individual (with username "foo") has their home
directory ("/home/foo") stored as a Git repository, then an attacker
could exfiltrate data by convincing a victim to clone a malicious
repository containing a submodule pointing at "/home/foo/.git" with
`--recurse-submodules`. Doing so would expose any sensitive contents in
stored in "/home/foo" tracked in Git.

For systems (such as Docker) that consider everything outside of the
immediate top-level working directory containing a Dockerfile as
inaccessible to the container (with the exception of volume mounts, and
so on), this is a violation of trust by exposing unexpected contents in
the working copy.

To mitigate the likelihood of this kind of attack, adjust the "file://"
protocol's default policy to be "user" to prevent commands that execute
without user input (including recursive submodule initialization) from
taking place by default.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
f4a32a550f t/t9NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that interact with submodules a handful of times use
`test_config_global`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
0d3beb71da t/t7NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
0f21b8f468 t/t6NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
225d2d50cc t/t5NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
ac7e57fa28 t/t4NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
f8d510ed0b t/t3NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
99f4abb8da t/2NNNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
8a96dbcb33 t/t1NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
7de0c306f7 t/lib-submodule-update.sh: allow local submodules
To prepare for changing the default value of `protocol.file.allow` to
"user", update the `prolog()` function in lib-submodule-update to allow
submodules to be cloned over the file protocol.

This is used by a handful of submodule-related test scripts, which
themselves will have to tweak the value of `protocol.file.allow` in
certain locations. Those will be done in subsequent commits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
6f054f9fb3 builtin/clone.c: disallow --local clones with symlinks
When cloning a repository with `--local`, Git relies on either making a
hardlink or copy to every file in the "objects" directory of the source
repository. This is done through the callpath `cmd_clone()` ->
`clone_local()` -> `copy_or_link_directory()`.

The way this optimization works is by enumerating every file and
directory recursively in the source repository's `$GIT_DIR/objects`
directory, and then either making a copy or hardlink of each file. The
only exception to this rule is when copying the "alternates" file, in
which case paths are rewritten to be absolute before writing a new
"alternates" file in the destination repo.

One quirk of this implementation is that it dereferences symlinks when
cloning. This behavior was most recently modified in 36596fd2df (clone:
better handle symlinked files at .git/objects/, 2019-07-10), which
attempted to support `--local` clones of repositories with symlinks in
their objects directory in a platform-independent way.

Unfortunately, this behavior of dereferencing symlinks (that is,
creating a hardlink or copy of the source's link target in the
destination repository) can be used as a component in attacking a
victim by inadvertently exposing the contents of file stored outside of
the repository.

Take, for example, a repository that stores a Dockerfile and is used to
build Docker images. When building an image, Docker copies the directory
contents into the VM, and then instructs the VM to execute the
Dockerfile at the root of the copied directory. This protects against
directory traversal attacks by copying symbolic links as-is without
dereferencing them.

That is, if a user has a symlink pointing at their private key material
(where the symlink is present in the same directory as the Dockerfile,
but the key itself is present outside of that directory), the key is
unreadable to a Docker image, since the link will appear broken from the
container's point of view.

This behavior enables an attack whereby a victim is convinced to clone a
repository containing an embedded submodule (with a URL like
"file:///proc/self/cwd/path/to/submodule") which has a symlink pointing
at a path containing sensitive information on the victim's machine. If a
user is tricked into doing this, the contents at the destination of
those symbolic links are exposed to the Docker image at runtime.

One approach to preventing this behavior is to recreate symlinks in the
destination repository. But this is problematic, since symlinking the
objects directory are not well-supported. (One potential problem is that
when sharing, e.g. a "pack" directory via symlinks, different writers
performing garbage collection may consider different sets of objects to
be reachable, enabling a situation whereby garbage collecting one
repository may remove reachable objects in another repository).

Instead, prohibit the local clone optimization when any symlinks are
present in the `$GIT_DIR/objects` directory of the source repository.
Users may clone the repository again by prepending the "file://" scheme
to their clone URL, or by adding the `--no-local` option to their `git
clone` invocation.

The directory iterator used by `copy_or_link_directory()` must no longer
dereference symlinks (i.e., it *must* call `lstat()` instead of `stat()`
in order to discover whether or not there are symlinks present). This has
no bearing on the overall behavior, since we will immediately `die()` on
encounter a symlink.

Note that t5604.33 suggests that we do support local clones with
symbolic links in the source repository's objects directory, but this
was likely unintentional, or at least did not take into consideration
the problem with sharing parts of the objects directory with symbolic
links at the time. Update this test to reflect which options are and
aren't supported.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
d7f69b76ec Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.38 (round 3)
2022-10-01 10:02:03 +08:00
e288b3de35 branch: do not fail a no-op --edit-desc
Imagine running "git branch --edit-description" while on a branch
without the branch description, and then exit the editor after
emptying the edit buffer, which is the way to tell the command that
you changed your mind and you do not want the description after all.

The command should just happily oblige, adding no branch description
for the current branch, and exit successfully.  But it fails to do
so:

    $ git init -b main
    $ git commit --allow-empty -m commit
    $ GIT_EDITOR=: git branch --edit-description
    fatal: could not unset 'branch.main.description'

The end result is OK in that the configuration variable does not
exist in the resulting repository, but we should do better.  If we
know we didn't have a description, and if we are asked not to have a
description by the editor, we can just return doing nothing.

This of course introduces TOCTOU.  If you add a branch description
to the same branch from another window, while you had the editor
open to edit the description, and then exit the editor without
writing anything there, we'd end up not removing the description you
added in the other window.  But you are fooling yourself in your own
repository at that point, and if it hurts, you'd be better off not
doing so ;-).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-30 11:13:51 -07:00
5e7c8b75e7 test-lib: have SANITIZE=leak imply TEST_NO_MALLOC_CHECK
Since 131b94a10a (test-lib.sh: Use GLIBC_TUNABLES instead of
MALLOC_CHECK_ on glibc >= 2.34, 2022-03-04) compiling with
SANITIZE=leak has missed reporting some leaks. The old MALLOC_CHECK
method used before glibc 2.34 seems to have been (mostly?) compatible
with it, but after 131b94a10a e.g. running:

	TEST_NO_MALLOC_CHECK=1 make SANITIZE=leak test T=t6437-submodule-merge.sh

Would report a leak in builtin/commit.c, but this would not:

	TEST_NO_MALLOC_CHECK= make SANITIZE=leak test T=t6437-submodule-merge.sh

Since the interaction is clearly breaking the SANITIZE=leak mode,
let's mark them as explicitly incompatible.

A related regression for SANITIZE=address was fixed in
067109a5e7 (tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK,
2022-04-09).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-29 08:37:45 -07:00
21cefac967 Merge branch 'l10n-de-2.38-rnd3' of github.com:ralfth/git
* 'l10n-de-2.38-rnd3' of github.com:ralfth/git:
  l10n: de.po: update German translation
2022-09-29 18:54:12 +08:00
2a905f8fa8 push: improve grammar of branch.autoSetupMerge advice
"upstream branches" is plural but "name" and "local branch" are
singular. Make them all singular. And because we're talking about a
hypothetical branch that doesn't exist yet, use the future tense.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-28 19:03:10 -07:00
48bf511320 Merge branch 'fr_2.38_rnd3' of github.com:jnavila/git
* 'fr_2.38_rnd3' of github.com:jnavila/git:
  l10n: fr: v2.38.0 round 3
2022-09-29 08:00:30 +08:00
08f41b8171 Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2022-09-29 07:59:44 +08:00
48fe8e6a63 l10n: fr: v2.38.0 round 3
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2022-09-28 21:46:22 +02:00
4a6ed30f96 read-cache: avoid misaligned reads in index v4
The process for reading the index into memory from disk is to first read its
contents into a single memory-mapped file buffer (type 'char *'), then
sequentially convert each on-disk index entry into a corresponding incore
'cache_entry'. To access the contents of the on-disk entry for processing, a
moving pointer within the memory-mapped file is cast to type 'struct
ondisk_cache_entry *'.

In index v4, the entries in the on-disk index file are written *without*
aligning their first byte to a 4-byte boundary; entries are a variable
length (depending on the entry name and whether or not extended flags are
used). As a result, casting the 'char *' buffer pointer to 'struct
ondisk_cache_entry *' then accessing its contents in a 'SANITIZE=undefined'
build can trigger the following error:

  read-cache.c:1886:46: runtime error: member access within misaligned
  address <address> for type 'struct ondisk_cache_entry', which requires 4
  byte alignment

Avoid this error by reading fields directly from the 'char *' buffer, using
the 'offsetof' individual fields in 'struct ondisk_cache_entry'.
Additionally, add documentation describing why the new approach avoids the
misaligned address error, as well as advice on how to improve the
implementation in the future.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-28 10:32:18 -07:00
42fe2b951a l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2022-09-28 19:05:55 +02:00
92481d1b26 merge-ort: return early when failing to write a blob
In the previous commit, we fixed a segmentation fault when a tree object
could not be written.

However, before the tree object is written, `merge-ort` wants to write
out a blob object (except in cases where the merge results in a blob
that already exists in the database). And this can fail, too, but we
ignore that write failure so far.

Let's pay close attention and error out early if the blob could not be
written. This reduces the error output of t4301.25 ("merge-ort fails
gracefully in a read-only repository") from:

	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add greeting to database
	error: insufficient permission for adding an object to repository database ./objects
	fatal: failure to merge

to:

	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	fatal: failure to merge

This is _not_ just a cosmetic change: Even though one might assume that
the operation would have failed anyway at the point when the new tree
object is written (and the corresponding tree object _will_ be new if it
contains a blob that is new), but that is not so: As pointed out by
Elijah Newren, when Git has previously been allowed to add loose objects
via `sudo` calls, it is very possible that the blob object cannot be
written (because the corresponding `.git/objects/??/` directory may be
owned by `root`) but the tree object can be written (because the
corresponding objects directory is owned by the current user). This
would result in a corrupt repository because it is missing the blob
object, and with this here patch we prevent that.

Note: This patch adjusts two variable declarations from `unsigned` to
`int` because their purpose is to hold the return value of
`handle_content_merge()`, which is of type `int`. The existing users of
those variables are only interested whether that variable is zero or
non-zero, therefore this type change does not affect the existing code.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-28 08:49:35 -07:00
0b55d930a6 merge-ort: fix segmentation fault in read-only repositories
If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).

Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-28 08:49:27 -07:00
92e51feec5 l10n: de.po: update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2022-09-28 17:15:53 +02:00
b796ca1cd4 l10n: zh_CN: 2.38.0 round 3
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2022-09-28 15:51:07 +01:00
37db9416c4 Merge branch 'turkish' of github.com:bitigchi/git-po
* 'turkish' of github.com:bitigchi/git-po:
  l10n: tr: v2.38.0 3rd round
2022-09-28 20:54:29 +08:00
8d500614f7 Merge branch 'master' of github.com:alshopov/git-po
* 'master' of github.com:alshopov/git-po:
  l10n: bg.po: Updated Bulgarian translation (5484t)
2022-09-28 20:52:34 +08:00
2c30dfa7d7 l10n: tr: v2.38.0 3rd round
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2022-09-28 12:32:13 +03:00
88fda53a16 l10n: bg.po: Updated Bulgarian translation (5484t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2022-09-28 11:07:07 +02:00
55b1c1ab1c l10n: po-id for 2.38 (round 3)
Update following components:

  * sequencer.c
  * wt-status.c

Translate following new components:

  * compat/compiler.h
  * compat/disk.h
  * compat/fsmonitor/fsm-health-win32.c
  * compat/fsmonitor/fsm-listen-darwin.c
  * compat/fsmonitor/fsm-listen-win32.c
  * compat/fsmonitor/fsm-settings-win32.c
  * compat/mingw.c
  * compat/obstack.c
  * compat/regex/regcomp.c
  * compat/simple-ipc/ipc-unix-socket.c
  * compat/simple-ipc/ipc-win32.c
  * compat/terminal.c
  * convert.c
  * entry.c
  * environment.c
  * exec-cmd.c
  * git-merge-octopus.sh
  * git-sh-setup.sh
  * list-objects-filter-options.c
  * list-objects-filter-options.h
  * list-objects.c
  * lockfile.c
  * ls-refs.c
  * mailinfo.c
  * name-hash.c
  * notes-merge.c
  * notes-utils.c
  * pkt-line.c
  * preload-index.c
  * pretty.c
  * promisor-remote.c
  * protocol-caps.c
  * read-cache.c
  * scalar.c
  * transport-helper.c
  * transport.c
  * tree-walk.c
  * urlmatch.c
  * walker.c
  * wrapper.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2022-09-28 15:06:14 +07:00
9af6cb88b6 l10n: es: update translation
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2022-09-27 22:56:55 -06:00
4ff58e9690 Merge branch 'main' of github.com:git/git
* 'main' of github.com:git/git:
  Git 2.38-rc2
  pack-bitmap: remove trace2 region from hot path
2022-09-28 08:03:38 +08:00
bcd6bc478a Git 2.38-rc2
We have small updates since -rc1 but none of them is about a new
thing and there is no updates to the release notes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-27 11:25:52 -07:00
d151f0cce7 string-list: document iterator behavior on NULL input
The for_each_string_list_item() macro takes a string_list and
automatically constructs a for loop to iterate over its contents. This
macro will segfault if the list is non-NULL.

We cannot change the macro to be careful around NULL values because
there are many callers that use the address of a local variable, which
will never be NULL and will cause compile errors with -Werror=address.

For now, leave a documentation comment to try to avoid mistakes in the
future where a caller does not check for a NULL list.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-27 09:32:26 -07:00
50a044f1e4 gc: replace config subprocesses with API calls
The 'git maintenance [un]register' commands set or unset the multi-
valued maintenance.repo config key with the absolute path of the current
repository. These are set in the global config file.

Instead of calling a subcommand and creating a new process, create the
proper API calls to git_config_set_multivar_in_file_gently(). It
requires loading the filename for the global config file (and erroring
out if now $HOME value is set). We also need to be careful about using
CONFIG_REGEX_NONE when adding the value and using
CONFIG_FLAGS_FIXED_VALUE when removing the value. In both cases, we
check that the value already exists (this check already existed for
'unregister').

Also, remove the transparent translation of the error code from the
config API to the exit code of 'git maintenance'. Instead, use die() to
recover from failures at that level. In the case of 'unregister
--force', allow the CONFIG_NOTHING_SET error code to be a success. This
allows a possible race where another process removes the config value.
The end result is that the config value is not set anymore, so we can
treat this as a success.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-27 09:32:26 -07:00
d871b6c6c6 scalar: make 'unregister' idempotent
The 'scalar unregister' command removes a repository from the list of
registered Scalar repositories and removes it from the list of
repositories registered for background maintenance. If the repository
was not already registered for background maintenance, then the command
fails, even if the repository was still registered as a Scalar
repository.

After using 'scalar clone' or 'scalar register', the repository would be
enrolled in background maintenance since those commands run 'git
maintenance start'. If the user runs 'git maintenance unregister' on
that repository, then it is still in the list of repositories which get
new config updates from 'scalar reconfigure'. The 'scalar unregister'
command would fail since 'git maintenance unregister' would fail.

Further, the add_or_remove_enlistment() method in scalar.c already has
this idempotent nature built in as an expectation since it returns zero
when the scalar.repo list already has the proper containment of the
repository.

The previous change added the 'git maintenance unregister --force'
option, so use it within 'scalar unregister' to make it idempotent.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-27 09:32:26 -07:00
1ebe6b0297 maintenance: add 'unregister --force'
The 'git maintenance unregister' subcommand has a step that removes the
current repository from the multi-valued maitenance.repo config key.
This fails if the repository is not listed in that key. This makes
running 'git maintenance unregister' twice result in a failure in the
second instance.

This failure exit code is helpful, but its message is not. Add a new
die() message that explicitly calls out the failure due to the
repository not being registered.

In some cases, users may want to run 'git maintenance unregister' just
to make sure that background jobs will not start on this repository, but
they do not want to check to see if it is registered first. Add a new
'--force' option that will siltently succeed if the repository is not
already registered.

Also add an extra test of 'git maintenance unregister' at a point where
there are no registered repositories. This should fail without --force.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-27 09:32:25 -07:00
2a7d63a245 Merge branch 'ds/bitmap-lookup-remove-tracing'
Perf-fix.

* ds/bitmap-lookup-remove-tracing:
  pack-bitmap: remove trace2 region from hot path
2022-09-26 21:46:51 -07:00
89a1ab8fb5 pack-bitmap: remove trace2 region from hot path
The trace2 region around the call to lazy_bitmap_for_commit() in
bitmap_for_commit() was added in 28cd730680 (pack-bitmap: prepare to
read lookup table extension, 2022-08-14). While adding trace2 regions is
typically helpful for tracking performance, this method is called
possibly thousands of times as a commit walk explores commit history
looking for a matching bitmap. When trace2 output is enabled, this
region is emitted many times and performance is throttled by that
output.

For now, remove these regions entirely.

This is a critical path, and it would be valuable to measure that the
time spent in bitmap_for_commit() does not increase when using the
commit lookup table. The best way to do that would be to use a mechanism
that sums the time spent in a region and reports a single value at the
end of the process. This technique was introduced but not merged by [1]
so maybe this example presents some justification to revisit that
approach.

[1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/

To help with the 'git blame' output in this region, add a comment that
warns against adding a trace2 region. Delete a test from t5310 that used
that trace output to check that this lookup optimization was activated.
To create this kind of test again in the future, the stopwatch traces
mentioned earlier could be used as a signal that we activated this code
path.

Helpedy-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-26 12:09:18 -07:00
abcac2e19f ref-filter.c: fix a leak in get_head_description
In 2708ce62d2 (branch: sort detached HEAD based on a flag, 2021-01-07) a
call to wt_status_state_free_buffers, responsible of freeing the
resources that could be allocated in the local struct wt_status_state
state, was eliminated.

The call to wt_status_state_free_buffers was introduced in 962dd7ebc3
(wt-status: introduce wt_status_state_free_buffers(), 2020-09-27).  This
commit brings back that call in get_head_description.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Reviewed-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-26 11:14:49 -07:00
3e367a5f2f sequencer: avoid dropping fixup commit that targets self via commit-ish
Commit 68d5d03bc4 (rebase: teach --autosquash to match on sha1 in
addition to message, 2010-11-04) taught autosquash to recognize
subjects like "fixup! 7a235b" where 7a235b is an OID-prefix. It
actually did more than advertised: 7a235b can be an arbitrary
commit-ish (as long as it's not trailed by spaces).

Accidental(?) use of this secret feature revealed a bug where we
would silently drop a fixup commit. The bug can also be triggered
when using an OID-prefix but that's unlikely in practice.

Let the commit with subject "fixup! main" be the tip of the "main"
branch. When computing the fixup target for this commit, we find
the commit itself. This is wrong because, by definition, a fixup
target must be an earlier commit in the todo list. We wrongly find
the current commit because we added it to the todo list prematurely.
Avoid these fixup-cycles by only adding the current commit to the
todo list after we have finished looking for the fixup target.

Reported-by: Erik Cervin Edin <erik@cervined.in>
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-26 10:11:57 -07:00
33ccfd1e5b l10n: sv.po: Update Swedish translation (5484t0f0u)
Also fix a couple of typos.

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2022-09-26 06:36:23 +01:00
6c9165c07a l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2022-09-25 21:04:22 +02:00
54e1f9f66d Merge branch 'main' of github.com:git/git
* 'main' of github.com:git/git:
  cmd-list.perl: fix identifying man sections
  pack-bitmap: improve grammar of "xor chain" error message
2022-09-24 21:51:06 +08:00
456a75f814 Merge branch 'fr_quickfix' of github.com:jnavila/git
* 'fr_quickfix' of github.com:jnavila/git:
  l10n: fr: don't say that merge is "the default strategy"
2022-09-24 21:12:37 +08:00
9865dce557 Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.38 (round 2)
2022-09-24 21:09:22 +08:00
1d8177c6fa Merge branch 'turkish' of github.com:bitigchi/git-po
* 'turkish' of github.com:bitigchi/git-po:
  l10n: tr: v2.38.0 round 2
2022-09-24 21:08:11 +08:00
d1e76d5ddc l10n: fr: don't say that merge is "the default strategy"
The text of this message was changed in commit
71076d0edd to avoid making any
suggestion about which strategy is better for the situation at hand.
Update the Franch translation to match.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2022-09-23 20:42:58 +02:00
4fd6c5e444 Merge branch 'ac/bitmap-lookup-table'
Grammofix.

* ac/bitmap-lookup-table:
  pack-bitmap: improve grammar of "xor chain" error message
2022-09-23 11:07:49 -07:00
0d14f80f94 Merge branch 'ma/scalar-to-main-fix'
Fix manpage generation.

* ma/scalar-to-main-fix:
  cmd-list.perl: fix identifying man sections
2022-09-23 11:07:48 -07:00
32c6fff4b8 cmd-list.perl: fix identifying man sections
We attribute each documentation text file to a man section by finding a
line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
("scalar: add to 'git help -a' command list", 2022-09-02) updated this
logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
it forgot to account for the fact that after the updated regex has found
a match, the man section is no longer to be found in `$1` but now lives
in `$2`.

This makes our git(1) manpage look as follows:

  Main porcelain commands
       git-add(git)
           Add file contents to the index.

  [...]

       gitk(git)
           The Git repository browser.

       scalar(scalar)
           A tool for managing large Git repositories.

Restore the man sections by not capturing the (git|scalar) part of the
match into `$1`.

As noted by Ævar [1], we could even match any "foo" rather than just
"gitfoo" and "scalarfoo", but that's a larger change. For now, just fix
the regression in cc75e556a9.

[1] https://lore.kernel.org/git/220923.86wn9u4joo.gmgdl@evledraar.gmail.com/#t

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-23 10:01:07 -07:00
7cae7627c4 builtin/grep.c: integrate with sparse index
Turn on sparse index and remove ensure_full_index().

Before this patch, `git-grep` utilizes the ensure_full_index() method to
expand the index and search all the entries. Because this method
requires walking all the trees and constructing the index, it is the
slow part within the whole command.

To achieve better performance, this patch uses grep_tree() to search the
sparse directory entries and get rid of the ensure_full_index() method.

Why grep_tree() is a better choice over ensure_full_index()?

1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks
   into every sparse-directory entry (represented by a tree) recursively
   when looping over the index, and the result of doing so matches the
   result of expanding the index.

2) grep_tree() utilizes pathspecs to limit the scope of searching.
   ensure_full_index() always expands the index, which means it will
   always walk all the trees and blobs in the repo without caring if
   the user only wants a subset of the content, i.e. using a pathspec.
   On the other hand, grep_tree() will only search the contents that
   match the pathspec, and thus possibly walking fewer trees.

3) grep_tree() does not construct and copy back a new index, while
   ensure_full_index() does. This also saves some time.

----------------
Performance test

- Summary:

p2000 tests demonstrate a ~71% execution time reduction for
`git grep --cached bogus -- "f2/f1/f1/*"` using tree-walking logic.
However, notice that this result varies depending on the pathspec
given. See below "Command used for testing" for more details.

Test                              HEAD~   HEAD
-------------------------------------------------------
2000.78: git grep ... (full-v3)   0.35    0.39 (≈)
2000.79: git grep ... (full-v4)   0.36    0.30 (≈)
2000.80: git grep ... (sparse-v3) 0.88    0.23 (-73.8%)
2000.81: git grep ... (sparse-v4) 0.83    0.26 (-68.6%)

- Command used for testing:

	git grep --cached bogus -- "f2/f1/f1/*"

The reason for specifying a pathspec is that, if we don't specify a
pathspec, then grep_tree() will walk all the trees and blobs to find the
pattern, and the time consumed doing so is not too different from using
the original ensure_full_index() method, which also spends most of the
time walking trees. However, when a pathspec is specified, this latest
logic will only walk the area of trees enclosed by the pathspec, and the
time consumed is reasonably a lot less.

Generally speaking, because the performance gain is acheived by walking
less trees, which are specified by the pathspec, the HEAD time v.s.
HEAD~ time in sparse-v[3|4], should be proportional to
"pathspec enclosed area" v.s. "all area", respectively. Namely, the
wider the <pathspec> is encompassing, the less the performance
difference between HEAD~ and HEAD, and vice versa.

That is, if we don't specify a pathspec, the performance difference [1]
is indistinguishable: both methods walk all the trees and take generally
same amount of time (even with the index construction time included for
ensure_full_index()).

[1] Performance test result without pathspec (hence walking all trees):

	Command used:

		git grep --cached bogus

	Test                                HEAD~  HEAD
	---------------------------------------------------
	2000.78: git grep ... (full-v3)     6.17   5.19 (≈)
	2000.79: git grep ... (full-v4)     6.19   5.46 (≈)
	2000.80: git grep ... (sparse-v3)   6.57   6.44 (≈)
	2000.81: git grep ... (sparse-v4)   6.65   6.28 (≈)

--------------------------
NEEDSWORK about submodules

There are a few NEEDSWORKs that belong to improvements beyond this
topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for
more context. The other two NEEDSWORKs in t1092 are also relative.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-23 09:41:27 -07:00
711340c797 pack-bitmap: improve grammar of "xor chain" error message
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-23 08:54:05 -07:00
2b521630f9 check-non-portable-shell: detect obsolescent egrep/fgrep
GNU grep deprecated `egrep` and `fgrep` with release 2.5.3 in 2007.
As of release 3.8 in 2022, those commands warn[1] that they are
obsolescent. Now that all the Git test scripts have been scrubbed of
uses of `egrep` and `fgrep`, make `check-non-portable-shell` complain
about them to prevent new instances from creeping back into the project.

[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-23 08:31:13 -07:00
75fc96d57e Merge branch 'dd/retire-efgrep' into es/retire-efgrep
* dd/retire-efgrep:
  t: convert fgrep usage to "grep -F"
  t: convert egrep usage to "grep -E"
  t: remove \{m,n\} from BRE grep usage
  CodingGuidelines: allow grep -E
2022-09-23 08:31:04 -07:00
d5be499eed l10n: zh_CN v2.38.0 rounds 1 & 2
Reviewed-by: Jiang Xin <worldhello.net@gmail.com>
Reviewed-by: Li Linchao <lilinchao@oschina.cn>
Reviewed-by: 依云 <lilydjwg@gmail.com>
Signed-off-by: Fangyi Zhou <me@fangyi.io>
2022-09-23 14:53:24 +01:00
d4df8609f3 l10n: po-id for 2.38 (round 2)
Update following components:

  * branch.c
  * builtin/log.c
  * builtin/rebase.c
  * builtin/remote.c
  * builtin/reset.c
  * builtin/rev-list.c
  * builtin/rev-parse.c
  * builtin/revert.c
  * builtin/sparse-checkout.c
  * builtin/submodule--helper.c
  * command-list.h
  * help.c
  * merge.c

Translate following new components:

  * builtin/check-attr.c
  * builtin/check-ignore.c
  * builtin/check-mailmap.c
  * builtin/column.c
  * builtin/credential-cache--daemon.c
  * builtin/credential-cache.c
  * builtin/credential-store.c
  * builtin/diagnose.c
  * builtin/env--helper.c
  * builtin/fsmonitor--daemon.c
  * builtin/interpret-trailers.c
  * builtin/mailinfo.c
  * builtin/mailsplit.c
  * builtin/mktag.c
  * builtin/mktree.c
  * builtin/pack-redundant.c
  * builtin/replace.c
  * builtin/rerere.c
  * builtin/stripspace.c
  * bulk-checkin.c
  * commit.c
  * credential.c
  * fsmonitor-ipc.c
  * fsmonitor-settings.c
  * http-fetch.c
  * http.c

Also remove unused strings.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2022-09-23 20:02:42 +07:00
20f5a4f114 l10n: tr: v2.38.0 round 2
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2022-09-23 13:10:04 +03:00
471ae3e297 l10n: bg.po: Updated Bulgarian translation (5484t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2022-09-23 11:21:03 +02:00
f2d1418080 Merge branch 'fr_v2.38_rnd2' of github.com:jnavila/git
* 'fr_v2.38_rnd2' of github.com:jnavila/git:
  l10n: fr: v2.38.0 round 2
  l10n: fr: v2.38 round 1
  l10n: fr: The word 'branche' is only feminine
2022-09-23 17:06:12 +08:00
f5e09d5711 Merge branch 'catalan' of github.com:Softcatala/git-po
* 'catalan' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2022-09-23 16:58:14 +08:00
e3be58c005 Merge branch 'l10n-de-2.38' of github.com:ralfth/git
* 'l10n-de-2.38' of github.com:ralfth/git:
  l10n: de.po: update German translation
2022-09-23 16:51:23 +08:00
eb0d781094 Merge branch 'main' of github.com:git/git
* 'main' of github.com:git/git:
  list-objects-filter: initialize sub-filter structs
  Git 2.38-rc1
  Final batch before -rc1
  builtin/diagnose.c: don't translate the two mode values
  t/Makefile: remove 'test-results' on 'make clean'
  gc: don't translate literal commands
  Documentation: clean up various typos in technical docs
  Documentation: clean up a few misspelled word typos
  version: fix builtin linking & documentation
  diagnose: add to command-list.txt
  Documentation: add ReviewingGuidelines
  commit-graph: Fix missing closedir in expire_commit_graphs
  diagnose.c: refactor to safely use 'd_type'
  help: fix doubled words in explanation for developer interfaces
  api docs: link to html version of api-trace2
  docs: fix a few recently broken links
  reftable: use a pointer for pq_entry param
2022-09-23 16:50:32 +08:00
4b79ee4b0c Merge branch 'jk/list-objects-filter-cleanup'
Fix uninitialized memory access in a recent fix-up that is already
in -rc1.

* jk/list-objects-filter-cleanup:
  list-objects-filter: initialize sub-filter structs
2022-09-22 15:30:47 -07:00
630a6429a7 osxkeychain: clarify that we ignore unknown lines
Like in all the other credential helpers, the osxkeychain helper
ignores unknown credential lines.

Add a comment (a la the other helpers) to make it clear and explicit
that this is the desired behaviour.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 14:21:04 -07:00
6ea87d97af netrc: ignore unknown lines (do not die)
Contrary to the documentation on credential helpers, as well as the help
text for git-credential-netrc itself, this helper will `die` when
presented with an unknown property/attribute/token.

Correct the behaviour here by skipping and ignoring any tokens that are
unknown. This means all helpers in the tree are consistent and ignore
any unknown credential properties/attributes.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 14:20:59 -07:00
d695804983 wincred: ignore unknown lines (do not die)
It is the expectation that credential helpers be liberal in what they
accept and conservative in what they return, to allow for future growth
and evolution of the protocol/interaction.

All of the other helpers (store, cache, osxkeychain, libsecret,
gnome-keyring) except `netrc` currently ignore any credential lines
that are not recognised, whereas the Windows helper (wincred) instead
dies.

Fix the discrepancy and ignore unknown lines in the wincred helper.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 14:20:37 -07:00
5a97b38109 remote: handle rename of remote without fetch refspec
We return an error when trying to rename a remote that has no fetch
refspec:

  $ git config --unset-all remote.origin.fetch
  $ git remote rename origin foo
  fatal: could not unset 'remote.foo.fetch'

To make things even more confusing, we actually _do_ complete the config
modification, via git_config_rename_section(). After that we try to
rewrite the fetch refspec (to say refs/remotes/foo instead of origin).
But our call to git_config_set_multivar() to remove the existing entries
fails, since there aren't any, and it calls die().

We could fix this by using the "gently" form of the config call, and
checking the error code. But there is an even simpler fix: if we know
that there are no refspecs to rewrite, then we can skip that part
entirely.

Reported-by: John A. Leuenhagen <john@zlima12.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 12:59:52 -07:00
3b910d6e29 clone: allow "--bare" with "-o"
We explicitly forbid the combination of "--bare" with "-o", but there
doesn't seem to be any good reason to do so. The original logic came as
part of e6489a1bdf (clone: do not accept more than one -o option.,
2006-01-22), but that commit does not give any reason.

Furthermore, the equivalent combination via config is allowed:

  git -c clone.defaultRemoteName=foo clone ...

and works as expected. It may be that this combination was considered
useless, because a bare clone does not set remote.origin.fetch (and
hence there is no refs/remotes/origin hierarchy). But it does set
remote.origin.url, and that name is visible to the user via "git fetch
origin", etc.

Let's allow the options to be used together, and switch the "forbid"
test in t5606 to check that we use the requested name. That test came
much later in 349cff76de (clone: add tests for --template and some
disallowed option pairs, 2020-09-29), and does not offer any logic
beyond "let's test what the code currently does".

Reported-by: John A. Leuenhagen <john@zlima12.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 12:57:03 -07:00
d5e81315d2 l10n: fr: v2.38.0 round 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2022-09-22 21:52:26 +02:00
77532d041a l10n: fr: v2.38 round 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2022-09-22 21:52:26 +02:00
97db13f26c l10n: fr: The word 'branche' is only feminine
Signed-off-by: hbossot <hbossot@profideo.com>
2022-09-22 21:52:04 +02:00
4eaed7c2f2 list-objects-filter: initialize sub-filter structs
Since commit c54980ab83 (list-objects-filter: convert filter_spec to a
strbuf, 2022-09-11), building with SANITIZE=undefined triggers an error
in t5616.

The problem is that we end up with a strbuf that has been
zero-initialized instead of via STRBUF_INIT. Feeding that strbuf to
strbuf_addbuf() in list_objects_filter_copy() means we will call memcpy
like:

   memcpy(some_actual_buffer, NULL, 0);

This works on most systems because we're copying zero bytes, but it is
technically undefined behavior to ever pass NULL to memcpy.

Even though c54980ab83 is where the bug manifests, that is only because
we switched away from a string_list, which is OK with being
zero-initialized (though it may cause other problems by not duplicating
the strings, it happened to be OK in this instance).

The actual bug is caused by the commit before that, 2a01bdedf8
(list-objects-filter: add and use initializers, 2022-09-11). There we
consistently initialize the top-level filter structs, but we forgot the
dynamically allocated ones we stick in filter_options->sub when creating
combined filters.

Note that we need to fix two spots here: where we parse a "combine:"
filter, but also where we transform from a single-filter into a combined
one after seeing multiple "--filter" options. In the second spot, we'll
do some minor refactoring to avoid repeating our very-long array index.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 12:43:04 -07:00
51b27747e5 parse_object_buffer(): respect save_commit_buffer
If the global variable "save_commit_buffer" is set to 0, then
parse_commit() will throw away the commit object data after parsing it,
rather than sticking it into a commit slab. This goes all the way back
to 60ab26de99 ([PATCH] Avoid wasting memory in git-rev-list,
2005-09-15).

But there's another code path which may similarly stash the buffer:
parse_object_buffer(). This is where we end up if we parse a commit via
parse_object(), and it's used directly in a few other code paths like
git-fsck.

The original goal of 60ab26de99 was avoiding extra memory usage for
rev-list. And there it's not all that important to catch parse_object().
We use that function only for looking at the tips of the traversal, and
the majority of the commits are parsed by following parent links, where
we use parse_commit() directly. So we were wasting some memory, but only
a small portion.

It's much easier to see the effect with fsck. Since we now turn off
save_commit_buffer by default there, we _should_ be able to drop the
freeing of the commit buffer in fsck_obj(). But if we do so (taking the
first hunk of this patch without the rest), then the peak heap of "git
fsck" in a clone of git.git goes from 136MB to 194MB. Teaching
parse_object_buffer() to respect save_commit_buffer brings that down to
134.5MB (it's hard to tell from massif's output, but I suspect the
savings comes from avoiding the overhead of the mostly-empty commit
slab).

Other programs should see a small improvement. Both "rev-list --all" and
"fsck --connectivity-only" improve by a few hundred kilobytes, as they'd
avoid loading the tip objects of their traversals.

Most importantly, no code should be hurt by doing this. Any program that
turns off save_commit_buffer is already making the assumption that any
commit it sees may need to have its object data loaded on demand, as it
doesn't know which ones were parsed by parse_commit() versus
parse_object(). Not to mention that anything parsed by the commit graph
may be in the same boat, even if save_commit_buffer was not disabled.

This should be the only spot that needs to be fixed. Grepping for
set_commit_buffer() shows that this and parse_commit() are the only
relevant calls.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 11:40:47 -07:00
069e445256 fsck: turn off save_commit_buffer
When parsing a commit, the default behavior is to stuff the original
buffer into a commit_slab (which takes ownership of it). But for a tool
like fsck, this isn't useful. While we may look at the buffer further as
part of fsck_commit(), we'll always do so through a separate pointer;
attaching the buffer to the slab doesn't help.

Worse, it means we have to remember to free the commit buffer in all
call paths. We do so in fsck_obj(), which covers a regular "git fsck".
But with "--connectivity-only", we forget to do so in both
traverse_one_object(), which covers reachable objects, and
mark_unreachable_referents(), which covers unreachable ones. As a
result, that mode ends up storing an uncompressed copy of every commit
on the heap at once.

We could teach the code paths for --connectivity-only to also free
commit buffers. But there's an even easier fix: we can just turn off the
save_commit_buffer flag, and then we won't attach them to the commits in
the first place.

This reduces the peak heap of running "git fsck --connectivity-only" in
a clone of linux.git from ~2GB to ~1GB. According to massif, the
remaining memory goes where you'd expect: the object structs themselves,
the obj_hash containing them, and the delta base cache.

Note that we'll leave the call to free commit buffers in fsck_obj() for
now; it's not quite redundant because of a related bug that we'll fix in
a subsequent commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 11:40:11 -07:00
fbce4fa9ae fsck: free tree buffers after walking unreachable objects
After calling fsck_walk(), a tree object struct may be left in the
parsed state, with the full tree contents available via tree->buffer.
It's the responsibility of the caller to free these when it's done with
the object to avoid having many trees allocated at once.

In a regular "git fsck", we hit fsck_walk() only from fsck_obj(), which
does call free_tree_buffer(). Likewise for "--connectivity-only", we see
most objects via traverse_one_object(), which makes a similar call.

The exception is in mark_unreachable_referents(). When using both
"--connectivity-only" and "--dangling" (the latter of which is the
default), we walk all of the unreachable objects, and there we forget to
free. Most cases would not notice this, because they don't have a lot of
unreachable objects, but you can make a pathological case like this:

  git clone --bare /path/to/linux.git repo.git
  cd repo.git
  rm packed-refs ;# now everything is unreachable!
  git fsck --connectivity-only

That ends up with peak heap usage ~18GB, which is (not coincidentally)
close to the size of all uncompressed trees in the repository. After
this patch, the peak heap is only ~2GB.

A few things to note:

  - it might seem like fsck_walk(), if it is parsing the trees, should
    be responsible for freeing them. But the situation is quite tricky.
    In the non-connectivity mode, after we call fsck_walk() we then
    proceed with fsck_object() which actually does the type-specific
    sanity checks on the object contents. We do pass our own separate
    buffer to fsck_object(), but there's a catch: our earlier call to
    parse_object_buffer() may have attached that buffer to the object
    struct! So by freeing it, we leave the rest of the code with a
    dangling pointer.

    Likewise, the call to fsck_walk() in index-pack is subtle. It
    attaches a buffer to the tree object that must not be freed! And
    so rather than calling free_tree_buffer(), it actually detaches it
    by setting tree->buffer to NULL.

    These cases would _probably_ be fixable by having fsck_walk() free
    the tree buffer only when it was the one who allocated it via
    parse_tree(). But that would still leave the callers responsible for
    freeing other cases, so they wouldn't be simplified. While the
    current semantics for fsck_walk() make it easy to accidentally leak
    in new callers, at least they are simple to explain, and it's not a
    function that's likely to get a lot of new call-sites.

    And in any case, it's probably sensible to fix the leak first with
    this simple patch, and try any more complicated refactoring
    separately.

  - a careful reader may notice that fsck_obj() also frees commit
    buffers, but neither the call in traverse_one_object() nor the one
    touched in this patch does so. And indeed, this is another problem
    for --connectivity-only (and accounts for most of the 2GB heap after
    this patch), but it's one we'll fix in a separate commit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 11:30:06 -07:00
aa923f75a6 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2022-09-22 18:30:42 +02:00
9e17cd5c05 l10n: de.po: update German translation
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com>
2022-09-22 17:23:13 +02:00
1b3d6e17fe Git 2.38-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 15:27:03 -07:00
04cc66fe8c Merge branch 'sg/parse-options-subcommand'
Fix messages incorrectly marked for translation.

* sg/parse-options-subcommand:
  gc: don't translate literal commands
2022-09-21 15:27:03 -07:00
4140830d25 Merge branch 'js/typofix'
* js/typofix:
  Documentation: clean up various typos in technical docs
  Documentation: clean up a few misspelled word typos
2022-09-21 15:27:02 -07:00
17df9d3849 Merge branch 'sg/clean-test-results'
"make clean" stopped cleaning the test results directory as a side
effect of a topic that has nothing to do with "make clean", which
has been corrected.

* sg/clean-test-results:
  t/Makefile: remove 'test-results' on 'make clean'
2022-09-21 15:27:02 -07:00
2cf2ae9dd6 Merge branch 'vd/check-docs-fixes'
Build fix.

* vd/check-docs-fixes:
  version: fix builtin linking & documentation
  diagnose: add to command-list.txt
2022-09-21 15:27:02 -07:00
ac45db1e75 Merge branch 'vd/doc-reviewing-guidelines'
Just like we have coding guidelines, we now have guidelines for
reviewers.

* vd/doc-reviewing-guidelines:
  Documentation: add ReviewingGuidelines
2022-09-21 15:27:02 -07:00
86c108a8a2 Merge branch 'vd/scalar-generalize-diagnose'
Portability fix.

* vd/scalar-generalize-diagnose:
  builtin/diagnose.c: don't translate the two mode values
  diagnose.c: refactor to safely use 'd_type'
2022-09-21 15:27:01 -07:00
370d3a06a3 Final batch before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 14:23:14 -07:00
dd37e5607f Merge branch 'fz/help-doublofix'
Typofix for topic already in -rc0.

* fz/help-doublofix:
  help: fix doubled words in explanation for developer interfaces
2022-09-21 14:23:14 -07:00
8c88f75909 Merge branch 'tz/tech-docs-to-help-fix'
Docfix for topic already in -rc0.

* tz/tech-docs-to-help-fix:
  api docs: link to html version of api-trace2
  docs: fix a few recently broken links
2022-09-21 14:23:14 -07:00
3239100b5a Merge branch 'ml/commit-graph-expire-dir-leak-fix'
A result from opendir() was leaking in the commit-graph expiration
codepath, which has been plugged.

* ml/commit-graph-expire-dir-leak-fix:
  commit-graph: Fix missing closedir in expire_commit_graphs
2022-09-21 14:23:14 -07:00
f73ad8f75f Merge branch 'ec/reftable-pass-pq-entry-by-reference'
Small code clean-up in reftable implementation.

* ec/reftable-pass-pq-entry-by-reference:
  reftable: use a pointer for pq_entry param
2022-09-21 14:23:13 -07:00
02cb8b9ee3 fsmonitor--daemon: don't translate literal commands
These commands have no placeholders to be translated.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:56:42 -07:00
d956fa8082 builtin/diagnose.c: don't translate the two mode values
These strings are not translatable in the diagnose_options array in
diagnose.c. Don't translate them in builtin/diagnose.c either.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:53:35 -07:00
d11b875197 t/Makefile: remove 'test-results' on 'make clean'
The 't/test-results' directory and its contents are by-products of the
test process, so 'make clean' should remove them, but, alas, this has
been broken since fee65b194d (t/Makefile: don't remove test-results in
"clean-except-prove-cache", 2022-07-28).

The 'clean' target in 't/Makefile' was not directly responsible for
removing the 'test-results' directory, but relied on its dependency
'clean-except-prove-cache' to do that [1].  ee65b194d broke this,
because it only removed the 'rm -r test-results' command from the
'clean-except-prove-cache' target instead of moving it to the 'clean'
target, resulting in stray 't/test-results' directories.

Add that missing cleanup command to 't/Makefile', and to all
sub-Makefiles touched by that commit as well.

[1] 60f26f6348 (t/Makefile: retain cache t/.prove across prove runs,
                2012-05-02)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:32:13 -07:00
37eb90f79a t: convert fgrep usage to "grep -F"
Despite POSIX states that:

> The old egrep and fgrep commands are likely to be supported for many
> years to come as implementation extensions, allowing historical
> applications to operate unmodified.

GNU grep 3.8 started to warn[1]:

> The egrep and fgrep commands, which have been deprecated since
> release 2.5.3 (2007), now warn that they are obsolescent and should
> be replaced by grep -E and grep -F.

Prepare for their removal in the future.

[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:00:19 -07:00
81580fa06d t: convert egrep usage to "grep -E"
Despite POSIX states that:

> The old egrep and fgrep commands are likely to be supported for many
> years to come as implementation extensions, allowing historical
> applications to operate unmodified.

GNU grep 3.8 started to warn[1]:

> The egrep and fgrep commands, which have been deprecated since
> release 2.5.3 (2007), now warn that they are obsolescent and should
> be replaced by grep -E and grep -F.

Prepare for their removal in the future.

[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:00:18 -07:00
a764c37bad t: remove \{m,n\} from BRE grep usage
The CodingGuidelines says we should avoid \{m,n\} in BRE usage.
And their usages in our code base is limited, and subjectively
hard to read.

Replace them with ERE.

Except for "0\{40\}" which would be changed to "$ZERO_OID",
which is a better value for testing with:
GIT_TEST_DEFAULT_HASH=sha256

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:00:18 -07:00
2e092725e6 CodingGuidelines: allow grep -E
Despite forbidden by CodingGuidelines, our usage of 'grep -E' has been
increased over the years, and noone has come and complained.

Let's lift the restriction.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:00:18 -07:00
8b74492135 gc: don't translate literal commands
The command you type is still "git maintenance" even in other languages.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:43:10 -07:00
bbb0c357b8 Documentation: clean up various typos in technical docs
Used GNU "aspell check <filename>" to review various technical
documentation files with the default aspell dictionary. Ignored
false-positives between american and british english.

Signed-off-by: Jacob Stopak <jacob@initialcommit.io>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:28:36 -07:00
72991ff558 Documentation: clean up a few misspelled word typos
Used GNU "aspell check <filename>" to review various documentation
files with the default aspell dictionary. Ignored false-positives
between american and british english.

Signed-off-by: Jacob Stopak <jacob@initialcommit.io>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:28:35 -07:00
b62ad5681f midx.c: avoid cruft packs with non-zero repack --batch-size
Apply similar treatment with respect to cruft packs as in a few commits
ago to `repack` with a non-zero `--batch-size`.

Since the case of a non-zero `--batch-size` is handled separately (in
`fill_included_packs_batch()` instead of `fill_included_packs_all()`), a
separate fix must be applied for this case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:47 -07:00
0a8e561492 midx.c: remove unnecessary loop condition
The fill_included_packs_batch() routine is responsible for aggregating
objects in packs with a non-zero value for the `--batch-size` option of
the `git multi-pack-index repack` sub-command.

Since this routine is explicitly called only when `--batch-size` is
non-zero, there is no point in checking that this is the case in our
loop condition.

Remove the unnecessary part of this condition to avoid confusion.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:47 -07:00
cb6c48cbbc midx.c: replace xcalloc() with CALLOC_ARRAY()
Replace a direct invocation of Git's `xcalloc()` wrapper with the
`CALLOC_ARRAY()` macro instead.

The latter is preferred since it is more conventional in Git's codebase,
but also because it automatically picks the correct value for the record
size.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:46 -07:00
d9f7721450 midx.c: avoid cruft packs with repack --batch-size=0
The `repack` sub-command of the `git multi-pack-index` builtin creates a
new pack aggregating smaller packs contained in the MIDX up to some
given `--batch-size`.

When `--batch-size=0`, this instructs the MIDX builtin to repack
everything contained in the MIDX into a single pack.

In similar spirit as a previous commit, it is undesirable to repack the
contents of a cruft pack in this step. Teach `repack` to ignore any
cruft pack(s) when `--batch-size=0` for the same reason(s).

(The case of a non-zero `--batch-size` will be handled in a subsequent
commit).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:46 -07:00
757d457907 midx.c: prevent expire from removing the cruft pack
The `expire` sub-command unlinks any packs that are (a) contained in the
MIDX, but (b) have no objects referenced by the MIDX.

This sub-command ignores `.keep` packs, which remain on-disk even if
they have no objects referenced by the MIDX. Cruft packs, however,
aren't given the same treatment: if none of the objects contained in the
cruft pack are selected from the cruft pack by the MIDX, then the cruft
pack is eligible to be expired.

This is less than desireable, since the cruft pack has important
metadata about the individual object mtimes, which is useful to
determine how quickly an object should age out of the repository when
pruning.

Ordinarily, we wouldn't expect the contents of a cruft pack to
duplicated across non-cruft packs (and we'd expect to see the MIDX
select all cruft objects from other sources even less often). But
nonetheless, it is still possible to trick the `expire` sub-command into
removing the `.mtimes` file in this circumstance.

Teach the `expire` sub-command to ignore cruft packs in the same manner
as it does `.keep` packs, in order to keep their metadata around, even
when they are unreferenced by the MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:46 -07:00
2a91b35fce Documentation/git-multi-pack-index.txt: clarify expire behavior
The `expire` sub-command of `git multi-pack-index` will never expire
`.keep` packs, regardless of whether or not any of their objects were
selected in the MIDX.

This has always been the case since 19575c7c8e (multi-pack-index:
implement 'expire' subcommand, 2019-06-10), which came after cff9711616
(multi-pack-index: prepare for 'expire' subcommand, 2019-06-10), when
this documentation was originally written.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:46 -07:00
2699542824 Documentation/git-multi-pack-index.txt: fix typo
Remove the extra space character between "tracked" and "by", which dates
back to when this paragraph was originally written in cff9711616
(multi-pack-index: prepare for 'expire' subcommand, 2019-06-10).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 10:21:46 -07:00
82958c3c6d Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: po-id for 2.38 (round 1)
2022-09-21 08:14:37 +08:00
2e2f4dd1e6 Merge branch 'main' of github.com:git/git
* 'main' of github.com:git/git: (45 commits)
  A bit more of remaining topics before -rc1
  t1800: correct test to handle Cygwin
  chainlint: colorize problem annotations and test delimiters
  ls-files: fix black space in error message
  list-objects-filter: convert filter_spec to a strbuf
  list-objects-filter: add and use initializers
  list-objects-filter: handle null default filter spec
  list-objects-filter: don't memset after releasing filter struct
  builtin/mv.c: fix possible segfault in add_slash()
  Documentation/technical: include Scalar technical doc
  t/perf: add 'GIT_PERF_USE_SCALAR' run option
  t/perf: add Scalar performance tests
  scalar-clone: add test coverage
  scalar: add to 'git help -a' command list
  scalar: implement the `help` subcommand
  git help: special-case `scalar`
  scalar: include in standard Git build & installation
  scalar: fix command documentation section header
  t: retire unused chainlint.sed
  t/Makefile: teach `make test` and `make prove` to run chainlint.pl
  ...
2022-09-21 08:13:27 +08:00
03f47f2ac5 l10n: po-id for 2.38 (round 1)
Update following components:

  * add-patch.c
  * advice.c
  * builtin/add.c
  * builtin/am.c
  * builtin/clone.c
  * builtin/gc.c
  * builtin/help.c
  * builtin/ls-files.c
  * builtin/merge.c
  * diff.c
  * merge-ort.c
  * merge-tree.c
  * object-file.c
  * pack-bitmap.c
  * remote.c
  * revision.c
  * setup.c

Translate following new components:

  * builtin/bugreport.c
  * builtin/checkout--worker.c
  * builtin/checkout-index.c
  * builtin/commit-graph.c
  * builtin/fmt-merge-msg.c
  * builtin/for-each-ref.c
  * builtin/merge-file.c
  * builtin/merge-recursive.c
  * builtin/range-diff.c
  * bundle-uri.c
  * chunk-format.c
  * color.c
  * command-list.h
  * commit-graph.c
  * delta-islands.c
  * diagnose.c
  * diff-lib.c
  * diff-no-index.c
  * diffcore-order.c
  * diffcore-rename.c
  * diffcore-rotate.c
  * dir.c
  * editor.c
  * for-each-repo.c
  * parse-options-cb.c
  * parse-options.c
  * parse-options.h
  * path.c
  * pathspec.c
  * prune-packed.c
  * range-diff.c
  * ref-filter.c
  * ref-filter.h
  * remote-curl.c
  * replace-object.c
  * rerere.h
  * run-command.c
  * unpack-trees.c
  * usage.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2022-09-20 16:48:53 +07:00
9b1dc1c9d8 version: fix builtin linking & documentation
Like most builtins, 'version' is documented in a corresponding
'Documentation/git-version.txt' and can be invoked with 'git version'.
However, the 'check-docs' Makefile target showed that it was "removed but
documented: git-version." This was cause by the fact that it is not built as
a standalone 'git-version' executable, therefore appearing "removed" to
'check-docs'.

Without a precedent for documented builtins that aren't built into an
executable *or* any clear reason why a standalone 'git-version' shouldn't
exist, the 'check-docs' error appears to correctly identify an issue. To
correct that mismatch, add 'git-version' to the 'BUILT_INS' list in the root
Makefile (indicating that the 'cmd_version()' function appears in a file
that is *not* 'builtin/version.c'). Additionally, to avoid the "no link"
message in 'check-docs', list 'git-version' as an "ancilliaryinterrogator"
(like 'git help') in 'command-list.txt'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 17:28:25 -07:00
89c8048855 diagnose: add to command-list.txt
Add 'git diagnose' as an "ancilliaryinterrogator" (like 'git bugreport') to
'command-list.txt' in order to have it show up in 'git help -a' and avoid
the "no link" warning message from the 'check-docs' Makefile target.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 17:27:26 -07:00
e01b851923 Documentation: add ReviewingGuidelines
Add a reviewing guidelines document including advice and common terminology
used in Git mailing list reviews. The document is included in the
'TECH_DOCS' list in order to include it in Git's published documentation.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 14:36:08 -07:00
dda7228a83 A bit more of remaining topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 14:35:25 -07:00
279ebd4761 Merge branch 'ad/t1800-cygwin'
Test fix.

* ad/t1800-cygwin:
  t1800: correct test to handle Cygwin
2022-09-19 14:35:25 -07:00
42bf77c7d0 Merge branch 'vd/scalar-to-main'
Hoist the remainder of "scalar" out of contrib/ to the main part of
the codebase.

* vd/scalar-to-main:
  Documentation/technical: include Scalar technical doc
  t/perf: add 'GIT_PERF_USE_SCALAR' run option
  t/perf: add Scalar performance tests
  scalar-clone: add test coverage
  scalar: add to 'git help -a' command list
  scalar: implement the `help` subcommand
  git help: special-case `scalar`
  scalar: include in standard Git build & installation
  scalar: fix command documentation section header
2022-09-19 14:35:25 -07:00
9d58241ee4 Merge branch 'es/chainlint'
Revamp chainlint script for our tests.

* es/chainlint:
  chainlint: colorize problem annotations and test delimiters
  t: retire unused chainlint.sed
  t/Makefile: teach `make test` and `make prove` to run chainlint.pl
  test-lib: replace chainlint.sed with chainlint.pl
  test-lib: retire "lint harder" optimization hack
  t/chainlint: add more chainlint.pl self-tests
  chainlint.pl: allow `|| echo` to signal failure upstream of a pipe
  chainlint.pl: complain about loops lacking explicit failure handling
  chainlint.pl: don't flag broken &&-chain if failure indicated explicitly
  chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly
  chainlint.pl: don't require `&` background command to end with `&&`
  t/Makefile: apply chainlint.pl to existing self-tests
  chainlint.pl: don't require `return|exit|continue` to end with `&&`
  chainlint.pl: validate test scripts in parallel
  chainlint.pl: add parser to identify test definitions
  chainlint.pl: add parser to validate tests
  chainlint.pl: add POSIX shell parser
  chainlint.pl: add POSIX shell lexical analyzer
  t: add skeleton chainlint.pl
2022-09-19 14:35:24 -07:00
298a958224 Merge branch 'jk/list-objects-filter-cleanup'
A couple of bugfixes with code clean-up.

* jk/list-objects-filter-cleanup:
  list-objects-filter: convert filter_spec to a strbuf
  list-objects-filter: add and use initializers
  list-objects-filter: handle null default filter spec
  list-objects-filter: don't memset after releasing filter struct
2022-09-19 14:35:24 -07:00
f876b5a686 Merge branch 'zh/ls-files-format'
Typofix in the UI of a topic that has graduated to 'master'.

* zh/ls-files-format:
  ls-files: fix black space in error message
2022-09-19 14:35:24 -07:00
339517b035 Merge branch 'sy/mv-out-of-cone'
"git mv A B" in a sparsely populated working tree can be asked to
move a path from a directory that is "in cone" to another directory
that is "out of cone".  Handling of such a case has been improved.

* sy/mv-out-of-cone:
  builtin/mv.c: fix possible segfault in add_slash()
  mv: check overwrite for in-to-out move
  advice.h: add advise_on_moving_dirty_path()
  mv: cleanup empty WORKING_DIRECTORY
  mv: from in-cone to out-of-cone
  mv: remove BOTH from enum update_mode
  mv: check if <destination> is a SKIP_WORKTREE_DIR
  mv: free the with_slash in check_dir_in_index()
  mv: rename check_dir_in_index() to empty_dir_has_sparse_contents()
  t7002: add tests for moving from in-cone to out-of-cone
2022-09-19 14:35:23 -07:00
71e5473493 refs: unify parse_worktree_ref() and ref_type()
The logic to handle worktree refs (worktrees/NAME/REF and
main-worktree/REF) existed in two places:

* ref_type() in refs.c

* parse_worktree_ref() in worktree.c

Collapse this logic together in one function parse_worktree_ref():
this avoids having to cross-check the result of parse_worktree_ref()
and ref_type().

Introduce enum ref_worktree_type, which is slightly different from
enum ref_type. The latter is a misleading name (one would think that
'ref_type' would have the symref option).

Instead, enum ref_worktree_type only makes explicit how a refname
relates to a worktree. From this point of view, HEAD and
refs/bisect/abc are the same: they specify the current worktree
implicitly.

The files-backend must avoid packing refs/bisect/* and friends into
packed-refs, so expose is_per_worktree_ref() separately.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 11:11:11 -07:00
12f1ae5324 commit-graph: Fix missing closedir in expire_commit_graphs
The function calls opendir() but missing the corresponding
closedir() before exit the function.
Add missing closedir() to fix it.

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 10:42:38 -07:00
cb98e1d50a diagnose.c: refactor to safely use 'd_type'
Refactor usage of the 'd_type' property of 'struct dirent' in 'diagnose.c'
to instead utilize the compatibility macro 'DTYPE()'. On systems where
'd_type' is not present in 'struct dirent', this macro will always return
'DT_UNKNOWN'. In that case, instead fall back on using the 'stat.st_mode' to
determine whether the dirent points to a dir, file, or link.

Additionally, add a test to 't0092-diagnose.sh' to verify that files (e.g.,
loose objects) are counted properly.

Note that the new function 'get_dtype()' is based on 'resolve_dtype()' in
'dir.c' (which itself was refactored from a prior 'get_dtype()' in
ad6f2157f9 (dir: restructure in a way to avoid passing around a struct
dirent, 2020-01-16)), but differs in that it is meant for use on arbitrary
files, such as those inside the '.git' dir. Because of this, it does not
search the index for a matching entry to derive the 'd_type'.

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 10:25:01 -07:00
6713bfc70c fuzz: reorganise the path for existing oss-fuzz fuzzers
In order to provide a better organisation for oss-fuzz fuzzers and
to avoid top-level clustters in the git repository when more fuzzers
are introduced, move the existing fuzzer-related sources to their
own oss-fuzz/ hierarchy.  Grouping the fuzzers into their own
directory, separate their application on fuzz-testing from the core
functionalities of the git code, prvides better and tidier structure
the oss-fuzz fuzzing library to manage, locate, build and execute
those fuzzers for fuzz-testing purposes in future development.

Signed-off-by: Arthur Chan <arthur.chan@adalogics.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 09:34:35 -07:00
a275db6dec Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5482t0f0u)
2022-09-19 10:50:10 +08:00
c1eb12601c l10n: bg.po: Updated Bulgarian translation (5482t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2022-09-18 21:34:22 +02:00
ef926c6f53 l10n: sv.po: Update Swedish translation (5482t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2022-09-16 22:03:08 +01:00
365891d6a3 l10n: tr: Update translations for v2.38.0 round #1
Signed-off-by: Emir SARI <emir_sari@icloud.com>
2022-09-16 22:26:06 +03:00
819fb68222 environ: GIT_INDEX_VERSION affects not just a new repository
The variable is consulted whenever we write the index file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:45:22 -07:00
b724df6b55 environ: simplify description of GIT_INDEX_FILE
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:45:21 -07:00
c34a6bd291 diff-merges: clarify log.diffMerges documentation
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:21:44 -07:00
563005ecbf diff-merges: cleanup set_diff_merges()
Get rid of special-casing of 'suppress' in set_diff_merges(). Instead
set 'merges_need_diff' flag correctly in every option handling
function.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:21:43 -07:00
c7c4f7608a diff-merges: cleanup func_by_opt()
Get rid of unneeded "else" statements in func_by_opt().

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:21:40 -07:00
225e815ef2 help: fix doubled words in explanation for developer interfaces
Signed-off-by: Fangyi Zhou <me@fangyi.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:20:11 -07:00
4945f046c7 api docs: link to html version of api-trace2
In f6d25d7878 (api docs: document that BUG() emits a trace2 error event,
2021-04-13), a link to the plain text version of api-trace2 was added in
`technical/api-error-handling.txt`.

All of our other `link:`s point to the html versions.  Do the same here.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 08:39:11 -07:00
086eaab8da docs: fix a few recently broken links
Some links were broken in the recent move of various technical docs
c0f6dd49f1 (Merge branch 'ab/tech-docs-to-help', 2022-08-14).

Fix them.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 08:38:03 -07:00
d3fa443f97 Git 2.38-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 16:09:47 -07:00
ca20a44bc5 Merge branch 'jk/proto-v2-ref-prefix-fix'
"git fetch" over protocol v2 sent an incorrect ref prefix request
to the server and made "git pull" with configured fetch refspec
that does not cover the remote branch to merge with fail, which has
been corrected.

* jk/proto-v2-ref-prefix-fix:
  fetch: add branch.*.merge to default ref-prefix extension
  fetch: stop checking for NULL transport->remote in do_fetch()
2022-09-15 16:09:47 -07:00
b7f39a3fe6 Merge branch 'rs/add-p-worktree-mode-prompt-fix'
Fix another UI regression in the reimplemented "add -p".

* rs/add-p-worktree-mode-prompt-fix:
  add -p: fix worktree patch mode prompts
2022-09-15 16:09:46 -07:00
5ff02db75b Merge branch 'js/typofix'
Typofix.

* js/typofix:
  Documentation: fix various repeat word typos
2022-09-15 16:09:46 -07:00
d878d83ff0 Merge branch 'en/remerge-diff-fixes'
Fix a few "git log --remerge-diff" bugs.

* en/remerge-diff-fixes:
  diff: fix filtering of merge commits under --remerge-diff
  diff: fix filtering of additional headers under --remerge-diff
  diff: have submodule_format logic avoid additional diff headers
2022-09-15 16:09:46 -07:00
fd01795beb environ: GIT_FLUSH should be made a usual Boolean
This uses atoi() and checks if the result is not zero to decide what
to do.  Turning it into the usual Boolean environment variable to
use git_env_bool() would not break those who have been using "set to
0, or set to non-zero, that can be parsed with atoi()" values, but
will match the expectation of those who expected "true" to mean
"yes".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 11:34:51 -07:00
80f0b3f397 environ: explain Boolean environment variables
Many environment variables use the git_env_bool() API to parse their
values, and allow the usual "true/yes/on are true, false/no/off are
false. In addition non-zero numbers are true and zero is false.  An
empty string is also false." set of values.

Mark them as such, and consistently say "true" or "false", instead
of random mixes of '1', '0', 'yes', 'true', etc. in their
description.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 11:34:51 -07:00
29491ca5fd environ: document GIT_SSL_NO_VERIFY
Even though the name of the environment variable is mentioned in
"git config --help" from http.sslVerify, there is no description for
it.  Add one.

Note that this is not a usual Boolean environment variable whose
value can be yes/true/on vs no/false/off; the existence of it is
enough to trigger the feature named by the variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 11:34:51 -07:00
c18eecbe5c reftable: use a pointer for pq_entry param
The speed of the merged_iter_pqueue_add() can be improved by using a
pointer to the pq_entry struct, which is 96 bytes. Since the pq_entry
param is worked directly on the stack and does not currently have a
pointer to it, the merged_iter_pqueue_add() function is slightly
slower.

References to pq_entry in reftable have typically included pointers,
such as both of the params for pq_less().

Since we are working with pointers in the pq_entry param, as keenly
pointed out, the pq_entry param has also been made into a const since
the contents of the pq_entry param are copied and not manipulated.

Signed-off-by: Elijah Conners <business@elijahpepe.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 11:32:37 -07:00
255a6f91ae t1800: correct test to handle Cygwin
On Cygwin, when failing to spawn a process using start_command, Git
outputs the same error as on Linux systems, rather than using the
GIT_WINDOWS_NATIVE-specific error output.  The WINDOWS test prerequisite
is set in both Cygwin and native Windows environments, which means it's
not appropriate to use to anticipate the error output from
start_command.  Instead, use the MINGW test prerequisite, which is only
set for Git in native Windows environments, and not for Cygwin.

Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 10:29:51 -07:00
12fc4ad89e diff.c: use utf8_strwidth() to count display width
When unicode filenames (encoded in UTF-8) are used, the visible width
on the screen is not the same as strlen().

For example, `git log --stat` may produce an output like this:

[snip the header]

 Arger.txt  | 1 +
 Ärger.txt | 1 +
 2 files changed, 2 insertions(+)

A side note: the original report was about cyrillic filenames.
After some investigations it turned out that
a) This is not a problem with "ambiguous characters" in unicode
b) The same problem exists for all unicode code points (so we
  can use Latin based Umlauts for demonstrations below)

The 'Ä' takes the same space on the screen as the 'A'.
But needs one more byte in memory, so the the `git log --stat` output
for "Arger.txt" (!) gets mis-aligned:
The maximum length is derived from "Ärger.txt", 10 bytes in memory,
9 positions on the screen. That is why "Arger.txt" gets one extra ' '
for aligment, it needs 9 bytes in memory.
If there was a file "Ö", it would be correctly aligned by chance,
but "Öhö" would not.

The solution is of course, to use utf8_strwidth() instead of strlen()
when dealing with the width on screen.

And then there is another problem, code like this:
strbuf_addf(&out, "%-*s", len, name);
(or using the underlying snprintf() function) does not align the
buffer to a minimum of len measured in screen-width, but uses the
memory count.

One could be tempted to wish that snprintf() was UTF-8 aware.
That doesn't seem to be the case anywhere (tested on Linux and Mac),
probably snprintf() uses the "bytes in memory"/strlen() approach to be
compatible with older versions and this will never change.

The basic idea is to change code in diff.c like this
strbuf_addf(&out, "%-*s", len, name);

into something like this:
int padding = len - utf8_strwidth(name);
if (padding < 0)
	padding = 0;
strbuf_addf(&out, " %s%*s", name, padding, "");

The real change is slighty bigger, as it, as well, integrates two calls
of strbuf_addf() into one.

Tests:
Two things need to be tested:
 - The calculation of the maximum width
 - The calculation of padding

The name "textfile" is changed into "tëxtfilë", both have a width of 8.
If strlen() was used, to get the maximum width, the shorter "binfile" would
have been mis-aligned:
 binfile    | [snip]
 tëxtfilë | [snip]

If only "binfile" would be renamed into "binfilë":
 binfilë | [snip]
 textfile | [snip]

In order to verify that the width is calculated correctly everywhere,
"binfile" is renamed into "binfilë", giving 1 bytes more in strlen()
"tëxtfile" is renamed into "tëxtfilë", 2 byte more in strlen().

The updated t4012-diff-binary.sh checks the correct aligment:
 binfilë  | [snip]
 tëxtfilë | [snip]

Reported-by: Alexander Meshcheryakov <alexander.s.m@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14 13:48:18 -07:00
36f8e7ed7d Prepare for 2.38-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14 12:56:41 -07:00
08d61c7061 Merge branch 'jk/plug-list-object-filter-leaks'
The code that manages list-object-filter structure, used in partial
clones, leaked the instances, which has been plugged.

* jk/plug-list-object-filter-leaks:
  prepare_repo_settings(): plug leak of config values
  list_objects_filter_options: plug leak of filter_spec strings
  transport: free filter options in disconnect_git()
  transport: deep-copy object-filter struct for fetch-pack
  list_objects_filter_copy(): deep-copy sparse_oid_name field
2022-09-14 12:56:40 -07:00
b563638d2c Merge branch 'ab/submodule-helper-leakfix'
Plugging leaks in submodule--helper.

* ab/submodule-helper-leakfix:
  submodule--helper: fix a configure_added_submodule() leak
  submodule--helper: free rest of "displaypath" in "struct update_data"
  submodule--helper: free some "displaypath" in "struct update_data"
  submodule--helper: fix a memory leak in print_status()
  submodule--helper: fix a leak in module_add()
  submodule--helper: fix obscure leak in module_add()
  submodule--helper: fix "reference" leak
  submodule--helper: fix a memory leak in get_default_remote_submodule()
  submodule--helper: fix a leak with repo_clear()
  submodule--helper: fix "sm_path" and other "module_cb_list" leaks
  submodule--helper: fix "errmsg_str" memory leak
  submodule--helper: add and use *_release() functions
  submodule--helper: don't leak {run,capture}_command() cp.dir argument
  submodule--helper: "struct pathspec" memory leak in module_update()
  submodule--helper: fix most "struct pathspec" memory leaks
  submodule--helper: fix trivial get_default_remote_submodule() leak
  submodule--helper: fix a leak in "clone_submodule"
2022-09-14 12:56:40 -07:00
7a54d74045 Merge branch 'ab/dedup-config-and-command-docs'
Share the text used to explain configuration variables used by "git
<subcmd>" in "git help <subcmd>" with the text from "git help config".

* ab/dedup-config-and-command-docs:
  docs: add CONFIGURATION sections that fuzzy map to built-ins
  docs: add CONFIGURATION sections that map to a built-in
  log docs: de-duplicate configuration sections
  difftool docs: de-duplicate configuration sections
  notes docs: de-duplicate and combine configuration sections
  apply docs: de-duplicate configuration sections
  send-email docs: de-duplicate configuration sections
  grep docs: de-duplicate configuration sections
  docs: add and use include template for config/* includes
2022-09-14 12:56:40 -07:00
dd407f1c7c Merge branch 'ab/unused-annotation'
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.

* ab/unused-annotation:
  git-compat-util.h: use "deprecated" for UNUSED variables
  git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14 12:56:39 -07:00
a6b42ec0c6 Merge branch 'jk/unused-annotation'
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.

* jk/unused-annotation:
  is_path_owned_by_current_uid(): mark "report" parameter as unused
  run-command: mark unused async callback parameters
  mark unused read_tree_recursive() callback parameters
  hashmap: mark unused callback parameters
  config: mark unused callback parameters
  streaming: mark unused virtual method parameters
  transport: mark bundle transport_options as unused
  refs: mark unused virtual method parameters
  refs: mark unused reflog callback parameters
  refs: mark unused each_ref_fn parameters
  git-compat-util: add UNUSED macro
2022-09-14 12:56:39 -07:00
f6f0ee247f add -p: fix worktree patch mode prompts
cee6cb7300 (built-in add -p: implement the "worktree" patch modes,
2019-12-21) added the worktree patch modes to the built-in add -p.
Its commit message claims to be a port of 2f0896ec3a (restore:
support --patch, 2019-04-25), which did the same for the script
git-add--interactive.perl.

The script mentioned only the worktree in its prompt messages in
worktree mode, while the built-in mentions the worktree and also the
index, even though the command doesn't actually affect the index.

2c8bd8471a (checkout -p: handle new files correctly, 2020-05-27)
added new prompt messages for addition that also mention the index in
worktree mode in the built-in, but not in the script.

Correct these prompts to state that only the worktree will be affected.

Reported-by: David Plumpton <david.plumpton@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14 11:14:38 -07:00
e188ec3a73 Sync with 'maint' 2022-09-13 12:23:48 -07:00
a0feb8611d Merge a handful of topics from the 'master' front
As the 'master' front will soon tag a preview and then release
candidates for 2.38, it is unknown if we are going to issue another
maintenance release on the 2.37.x track, but as we have accumulated
enough material there, let's prepare a draft for it.

Even if we end up not tagging 2.37.4, it would help motivated distro
packagers to maintain their slightly older and "more stable" versions.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-13 12:22:59 -07:00
2c75b3255b Merge branch 'en/merge-unstash-only-on-clean-merge' into maint
The auto-stashed local changes created by "git merge --autostash"
was mixed into a conflicted state left in the working tree, which
has been corrected.

* en/merge-unstash-only-on-clean-merge:
  merge: only apply autostash when appropriate
2022-09-13 12:21:11 -07:00
4f06dfde7a Merge branch 'ds/github-actions-use-newer-ubuntu' into maint
Update the version of Ubuntu used for GitHub Actions CI from 18.04
to 22.04.

* ds/github-actions-use-newer-ubuntu:
  ci: update 'static-analysis' to Ubuntu 22.04
2022-09-13 12:21:10 -07:00
37317ab40b Merge branch 'ad/preload-plug-memleak' into maint
The preload-index codepath made copies of pathspec to give to
multiple threads, which were left leaked.

* ad/preload-plug-memleak:
  preload-index: fix memleak
2022-09-13 12:21:10 -07:00
c61614e30f Merge branch 'sg/xcalloc-cocci-fix' into maint
xcalloc(), imitating calloc(), takes "number of elements of the
array", and "size of a single element", in this order.  A call that
does not follow this ordering has been corrected.

* sg/xcalloc-cocci-fix:
  promisor-remote: fix xcalloc() argument order
2022-09-13 12:21:09 -07:00
aa31cb8974 Merge branch 'jk/pipe-command-nonblock' into maint
Fix deadlocks between main Git process and subprocess spawned via
the pipe_command() API, that can kill "git add -p" that was
reimplemented in C recently.

* jk/pipe-command-nonblock:
  pipe_command(): mark stdin descriptor as non-blocking
  pipe_command(): handle ENOSPC when writing to a pipe
  pipe_command(): avoid xwrite() for writing to pipe
  git-compat-util: make MAX_IO_SIZE define globally available
  nonblock: support Windows
  compat: add function to enable nonblocking pipes
2022-09-13 12:21:08 -07:00
72869e750b Merge branch 'jk/is-promisor-object-keep-tree-in-use' into maint
An earlier optimization discarded a tree-object buffer that is
still in use, which has been corrected.

* jk/is-promisor-object-keep-tree-in-use:
  is_promisor_object(): fix use-after-free of tree buffer
2022-09-13 12:21:07 -07:00
21dd13e025 The twentieth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-13 11:38:25 -07:00
de1fee2f1e Merge branch 'ow/rev-parse-parseopt-fix'
The parser in the script interface to parse-options in "git
rev-parse" has been updated to diagnose a bogus input correctly.

* ow/rev-parse-parseopt-fix:
  rev-parse --parseopt: detect missing opt-spec
2022-09-13 11:38:25 -07:00
e4ffba458f Merge branch 'js/builtin-add-p-portability-fix'
More fixes to "add -p"

* js/builtin-add-p-portability-fix:
  t6132(NO_PERL): do not run the scripted `add -p`
  t3701: test the built-in `add -i` regardless of NO_PERL
  add -p: avoid ambiguous signed/unsigned comparison
2022-09-13 11:38:24 -07:00
76ffa818c7 Merge branch 'sg/parse-options-subcommand'
The codepath for the OPT_SUBCOMMAND facility has been cleaned up.

* sg/parse-options-subcommand:
  notes, remote: show unknown subcommands between `'
  notes: simplify default operation mode arguments check
  test-parse-options.c: fix style of comparison with zero
  test-parse-options.c: don't use for loop initial declaration
  t0040-parse-options: remove leftover debugging
2022-09-13 11:38:24 -07:00
655e494047 Merge branch 'jk/rev-list-verify-objects-fix'
"git rev-list --verify-objects" ought to inspect the contents of
objects and notice corrupted ones, but it didn't when the commit
graph is in use, which has been corrected.

* jk/rev-list-verify-objects-fix:
  rev-list: disable commit graph with --verify-objects
  lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
2022-09-13 11:38:24 -07:00
8b2f027e20 Merge branch 'jk/upload-pack-skip-hash-check'
The server side that responds to "git fetch" and "git clone"
request has been optimized by allowing it to send objects in its
object store without recomputing and validating the object names.

* jk/upload-pack-skip-hash-check:
  t1060: check partial clone of misnamed blob
  parse_object(): check commit-graph when skip_hash set
  upload-pack: skip parse-object re-hashing of "want" objects
  parse_object(): allow skipping hash check
2022-09-13 11:38:23 -07:00
e0574c4fd1 Merge branch 'rs/diff-no-index-cleanup'
"git diff --no-index A B" managed its the pathnames of its two
input files rather haphazardly, sometimes leaking them.  The
command line argument processing has been straightened out to clean
it up.

* rs/diff-no-index-cleanup:
  diff-no-index: simplify argv index calculation
  diff-no-index: release prefixed filenames
  diff-no-index: release strbuf on queue error
2022-09-13 11:38:23 -07:00
f322e9f51b Merge branch 'ab/submodule-helper-prep'
Code clean-up of "git submodule--helper".

* ab/submodule-helper-prep: (33 commits)
  submodule--helper: fix bad config API usage
  submodule--helper: libify even more "die" paths for module_update()
  submodule--helper: libify more "die" paths for module_update()
  submodule--helper: check repo{_submodule,}_init() return values
  submodule--helper: libify "must_die_on_failure" code paths (for die)
  submodule--helper update: don't override 'checkout' exit code
  submodule--helper: libify "must_die_on_failure" code paths
  submodule--helper: libify determine_submodule_update_strategy()
  submodule--helper: don't exit() on failure, return
  submodule--helper: use "code" in run_update_command()
  submodule API: don't handle SM_..{UNSPECIFIED,COMMAND} in to_string()
  submodule--helper: don't call submodule_strategy_to_string() in BUG()
  submodule--helper: add missing braces to "else" arm
  submodule--helper: return "ret", not "1" from update_submodule()
  submodule--helper: rename "int res" to "int ret"
  submodule--helper: don't redundantly check "else if (res)"
  submodule--helper: refactor "errmsg_str" to be a "struct strbuf"
  submodule--helper: add "const" to passed "struct update_data"
  submodule--helper: add "const" to copy of "update_data"
  submodule--helper: add "const" to passed "module_clone_data"
  ...
2022-09-13 11:38:23 -07:00
0479138645 Merge branch 'ed/fsmonitor-on-network-disk'
The built-in fsmonitor refuses to work on a network mounted
repositories; a configuration knob for users to override this has
been introduced.

* ed/fsmonitor-on-network-disk:
  fsmonitor: option to allow fsmonitor to run against network-mounted repos
2022-09-13 11:38:23 -07:00
7c04aa7390 chainlint: colorize problem annotations and test delimiters
When `chainlint.pl` detects problems in a test definition, it emits the
test definition with "?!FOO?!" annotations highlighting the problems it
discovered. For instance, given this problematic test:

    test_expect_success 'discombobulate frobnitz' '
        git frob babble &&
        (echo balderdash; echo gnabgib) >expect &&
        for i in three two one
        do
            git nitfol $i
        done >actual
        test_cmp expect actual
    '

chainlint.pl will output:

    # chainlint: t1234-confusing.sh
    # chainlint: discombobulate frobnitz
    git frob babble &&
    (echo balderdash ; ?!AMP?! echo gnabgib) >expect &&
    for i in three two one
    do
    git nitfol $i ?!LOOP?!
    done >actual ?!AMP?!
    test_cmp expect actual

in which it may be difficult to spot the "?!FOO?!" annotations. The
problem is compounded when multiple tests, possibly in multiple
scripts, fail "linting", in which case it may be difficult to spot the
"# chainlint:" lines which delimit one problematic test from another.

To ameliorate this potential problem, colorize the "?!FOO?!" annotations
in order to quickly draw the test author's attention to the problem
spots, and colorize the "# chainlint:" lines to help the author identify
the name of each script and each problematic test.

Colorization is disabled automatically if output is not directed to a
terminal or if NO_COLOR environment variable is set. The implementation
is specific to Unix (it employs `tput` if available) but works equally
well in the Git for Windows development environment which emulates Unix
sufficiently.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 21:33:58 -07:00
c9dba103dd Documentation: fix various repeat word typos
Inspired by 24966cd982 ("doc: fix repeated words", 08-09-2019),
I ran "egrep -R "\<([a-zA-Z]+)\> \<\1\>" ./Documentation/*" to
find current cases of repeated words such as "the the" that were
quite clearly typos.

There were many false positives reported, such as "really really"
or valid uses of "that that" which I left alone.

Signed-off-by: Jacob Stopak <jacob@initialcommit.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 11:04:55 -07:00
746aae3dd1 ls-files: fix black space in error message
ce74de9(ls-files: introduce "--format" option) miss
a space between two words incorrectly, it leads to
wrong i10n messages. So fix it by adding a space at
the end of the error message.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 09:25:40 -07:00
c54980ab83 list-objects-filter: convert filter_spec to a strbuf
Originally, the filter_spec field was just a string pointer. In
cf9ceb5a12 (list-objects-filter-options: make filter_spec a string_list,
2019-06-27) it became a string_list, but that commit notes:

    A strbuf would seem to be a more natural choice for this object, but
    it unfortunately requires initialization besides just zero'ing out
    the memory.  This results in all container structs, and all
    containers of those structs, etc., to also require initialization.
    Initializing them all would be more cumbersome that simply using a
    string_list, which behaves properly when its contents are zero'd.

Now that we've changed the struct to require non-zero initialization
anyway (ironically, because string_list also needed non-zero
initialization to avoid leaks), we can now convert to that more natural
type.

This makes the list_objects_filter_spec() function much less awkward, as
it had to collapse the string_list to a single-entry list on the fly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 08:38:59 -07:00
2a01bdedf8 list-objects-filter: add and use initializers
In 7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
strings, 2022-09-08), we noted that the filter_spec string_list was
inconsistent in how it handled memory ownership of strings stored in the
list. The fix there was a bit of a band-aid to set the "strdup_strings"
variable right before adding anything.

That works OK, and it lets the users of the API continue to
zero-initialize the struct. But it makes the code a bit hard to follow
and accident-prone, as any other spots appending the filter_spec need to
think about whether to set the strdup_strings value, too (there's one
such spot in partial_clone_get_default_filter_spec(), which is probably
a possible memory leak).

So let's do that full cleanup now. We'll introduce a
LIST_OBJECTS_FILTER_INIT macro and matching function, and use them as
appropriate (though it is for the "_options" struct, this matches the
corresponding list_objects_filter_release() function).

This is harder than it seems! Many other structs, like
git_transport_data, embed the filter struct. So they need to initialize
it themselves even if the rest of the enclosing struct is OK with
zero-initialization. I found all of the relevant spots by grepping
manually for declarations of list_objects_filter_options. And then doing
so recursively for structs which embed it, and ones which embed those,
and so on.

I'm pretty sure I got everything, but there's no change that would alert
the compiler if any topics in flight added new declarations. To catch
this case, we now double-check in the parsing function that things were
initialized as expected and BUG() if appropriate.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 08:38:59 -07:00
aff4bfcf0a list-objects-filter: handle null default filter spec
When we have a remote.*.promisor config variable, we know that we're in
a partial clone. Usually there's a matching remote.*.partialclonefilter
option, which tells us which filter to use with the remote. If that
option is missing, we skip setting up the filter at all. But something
funny happens: we stick a NULL entry into the string_list storing the
text filter spec.

This is a weird state, and could possibly segfault if anybody called
called list_objects_filter_spec(), etc. In practice, nobody does,
because filter->choice will still be LOFC_DISABLED, so code generally
realizes there's no filter to use. And the string_list itself is OK,
because it starts in non-dup mode until we actually parse a filter spec.
So it blindly stores the NULL without even looking at it.

But it's probably worth avoiding this confused state. It's an accident
waiting to happen, and it will be a problem if we replace the lazy
initialization from 7e2619d8ff (list_objects_filter_options: plug leak
of filter_spec strings, 2022-09-08) with a real initialization function.

The history is a little interesting here, as the bug was introduced
during the merge resolution in 627b826834 (Merge branch
'md/list-objects-filter-combo', 2019-09-18).

The original logic comes from cac1137dc4 (list-objects: check if filter
is NULL before using, 2018-06-11), where we had a single string via
core.partialCloneFilter, and a simple NULL check was sufficient. And it
even added a test in t0410 that covers this situation.

Later, that was expanded to allow per-remote filters in fa3d1b63e8
(promisor-remote: parse remote.*.partialclonefilter, 2019-06-25). After
that commit, we get a promisor struct with a partial_clone_filter
string, which could be NULL. The commit checks only that the struct
pointer is non-NULL, which is enough. It may pass NULL to
gently_parse_list_objects_filter(), but that function is smart enough to
consider it a noop.

But in parallel, cf9ceb5a12 (list-objects-filter-options: make
filter_spec a string_list, 2019-06-27) added a new line of code: before
we call gently_parse_list_objets_filter(), we append the filter spec to
the string_list. By itself that was OK, since we'd have returned early
if the string was NULL.

When the two were merged in 627b826834, the result is that we return
early only if the struct is NULL, but not the string. And we append to
the string_list, meaning we may append NULL.

The solution is to return early if either is NULL, as it would mean we
don't have a configured filter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 08:38:59 -07:00
e40d906449 list-objects-filter: don't memset after releasing filter struct
If we see an error while parsing a "combine" filter, we call
list_objects_filter_release() to free any allocated memory,
and then use memset() to return the struct to a known state. But the
release function already does that reinitializing. Doing it again is
pointless.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 08:38:58 -07:00
7522bb9bc9 Merge branch 'jk/plug-list-object-filter-leaks' into jk/list-objects-filter-cleanup
* jk/plug-list-object-filter-leaks:
  prepare_repo_settings(): plug leak of config values
  list_objects_filter_options: plug leak of filter_spec strings
  transport: free filter options in disconnect_git()
  transport: deep-copy object-filter struct for fetch-pack
  list_objects_filter_copy(): deep-copy sparse_oid_name field
2022-09-12 08:38:47 -07:00
7ead46810b builtin/mv.c: fix possible segfault in add_slash()
A possible segfault was introduced in c08830de41 (mv: check if
<destination> is a SKIP_WORKTREE_DIR, 2022-08-09).

When running t7001 with SANITIZE=address, problem appears when running:

	git mv path1/path2/ .
or
	git mv directory ../
or
	any <destination> that makes dest_path[0] an empty string.

The add_slash() call could segfault when path argument to it is an empty
string, because it makes an out-of-bounds read to decide if an extra
slash '/' needs to be appended to it.

As add_slash() is used to make sure that a valid pathname to a file in
the given directory can be made by appending a filename after the value
returned from it, if path is an empty string, we want to return it
as-is.  The path to a file "F" in the top-level of the working tree
(i.e. path=="") is formed by appending "F" after "" (i.e. path) without
any slash in between.

So, just like the case where a non-empty path already ends with a slash,
return an empty path as-is.

Reported-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-09 15:49:53 -07:00
dd3f6c4cae The nineteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-09 12:02:26 -07:00
fe3939bc2a Merge branch 'vd/sparse-reset-checkout-fixes'
Segfault fix-up to an earlier fix to the topic to teach "git reset"
and "git checkout" work better in a sparse checkout.

* vd/sparse-reset-checkout-fixes:
  unpack-trees: fix sparse directory recursion check
2022-09-09 12:02:26 -07:00
fd1ec82547 Merge branch 'ab/retire-ppc-sha1'
Remove the assembly version of SHA-1 implementation for PPC.

* ab/retire-ppc-sha1:
  Makefile: use $(OBJECTS) instead of $(C_OBJ)
  Makefile + hash.h: remove PPC_SHA1 implementation
2022-09-09 12:02:25 -07:00
00b0199c51 Merge branch 'cc/doc-trailer-whitespace-rules'
Doc update.

* cc/doc-trailer-whitespace-rules:
  Documentation: clarify whitespace rules for trailers
2022-09-09 12:02:25 -07:00
0e2a4764ed Merge branch 'jc/format-patch-force-in-body-from'
"git format-patch --from=<ident>" can be told to add an in-body
"From:" line even for commits that are authored by the given
<ident> with "--force-in-body-from"option.

* jc/format-patch-force-in-body-from:
  format-patch: learn format.forceInBodyFrom configuration variable
  format-patch: allow forcing the use of in-body From: header
  pretty: separate out the logic to decide the use of in-body from
2022-09-09 12:02:25 -07:00
428dce9f4d Merge branch 'js/range-diff-with-pathspec'
Allow passing a pathspec to "git range-diff".

* js/range-diff-with-pathspec:
  range-diff: optionally accept pathspecs
  range-diff: consistently validate the arguments
  range-diff: reorder argument handling
2022-09-09 12:02:25 -07:00
526c4906f8 Merge branch 'jk/tempfile-active-flag-cleanup'
Code clean-up.

* jk/tempfile-active-flag-cleanup:
  tempfile: update comment describing state transitions
  tempfile: drop active flag
2022-09-09 12:02:24 -07:00
fb094cb583 Merge branch 'js/add-p-diff-parsing-fix'
Those who use diff-so-fancy as the diff-filter noticed a regression
or two in the code that parses the diff output in the built-in
version of "add -p", which has been corrected.

* js/add-p-diff-parsing-fix:
  add -p: ignore dirty submodules
  add -p: gracefully handle unparseable hunk headers in colored diffs
  add -p: detect more mismatches between plain vs colored diffs
2022-09-09 12:02:24 -07:00
f20b9c36d0 rev-parse --parseopt: detect missing opt-spec
After 2d893dff4c (rev-parse --parseopt: allow [*=?!] in argument hints,
2015-07-14) updated the parser, a line in parseopts's input can start
with one of the flag characters and be erroneously parsed as a opt-spec
where the short name of the option is the flag character itself and the
long name is after the end of the string. This makes Git want to
allocate SIZE_MAX bytes of memory at this line:

    o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);

Since s and sb.buf are equal the second argument is -2 (except unsigned)
and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.

Avoid this by checking whether a flag character was found in the zeroth
position.

Reported-by: Ingy dot Net <ingy@ingy.net>
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 14:55:07 -07:00
49ca2fba39 fetch: add branch.*.merge to default ref-prefix extension
When running "git pull" with no arguments, we'll do a default "git
fetch" and then try to merge the branch specified by the branch.*.merge
config. There's code in get_ref_map() to treat that "merge" branch as
something we want to fetch, even if it is not otherwise covered by the
default refspec.

This works fine with the v0 protocol, as the server tells us about all
of the refs, and get_ref_map() is the ultimate decider of what we fetch.

But in the v2 protocol, we send the ref-prefix extension to the server,
asking it to limit the ref advertisement. And we only tell it about the
default refspec for the remote; we don't mention the branch.*.merge
config at all.

This usually doesn't matter, because the default refspec matches
"refs/heads/*", which covers all branches. But if you explicitly use a
narrow refspec, then "git pull" on some branches may fail. The server
doesn't advertise the branch, so we don't fetch it, and "git pull"
thinks that it went away upstream.

We can fix this by including any branch.*.merge entries for the current
branch in the list of ref-prefixes we pass to the server. This only
needs to happen when using the default configured refspec (since
command-line refspecs are already added, and take precedence in deciding
what we fetch). We don't otherwise need to replicate any of the "what to
fetch" logic in get_ref_map(). These ref-prefixes are an optimization,
so it's OK if we tell the server to advertise the branch.*.merge ref,
even if we're not going to pull it. We'll just choose not to fetch it.

The test here is based on one constructed by Johannes. I modified the
branch names to trigger the ref-prefix issue (and be more descriptive),
and to confirm that "git pull" actually updated the local ref, which
should be more robust than just checking stderr.

Reported-by: Lana Deere <lana.deere@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 13:10:37 -07:00
080bc4990f fetch: stop checking for NULL transport->remote in do_fetch()
This field will never be NULL; if it were, we'd segfault earlier in the
function when we unconditionally check transport->remote->fetch_tags.
Likewise, many other functions dereference it unconditionally.

This is a small simplification, but it will make things easier as we
extend this conditional in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 13:10:32 -07:00
66eede4a37 prepare_repo_settings(): plug leak of config values
We call repo_config_get_string() for fetch.negotiationAlgorithm, which
allocates a copy of the string, but we never free it.

We could add a call to free(), but there's an even simpler solution: we
don't need the string to persist beyond a few strcasecmp() calls, so we
can instead use the "_tmp" variant which gives us a const pointer to the
cached value.

We need to switch the type of "strval" to "const char *" for this to
work, which affects a similar call that checks core.untrackedCache. But
it's in the same boat! It doesn't actually need the value to persist
beyond a maybe_bool() check (though it does remember to correctly free
the string afterwards). So we can simplify it at the same time.

Note that this core.untrackedCache check arguably should be using
repo_config_get_maybe_bool(), but there are some subtle behavior
changes. E.g., it doesn't currently allow a value-less "true". Arguably
it should, but let's avoid lumping further changes in what should be a
simple leak cleanup.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 11:11:28 -07:00
7e2619d8ff list_objects_filter_options: plug leak of filter_spec strings
The list_objects_filter_options struct contains a string_list to store
the filter_spec. Because we allow the options struct to be
zero-initialized by callers, the string_list's strdup_strings field is
generally not set.

Because we don't want to depend on the memory lifetimes of any strings
passed in to the list-objects API, everything we add to the string_list
is duplicated (either via xstrdup(), or via strbuf_detach()). So far so
good, but now we have a problem at cleanup time: when we clear the
list, the string_list API doesn't realize that it needs to free all of
those strings, and we leak them.

One option would be to set strdup_strings right before clearing the
list. But this is tricky for two reasons:

  1. There's one spot, in partial_clone_get_default_filter_spec(),
     that fails to duplicate its argument. We could fix that, but...

  2. We clear the list in a surprising number of places. As you might
     expect, we do so in list_objects_filter_release(). But we also
     clear and rewrite it in expand_list_objects_filter_spec(),
     list_objects_filter_spec(), and transform_to_combine_type().
     We'd have to put the same hack in all of those spots.

Instead, let's just set strdup_strings before adding anything. That lets
us drop the extra manual xstrdup() calls, fixes the spot mentioned
in (1) above that _should_ be duplicating, and future-proofs further
calls. We do have to switch the strbuf_detach() calls to use the nodup
form, but that's an easy change, and the resulting code more clearly
shows the expected ownership transfer.

This also resolves a weird inconsistency: when we make a deep copy with
list_objects_filter_copy(), it initializes the copy's filter_spec with
string_list_init_dup(). So the copy frees its string_list memory
correctly, but accidentally leaks the extra manual-xstrdup()'d strings!

There is one hiccup, though. In an ideal world, everyone would allocate
the list_objects_filter_options struct with an initializer which used
STRING_LIST_INIT_DUP under the hood. But there are a bunch of existing
callers which think that zero-initializing is good enough. We can leave
them as-is by noting that the list is always initially populated via
parse_list_objects_filter(). So we can just initialize the
strdup_strings flag there.

This is arguably a band-aid, but it works reliably. And it doesn't make
anything harder if we want to switch all the callers later to a new
LIST_OBJECTS_FILTER_INIT.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 11:08:23 -07:00
dd49699d12 transport: free filter options in disconnect_git()
If a user of the transport API calls transport_set_option() with
TRANS_OPT_LIST_OBJECTS_FILTER, it doesn't pass a struct, but rather a
string with the filter-spec, which the transport code then stores in its
own list_objects_filter_options struct.

When the caller is done and we call transport_disconnect(), the contents
of that filter struct are then leaked. We should release it before
freeing the transport struct.

Another way to solve this would be for transport_set_option() to pass a
pointer to the struct. But that's awkward, because there's a generic
transport-option interface that always takes a string. Plus it opens up
questions of memory lifetimes; by storing its own filter-options struct,
the transport code remains self-contained.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 11:07:58 -07:00
3f0e86a158 transport: deep-copy object-filter struct for fetch-pack
When the transport code for the git protocol calls into fetch_pack(), it
has to fill out a fetch_pack_args struct that is mostly taken from the
transport options. We pass along any object-filter data by doing a
struct assignment of the list_objects_filter_options struct. But doing
so isn't safe; it contains allocated pointers in its filter_spec
string_list, which could lead to a double-free if one side mutates or
frees the string_list.

And indeed, the fetch-pack code does clear and rewrite the list via
expand_list_objects_filter_spec(), leaving the transport code with
dangling pointers.

This hasn't been a problem so far, though, because the transport code
doesn't look further at the filter struct. But it should, because in
some cases (when fetch-pack doesn't rewrite the list), it ends up
leaking the string_list.

So let's start by turning this shallow copy into a deep one, which
should let us fix the transport leak in a subsequent patch. Likewise,
we'll free the deep copy we made here when we're done with it (to avoid
leaking).

Note that it would also work to pass fetch-pack a pointer to our filter
struct, rather than a copy. But it's awkward for fetch-pack to take a
pointer in its arg struct; the actual git-fetch-pack command allocates a
fetch_pack_args struct on the stack and expects it to contain the filter
options. It could be rewritten to avoid this, but a deep copy serves our
purposes just as well.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 11:06:14 -07:00
3fbfbbb7e3 list_objects_filter_copy(): deep-copy sparse_oid_name field
The purpose of our copy function is to do a deep copy of each field so
that the source and destination structs become independent. We correctly
copy the filter_spec string list, but we forgot the sparse_oid_name
field. By doing a shallow copy of the pointer, that puts us at risk for
a use-after-free if one or both of the structs is cleaned up.

I don't think this can be triggered in practice, because we tend to leak
the structs rather than actually clean them up. But this should
future-proof us for plugging those leaks.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 11:05:46 -07:00
945ed00957 t1060: check partial clone of misnamed blob
A recent commit (upload-pack: skip parse-object re-hashing of "want"
objects, 2022-09-06) loosened the behavior of upload-pack so that it
does not verify the sha1 of objects it receives directly via "want"
requests. The existing corruption tests in t1060 aren't affected by
this: the corruptions are blobs reachable from commits, and the client
requests the commits.

The more interesting case here is a partial clone, where the client will
directly ask for the corrupted blob when it does an on-demand fetch of
the filtered object. And that is not covered at all, so let's add a
test.

It's important here that we use the "misnamed" corruption and not
"bit-error". The latter is sufficiently corrupted that upload-pack
cannot even figure out the type of the object, so it bails identically
both before and after the recent change. But with "misnamed", with the
hash-checks enabled it sees the problem (though the error messages are a
bit confusing because of the inability to create a "struct object" to
store the flags):

  error: hash mismatch d95f3ad14dee633a758d2e331151e950dd13e4ed
  fatal: git upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed
  fatal: remote error: upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed

After the change to skip the hash check, the server side happily sends
the bogus object, but the client correctly realizes that it did not get
the necessary data:

  remote: Enumerating objects: 1, done.
  remote: Counting objects: 100% (1/1), done.
  remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
  Receiving objects: 100% (1/1), 49 bytes | 49.00 KiB/s, done.
  fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
  error: [...]/misnamed did not send all necessary objects

which is exactly what we expect to happen.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 15:08:51 -07:00
2b43dd0eb5 diff-no-index: simplify argv index calculation
Since 16bb3d714d (diff --no-index: use parse_options() instead of
diff_opt_parse(), 2019-03-24) argc must be 2 if we reach the loop, i.e.
argc - 2 == 0.  Remove that inconsequential term.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:36:43 -07:00
07a6f94a6d diff-no-index: release prefixed filenames
Callers of prefix_filename() are responsible for freeing its result.
Remember the returned strings and release them to appease leak checkers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:34:03 -07:00
fffe7d81a4 diff-no-index: release strbuf on queue error
The strbuf is small and we are about to exit, so we could leave its
cleanup to the OS.  If we release it explicitly at all, however, then we
should do it on early exit as well.  Move the strbuf_release call to a
new cleanup section at the end and make sure all execution paths go
through it.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:33:28 -07:00
9a8c3c4a5f parse_object(): check commit-graph when skip_hash set
If the caller told us that they don't care about us checking the object
hash, then we're free to implement any optimizations that get us the
parsed value more quickly. An obvious one is to check the commit graph
before loading an object from disk. And in fact, both of the callers who
pass in this flag are already doing so before they call parse_object()!

So we can simplify those callers, as well as any possible future ones,
by moving the logic into parse_object().

There are two subtle things to note in the diff, but neither has any
impact in practice:

  - it seems least-surprising here to do the graph lookup on the
    git-replace'd oid, rather than the original. This is in theory a
    change of behavior from the earlier code, as neither caller did a
    replace lookup itself. But in practice it doesn't matter, as we
    disable the commit graph entirely if there are any replace refs.

  - the caller in get_reference() passes the skip_hash flag only if
    revs->verify_objects isn't set, whereas it would look in the commit
    graph unconditionally. In practice this should not matter as we
    should disable the commit graph entirely when using verify_objects
    (and that was done recently in another patch).

So this should be a pure cleanup with no behavior change.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:27:02 -07:00
0bc2557951 upload-pack: skip parse-object re-hashing of "want" objects
Imagine we have a history with commit C pointing to a large blob B.
If a client asks us for C, we can generally serve both objects to them
without accessing the uncompressed contents of B. In upload-pack, we
figure out which commits we have and what the client has, and feed those
tips to pack-objects. In pack-objects, we traverse the commits and trees
(or use bitmaps!) to find the set of objects needed, but we never open
up B. When we serve it to the client, we can often pass the compressed
bytes directly from the on-disk packfile over the wire.

But if a client asks us directly for B, perhaps because they are doing
an on-demand fetch to fill in the missing blob of a partial clone, we
end up much slower. Upload-pack calls parse_object() on the oid we
receive, which opens up the object and re-checks its hash (even though
if it were a commit, we might skip this parse entirely in favor of the
commit graph!). And then we feed the oid directly to pack-objects, which
again calls parse_object() and opens the object. And then finally, when
we write out the result, we may send bytes straight from disk, but only
after having unnecessarily uncompressed and computed the sha1 of the
object twice!

This patch teaches both code paths to use the new SKIP_HASH_CHECK flag
for parse_object(). You can see the speed-up in p5600, which does a
blob:none clone followed by a checkout. The savings for git.git are
modest:

  Test                          HEAD^             HEAD
  ----------------------------------------------------------------------
  5600.3: checkout of result    2.23(4.19+0.24)   1.72(3.79+0.18) -22.9%

But the savings scale with the number of bytes. So on a repository like
linux.git with more files, we see more improvement (in both absolute and
relative numbers):

  Test                          HEAD^                HEAD
  ----------------------------------------------------------------------------
  5600.3: checkout of result    51.62(77.26+2.76)    34.86(61.41+2.63) -32.5%

And here's an even more extreme case. This is the android gradle-plugin
repository, whose tip checkout has ~3.7GB of files:

  Test                          HEAD^               HEAD
  --------------------------------------------------------------------------
  5600.3: checkout of result    79.51(90.84+5.55)   40.28(51.88+5.67) -49.3%

Keep in mind that these timings are of the whole checkout operation. So
they count the client indexing the pack and actually writing out the
files. If we want to see just the server's view, we can hack up the
GIT_TRACE_PACKET output from those operations and replay it via
upload-pack. For the gradle example, that gives me:

  Benchmark 1: GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input
    Time (mean ± σ):     50.884 s ±  0.239 s    [User: 51.450 s, System: 1.726 s]
    Range (min … max):   50.608 s … 51.025 s    3 runs

  Benchmark 2: GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input
    Time (mean ± σ):      9.728 s ±  0.112 s    [User: 10.466 s, System: 1.535 s]
    Range (min … max):    9.618 s …  9.842 s    3 runs

  Summary
    'GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input' ran
      5.23 ± 0.07 times faster than 'GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input'

So a server would see an 80% reduction in CPU serving the initial
checkout of a partial clone for this repository. Or possibly even more
depending on the packing; most of the time spent in the faster one were
objects we had to open during the write phase.

In both cases skipping the extra hashing on the server should be pretty
safe. The client doesn't trust the server anyway, so it will re-hash all
of the objects via index-pack. There is one thing to note, though: the
change in get_reference() affects not just pack-objects, but rev-list,
git-log, etc. We could use a flag to limit to index-pack here, but we
may already skip hash checks in this instance. For commits, we'd skip
anything we load via the commit-graph. And while before this commit we
would check a blob fed directly to rev-list on the command-line, we'd
skip checking that same blob if we found it by traversing a tree.

The exception for both is if --verify-objects is used. In that case,
we'll skip this optimization, and the new test makes sure we do this
correctly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:20:02 -07:00
c868d8e91f parse_object(): allow skipping hash check
The parse_object() function checks the object hash of any object it
parses. This is a nice feature, as it means we may catch bit corruption
during normal use, rather than waiting for specific fsck operations.

But it also can be slow. It's particularly noticeable for blobs, where
except for the hash check, we could return without loading the object
contents at all. Now one may wonder what is the point of calling
parse_object() on a blob in the first place then, but usually it's not
intentional: we were fed an oid from somewhere, don't know the type, and
want an object struct. For commits and trees, the parsing is usually
helpful; we're about to look at the contents anyway. But this is less
true for blobs, where we may be collecting them as part of a
reachability traversal, etc, and don't actually care what's in them. And
blobs, of course, tend to be larger.

We don't want to just throw out the hash-checks for blobs, though. We do
depend on them in some circumstances (e.g., rev-list --verify-objects
uses parse_object() to check them). It's only the callers that know
how they're going to use the result. And so we can help them by
providing a special flag to skip the hash check.

We could just apply this to blobs, as they're going to be the main
source of performance improvement. But if a caller doesn't care about
checking the hash, we might as well skip it for other object types, too.
Even though we can't avoid reading the object contents, we can still
skip the actual hash computation.

If this seems like it is making Git a little bit less safe against
corruption, it may be. But it's part of a series of tradeoffs we're
already making. For instance, "rev-list --objects" does not open the
contents of blobs it prints. And when a commit graph is present, we skip
opening most commits entirely. The important thing will be to use this
flag in cases where it's safe to skip the check. For instance, when
serving a pack for a fetch, we know the client will fully index the
objects and do a connectivity check itself. There's little to be gained
from the server side re-hashing a blob itself. And indeed, most of the
time we don't! The revision machinery won't open up a blob reached by
traversal, but only one requested directly with a "want" line. So
applied properly, this new feature shouldn't make anything less safe in
practice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:18:57 -07:00
dd834d75ca notes, remote: show unknown subcommands between `'
Update the "unknown subcommand" error message in 'git notes' and 'git
remote' to wrap the offending argument between `', to make it
consistent with the "unknown switch/option/subcommand" error messages
in parse-options.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
1c7c25aef1 notes: simplify default operation mode arguments check
'git notes' has a default operation mode, but when invoked without a
subcommand it doesn't accept any arguments (although the 'list'
subcommand implementing the default operation mode does accept
arguments).  The condition checking this ended up a bit awkward, so
let's make it clearer.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
45bec2ead2 test-parse-options.c: fix style of comparison with zero
The preferred style is '!argc' instead of 'argc == 0'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
6983f4e3b2 test-parse-options.c: don't use for loop initial declaration
We would like to eventually use for loop initial declarations in our
codebase, but we are not there yet.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
9a22b4d907 t0040-parse-options: remove leftover debugging
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
9274dea3d9 docs: add CONFIGURATION sections that fuzzy map to built-ins
Add a CONFIGURATION section to the documentation of various built-ins,
for those cases where the relevant config/NAME.txt doesn't map only to
one git-NAME.txt. In particular:

 * config/blame.txt: used by git-{blame,annotate}.txt. Since the
   git-annotate(1) documentation refers to git-blame(1) don't add a
   "CONFIGURATION" section to git-annotate(1), only to git-blame(1).

 * config/checkout.txt: maps to both git-checkout.txt and
   git-switch.txt (but nothing else).

 * config/init.txt: should be included in git-init(1) and
   git-clone(1).

 * config/column.txt: We should ideally mention the relevant subset of
   this in git-{branch,clean,status,tag}.txt, but let's punt on it for
   now. We will when we eventually split these sort of files into
   e.g. config/column.txt and
   config/column/{branch,clean,status,tag}.txt, with the former
   including the latter set.

Things that are being left out, and why:

 * config/{remote,remotes,credential}.txt: Configuration that affects
   how we talk to remote repositories is harder to untangle. We'll need
   to include some of this in git-{fetch,remote,push,ls-remote}.txt
   etc., but some of those only use a small subset of these
   options. Let's leave this for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:07 -07:00
16f6b0d1aa docs: add CONFIGURATION sections that map to a built-in
Add a CONFIGURATION section to the documentation of various built-ins,
for those cases where the relevant config/NAME.txt describes
configuration that is only used by the relevant built-in documented in
git-NAME.txt. Subsequent commits will handle more complex cases.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:06 -07:00
00c80534f6 log docs: de-duplicate configuration sections
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:06 -07:00
2a9dfdf260 difftool docs: de-duplicate configuration sections
Include the "config/difftool.txt" file in "git-difftool.txt", and move
the relevant part of git-difftool(1) configuration from
"config/diff.txt" to config/difftool.txt".

Doing this is slightly odd, as we usually discuss configuration in
alphabetical order, but by doing it we're able to include the full set
of configuration used by git-difftool(1) (and only that configuration)
in its own documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:06 -07:00
5bd277e2e2 notes docs: de-duplicate and combine configuration sections
Combine the various "notes" configuration sections spread across
Documentation/config/notes.txt and Documentation/git-notes.txt to live
in the former, and to be included in the latter.

We'll now forward link from "git notes" to the "CONFIGURATION" section
below, rather than to "git-config(1)" when discussing configuration
variables that are (also) discussed in that section.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:06 -07:00
416fed246f apply docs: de-duplicate configuration sections
The wording is not identical to Documentation/config/apply.txt, but
that version is better.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:05 -07:00
bac1d52cfe send-email docs: de-duplicate configuration sections
De-duplicate the discussion of "send-email" configuration, such that
the "git-config(1)" manual page becomes the source of truth, and
"git-send-email(1)" includes the relevant part.

Most commands that suffered from such duplication had diverging text
discussing the same variables, but in this case some config was also
only discussed in one or the other.

This is mostly a move-only change, the exception is a minor rewording
of changing wording like "see above" to "see linkgit:git-config[1]",
as well as a clarification about the big section of command-line
option tweaking config being discussed in git-send-email(1)'s main
docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:05 -07:00
a2811dd7c4 grep docs: de-duplicate configuration sections
Include the "config/grep.txt" file in "git-grep.txt", instead of
repeating an almost identical description of the "grep" configuration
variables in two places.

There is no loss of information here that isn't shown in the addition
to "grep.txt". This change was made by copying the contents of
"git-grep.txt"'s version over the "grep.txt" version. Aside from the
change "grep.txt" being made here the two were identical.

This documentation started being copy/pasted around in
b22520a37c (grep: allow -E and -n to be turned on by default via
configuration, 2011-03-30). After that in e.g. 6453f7b348 (grep: add
grep.fullName config variable, 2014-03-17) they started drifting
apart, with only grep.fullName being described in the command
documentation.

In 434e6e753f (config.txt: move grep.* to a separate file,
2018-10-27) we gained the include, but didn't do this next step, let's
do it now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:05 -07:00
18d89fe25c docs: add and use include template for config/* includes
In b6a8d09f6d (gc docs: include the "gc.*" section from "config" in
"gc", 2019-04-07) the "git gc" documentation was made to include the
config/gc.txt in its "CONFIGURATION" section. We do that in several
other places, but "git gc" was the only one with a blurb above the
include to orient the reader.

We don't want readers to carefully scrutinize "git-config(1)" and
"git-gc(1)" looking for discrepancies, instead we should tell them
that the latter includes a part of the former.

This change formalizes that wording in two new templates to be
included, one for the "git gc" case where the entire section is
included from "git-config(1)", and another for when the inclusion of
"git-config(1)" follows discussion unique to that documentation. In
order to use that re-arrange the order of those being discussed in the
"git-merge(1)" documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:46:05 -07:00
b27ccae34b rev-list: disable commit graph with --verify-objects
Since the point of --verify-objects is to actually load and checksum the
bytes of each object, optimizing out reads using the commit graph runs
contrary to our goal.

The most targeted way to implement this would be for the revision
traversal code to check revs->verify_objects and avoid using the commit
graph. But it's difficult to be sure we've hit all of the correct spots.
For instance, I started this patch by writing the first of the included
test cases, where the corrupted commit is directly on rev-list's command
line. And that is easy to fix by teaching get_reference() to check
revs->verify_objects before calling lookup_commit_in_graph().

But that doesn't cover the second test case: when we traverse to a
corrupted commit, we'd parse the parent in process_parents(). So we'd
need to check there, too. And it keeps going. In handle_commit() we
sometimes parses commits, too, though I couldn't figure out a way to
trigger it that did not already parse via get_reference() or tag
peeling. And try_to_simplify_commit() has its own parse call, and so on.

So it seems like the safest thing is to just disable the commit graph
for the whole process when we see the --verify-objects option. We can do
that either in builtin/rev-list.c, where we use the option, or in
revision.c, where we parse it. There are some subtleties:

  - putting it in rev-list.c is less surprising in some ways, because
    there we know we are just doing a single traversal. In a command
    which does multiple traversals in a single process, it's rather
    unexpected to globally disable the commit graph.

  - putting it in revision.c is less surprising in some ways, because
    the caller does not have to remember to disable the graph
    themselves. But this is already tricky! The verify_objects flag in
    rev_info doesn't do anything by itself. The caller has to provide an
    object callback which does the right thing.

  - for that reason, in practice nobody but rev-list uses this option in
    the first place. So the distinction is probably not important either
    way. Arguably it should just be an option of rev-list, and not the
    general revision machinery; right now you can run "git log
    --verify-objects", but it does not actually do anything useful.

  - checking for a parsed revs.verify_objects flag in rev-list.c is too
    late. By that time we've already passed the arguments to
    setup_revisions(), which will have parsed the commits using the
    graph.

So this commit disables the graph as soon as we see the option in
revision.c. That's a pretty broad hammer, but it does what we want, and
in practice nobody but rev-list is using this flag anyway.

The tests cover both the "tip" and "parent" cases. Obviously our hammer
hits them both in this case, but it's good to check both in case
somebody later tries the more focused approach.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:44:30 -07:00
d6045294a9 lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
We exit early from lookup_commit_in_graph() if the commit_graph pointer
is NULL, under the assumption that we don't have a graph to look at. But
the graph pointer is lazy-loaded; if no other code happens to have
called prepare_commit_graph(), we'll incorrectly assume that one isn't
available at all.

This has a pretty small performance impact in practice, because the
fallback will generally be to call parse_object() instead. That ends up
in parse_commit_buffer(), which loads the graph data itself. So the
first commit we see won't use the graph, but subsequent ones will. Since
using the graph is just an optimization there's generally no
user-visible difference, but if you instrument rev-list like so:

  diff --git a/revision.c b/revision.c
  index ee702e498a..63c488ffb6 100644
  --- a/revision.c
  +++ b/revision.c
  @@ -381,6 +381,9 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
           * parsing commit data from disk.
           */
          commit = lookup_commit_in_graph(revs->repo, oid);
  +       warning("%s %s in commit graph",
  +               commit ? "found" : "did not find",
  +               name);
          if (commit)
                  object = &commit->object;
          else

and run (in git.git):

  git commit-graph write --reachable
  git rev-list origin/master origin/next >/dev/null

you'll see that we fail to find the first one:

  warning: did not find origin/master in commit graph
  warning: found origin/next in commit graph

After this patch, you'll see that we find both:

  warning: found origin/master in commit graph
  warning: found origin/next in commit graph

Even though the performance implication is small here, there are two
important reasons to do this:

  - it's downright confusing if you are hunting a bug triggered by the
    use of the commit graph. It may or may not trigger depending on the
    number and ordering of tips you ask for.

  - prepare_commit_graph() has other policy logic, too. In particular,
    if we've loaded a commit graph and then disabled the graph via
    disable_commit_graph(), that should take precedence.

    I'm not sure if this can trigger bad behavior in practice. The only
    caller there is upload-pack's deepen_by_rev_list(), which should be
    avoiding the commit graph for its traversal tips, but probably
    wasn't before this patch. Whether you could come up with a case
    where that mattered is unclear. Still, this is obviously the right
    thing to be doing.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:44:28 -07:00
79f2338b37 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-05 18:33:41 -07:00
27fb520ef2 Merge branch 'jk/test-crontab-fixes'
Test helper fix.

* jk/test-crontab-fixes:
  test-crontab: minor memory and error handling fixes
2022-09-05 18:33:41 -07:00
fcbc8743ef Merge branch 'en/test-without-test-create-repo'
Test clean-up.

* en/test-without-test-create-repo:
  t64xx: convert 'test_create_repo' to 'git init'
2022-09-05 18:33:41 -07:00
56785a3fad Merge branch 'bc/gc-crontab-fix'
FreeBSD portability fix for "git maintenance" that spawns "crontab"
to schedule tasks.

* bc/gc-crontab-fix:
  gc: use temporary file for editing crontab
2022-09-05 18:33:41 -07:00
2d88021919 Merge branch 'es/t4301-sed-portability-fix'
Test clean-up.

* es/t4301-sed-portability-fix:
  t4301: emit blank line in more idiomatic fashion
  t4301: fix broken &&-chains and add missing loop termination
  t4301: account for behavior differences between sed implementations
2022-09-05 18:33:40 -07:00
5784d201da Merge branch 'rs/test-mergesort'
Optimization of a test-helper command.

* rs/test-mergesort:
  test-mergesort: use mem_pool for sort input
  test-mergesort: read sort input all at once
2022-09-05 18:33:40 -07:00
b5d2e9924f Merge branch 'rs/tempfile-cleanup-race-fix'
The clean-up of temporary files created via mks_tempfile_dt() was
racy and attempted to unlink() the leading directory when signals
are involved, which has been corrected.

* rs/tempfile-cleanup-race-fix:
  tempfile: avoid directory cleanup race
2022-09-05 18:33:40 -07:00
3fe0121479 Merge branch 'ac/bitmap-lookup-table'
The pack bitmap file gained a bitmap-lookup table to speed up
locating the necessary bitmap for a given commit.

* ac/bitmap-lookup-table:
  pack-bitmap-write: drop unused pack_idx_entry parameters
  bitmap-lookup-table: add performance tests for lookup table
  pack-bitmap: prepare to read lookup table extension
  pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests
  pack-bitmap-write.c: write lookup table extension
  bitmap: move `get commit positions` code to `bitmap_writer_finish`
  Documentation/technical: describe bitmap lookup table extension
2022-09-05 18:33:39 -07:00
cf98b69053 Merge branch 'tb/midx-with-changing-preferred-pack-fix'
Multi-pack index got corrupted when preferred pack changed from one
pack to another in a certain way, which has been corrected.

* tb/midx-with-changing-preferred-pack-fix:
  midx.c: avoid adding preferred objects twice
  midx.c: include preferred pack correctly with existing MIDX
  midx.c: extract `midx_fanout_add_pack_fanout()`
  midx.c: extract `midx_fanout_add_midx_fanout()`
  midx.c: extract `struct midx_fanout`
  t/lib-bitmap.sh: avoid silencing stderr
  t5326: demonstrate potential bitmap corruption
2022-09-05 18:33:39 -07:00
9eb7a73158 Documentation/technical: include Scalar technical doc
Include 'Documentation/technical/scalar.txt' alongside the other HTML
technical docs when installing them.

Now that the document is intended as a widely-accessible reference, remove
the internal work-in-progress roadmap from the document. Those details
should no longer be needed to guide Scalar's development and, if they were
left, they could fall out-of-date and be misleading to readers.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
ba1b117eec t/perf: add 'GIT_PERF_USE_SCALAR' run option
Add a 'GIT_PERF_USE_SCALAR' environment variable (and corresponding perf
config 'useScalar') to register a repository created with any of:

* test_perf_fresh_repo
* test_perf_default_repo
* test_perf_large_repo

as a Scalar enlistment. This is intended to allow a developer to test the
impact of Scalar on already-defined performance scenarios.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
e2809233d1 t/perf: add Scalar performance tests
Create 'p9210-scalar.sh' for testing Scalar performance and comparing
performance of Git operations in Scalar registrations and standard
repositories. Example results:

Test                                                   this tree
------------------------------------------------------------------------
9210.2: scalar clone                                   14.82(18.00+3.63)
9210.3: git clone                                      26.15(36.67+6.90)
9210.4: git status (scalar)                            0.04(0.01+0.01)
9210.5: git status (non-scalar)                        0.10(0.02+0.11)
9210.6: test_commit --append --no-tag A (scalar)       0.08(0.02+0.03)
9210.7: test_commit --append --no-tag A (non-scalar)   0.13(0.03+0.11)

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
14b4e7e5a4 scalar-clone: add test coverage
Create a new test file ('t9211-scalar-clone.sh') to exercise the options and
behavior of the 'scalar clone' command. Each test clones to a unique target
location and cleans up the cloned repo only when the test passes. This
ensures that failed tests' artifacts are captured in CI artifacts for
further debugging.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
cc75e556a9 scalar: add to 'git help -a' command list
Add 'scalar' as a 'mainporcelain' command in the Git command list. Update
the regex in 'cmd-list.perl' used to match the first line of command
documentation to find 'scalar(1)'.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
951759d3a5 scalar: implement the help subcommand
It is merely handing off to `git help scalar`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
dd9603e228 git help: special-case scalar
With this commit, `git help scalar` will open the appropriate manual
or HTML page (instead of looking for `gitscalar`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
7b5c93c6c6 scalar: include in standard Git build & installation
Move 'scalar' out of 'contrib/' and into the root of the Git tree. The goal
of this change is to build 'scalar' as part of the standard Git build &
install processes.

This patch includes both the physical move of Scalar's files out of
'contrib/' ('scalar.c', 'scalar.txt', and 't9xxx-scalar.sh'), and the
changes to the build definitions in 'Makefile' and 'CMakelists.txt' to
accommodate the new program.

At a high level, Scalar is built so that:
- there is a 'scalar-objs' target (similar to those created in 029bac01a8
  (Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets,
  2021-02-23)) for debugging purposes.
- it appears in the root of the install directory (rather than the
  gitexecdir).
- it is included in the 'bin-wrappers/' directory for use in tests.
- it receives a platform-specific executable suffix (e.g., '.exe'), if
  applicable.
- 'scalar.txt' is installed as 'man1' documentation.
- the 'clean' target removes the 'scalar' executable.

Additionally, update the root level '.gitignore' file to ignore the Scalar
executable.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:55 -07:00
b6faef396f scalar: fix command documentation section header
Rename the last section header in 'contrib/scalar/scalar.txt' from "Scalar"
to "GIT". The linting rules of the 'documentation' CI build enforce the
existence of a "GIT" section in command documentation. Although 'scalar.txt'
is not yet checked, it will be in a future patch.

Here, changing the header name is more appropriate than making a
Scalar-specific exception to the linting rule. The existing "Scalar" section
contains only a link back to the main Git documentation, essentially the
same as the "GIT" section in builtin documentation. Changing the section
name further clarifies the Scalar-Git association and maintains consistency
with the rest of Git.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:55 -07:00
037f8ea6d9 unpack-trees: fix sparse directory recursion check
Ensure 'is_sparse_directory_entry()' receives a valid 'name_entry *' if one
exists in the list of tree(s) being unpacked in 'unpack_callback()'.

Currently, 'is_sparse_directory_entry()' is called with the first
'name_entry' in the 'names' list of entries on 'unpack_callback()'. However,
this entry may be empty even when other elements of 'names' are not (such as
when switching from an orphan branch back to a "normal" branch). As a
result, 'is_sparse_directory_entry()' could incorrectly indicate that a
sparse directory is *not* actually sparse because the name of the index
entry does not match the (empty) 'name_entry' path.

Fix the issue by using the existing 'name_entry *p' value in
'unpack_callback()', which points to the first non-empty entry in 'names'.
Because 'p' is 'const', also update 'is_sparse_directory_entry()'s
'name_entry *' argument to be 'const'.

Finally, add a regression test case.

Reported-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:43:09 -07:00
67360b75c6 diff: fix filtering of merge commits under --remerge-diff
Commit 95433eeed9 ("diff: add ability to insert additional headers for
paths", 2022-02-02) introduced the possibility of additional headers.
Because there could be conflicts with no content differences (e.g. a
modify/delete conflict resolved in favor of taking the modified file
as-is), that commit also modified the diff_queue_is_empty() and
diff_flush_patch() logic to ensure these headers were included even if
there was no associated content diff.

However, the added logic was a bit inconsistent between these two
functions.  diff_queue_is_empty() overlooked the fact that the
additional headers strmap could be non-NULL and empty, which would cause
it to display commits that should have been filtered out.

Fix the diff_queue_is_empty() logic to also account for
additional_path_headers being empty.

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:22:25 -07:00
71a146dc70 diff: fix filtering of additional headers under --remerge-diff
Commit 95433eeed9 ("diff: add ability to insert additional headers for
paths", 2022-02-02) introduced the possibility of additional headers.
Because there could be conflicts with no content differences (e.g. a
modify/delete conflict resolved in favor of taking the modified file
as-is), that commit also modified the diff_queue_is_empty() and
diff_flush_patch() logic to ensure these headers were included even if
there was no associated content diff.

However, when the pickaxe is active, we really only want the remerge
conflict headers to be shown when there is an associated content diff.
Adjust the logic in these two functions accordingly.

This also removes the TEST_PASSES_SANITIZE_LEAK=true declaration from
t4069, as there is apparently some kind of memory leak with the pickaxe
code.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:22:25 -07:00
9b08091cb7 diff: have submodule_format logic avoid additional diff headers
Commit 95433eeed9 ("diff: add ability to insert additional headers for
paths", 2022-02-02) introduced the possibility of additional headers,
created in create_filepairs_for_header_only_notifications().  These are
represented by inserting additional pairs in diff_queued_diff which
always have a mode of 0 and a null_oid.  When these were added, one
code path was noted to assume that at least one of the diff_filespecs
in the pair were valid, and that codepath was corrected.

The submodule_format handling is another codepath with the same issue;
it would operate on these additional headers and attempt to display them
as submodule changes.  Prevent that by explicitly checking for "phoney"
filepairs (i.e. filepairs with both modes being 0).

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:22:25 -07:00
fe4c750fb1 submodule--helper: fix a configure_added_submodule() leak
Fix config API a memory leak added in a452128a36 (submodule--helper:
introduce add-config subcommand, 2021-08-06) by using the *_tmp()
variant of git_config_get_string().

In this case we're only checking whether
the (repo|git)_config_get_string() call is telling us that the
"submodule.active" key exists.

As with the preceding commit we'll find many other such patterns in
the codebase if we go fishing. E.g. "git gc" leaks in the code added
in 61f7a383d3 (maintenance: use 'incremental' strategy by default,
2020-10-15). Similar code in "git gc" added in
b08ff1fee0 (maintenance: add --schedule option and config,
2020-09-11) doesn't leak, but we could avoid the malloc() & free() in
that case.

A coccinelle rule to find those would find and fix some leaks, and
cases where we're doing needless malloc() + free()'s but only care
about the key existence, or are copying
the (repo|git)_config_get_string() return value right away.

But as with the preceding commit let's punt on all of that for now,
and just narrowly fix this specific case in submodule--helper.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
4c4d3e7c0a submodule--helper: free rest of "displaypath" in "struct update_data"
Fix a leak in code added in c51f8f94e5 (submodule--helper: run update
procedures from C, 2021-08-24), we clobber the "displaypath" member of
the passed-in "struct update_data" both so that die() messages in this
update_submodule() function itself can use it, and for the
run_update_procedure() called within this function.

Fix a leak in code added in 51f8f94e5b (submodule--helper: run update
procedures from C, 2021-08-24). We'd always clobber the old
"displaypath" member of the previously passed-in "struct update_data".

A better fix for this would be to remove the "displaypath" member from
the "struct update_data" entirely. Along with "oid", "suboid",
"just_cloned" and "sm_path" it's managing members that mainly need to
be passed between 1-3 stack frames of functions adjacent to this
code. But doing so would be a much larger change (I have it locally,
and fully untangling that in an incremental way is a 10 patch
journey).

So let's go for this much more isolated fix suggested by Glen. We
FREE_AND_NULL() the "update_data->displaypath", the "AND_NULL()" part
of that is needed due to the later "free(ud->displaypath)" in
"update_data_release()" introduced in the preceding commit

Moving ensure_core_worktree() out of update_submodule() may not be
strictly required, but in doing so we are left with the exact same
ordering as before, making this a smaller functional change.

Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
d40c42e06b submodule--helper: free some "displaypath" in "struct update_data"
Make the update_data_release() function free "displaypath" member when
appropriate. The "displaypath" member is always ours, the "const" on
the "char *" was wrong to begin with.

This leaves a leak of "displaypath" in update_submodule(), which as
we'll see in subsequent commits is harder to deal with than this
trivial fix.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
25b6a95d03 submodule--helper: fix a memory leak in print_status()
Fix a leak in print_status(), the compute_rev_name() function
implemented in this file will return a strbuf_detach()'d value, or
NULL.

This leak has existed since this code was added in
a9f8a37584 (submodule: port submodule subcommand 'status' from shell
to C, 2017-10-06), but in 0b5e2ea7cf (submodule--helper: don't print
null in 'submodule status', 2018-04-18) we added a "const"
intermediate variable for the return value, that "const" should be
removed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
623bd7d154 submodule--helper: fix a leak in module_add()
Fix a leak in module_path(), since a6226fd772 (submodule--helper:
convert the bulk of cmd_add() to C, 2021-08-10), we've been freeing
add_data.sm_path, but in this case we clobbered it, and didn't free
the value we clobbered.

This makes test 28 of "t/t7400-submodule-basic.sh" ("submodule add in
subdirectory") pass when we're compiled with SANITIZE=leak..

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
4e83605d38 submodule--helper: fix obscure leak in module_add()
Fix an obscure leak in module_add(), if the "git add" command we were
piping to failed we'd fail to strbuf_release(&sb). This fixes a leak
introduced in a6226fd772 (submodule--helper: convert the bulk of
cmd_add() to C, 2021-08-10).

In fixing it move to a "goto cleanup" pattern, and since we need to
introduce a "ret" variable to do that let's also get rid of the
intermediate "exit_code" variable. The initialization to "-1" in
a6226fd772 has always been redundant, we'd only use the "exit_code"
value after assigning the return value of pipe_command() to it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
4c81ee9669 submodule--helper: fix "reference" leak
Fix leaks in the "reference" variable declared in add_submodule() and
module_clone().

In preceding commits this variable was refactored out of the "struct
module_clone_data", but the leak has been with us since
31224cbdc7 (clone: recursive and reference option triggers submodule
alternates, 2016-08-17) and 8c8195e9c3 (submodule--helper: introduce
add-clone subcommand, 2021-07-10).

Those commits added an xstrdup()'d member of the
STRING_LIST_INIT_NODUP'd "struct string_list". We need to free()
those, but not the ones we get from argv, let's make use of the "util"
member, if it has a pointer it's the pointer we'll need to free,
otherwise it'll be NULL (i.e. from argv).

Note that the free() of the "util" member is needed in both
module_clone() and add_submodule(). The module_clone() function itself
doesn't populate the "util" pointer as add_submodule() does, but
module_clone() is upstream of the
add_possible_reference_from_superproject() caller we're modifying
here, which does do that.

This does preclude the use of the "util" pointer for any other reasons
for now, but that's OK. If we ever need to use it for something else
we could turn it into a small "struct" with an optional "to_free"
member, and switch to using string_list_clear_func().

Alternatively we could have another "struct string_list to_free" which
would keep a copy of the strings we've dup'd to free(). But for now
this is perfectly adequate.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
ae3ef94d9b submodule--helper: fix a memory leak in get_default_remote_submodule()
Fix a memory leak in the get_default_remote_submodule() function added
in a77c3fcb5e (submodule--helper: get remote names from any
repository, 2022-03-04), we need to repo_clear() the submodule we
initialize.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
17af0a8444 submodule--helper: fix a leak with repo_clear()
Call repo_clear() in ensure_core_worktree() to free the "struct
repository". Fixes a leak that's been here since
74d4731da1 (submodule--helper: replace connect-gitdir-workingtree by
ensure-core-worktree, 2018-08-13).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
980416e469 submodule--helper: fix "sm_path" and other "module_cb_list" leaks
Fix leaks in "struct module_cb_list" and the "struct module_cb" which
it contains, these fix leaks in e83e3333b5 (submodule: port submodule
subcommand 'summary' from shell to C, 2020-08-13).

The "sm_path" should always have been a "char *", not a "const
char *", we always create it with xstrdup().

We can't mark any tests passing passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true" as a result of this change, but
"t7401-submodule-summary.sh" gets closer to passing as a result of
this change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
61adac6c4b submodule--helper: fix "errmsg_str" memory leak
Fix a memory leak introduced in e83e3333b5 (submodule: port submodule
subcommand 'summary' from shell to C, 2020-08-13), we sometimes append
to the "errmsg", and need to free the "struct strbuf".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
87a683482a submodule--helper: add and use *_release() functions
Add release functions for "struct module_list", "struct
submodule_update_clone" and "struct update_data".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
0a4d31537d submodule--helper: don't leak {run,capture}_command() cp.dir argument
Fix a memory leak in c51f8f94e5 (submodule--helper: run update
procedures from C, 2021-08-24) and 3c3558f095 (submodule--helper: run
update using child process struct, 2022-03-15) by not allocating
memory in the first place.

The "dir" member of "struct child_process" will not be modified by
that API, and it's declared to be "const char *". So let's not
needlessly duplicate these strings.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
4b9d12460d submodule--helper: "struct pathspec" memory leak in module_update()
The module_update() function calls module_list_compute() twice, which
in turn will reset the "struct pathspec" passed to it. Let's instead
track two of them, and clear them both.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
8fb201d4da submodule--helper: fix most "struct pathspec" memory leaks
Call clear_pathspec() at the end of various functions that work with
and allocate a "struct pathspec".

In some cases the zero-initialization here isn't strictly needed, but
as we're moving to a "goto cleanup" pattern let's make sure that it's
safe to call clear_pathspec(), we don't want the data to be
uninitialized.

E.g. for module_foreach() we can see from looking at
module_list_compute() that if it returns non-zero that the "pathspec"
will always have been initialized. But relying on that both assumes
knowledge about parse_pathspec(), and would set up a fragile pattern
going forward.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
d76260e60a submodule--helper: fix trivial get_default_remote_submodule() leak
Fix a leak in code added in 1012a5cbc3 (submodule--helper
run-update-procedure: learn --remote, 2022-03-04), we need to free()
the xstrdup()'d string. This gets e.g. t/t7419-submodule-set-branch.sh
closer to passing under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
e77b3da6bb submodule--helper: fix a leak in "clone_submodule"
Fix a memory leak of the "clone_data_path" variable that we copy or
derive from the "struct module_clone_data" in clone_submodule(). This
code was refactored in preceding commits, but the leak has been with
us since f8eaa0ba98 (submodule--helper, module_clone: always operate
on absolute paths, 2016-03-31).

For the "else" case we don't need to xstrdup() the "clone_data->path",
and we don't need to free our own "clone_data_path". We can therefore
assign the "clone_data->path" to our own "clone_data_path" right away,
and only override it (and remember to free it!) if we need to
xstrfmt() a replacement.

In the case of the module_clone() caller it's from "argv", and doesn't
need to be free'd, and in the case of the add_submodule() caller we
get a pointer to "sm_path", which doesn't need to be directly free'd
either.

Fixing this leak makes several tests pass, so let's mark them as
passing with TEST_PASSES_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
5647d743e3 Merge branch 'ab/submodule-helper-prep' into ab/submodule-helper-leakfix
* ab/submodule-helper-prep: (33 commits)
  submodule--helper: fix bad config API usage
  submodule--helper: libify even more "die" paths for module_update()
  submodule--helper: libify more "die" paths for module_update()
  submodule--helper: check repo{_submodule,}_init() return values
  submodule--helper: libify "must_die_on_failure" code paths (for die)
  submodule--helper update: don't override 'checkout' exit code
  submodule--helper: libify "must_die_on_failure" code paths
  submodule--helper: libify determine_submodule_update_strategy()
  submodule--helper: don't exit() on failure, return
  submodule--helper: use "code" in run_update_command()
  submodule API: don't handle SM_..{UNSPECIFIED,COMMAND} in to_string()
  submodule--helper: don't call submodule_strategy_to_string() in BUG()
  submodule--helper: add missing braces to "else" arm
  submodule--helper: return "ret", not "1" from update_submodule()
  submodule--helper: rename "int res" to "int ret"
  submodule--helper: don't redundantly check "else if (res)"
  submodule--helper: refactor "errmsg_str" to be a "struct strbuf"
  submodule--helper: add "const" to passed "struct update_data"
  submodule--helper: add "const" to copy of "update_data"
  submodule--helper: add "const" to passed "module_clone_data"
  ...
2022-09-02 09:17:17 -07:00
d4a492f4ad submodule--helper: fix bad config API usage
Fix bad config API usage added in a452128a36 (submodule--helper:
introduce add-config subcommand, 2021-08-06). After
git_config_get_string() returns successfully we know the "char **dest"
will be non-NULL.

A coccinelle patch that transforms this turns up a couple of other
such issues, one in fetch-pack.c, and another in upload-pack.c:

	@@
	identifier F =~ "^(repo|git)_config_get_string(_tmp)?$";
	identifier V;
	@@
	  !F(..., &V)
	- && (V)

But let's focus narrowly on submodule--helper for now, we can fix
those some other time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:25 -07:00
86e16ed3a9 submodule--helper: libify even more "die" paths for module_update()
As noted in a preceding commit the get_default_remote_submodule() and
remote_submodule_branch() functions would invoke die(), and thus leave
update_submodule() only partially lib-ified. We've addressed the
former of those in a preceding commit, let's now address the latter.

In addition to lib-ifying the function this fixes a potential (but
obscure) segfault introduced by a logic error in
1012a5cbc3 (submodule--helper run-update-procedure: learn --remote,
2022-03-04):

We were assuming that remote_submodule_branch() would always return
non-NULL, but if the submodule_from_path() call in that function fails
we'll return NULL. See its introduction in
92bbe7ccf1 (submodule--helper: add remote-branch helper,
2016-08-03). I.e. we'd previously have segfaulted in the xstrfmt()
call in update_submodule() seen in the context.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:25 -07:00
f5373deabd submodule--helper: libify more "die" paths for module_update()
As noted in a preceding commit the get_default_remote_submodule() and
remote_submodule_branch() functions would invoke die(), and thus leave
update_submodule() only partially lib-ified. Let's address the former
of those cases.

Change the functions to return an int exit code (non-zero on failure),
while leaving the get_default_remote() function for the callers that
still want the die() semantics.

This change addresses 1/2 of the "die" issue in these two lines in
update_submodule():

	char *remote_name = get_default_remote_submodule(update_data->sm_path);
	const char *branch = remote_submodule_branch(update_data->sm_path);

We can safely remove the "!default_remote" case from sync_submodule(),
because our get_default_remote_submodule() function now returns a
die_message() on failure, so we can have it and other callers check if
the exit code should be non-zero instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:25 -07:00
1e8697b5c4 submodule--helper: check repo{_submodule,}_init() return values
Fix code added in ce125d431a (submodule: extract path to submodule
gitdir func, 2021-09-15) and a77c3fcb5e (submodule--helper: get
remote names from any repository, 2022-03-04) which failed to check
the return values of repo_init() and repo_submodule_init(). If we
failed to initialize the repository or submodule we could segfault
when trying to access the invalid repository structs.

Let's also check that these were the only such logic errors in the
codebase by making use of the "warn_unused_result" attribute. This is
valid as of GCC 3.4.0 (and clang will catch it via its faking of
__GNUC__ ).

As the comment being added to git-compat-util.h we're piggy-backing on
the LAST_ARG_MUST_BE_NULL version check out of lazyness. See
9fe3edc47f (Add the LAST_ARG_MUST_BE_NULL macro, 2013-07-18) for its
addition. The marginal benefit of covering gcc 3.4.0..4.0.0 is
near-zero (or zero) at this point. It mostly matters that we catch
this somewhere.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
ac350155de submodule--helper: libify "must_die_on_failure" code paths (for die)
Continue the libification of codepaths that previously relied on
"must_die_on_failure". In these cases we've always been early aborting
by calling die(), but as we know that these codepaths will properly
handle return codes of 128 to mean an early abort let's have them use
die_message() instead.

This still isn't a complete migration away from die() for these
codepaths, in particular this code in update_submodule() will still call die() in some cases:

	char *remote_name = get_default_remote_submodule(update_data->sm_path);
	const char *branch = remote_submodule_branch(update_data->sm_path);

But as that code is used by other callers than the "update" code let's
leave converting it for a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
a03c01de2f submodule--helper update: don't override 'checkout' exit code
When "git submodule update" runs it might call "checkout", "merge",
"rebase", or a custom command. Ever since run_update_command() was
added in c51f8f94e5 (submodule--helper: run update procedures from C,
2021-08-24) we'd either exit immediately if the
"submodule.<name>.update" method failed, or in the case of "checkout"
continue trying to update other submodules.

This code used to use the magical "2" return code, but in
55b3f12cb5 (submodule update: use die_message(), 2022-03-15) it was
made to exit(128), which in preceding commits has been changed to
return that 128 code to the top-level.

Let's "libify" this code even more by not having it arbitrarily
override the return code. In practice this doesn't change anything as
the code "git checkout" would return on any normal failure is "1", but
we'll now in principle properly abort the operation if "git checkout"
were to exit with 128.

It would make sense to follow-up this change with a change to allow
the "submodule.<name>.update = !..." (SM_UPDATE_COMMAND) method the
same liberties as "checkout", and perhaps to do the same with a failed
"merge" or "rebase". But let's leave that for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
d905d4432f submodule--helper: libify "must_die_on_failure" code paths
In preceding commits the codepaths around update_submodules() were
changed from using exit() or die() to ferrying up a
"must_die_on_failure" in the cases where we'd exit(), and in most
cases where we'd die().

We needed to do this this to ensure that we'd early exit or otherwise
abort the update_submodules() processing before it was completed.

Now that those preceding changes have shown that we've converted those
paths, we can remove the remaining "ret == 128" special-cases, leaving
the only such special-case in update_submodules(). I.e. we now know
after having gone through the various codepaths that we were only
returning 128 if we meant to early abort.

In update_submodules() we'll for now set any non-zero non-128 exit
codes to "1", but will start ferrying up the exit code as-is in a
subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
484f9150e6 submodule--helper: libify determine_submodule_update_strategy()
Libify the determine_submodule_update_strategy() by having it invoke
die_message() rather than die(), and returning the code die_message()
returns on failure.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
2cb9294b99 submodule--helper: don't exit() on failure, return
Change code downstream of module_update() to short-circuit and return
to the top-level on failure, rather than calling exit().

To do so we need to diligently check whether we "must_die_on_failure",
which is a pattern started in c51f8f94e5 (submodule--helper: run
update procedures from C, 2021-08-24), but which hadn't been completed
to the point where we could avoid calling exit() here.

This introduces no functional changes, but makes it easier to both
call these routines as a library in the future, and to eventually
avoid leaking memory.

This and similar control flow in submodule--helper.c could be made
simpler by properly "libifying" it, i.e. to have it consistently
return -1 on failures, and to early return on any non-success.

But let's leave that larger project for now, and (mostly) emulate what
were doing with the "exit(128)" before this change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
6870cdc32a submodule--helper: use "code" in run_update_command()
Apply some DRY principles in run_update_command() and don't have two
"switch" statements over "ud->update_strategy.type" determine the same
thing.

First we were setting "must_die_on_failure = 1" in all cases except
"SM_UPDATE_CHECKOUT" (and we'd BUG(...) out on the rest). This code
was added in c51f8f94e5 (submodule--helper: run update procedures
from C, 2021-08-24).

Then we'd duplicate same "switch" logic when we were using the
"must_die_on_failure" variable.

Let's instead have the "case" branches in that inner "switch"
determine whether or not the "update must continue" by picking an exit
code.

This also mostly avoids hardcoding the "128" exit code, instead we can
make use of the return value of the die_message() function, which
we've been calling here since 55b3f12cb5 (submodule update: use
die_message(), 2022-03-15). We're still hardcoding it to determine if
we "exit()", but subsequent commit(s) will address that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
b9dd63ffe2 submodule API: don't handle SM_..{UNSPECIFIED,COMMAND} in to_string()
Change the submodule_strategy_to_string() function added in
3604242f08 (submodule: port init from shell to C, 2016-04-15) to
really return a "const char *". In the "SM_UPDATE_COMMAND" case it
would return a strbuf_detach().

Furthermore, this function would return NULL on SM_UPDATE_UNSPECIFIED,
so it wasn't safe to xstrdup() its return value in the general case,
or to use it in a sprintf() format as the code removed in the
preceding commit did.

But its callers would never call it with either SM_UPDATE_UNSPECIFIED
or SM_UPDATE_COMMAND. Let's have its behavior reflect how its only
user expects it to behave, and BUG() out on the rest.

By doing this we can also stop needlessly xstrdup()-ing and free()-ing
the memory for the config we're setting. We can instead always use
constant strings. We can also use the *_tmp() variant of
git_config_get_string().

Let's also rename this submodule_strategy_to_string() function to
submodule_update_type_to_string(). Now that it's only tasked with
returning a string version of the "enum submodule_update_type type".
Before it would look at the "command" field in "struct
submodule_update_strategy".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
08c2e778d6 submodule--helper: don't call submodule_strategy_to_string() in BUG()
Don't call submodule_strategy_to_string() in a BUG() message. These
calls added in c51f8f94e5 (submodule--helper: run update procedures
from C, 2021-08-24) don't need the extra information
submodule_strategy_to_string() gives us, as we'll never reach the
SM_UPDATE_COMMAND case here.

That case is the only one where we'd get any information beyond the
straightforward number-to-string mapping.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
96a907376b submodule--helper: add missing braces to "else" arm
Add missing braces to an "else" arm in init_submodule(), this
stylistic change makes this code conform to the CodingGuidelines, and
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
0b917a9f5c submodule--helper: return "ret", not "1" from update_submodule()
Amend the update_submodule() function to return the failing "ret" on
error, instead of overriding it with "1".

This code was added in b3c5f5cb04 (submodule: move core cmd_update()
logic to C, 2022-03-15), and this change ends up not making a
difference as this function is only called in update_submodules(). If
we return non-zero here we'll always in turn return "1" in
module_update().

But if we didn't do that and returned any other non-zero exit code in
update_submodules() we'd fail the test that's being amended
here. We're still testing the status quo here.

This change makes subsequent refactoring of update_submodule() easier,
as we'll no longer need to worry about clobbering the "ret" we get
from the run_command().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
addda284cb submodule--helper: rename "int res" to "int ret"
Rename the "res" variable added in b3c5f5cb04 (submodule: move core
cmd_update() logic to C, 2022-03-15) to "ret", which is the convention
in the rest of this file.

Eventual follow-up commits will change the code in update_submodule()
to a "goto cleanup" pattern, let's have the post image look consistent
with the rest. For update_submodules() let's also use a "ret" for
consistency, that use was also added in b3c5f5cb04. We'll be
modifying that codepath in subsequent commits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
b0bff0be54 submodule--helper: don't redundantly check "else if (res)"
The "res" variable must be true at this point in update_submodule(),
as just a few lines above this we've unconditionally:

	if (!res)
		return 0;

So we don't need to guard the "return 1" with an "else if (res)", we
can return unconditionally at this point. See b3c5f5cb04 (submodule:
move core cmd_update() logic to C, 2022-03-15) for the initial
introduction of this code, this check of "res" has always been
redundant.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
9d02f9499f submodule--helper: refactor "errmsg_str" to be a "struct strbuf"
Refactor code added in e83e3333b5 (submodule: port submodule
subcommand 'summary' from shell to C, 2020-08-13) so that "errmsg" and
"errmsg_str" are folded into one. The distinction between the empty
string and NULL is something that's tested for by
e.g. "t/t7401-submodule-summary.sh".

This is in preparation for fixing a memory leak the "struct strbuf" in
the pre-image.

Let's also pass a "const char *" to print_submodule_summary(), as it
should not be modifying the "errmsg".

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
a253be682f submodule--helper: add "const" to passed "struct update_data"
Add a "const" to the "struct update_data" passed to
run_update_procedure(), which it in turn passes along (peeled) to
is_tip_reachable() and fetch_in_submodule()).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
1da635b84d submodule--helper: add "const" to copy of "update_data"
Add a "const" to the copy of "struct update_data" that's tracked by
the "struct submodule_update_clone", as it neither owns nor modifies
it.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
6fac5b2f35 submodule--helper: add "const" to passed "module_clone_data"
Add "const" to the "struct module_clone_data" that we pass to
clone_submodule(), which makes the ownership clear, and stops us from
clobbering the "clone_data->path".

We still need to add to the "reference" member, which is a "struct
string_list". Let's do this by having clone_submodule() create its
own, and copy the contents over, allowing us to pass it as a
separate parameter.

This new "struct string_list" still leaks memory, just as the "struct
module_clone_data" did before. let's not fix that for now, to fix that
we'll need to add some "goto cleanup" to the relevant code. That will
eventually be done in follow-up commits, this change makes it easier
to fix the memory leak.

The scope of the new "reference" variable in add_submodule() could be
narrowed to the "else" block, but as we'll eventually free it with a
"goto cleanup" let's declare it at the start of the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
9bdf5277d5 submodule--helper: move "sb" in clone_submodule() to its own scope
Refactor the only remaining use of a "struct strbuf sb" in
clone_submodule() to live in its own scope. This makes the code
clearer by limiting its lifetime.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
21496b4c60 submodule--helper: use xstrfmt() in clone_submodule()
Use xstrfmt() in clone_submodule() instead of a "struct strbuf" in two
cases where we weren't getting anything out of using the "struct
strbuf".

This changes code that was was added along with other uses of "struct
strbuf" in this function in ee8838d157 (submodule: rewrite
`module_clone` shell function in C, 2015-09-08).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
4b82d75b51 submodule--helper: replace memset() with { 0 }-initialization
Use the less verbose { 0 }-initialization syntax rather than memset()
in builtin/submodule--helper.c, this doesn't make a difference in
terms of behavior, but as we're about to modify adjacent code makes
this more consistent, and lets us avoid worrying about when the
memset() happens v.s. a "goto cleanup".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
0b83b2b03a submodule--helper style: add \n\n after variable declarations
Since the preceding commit fixed style issues with \n\n among the
declared variables let's fix the minor stylistic issues with those
variables not being consistently followed by a \n\n.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
e2d5c886bf submodule--helper style: don't separate declared variables with \n\n
The usual style in the codebase is to separate declared variables with
a single newline, not two, let's adjust this code to conform to
that. This makes the eventual addition of various "int ret" variables
more consistent.

In doing this the comment added in 2964d6e5e1 (submodule: port
subcommand 'set-branch' from shell to C, 2020-06-02) might become
ambiguous to some, although it should be clear what it's referring to,
let's move it above the 'OPT_NOOP_NOARG('q', "quiet")' to make that
clearer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
96a28a9bc6 submodule--helper: move "resolve-relative-url-test" to a test-tool
As its name suggests the "resolve-relative-url-test" has never been
used outside of the test suite, see 63e95beb08 (submodule: port
resolve_relative_url from shell to C, 2016-04-15) for its original
addition.

Perhaps it would make sense to drop this code entirely, as we feel
that we've got enough indirect test coverage, but let's leave that
question to a possible follow-up change. For now let's keep the test
coverage this gives us.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
85321a346b submodule--helper: move "check-name" to a test-tool
Move the "check-name" helper to a test-tool, since
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10) it has only been used by this test, not git-submodule.sh.

As noted with its introduction in 0383bbb901 (submodule-config:
verify submodule names as paths, 2018-04-30) the intent of
t7450-bad-git-dotfiles.sh has always been to unit test the
check_submodule_name() function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
9fb2a970e9 submodule--helper: move "is-active" to a test-tool
Create a new "test-tool submodule" and move the "is-active" subcommand
over to it. It was added in 5c2bd8b77a (submodule--helper: add
is-active subcommand, 2017-03-16), since
a452128a36 (submodule--helper: introduce add-config subcommand,
2021-08-06) it hasn't been used by git-submodule.sh.

Since we're creating a command dispatch similar to test-tool.c itself
let's split out the "struct test_cmd" into a new test-tool-utils.h,
which both this new code and test-tool.c itself can use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
255a1ae5da test-tool submodule-config: remove unused "--url" handling
No test has used this "--url" parameter since the test code that made
use of it was removed in 32bc548329 (submodule-config: remove support
for overlaying repository config, 2017-08-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
31955475d1 submodule--helper: remove unused "list" helper
Remove the "submodule--helper list" sub-command, which hasn't been
used by git-submodule.sh since 2964d6e5e1 (submodule: port subcommand
'set-branch' from shell to C, 2020-06-02).

There was a test added in 2b56bb7a87 (submodule helper list: respect
correct path prefix, 2016-02-24) which relied on it, but the right
thing to do here is to delete that test as well.

That test was regression testing the "list" subcommand itself. We're
not getting anything useful from the "list | cut -f2" invocation that
we couldn't get from "foreach 'echo $sm_path'".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
76d63ddc46 submodule--helper: remove unused "name" helper
The "name" helper has not been used since e83e3333b5 (submodule: port
submodule subcommand 'summary' from shell to C, 2020-08-13).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
59378e3355 submodule tests: test for "add <repository> <abs-path>"
Add a missing test for ""add <repository> <path>" where "<path>" is an
absolute path. This tests code added in [1] and later turned into an
"else" branch in clone_submodule() in [2] that's never been tested.

This needs to be skipped on WINDOWS because all of $PWD, $(pwd) and
the "$(pwd -P)" we get via "$submodurl" would fail in CI with e.g.:

	fatal: could not create directory 'D:/a/git/git/t/trash
	directory.t7400-submodule-basic/.git/modules/D:/a/git/git/t/trash
	directory.t7400-submodule-basic/add-abs'

I.e. we can't handle these sorts of paths in this context on that
platform.

I'm not sure where we run into the edges of "$PWD" behavior on
Windows (see [1] for a previous loose end on the topic), but for the
purposes of this test it's sufficient that we test this on other
platforms.

1. ee8838d157 (submodule: rewrite `module_clone` shell function in C,
   2015-09-08)
2. f8eaa0ba98 (submodule--helper, module_clone: always operate on
   absolute paths, 2016-03-31)

1. https://lore.kernel.org/git/220630.86edz6c75c.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:22 -07:00
89bc7b5c01 submodule tests: test usage behavior
Test what exit code and output we emit on "git submodule -h", how we
handle "--" when no subcommand is specified, and how the top-level
"--recursive" option is handled.

For "-h" this doesn't make sense, but let's test for it so that any
subsequent eventual behavior change will become clear.

For "--" this follows up on 68cabbfda3 (submodule: document default
behavior, 2019-02-15) and tests that "status" doesn't support
the "--" delimiter. There's no intrinsically good reason not to
support that. We behave this way due to edge cases in
git-submodule.sh's implementation, but as with "-h" let's assert our
current long-standing behavior for now.

For "--recursive" the exclusion of it from the top-level appears to
have been an omission in 15fc56a853 (git submodule foreach: Add
--recursive to recurse into nested submodules, 2009-08-19), there
doesn't seem to be a reason not to support it alongside "--quiet" and
"--cached", but let's likewise assert our existing behavior for now.

I.e. as long as "status" is optional it would make sense to support
all of its options when it's omitted, but we only do that with
"--quiet" and "--cached", and curiously omit "--recursive".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:22 -07:00
be1a02a17e The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 13:40:19 -07:00
624a936234 Merge branch 'en/merge-multi-strategies'
The code that implements multi-strategy support in "git merge" has
been clean-up a bit.

* en/merge-multi-strategies:
  merge: small code readability improvement
  merge: cleanup confusing logic for handling successful merges
2022-09-01 13:40:19 -07:00
014a9ea207 Merge branch 'en/t4301-more-merge-tree-tests'
More tests to protect the current behaviour of "merge-tree" before
it gets further updated.

* en/t4301-more-merge-tree-tests:
  t4301: add more interesting merge-tree testcases
2022-09-01 13:40:19 -07:00
3a4779086d Merge branch 'en/merge-unstash-only-on-clean-merge'
The auto-stashed local changes created by "git merge --autostash"
was mixed into a conflicted state left in the working tree, which
has been corrected.

* en/merge-unstash-only-on-clean-merge:
  merge: only apply autostash when appropriate
2022-09-01 13:40:18 -07:00
d528044c83 Merge branch 'sg/parse-options-subcommand'
Introduce the "subcommand" mode to parse-options API and update the
command line parser of Git commands with subcommands.

* sg/parse-options-subcommand: (23 commits)
  remote: run "remote rm" argv through parse_options()
  maintenance: add parse-options boilerplate for subcommands
  pass subcommand "prefix" arguments to parse_options()
  builtin/worktree.c: let parse-options parse subcommands
  builtin/stash.c: let parse-options parse subcommands
  builtin/sparse-checkout.c: let parse-options parse subcommands
  builtin/remote.c: let parse-options parse subcommands
  builtin/reflog.c: let parse-options parse subcommands
  builtin/notes.c: let parse-options parse subcommands
  builtin/multi-pack-index.c: let parse-options parse subcommands
  builtin/hook.c: let parse-options parse subcommands
  builtin/gc.c: let parse-options parse 'git maintenance's subcommands
  builtin/commit-graph.c: let parse-options parse subcommands
  builtin/bundle.c: let parse-options parse subcommands
  parse-options: add support for parsing subcommands
  parse-options: drop leading space from '--git-completion-helper' output
  parse-options: clarify the limitations of PARSE_OPT_NODASH
  parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
  api-parse-options.txt: fix description of OPT_CMDMODE
  t0040-parse-options: test parse_options() with various 'parse_opt_flags'
  ...
2022-09-01 13:40:18 -07:00
68ef0425d9 Merge branch 'ds/bundle-uri-clone'
Implement "git clone --bundle-uri".

* ds/bundle-uri-clone:
  clone: warn on failure to repo_init()
  clone: --bundle-uri cannot be combined with --depth
  bundle-uri: add support for http(s):// and file://
  clone: add --bundle-uri option
  bundle-uri: create basic file-copy logic
  remote-curl: add 'get' capability
2022-09-01 13:40:17 -07:00
9ff7eb8c88 git-compat-util.h: use "deprecated" for UNUSED variables
As noted in the preceding commit our "UNUSED" macro was no longer
protecting against actual use of the "unused" variables, which it was
previously doing by renaming the variable.

Let's instead use the "deprecated" attribute to accomplish that
goal. As [1] rightly notes this has the drawback that compiling with
"-Wno-deprecated-declarations" will silence any such uses. I think the
trade-off is worth it as:

 * We can consider that a feature, as e.g. backporting certain patches
   might use a now "unused" parameter, and the person doing that might
   want to silence it with DEVOPTS=no-error.

 * This way we play nicely with coccinelle, and any other dumb(er)
   parser of C (such as syntax highlighters).

 * Not every single compilation of git needs to catch "used but
   declared unused" parameters. It's sufficient that the default "make
   DEVELOPER=1" will do so, and that the "static-analysis" CI job will
   catch it.

1. https://lore.kernel.org/git/YwCtkwjWdJVHHZV0@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:49:49 -07:00
5cf88fd8b0 git-compat-util.h: use "UNUSED", not "UNUSED(var)"
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75de (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.

Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.

This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.

1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:49:48 -07:00
fb41727b7e t: retire unused chainlint.sed
Retire chainlint.sed since it has been replaced by a more accurate and
functional &&-chain "linter", thus is no longer used.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
69b9924b87 t/Makefile: teach make test and make prove to run chainlint.pl
Unlike chainlint.sed which "lints" a single test body at a time, thus is
invoked once per test, chainlint.pl can check all test bodies in all
test scripts with a single invocation. As such, it is akin to other bulk
"linters" run by the Makefile, such as `test-lint-shell-syntax`,
`test-lint-duplicates`, etc.

Therefore, teach `make test` and `make prove` to invoke chainlint.pl
along with the other bulk linters. Also, since the single chainlint.pl
invocation by `make test` or `make prove` has already checked all tests
in all scripts, instruct the individual test scripts not to run
chainlint.pl on themselves unnecessarily.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
23a14f3016 test-lib: replace chainlint.sed with chainlint.pl
By automatically invoking chainlint.sed upon each test it runs,
`test_run_` in test-lib.sh ensures that broken &&-chains will be
detected early as tests are modified or new are tests created since it
is typical to run a test script manually (i.e. `./t1234-test-script.sh`)
during test development. Now that the implementation of chainlint.pl is
complete, modify test-lib.sh to invoke it automatically instead of
chainlint.sed each time a test script is run.

This change reduces the number of "linter" invocations from 26800+ (once
per test run) down to 1050+ (once per test script), however, a
subsequent change will drop the number of invocations to 1 per `make
test`, thus fully realizing the benefit of the new linter.

Note that the "magic exit code 117" &&-chain checker added by bb79af9d09
(t/test-lib: introduce --chain-lint option, 2015-03-20) which is built
into t/test-lib.sh is retained since it has near zero-cost and
(theoretically) may catch a broken &&-chain not caught by chainlint.pl.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
9fd911237f test-lib: retire "lint harder" optimization hack
`test_run_` in test-lib.sh "lints" the body of a test by sending it down
a `sed chainlint.sed | grep` pipeline; this happens once for each test
run by a test script. Although this pipeline may seem relatively cheap
in isolation, it can become expensive when invoked 26800+ times by `make
test`, once for each test run, despite the existence of only 16500+ test
definitions across all tests scripts.

This difference in the number of tests defined in the scripts (16500+)
and the number of tests actually run by `make test` (26800+) is
explained by the fact that some test scripts run a very large number of
small tests, all driven by a series of functions/loops which fill in the
test bodies. This means that certain test definitions are being linted
repeatedly (tens or hundreds of times) unnecessarily. To avoid such
unnecessary work, 2d86a96220 (t: avoid sed-based chain-linting in some
expensive cases, 2021-05-13) added an optimization hack which allows
individual scripts to manually suppress the unnecessary repeated linting
of the same test definition.

However, unlike chainlint.sed which checks a test body as the test is
run, chainlint.pl checks each test definition just once, no matter how
many times the test is run, thus the sort of optimization hack
introduced by 2d86a96220 is no longer needed and can be retired.
Therefore, revert 2d86a96220.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
56066523ed t/chainlint: add more chainlint.pl self-tests
During the development of chainlint.pl, numerous new self-tests were
created to verify correct functioning beyond the checks already
represented by the existing self-tests. The new checks fall into several
categories:

* behavior of the lexical analyzer for complex cases, such as line
  splicing, token pasting, entering and exiting string contexts inside
  and outside of test script bodies; for instance:

    test_expect_success 'title' '
      x=$(echo "something" |
        sed -e '\''s/\\/\\\\/g'\'' -e '\''s/[[/.*^$]/\\&/g'\''
    '

* behavior of the parser for all compound grammatical constructs, such
  as `if...fi`, `case...esac`, `while...done`, `{...}`, etc., and for
  other legal shell grammatical constructs not covered by existing
  chainlint.sed self-tests, as well as complex cases, such as:

    OUT=$( ((large_git 1>&3) | :) 3>&1 ) &&

* detection of problems, such as &&-chain breakage, from top-level to
  any depth since the existing self-tests do not cover any top-level
  context and only cover subshells one level deep due to limitations of
  chainlint.sed

* address blind spots in chainlint.sed (such as not detecting a broken
  &&-chain on a one-line for-loop in a subshell[1]) which chainlint.pl
  correctly detects

* real-world cases which tripped up chainlint.pl during its development

[1]: https://lore.kernel.org/git/dce35a47012fecc6edc11c68e91dbb485c5bc36f.1661663880.git.gitgitgadget@gmail.com/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
ae0c55abf8 chainlint.pl: allow || echo to signal failure upstream of a pipe
The use of `|| return` (or `|| exit`) to signal failure within a loop
isn't effective when the loop is upstream of a pipe since the pipe
swallows all upstream exit codes and returns only the exit code of the
final command in the pipeline.

To work around this limitation, tests may adopt an alternative strategy
of signaling failure by emitting text which would never be emitted in
the non-failing case. For instance:

    while condition
    do
        command1 &&
        command2 ||
        echo "impossible text"
    done |
    sort >actual &&

Such usage indicates deliberate thought about failure cases by the test
author, thus flagging them as missing `|| return` (or `|| exit`) is not
helpful. Therefore, take this case into consideration when checking for
explicit loop termination.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
fd4094c3ca chainlint.pl: complain about loops lacking explicit failure handling
Shell `for` and `while` loops do not terminate automatically just
because a command fails within the loop body. Instead, the loop
continues to iterate and eventually returns the exit status of the final
command of the final iteration, which may not be the command which
failed, thus it is possible for failures to go undetected. Consequently,
it is important for test authors to explicitly handle failure within the
loop body by terminating the loop manually upon failure. This can be
done by returning a non-zero exit code from within the loop body
(i.e. `|| return 1`) or exiting (i.e. `|| exit 1`) if the loop is within
a subshell, or by manually checking `$?` and taking some appropriate
action. Therefore, add logic to detect and complain about loops which
lack explicit `return` or `exit`, or `$?` check.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
832c68b3c2 chainlint.pl: don't flag broken &&-chain if failure indicated explicitly
There are quite a few tests which print an error messages and then
explicitly signal failure with `false`, `return 1`, or `exit 1` as the
final command in an `if` branch. In these cases, the tests don't bother
maintaining the &&-chain between `echo` and the explicit "test failed"
indicator. Since such constructs are manually signaling failure, their
&&-chain breakage is legitimate and safe -- both for the command
immediately preceding `false`, `return`, or `exit`, as well as for all
preceding commands in the `if` branch. Therefore, stop flagging &&-chain
breakage in these sorts of cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
a8f30ee050 chainlint.pl: don't flag broken &&-chain if $? handled explicitly
There are cases in which tests capture and check a command's exit code
explicitly without employing test_expect_code(). They do so by
intentionally breaking the &&-chain since it would be impossible to
capture "$?" in the failing case if the `status=$?` assignment was part
of the &&-chain. Since such constructs are manually checking the exit
code, their &&-chain breakage is legitimate and safe, thus should not be
flagged. Therefore, stop flagging &&-chain breakage in such cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
aabc3258a1 chainlint.pl: don't require & background command to end with &&
The exit status of the `&` asynchronous operator which starts a command
in the background is unconditionally zero, and the few places in the
test scripts which launch commands asynchronously are not interested in
the exit status of the `&` operator (though they often capture the
background command's PID). As such, there is little value in complaining
about broken &&-chain for a command launched in the background, and
doing so would only make busy-work for test authors. Therefore, take
this special case into account when checking for &&-chain breakage.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
d00113ec34 t/Makefile: apply chainlint.pl to existing self-tests
Now that chainlint.pl is functional, take advantage of the existing
chainlint self-tests to validate its operation. (While at it, stop
validating chainlint.sed against the self-tests since it will soon be
retired.)

Due to chainlint.sed implementation limitations leaking into the
self-test "expect" files, a few of them require minor adjustment to make
them compatible with chainlint.pl which does not share those
limitations.

First, because `sed` does not provide any sort of real recursion,
chainlint.sed only emulates recursion into subshells, and each level of
recursion leads to a multiplicative increase in complexity of the `sed`
rules. To avoid substantial complexity, chainlint.sed, therefore, only
emulates subshell recursion one level deep. Any subshell deeper than
that is passed through as-is, which means that &&-chains are not checked
in deeper subshells. chainlint.pl, on the other hand, employs a proper
recursive descent parser, thus checks subshells to any depth and
correctly flags broken &&-chains in deep subshells.

Second, due to sed's line-oriented nature, chainlint.sed, by necessity,
folds multi-line quoted strings into a single line. chainlint.pl, on the
other hand, employs a proper lexical analyzer which preserves quoted
strings as-is, including embedded newlines.

Furthermore, the output of chainlint.sed and chainlint.pl do not match
precisely in terms of whitespace. However, since the purpose of the
self-checks is to verify that the ?!AMP?! annotations are being
correctly added, minor whitespace differences are immaterial. For this
reason, rather than adjusting whitespace in all existing self-test
"expect" files to match the new linter's output, the `check-chainlint`
target ignores whitespace differences. Since `diff -w` is not POSIX,
`check-chainlint` attempts to employ `git diff -w`, and only falls back
to non-POSIX `diff -w` (and `-u`) if `git diff` is not available.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
35ebb1e37b chainlint.pl: don't require return|exit|continue to end with &&
In order to check for &&-chain breakage, each time TestParser encounters
a new command, it checks whether the previous command ends with `&&`,
and -- with a couple exceptions -- signals breakage if it does not. The
first exception is that a command may validly end with `||`, which is
commonly employed as `command || return 1` at the very end of a loop
body to terminate the loop early. The second is that piping one
command's output with `|` to another command does not constitute a
&&-chain break (the exit status of the pipe is the exit status of the
final command in the pipe).

However, it turns out that there are a few additional cases found in the
wild in which it is likely safe for `&&` to be missing even when other
commands follow. For instance:

    while {condition-1}
    do
        test {condition-2} || return 1 # or `exit 1` within a subshell
        more-commands
    done

    while {condition-1}
    do
        test {condition-2} || continue
        more-commands
    done

Such cases indicate deliberate thought about failure modes by the test
author, thus flagging them as breaking the &&-chain is not helpful.
Therefore, take these special cases into consideration when checking for
&&-chain breakage.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
29fb2ec384 chainlint.pl: validate test scripts in parallel
Although chainlint.pl has undergone a good deal of optimization during
its development -- increasing in speed significantly -- parsing and
validating 1050+ scripts and 16500+ tests via Perl is not exactly
instantaneous. However, perceived performance can be improved by taking
advantage of the fact that there is no interdependence between test
scripts or test definitions, thus parsing and validating can be done in
parallel. The number of available cores is determined automatically but
can be overridden via the --jobs option.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
d99ebd6d2e chainlint.pl: add parser to identify test definitions
Finish fleshing out chainlint.pl by adding ScriptParser, a parser which
scans shell scripts for tests defined by test_expect_success() and
test_expect_failure(), plucks the test body from each definition, and
passes it to TestParser for validation. It recognizes test definitions
not only at the top-level of test scripts but also tests synthesized
within compound commands such as loops and function.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
6d932e92fc chainlint.pl: add parser to validate tests
Continue fleshing out chainlint.pl by adding TestParser, a parser with
special knowledge about how Git tests should be written; for instance,
it knows that commands within a test body should be chained together
with `&&`. An upcoming parser which plucks test definitions from test
scripts will invoke TestParser for each test body it encounters.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
6594554119 chainlint.pl: add POSIX shell parser
Continue fleshing out chainlint.pl by adding a general purpose recursive
descent parser for the POSIX shell command language. Although never
invoked directly, upcoming parser subclasses will extend its
functionality for specific purposes, such as plucking test definitions
from input scripts and applying domain-specific knowledge to perform
test validation.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
7d4804731e chainlint.pl: add POSIX shell lexical analyzer
Begin fleshing out chainlint.pl by adding a lexical analyzer for the
POSIX shell command language. The sole entry point Lexer::scan_token()
returns the next token from the input. It will be called by the upcoming
shell language parser.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
b4f25b07c7 t: add skeleton chainlint.pl
Although chainlint.sed usefully identifies broken &&-chains in tests, it
has several shortcomings which include:

  * only detects &&-chain breakage in subshells (one-level deep)

  * does not check for broken top-level &&-chains; that task is left to
    the "magic exit code 117" checker built into test-lib.sh, however,
    that detection does not extend to `{...}` blocks, `$(...)`
    expressions, or compound statements such as `if...fi`,
    `while...done`, `case...esac`

  * uses heuristics, which makes it (potentially) fallible and difficult
    to tweak to handle additional real-world cases

  * written in `sed` and employs advanced `sed` operators which are
    probably not well-known to many programmers, thus the pool of people
    who can maintain it is likely small

  * manually simulates recursion into subshells which makes it much more
    difficult to reason about than, say, a traditional top-down parser

  * checks each test as the test is run, which can get expensive for
    tests which are run repeatedly by functions or loops since their
    bodies will be checked over and over (tens or hundreds of times)
    unnecessarily

To address these shortcomings, begin implementing a more functional and
precise test linter which understands shell syntax and semantics rather
than employing heuristics, thus is able to recognize structural problems
with tests beyond broken &&-chains.

The new linter is written in Perl, thus should be more accessible to a
wider audience, and is structured as a traditional top-down parser which
makes it much easier to reason about, and allows it to inspect compound
statements within test bodies to any depth.

Furthermore, it can check all test definitions in the entire project in
a single invocation rather than having to be invoked once per test, and
each test definition is checked only once no matter how many times the
test is actually run.

At this stage, the new linter is just a skeleton containing boilerplate
which handles command-line options, collects and reports statistics, and
feeds its arguments -- paths of test scripts -- to a (presently)
do-nothing script parser for validation. Subsequent changes will flesh
out the functionality.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
0a101676e5 add -p: ignore dirty submodules
Thanks to always running `diff-index` and `diff-files` with the
`--numstat` option (the latter with `--ignore-submodules=dirty`) before
even generating any real diff to parse, the Perl version of `git add -p`
simply ignored dirty submodules and does not even offer them up for
staging.

However, the built-in variant did not use that flag because it tries to
run only one `diff` command, skipping the unneeded
`diff-index`/`diff-files` invocation of the Perl variant and therefore
only faithfully recapitulates what the Perl code does once it _does_
generate and parse the real diff.

This causes a problem when running the built-in `add -p` with
`diff-so-fancy` because that diff colorizer always inserts an empty line
before the diff header to ensure that it produces 4 lines as expected by
`git add -p` (the equivalent of the non-colorized `diff`, `index`, `---`
and `+++` lines). But `git diff-files` does not produce any `index` line
for dirty submodules.

The underlying problem is not even the discrepancy in lines, but that
`git add -p` presents diffs for dirty submodules: there is nothing that
_can_ be staged for those.

Let's fix that bug, and teach the built-in `add -p` to ignore dirty
submodules, too. This _incidentally_ also fixes the `diff-so-fancy`
problem ;-)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 09:55:28 -07:00
fd3f7f619a add -p: gracefully handle unparseable hunk headers in colored diffs
In
https://lore.kernel.org/git/ecf6f5be-22ca-299f-a8f1-bda38e5ca246@gmail.com,
Phillipe Blain reported that the built-in `git add -p` command fails
when asked to use [`diff-so-fancy`][diff-so-fancy] to colorize the diff.

The reason is that this tool produces colored diffs with a hunk header
that does not contain any parseable `@@ ... @@` line range information,
and therefore we cannot detect any part in that header that comes after
the line range.

As proposed by Phillip Wood, let's take that for a clear indicator that
we should show the hunk headers verbatim. This is what the Perl version
of the interactive `add` command did, too.

[diff-so-fancy]: https://github.com/so-fancy/diff-so-fancy

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 09:55:21 -07:00
b6633a0053 add -p: detect more mismatches between plain vs colored diffs
When parsing the colored version of a diff, the interactive `add`
command really relies on the colored version having the same number of
lines as the plain (uncolored) version. That is an invariant.

We already have code to verify correctly when the colored diff has less
lines than the plain diff. Modulo an off-by-one bug: If the last diff
line has no matching colored one, the code pretends to succeed, still.

To make matters worse, when we adjusted the test in 1e4ffc765d (t3701:
adjust difffilter test, 2020-01-14), we did not catch this because `add
-p` fails for a _different_ reason: it does not find any colored hunk
header that contains a parseable line range.

If we change the test case so that the line range _can_ be parsed, the
bug is exposed.

Let's address all of the above by

- fixing the off-by-one,

- adjusting the test case to allow `add -p` to parse the line range

- making the test case more stringent by verifying that the expected
  error message is shown

Also adjust a misleading code comment about the now-fixed code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 09:49:45 -07:00
3221597433 Makefile: use $(OBJECTS) instead of $(C_OBJ)
In the preceding commit $(C_OBJ) added in c373991375 (Makefile: list
generated object files in OBJECTS, 2010-01-26) became synonymous with
$(OBJECTS). Let's avoid the indirection and use the $(OBJECTS)
variable directly instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-31 14:44:39 -07:00
9dc523aa0e Makefile + hash.h: remove PPC_SHA1 implementation
Remove the PPC_SHA1 implementation added in a6ef3518f9 ([PATCH] PPC
assembly implementation of SHA1, 2005-04-22). When this was added
Apple consumer hardware used the PPC architecture, and the
implementation was intended to improve SHA-1 speed there.

Since it was added we've moved to using sha1collisiondetection by
default, and anyone wanting hard-rolled non-DC SHA-1 implementation
can use OpenSSL's via the OPENSSL_SHA1 knob.

The PPC_SHA1 originally originally targeted 32 bit PPC, and later the
64 bit PPC 970 (a.k.a. Apple PowerPC G5). See 926172c5e4 (block-sha1:
improve code on large-register-set machines, 2009-08-10) for a
reference about the performance on G5 (a comment in block-sha1/sha1.c
being removed here).

I can't get it to do anything but segfault on both the BE and LE POWER
machines in the GCC compile farm[1]. Anyone who's concerned about
performance on PPC these days is likely to be using the IBM POWER
processors.

There have been proposals to entirely remove non-sha1collisiondetection
implementations from the tree[2]. I think per [3] that would be a bit
overzealous. I.e. there are various set-ups git's speed is going to be
more important than the relatively implausible SHA-1 collision attack,
or where such attacks are entirely mitigated by other means (e.g. by
incoming objects being checked with DC_SHA1).

But that really doesn't apply to PPC_SHA1 in particular, which seems
to have outlived its usefulness.

As this gets rid of the only in-tree *.S assembly file we can remove
the small bits of logic from the Makefile needed to build objects
from *.S (as opposed to *.c)

The code being removed here was also throwing warnings with the
"-pedantic" flag, it could have been fixed as 544d93bc3b (block-sha1:
remove use of obsolete x86 assembly, 2022-03-10) did for block-sha1/*,
but as noted above let's remove it instead.

1. https://cfarm.tetaneutral.net/machines/list/
   Tested on gcc{110,112,135,203}, a mixture of POWER [789] ppc64 and
   ppc64le. All segfault in anything needing object
   hashing (e.g. t/t1007-hash-object.sh) when compiled with
   PPC_SHA1=Y.
2. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
3. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/

Acked-by: brian m. carlson" <sandals@crustytoothpaste.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-31 14:37:31 -07:00
0682bc43f5 test-crontab: minor memory and error handling fixes
Since ee69e7884e (gc: use temporary file for editing crontab,
2022-08-28), we now insist that "argc == 3" (and otherwise return an
error). Coverity notes that this causes some dead code:

    if (argc == 3)
          fclose(from);
    else
          fclose(to);

as we will never trigger the else. This also causes a memory leak, since
we'll never close "to".

Now that all paths require 2 arguments, we can just reorganize the
function to check argc up front, and tweak the cleanup to do the right
thing for all cases.

While we're here, we can also notice some minor problems:

  - we return a negative int via error() from what is essentially a
    main() function; we should return a positive non-zero value for
    error. Or better yet, we can just use usage(), which gives a better
    message.

  - while writing the usage message, we can note the one in the comment
    was made out of date by ee69e7884e. But it also had a typo already,
    calling the subcommand "cron" and not "crontab"

  - we didn't check for an error from fopen(), meaning we would segfault
    if the to-be-read file was missing. We can use xfopen() to catch
    this.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 14:31:37 -07:00
320fa579ec tempfile: update comment describing state transitions
Back when 1a9d15db25 (tempfile: a new module for handling temporary
files, 2015-08-10) added this comment, tempfile structs were held in
memory for the life of a process, and there were various guarantees
about which fields were valid in which states.

Since 422a21c6a0 (tempfile: remove deactivated list entries, 2017-09-05)
and 076aa2cbda (tempfile: auto-allocate tempfiles on heap, 2017-09-05),
the flow is quite different: objects come and go from the list, and
inactive ones are deallocated. And the previous commit removed the
"active" flag from the struct entirely.

Let's bring the comment up to date with the current code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 14:16:51 -07:00
77a42b3b84 tempfile: drop active flag
Our tempfile struct contains an "active" flag. Long ago, this flag was
important: tempfile structs were always allocated for the lifetime of
the program and added to a global linked list, and the active flag was
what told us whether a struct's tempfile needed to be cleaned up on
exit.

But since 422a21c6a0 (tempfile: remove deactivated list entries,
2017-09-05) and 076aa2cbda (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we actually remove items from the list, and the active flag
is generally always set to true for any allocated struct. We set it to
true in all of the creation functions, and in the normal code flow it
becomes false only in deactivate_tempfile(), which then immediately
frees the struct.

So the flag isn't performing that role anymore, and in fact makes things
more confusing. Dscho noted that delete_tempfile() is a noop for an
inactive struct. Since 076aa2cbda taught it to free the struct when
deactivating, we'd leak any struct whose active flag is unset. But in
practice it's not a leak, because again, we'll free when we unset the
flag, and never see the allocated-but-inactive state.

Can we just get rid of the flag? The answer is yes, but it requires
looking at a few other spots:

  1. I said above that the flag only becomes false before we deallocate,
     but there's one exception: when we call remove_tempfiles() from a
     signal or atexit handler, we unset the active flag as we remove
     each file. This isn't important for delete_tempfile(), as nobody
     would call it anymore, since we're exiting.

     It does in theory provide us some protection against racily
     double-removing a tempfile. If we receive a second signal while we
     are already in the cleanup routines, we'll start the cleanup loop
     again, and may visit the same tempfile. But this race already
     exists, because calling unlink() and unsetting the active flag
     aren't atomic! And it's OK in practice, because unlink() is
     idempotent (barring the unlikely event that some other process
     chooses our exact temp filename in that instant).

     So dropping the active flag widens the race a bit, but it was
     already there, and is fairly harmless in practice. If we really
     care about addressing it, the right thing is probably to block
     further signals while we're doing our cleanup (which we could
     actually do atomically).

  2. The active flag is declared as "volatile sig_atomic_t". The idea is
     that it's the final bit that gets set to tell the cleanup routines
     that the tempfile is ready to be used (or not used), and it's safe
     to receive a signal racing with regular code which adds or removes
     a tempfile from the list.

     In practice, I don't think this is buying us anything. The presence
     on the linked list is really what tells the cleanup routines to
     look at the struct. That is already marked as "volatile". It's not
     a sig_atomic_t, so it's possible that we could see a sheared write
     there as an entry is added or removed. But that is true of the
     current code, too! Before we can even look at the "active" flag,
     we'd have to follow a link to the struct itself. If we see a
     sheared write in the pointer to the struct, then we'll look at
     garbage memory anyway, and there's not much we can do.

This patch removes the active flag entirely, using presence on the
global linked list as an indicator that a tempfile ought to be cleaned
up. We are already careful to add to the list as the final step in
activating. On deactivation, we'll make sure to remove from the list as
the first step, before freeing any fields. The use of the volatile
keyword should mean that those things happen in the expected order.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 14:16:49 -07:00
78861eb58a Merge branch 'rs/tempfile-cleanup-race-fix' into jk/tempfile-active-flag-cleanup
* rs/tempfile-cleanup-race-fix:
  tempfile: avoid directory cleanup race
2022-08-30 14:16:41 -07:00
64ec8efb83 t6132(NO_PERL): do not run the scripted add -p
When using the non-built-in version of `git add -p` in a `NO_PERL`
build, we expect that invocation to fail.

However, when b02fdbc80a (pathspec: correct an empty string used as a
pathspec element, 2022-05-29) added a test case to t6132 to exercise
`git add -p`, it did not add appropriate prereqs (which admittedly did
not exist back then).

Let's specify the appropriate prereqs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 10:40:48 -07:00
7524780255 t3701: test the built-in add -i regardless of NO_PERL
The built-in `git add --interactive` does not require Perl, therefore we
can safely run these tests even when building with `NO_PERL=LetsDoThat`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 10:40:46 -07:00
cfd0163d64 add -p: avoid ambiguous signed/unsigned comparison
In the interactive `add` operation, users can choose to jump to specific
hunks, and Git will present the hunk list in that case. To avoid showing
too many lines at once, only a maximum of 21 hunks are shown, skipping
the "mode change" pseudo hunk.

The comparison performed to skip the "mode change" pseudo hunk (if any)
compares a signed integer `i` to the unsigned value `mode_change` (which
can be 0 or 1 because it is a 1-bit type).

According to section 6.3.1.8 of the C99 standard (see e.g.
https://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf), what should
happen is an automatic conversion of the "lesser" type to the "greater"
type, but since the types differ in signedness, it is ill-defined what
is the correct "usual arithmetic conversion".

Which means that Visual C's behavior can (and does) differ from GCC's:
When compiling Git using the latter, `add -p`'s `goto` command shows no
hunks by default because it casts a negative start offset to a pretty
large unsigned value, breaking the "goto hunk" test case in
`t3701-add-interactive.sh`.

Let's avoid that by converting the unsigned bit explicitly to a signed
integer.

Note: This is a long-standing bug in the Visual C build of Git, but it
has never been caught because t3701 is skipped when `NO_PERL` is set,
which is the case in the `vs-test` jobs of Git's CI runs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 10:40:42 -07:00
d42b38dfb5 Sync with Git 2.37.3 2022-08-30 10:27:16 -07:00
b46dd1726c Documentation: clarify whitespace rules for trailers
Commit e4319562bc (trailer: be stricter in parsing separators, 2016-11-02)
restricted whitespaces allowed by `git interpret-trailers` in the "token"
part of the trailers it reads. This commit didn't update the related
documentation in Documentation/git-interpret-trailers.txt though.

Also commit 60ef86a162 (trailer: support values folded to multiple lines,
2016-10-21) updated the documentation, but didn't make it clear how many
whitespace characters are allowed at the beginning of new lines in folded
values.

Let's fix both of these issues by rewriting the paragraph describing
what whitespaces are allowed by `git interpret-trailers` in the trailers
it reads.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 10:17:34 -07:00
6c8e4ee870 The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29 14:55:15 -07:00
3658170b92 Merge branch 'es/fix-chained-tests'
Fix broken "&&-" chains and failures in early iterations of a loop.

* es/fix-chained-tests:
  t5329: notice a failure within a loop
  t: detect and signal failure within loop
  t1092: fix buggy sparse "blame" test
  t2407: fix broken &&-chains in compound statement
2022-08-29 14:55:15 -07:00
0d2cf16680 Merge branch 'ds/github-actions-use-newer-ubuntu'
Update the version of Ubuntu used for GitHub Actions CI from 18.04
to 22.04.

* ds/github-actions-use-newer-ubuntu:
  ci: update 'static-analysis' to Ubuntu 22.04
2022-08-29 14:55:15 -07:00
f0deb3f2b5 Merge branch 'ad/preload-plug-memleak'
The preload-index codepath made copies of pathspec to give to
multiple threads, which were left leaked.

* ad/preload-plug-memleak:
  preload-index: fix memleak
2022-08-29 14:55:15 -07:00
edc4f6d280 Merge branch 'sg/xcalloc-cocci-fix'
xcalloc(), imitating calloc(), takes "number of elements of the
array", and "size of a single element", in this order.  A call that
does not follow this ordering has been corrected.

* sg/xcalloc-cocci-fix:
  promisor-remote: fix xcalloc() argument order
2022-08-29 14:55:14 -07:00
56ba6245a4 Merge branch 'en/ort-unused-code-removal'
Code clean-up.

* en/ort-unused-code-removal:
  merge-ort: remove code obsoleted by other changes
2022-08-29 14:55:14 -07:00
10ccb50b16 Merge branch 'tl/trace2-config-scope'
Tweak trace2 output about configuration variables.

* tl/trace2-config-scope:
  tr2: shows scope unconditionally in addition to key-value pair
  api-trace2.txt: print config key-value pair
2022-08-29 14:55:13 -07:00
25402204fe Merge branch 'vd/fix-perf-tests'
Rather trivial perf-test code fixes.

* vd/fix-perf-tests:
  p0006: fix 'read-tree' argument ordering
  p0004: fix prereq declaration
2022-08-29 14:55:13 -07:00
b014a4416a Merge branch 'mg/sequencer-untranslate-reflog'
The sequencer machinery translated messages left in the reflog by
mistake, which has been corrected.

* mg/sequencer-untranslate-reflog:
  sequencer: do not translate command names
  sequencer: do not translate parameters to error_resolve_conflict()
  sequencer: do not translate reflog messages
2022-08-29 14:55:13 -07:00
a0ab573bb1 Merge branch 'jk/unused-fixes'
Code clean-up to remove unused function parameters.

* jk/unused-fixes:
  xdiff: drop unused mmfile parameters from xdl_do_patience_diff()
  reflog: assert PARSE_OPT_NONEG in parse-options callbacks
  reftable: drop unused parameter from reader_seek_linear()
  verify_one_sparse(): drop unused parameters
  match_pathname(): drop unused "flags" parameter
  log-tree: drop unused commit param in remerge_diff()
  xdiff: drop unused mmfile parameters from xdl_do_histogram_diff()
2022-08-29 14:55:12 -07:00
a572a5d4c1 Merge branch 'jd/prompt-show-conflict'
The bash prompt (in contrib/) learned to optionally indicate when
the index is unmerged.

* jd/prompt-show-conflict:
  git-prompt: show presence of unresolved conflicts at command prompt
2022-08-29 14:55:12 -07:00
bc820cf9e6 Merge branch 'vd/scalar-enables-fsmonitor'
"scalar" now enables built-in fsmonitor on enlisted repositories,
when able.

* vd/scalar-enables-fsmonitor:
  scalar: update technical doc roadmap with FSMonitor support
  scalar unregister: stop FSMonitor daemon
  scalar: enable built-in FSMonitor on `register`
  scalar: move config setting logic into its own function
  scalar-delete: do not 'die()' in 'delete_enlistment()'
  scalar-[un]register: clearly indicate source of error
  scalar-unregister: handle error codes greater than 0
  scalar: constrain enlistment search
2022-08-29 14:55:12 -07:00
0b08ba7eb6 Merge branch 'en/ancestry-path-in-a-range'
"git rev-list --ancestry-path=C A..B" is a natural extension of
"git rev-list A..B"; instead of choosing a subset of A..B to those
that have ancestry relationship with A, it lets a subset with
ancestry relationship with C.

* en/ancestry-path-in-a-range:
  revision: allow --ancestry-path to take an argument
  t6019: modernize tests with helper
  rev-list-options.txt: fix simple typo
2022-08-29 14:55:11 -07:00
64cb4c34d1 Merge branch 'mt/rot13-in-c'
Test portability improvements.

* mt/rot13-in-c:
  tests: use the new C rot13-filter helper to avoid PERL prereq
  t0021: implementation the rot13-filter.pl script in C
  t0021: avoid grepping for a Perl-specific string at filter output
2022-08-29 14:55:11 -07:00
c068a3b8ee Merge branch 'ds/decorate-filter-tweak'
The namespaces used by "log --decorate" from "refs/" hierarchy by
default has been tightened.

* ds/decorate-filter-tweak:
  fetch: use ref_namespaces during prefetch
  maintenance: stop writing log.excludeDecoration
  log: create log.initialDecorationSet=all
  log: add --clear-decorations option
  log: add default decoration filter
  log-tree: use ref_namespaces instead of if/else-if
  refs: use ref_namespaces for replace refs base
  refs: add array of ref namespaces
  t4207: test coloring of grafted decorations
  t4207: modernize test
  refs: allow "HEAD" as decoration filter
2022-08-29 14:55:11 -07:00
d5fc07df68 format-patch: learn format.forceInBodyFrom configuration variable
As the need to use the "--force-in-body-from" option primarily is
tied to which mailing list the mails go to (and get their From:
address mangled), it is likely that a user who needs to use this
option once to interact with their upstream project needs to use it
for all patches they send out.

Add a configuration variable, suitable for setting in the local
configuration file per repository, for this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29 14:39:13 -07:00
34bc1b1045 format-patch: allow forcing the use of in-body From: header
Users may be authoring and committing their commits under the same
e-mail address they use to send their patches from, in which case
they shouldn't need to use the in-body From: line in their outgoing
e-mails.  At the receiving end, "git am" will use the address on the
"From:" header of the incoming e-mail and all should be well.

Some mailing lists, however, mangle the From: address from what the
original sender had; in such a situation, the user may want to add
the in-body "From:" header even for their own patches.

"git format-patch --[no-]force-in-body-from" was invented for such
users.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29 14:39:13 -07:00
b84d013936 pretty: separate out the logic to decide the use of in-body from
When pretty-printing the log message for a given commit in the
e-mail format (e.g. "git format-patch"), we add an in-body "From:"
header when the author identity of the commit is different from the
identity of the person whose identity appears in the header of the
e-mail (the latter is passed with them "--from" option).

Split out the logic into a helper function, as we would want to
extend the condition further.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29 14:39:13 -07:00
3871bdb7e4 t4301: emit blank line in more idiomatic fashion
The unusual use of:

    printf "\\n" >>file &&

may give readers pause, making them wonder why this form was chosen over
the more typical:

    printf "\n" >>file &&

However, even that may give pause since it is a somewhat unusual and
long-winded way of saying:

    echo >>file &&

Therefore, replace `printf` with the more idiomatic `echo`, with the
hope of eliminating a possible stumbling block for those reading the
code.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29 09:28:40 -07:00
87ed97167a t4301: fix broken &&-chains and add missing loop termination
Fix &&-chain breaks in a couple tests which went unnoticed due to blind
spots in the &&-chain linters. In particular, the "magic exit code 117"
&&-chain checker built into test-lib.sh only recognizes broken &&-chains
at the top-level; it does not work within `{...}` groups, `(...)`
subshells, `$(...)` substitutions, or within bodies of compound
statements, such as `if`, `for`, `while`, `case`, etc. Furthermore,
`chainlint.sed`, which detects broken &&-chains only in `(...)`
subshells, missed these cases (which are in subshells) because it
(surprisingly) neglects to check for intact &&-chain on single-line
`for` loops.

While at it, explicitly signal failure of commands within the `for`
loops (which might arise due to the filesystem being full or "inode"
exhaustion). This is important since failures within `for` and `while`
loops can go unnoticed if not detected and signaled manually since the
loop itself does not abort when a contained command fails, nor will a
failure necessarily be detected when the loop finishes since the loop
returns the exit code of the last command it ran on the final iteration,
which may not be the command which failed.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29 09:28:31 -07:00
ee69e7884e gc: use temporary file for editing crontab
While cron is specified by POSIX, there are a wide variety of
implementations in use.  "git maintenance" assumes that the
"crontab" command can be fed from its standard input the new
contents and the syntax to do so is not to have any filename
argument, as POSIX describes.  However, on FreeBSD, the cron
implementation requires a file name argument: if the user wants to
edit standard input, they must specify "-".

Unfortunately, POSIX systems do not have to interpret "-" on the
command line of crontab as a request to read from the standard
input.  Blindly adding "-" on the command line would not work as a
general solution.

Since POSIX tells us that cron must accept a file name argument, let's
solve this problem by specifying a temporary file instead.  This will
ensure that we work with the vast majority of implementations.

Note that because delete_tempfile closes the file for us, we should not
call fclose here on the handle, since doing so will introduce a double
free.

Reported-by: Renato Botelho <garga@FreeBSD.org>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-28 15:47:00 -07:00
969a564587 pack-bitmap-write: drop unused pack_idx_entry parameters
Our write_selected_commits_v1() function takes an array of
pack_idx_entry structs. We used to need them for computing commit
positions, but since aa30162559 (bitmap: move `get commit positions`
code to `bitmap_writer_finish`, 2022-08-14), the caller passes in a
separate array of positions for us. We can drop the unused array (and
its matching length parameter).

Likewise, when we added write_lookup_table() in 93eb41e240
(pack-bitmap-write.c: write lookup table extension, 2022-08-14), it
receives the same array of positions. So its "index" parameter was never
used at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-28 13:23:37 -07:00
c333c2ce65 test-mergesort: use mem_pool for sort input
The previous patch almost halved the number of heap allocations for the
sort subcommand.  Reduce it further by using a mem_pool for the line
objects.

Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version.  So I hand-merged the results of separate runs before and with
this patch:

macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted     0.22(0.20+0.01)     0.21(0.19+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.10(0.08+0.01)     0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.10(0.08+0.01)     0.10(0.08+0.01)

Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted     0.54(0.00+0.06)     0.44(0.01+0.06)
0071.14: DEFINE_LIST_SORT sorted       0.21(0.03+0.03)     0.19(0.04+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.21(0.01+0.04)     0.19(0.04+0.04)

Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted     0.29(0.27+0.01)     0.22(0.19+0.02)
0071.14: DEFINE_LIST_SORT sorted       0.07(0.06+0.01)     0.06(0.04+0.02)
0071.16: DEFINE_LIST_SORT reversed     0.07(0.04+0.03)     0.06(0.04+0.02)

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-28 13:10:22 -07:00
f3e8ba2e64 test-mergesort: read sort input all at once
The sort subcommand of test-mergesort is used to test the performance of
sorting linked lists.  It reads lines from stdin, sorts them and prints
the result to stdout.  Two heap allocations are done per line: One for
the linked list item and one for the actual line string.  That imposes a
significant amount of allocation overhead.

Reduce it by doing the same as the sort subcommand of test-string-list,
namely to read the whole input file into a single buffer and then split
it in-place.

Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version.  So I hand-merged the results of separate runs before and with
this patch:

macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted     0.23(0.20+0.01)     0.22(0.20+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.12(0.10+0.01)     0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.12(0.10+0.01)     0.10(0.08+0.01)

Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted     0.71(0.00+0.03)     0.54(0.00+0.06)
0071.14: DEFINE_LIST_SORT sorted       0.42(0.00+0.04)     0.21(0.03+0.03)
0071.16: DEFINE_LIST_SORT reversed     0.42(0.06+0.01)     0.21(0.01+0.04)

Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted     0.41(0.39+0.02)     0.29(0.27+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.11(0.08+0.02)     0.07(0.06+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.11(0.08+0.02)     0.07(0.04+0.03)

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-28 13:10:20 -07:00
bcf325ae77 t4301: account for behavior differences between sed implementations
It is a common pattern in this script to write the result of
`merge-tree -z` (NUL-termination mode) to an "actual" file and then
manually append a newline to that file so that it can be diff'd easily
with a hand-crafted "expect" file which itself ends with a newline since
it has been created by standard Unix tools which terminate lines by
default. For instance:

    git merge-tree --write-tree -z ... >out &&
    printf "\\n" >>out
    anonymize_hash out >actual &&
    q_to_nul <<-EOF >expect &&
    ...
    EOF
    test_cmp expect actual

However, one test gets this backward:

    git merge-tree --write-tree -z ... >out &&
    anonymize_hash out >actual &&
    printf "\\n" >>actual

which means that, unlike all other cases, when anonymize_hash() is
called, the file being anonymized does not end with a newline. As a
result, this test fails on some platforms.

anonymize_hash() is implemented like this:

    anonymize_hash() {
        sed -e "s/[0-9a-f]\{40,\}/HASH/g" "$@"
    }

The problem arises due to differences in behavior of various `sed`
implementations when fed an incomplete line (lacking a newline).
Although most modern `sed` implementations output such a line
unmolested (i.e. without a newline), some older `sed` implementations
forcibly add a newline to the incomplete line (giving the output an
extra unexpected newline), while other very old implementations simply
swallow an incomplete line and don't emit it at all (making the output
shorter than expected).

Fix this test by manually adding the newline before passing it through
`sed`, thus ensuring identical behavior with all `sed` implementation,
and bringing the test in line with other tests in this script.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-28 13:09:46 -07:00
2987ce743d Merge branch 'en/t4301-more-merge-tree-tests' into es/t4301-sed-portability-fix
* en/t4301-more-merge-tree-tests:
  t4301: add more interesting merge-tree testcases
2022-08-28 13:08:54 -07:00
babe2e0559 tempfile: avoid directory cleanup race
The temporary directory created by mks_tempfile_dt() is deleted by first
deleting the file within, then truncating the filename strbuf and
passing the resulting string to rmdir(2).  When the cleanup routine is
invoked concurrently by a signal handler we can end up passing the now
truncated string to unlink(2), however, which could cause problems on
some systems.

Avoid that issue by remembering the directory name separately.  This way
the paths stay unchanged.  A signal handler can still race with normal
cleanup, but deleting the same files and directories twice is harmless.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-27 10:17:46 -07:00
1819ad327b grep: fix multibyte regex handling under macOS
The commit 29de20504e (Makefile: fix default regex settings on
Darwin, 2013-05-11) fixed t0070-fundamental.sh under Darwin (macOS) by
adopting Git's regex library.  However, this library is compiled with
NO_MBSUPPORT, which causes git-grep to work incorrectly on multibyte
(e.g. UTF-8) files.  Current macOS versions pass t0070-fundamental.sh
with the native macOS regex library, which also supports multibyte
characters.

Adjust the Makefile to use the native regex library, and call
setlocale(3) to set CTYPE according to the user's preference.
The setlocale call is required on all platforms, but in platforms
supporting gettext(3), setlocale was called as a side-effect of
initializing gettext.  Therefore, move the CTYPE setlocale call from
gettext.c to common-main.c and the corresponding locale.h include
into git-compat-util.h.

Thanks to the global initialization of CTYPE setlocale, the test-tool
regex command now works correctly with supported multibyte regexes, and
is used to set the MB_REGEX test prerequisite by assessing a platform's
support for them.

Signed-off-by: Diomidis Spinellis <dds@aueb.gr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 11:45:52 -07:00
07ee72db0e Sync with 'maint' 2022-08-26 11:14:11 -07:00
761416ef91 bitmap-lookup-table: add performance tests for lookup table
Add performance tests to verify the performance of lookup table.
`p5310-pack-bitmaps.sh` contain tests with and without lookup table.
`p5312-pack-bitmaps-revs.sh` contain same tests with and without
lookup table but with `pack.writeReverseIndex` enabled.

Lookup table makes Git run faster in most of the cases. Below is the
result of `t/perf/p5310-pack-bitmaps.sh`.`perf/p5326-multi-pack-bitmaps.sh`
gives similar result. The repository used in the test is linux kernel.

Test                                                    this tree
-----------------------------------------------------------------------
5310.4: enable lookup table: false                    0.01(0.00+0.00)
5310.5: repack to disk                                320.89(230.20+23.45)
5310.6: simulated clone                               14.04(5.78+1.79)
5310.7: simulated fetch                               1.95(3.05+0.20)
5310.8: pack to file (bitmap)                         44.73(20.55+7.45)
5310.9: rev-list (commits)                            0.78(0.46+0.10)
5310.10: rev-list (objects)                           4.07(3.97+0.08)
5310.11: rev-list with tag negated via --not          0.06(0.02+0.03)
         --all (objects)
5310.12: rev-list with negative tag (objects)         0.21(0.15+0.05)
5310.13: rev-list count with blob:none                0.24(0.17+0.06)
5310.14: rev-list count with blob:limit=1k            7.07(5.92+0.48)
5310.15: rev-list count with tree:0                   0.25(0.17+0.07)
5310.16: simulated partial clone                      5.67(3.28+0.64)
5310.18: clone (partial bitmap)                       16.05(8.34+1.86)
5310.19: pack to file (partial bitmap)                59.76(27.22+7.43)
5310.20: rev-list with tree filter (partial bitmap)   0.90(0.18+0.16)
5310.24: enable lookup table: true                    0.01(0.00+0.00)
5310.25: repack to disk                               319.73(229.30+23.01)
5310.26: simulated clone                              13.69(5.72+1.78)
5310.27: simulated fetch                              1.84(3.02+0.16)
5310.28: pack to file (bitmap)                        45.63(20.67+7.50)
5310.29: rev-list (commits)                           0.56(0.39+0.8)
5310.30: rev-list (objects)                           3.77(3.74+0.08)
5310.31: rev-list with tag negated via --not          0.05(0.02+0.03)
         --all (objects)
5310.32: rev-list with negative tag (objects)         0.21(0.15+0.05)
5310.33: rev-list count with blob:none                0.23(0.17+0.05)
5310.34: rev-list count with blob:limit=1k            6.65(5.72+0.40)
5310.35: rev-list count with tree:0                   0.23(0.16+0.06)
5310.36: simulated partial clone                      5.57(3.26+0.59)
5310.38: clone (partial bitmap)                       15.89(8.39+1.84)
5310.39: pack to file (partial bitmap)                58.32(27.55+7.47)
5310.40: rev-list with tree filter (partial bitmap)   0.73(0.18+0.15)

Test 4-15 are tested without using lookup table. Same tests are
repeated in 16-30 (using lookup table).

Mentored-by: Taylor Blau <me@ttaylorr.com>
Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:14:02 -07:00
28cd730680 pack-bitmap: prepare to read lookup table extension
Earlier change teaches Git to write bitmap lookup table. But Git
does not know how to parse them.

Teach Git to parse the existing bitmap lookup table. The older
versions of Git are not affected by it. Those versions ignore the
lookup table.

Mentored-by: Taylor Blau <me@ttaylorr.com>
Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:13:58 -07:00
76f14b777c pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests
Teach Git to provide a way for users to enable/disable bitmap lookup
table extension by providing a config option named 'writeBitmapLookupTable'.
Default is false.

Also add test to verify writting of lookup table.

Mentored-by: Taylor Blau <me@ttaylorr.com>
Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-Authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:13:54 -07:00
93eb41e240 pack-bitmap-write.c: write lookup table extension
The bitmap lookup table extension was documented by an earlier
change, but Git does not yet know how to write that extension.

Teach Git to write bitmap lookup table extension. The table contains
the list of `N` <commit_pos, offset, xor_row>` triplets. These
triplets are sorted according to their commit pos (ascending order).
The meaning of each data in the i'th triplet is given below:

  - commit_pos stores commit position (in the pack-index or midx).
    It is a 4 byte network byte order unsigned integer.

  - offset is the position (in the bitmap file) from which that
    commit's bitmap can be read.

  - xor_row is the position of the triplet in the lookup table
    whose bitmap is used to compress this bitmap, or `0xffffffff`
    if no such bitmap exists.

Mentored-by: Taylor Blau <me@ttaylorr.com>
Co-mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:13:50 -07:00
aa30162559 bitmap: move get commit positions code to bitmap_writer_finish
The `write_selected_commits_v1` function takes care of writing commit
positions along with their corresponding bitmaps in the disk. It is
OK because this `search commit position of a given commit` algorithm
is needed only once here. But in later changes of the `lookup table
extension series`, we need same commit positions which means we have
to run the above mentioned algorithm one more time.

Move the `search commit position of a given commit` algorithm to
`bitmap_writer_finish()` and use the `commit_positions` array
to get commit positions of their corresponding bitmaps.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:13:47 -07:00
e9977b12fd Documentation/technical: describe bitmap lookup table extension
When reading bitmap file, Git loads each and every bitmap one by one
even if all the bitmaps are not required. A "bitmap lookup table"
extension to the bitmap format can reduce the overhead of loading
bitmaps which stores a list of bitmapped commit id pos (in the midx
or pack, along with their offset and xor offset. This way Git can
load only the necessary bitmaps without loading the previous bitmaps.

Older versions of Git ignore the lookup table extension and don't
throw any kind of warning or error while parsing the bitmap file.

Add some information for the new "bitmap lookup table" extension in the
bitmap-format documentation.

Mentored-by: Taylor Blau <me@ttaylorr.com>
Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-Authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:13:33 -07:00
b75747829f range-diff: optionally accept pathspecs
The `git range-diff` command can be quite expensive, which is not a
surprise given that the underlying algorithm to match up pairs of
commits between the provided two commit ranges has a cubic runtime.

Therefore it makes sense to restrict the commit ranges as much as
possible, to reduce the amount of input to that O(N^3) algorithm.

In chatty repositories with wide trees, this is not necessarily
possible merely by choosing commit ranges wisely.

Let's give users another option to restrict the commit ranges: by
providing a pathspec. That helps in repositories with wide trees because
it is likely that the user has a good idea which subset of the tree they
are actually interested in.

Example:

	git range-diff upstream/main upstream/seen HEAD -- range-diff.c

This shows commits that are either in the local branch or in `seen`, but
not in `main`, skipping all commits that do not touch `range-diff.c`.

Note: Since we piggy-back the pathspecs onto the `other_arg` mechanism
that was introduced to be able to pass through the `--notes` option to
the revision machinery, we must now ensure that the `other_arg` array is
appended at the end (the revision range must come before the pathspecs,
if any).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 09:49:26 -07:00
0087d7dfbe range-diff: consistently validate the arguments
This patch lets `range-diff` validate the arguments not only when
invoked with one or two arguments, but also in the code path where three
arguments are handled.

While at it, we now use `usage_msg_opt*()` consistently.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 09:49:25 -07:00
edd6a31f46 range-diff: reorder argument handling
In d9c66f0b5b (range-diff: first rudimentary implementation,
2018-08-13), we introduced the argument handling of the `range-diff`
command, special-casing three different stanzas based on the argument
count.

The somewhat unorthodox order (first handling the case of 2 arguments,
then 3, then 1) was chosen for clarity: the natural argument number is 2
because that is how many revision ranges are used internally. The code
to handle three arguments is relatively trivial, so it was added next.
And finally, the code to ungarble a single symmetric range into two
separate ones was added, because it was the most complicated (the most
inelegant part being about interpreting empty sides of the symmetric
range as `HEAD`).

In preparation for allowing pathspecs in `git range-diff` invocations,
where we no longer have the luxury of using the number of arguments to
disambiguate between these three different ways to specify the commit
ranges, we need to order these cases by argument count, in descending
order.

This patch is best viewed with `--color-moved`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 09:49:23 -07:00
6693fb3f01 t64xx: convert 'test_create_repo' to 'git init'
Convert the merge-specific tests (those in the t64xx range) over to
using 'git init' instead of 'test_create_repo'.

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 09:23:03 -07:00
7c46ea0ded The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-25 14:42:32 -07:00
f00ddc9f48 Merge branch 'vd/scalar-generalize-diagnose'
The "diagnose" feature to create a zip archive for diagnostic
material has been lifted from "scalar" and made into a feature of
"git bugreport".

* vd/scalar-generalize-diagnose:
  scalar: update technical doc roadmap
  scalar-diagnose: use 'git diagnose --mode=all'
  builtin/bugreport.c: create '--diagnose' option
  builtin/diagnose.c: add '--mode' option
  builtin/diagnose.c: create 'git diagnose' builtin
  diagnose.c: add option to configure archive contents
  scalar-diagnose: move functionality to common location
  scalar-diagnose: move 'get_disk_info()' to 'compat/'
  scalar-diagnose: add directory to archiver more gently
  scalar-diagnose: avoid 32-bit overflow of size_t
  scalar-diagnose: use "$GIT_UNZIP" in test
2022-08-25 14:42:32 -07:00
a103ad6f3d Merge branch 'jk/pipe-command-nonblock'
Fix deadlocks between main Git process and subprocess spawned via
the pipe_command() API, that can kill "git add -p" that was
reimplemented in C recently.

* jk/pipe-command-nonblock:
  pipe_command(): mark stdin descriptor as non-blocking
  pipe_command(): handle ENOSPC when writing to a pipe
  pipe_command(): avoid xwrite() for writing to pipe
  git-compat-util: make MAX_IO_SIZE define globally available
  nonblock: support Windows
  compat: add function to enable nonblocking pipes
2022-08-25 14:42:32 -07:00
098b7bfaa6 Merge branch 'js/fetch-negotiation-trace'
The common ancestor negotiation exchange during a "git fetch"
session now leaves trace log.

* js/fetch-negotiation-trace:
  fetch-pack: add tracing for negotiation rounds
2022-08-25 14:42:31 -07:00
01a30a5a58 Merge branch 'jk/is-promisor-object-keep-tree-in-use'
An earlier optimization discarded a tree-object buffer that is
still in use, which has been corrected.

* jk/is-promisor-object-keep-tree-in-use:
  is_promisor_object(): fix use-after-free of tree buffer
2022-08-25 14:42:31 -07:00
df3c129e24 Merge branch 'en/submodule-merge-messages-fixes'
Further update the help messages given while merging submodules.

* en/submodule-merge-messages-fixes:
  merge-ort: provide helpful submodule update message when possible
  merge-ort: avoid surprise with new sub_flag variable
  merge-ort: remove translator lego in new "submodule conflict suggestion"
  submodule merge: update conflict error message
2022-08-25 14:42:29 -07:00
8f9d80f6c0 remote: run "remote rm" argv through parse_options()
The "git remote rm" command's option parsing is fairly primitive: it
insists on a single argument, which it treats as the remote name, and
displays a usage message otherwise.

This is OK, and maybe even convenient, as you could run:

  git remote rm --foo

to drop a remote named "--foo". But it's also weirdly unlike most of the
rest of Git, which would complain that there is no option "--foo". The
right way to spell it by our conventions is:

  git remote rm -- --foo

but this doesn't currently work.

So let's bring the command in line with the rest of Git (including its
sibling subcommands!) by feeding argv to parse_options(). We already
have an empty options array for the usage helper.

Note that we have to adjust the argc index down by one, as
parse_options() eats the program name from the start of the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-25 09:43:30 -07:00
0d330a53f3 maintenance: add parse-options boilerplate for subcommands
Several of the git-maintenance subcommands don't take any options, so
they don't bother looking at argv at all. This means they'll silently
accept garbage, like:

  $ git maintenance register --foo
  [no output]

  $ git maintenance stop bar
  [no output]

Let's give them the basic boilerplate to detect and handle these cases:

  $ git maintenance register --foo
  error: unknown option `foo'
  usage: git maintenance register

  $ git maintenance stop bar
  usage: git maintenance stop

We could reduce the number of lines of code here a bit with a shared
helper function. But it's worth building out the boilerplate, as it may
serve as the base for adding options later.

Note one complication: maintenance_start() calls directly into
maintenance_register(), so it now needs to pass a plausible argv (we
don't care, but parse_options() is expecting there to at least be an
argv[0] program name). This is an extra line of code, but it eliminates
the need for an explanatory comment.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-25 09:43:30 -07:00
ecd2d3efe0 pass subcommand "prefix" arguments to parse_options()
Recent commits such as bf0a6b65fc (builtin/multi-pack-index.c: let
parse-options parse subcommands, 2022-08-19) converted a few functions
to match our usual argc/argv/prefix conventions, but the prefix argument
remains unused.

However, there is a good use for it: they should pass it to their own
parse_options() functions, where it may be used to adjust the value of
any filename options. In all but one of these functions, there's no
behavior change, since they don't use OPT_FILENAME. But this is an
actual fix for one option, which you can see by modifying the test suite
like so:

	diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh
	index 4fe57414c1..d0974d4371 100755
	--- a/t/t5326-multi-pack-bitmaps.sh
	+++ b/t/t5326-multi-pack-bitmaps.sh
	@@ -186,7 +186,11 @@ test_expect_success 'writing a bitmap with --refs-snapshot' '

	 		# Then again, but with a refs snapshot which only sees
	 		# refs/tags/one.
	-		git multi-pack-index write --bitmap --refs-snapshot=snapshot &&
	+		(
	+			mkdir subdir &&
	+			cd subdir &&
	+			git multi-pack-index write --bitmap --refs-snapshot=../snapshot
	+		) &&

	 		test_path_is_file $midx &&
	 		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&

I'd emphasize that this wasn't broken by bf0a6b65fc; it has been broken
all along, because the sub-function never got to see the prefix. It is
that commit which is actually enabling us to fix it (and which also
brought attention to the problem because it triggers -Wunused-parameter!)

The other functions changed here don't use OPT_FILENAME at all. In their
cases this isn't fixing anything visible, but it's following the usual
pattern and future-proofing them against somebody adding new options and
being surprised.

I didn't include a test for the one visible case above. We don't
generally test routine parse-options behavior for individual options.
The challenge here was finding the problem, and now that this has been
done, it's not likely to regress. Likewise, we could apply the patch
above to cover it "for free" but it makes reading the rest of the test
unnecessarily complicated.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-25 09:43:29 -07:00
f677f62970 Merge branch 'ds/bundle-uri-clone' into ds/bundle-uri-3
* ds/bundle-uri-clone:
  clone: warn on failure to repo_init()
  clone: --bundle-uri cannot be combined with --depth
  bundle-uri: add support for http(s):// and file://
  clone: add --bundle-uri option
  bundle-uri: create basic file-copy logic
  remote-curl: add 'get' capability
2022-08-24 16:05:16 -07:00
9905e80b0f t5329: notice a failure within a loop
We try to write "|| return 1" (or "|| exit 1" in a subshell) at the
end of a sequence of &&-chained command in a loop of our tests, so
that a failure of any step during the earlier iteration of the loop
can properly be caught.

There is one loop in this test script that is used to compute the
expected result, that will be later compared with an actual output
produced by the "test-tool pack-mtimes" command.  This particular
loop, however, is placed on the upstream side of a pipe, whose
non-zero exit code does not get noticed.

Emit a line that will never be produced by the "test-tool pack-mtimes"
to cause the later comparison to fail.  As we use test_cmp to compare
this "expected output" file with the "actual output", the "error
message" we are emitting into the expected output stream will stand
out and shown to the tester.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 14:24:17 -07:00
ef46584831 ci: update 'static-analysis' to Ubuntu 22.04
GitHub Actions scheduled a brownout of Ubuntu 18.04, which canceled all
runs of the 'static-analysis' job in our CI runs. Update to 22.04 to
avoid this as the brownout later turns into a complete deprecation.

The use of 18.04 was set in d051ed77ee (.github/workflows/main.yml: run
static-analysis on bionic, 2021-02-08) due to the lack of Coccinelle
being available on 20.04 (which continues today).

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 13:02:12 -07:00
3c4dbf556f t4301: add more interesting merge-tree testcases
This adds several tests of `merge-tree -z` extended conflict output
behavior to the testsuite, including some tests adapted from t6422.
These tests mark current behavior, not necessarily optimal behavior.  In
particular, some path_msg() calls might want to include additional
paths.

These testcases also make something clear about the <Conflicted file>
info section of the output.  That section consists of a sequence of
lines of the form
    <mode> <object> <stage> <filename>
where <stage> is always greater than 0 (since each line comes from a
conflicted file).  The lines correspond to conflicts that would be
placed in the index if we were doing a merge in a working tree.  It is
perhaps natural to assume that for any given line, the <object> and
<filename> correspond to a single <revision>:<filename> pair from one of
the commits being merged (or from the merge base).  This is true for
simple conflicts.  However, these testcases make it clear that this is
not the case in general.  For example, <object> may be the hash of a
three-way content merge of three different files (and with different
filenames).

The tests no longer pass under TEST_PASSES_SANITIZE_LEAK; it appears
that doing a directory rename with "git mv", among other possible
problems, triggers issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 09:42:36 -07:00
ae15fd4116 merge: small code readability improvement
After our loop through the selected strategies, we compare best_strategy
to wt_strategy.  This is fine, but the fact that the code setting
best_strategy sets it to use_strategies[i]->name requires a little bit
of extra checking to determine that at the time of setting, that's the
same as wt_strategy.  Just setting best_strategy to wt_strategy makes it
a little easier to verify what the loop is doing, at least for this
reader.

Further, use_strategies[i]->name is used in a number of places, where we
could just use wt_strategy.  The latter takes less time for this reader
to parse (one variable name instead of three), so just use wt_strategy
to make the code slightly faster for human readers to parse.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 09:25:59 -07:00
5b1d30cabf merge: cleanup confusing logic for handling successful merges
builtin/merge.c has a loop over the specified strategies, where if they
all fail with conflicts, it picks the one with the least number of
conflicts.

In the codepath that finds a successful merge, if an automatic commit
was wanted, the code breaks out of the above loop, which makes sense.
However, if the user requested there be no automatic commit, the loop
would continue.  That seems weird; --no-commit should not affect the
choice of merge strategy, but the code as written makes one think it
does.  However, since the loop itself embeds "!merge_was_ok" as a
condition on continuing to loop, it actually would also exit early if
--no-commit was specified, it just exited from a different location.

Restructure the code slightly to make it clear that the loop will
immediately exit whenever we find a merge strategy that is successful.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 09:10:27 -07:00
d3a9295ada merge: only apply autostash when appropriate
If a merge failed and we are leaving conflicts in the working directory
for the user to resolve, we should not attempt to apply any autostash.

Further, if we fail to apply the autostash (because either the merge
failed, or the user requested --no-commit), then we should instruct the
user how to apply it later.

Add a testcase verifying we have corrected this behavior.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 09:08:32 -07:00
c4bbd9bb8f promisor-remote: fix xcalloc() argument order
Pass the number of elements first and their size second, as expected
by xcalloc().

Patch generated with:

  make SPATCH_FLAGS=--recursive-includes contrib/coccinelle/xcalloc.cocci.patch

Our default SPATCH_FLAGS ('--all-includes') doesn't catch this
transformation by default, unless used in combination with a large-ish
SPATCH_BATCH_SIZE which happens to put 'promisor-remote.c' with a file
that includes 'repository.h' directly in the same batch.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 08:50:39 -07:00
65da938916 clone: warn on failure to repo_init()
The --bundle-uri option was added in 5556891961 (clone: add
--bundle-uri option, 2022-08-09), but this also introduced a call to
repo_init() whose return value was ignored. Fix that ignored value by
warning that the bundle URI process could not continue if it failed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-24 08:46:36 -07:00
23578904da preload-index: fix memleak
Fix a memory leak occuring in case of pathspec copy in preload_index.

Direct leak of 8 byte(s) in 8 object(s) allocated from:
    #0 0x7f0a353ead47 in __interceptor_malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/libasan.so.6+0xb5d47)
    #1 0x55750995e840 in do_xmalloc /home/anthony/src/c/git/wrapper.c:51
    #2 0x55750995e840 in xmalloc /home/anthony/src/c/git/wrapper.c:72
    #3 0x55750970f824 in copy_pathspec /home/anthony/src/c/git/pathspec.c:684
    #4 0x557509717278 in preload_index /home/anthony/src/c/git/preload-index.c:135
    #5 0x55750975f21e in refresh_index /home/anthony/src/c/git/read-cache.c:1633
    #6 0x55750915b926 in cmd_status builtin/commit.c:1547
    #7 0x5575090e1680 in run_builtin /home/anthony/src/c/git/git.c:466
    #8 0x5575090e1680 in handle_builtin /home/anthony/src/c/git/git.c:720
    #9 0x5575090e284a in run_argv /home/anthony/src/c/git/git.c:787
    #10 0x5575090e284a in cmd_main /home/anthony/src/c/git/git.c:920
    #11 0x5575090dbf82 in main /home/anthony/src/c/git/common-main.c:56
    #12 0x7f0a348230ab  (/lib64/libc.so.6+0x290ab)

Signed-off-by: Anthony Delannoy <anthony.2lannoy@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 15:08:30 -07:00
99e4d084ff midx.c: avoid adding preferred objects twice
The last commit changes the behavior of midx.c's `get_sorted_objects()`
function to handle the case of writing a MIDX bitmap while reusing an
existing MIDX and changing the identity of the preferred pack
separately.

As part of this change, all objects from the (new) preferred pack are
added to the fanout table in a separate pass. Since these copies of the
objects all have their preferred bits set, any duplicates will be
resolved in their favor.

Importantly, this includes any copies of those same objects that come
from the existing MIDX. We know at the time of adding them that they'll
be redundant if their source pack is the (new) preferred one, so we can
avoid adding them to the list in this case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:22 -07:00
cdf517be06 midx.c: include preferred pack correctly with existing MIDX
This patch resolves an issue where the object order used to generate a
MIDX bitmap would violate an invariant that all of the preferred pack's
objects are represented by that pack in the MIDX.

The problem arises when reusing an existing MIDX while generating a new
one, and occurs specifically when the identity of the preferred pack
changes from one MIDX to another, along with a few other conditions:

    - the new preferred pack must also be present in the existing MIDX

    - the new preferred pack must *not* have been the preferred pack in
      the existing MIDX

    - most importantly, there must be at least one object present in the
      physical preferred pack (ie., it shows up in that pack's index)
      but was selected from a *different* pack when the previous MIDX
      was generated

When the above conditions are all met, we end up (incorrectly)
discarding copies of some objects in the pack selected as the preferred
pack. This is because `get_sorted_entries()` adds objects to its list
by doing the following at each fanout level:

    - first, adding all objects from that fanout level from an existing
      MIDX

    - then, adding all objects from that fanout level in each pack *not*
      included in the existing MIDX

So if some object was not selected from the to-be-preferred pack when
writing the previous MIDX, then we will never consider it as a candidate
when generating the new MIDX. This means that it's possible for the
preferred pack to not include all of its objects in the MIDX's
pseudo-pack object order, which is an invariant violation of that order.

Resolve this by adding all objects from the preferred pack separately
when it appears in the existing MIDX (if one was present). This will
duplicate objects from that pack that *did* appear in the MIDX, but this
is fine, since get_sorted_entries() already handles duplicates. (A
future optimization in this area could avoid adding copies of objects
that we know already existing in the MIDX.)

Note that we no longer need to compute the preferred-ness of objects
added from the MIDX, since we only want to select the preferred objects
from a single source. (We could still mark these preferred bits, but
doing so is redundant and unnecessary).

This resolves the bug demonstrated by t5326.174 ("preferred pack change
with existing MIDX bitmap").

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:22 -07:00
1d6f4c6408 midx.c: extract midx_fanout_add_pack_fanout()
Extract a routine to add all objects whose object ID's first byte is
`cur_fanout` from a given pack (identified by its index into the `struct
pack_info` array maintained by the MIDX writing routine).

Unlike the previous extraction (for `midx_fanout_add_midx_fanout()`),
this function will be called twice, once for all new packs, and again
for the preferred pack (if it appears in an existing MIDX). The latter
change is to resolve the bug described a few patches ago, and will be
made in the subsequent commit.

Similar to the previous refactoring, this function also enhances the
readability of its caller in `get_sorted_entries()`.

Its functionality is unchanged in this commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:22 -07:00
852c530102 midx.c: extract midx_fanout_add_midx_fanout()
Extract a routine to add all objects whose object ID's first byte is
`cur_fanout` from an existing MIDX.

This function will only be called once, so extracting it is purely
cosmetic to improve the readability of `get_sorted_entries()` (its sole
caller) below.

The functionality is unchanged in this commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:22 -07:00
989d9cbd5c midx.c: extract struct midx_fanout
To build up a list of objects (along with their packs, and the offsets
within those packs that each object appears at), the MIDX code
implements `get_sorted_entries()` which builds up a list of candidates,
sorts them, and then removes duplicate entries.

To do this, it keeps an array of `pack_midx_entry` structures that it
builds up once for each fanout level (ie., for all possible values of
the first byte of each object's ID).

This array is a function-local variable of `get_sorted_entries()`. Since
it uses the ALLOC_GROW() macro, having the `alloc_fanout` variable also
be local to that function, and only modified within that function is
convenient.

However, subsequent changes will extract the two ways this array is
filled (from a pack at some fanout value, and from an existing MIDX at
some fanout value) into separate functions. Instead of passing around
pointers to the entries array, along with `nr_fanout` and
`alloc_fanout`, encapsulate these three into a structure instead. Then
pass around a pointer to this structure instead.

This patch does not yet extract the above two functions, but sets us up
to begin doing so in the following commit. For now, the implementation
of get_sorted_entries() is only modified to replace `entries_by_fanout`
with `fanout.entries`, `nr_fanout` with `fanout.nr`, and so on.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:22 -07:00
0b6203c4ef t/lib-bitmap.sh: avoid silencing stderr
The midx_bitmap_partial_tests() function is responsible for setting up a
state where some (but not all) packs in the repository are covered by a
MIDX (and bitmap).

This function has redirected the `git multi-pack-index write --bitmap`'s
stderr to a file "err" since its introduction back in c51f5a6437 (t5326:
test multi-pack bitmap behavior, 2021-08-31).

This was likely a stray change left over from a slightly different
version of this test, since the file "err" is never read after being
written. This leads to confusingly-missing output, especially when the
contents of stderr are important.

Resolve this confusion by avoiding silencing stderr in this case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:22 -07:00
65168c42df t5326: demonstrate potential bitmap corruption
It is possible to generate a corrupt MIDX bitmap when certain conditions
are met. This happens when the preferred pack "P" changes to one (say,
"Q") that:

  - "Q" has objects included in an existing MIDX,
  - but "Q" is different than "P",
  - and "Q" and "P" have some objects in common

When this is the case, not all objects from "Q" will be selected from
"Q" (ie., the generated MIDX will represent them as coming from a
different pack), despite "Q" being preferred.

This is an invariant violation, since all objects contained in the
MIDX's preferred pack are supposed to originate from the preferred pack.
In other words, all duplicate objects are resolved in favor of the copy
that comes from the MIDX's preferred pack, if any.

This violation results in a corrupt object order, which cannot be
interpreted by the pack-bitmap code, leading to broken clones and other
defects.

This test demonstrates the above problem by constructing a minimal
reproduction, and showing that the final `git clone` invocation fails.

The reproduction is mostly straightforward, except that the new pack
generated between MIDX writes (which is necessary in order to prevent
that operation from being a noop) must sort ahead of all existing packs
in order to prevent a different pack (neither "P" nor "Q") from
appearing as preferred (meaning all its objects appear in order at the
beginning of the pseudo-pack order).

Subsequent commits will first refactor the midx.c::get_sorted_entries()
function, and then fix this bug.

Reported-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 13:04:21 -07:00
0e66bc1b21 t: detect and signal failure within loop
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 12:53:02 -07:00
625ff5c320 t1092: fix buggy sparse "blame" test
This test wants to verify that `git blame` errors out when asked to
blame a file _not_ in the sparse checkout. However, the very first file
it asks to blame _is_ present in the checkout, thus `test_must_fail git
blame $file` gives an unexpected result (the "blame" succeeds). This
problem went unnoticed because the test invokes `test_must_fail git
blame $file` in loop but forgets to break out of the loop early upon
failure, thus the failure gets swallowed.

Fix the test by having it not ask to blame a file present in the sparse
checkout, and instead only blame files not present, as intended. While
at it, also add the missing `|| return 1` which allowed this bug to go
unnoticed.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 12:53:02 -07:00
308cbaa082 t2407: fix broken &&-chains in compound statement
The breaks in the &&-chain in this test went unnoticed because the
"magic exit code 117" &&-chain checker built into test-lib.sh only
recognizes broken &&-chains at the top-level; it does not work within
`{...}` groups, `(...)` subshells, `$(...)` substitutions, or within
bodies of compound statements, such as `if`, `for`, `while`, `case`,
etc. Furthermore, `chainlint.sed` detects broken &&-chains only in
`(...)` subshells. Thus, the &&-chain breaks in this test fall into the
blind spots of the &&-chain linters.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22 12:53:02 -07:00
12a58f9014 xdiff: drop unused mmfile parameters from xdl_do_patience_diff()
The entry point to the patience-diff algorithm takes two mmfile_t
structs with the original file contents, but it doesn't actually do
anything useful with them. This is similar to the case recently cleaned
up in the histogram code via f1d019071e (xdiff: drop unused mmfile
parameters from xdl_do_histogram_diff(), 2022-08-19), but there's a bit
more subtlety going on.

We pass them into the recursive patience_diff(), which in turn passes
them into fill_hashmap(), which stuffs the pointers into a struct. But
the only thing which reads the struct fields is our recursion into
patience_diff()!

So it's unlikely that something like -Wunused-parameter could find this
case: it would have to detect the circular dependency caused by the
recursion (not to mention tracing across struct field assignments).

But once found, it's easy to have the compiler confirm what's going on:

  1. Drop the "file1" and "file2" fields from the hashmap struct
     definition. Remove the assignments in fill_hashmap(), and
     temporarily substitute NULL in the recursive call to
     patience_diff(). Compiling shows that no other code touched those
     fields.

  2. Now fill_hashmap() will trigger -Wunused-parameter. Drop "file1"
     and "file2" from its definition and callsite.

  3. Now patience_diff() will trigger -Wunused-parameter. Drop them
     there, too. One of the callsites is the recursion with our
     NULL values, so those temporary values go away.

  4. Now xdl_do_patience_diff() will trigger -Wunused-parameter. Drop
     them there. And we're done.

Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-20 14:14:55 -07:00
ee610f00e2 reflog: assert PARSE_OPT_NONEG in parse-options callbacks
In the spirit of 517fe807d6 (assert NOARG/NONEG behavior of
parse-options callbacks, 2018-11-05), this asserts that our callbacks
were invoked using the right flags (since otherwise they'd segfault on
the NULL arg). Both cases are already correct here, so this is mostly
about annotating the functions, and appeasing -Wunused-parameters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-20 14:14:55 -07:00
21a40847ed reftable: drop unused parameter from reader_seek_linear()
The reader code passes around a "struct reftable_reader" context
variable. But the seek function doesn't need it; the table iterator we
already get is sufficient.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-20 14:14:55 -07:00
5db8e59cf1 verify_one_sparse(): drop unused parameters
This function has never used its repository or cache_tree parameters
since it was introduced in 9ad2d5ea71 (sparse-index: loose integration
with cache_tree_verify(), 2021-03-30).

As that commit notes, it may eventually be extended further, and that
might require looking at more data. But we can easily add them back if
necessary (and the repository is even included in the index_state these
days already). In the mean time, dropping them makes the code shorter
and appeases -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-20 14:14:17 -07:00
77b9e85c0f p0006: fix 'read-tree' argument ordering
In the 'p0006' test "read-tree br_base br_ballast", move the '-n' flag used
in 'git read-tree' ahead of its positional arguments.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 14:35:30 -07:00
59c72303dd p0004: fix prereq declaration
Fix multi-threaded 'p0004' test's use of the 'REPO_BIG_ENOUGH_FOR_MULTI'
prerequisite. Unlike normal 't/' tests, 't/perf/' tests need to have their
prerequisites declared with the '--prereq' flag.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 14:35:28 -07:00
629444ad45 sequencer: do not translate command names
When action_name is used to denote a command `git %s` do not translate
since command names are never translated.

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 13:46:37 -07:00
1c8dfc3674 sequencer: do not translate parameters to error_resolve_conflict()
`error_resolve_conflict()` checks the untranslated action_name
parameter, so pass it as is.

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 13:44:48 -07:00
5670e0ec15 sequencer: do not translate reflog messages
Traditionally, reflog messages were never translated, in particular not
on storage.

Due to the switch of more parts of git to the sequencer, old changes in
the sequencer code may lead to recent changes in git's behaviour. E.g.:
c28cbc5ea6 ("sequencer: mark action_name() for translation", 2016-10-21)
marked several uses of `action_name()` for translation. Recently, this
lead to a partially translated reflog:

`rebase: fast-forward` is translated (e.g. in de to `Rebase: Vorspulen`)
whereas other reflog entries such as `rebase (pick):` remain
untranslated as they should be.

Change the relevant line in the sequencer so that this reflog entry
remains untranslated, as well.

Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 13:43:35 -07:00
77651c032c match_pathname(): drop unused "flags" parameter
This field has not been used since the function was introduced in
b559263216 (exclude: split pathname matching code into a separate
function, 2012-10-15), though there was a brief period where it was
erroneously used and then reverted in ed4958477b (dir: fix pattern
matching on dirs, 2021-09-24) and 5ceb663e92 (dir: fix
directory-matching bug, 2021-11-02).

It's possible we'd eventually add a flag that makes it useful here, but
there are only a handful of callers. It would be easy to add back if
necessary, and in the meantime this makes the function interface less
misleading.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:20:56 -07:00
e2841f706e log-tree: drop unused commit param in remerge_diff()
This function has never used its "commit" parameter since it was added
in db757e8b8d (show, log: provide a --remerge-diff capability,
2022-02-02).

This makes sense; we already have separate parameters for the parents
(which lets us redo the merge) and the oid of the result tree (which we
can then diff against the remerge result).

Let's drop the unused parameter in the name of clarity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:20:43 -07:00
f1d019071e xdiff: drop unused mmfile parameters from xdl_do_histogram_diff()
These are no longer used since 9df0fc3d57 (xdiff: fix a memory leak,
2022-02-16), as the caller is expected to call xdl_prepare_env() itself.
After that change the histogram code only examines the prepared
xdfenv_t, not the original buffers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:20:24 -07:00
776515ef8b is_path_owned_by_current_uid(): mark "report" parameter as unused
In the non-Windows version of this function, we never have any errors to
report, and thus the "report" parameter is unused. But we can't drop it,
because we have to maintain function call compatibility with the version
in compat/mingw.h, which does use this parameter.

Note that there's an extra level of indirection here; the common
function is actually is_path_owned_by_current_user, which is a macro
pointing to "by_current_uid" or "by_current_sid", depending on the
platform. So an alternative here is to eat the unused parameter in the
macro, since -Wunused-parameter doesn't complain about macros. But I
think the UNUSED() annotation is less obfuscated for somebody reading
the code later.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:56 -07:00
e5e056b21d run-command: mark unused async callback parameters
The start_async(), etc, functions need a "proc" callback that conforms
to a particular interface. Not every callback needs every parameter
(e.g., the caller might not even ask to open an input descriptor, in
which case there is no point in the callback looking at it). Let's mark
these for -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:56 -07:00
555ff1c8a4 mark unused read_tree_recursive() callback parameters
We pass a callback to read_tree_recursive(), but not every callback
needs every parameter. Let's mark the unused ones to satisfy
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:56 -07:00
02c3c59e62 hashmap: mark unused callback parameters
Hashmap comparison functions must conform to a particular callback
interface, but many don't use all of their parameters. Especially the
void cmp_data pointer, but some do not use keydata either (because they
can easily form a full struct to pass when doing lookups). Let's mark
these to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
783a86c142 config: mark unused callback parameters
The callback passed to git_config() must conform to a particular
interface. But most callbacks don't actually look at the extra "void
*data" parameter. Let's mark the unused parameters to make
-Wunused-parameter happy.

Note there's one unusual case here in get_remote_default() where we
actually ignore the "value" parameter. That's because it's only checking
whether the option is found at all, and not parsing its value.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
9f5a9de7c8 streaming: mark unused virtual method parameters
Streaming "open" functions need to conform to the same virtual function
interface, but not every implementation needs every parameter. Mark the
unused ones as such to appease -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
f7d5741279 transport: mark bundle transport_options as unused
get_refs_from_bundle() is a virtual function which must match the
signature of other transports, but it doesn't look at its
transport_options at all. This isn't a bug, because not all transports
necessarily support all options. Let's mark it as unused to appease
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
7718827a2d refs: mark unused virtual method parameters
The refs code uses various polymorphic types (e.g., loose vs packed
ref_stores, abstracted iterators). Not every virtual function or
callback needs all of its parameters. Let's mark the unused ones to
quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
c006e9fa59 refs: mark unused reflog callback parameters
Functions used with for_each_reflog_ent() need to conform to a
particular interface, but not every function needs all of the
parameters. Mark the unused ones to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:54 -07:00
63e14ee2d6 refs: mark unused each_ref_fn parameters
Functions used with for_each_ref(), etc, need to conform to the
each_ref_fn interface. But most of them don't need every parameter;
let's annotate the unused ones to quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:54 -07:00
9b24034754 git-compat-util: add UNUSED macro
In preparation for compiling with -Wunused-parameter, we'd like to be
able to annotate some function parameters as false positives (e.g.,
parameters which must exist to conform to a callback interface).

Ideally our annotation will:

  - be portable, turning into nothing on platforms which don't support
    it

  - be easy to read, without looking too syntactically odd or taking
    attention away from the rest of the parameters

  - help us notice when a parameter marked as unused is actually used,
    which keeps our annotations accurate. In theory a compiler could
    tell us this easily, but gcc has no such warning. Clang has
    -Wused-but-marked-unused, but it triggers false positives with our
    MAYBE_UNUSED annotation (e.g., for commit-slab functions)

This patch introduces an UNUSED() macro which takes the parameter name
as an argument. That lets us tweak the name in such a way that we'll
notice if somebody tries to use it. It looks like this in use:

  int some_ref_cb(const char *refname,
                  const struct object_id *UNUSED(oid),
                  int UNUSED(flags),
                  void *UNUSED(data))
  {
        printf("got refname %s", refname);
        return 0;
  }

Because the unused parameter names are rewritten behind the scenes to
UNUSED_oid, etc, adding code like:

  printf("oid is %s", oid_to_hex(oid));

will fail compilation with "oid undeclared". Sadly, the "did you mean"
feature of modern compilers is not generally smart enough to suggest the
"unused" name. If we used a very short prefix like U_oid, that does
convince gcc to say "did you mean", but since the "U_" in the suggestion
isn't much of a hint, it doesn't really help. In practice, a look at the
function definition usually makes the problem pretty obvious.

Note that we have to put the definition of UNUSED early in
git-compat-util.h, because it will eventually be used for some compat
functions themselves (both directly here and in mingw.h).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:54 -07:00
398c4ff582 builtin/worktree.c: let parse-options parse subcommands
'git worktree' parses its subcommands with a long list of if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:16 -07:00
2b057d97d7 builtin/stash.c: let parse-options parse subcommands
'git stash' parses its subcommands with a long list of if-else if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
and listing subcommands for Bash completion.

Note that the push_stash() function implementing the 'push' subcommand
accepts an extra flag parameter to indicate whether push was assumed,
so add a wrapper function with the standard subcommand function
signature.

Note also that this change "hides" the '-h' option in 'git stash push
-h' from the parse_option() call in cmd_stash(), as it comes after the
subcommand.  Consequently, from now on it will emit the usage of the
'push' subcommand instead of the usage of 'git stash'.  We had a
failing test for this case, which can now be flipped to expect
success.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:16 -07:00
1c3502b198 builtin/sparse-checkout.c: let parse-options parse subcommands
'git sparse-checkout' parses its subcommands with a couple of if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

Note that some of the functions implementing each subcommand only
accept the 'argc' and '**argv' parameters, so add a (unused) '*prefix'
parameter to make them match the type expected by parse-options, and
thus avoid casting function pointers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:16 -07:00
b26a412f1e builtin/remote.c: let parse-options parse subcommands
'git remote' parses its subcommands with a long list of if-else if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling unknown subcommands, and listing subcommands for Bash
completion.  Make sure that the default operation mode doesn't accept
any arguments; and while at it remove the capitalization of the error
message and adjust the test checking it accordingly.

Note that 'git remote' has both 'remove' and 'rm' subcommands, and the
former is preferred [1], so hide the latter for completion.

Note also that the functions implementing each subcommand only accept
the 'argc' and '**argv' parameters, so add a (unused) '*prefix'
parameter to make them match the type expected by parse-options, and
thus avoid casting a bunch of function pointers.

[1] e17dba8fe1 (remote: prefer subcommand name 'remove' to 'rm',
    2012-09-06)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:16 -07:00
729b97332b builtin/reflog.c: let parse-options parse subcommands
'git reflog' parses its subcommands with a couple of if-else if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
and listing subcommands for Bash completion.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:16 -07:00
54ef7676ba builtin/notes.c: let parse-options parse subcommands
'git notes' parses its subcommands with a long list of if-else if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling unknown subcommands, and listing subcommands for Bash
completion.  Make sure that the default operation mode doesn't accept
any arguments.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
bf0a6b65fc builtin/multi-pack-index.c: let parse-options parse subcommands
'git multi-pack-index' parses its subcommands with a couple of if-else
if statements.  parse-options has just learned to parse subcommands,
so let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

Note that the functions implementing each subcommand only accept the
'argc' and '**argv' parameters, so add a (unused) '*prefix' parameter
to make them match the type expected by parse-options, and thus avoid
casting function pointers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
f83736ce9d builtin/hook.c: let parse-options parse subcommands
'git hook' parses its currently only subcommand with an if statement.
parse-options has just learned to parse subcommands, so let's use that
facility instead, with the benefits of shorter code, handling missing
or unknown subcommands, and listing subcommands for Bash completion.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
0350954482 builtin/gc.c: let parse-options parse 'git maintenance's subcommands
'git maintenanze' parses its subcommands with a couple of if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

This change makes 'git maintenance' consistent with other commands in
that the help text shown for '-h' goes to standard output, not error,
in the exit code and error message on unknown subcommand, and the
error message on missing subcommand.  There is a test checking these,
which is now updated accordingly.

Note that some of the functions implementing each subcommand don't
accept any parameters, so add the (unused) 'argc', '**argv' and
'*prefix' parameters to make them match the type expected by
parse-options, and thus avoid casting function pointers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
1c3b05170a builtin/commit-graph.c: let parse-options parse subcommands
'git commit-graph' parses its subcommands with an if-else if
statement.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

Note that the functions implementing each subcommand only accept the
'argc' and '**argv' parameters, so add a (unused) '*prefix' parameter
to make them match the type expected by parse-options, and thus avoid
casting function pointers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
aef7d75e58 builtin/bundle.c: let parse-options parse subcommands
'git bundle' parses its subcommands with a couple of if-else if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
fa83cc834d parse-options: add support for parsing subcommands
Several Git commands have subcommands to implement mutually exclusive
"operation modes", and they usually parse their subcommand argument
with a bunch of if-else if statements.

Teach parse-options to handle subcommands as well, which will result
in shorter and simpler code with consistent error handling and error
messages on unknown or missing subcommand, and it will also make
possible for our Bash completion script to handle subcommands
programmatically.

The approach is guided by the following observations:

  - Most subcommands [1] are implemented in dedicated functions, and
    most of those functions [2] either have a signature matching the
    'int cmd_foo(int argc, const char **argc, const char *prefix)'
    signature of builtin commands or can be trivially converted to
    that signature, because they miss only that last prefix parameter
    or have no parameters at all.

  - Subcommand arguments only have long form, and they have no double
    dash prefix, no negated form, and no description, and they don't
    take any arguments, and can't be abbreviated.

  - There must be exactly one subcommand among the arguments, or zero
    if the command has a default operation mode.

  - All arguments following the subcommand are considered to be
    arguments of the subcommand, and, conversely, arguments meant for
    the subcommand may not preceed the subcommand.

So in the end subcommand declaration and parsing would look something
like this:

    parse_opt_subcommand_fn *fn = NULL;
    struct option builtin_commit_graph_options[] = {
        OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"),
                   N_("the object directory to store the graph")),
        OPT_SUBCOMMAND("verify", &fn, graph_verify),
        OPT_SUBCOMMAND("write", &fn, graph_write),
        OPT_END(),
    };
    argc = parse_options(argc, argv, prefix, options,
                         builtin_commit_graph_usage, 0);
    return fn(argc, argv, prefix);

Here each OPT_SUBCOMMAND specifies the name of the subcommand and the
function implementing it, and the address of the same 'fn' subcommand
function pointer.  parse_options() then processes the arguments until
it finds the first argument matching one of the subcommands, sets 'fn'
to the function associated with that subcommand, and returns, leaving
the rest of the arguments unprocessed.  If none of the listed
subcommands is found among the arguments, parse_options() will show
usage and abort.

If a command has a default operation mode, 'fn' should be initialized
to the function implementing that mode, and parse_options() should be
invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag.  In this case
parse_options() won't error out when not finding any subcommands, but
will return leaving 'fn' unchanged.  Note that if that default
operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT
flag is necessary as well (otherwise parse_options() would error out
upon seeing the unknown option meant to the default operation mode).

Some thoughts about the implementation:

  - The same pointer to 'fn' must be specified as 'value' for each
    OPT_SUBCOMMAND, because there can be only one set of mutually
    exclusive subcommands; parse_options() will BUG() otherwise.

    There are other ways to tell parse_options() where to put the
    function associated with the subcommand given on the command line,
    but I didn't like them:

      - Change parse_options()'s signature by adding a pointer to
        subcommand function to be set to the function associated with
        the given subcommand, affecting all callsites, even those that
        don't have subcommands.

      - Introduce a specific parse_options_and_subcommand() variant
        with that extra funcion parameter.

  - I decided against automatically calling the subcommand function
    from within parse_options(), because:

      - There are commands that have to perform additional actions
        after option parsing but before calling the function
        implementing the specified subcommand.

      - The return code of the subcommand is usually the return code
        of the git command, but preserving the return code of the
        automatically called subcommand function would have made the
        API awkward.

  - Also add a OPT_SUBCOMMAND_F() variant to allow specifying an
    option flag: we have two subcommands that are purposefully
    excluded from completion ('git remote rm' and 'git stash save'),
    so they'll have to be specified with the PARSE_OPT_NOCOMPLETE
    flag.

  - Some of the 'parse_opt_flags' don't make sense with subcommands,
    and using them is probably just an oversight or misunderstanding.
    Therefore parse_options() will BUG() when invoked with any of the
    following flags while the options array contains at least one
    OPT_SUBCOMMAND:

      - PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing
        arguments when encountering a "--" argument, so it doesn't
        make sense to expect and keep one before a subcommand, because
        it would prevent the parsing of the subcommand.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash
        might be meaningful for the command's default operation mode,
        e.g. to disambiguate refs and pathspecs.

      - PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag
        tells parse_options() to stop as soon as it encouners a
        non-option argument, but subcommands are by definition not
        options...  so how could they be parsed, then?!

      - PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any
        unknown --options and then pass them to a different command or
        subsystem.  Surely if a command has subcommands, then this
        functionality should rather be delegated to one of those
        subcommands, and not performed by the command itself.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass
        --options to the default operation mode.

  - If the command with subcommands has a default operation mode, then
    all arguments to the command must preceed the arguments of the
    subcommand.

    AFAICT we don't have any commands where this makes a difference,
    because in those commands either only the command accepts any
    arguments ('notes' and 'remote'), or only the default subcommand
    ('reflog' and 'stash'), but never both.

  - The 'argv' array passed to subcommand functions currently starts
    with the name of the subcommand.  Keep this behavior.  AFAICT no
    subcommand functions depend on the actual content of 'argv[0]',
    but the parse_options() call handling their options expects that
    the options start at argv[1].

  - To support handling subcommands programmatically in our Bash
    completion script, 'git cmd --git-completion-helper' will now list
    both subcommands and regular --options, if any.  This means that
    the completion script will have to separate subcommands (i.e.
    words without a double dash prefix) from --options on its own, but
    that's rather easy to do, and it's not much work either, because
    the number of subcommands a command might have is rather low, and
    those commands accept only a single --option or none at all.  An
    alternative would be to introduce a separate option that lists
    only subcommands, but then the completion script would need not
    one but two git invocations and command substitutions for commands
    with subcommands.

    Note that this change doesn't affect the behavior of our Bash
    completion script, because when completing the --option of a
    command with subcommands, e.g. for 'git notes --<TAB>', then all
    subcommands will be filtered out anyway, as none of them will
    match the word to be completed starting with that double dash
    prefix.

[1] Except 'git rerere', because many of its subcommands are
    implemented in the bodies of the if-else if statements parsing the
    command's subcommand argument.

[2] Except 'credential', 'credential-store' and 'fsmonitor--daemon',
    because some of the functions implementing their subcommands take
    special parameters.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
dc9f98832b parse-options: drop leading space from '--git-completion-helper' output
The output of 'git <cmd> --git-completion-helper' always starts with a
space, e.g.:

  $ git config --git-completion-helper
   --global --system --local [...]

This doesn't matter for the completion script, because field splitting
discards that space anyway.

However, later patches in this series will teach parse-options to
handle subcommands, and subcommands will be included in the completion
helper output as well.  This will make the loop printing options (and
subcommands) a tad more complex, so I wanted to test the result.  The
test would have to account for the presence of that leading space,
which bugged my OCD, so let's get rid of it.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
a9126b92a2 parse-options: clarify the limitations of PARSE_OPT_NODASH
Update the comment documenting 'struct option' to clarify that
PARSE_OPT_NODASH can only be an argumentless short option; see
51a9949eda (parseopt: add PARSE_OPT_NODASH, 2009-05-07).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
99d86d60e5 parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown
arguments instead of erroring out".  This is a bit misleading, as this
flag only applies to unknown --options, while non-option arguments are
kept even without this flag.

Update the description to clarify this, and rename the flag to
PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at
the flag name.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
80882bc5e7 api-parse-options.txt: fix description of OPT_CMDMODE
The description of the 'OPT_CMDMODE' macro states that "enum_val is
set to int_var when ...", but it's the other way around, 'int_var' is
set to 'enum_val'.  Fix this.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
c1b117d31c t0040-parse-options: test parse_options() with various 'parse_opt_flags'
In 't0040-parse-options.sh' we thoroughly test the parsing of all
types and forms of options, but in all those tests parse_options() is
always invoked with a 0 flags parameter.

Add a few tests to demonstrate how various 'enum parse_opt_flags'
values are supposed to influence option parsing.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
31a66c1964 t5505-remote.sh: check the behavior without a subcommand
'git remote' without a subcommand defaults to listing all remotes and
doesn't accept any arguments except the '-v|--verbose' option.

We are about to teach parse-options to handle subcommands, and update
'git remote' to make use of that new feature.  So let's add some tests
to make sure that the upcoming changes don't inadvertently change the
behavior in these cases.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:13 -07:00
9e4658d5c6 t3301-notes.sh: check that default operation mode doesn't take arguments
'git notes' without a subcommand defaults to listing all notes and
doesn't accept any arguments.

We are about to teach parse-options to handle subcommands, and update
'git notes' to make use of that new feature.  So let's add a test to
make sure that the upcoming changes don't inadvertenly change the
behavior in this corner case.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:13 -07:00
66fa6e8ed8 git.c: update NO_PARSEOPT markings
Our Bash completion script can complete --options for commands using
parse-options even when that command doesn't have a dedicated
completion function, but to do so the completion script must know
which commands use parse-options and which don't.  Therefore, commands
not using parse-options are marked in 'git.c's command list with the
NO_PARSEOPT flag.

Update this list, and remove this flag from the commands that by now
use parse-options.

After this change we can TAB complete --options of the plumbing
commands 'commit-tree', 'mailinfo' and 'mktag'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:13 -07:00
e03acd0d4a git-prompt: show presence of unresolved conflicts at command prompt
If GIT_PS1_SHOWCONFLICTSTATE is set to "yes", show the word "CONFLICT"
on the command prompt when there are unresolved conflicts.

Example prompt: (main|CONFLICT)

Signed-off-by: Justin Donnelly <justinrdonnelly@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 10:58:40 -07:00
257418c590 revision: allow --ancestry-path to take an argument
We have long allowed users to run e.g.
    git log --ancestry-path master..seen
which shows all commits which satisfy all three of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have master as an ancestor

This commit allows another variant:
    git log --ancestry-path=$TOPIC master..seen
which shows all commits which satisfy all of these criteria:
  * are an ancestor of seen
  * are not an ancestor of master
  * have $TOPIC in their ancestry-path
that last bullet can be defined as commits meeting any of these
criteria:
    * are an ancestor of $TOPIC
    * have $TOPIC as an ancestor
    * are $TOPIC

This also allows multiple --ancestry-path arguments, which can be
used to find commits with any of the given topics in their ancestry
path.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 10:45:08 -07:00
1838e21cff t6019: modernize tests with helper
The tests in t6019 are repetitive, so create a helper that greatly
simplifies the test script.

In addition, update the common pattern that places 'git rev-list' on the
left side of a pipe, which can hide some exit codes. Send the output to
a 'raw' file that is then consumed by other tools so the Git exit code
is verified as zero.  And since we're using --format anyway, switch to
`git log`, so that we get the desired format and can avoid using sed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 10:45:07 -07:00
11ea33ce44 rev-list-options.txt: fix simple typo
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 10:45:07 -07:00
ff033db7a8 merge-ort: remove code obsoleted by other changes
Commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE handling with
conflicted entries", 2021-03-20) added some code for merge-ort to handle
conflicted and skip_worktree entries in general.  Included in this was
an ugly hack for dealing with present-despite-skipped entries and a
testcase (t6428.2) specific to that hack, since at that time users could
accidentally get files into that state when using a sparse checkout.

However, with the merging of 82386b4496 ("Merge branch
'en/present-despite-skipped'", 2022-03-09), that class of problems was
addressed globally and in a much cleaner way.  As such, the
present-despite-skipped hack in merge-ort is no longer needed and can
simply be removed.

No additional testcase is needed here; t6428.2 was written to test the
necessary functionality and is being kept.  The fact that this test
continues to pass despite the code being removed shows that the extra
code is no longer necessary.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 10:43:41 -07:00
8e2841890a scalar: update technical doc roadmap with FSMonitor support
Update the Scalar roadmap to reflect completion of enabling the built-in
FSMonitor in Scalar.

Note that implementation of 'scalar help' was moved to the final set of
changes to move Scalar out of 'contrib/'. This is due to a dependency on
changes to 'git help', as all changes to the main Git tree *exclusively*
implemented to support Scalar are part of that series.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:32 -07:00
ec4c23116b scalar unregister: stop FSMonitor daemon
Especially on Windows, we will need to stop that daemon, just in case
that the directory needs to be removed (the daemon would otherwise hold
a handle to that directory, preventing it from being deleted).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:32 -07:00
3f1917dc60 scalar: enable built-in FSMonitor on register
Using the built-in FSMonitor makes many common commands quite a bit
faster. So let's teach the `scalar register` command to enable the
built-in FSMonitor and kick-start the fsmonitor--daemon process (for
convenience).

For simplicity, we only support the built-in FSMonitor (and no external
file system monitor such as e.g. Watchman).

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:32 -07:00
d934a11c71 scalar: move config setting logic into its own function
Create function 'set_scalar_config()' to contain the logic used in setting
Scalar-defined Git config settings, including how to handle reconfiguring &
overwriting existing values. This function allows future patches to set
config values in parts of 'scalar.c' other than 'set_recommended_config()'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:32 -07:00
9b24bb9205 scalar-delete: do not 'die()' in 'delete_enlistment()'
Rather than exiting with 'die()' when 'delete_enlistment()' encounters an
error, return an error code with the appropriate message. There's no need
for an abrupt exit with 'die()' in 'delete_enlistment()' because its only
caller ('cmd_delete()') properly cleans up allocated resources and returns
the 'delete_enlistment()' return value as its own exit code.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:32 -07:00
d2a79bc953 scalar-[un]register: clearly indicate source of error
When a step in 'register_dir()' or 'unregister_dir()' fails, indicate which
step failed with an error message, rather than silently assigning a nonzero
return code.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:32 -07:00
adedcee811 scalar-unregister: handle error codes greater than 0
When 'scalar unregister' tries to disable maintenance and remove an
enlistment, ensure that the return value is nonzero if either operation
produces *any* nonzero return value, not just when they return a value less
than 0.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:31 -07:00
65f6a9eb0b scalar: constrain enlistment search
Make the search for repository and enlistment root in
'setup_enlistment_directory()' more constrained to simplify behavior and
adhere to 'GIT_CEILING_DIRECTORIES'.

Previously, 'setup_enlistment_directory()' would check whether the provided
path (or current working directory) '<dir>' or its subdirectory '<dir>/src'
was a repository root. If not, the process would repeat on the parent of
'<dir>' until the repository was found or it reached the root of the
filesystem. This meant that a user could specify a path *anywhere* inside an
enlistment (including paths not in the repository contained within the
enlistment) and it would be found.

The downside to this process is that the search would not account for
'GIT_CEILING_DIRECTORIES', so the upward search could result in modifying
repository contents past 'GIT_CEILING_DIRECTORIES'. Similarly, operations
like 'scalar delete' could end up unintentionally deleting the parent of a
repo if its root was named 'src'.

To make this 'setup_enlistment_directory()' both adhere to
'GIT_CEILING_DIRECTORIES' and avoid unwanted deletions, the search for an
enlistment directory is simplified to:

- if '<dir>/src' is a repository root, '<dir>' is the enlistment root
- if '<dir>' is either the repository root or contained within a repository,
  the repository root is the enlistment root

Now, only 'setup_git_directory()' (called by 'setup_enlistment_directory()')
searches upwards from the 'scalar' specified path, enforcing
'GIT_CEILING_DIRECTORIES' in the process. Additionally, 'scalar delete
<dir>/src' will not delete '<dir>' (if users would like to delete it, they
can still specify the enlistment root with 'scalar delete <dir>'). This is
true of any 'scalar' operation; users can invoke 'scalar' on the enlistment
root, but paths must otherwise be inside the repository to be valid.

To help clarify the updated behavior, new tests are added to
't9099-scalar.sh'.

Finally, this change leaves 'strbuf_parent_directory()' with only a single,
WIN32-specific caller in 'delete_enlistment()'. Rather than wrap
'strbuf_parent_directory()' in '#ifdef WIN32' to avoid the "unused function"
compiler error, move the contents of 'strbuf_parent_directory()' into
'delete_enlistment()' and remove the function.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 21:35:31 -07:00
795ea8776b The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 13:07:05 -07:00
fddd8b4801 Merge branch 'll/disk-usage-humanise'
"git rev-list --disk-usage" learned to take an optional value
"human" to show the reported value in human-readable format, like
"3.40MiB".

* ll/disk-usage-humanise:
  rev-list: support human-readable output for `--disk-usage`
2022-08-18 13:07:05 -07:00
9b9445cfde Merge branch 'sy/sparse-rm'
"git rm" has become more aware of the sparse-index feature.

* sy/sparse-rm:
  rm: integrate with sparse-index
  rm: expand the index only when necessary
  pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
  t1092: add tests for `git-rm`
2022-08-18 13:07:05 -07:00
80ffc849bd Merge branch 'vd/sparse-reset-checkout-fixes'
Fixes to sparse index compatibility work for "reset" and "checkout"
commands.

* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-18 13:07:04 -07:00
0d133a3dcf Merge branch 'ds/bundle-uri-more'
The "bundle URI" design gets documented.

* ds/bundle-uri-more:
  bundle-uri: add example bundle organization
  docs: document bundle URI standard
2022-08-18 13:07:04 -07:00
363a193c3a Merge branch 'jk/fsck-tree-mode-bits-fix'
"git fsck" reads mode from tree objects but canonicalizes the mode
before passing it to the logic to check object sanity, which has
hid broken tree objects from the checking logic.  This has been
corrected, but to help exiting projects with broken tree objects
that they cannot fix retroactively, the severity of anomalies this
code detects has been demoted to "info" for now.

* jk/fsck-tree-mode-bits-fix:
  fsck: downgrade tree badFilemode to "info"
  fsck: actually detect bad file modes in trees
  tree-walk: add a mechanism for getting non-canonicalized modes
2022-08-18 13:07:04 -07:00
4d8074bf8e Merge branch 'fc/vimdiff-layout-vimdiff3-fix'
"vimdiff3" regression fix.

* fc/vimdiff-layout-vimdiff3-fix:
  mergetools: vimdiff: simplify tabfirst
  mergetools: vimdiff: fix single window layouts
  mergetools: vimdiff: rework tab logic
  mergetools: vimdiff: fix for diffopt
  mergetools: vimdiff: silence annoying messages
  mergetools: vimdiff: make vimdiff3 actually work
  mergetools: vimdiff: fix comment
2022-08-18 13:07:04 -07:00
58ded4a4dc Merge branch 'po/doc-add-renormalize'
Documentation for "git add --renormalize" has been improved.

* po/doc-add-renormalize:
  doc add: renormalize is not idempotent for CRCRLF
2022-08-18 13:07:03 -07:00
565577ed88 merge-ort: provide helpful submodule update message when possible
In commit 4057523a40 ("submodule merge: update conflict error message",
2022-08-04), a more detailed message was provided when submodules
conflict, in order to help users know how to resolve those conflicts.
There were a couple situations for which a different message would be
more appropriate, but that commit left handling those for future work.
Unfortunately, that commit would check if any submodules were of the
type that it didn't know how to explain, and, if so, would avoid
providing the more detailed explanation even for the submodules it did
know how to explain.

Change this to have the code print the helpful messages for the subset
of submodules it knows how to explain.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 09:49:30 -07:00
34ce504a33 merge-ort: avoid surprise with new sub_flag variable
Commit 4057523a40 ("submodule merge: update conflict error message",
2022-08-04) added a sub_flag variable that is used to store a value from
enum conflict_and_info_types, but initializes it with a value of -1 that
does not correspond to any of the conflict_and_info_types.  The code may
never set it to a valid value and yet still use it, which can be
surprising when reading over the code at first.  Initialize it instead
to the generic CONFLICT_SUBMODULE_FAILED_TO_MERGE value, which is still
distinct from the two values we need to special case.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 09:49:30 -07:00
a5834b775b merge-ort: remove translator lego in new "submodule conflict suggestion"
In commit 4057523a40 ("submodule merge: update conflict error message",
2022-08-04), the new "submodule conflict suggestion" code was
translating 6 different pieces of the new message and then used
carefully crafted logic to allow stitching it back together with special
formatting.  Keep the components of the message together as much as
possible, so that:
  * we reduce the number of things translators have to translate
  * translators have more control over the format of the output
  * the code is much easier for developers to understand too

Also, reformat some comments running beyond the 80th column while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-18 09:49:30 -07:00
716c1f649e pipe_command(): mark stdin descriptor as non-blocking
Our pipe_command() helper lets you both write to and read from a child
process on its stdin/stdout. It's supposed to work without deadlocks
because we use poll() to check when descriptors are ready for reading or
writing. But there's a bug: if both the data to be written and the data
to be read back exceed the pipe buffer, we'll deadlock.

The issue is that the code assumes that if you have, say, a 2MB buffer
to write and poll() tells you that the pipe descriptor is ready for
writing, that calling:

  write(cmd->in, buf, 2*1024*1024);

will do a partial write, filling the pipe buffer and then returning what
it did write. And that is what it would do on a socket, but not for a
pipe. When writing to a pipe, at least on Linux, it will block waiting
for the child process to read() more. And now we have a potential
deadlock, because the child may be writing back to us, waiting for us to
read() ourselves.

An easy way to trigger this is:

  git -c add.interactive.useBuiltin=true \
      -c interactive.diffFilter=cat \
      checkout -p HEAD~200

The diff against HEAD~200 will be big, and the filter wants to write all
of it back to us (obviously this is a dummy filter, but in the real
world something like diff-highlight would similarly stream back a big
output).

If you set add.interactive.useBuiltin to false, the problem goes away,
because now we're not using pipe_command() anymore (instead, that part
happens in perl). But this isn't a bug in the interactive code at all.
It's the underlying pipe_command() code which is broken, and has been
all along.

We presumably didn't notice because most calls only do input _or_
output, not both. And the few that do both, like gpg calls, may have
large inputs or outputs, but never both at the same time (e.g., consider
signing, which has a large payload but a small signature comes back).

The obvious fix is to put the descriptor into non-blocking mode, and
indeed, that makes the problem go away. Callers shouldn't need to
care, because they never see the descriptor (they hand us a buffer to
feed into it).

The included test fails reliably on Linux without this patch. Curiously,
it doesn't fail in our Windows CI environment, but has been reported to
do so for individual developers. It should pass in any environment after
this patch (courtesy of the compat/ layers added in the last few
commits).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:41 -07:00
c6d3cce6f3 pipe_command(): handle ENOSPC when writing to a pipe
When write() to a non-blocking pipe fails because the buffer is full,
POSIX says we should see EAGAIN. But our mingw_write() compat layer on
Windows actually returns ENOSPC for this case. This is probably
something we want to correct, but given that we don't plan to use
non-blocking descriptors in a lot of places, we can work around it by
just catching ENOSPC alongside EAGAIN. If we ever do fix mingw_write(),
then this patch can be reverted.

We don't actually use a non-blocking pipe yet, so this is still just
preparation.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:41 -07:00
14eab817e4 pipe_command(): avoid xwrite() for writing to pipe
If xwrite() sees an EAGAIN response, it will loop forever until the
write succeeds (or encounters a real error). This is due to ef1cf0167a
(xwrite: poll on non-blocking FDs, 2016-06-26), with the idea that we
won't be surprised by a descriptor unexpectedly set as non-blocking.

But that will make things awkward when we do want a non-blocking
descriptor, and a future patch will switch pipe_command() to using one.
In that case, looping on EAGAIN is bad, because the process on the other
end of the pipe may be waiting on us before doing another read() on the
pipe, which would mean we deadlock.

In practice we're not supposed to ever see EAGAIN here, since poll()
will have just told us the descriptor is ready for writing. But our
Windows emulation of poll() will always return "ready" for writing to a
pipe descriptor! This is due to 94f4d01932 (mingw: workaround for hangs
when sending STDIN, 2020-02-17).

Our best bet in that case is to keep handling other descriptors, as any
read() we do may allow the child command to make forward progress (i.e.,
its write() finishes, and then it read()s from its stdin, freeing up
space in the pipe buffer). This means we might busy-loop between poll()
and write() on Windows if the child command is slow to read our input,
but it's much better than the alternative of deadlocking.

In practice, this busy-looping should be rare:

  - for small inputs, we'll just write the whole thing in a single
    write() anyway, non-blocking or not

  - for larger inputs where the child reads input and then processes it
    before writing (e.g., gpg verifying a signature), we may make a few
    extra write() calls that get EAGAIN during the initial write, but
    once it has taken in the whole input, we'll correctly block waiting
    to read back the data.

  - for larger inputs where the child process is streaming output back
    (like a diff filter), we'll likewise see some extra EAGAINs, but
    most of them will be followed immediately by a read(), which will
    let the child command make forward progress.

Of course it won't happen at all for now, since we don't yet use a
non-blocking pipe. This is just preparation for when we do.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:40 -07:00
ec4f39b233 git-compat-util: make MAX_IO_SIZE define globally available
We define MAX_IO_SIZE within wrapper.c, but it's useful for any code
that wants to do a raw write() for whatever reason (say, because they
want different EAGAIN handling). Let's make it available everywhere.

The alternative would be adding xwrite_foo() variants to give callers
more options. But there's really no reason MAX_IO_SIZE needs to be
abstracted away, so this give callers the most flexibility.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:40 -07:00
24b56ae4ae nonblock: support Windows
Implement enable_pipe_nonblock() using the Windows API. This works only
for pipes, but that is sufficient for this limited interface. Despite
the API calls used, it handles both "named" and anonymous pipes from our
pipe() emulation.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:40 -07:00
10f743389c compat: add function to enable nonblocking pipes
We'd like to be able to make some of our pipes nonblocking so that
poll() can be used effectively, but O_NONBLOCK isn't portable. Let's
introduce a compat wrapper so this can be abstracted for each platform.

The interface is as narrow as possible to let platforms do what's
natural there (rather than having to implement fcntl() and a fake
O_NONBLOCK for example, or having to handle other types of descriptors).

The next commit will add Windows support, at which point we should be
covering all platforms in practice. But if we do find some other
platform without O_NONBLOCK, we'll return ENOSYS. Arguably we could just
trigger a build-time #error in this case, which would catch the problem
earlier. But since we're not planning to use this compat wrapper in many
code paths, a seldom-seen runtime error may be friendlier for such a
platform than blocking compilation completely. Our test suite would
still notice it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:40 -07:00
a29263cf5f fetch-pack: add tracing for negotiation rounds
Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering
the entire negotiation process. However, we'd like additional data, such
as timing for each round of negotiation or the number of "haves" in each
round. Additionally, "independent negotiation" (AKA push negotiation)
has no tracing at all. Having this data would allow us to compare the
performance of the various negotation implementations, and to debug
unexpectedly slow fetch & push sessions.

Add per-round trace2 regions for all negotiation implementations (V0+V1,
V2, and independent negotiation), as well as an overall region for
independent negotiation. Add trace2 data logging for the number of haves
and "in vain" objects for each round, and for the total number of rounds
once negotiation completes.  Finally, add a few checks into various
tests to verify that the number of rounds is logged as expected.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-15 09:17:03 -07:00
9bf691b78c The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-14 23:19:28 -07:00
7fac7b563b Merge branch 'js/safe-directory-plus'
Platform-specific code that determines if a directory is OK to use
as a repository has been taught to report more details, especially
on Windows.

* js/safe-directory-plus:
  mingw: handle a file owned by the Administrators group correctly
  mingw: be more informative when ownership check fails on FAT32
  mingw: provide details about unsafe directories' ownership
  setup: prepare for more detailed "dubious ownership" messages
  setup: fix some formatting
2022-08-14 23:19:28 -07:00
7d0a1c8895 Merge branch 'pw/use-glibc-tunable-for-malloc-optim'
Avoid repeatedly running getconf to ask libc version in the test
suite, and instead just as it once per script.

* pw/use-glibc-tunable-for-malloc-optim:
  tests: cache glibc version check
2022-08-14 23:19:28 -07:00
c0f6dd49f1 Merge branch 'ab/tech-docs-to-help'
Expose a lot of "tech docs" via "git help" interface.

* ab/tech-docs-to-help:
  docs: move http-protocol docs to man section 5
  docs: move cruft pack docs to gitformat-pack
  docs: move pack format docs to man section 5
  docs: move signature docs to man section 5
  docs: move index format docs to man section 5
  docs: move protocol-related docs to man section 5
  docs: move commit-graph format docs to man section 5
  git docs: add a category for file formats, protocols and interfaces
  git docs: add a category for user-facing file, repo and command UX
  git help doc: use "<doc>" instead of "<guide>"
  help.c: remove common category behavior from drop_prefix() behavior
  help.c: refactor drop_prefix() to use a "switch" statement"
2022-08-14 23:19:28 -07:00
3adacc2817 Merge branch 'jc/rerere-autoupdate-doc'
Update documentation on the "--[no-]rerere-autoupdate" option.

* jc/rerere-autoupdate-doc:
  doc: clarify rerere-autoupdate
  doc: consolidate --rerere-autoupdate description
2022-08-14 23:19:27 -07:00
d86ac14dd7 Merge branch 'ab/hooks-regression-fix'
A follow-up fix to a fix for a regression in 2.36.

* ab/hooks-regression-fix:
  hook API: don't segfault on strbuf_addf() to NULL "out"
2022-08-14 23:19:27 -07:00
4d1d843be7 tests: use the new C rot13-filter helper to avoid PERL prereq
The previous commit implemented a C version of the t0021/rot13-filter.pl
script. Let's use this new C helper to eliminate the PERL prereq from
various tests, and also remove the superseded Perl script.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-14 22:57:12 -07:00
52917a998e t0021: implementation the rot13-filter.pl script in C
This script is currently used by three test files: t0021-conversion.sh,
t2080-parallel-checkout-basics.sh, and
t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
dependency at these tests, let's convert the script to a C test-tool
command. The following commit will take care of actually modifying the
said tests to use the new C helper and removing the Perl script.

The Perl script flushes the log file handler after each write. As
commented in [1], this seems to be an early design decision that was
later reconsidered, but possibly ended up being left in the code by
accident:

	>> +$debug->flush();
	>
	> Isn't $debug flushed automatically?

	Maybe, but autoflush is not explicitly enabled. I will
	enable it again (I disabled it because of Eric's comment
	but I re-read the comment and he is only talking about
	pipes).

Anyways, this behavior is not really needed for the tests and the
flush() calls make the code slightly larger, so let's avoid them
altogether in the new C version.

[1]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-14 22:57:12 -07:00
bed8947751 t0021: avoid grepping for a Perl-specific string at filter output
This test sets the t0021/rot13-filter.pl script as a long-running
process filter for a git checkout command. It then expects the filter to
fail producing a specific error message at stderr. In the following
commits we are going to replace the script with a C test-tool helper,
but the test currently expects the error message in a Perl-specific
format. That is, when you call `die <msg>` in Perl, it emits
"<msg> at - line 1." In preparation for the conversion, let's avoid the
Perl-specific part and only grep for <msg> itself.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-14 22:57:11 -07:00
1490d7d82d is_promisor_object(): fix use-after-free of tree buffer
Since commit fcc07e980b (is_promisor_object(): free tree buffer after
parsing, 2021-04-13), we'll always free the buffers attached to a
"struct tree" after searching them for promisor links. But there's an
important case where we don't want to do so: if somebody else is already
using the tree!

This can happen during a "rev-list --missing=allow-promisor" traversal
in a partial clone that is missing one or more trees or blobs. The
backtrace for the free looks like this:

      #1 free_tree_buffer tree.c:147
      #2 add_promisor_object packfile.c:2250
      #3 for_each_object_in_pack packfile.c:2190
      #4 for_each_packed_object packfile.c:2215
      #5 is_promisor_object packfile.c:2272
      #6 finish_object__ma builtin/rev-list.c:245
      #7 finish_object builtin/rev-list.c:261
      #8 show_object builtin/rev-list.c:274
      #9 process_blob list-objects.c:63
      #10 process_tree_contents list-objects.c:145
      #11 process_tree list-objects.c:201
      #12 traverse_trees_and_blobs list-objects.c:344
      [...]

We're in the middle of walking through the entries of a tree object via
process_tree_contents(). We see a blob (or it could even be another tree
entry) that we don't have, so we call is_promisor_object() to check it.
That function loops over all of the objects in the promisor packfile,
including the tree we're currently walking. When we're done with it
there, we free the tree buffer. But as we return to the walk in
process_tree_contents(), it's still holding on to a pointer to that
buffer, via its tree_desc iterator, and it accesses the freed memory.

Even a trivial use of "--missing=allow-promisor" triggers this problem,
as the included test demonstrates (it's just a vanilla --blob:none
clone).

We can detect this case by only freeing the tree buffer if it was
allocated on our behalf. This is a little tricky since that happens
inside parse_object(), and it doesn't tell us whether the object was
already parsed, or whether it allocated the buffer itself. But by
checking for an already-parsed tree beforehand, we can distinguish the
two cases.

That feels a little hacky, and does incur an extra lookup in the
object-hash table. But that cost is fairly minimal compared to actually
loading objects (and since we're iterating the whole pack here, we're
likely to be loading most objects, rather than reusing cached results).

It may also be a good direction for this function in general, as there
are other possible optimizations that rely on doing some analysis before
parsing:

  - we could detect blobs and avoid reading their contents; they can't
    link to other objects, but parse_object() doesn't know that we don't
    care about checking their hashes.

  - we could avoid allocating object structs entirely for most objects
    (since we really only need them in the oidset), which would save
    some memory.

  - promisor commits could use the commit-graph rather than loading the
    object from disk

This commit doesn't do any of those optimizations, but I think it argues
that this direction is reasonable, rather than relying on parse_object()
and trying to teach it to give us more information about whether it
parsed.

The included test fails reliably under SANITIZE=address just when
running "rev-list --missing=allow-promisor". Checking the output isn't
strictly necessary to detect the bug, but it seems like a reasonable
addition given the general lack of coverage for "allow-promisor" in the
test suite.

Reported-by: Andrew Olsen <andrew.olsen@koordinates.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-14 18:03:36 -07:00
43370b1e91 scalar: update technical doc roadmap
Update the Scalar roadmap to reflect the completion of generalizing 'scalar
diagnose' into 'git diagnose' and 'git bugreport --diagnose'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:03 -07:00
672196a307 scalar-diagnose: use 'git diagnose --mode=all'
Replace implementation of 'scalar diagnose' with an internal invocation of
'git diagnose --mode=all'. This simplifies the implementation of
'cmd_diagnose' by making it a direct alias of 'git diagnose' and removes
some code in 'scalar.c' that is duplicated in 'builtin/diagnose.c'. The
simplicity of the alias also sets up a clean deprecation path for 'scalar
diagnose' (in favor of 'git diagnose'), if that is desired in the future.

This introduces one minor change to the output of 'scalar diagnose', which
is that the prefix of the created zip archive is changed from 'scalar_' to
'git-diagnostics-'.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
aac0e8ffee builtin/bugreport.c: create '--diagnose' option
Create a '--diagnose' option for 'git bugreport' to collect additional
information about the repository and write it to a zipped archive.

The '--diagnose' option behaves effectively as an alias for simultaneously
running 'git bugreport' and 'git diagnose'. In the documentation, users are
explicitly recommended to attach the diagnostics alongside a bug report to
provide additional context to readers, ideally reducing some back-and-forth
between reporters and those debugging the issue.

Note that '--diagnose' may take an optional string arg (either 'stats' or
'all'). If specified without the arg, the behavior corresponds to running
'git diagnose' without '--mode'. As with 'git diagnose', this default is
intended to help reduce unintentional leaking of sensitive information).
Users can also explicitly specify '--diagnose=(stats|all)' to generate the
respective archive created by 'git diagnose --mode=(stats|all)'.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
7ecf193f7d builtin/diagnose.c: add '--mode' option
Create '--mode=<mode>' option in 'git diagnose' to allow users to optionally
select non-default diagnostic information to include in the output archive.
Additionally, document the currently-available modes, emphasizing the
importance of not sharing a '--mode=all' archive publicly due to the
presence of sensitive information.

Note that the option parsing callback - 'option_parse_diagnose()' - is added
to 'diagnose.c' rather than 'builtin/diagnose.c' so that it may be reused in
future callers configuring a diagnostics archive.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
6783fd3cef builtin/diagnose.c: create 'git diagnose' builtin
Create a 'git diagnose' builtin to generate a standalone zip archive of
repository diagnostics.

The "diagnose" functionality was originally implemented for Scalar in
aa5c79a331 (scalar: implement `scalar diagnose`, 2022-05-28). However, the
diagnostics gathered are not specific to Scalar-cloned repositories and
can be useful when diagnosing issues in any Git repository.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
33cba726f0 diagnose.c: add option to configure archive contents
Update 'create_diagnostics_archive()' to take an argument 'mode'. When
archiving diagnostics for a repository, 'mode' is used to selectively
include/exclude information based on its value. The initial options for
'mode' are:

* DIAGNOSE_NONE: do not collect any diagnostics or create an archive
  (no-op).
* DIAGNOSE_STATS: collect basic repository metadata (Git version, repo path,
  filesystem available space) as well as sizing and count statistics for the
  repository's objects and packfiles.
* DIAGNOSE_ALL: collect basic repository metadata, sizing/count statistics,
  and copies of the '.git', '.git/hooks', '.git/info', '.git/logs', and
  '.git/objects/info' directories.

These modes are introduced to provide users the option to collect
diagnostics without the sensitive information included in copies of '.git'
dir contents. At the moment, only 'scalar diagnose' uses
'create_diagnostics_archive()' (with a hardcoded 'DIAGNOSE_ALL' mode to
match existing functionality), but more callers will be introduced in
subsequent patches.

Finally, refactor from a hardcoded set of 'add_directory_to_archiver()'
calls to iterative invocations gated by 'DIAGNOSE_ALL'. This allows for
easier future modification of the set of directories to archive and improves
error reporting when 'add_directory_to_archiver()' fails.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
bb2c34956a scalar-diagnose: move functionality to common location
Move the core functionality of 'scalar diagnose' into a new 'diagnose.[c,h]'
library to prepare for new callers in the main Git tree generating
diagnostic archives. These callers will be introduced in subsequent patches.

While this patch appears large, it is mostly made up of moving code out of
'scalar.c' and into 'diagnose.c'. Specifically, the functions

- dir_file_stats_objects()
- dir_file_stats()
- count_files()
- loose_objs_stats()
- add_directory_to_archiver()

are all copied verbatim from 'scalar.c'. The 'create_diagnostics_archive()'
function is a mostly identical (partial) copy of 'cmd_diagnose()', with the
primary changes being that 'zip_path' is an input and "Enlistment root" is
corrected to "Repository root" in the archiver log.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
435a2535b7 scalar-diagnose: move 'get_disk_info()' to 'compat/'
Move 'get_disk_info()' function into 'compat/'. Although Scalar-specific
code is generally not part of the main Git tree, 'get_disk_info()' will be
used in subsequent patches by additional callers beyond 'scalar diagnose'.
This patch prepares for that change, at which point this platform-specific
code should be part of 'compat/' as a matter of convention.

The function is copied *mostly* verbatim, with two exceptions:

* '#ifdef WIN32' is replaced with '#ifdef GIT_WINDOWS_NATIVE' to allow
  'statvfs' to be used with Cygwin.
* the 'struct strbuf buf' and 'int res' (as well as their corresponding
  cleanup & return) are moved outside of the '#ifdef' block.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
ba307a5046 scalar-diagnose: add directory to archiver more gently
If a directory added to the 'scalar diagnose' archiver does not exist, warn
and return 0 from 'add_directory_to_archiver()' rather than failing with a
fatal error. This handles a failure edge case where the '.git/logs' has not
yet been created when running 'scalar diagnose', but extends to any
situation where a directory may be missing in the '.git' dir.

Now, when a directory is missing a warning is captured in the diagnostic
logs. This provides a user with more complete information than if 'scalar
diagnose' simply failed with an error.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
91be401945 scalar-diagnose: avoid 32-bit overflow of size_t
Avoid 32-bit size_t overflow when reporting the available disk space in
'get_disk_info' by casting the block size and available block count to
'off_t' before multiplying them. Without this change, 'st_mult' would
(correctly) report a size_t overflow on 32-bit systems at or exceeding 2^32
bytes of available space.

Note that 'off_t' is a 64-bit integer even on 32-bit systems due to the
inclusion of '#define _FILE_OFFSET_BITS 64' in 'git-compat-util.h' (see
b97e911643 (Support for large files on 32bit systems., 2007-02-17)).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
81ad551343 scalar-diagnose: use "$GIT_UNZIP" in test
Use the "$GIT_UNZIP" test variable rather than verbatim 'unzip' to unzip the
'scalar diagnose' archive. Using "$GIT_UNZIP" is needed to run the Scalar
tests on systems where 'unzip' is not in the system path.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
afa70145a2 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:19:08 -07:00
83489a5b20 Merge branch 'ab/plug-revisions-leak'
Plug a bit more leaks in the revisions API.

* ab/plug-revisions-leak:
  revisions API: don't leak memory on argv elements that need free()-ing
  bisect.c: partially fix bisect_rev_setup() memory leak
  log: refactor "rev.pending" code in cmd_show()
  log: fix a memory leak in "git show <revision>..."
  test-fast-rebase helper: use release_revisions() (again)
  bisect.c: add missing "goto" for release_revisions()
2022-08-12 13:19:08 -07:00
657c7403a3 Merge branch 'ab/leak-check'
Extend SANITIZE=leak checking and declare more tests "currently leak-free".

* ab/leak-check:
  CI: use "GIT_TEST_SANITIZE_LEAK_LOG=true" in linux-leaks
  upload-pack: fix a memory leak in create_pack_file()
  leak tests: mark passing SANITIZE=leak tests as leak-free
  leak tests: don't skip some tests under SANITIZE=leak
  test-lib: have the "check" mode for SANITIZE=leak consider leak logs
  test-lib: add a GIT_TEST_PASSING_SANITIZE_LEAK=check mode
  test-lib: simplify by removing test_external
  tests: move copy/pasted PERL + Test::More checks to a lib-perl.sh
  t/Makefile: don't remove test-results in "clean-except-prove-cache"
  test-lib: add a SANITIZE=leak logging mode
  t/README: reword the "GIT_TEST_PASSING_SANITIZE_LEAK" description
  test-lib: add a --invert-exit-code switch
  test-lib: fix GIT_EXIT_OK logic errors, use BAIL_OUT
  test-lib: don't set GIT_EXIT_OK before calling test_atexit_handler
  test-lib: use $1, not $@ in test_known_broken_{ok,failure}_
2022-08-12 13:19:08 -07:00
f0e9754a27 Merge branch 'gc/git-reflog-doc-markup'
Doc mark-up fix.

* gc/git-reflog-doc-markup:
  Documentation/git-reflog: remove unneeded \ from \{
2022-08-12 13:19:08 -07:00
8faaf690f7 Merge branch 'lt/symbolic-ref-sanity'
"git symbolic-ref symref non..sen..se" is now diagnosed as an error.

* lt/symbolic-ref-sanity:
  symbolic-ref: refuse to set syntactically invalid target
2022-08-12 13:19:08 -07:00
35ae40ead3 tr2: shows scope unconditionally in addition to key-value pair
When we specify GIT_TRACE2_CONFIG_PARAMS or trace2.configparams,
trace2 will prints "interesting" config values to log. Sometimes,
when a config set in multiple scope files, the following output
looks like (the irrelevant fields are omitted here as "..."):

...| def_param    |  ...  | core.multipackindex:false
...| def_param    |  ...  | core.multipackindex:false
...| def_param    |  ...  | core.multipackindex:false

As the log shows, even each config in different scope is dumped, but
we don't know which scope it comes from. Therefore, it's better to
add the scope names as well to make them be more recognizable. For
example, when execute:

    $ GIT_TRACE2_PERF=1 \
    > GIT_TRACE2_CONFIG_PARAMS=core.multipackIndex \
    > git rev-list --test-bitmap HEAD"

The following is the ouput (the irrelevant fields are omitted here
as "..."):

Format normal:
... git.c:461 ... def_param scope:system core.multipackindex=false
... git.c:461 ... def_param scope:global core.multipackindex=false
... git.c:461 ... def_param scope:local core.multipackindex=false

Format perf:

... | def_param    | ... | scope:system | core.multipackindex:false
... | def_param    | ... | scope:global | core.multipackindex:false
... | def_param    | ... | scope:local  | core.multipackindex:false

Format event:

{"event":"def_param", ... ,"scope":"system","param":"core.multipackindex","value":"false"}
{"event":"def_param", ... ,"scope":"global","param":"core.multipackindex","value":"false"}
{"event":"def_param", ... ,"scope":"local","param":"core.multipackindex","value":"false"}

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 21:05:00 -07:00
050d0dc241 api-trace2.txt: print config key-value pair
It's supported to print "interesting" config key-value paire
to tr2 log by setting "GIT_TRACE2_CONFIG_PARAMS" environment
variable and the "trace2.configparam" config, let's add the
related docs in Documentaion/technical/api-trace2.txt.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 21:05:00 -07:00
85dc0da6dc fsmonitor: option to allow fsmonitor to run against network-mounted repos
Though perhaps not common, there are use cases where users have large,
network-mounted repos. Having the ability to run fsmonitor against
network paths would benefit those users.

Most modern Samba-based filers have the necessary support to enable
fsmonitor on network-mounted repos. As a first step towards enabling
fsmonitor to work against network-mounted repos, introduce a
configuration option, 'fsmonitor.allowRemote'. Setting this option to
true will override the default behavior (erroring-out) when a
network-mounted repo is detected by fsmonitor.

Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 21:03:09 -07:00
9096451acd rev-list: support human-readable output for --disk-usage
The '--disk-usage' option for git-rev-list was introduced in 16950f8384
(rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
This is very useful for people inspect their git repo's objects usage
infomation, but the resulting number is quit hard for a human to read.

Teach git rev-list to output a human readable result when using
'--disk-usage'.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 13:45:23 -07:00
5502f77b69 Sync with Git 2.37.2 2022-08-10 21:57:59 -07:00
e21e663cd1 clone: --bundle-uri cannot be combined with --depth
A previous change added the '--bundle-uri' option, but did not check
if the --depth parameter was included. Since bundles are not compatible
with shallow clones, provide an error message to the user who is
attempting this combination.

I am leaving this as its own change, separate from the one that
implements '--bundle-uri', because this is more of an advisory for the
user. There is nothing wrong with bootstrapping with bundles and then
fetching a shallow clone. However, that is likely going to involve too
much work for the client _and_ the server. The client will download all
of this bundle information containing the full history of the
repository only to ignore most of it. The server will get a shallow
fetch request, but with a list of haves that might cause a more painful
computation of that shallow pack-file.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
59c1752ab6 bundle-uri: add support for http(s):// and file://
The previous change created the 'git clone --bundle-uri=<uri>' option.
Currently, <uri> must be a filename.

Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix
and use git-remote-https as the way to download the data at that URI.
Otherwise, check to see if file:// is present and modify the prefix
accordingly.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
5556891961 clone: add --bundle-uri option
Cloning a remote repository is one of the most expensive operations in
Git. The server can spend a lot of CPU time generating a pack-file for
the client's request. The amount of data can clog the network for a long
time, and the Git protocol is not resumable. For users with poor network
connections or are located far away from the origin server, this can be
especially painful.

Add a new '--bundle-uri' option to 'git clone' to bootstrap a clone from
a bundle. If the user is aware of a bundle server, then they can tell
Git to bootstrap the new repository with these bundles before fetching
the remaining objects from the origin server.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
53a50892be bundle-uri: create basic file-copy logic
Before implementing a way to fetch bundles into a repository, create the
basic logic. Assume that the URI is actually a file path. Future logic
will make this more careful to other protocols.

For now, we also only succeed if the content at the URI is a bundle
file, not a bundle list. Bundle lists will be implemented in a future
change.

Note that the discovery of a temporary filename is slightly racy because
the odb_mkstemp() relies on the temporary file not existing. With the
current implementation being limited to file copies, we could replace
the copy_file() with copy_fd(). The tricky part comes in future changes
that send the filename to 'git remote-https' and its 'get' capability.
At that point, we need the file descriptor closed _and_ the file
unlinked. If we were to keep the file descriptor open for the sake of
normal file copies, then we would pollute the rest of the code for
little benefit. This is especially the case because we expect that most
bundle URI use will be based on HTTPS instead of file copies.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
b5624a4474 remote-curl: add 'get' capability
A future change will want a way to download a file over HTTP(S) using
the simplest of download mechanisms. We do not want to assume that the
server on the other side understands anything about the Git protocol but
could be a simple static web server.

Create the new 'get' capability for the remote helpers which advertises
that the 'get' command is avalable. A caller can send a line containing
'get <url> <path>' to download the file at <url> into the file at
<path>.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
d06ed85dcb bundle-uri: add example bundle organization
The previous change introduced the bundle URI design document. It
creates a flexible set of options that allow bundle providers many ways
to organize Git object data and speed up clones and fetches. It is
particularly important that we have flexibility so we can apply future
advancements as new ideas for efficiently organizing Git data are
discovered.

However, the design document does not provide even an example of how
bundles could be organized, and that makes it difficult to envision how
the feature should work at the end of the implementation plan.

Add a section that details how a bundle provider could work, including
using the Git server advertisement for multiple geo-distributed servers.
This organization is based on the GVFS Cache Servers which have
successfully used similar ideas to provide fast object access and
reduced server load for very large repositories.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:03:17 -07:00
2da14fad8f docs: document bundle URI standard
Introduce the idea of bundle URIs to the Git codebase through an
aspirational design document. This document includes the full design
intended to include the feature in its fully-implemented form. This will
take several steps as detailed in the Implementation Plan section.

By committing this document now, it can be used to motivate changes
necessary to reach these final goals. The design can still be altered as
new information is discovered.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:03:11 -07:00
da6fe05b3d mv: check overwrite for in-to-out move
Add checking logic for overwriting when moving from in-cone to
out-of-cone. It is the index version of the original overwrite logic.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:50 -07:00
5efd533ed8 advice.h: add advise_on_moving_dirty_path()
Add an advice.

When the user use `git mv --sparse <dirty-path> <destination>`, Git
will warn the user to use `git add --sparse <paths>` then use
`git sparse-checkout reapply` to apply the sparsity rules.

Add a few lines to previous "move dirty path" tests so we can test
this new advice is working.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:50 -07:00
b6f51e3db9 mv: cleanup empty WORKING_DIRECTORY
Originally, moving from-in-to-out may leave an empty <source>
directory on-disk (this kind of directory is marked as
WORKING_DIRECTORY).

Cleanup such directories if they are empty (don't have any entries
under them).

Modify two tests that take <source> as WORKING_DIRECTORY to test
this behavior.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:49 -07:00
5784db1b22 mv: from in-cone to out-of-cone
Originally, moving an in-cone <source> to an out-of-cone <destination>
was not possible, mainly because such <destination> is a directory that
is not present in the working tree.

Change the behavior so that we can move an in-cone <source> to
out-of-cone <destination> when --sparse is supplied.

Notice that <destination> can also be an out-of-cone file path, rather
than a directory.

Such <source> can be either clean or dirty, and moving it results in
different behaviors:

A clean move should move <source> to <destination> in the index (do
*not* create <destination> in the worktree), then delete <source> from
the worktree.

A dirty move should move the <source> to the <destination>, both in the
working tree and the index, but should *not* remove the resulted path
from the working tree and should *not* turn on its CE_SKIP_WORKTREE bit.

Optional reading
================

We are strict about cone mode when <destination> is a file path.
The reason is that some of the previous tests that use no-cone mode in
t7002 are keep breaking, mainly because the `dst_mode = SPARSE;` line
added in this patch.

Most features developed in both "from-out-to-in" and "from-in-to-out"
only care about cone mode situation, as no-cone mode is becoming
irrelevant. And because assigning `SPARSE` to `dst_mode` when the
repo is in no-cone mode causes miscellaneous bugs, we should just leave
this new functionality to be exclusive cone mode and save some time.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:49 -07:00
9284c3ce26 mv: remove BOTH from enum update_mode
Since BOTH is not used anywhere in the code and its meaning is unclear,
remove it.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:49 -07:00
c08830de41 mv: check if <destination> is a SKIP_WORKTREE_DIR
Originally, <destination> is assumed to be in the working tree. If it is
not found as a directory, then it is determined to be either a regular file
path, or error out if used under the second form (move into a directory)
of 'git-mv'. Such behavior is not ideal, mainly because Git does not
look into the index for <destination>, which could potentially be a
SKIP_WORKTREE_DIR, which we need to determine for the later "moving from
in-cone to out-of-cone" patch.

Change the logic so that Git first check if <destination> is a directory
with all its contents sparsified (a SKIP_WORKTREE_DIR).

If <destination> is such a sparse directory, then we should modify the
index the same way as we would if this were a non-sparse directory. We
must be careful to ensure that the <destination> is marked with
SKIP_WORKTREE_DIR.

Also add a `dst_w_slash` to reuse the result from `add_slash()`, which
was everywhere and can be simplified.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:49 -07:00
d57690a9c8 mv: free the with_slash in check_dir_in_index()
with_slash may be a malloc'd pointer, and when it is, free it.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:49 -07:00
72e59ba19e mv: rename check_dir_in_index() to empty_dir_has_sparse_contents()
Method check_dir_in_index() introduced in b91a2b6594 (mv: add
check_dir_in_index() and solve general dir check issue, 2022-06-30)
does not describe its intent and behavior well.

Change its name to empty_dir_has_sparse_contents(), which more
precisely describes its purpose.

Reverse the return values, check_dir_in_index() return 0 for success
and 1 for failure; reverse the values so empty_dir_has_sparse_contents()
return 1 for success and 0 for failure. These values are more intuitive
because 1 usually means "has" and 0 means "not found".

Also modify the documentation to better align with the method's
intent and behavior.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:49 -07:00
5506683dea t7002: add tests for moving from in-cone to out-of-cone
Add corresponding tests to test that user can move an in-cone <source>
to out-of-cone <destination> when --sparse is supplied.

Such <source> can be either clean or dirty, and moving it results in
different behaviors:

A clean move should move <source> to <destination> in the index (do
*not* create <destination> in the worktree), then delete <source> from
the worktree.

A dirty move should move the <source> to the <destination>, both in the
working tree and the index, but should *not* remove the resulted path
from the working tree and should *not* turn on its CE_SKIP_WORKTREE bit.

Also make sure that if <destination> exists in the index (existing
check for if <destination> is in the worktree is not enough in
in-to-out moves), warn user against the overwrite. And Git should force
the overwrite when supplied with -f or --force.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 13:57:48 -07:00
ede241c715 rm: integrate with sparse-index
Enable the sparse index within the `git-rm` command.

The `p2000` tests demonstrate a ~92% execution time reduction for
'git rm' using a sparse index.

Test                              HEAD~1            HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3)     0.41(0.37+0.05)   0.43(0.36+0.07) +4.9%
2000.75: git rm ... (full-v4)     0.38(0.34+0.05)   0.39(0.35+0.05) +2.6%
2000.76: git rm ... (sparse-v3)   0.57(0.56+0.01)   0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4)   0.57(0.55+0.02)   0.03(0.03+0.00) -94.7%

----
Also, normalize a behavioral difference of `git-rm` under sparse-index.
See related discussion [1].

`git-rm` a sparse-directory entry within a sparse-index enabled repo
behaves differently from a sparse directory within a sparse-checkout
enabled repo.

For example, in a sparse-index repo, where 'folder1' is a
sparse-directory entry, `git rm -r --sparse folder1` provides this:

        rm 'folder1/'

Whereas in a sparse-checkout repo *without* sparse-index, doing so
provides this:

        rm 'folder1/0/0/0'
        rm 'folder1/0/1'
        rm 'folder1/a'

Because `git rm` a sparse-directory entry does not need to expand the
index, therefore we should accept the current behavior, which is faster
than "expand the sparse-directory entry to match the sparse-checkout
situation".

Modify a previous test so such difference is not considered as an error.

[1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
bcf96cfca6 rm: expand the index only when necessary
Remove the `ensure_full_index()` method so `git-rm` does not always
expand the index when the expansion is unnecessary, i.e. when
<pathspec> does not have any possibilities to match anything outside
of sparse-checkout definition.

Expand the index when the <pathspec> needs an expanded index, i.e. the
<pathspec> contains wildcard that may need a full-index or the
<pathspec> is simply outside of sparse-checkout definition.

Notice that the test 'rm pathspec expands index when necessary' in
t1092 *is* testing this code change behavior, though it will be marked
as 'test_expect_success' only in the next patch, where we officially
mark `command_requires_full_index = 0`, so the index does not expand
unless we tell it to do so.

Notice that because we also want `ensure_full_index` to record the
stdout and stderr from Git command, a corresponding modification
is also included in this patch. The reason we want the "sparse-index-out"
and "sparse-index-err", is that we need to make sure there is no error
from Git command itself, so we can rely on the `test_region` result
and determine if the index is expanded or not.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
b29ad38322 pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
Method pathspec_needs_expanded_index() in reset.c from 4d1cfc1351
(reset: make --mixed sparse-aware, 2021-11-29) is reusable when we
need to verify if the index needs to be expanded when the command
is utilizing a pathspec rather than a literal path.

Move it to pathspec.h for reusability.

Add a few items to the function so it can better serve its purpose as
a standalone public function:

* Add a check in front so if the index is not sparse, return early since
  no expansion is needed.

* It now takes an arbitrary 'struct index_state' pointer instead of
  using `the_index` and `active_cache`.

* Add documentation to the function.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
ba808251aa t1092: add tests for git-rm
Add tests for `git-rm`, make sure it behaves as expected when
<pathspec> is both inside or outside of sparse-checkout definition.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
3f61790678 Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rm
* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-08 13:23:06 -07:00
c50926e1f4 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:13:14 -07:00
bac92b1f39 Merge branch 'js/ort-clean-up-after-failed-merge'
Plug memory leaks in the failure code path in the "merge-ort" merge
strategy backend.

* js/ort-clean-up-after-failed-merge:
  merge-ort: do leave trace2 region even if checkout fails
  merge-ort: clean up after failed merge
2022-08-08 13:13:14 -07:00
b9654bee99 Merge branch 'jk/struct-zero-init-with-older-gcc'
Older gcc with -Wall complains about the universal zero initializer
"struct s = { 0 };" idiom, which makes developers' lives
inconvenient (as -Werror is enabled by DEVELOPER=YesPlease).  The
build procedure has been tweaked to help these compilers.

* jk/struct-zero-init-with-older-gcc:
  config.mak.dev: squelch -Wno-missing-braces for older gcc
2022-08-08 13:13:14 -07:00
1b53bea29a Merge branch 'js/t5351-freebsd-fix'
Some tests assumed that core.fsyncMethod=batch is supported
everywhere, which broke FreeBSD.

* js/t5351-freebsd-fix:
  t5351: avoid using `test_cmp` for binary data
  t5351: avoid relying on `core.fsyncMethod = batch` to be supported
2022-08-08 13:13:14 -07:00
6c5fbd866c Merge branch 'js/lstat-mingw-enotdir-fix'
Fix to lstat() emulation on Windows.

* js/lstat-mingw-enotdir-fix:
  lstat(mingw): correctly detect ENOTDIR scenarios
2022-08-08 13:13:14 -07:00
8dfa09f49f Merge branch 'js/mingw-with-python'
Conditionally allow building Python interpreter on Windows

* js/mingw-with-python:
  mingw: remove unneeded `NO_CURL` directive
  mingw: remove unneeded `NO_GETTEXT` directive
  windows: include the Python bits when building Git for Windows
2022-08-08 13:13:13 -07:00
6d97f440e5 Merge branch 'ca/unignore-local-installation-on-windows'
Fix build procedure for Windows that uses CMake so that it can pick
up the shell interpreter from local installation location.

* ca/unignore-local-installation-on-windows:
  cmake: support local installations of git
2022-08-08 13:13:13 -07:00
679aad9e82 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 15:55:00 -07:00
95d1613a9f Sync with 'maint' 2022-08-05 15:54:46 -07:00
1e92768aa1 Merge branch 'tb/cat-file-z'
Operating modes like "--batch" of "git cat-file" command learned to
take NUL-terminated input, instead of one-item-per-line.

* tb/cat-file-z:
  builtin/cat-file.c: support NUL-delimited input with `-z`
  t1006: extract --batch-command inputs to variables
2022-08-05 15:52:14 -07:00
3a4d71f52f Merge branch 'jt/fetch-pack-trace2-filter-spec'
"git fetch" client logs the partial clone filter used in the trace2
output.

* jt/fetch-pack-trace2-filter-spec:
  fetch-pack: write effective filter to trace2
2022-08-05 15:52:14 -07:00
dcdcc375a4 Merge branch 'jr/gitweb-title-shortening'
Gitweb had legacy URL shortener that is specific to the way
projects hosted on kernel.org used to (but no longer) work, which
has been removed.

* jr/gitweb-title-shortening:
  gitweb: remove title shortening heuristics
2022-08-05 15:52:14 -07:00
ac7f41fb8c Merge branch 'gc/bare-repo-discovery'
Fix-up for what has been merged to 'master' recently.

* gc/bare-repo-discovery:
  config.c: NULL check when reading protected config
2022-08-05 15:52:14 -07:00
992f25d713 fetch: use ref_namespaces during prefetch
The "refs/prefetch/" namespace is used by 'git fetch --prefetch' as a
replacement of the destination of the refpsec for a remote. Git also
removes refspecs that include tags.

Instead of using string literals for the 'refs/tags/ and
'refs/prefetch/' namespaces, use the entries in the ref_namespaces
array.

This kind of change could be done in many places around the codebase,
but we are isolating only to this change because of the way the
refs/prefetch/ namespace somewhat motivated the creation of the
ref_namespaces array.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:13 -07:00
863a8ae97b maintenance: stop writing log.excludeDecoration
This reverts commit 96eaffebbf (maintenance: set
log.excludeDecoration durin prefetch, 2021-01-19).

The previous change created a default decoration filter that does not
include refs/prefetch/, so this modification of the config is no longer
needed.

One issue that can happen from this point on is that users who ran the
prefetch task on previous versions of Git will still have a
log.excludeDecoration value and that will prevent the new default
decoration filter from being active. Thus, when we add the refs/bundle/
namespace as part of the bundle URI feature, those users will see
refs/bundle/ decorations.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:13 -07:00
3e103ed23f log: create log.initialDecorationSet=all
The previous change introduced the --clear-decorations option for users
who do not want their decorations limited to a narrow set of ref
namespaces.

Add a config option that is equivalent to specifying --clear-decorations
by default.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
748706d713 log: add --clear-decorations option
The previous changes introduced a new default ref filter for decorations
in the 'git log' command. This can be overridden using
--decorate-refs=HEAD and --decorate-refs=refs/, but that is cumbersome
for users.

Instead, add a --clear-decorations option that resets all previous
filters to a blank filter that accepts all refs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
92156291ca log: add default decoration filter
When a user runs 'git log', they expect a certain set of helpful
decorations. This includes:

* The HEAD ref
* Branches (refs/heads/)
* Stashes (refs/stash)
* Tags (refs/tags/)
* Remote branches (refs/remotes/)
* Replace refs (refs/replace/ or $GIT_REPLACE_REF_BASE)

Each of these namespaces was selected due to existing test cases that
verify these namespaces appear in the decorations. In particular,
stashes and replace refs can have custom colors from the
color.decorate.<slot> config option.

While one test checks for a decoration from notes, it only applies to
the tip of refs/notes/commit (or its configured ref name). Notes form
their own kind of decoration instead. Modify the expected output for the
tests in t4013 that expect this note decoration.  There are several
tests throughout the codebase that verify that --decorate-refs,
--decorate-refs-exclude, and log.excludeDecoration work as designed and
the tests continue to pass without intervention.

However, there are other refs that are less helpful to show as
decoration:

* Prefetch refs (refs/prefetch/)
* Rebase refs (refs/rebase-merge/ and refs/rebase-apply/)
* Bundle refs (refs/bundle/) [!]

[!] The bundle refs are part of a parallel series that bootstraps a repo
    from a bundle file, storing the bundle's refs into the repo's
    refs/bundle/ namespace.

In the case of prefetch refs, 96eaffebbf (maintenance: set
log.excludeDecoration durin prefetch, 2021-01-19) added logic to add
refs/prefetch/ to the log.excludeDecoration config option. Additional
feedback pointed out that having such a side-effect can be confusing and
perhaps not helpful to users. Instead, we should hide these ref
namespaces that are being used by Git for internal reasons but are not
helpful for the users to see.

The way to provide a seamless user experience without setting the config
is to modify the default decoration filters to match our expectation of
what refs the user actually wants to see.

In builtin/log.c, after parsing the --decorate-refs and
--decorate-refs-exclude options from the command-line, call
set_default_decoration_filter(). This method populates the exclusions
from log.excludeDecoration, then checks if the list of pattern
modifications are empty. If none are specified, then the default set is
restricted to the set of inclusions mentioned earlier (HEAD, branches,
etc.).  A previous change introduced the ref_namespaces array, which
includes all of these currently-used namespaces. The 'decoration' value
is non-zero when that namespace is associated with a special coloring
and fits into the list of "expected" decorations as described above,
which makes the implementation of this filter very simple.

Note that the logic in ref_filter_match() in log-tree.c follows this
matching pattern:

 1. If there are exclusion patterns and the ref matches one, then ignore
    the decoration.

 2. If there are inclusion patterns and the ref matches one, then
    definitely include the decoration.

 3. If there are config-based exclusions from log.excludeDecoration and
    the ref matches one, then ignore the decoration.

With this logic in mind, we need to ensure that we do not populate our
new defaults if any of these filters are manually set. Specifically, if
a user runs

	git -c log.excludeDecoration=HEAD log

then we expect the HEAD decoration to not appear. If we left the default
inclusions in the set, then HEAD would match that inclusion before
reaching the config-based exclusions.

A potential alternative would be to check the list of default inclusions
at the end, after the config-based exclusions. This would still create a
behavior change for some uses of --decorate-refs-exclude=<X>, and could
be overwritten somewhat with --decorate-refs=refs/ and
--decorate-refs=HEAD. However, it no longer becomes possible to include
refs outside of the defaults while also excluding some using
log.excludeDecoration.

Another alternative would be to exclude the known namespaces that are
not intended to be shown. This would reduce the visible effect of the
change for expert users who use their own custom ref namespaces. The
implementation change would be very simple to swap due to our use of
ref_namespaces:

	int i;
	struct string_list *exclude = decoration_filter->exclude_ref_pattern;

	/*
	 * No command-line or config options were given, so
	 * populate with sensible defaults.
	 */
	for (i = 0; i < NAMESPACE__COUNT; i++) {
		if (ref_namespaces[i].decoration)
			continue;

		string_list_append(exclude, ref_namespaces[i].ref);
	}

The main downside of this approach is that we expect to add new hidden
namespaces in the future, and that means that Git versions will be less
stable in how they behave as those namespaces are added.

It is critical that we provide ways for expert users to disable this
behavior change via command-line options and config keys. These changes
will be implemented in a future change.

Add a test that checks that the defaults are not added when
--decorate-refs is specified. We verify this by showing that HEAD is not
included as it normally would.  Also add a test that shows that the
default filter avoids the unwanted decorations from refs/prefetch,
refs/rebase-merge,
and refs/bundle.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
94d421b8af log-tree: use ref_namespaces instead of if/else-if
The add_ref_decoration() method uses an if/else-if chain to determine if
a ref matches a known ref namespace that has a special decoration
category. That decoration type is later used to assign a color when
writing to stdout.

The newly-added ref_namespaces array contains all namespaces, along
with information about their decoration type. Check this array instead
of this if/else-if chain. This reduces our dependency on string literals
being embedded in the decoration logic.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
97e61e0f9c refs: use ref_namespaces for replace refs base
The git_replace_ref_base global is used to store the value of the
GIT_REPLACE_REF_BASE environment variable or the default of
"refs/replace/". This is initialized within setup_git_env().

The ref_namespaces array is a new centralized location for information
such as the ref namespace used for replace refs. Instead of having this
namespace stored in two places, use the ref_namespaces array instead.

For simplicity, create a local git_replace_ref_base variable wherever
the global was previously used.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
b9342b3fd6 refs: add array of ref namespaces
Git interprets different meanings to different refs based on their
names. Some meanings are cosmetic, like how refs in  'refs/remotes/*'
are colored differently from refs in 'refs/heads/*'. Others are more
critical, such as how replace refs are interpreted.

Before making behavior changes based on ref namespaces, collect all
known ref namespaces into a array of ref_namespace_info structs. This
array is indexed by the new ref_namespace enum for quick access.

As of this change, this array is purely documentation. Future changes
will add dependencies on this array.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
5797b13919 t4207: test coloring of grafted decorations
The color.decorate.<slot> config option added the 'grafted' slot in
09c4ba410b (log-tree: allow to customize 'grafted' color, 2018-05-26)
but included no tests for this behavior. When modifying some logic
around decorations, this ref namespace was ignored and could have been
lost as a default namespace for 'git log' decorations by default.

Add two tests to t4207 that check that the replaced objects are
correctly decorated. Use "black" as the color since it is distinct from
the other colors already in the test. The first test uses regular
replace-objects while the second creates a commit graft.

Be sure to test both modes with GIT_REPLACE_REF_BASE unset and set to an
alternative base.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:11 -07:00
b004521aa6 t4207: modernize test
Before adding new tests to t4207-log-decoration-colors.sh, update the
existing test to use modern test conventions. This includes:

1. Use lowercase in test names.

2. Keep all test setup inside the test_expect_success blocks. We need
   to be careful about left whitespace in the broken lines of the input
   file.

3. Do not use 'git' commands on the left side of a pipe.

4. Create a cmp_filtered_decorations helper to perform the 'log', 'sed',
   and test_decode_color manipulations. Move the '--all' option to be an
   argument so we can change that value in future tests.

5. Modify the 'sed' command to use a simpler form that is more
   portable.

The next change will introduce new tests usinge these new conventions.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:11 -07:00
b877e617e6 refs: allow "HEAD" as decoration filter
The normalize_glob_ref() method was introduced in 65516f586b (log:
add option to choose which refs to decorate, 2017-11-21) to help with
decoration filters such as --decorate-refs=<filter> and
--decorate-refs-exclude=<filter>. The method has not been used anywhere
else.

At the moment, it is impossible to specify HEAD as a decoration filter
since normalize_glob_ref() prepends "refs/" to the filter if it isn't
already there.

Allow adding HEAD as a decoration filter by allowing the exact string
"HEAD" to not be prepended with "refs/". Add a test in t4202-log.sh that
would previously fail since the HEAD decoration would exist in the
output.

It is sufficient to only cover "HEAD" here and not include other special
refs like REBASE_HEAD. This is because HEAD is the only ref outside of
refs/* that is added to the list of decorations. However, we may want to
special-case these other refs in normalize_glob_ref() in the future.
Leave a NEEDSWORK comment for now.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:11 -07:00
1e2320161d docs: move http-protocol docs to man section 5
Continue the move of existing Documentation/technical/* protocol and
file-format documentation into our main documentation space by moving
the http-protocol.txt documentation over. I'm renaming it to
"protocol-http" to be consistent with other things in the new
gitformat-protocol-* namespace.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:24 -07:00
6b6029dd1d docs: move cruft pack docs to gitformat-pack
Integrate the cruft packs documentation initially added in
3d89a8c118 (Documentation/technical: add cruft-packs.txt, 2022-05-20)
to the newly created "gitformat-pack" documentation.

Like the "bitmap-format" added before it in
0d4455a3ab (documentation: add documentation for the bitmap format,
2013-11-14) the "cruft-packs" were documented in their own file.

As the diff move detection will show there is no change to
"Documentation/technical/cruft-packs.txt" here except to move it, and
to "indent" the existing sections by adding an extra "=" to them.

We could similarly convert the "bitmap-format.txt", but let's leave it
for now due to a conflict with the in-flight ac/bitmap-lookup-table
series.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:24 -07:00
977c47b46d docs: move pack format docs to man section 5
Continue the move of existing Documentation/technical/* protocol and
file-format documentation into our main documentation space by moving
the various documentation pertaining to the *.pack format and related
files, and updating things that refer to it to link to the new
location.

By moving these we can properly link from the newly created
gitformat-commit-graph to a gitformat-chunk-format page.

Integrating "Documentation/technical/bitmap-format.txt" and
"Documentation/technical/cruft-packs.txt" might logically be part of
this change, but as those cover parts of the wider "pack
format" (including associated files) that's documented outside of
"Documentation/technical/pack-format.txt" let's leave those for now,
subsequent commit(s) will address those.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:24 -07:00
20516890dc docs: move signature docs to man section 5
Continue the move of existing Documentation/technical/* protocol and
file-format documentation into our main documentation space by moving
the signature format documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:24 -07:00
00d3e8d7dd docs: move index format docs to man section 5
Continue the move of existing Documentation/technical/* protocol and
file-format documentation into our main documentation space by moving
the index format documentation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
5db921054e docs: move protocol-related docs to man section 5
Continue the move of existing Documentation/technical/* protocol and
file-format documentation into our main documentation space. By moving
the things that discuss the protocol we can properly link from
e.g. lsrefs.unborn and protocol.version documentation to a manpage we
build by default.

So far we have been using the "gitformat-" prefix for the
documentation we've been moving over from Documentation/technical/*,
but for protocol documentation let's use "gitprotocol-*".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
8cbace93d2 docs: move commit-graph format docs to man section 5
Continue the move of existing Documentation/technical/* protocol and
file-format documentation into our main documentation space.

By moving the documentation for the commit-graph format into man
section 5 and the new "developerinterfaces" category. This change is
split from subsequent commits due to the relatively large amount of
ASCIIDOC formatting changes that are required.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
844739ba27 git docs: add a category for file formats, protocols and interfaces
Create a new "File formats, protocols and other developer interfaces"
section in the main "git help git" manual page and start moving the
documentation that now lives in "Documentation/technical/*.git" over
to it. This complements the newly added and adjacent "Repository,
command and file interfaces" section.

This makes the technical documentation more accessible and
discoverable. Before this we wouldn't install it by default, and had
no ability to build man page versions of them. The links to them from
our existing documentation link to the generated HTML version of these
docs.

So let's start moving those over, starting with just the
"bundle-format.txt" documentation added in 7378ec90e1 (doc: describe
Git bundle format, 2020-02-07). We'll now have a new
gitformat-bundle(5) man page. Subsequent commits will move more git
internal format documentation over.

Unfortunately the syntax of the current Documentation/technical/*.txt
is not the same (when it comes to section headings etc.) as our
Documentation/*.txt documentation, so change the relevant bits of
syntax as we're moving this over.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
d976c5100f git docs: add a category for user-facing file, repo and command UX
Create a new "Repository, command and file interfaces" section in the
main "git help git" manual page. Move things that belong under this
new criteria from the generic "Guides" section.

The "Guides" section was added in f442f28a81 (git.txt: add list of
guides, 2020-08-05). It makes sense to have e.g. "giteveryday(7)" and
"gitfaq(7)" listed under "Guides".

But placing e.g. "gitignore(5)" in it is stretching the meaning of
what a "guide" is, ideally that section should list things similar to
"giteveryday(7)" and "gitcore-tutorial(7)".

An alternate name that was considered for this new section was "User
formats", for consistency with the nomenclature used for man section 5
in general. My man(1) lists it as "File formats and conventions,
e.g. /etc/passwd".

So calling this "git help --formats" or "git help --user-formats"
would make sense for e.g. gitignore(5), but would be stretching it
somewhat for githooks(5), and would seem really suspect for the likes
of gitcli(7).

Let's instead pick a name that's closer to the generic term "User
interface", which is really what this documentation discusses: General
user-interface documentation that doesn't obviously belong elsewhere.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
dba1e5392f git help doc: use "<doc>" instead of "<guide>"
Replace the use of "<guide>" originally introduced (as "GUIDE") in
a133737b80 (doc: include --guide option description for "git help",
2013-04-02) with the more generic "<doc>". The "<doc>" placeholder is
more generic, and one we'll be able to use as we introduce new
documentation categories.

Let's also add "<doc>" to the "git help -h" output, when it was made
to use parse_option() in in 41eb33bd0c (help: use parseopt,
2008-02-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
936b8eb6c8 help.c: remove common category behavior from drop_prefix() behavior
Change the behavior of the "git" prefix stripping for CAT_guide so
that we don't try to strip the "git-" prefix in that case. We should
be stripping either "git" or "git-" depending on the category. This
change makes it easier to add extra "category" conditions in
subsequent commits.

Before this we'd in principle strip a "git-" prefix from a "guide" in
command-list.txt, in practice we have no such entry there. As we don't
have any entry that looks like "git-foo" in command-list.txt this
changes nothing in practice, but it makes the intent of the code
clearer. In that hypothetical case we'd now strip it down to "-foo",
not "foo".

When this code was added in cfb22a02ab (help: use command-list.h for
common command list, 2018-05-10) the only entries in command-list.txt
that didn't begin with "git-" were "gitweb" and "gitk".

Then when the "guides" special-case was added in 1b81d8cb19 (help:
use command-list.txt for the source of guides, 2018-05-20) we had the
various "git" (not "git-") prefixed "guide" entries, which the
"CAT_guide" case handles.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
2f8b3ea662 help.c: refactor drop_prefix() to use a "switch" statement"
Refactor the drop_prefix() function in in help.c to make it easier to
strip prefixes from categories that aren't "CAT_guide". There are no
functional changes here, by doing this we make a subsequent functional
change's diff smaller.

As before we first try to strip "git-" unconditionally, if that works
we'll return the stripped string. Then we'll strip "git" if the
command is in "CAT_guide".

This means that we'd in principle strip "git-foo" down to "foo" if
it's in CAT_guide. That doesn't make much sense, and we don't have
such an entry in command-list.txt, but let's preserve that behavior
for now.

While we're at it remove a stray newline that had been added after the
"return name;" statement.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
4057523a40 submodule merge: update conflict error message
When attempting to merge in a superproject with conflicting submodule
pointers that cannot be fast-forwarded or trivially resolved, the merge
fails and Git prints an error message that accurately describes the
failure, but does not provide steps for the user to resolve the error.

Git is left in a conflicted state, which requires the user to:
 1. merge submodules or update submodules to an already existing
	commit that reflects the merge
 2. add submodules changes to the superproject
 3. finish merging superproject
These steps are non-obvious for newer submodule users to figure out
based on the error message and neither `git submodule status` nor `git
status` provide any useful pointers.

Update error message to provide steps to resolve submodule merge
conflict. Future work could involve adding an advice flag to the
message. Although the message is long, it also has the id of the
submodule commit that needs to be merged, which could be useful
information for the user.

Additionally, 5 merge failures that resulted in an early return have
been updated to reflect the status of the merge.
1. Null merge base (null o): CONFLICT_SUBMODULE_NULL_MERGE_BASE added
   as a new conflict type and will print updated error message.
2. Null merge side a (null a): BUG(). See [1] for discussion
3. Null merge side b (null b): BUG(). See [1] for discussion
4. Submodule not checked out: added NEEDSWORK bit
5. Submodule commits not present: added NEEDSWORK bit
The errors with a NEEDSWORK bit deserve a more detailed explanation of
how to resolve them. See [2] for more context.

[1] https://lore.kernel.org/git/CABPp-BE0qGwUy80dmVszkJQ+tcpfLRW0OZyErymzhZ9+HWY1mw@mail.gmail.com/
[2] https://lore.kernel.org/git/xmqqpmhjjwo9.fsf@gitster.g/

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 13:43:07 -07:00
cb54fc93e4 doc: clarify rerere-autoupdate
The "--[no-]rerere-autoupdate" option controls what happens _after_
the rerere mechanism kicks in to reuse recorded resolutions and does
not prevent from the rerere mechanism to trigger in the first place.

It is unclear in the current text if "--no-rerere-autoupdate" stops
the auto-resolution.  Rewrite the sentence to clarify.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 13:57:25 -07:00
0dbc715ae0 doc: consolidate --rerere-autoupdate description
The `--rerere-autoupdate` option is shared across 5 commands, and
are described the same way because it works exactly the same way in
these commands.

Create a separate file and include it from the help pages for these
commands, so that we can improve the description at one place to
improve all of them at once, and keep them in sync.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 13:47:11 -07:00
4af7188bc9 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 13:36:09 -07:00
30c6495e1e Merge branch 'jc/string-list-cleanup'
Code clean-up.

* jc/string-list-cleanup:
  builtin/remote.c: use the right kind of STRING_LIST_INIT
2022-08-03 13:36:09 -07:00
966ff64a30 Merge branch 'en/merge-restore-to-pristine'
When "git merge" finds that it cannot perform a merge, it should
restore the working tree to the state before the command was
initiated, but in some corner cases it didn't.

* en/merge-restore-to-pristine:
  merge: do not exit restore_state() prematurely
  merge: ensure we can actually restore pre-merge state
  merge: make restore_state() restore staged state too
  merge: fix save_state() to work when there are stat-dirty files
  merge: do not abort early if one strategy fails to handle the merge
  merge: abort if index does not match HEAD for trivial merges
  merge-resolve: abort if index does not match HEAD
  merge-ort-wrappers: make printed message match the one from recursive
2022-08-03 13:36:09 -07:00
4e0d160bbc Merge branch 'rs/mergesort'
Make our mergesort implementation type-safe.

* rs/mergesort:
  mergesort: remove llist_mergesort()
  packfile: use DEFINE_LIST_SORT
  fetch-pack: use DEFINE_LIST_SORT
  commit: use DEFINE_LIST_SORT
  blame: use DEFINE_LIST_SORT
  test-mergesort: use DEFINE_LIST_SORT
  test-mergesort: use DEFINE_LIST_SORT_DEBUG
  mergesort: add macros for typed sort of linked lists
  mergesort: tighten merge loop
  mergesort: unify ranks loops
2022-08-03 13:36:09 -07:00
87098a047b Merge branch 'sa/cat-file-mailmap'
"git cat-file" learned an option to use the mailmap when showing
commit and tag objects.

* sa/cat-file-mailmap:
  cat-file: add mailmap support
  ident: rename commit_rewrite_person() to apply_mailmap_to_header()
  ident: move commit_rewrite_person() to ident.c
  revision: improve commit_rewrite_person()
2022-08-03 13:36:08 -07:00
8e56affcb5 Merge branch 'zh/ls-files-format'
"git ls-files" learns the "--format" option to tweak its output.

* zh/ls-files-format:
  ls-files: introduce "--format" option
2022-08-03 13:36:08 -07:00
37e4bdd5ee Merge branch 'tb/commit-graph-genv2-upgrade-fix'
There was a bug in the codepath to upgrade generation information
in commit-graph from v1 to v2 format, which has been corrected.

* tb/commit-graph-genv2-upgrade-fix:
  commit-graph: fix corrupt upgrade from generation v1 to v2
  commit-graph: introduce `repo_find_commit_pos_in_graph()`
  t5318: demonstrate commit-graph generation v2 corruption
2022-08-03 13:36:08 -07:00
f1a0db23ad Merge branch 'tk/untracked-cache-with-uall'
Fix for a bug that makes write-tree to fail to write out a
non-existent index as a tree, introduced in 2.37.

* tk/untracked-cache-with-uall:
  read-cache: make `do_read_index()` always set up `istate->repo`
2022-08-03 13:36:07 -07:00
0f609558fc Merge branch 'pw/xdiff-alloc'
Add a level of redirection to array allocation API in xdiff part,
to make it easier to share with the libgit2 project.

* pw/xdiff-alloc:
  xdiff: introduce XDL_ALLOC_GROW()
  xdiff: introduce XDL_CALLOC_ARRAY()
  xdiff: introduce xdl_calloc
  xdiff: introduce XDL_ALLOC_ARRAY()
2022-08-03 13:36:07 -07:00
acbec18d8e Merge branch 'ds/midx-with-less-memory'
The codepath to write multi-pack index has been taught to release a
large chunk of memory that holds an array of objects in the packs,
as soon as it is done with the array, to reduce memory consumption.

* ds/midx-with-less-memory:
  write_midx_bitmap(): drop unused refs_snapshot parameter
  midx: reduce memory pressure while writing bitmaps
  midx: extract bitmap write setup
  pack-bitmap-write: use const for hashes
2022-08-03 13:36:06 -07:00
f92dbdbc6a revisions API: don't leak memory on argv elements that need free()-ing
Add a "free_removed_argv_elements" member to "struct
setup_revision_opt", and use it to fix several memory leaks.

We have various memory leaks in APIs that take and munge "const
char **argv", e.g. parse_options(). Sometimes these APIs are given the
"argv" we get to the "main" function, in which case we don't leak
memory, but other times we're giving it the "v" member of a "struct
strvec" we created.

There's several potential ways to fix those sort of leaks, we could
add a "nodup" mode to "struct strvec", which would work for the cases
where we push constant strings to it. But that wouldn't work as soon
as we used strvec_pushf(), or otherwise needed to duplicate or create
a string for that "struct strvec".

Let's instead make it the responsibility of the revisions API. If it's
going to clobber elements of argv it can also free() them, which it
will now do if instructed to do so via "free_removed_argv_elements".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 11:12:36 -07:00
57efebb9b9 bisect.c: partially fix bisect_rev_setup() memory leak
Partially fix the memory leak noted in in 8a534b6124 (bisect: use
argv_array API, 2011-09-13), which added the "XXX" comment seen in the
context. We can partially fix it by having the bisect_rev_setup()
function take a "struct strvec", rather than constructing it.

As the comment notes we need to keep the construct "rev_argv" around
while the "struct rev_info" is around, which as seen in the newly
added "strvec_clear()" calls here we do after "release_revisions()".

This "partially" fixes the memory leak because we're leaking the "--"
added to the "rev_argv" here still, which will be addressed in a
subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 11:01:03 -07:00
f89d085b3f log: refactor "rev.pending" code in cmd_show()
Refactor the juggling of "rev.pending" and our replacement for it
amended in the preceding commit so that:

 * We use an "unsigned int" instead of an "int" for "i", this matches
   the types of "struct rev_info" itself.

 * We don't need the "count" and "objects" variables introduced in
   5d7eeee2ac (git-show: grok blobs, trees and tags, too, 2006-12-14).

   They were originally added since we'd clobber rev.pending in the
   loop without restoring it. Since the preceding commit we are
   restoring it when we handle OBJ_COMMIT, so the main for-loop can
   refer to "rev.pending" didrectly.

 * We use the "memcpy a &blank" idiom introduced in
   5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
   macro, 2021-07-01).

   This is more obvious than relying on us enumerating all of the
   relevant members of the "struct object_array" that we need to
   clear.

 * We comment on why we don't need an object_array_clear() here, see
   the analysis in [1].

1. https://lore.kernel.org/git/YuQtJ2DxNKX%2Fy70N@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:54:20 -07:00
055e57b7b2 log: fix a memory leak in "git show <revision>..."
Fix a memory leak in code added in 5d7eeee2ac (git-show: grok blobs,
trees and tags, too, 2006-12-14). As we iterate over a "<revision>..."
command-line and encounter ad OBJ_COMMIT we want to use our "struct
rev_info", but with a "pending" array of one element: the one commit
we're showing in the loop.

To do this 5d7eeee2ac saved away a pointer to rev.pending.objects and
rev.pending.nr for its iteration. We'd then clobber those (and alloc)
when we needed to show an OBJ_COMMIT.

We'd therefore leak the "rev.pending" we started out with, and only
free the new "rev.pending" in the "OBJ_COMMIT" case arm as
prepare_revision_walk() would draw it down.

Let's fix this memory leak. Now when we encounter an OBJ_COMMIT we
save away the "rev.pending" before clearing it. We then add a single
commit to it, which our indirect invocation of prepare_revision_walk()
will remove. After that we restore the "rev.pending".

Our "rev.pending" will then get free'd by the release_revisions()
added in f6bfea0ad0 (revisions API users: use release_revisions() in
builtin/log.c, 2022-04-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:16:28 -07:00
c541e77cf8 test-fast-rebase helper: use release_revisions() (again)
Fix a bug in 0139c58ab9 (revisions API users: add "goto cleanup" for
release_revisions(), 2022-04-13), in that commit a release_revisions()
call was added to this function, but it never did anything due to this
TODO memset() added in fe1a21d526 (fast-rebase: demonstrate
merge-ort's API via new test-tool command, 2020-10-29).

Simply removing the memset() will fix the "cmdline" which can be seen
when running t5520-pull.sh.

This sort of thing could be detected automatically with a rule similar
to the unused.cocci merged in 7fa60d2a5b6 (Merge branch
'ab/cocci-unused' into next, 2022-07-11). The following rule on top
would catch the case being fixed here:

	@@
	type T;
	identifier I;
	identifier REL1 =~ "^[a-z_]*_(release|reset|clear|free)$";
	identifier REL2 =~ "^(release|clear|free)_[a-z_]*$";
	@@

	- memset(\( I \| &I \), 0, ...);
	  ... when != \( I \| &I \)
	(
	  \( REL1 \| REL2 \)( \( I \| &I \), ...);
	|
	  \( REL1 \| REL2 \)( \( &I \| I \) );
	)
	  ... when != \( I \| &I \)

That rule should arguably use only &I, not I (as we might be passed a
pointer). The distinction would matter if anyone cared about the
side-effects of a memset() followed by release() of a pointer to a
variable passed into the function.

As such a pattern would be at best very confusing, and most likely
point to buggy code as in this case, the above rule is probably fine
as-is.

But as this rule only found one such bug in the entire codebase let's
not add it to contrib/coccinelle/unused.cocci for now, we can always
dig it up in the future if it's deemed useful.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:13:50 -07:00
c5365e93fd bisect.c: add missing "goto" for release_revisions()
Add a missing "goto cleanup", this fixes a bug in
f196c1e908 (revisions API users: use release_revisions() needing
REV_INFO_INIT, 2022-04-13).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:12:12 -07:00
04ede97211 symbolic-ref: refuse to set syntactically invalid target
You can feed absolute garbage to symbolic-ref as a target like:

  git symbolic-ref HEAD refs/heads/foo..bar

While this doesn't technically break the repo entirely (our "is it a git
directory" detector looks only for "refs/" at the start), we would never
resolve such a ref, as the ".." is invalid within a refname.

Let's flag these as invalid at creation time to help the caller realize
that what they're asking for is bogus.

A few notes:

  - We use REFNAME_ALLOW_ONELEVEL here, which lets:

     git update-ref refs/heads/foo FETCH_HEAD

    continue to work. It's unclear whether anybody wants to do something
    so odd, but it does work now, so this is erring on the conservative
    side. There's a test to make sure we didn't accidentally break this,
    but don't take that test as an endorsement that it's a good idea, or
    something we might not change in the future.

  - The test in t4202-log.sh checks how we handle such an invalid ref on
    the reading side, so it has to be updated to touch the HEAD file
    directly.

  - We need to keep our HEAD-specific check for "does it start with
    refs/". The ALLOW_ONELEVEL flag means we won't be enforcing that for
    other refs, but HEAD is special here because of the checks in
    validate_headref().

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-01 12:17:13 -07:00
350dc9f0e8 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-01 09:58:39 -07:00
d02cc45c7a Merge branch 'mt/pkt-line-comment-tweak'
In-code comment clarification.
source: <6a14443c101fa132498297af6d7a483520688d75.1658488203.git.matheus.bernardino@usp.br>

* mt/pkt-line-comment-tweak:
  pkt-line.h: move comment closer to the associated code
2022-08-01 09:58:39 -07:00
acdb1e1053 Merge branch 'mt/checkout-count-fix'
"git checkout" miscounted the paths it updated, which has been
corrected.
source: <cover.1657799213.git.matheus.bernardino@usp.br>

* mt/checkout-count-fix:
  checkout: fix two bugs on the final count of updated entries
  checkout: show bug about failed entries being included in final report
  checkout: document bug where delayed checkout counts entries twice
2022-08-01 09:58:38 -07:00
f0f9a033ed Merge branch 'cl/rerere-train-with-no-sign'
"rerere-train" script (in contrib/) used to honor commit.gpgSign
while recreating the throw-away merges.
source: <PH7PR14MB5594A27B9295E95ACA4D6A69CE8F9@PH7PR14MB5594.namprd14.prod.outlook.com>

* cl/rerere-train-with-no-sign:
  contrib/rerere-train: avoid useless gpg sign in training
2022-08-01 09:58:38 -07:00
3d8e3dc4fc Merge branch 'ds/rebase-update-ref'
"git rebase -i" learns to update branches whose tip appear in the
rebased range with "--update-refs" option.
source: <pull.1247.v5.git.1658255624.gitgitgadget@gmail.com>

* ds/rebase-update-ref:
  sequencer: notify user of --update-refs activity
  sequencer: ignore HEAD ref under --update-refs
  rebase: add rebase.updateRefs config option
  sequencer: rewrite update-refs as user edits todo list
  rebase: update refs from 'update-ref' commands
  rebase: add --update-refs option
  sequencer: add update-ref command
  sequencer: define array with enum values
  rebase-interactive: update 'merge' description
  branch: consider refs under 'update-refs'
  t2407: test branches currently using apply backend
  t2407: test bisect and rebase as black-boxes
2022-08-01 09:58:38 -07:00
e59acea3f0 Merge branch 'kk/p4-client-name-encoding-fix'
"git p4" did not handle non-ASCII client name well, which has been
corrected.
source: <pull.1285.v3.git.git.1658394440.gitgitgadget@gmail.com>

* kk/p4-client-name-encoding-fix:
  git-p4: refactoring of p4CmdList()
  git-p4: fix bug with encoding of p4 client name
2022-08-01 09:58:37 -07:00
32ed3314c1 t5351: avoid using test_cmp for binary data
The `test_cmp` function is meant to provide nicer output than `cmp` when
expected and actual output of Git commands disagree. The implicit
assumption is that the output is line-based and human readable.

However, aaf81223f4 (unpack-objects: use stream_loose_object() to
unpack large objects, 2022-06-11) introduced a call that compares the
contents of pack files, which are distinctly not line-based nor human
readable.

This causes problems because on Windows, we hand off to the Bash
function `mingw_test_cmp` that compares the lines while ignoring line
ending differences. And this Bash function spends an insane amount of
cycles trying to read in that binary pack file, so that it is almost
indistinguishable from an infinite loop.

For example, t5351 took 1486 seconds in the CI run at
https://github.com/git/git/runs/7398490747?check_suite_focus=true#step:5:171,
to complete. And yes, that is almost half an hour.

Since Git's tests already use `cmp` consistently when comparing pack
files, let's change this instance to use `cmp` instead of `test_cmp`,
too, and fix that performance problem.

Now t5351 takes all of 22 seconds.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-29 09:09:07 -07:00
ce50f1f3ac t5351: avoid relying on core.fsyncMethod = batch to be supported
On FreeBSD, this mode is not supported. But since 3a251bac0d (trace2:
only include "fsync" events if we git_fsync(), 2022-07-18) t5351 will
fail if this mode is unsupported.

Let's address this in the minimal fashion, by detecting that that mode
is unsupported and expecting a different count of hardware flushes in
that case.

This fixes the CI/PR builds on FreeBSD again.

Note: A better way would be to test only what is relevant in t5351.6
"unpack big object in stream (core.fsyncmethod=batch)" again instead of
blindly comparing the output against some exact text. But that would
pretty much revert the idea of above-mentioned commit, and that commit
was _just_ accepted into Git's main branch so one must assume that it
was intentional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-29 09:08:57 -07:00
c24feabcfb CI: use "GIT_TEST_SANITIZE_LEAK_LOG=true" in linux-leaks
As noted in a preceding commit the leak checking done by
"GIT_TEST_PASSING_SANITIZE_LEAK=true" (added in [1]) is incomplete
without combining it with "GIT_TEST_SANITIZE_LEAK_LOG=true".

Let's run our CI with that, to ensure that we catch cases where our
tests are missing the abort() exit code resulting from a leak for
whatever reason. The reasons for that are discussed in detail in a
preceding commit.

1. 956d2e4639 (tests: add a test mode for SANITIZE=leak, run it in
   CI, 2021-09-23)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
c68d5dbc94 upload-pack: fix a memory leak in create_pack_file()
Fix a memory leak that's been reported by some versions of "gcc" since
"output_state" became malloc'd in 55a9651d26 (upload-pack.c: increase
output buffer size, 2021-12-14).

In e75d2f7f73 (revisions API: have release_revisions() release
"filter", 2022-04-13) it was correctly marked as leak-free, the only
path through this function that doesn't reach the free(output_state)
is if we "goto fail", and that will invoke "die()".

Such leaks are not included with SANITIZE=leak (but e.g. valgrind will
still report them), but under some gcc optimization (I have not been
able to reproduce it with "clang") we'll report a leak here
anyway. E.g. gcc v12 with "-O2" and above will trigger it, but not
clang v13 with any "-On".

The GitHub CI would also run into this leak if the "linux-leaks" job
was made to run with "GIT_TEST_SANITIZE_LEAK_LOG=true".

See [1] for a past case where gcc had similar trouble analyzing leaks
involving a die() invocation in the function.

1. https://lore.kernel.org/git/patch-v3-5.6-9a44204c4c9-20211022T175227Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
3e3b9321ca leak tests: mark passing SANITIZE=leak tests as leak-free
Mark those remaining tests that pass when run under SANITIZE=leak with
TEST_PASSES_SANITIZE_LEAK=true, these were either omitted in
f346fcb62a (Merge branch 'ab/mark-leak-free-tests-even-more',
2021-12-15) and 5a4f8381b6 (Merge branch 'ab/mark-leak-free-tests',
2021-10-25), or have had their memory leaks fixed since then.

With this change there's now a a one-to-one mapping between those
tests that we have opted-in via "TEST_PASSES_SANITIZE_LEAK=true", and
those that pass with the new "check" mode:

	GIT_TEST_PASSING_SANITIZE_LEAK=check \
	GIT_TEST_SANITIZE_LEAK_LOG=true \
	make test SANITIZE=leak

Note that the "GIT_TEST_SANITIZE_LEAK_LOG=true" is needed due to the
edge cases noted in a preceding commit, i.e. in some cases we'd pass
the test itself, but still have outstanding leaks due to ignored exit
codes.

The "GIT_TEST_SANITIZE_LEAK_LOG=true" corrects for that, we're only
marking those tests as passing that really don't have any leaks,
whether that was reflected in their exit code or not.

Note that the change here to "t9100-git-svn-basic.sh" is marking that
test as passing under SANITIZE=leak, we're removing a
"TEST_FAILS_SANITIZE_LEAK=true" line, not
"TEST_PASSES_SANITIZE_LEAK=true". See 7a98d9ab00 (revisions API: have
release_revisions() release "cmdline", 2022-04-13) for the
introduction of that t/lib-git-svn.sh-specific variable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
96ecf699aa leak tests: don't skip some tests under SANITIZE=leak
The '!SANITIZE_LEAK' prerequisite added in 956d2e4639 (tests: add a
test mode for SANITIZE=leak, run it in CI, 2021-09-23) has been used
in various tests to skip individual tests in otherwise leak-free
tests.

Let's change the cases that have become leak-free since then to run
under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
faececa53f test-lib: have the "check" mode for SANITIZE=leak consider leak logs
As noted in previous on-list discussions[1] we have various tests that
will falsely report being leak-free because we're missing the relevant
exit code from LSAN as summarized below.

We should fix those issues, but in the meantime and as an additional
sanity check we can and should consider our own ASAN logs before
reporting that a test is leak-free.

Before this compiling with SANITIZE=leak and running:

    ./t6407-merge-binary.sh

Will exit successfully, now we'll get an error and an informative
message on:

    GIT_TEST_SANITIZE_LEAK_LOG=true ./t6407-merge-binary.sh

Even better, as noted in the updated t/README we'll now error out when
combined with the "check" mode:

    GIT_TEST_PASSING_SANITIZE_LEAK=check \
    GIT_TEST_SANITIZE_LEAK_LOG=true \
	./t4058-diff-duplicates.sh

Why do we miss these leaks? Because:

 * We have leaks inside "test_expect_failure" blocks, which by design
   will not distinguish a "normal" failure from an abort() or
   segfault. See [1] for a discussion of it shortcomings.

 * We have "git" invocations outside of "test_expect_success",
   e.g. setup code in the main body of the test, or in test helper
   functions that don't use &&-chaining.

 * Our tests will otherwise catch segfaults and abort(), but if we
   invoke a command that invokes another command it needs to ferry the
   exit code up to us.

   Notably a command that e.g. might invoke "git pack-objects" might
   itself exit with status 128 if that "pack-objects" segfaults or
   abort()'s. If the test invoking the parent command(s) is using
   "test_must_fail" we'll consider it an expected "ok" failure.

 * run-command.c doesn't (but probably should) ferry up such exit
   codes, so for e.g. "git push" tests where we expect a failure and an
   underlying "git" command fails we won't ferry up the segfault or
   abort exit code.

 * We have gitweb.perl and some other perl code ignoring return values
   from close(), i.e. ignoring exit codes from "git rev-parse" et al.

 * We have in-tree shellscripts like "git-merge-one-file.sh" invoking
   git commands, they'll usually return their own exit codes on "git"
   failure, rather then ferrying up segfault or abort() exit code.

   E.g. these invocations in git-merge-one-file.sh leak, but aren't
   reflected in the "git merge" exit code:

	src1=$(git unpack-file $2)
	src2=$(git unpack-file $3)

   That case would be easily "fixed" by adding a line like this after
   each assignment:

	test $? -ne 0 && exit $?

   But we'd then in e.g. "t6407-merge-binary.sh" run into
   write_tree_trivial() in "builtin/merge.c" calling die() instead of
   ferrying up the relevant exit code.

Let's remove "TEST_PASSES_SANITIZE_LEAK=true" from tests we
were falsely marking as leak-free.

In the case of t6407-merge-binary.sh it was marked as leak-free in
9081a421a6 (checkout: fix "branch info" memory leaks,
2021-11-16). I'd previously removed other bad
"TEST_PASSES_SANITIZE_LEAK=true" opt-ins in the series merged in
ea05fd5fbf (Merge branch 'ab/keep-git-exit-codes-in-tests',
2022-03-16). The case of t1060-object-corruption.sh is more subtle,
and will be discussed in a subsequent commit.

1. https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
e92684e1a2 test-lib: add a GIT_TEST_PASSING_SANITIZE_LEAK=check mode
Add a new "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode to the
test-lib.sh.

As noted in the updated "t/README" this compliments the existing
"GIT_TEST_PASSING_SANITIZE_LEAK=true" mode added in
956d2e4639 (tests: add a test mode for SANITIZE=leak, run it in CI,
2021-09-23).

Rather than document this all in one (even more) dense paragraph split
up the discussion of how it combines with --immediate into its own
paragraph following the discussion of
"GIT_TEST_SANITIZE_LEAK_LOG=true".

Before the removal of "test_external" in a preceding commit we would
have had to special-case t9700-perl-git.sh and t0202-gettext-perl.sh.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
5beca49a0b test-lib: simplify by removing test_external
Remove the "test_external" function added in [1]. This arguably makes
the output of t9700-perl-git.sh and friends worse. But as we'll argue
below the trade-off is worth it, since "chaining" to another TAP
emitter in test-lib.sh is more trouble than it's worth.

The new output of t9700-perl-git.sh is now:

	$ ./t9700-perl-git.sh
	ok 1 - set up test repository
	ok 2 - use t9700/test.pl to test Git.pm
	# passed all 2 test(s)
	1..2

Whereas before this change it would be:

	$ ./t9700-perl-git.sh
	ok 1 - set up test repository
	# run 1: Perl API (perl /home/avar/g/git/t/t9700/test.pl)
	ok 2 - use Git;
	[... omitting tests 3..46 from t/t9700/test.pl ...]
	ok 47 - unquote escape sequences
	1..47
	# test_external test Perl API was ok
	# test_external_without_stderr test no stderr: Perl API was ok

At the time of its addition supporting "test_external" was easy, but
when test-lib.sh itself started to emit TAP in [2] we needed to make
everything surrounding the emission of the plan consider
"test_external". I added that support in [2] so that we could run:

	prove ./t9700-perl-git.sh :: -v

But since then in [3] the door has been closed on combining
$HARNESS_ACTIVE and -v, we'll now just die:

	$ prove ./t9700-perl-git.sh :: -v
	Bailout called.  Further testing stopped:  verbose mode forbidden under TAP harness; try --verbose-log
	FAILED--Further testing stopped: verbose mode forbidden under TAP harness; try --verbose-log

So the only use of this has been that *if* we had failure in one of
these tests we could e.g. in CI see which test failed based on the
test number. Now we'll need to look at the full verbose logs to get
that same information.

I think this trade-off is acceptable given the reduction in
complexity, and it brings these tests in line with other similar
tests, e.g. the reftable tests added in [4] will be condensed down to
just one test, which invokes the C helper:

	$ ./t0032-reftable-unittest.sh
	ok 1 - unittests
	# passed all 1 test(s)
	1..1

It would still be nice to have that ":: -v" form work again, it
never *really* worked, but even though we've had edge cases test
output screwing up the TAP it mostly worked between d998bd4ab6 and
[3], so we may have been overzealous in forbidding it outright.

I have local patches which I'm planning to submit sooner than later
that get us to that goal, and in a way that isn't buggy. In the
meantime getting rid of this special case makes hacking on this area
of test-lib.sh easier, as we'll do in subsequent commits.

The switch from "perl" to "$PERL_PATH" here is because "perl" is
defined as a shell function in the test suite, see a5bf824f3b (t:
prevent '-x' tracing from interfering with test helpers' stderr,
2018-02-25). On e.g. the OSX CI the "command perl"... will be part of
the emitted stderr.

1. fb32c41008 (t/test-lib.sh: add test_external and
   test_external_without_stderr, 2008-06-19)
2. d998bd4ab6 (test-lib: Make the test_external_* functions
   TAP-aware, 2010-06-24)
3. 614fe01521 (test-lib: bail out when "-v" used under
   "prove", 2016-10-22)
4. ef8a6c6268 (reftable: utility functions, 2021-10-07)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
64f3f5a3f6 tests: move copy/pasted PERL + Test::More checks to a lib-perl.sh
Since the original "perl -MTest::More" prerequisite check was added in
[1] it's been copy/pasted in [2], [3] and [4]. As we'll be changing
these codepaths in a subsequent commit let's consolidate these.

While we're at it let's move these to a lazy prereq, and make them
conform to our usual coding style (e.g. "\nthen", not "; then").

1. e46f9c8161 (t9700: skip when Test::More is not available,
   2008-06-29)
2. 5e9637c629 (i18n: add infrastructure for translating Git with
   gettext, 2011-11-18)
3. 8d314d7afe (send-email: reduce dependencies impact on
   parse_address_line, 2015-07-07)
4. f07eeed123 (git-credential-netrc: adapt to test framework for git,
   2018-05-12)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
fee65b194d t/Makefile: don't remove test-results in "clean-except-prove-cache"
When "make test" is run with the default of "DEFAULT_TEST_TARGET=test"
we'll leave the "test-results" directory in-place, but don't do so for
the "prove" target.

The reason for this is that when 28d836c815 (test: allow running the
tests under "prove", 2010-10-14) allowed for running the tests under
"prove" there was no point in leaving the "test-results" in place.

The "prove" target provides its own summary, so we don't need to run
"aggregate-results", which is the reason we have "test-results" in the
first place. See 2d84e9fb6d (Modify test-lib.sh to output stats to
t/test-results/*, 2008-06-08).

But in a subsequent commit test-lib.sh will start emitting reports of
memory leaks in test-results/*, and it will be useful to analyze these
after the fact.

This wouldn't be a problem as failing tests will halt the removal of
the files (we'll never reach "clean-except-prove-cache" from the
"prove" target), but will be subsequently as we'll want to report a
successful run, but might still have e.g. logs of known memory leaks
in test-results/*.

So let's stop removing this, it's sufficient that "make clean" removes
it, and that "pre-clean" (which both "test" and "prove" depend on)
will remove it, i.e. we'll never have a stale "test-results" because
of this change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
366bd129dc test-lib: add a SANITIZE=leak logging mode
Add the ability to run the test suite under a new
"GIT_TEST_SANITIZE_LEAK_LOG=true" mode, when true we'll log the leaks
we find an a new "test-results/<test-name>.leak" directory.

That new path is consistent with the existing
"test-results/<test-name>.<type>" results, except that those are all
files, not directories.

We also set "log_exe_name=1" to include the name of the executable in
the filename. This gives us files like "trace.git.<pid>" instead of
the default of "trace.<pid>". I.e. we'll be able to distinguish "git"
leaks from "test-tool", "git-daemon" etc.

We then set "dedup_token_length" to non-zero ("0" is the default) to
succinctly log a token we can de-duplicate these stacktraces on. The
string is simply a one-line stack-trace with only function names up to
N frames, which we limit at "9999" as a shorthand for
"infinite" (there appears to be no way to say "no limit").

With these combined we can now easily get e.g. the top 10 leaks in the
test suite grouped by full stacktrace:

    grep -o -P -h '(?<=DEDUP_TOKEN: ).*' test-results/*.leak/trace.git.* | sort | uniq -c | sort -nr | head -n 10

Or add "grep -E -o '[^-]+'" to that to group by functions instead of
stack traces:

    grep -o -P -h '(?<=DEDUP_TOKEN: ).*' test-results/*.leak/trace.git.* | grep -E -o '[^-]+' | sort | uniq -c | sort -nr | head -n 20

This new mode requires git to be compiled with SANITIZE=leak, rather
than explaining that in the documentation let's make it
self-documenting by bailing out if the user asks for this without git
having been compiled with SANITIZE=leak, as we do with
GIT_TEST_PASSING_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
ac8e3e94e5 t/README: reword the "GIT_TEST_PASSING_SANITIZE_LEAK" description
Reword the documentation added in 956d2e4639 (tests: add a test mode
for SANITIZE=leak, run it in CI, 2021-09-23) for brevity.

The comment added in the same commit was also misleading: We skip
certain tests if SANITIZE=leak and GIT_TEST_PASSING_SANITIZE_LEAK=true,
not if we're compiled with SANITIZE=leak. Let's just remove the
comment, the control flow here is obvious enough that the code can
speak for itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
46fb057aaa test-lib: add a --invert-exit-code switch
Add the ability to have those tests that fail return 0, and those
tests that succeed return 1. This is useful e.g. to run "--stress"
tests on tests that fail 99% of the time on some setup, i.e. to smoke
out the flaky run which yielded success.

In a subsequent commit a new SANITIZE=leak mode will make use of this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:40 -07:00
25c2351d85 test-lib: fix GIT_EXIT_OK logic errors, use BAIL_OUT
Change various "exit 1" checks that happened after our "die" handler
had been set up to use BAIL_OUT instead. See 234383cd40 (test-lib.sh:
use "Bail out!" syntax on bad SANITIZE=leak use, 2021-10-14) for the
benefits of the BAIL_OUT function.

The previous use of "error" here was not a logic error, but the "exit"
without "GIT_EXIT_OK" would emit the "FATAL: Unexpected exit with code
$code" message on top of the error we wanted to emit.

Since we'd also like to stop "prove" in its tracks here, the right
thing to do is to emit a "Bail out!" message.

Let's also move the "GIT_EXIT_OK=t" assignments to just above the
"exit [01]" in "test_done". It's not OK if we exit in
e.g. finalize_test_output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:39 -07:00
e0258f15cb test-lib: don't set GIT_EXIT_OK before calling test_atexit_handler
Change the control flow in test_done so that we'll set GIT_EXIT_OK=t
after we call test_atexit_handler(). This seems to have been a mistake
in 900721e15c (test-lib: introduce 'test_atexit', 2019-03-13). It
doesn't make sense to allow our "atexit" handling to call "exit"
without us emitting the errors we'll emit without GIT_EXIT_OK=t being
set.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:39 -07:00
6d00680de2 test-lib: use $1, not $@ in test_known_broken_{ok,failure}_
Clarify that these two functions never take N arguments, they'll only
ever receive one. They've needlessly used $@ over $1 since
41ac414ea2 (Sane use of test_expect_failure, 2008-02-01).

In the future we might want to pass the test source to these, but now
that's not the case. This preparatory change helps to clarify a
follow-up change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 16:35:39 -07:00
23b219f8e3 Sync with 'maint' 2022-07-27 13:42:09 -07:00
15b17e6480 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 09:16:55 -07:00
745af8c7b5 Merge branch 'mb/p4-utf16-crlf'
"git p4" working on UTF-16 files on Windows did not implement
CRLF-to-LF conversion correctly, which has been corrected.

* mb/p4-utf16-crlf:
  git-p4: fix CR LF handling for utf16 files
2022-07-27 09:16:55 -07:00
04e340b29b Merge branch 'mb/p4-fixes'
Fix a few issues in "git p4".

* mb/p4-fixes:
  git-p4: fix error handling in P4Unshelve.renameBranch()
  git-p4: fix typo in P4Submit.applyCommit()
2022-07-27 09:16:55 -07:00
13d0c00049 Merge branch 'ds/win-syslog-compiler-fix'
Workaround for a false positive compiler warning.

* ds/win-syslog-compiler-fix:
  compat/win32: correct for incorrect compiler warning
2022-07-27 09:16:54 -07:00
eacf9b2bb6 Merge branch 'ld/osx-keychain-usage-fix'
Workaround for a compiler warning against use of die() in
osx-keychain (in contrib/).

* ld/osx-keychain-usage-fix:
  osx-keychain: fix compiler warning
2022-07-27 09:16:54 -07:00
3a03633812 Merge branch 'vd/scalar-doc'
Doc update.

* vd/scalar-doc:
  scalar: convert README.md into a technical design doc
  scalar: reword command documentation to clarify purpose
2022-07-27 09:16:54 -07:00
0c5222b6c5 Merge branch 'ds/doc-wo-whitelist'
Avoid "white/black-list" in documentation and code comments.

* ds/doc-wo-whitelist:
  transport.c: avoid "whitelist"
  t: avoid "whitelist"
  git.txt: remove redundant language
  git-cvsserver: clarify directory list
  daemon: clarify directory arguments
2022-07-27 09:16:54 -07:00
6fa54b8fb5 Merge branch 'mb/config-document-include'
Add missing documentation for "include" and "includeIf" features in
"git config" file format, which incidentally teaches the command
line completion to include them in its offerings.

* mb/config-document-include:
  config.txt: document include, includeIf
2022-07-27 09:16:53 -07:00
a88527203f Merge branch 'sg/index-format-doc-update'
Docfix.

* sg/index-format-doc-update:
  index-format.txt: remove outdated list of supported extensions
2022-07-27 09:16:53 -07:00
6a591a3173 Merge branch 'ma/sparse-checkout-cone-doc-fix'
Docfix.

* ma/sparse-checkout-cone-doc-fix:
  config/core.txt: fix minor issues for `core.sparseCheckoutCone`
2022-07-27 09:16:53 -07:00
7787a6c3ee Merge branch 'ma/t4200-update'
Test fix.

* ma/t4200-update:
  t4200: drop irrelevant code
2022-07-27 09:16:53 -07:00
cc29f89032 Merge branch 'tl/pack-bitmap-error-messages'
Tweak various messages that come from the pack-bitmap codepaths.

* tl/pack-bitmap-error-messages:
  pack-bitmap.c: continue looping when first MIDX bitmap is found
  pack-bitmap.c: using error() instead of silently returning -1
  pack-bitmap.c: do not ignore error when opening a bitmap file
  pack-bitmap.c: rename "idx_name" to "bitmap_name"
  pack-bitmap.c: mark more strings for translations
  pack-bitmap.c: fix formatting of error messages
2022-07-27 09:16:52 -07:00
7c7719ac0e Merge branch 'ab/squelch-empty-fsync-traces'
Omit fsync-related trace2 entries when their values are all zero.

* ab/squelch-empty-fsync-traces:
  trace2: only include "fsync" events if we git_fsync()
2022-07-27 09:16:52 -07:00
36d7bd19cf Merge branch 'js/commit-graph-parsing-without-repo-settings'
API tweak to make it easier to run fuzz testing on commit-graph parser.

* js/commit-graph-parsing-without-repo-settings:
  commit-graph: pass repo_settings instead of repository
2022-07-27 09:16:52 -07:00
51d1b69a53 write_midx_bitmap(): drop unused refs_snapshot parameter
The refactoring in 90b2bb710d (midx: extract bitmap write setup,
2022-07-19) hoisted our call to find_commits_for_midx_bitmap() into the
caller, which means we no longer need to see the refs_snapshot at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-27 00:02:45 -07:00
776f184893 config.c: NULL check when reading protected config
In read_protected_config(), check whether each file name is NULL before
attempting to read it, and add a BUG() call to
git_config_from_file_with_options() to make this error easier to catch
in the future.

The NULL checks mirror what do_git_config_sequence() does (which
read_protected_config() is modeled after). Without these NULL checks,
multiple tests fail with "make SANITIZE=address", e.g. in the final test
of t4010, xdg_config is NULL causing us to call fopen(NULL).

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-26 23:46:01 -07:00
1007557a4e fetch-pack: write effective filter to trace2
Administrators of a managed Git environment (like the one at $DAYJOB)
might want to quantify the performance change of fetches with and
without filters from the client's point of view, and also detect if a
server does not support it. Therefore, log the filter information being
sent to the server whenever a fetch (or clone) occurs. Note that this is
not necessarily the same as what's specified on the CLI, because during
a fetch, the configured filter is used whenever a filter is not
specified on the CLI.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-26 23:45:01 -07:00
75707da4fa gitweb: remove title shortening heuristics
Those heuristics are way outdated and too specific to the kernel project
to be useful outside of kernel.org.  Since kernel.org doesn't use gitweb
anymore and at least one project complained about incorrect behavior,
entirely remove them.

Signed-off-by: Julien Rouhaud <julien.rouhaud@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-26 23:33:17 -07:00
ce74de931d ls-files: introduce "--format" option
Add a new option "--format" that outputs index entries
informations in a custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

"--format" cannot used with "-s", "-o", "-k", "-t",
" --resolve-undo","--deduplicate" and "--eol".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-23 10:53:55 -07:00
c23fc075c6 merge: do not exit restore_state() prematurely
Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Also, add a testcase checking this, one for which the octopus strategy
fails on the first commit it attempts to merge, and thus which it
cannot handle at all and must completely bail on (as per the "exit 2"
code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
last round.", 2006-01-13)).

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
034195ef92 merge: ensure we can actually restore pre-merge state
Merge strategies can:
  * succeed with a clean merge
  * succeed with a conflicted merge
  * fail to handle the given type of merge

If one is thinking in terms of automatic mergeability, they would use
the word "fail" instead of "succeed" for the second bullet, but I am
focusing here on ability of the merge strategy to handle the given
inputs, not on whether the given inputs are mergeable.  The third
category is about the merge strategy failing to know how to handle the
given data; examples include:

  * Passing more than 2 branches to 'recursive' or 'ort'
  * Passing 2 or fewer branches to 'octopus'
  * Trying to do more complicated merges with 'resolve' (I believe
    directory/file conflicts will cause it to bail.)
  * Octopus running into a merge conflict for any branch OTHER than
    the final one (see the "exit 2" codepath of commit 98efc8f3d8
    ("octopus: allow manual resolve on the last round.", 2006-01-13))

That final one is particularly interesting, because it shows that the
merge strategy can muck with the index and working tree, and THEN bail
and say "sorry, this strategy cannot handle this type of merge; use
something else".

Further, we do not currently expect the individual strategies to clean
up after themselves, but instead expect builtin/merge.c to do so.  For
it to be able to, it needs to save the state before trying the merge
strategy so it can have something to restore to.  Therefore, remove the
shortcut bypassing the save_state() call.

There is another bug on the restore_state() side of things, so no
testcase will be added until the next commit when we have addressed that
issue as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
aa77ce88ed merge: make restore_state() restore staged state too
There are multiple issues at play here:

  1) If `git merge` is invoked with staged changes, it should abort
     without doing any merging, and the user's working tree and index
     should be the same as before merge was invoked.
  2) Merge strategies are responsible for enforcing the index == HEAD
     requirement. (See 9822175d2b ("Ensure index matches head before
     invoking merge machinery, round N", 2019-08-17) for some history
     around this.)
  3) Merge strategies can bail saying they are not an appropriate
     handler for the merge in question (possibly allowing other
     strategies to be used instead).
  4) Merge strategies can make changes to the index and working tree,
     and have no expectation to clean up after themselves, *even* if
     they bail out and say they are not an appropriate handler for
     the merge in question.  (The `octopus` merge strategy does this,
     for example.)
  5) Because of (3) and (4), builtin/merge.c stashes state before
     trying merge strategies and restores it afterward.

Unfortunately, if users had staged changes before calling `git merge`,
builtin/merge.c could do the following:

   * stash the changes, in order to clean up after the strategies
   * try all the merge strategies in turn, each of which report they
     cannot function due to the index not matching HEAD
   * restore the changes via "git stash apply"

But that last step would have the net effect of unstaging the user's
changes.  Fix this by adding the "--index" option to "git stash apply".
While at it, also squelch the stash apply output; we already report
"Rewinding the tree to pristine..." and don't need a detailed `git
status` report afterwards.  Also while at it, switch to using strvec
so folks don't have to count the arguments to ensure we avoided an
off-by-one error, and so it's easier to add additional arguments to
the command.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
1369f1475b merge: fix save_state() to work when there are stat-dirty files
When there are stat-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Copy some code from sequencer.c's create_autostash to refresh
the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
8f240b8bbb merge: do not abort early if one strategy fails to handle the merge
builtin/merge is setup to allow multiple strategies to be specified,
and it will find the "best" result and use it.  This is defeated if
some of the merge strategies abort early when they cannot handle the
merge.  Fix the logic that calls recursive and ort to not do such an
early abort, but instead return "2" or "unhandled" so that the next
strategy can try to handle the merge.

Coming up with a testcase for this is somewhat difficult, since
recursive and ort both handle nearly any two-headed merge (there is
a separate code path that checks for non-two-headed merges and
already returns "2" for them).  So use a somewhat synthetic testcase
of having the index not match HEAD before the merge starts, since all
merge strategies will abort for that.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
e4cdfe84a0 merge: abort if index does not match HEAD for trivial merges
As noted in the last commit and the links therein (especially commit
9822175d2b ("Ensure index matches head before invoking merge machinery,
round N", 2019-08-17), we have had a very long history of problems with
failing to enforce the requirement that index matches HEAD when starting
a merge.

The "trivial merge" logic in builtin/merge.c is yet another such case
we previously missed.  Add a check for it to ensure it aborts if the
index does not match HEAD, and add a testcase where this fix is needed.

Note that the fix here would also incidentally be an alternative fix
for the testcase added in the last patch, but the fix in the last patch
is still needed when multiple merge strategies are in use, so tweak the
testcase from the previous commit so that it continues to exercise the
codepath added in the last commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
24ba8b70c9 merge-resolve: abort if index does not match HEAD
As noted in commit 9822175d2b ("Ensure index matches head before
invoking merge machinery, round N", 2019-08-17), we have had a very
long history of problems with failing to enforce the requirement that
index matches HEAD when starting a merge.  One of the commits
referenced in the long tale of issues arising from lax enforcement of
this requirement was commit 55f39cf755 ("merge: fix misleading
pre-merge check documentation", 2018-06-30), which tried to document
the requirement and noted there were some exceptions.  As mentioned in
that commit message, the `resolve` strategy was the one strategy that
did not have an explicit index matching HEAD check, and the reason it
didn't was that I wasn't able to discover any cases where the
implementation would fail to catch the problem and abort, and didn't
want to introduce unnecessary performance overhead of adding another
check.

Well, today I discovered a testcase where the implementation does not
catch the problem and so an explicit check is needed.  Add a testcase
that previously would have failed, and update git-merge-resolve.sh to
have an explicit check.  Note that the code is copied from 3ec62ad9ff
("merge-octopus: abort if index does not match HEAD", 2016-04-09), so
that we reuse the same message and avoid making translators need to
translate some new message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:22 -07:00
11f4290001 merge-ort-wrappers: make printed message match the one from recursive
When the index does not match HEAD, the merge strategies are responsible
to detect that condition and abort.  The merge-ort-wrappers had code to
implement this and meant to copy the error message from merge-recursive
but deviated in two ways, both due to the message in merge-recursive
being processed by another function that made additional changes:
  * It added an implicit "error: " prefix
  * It added an implicit trailing newline
We can get these things by making use of the error() function.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:22 -07:00
db9d67f2e9 builtin/cat-file.c: support NUL-delimited input with -z
When callers are using `cat-file` via one of the stdin-driven `--batch`
modes, all input is newline-delimited. This presents a problem when
callers wish to ask about, e.g. tree-entries that have a newline
character present in their filename.

To support this niche scenario, introduce a new `-z` mode to the
`--batch`, `--batch-check`, and `--batch-command` suite of options that
instructs `cat-file` to treat its input as NUL-delimited, allowing the
individual commands themselves to have newlines present.

The refactoring here is slightly unfortunate, since we turn loops like:

    while (strbuf_getline(&buf, stdin) != EOF)

into:

    while (1) {
        int ret;
        if (opt->nul_terminated)
            ret = strbuf_getline_nul(&input, stdin);
        else
            ret = strbuf_getline(&input, stdin);

        if (ret == EOF)
            break;
    }

It's tempting to think that we could use `strbuf_getwholeline()` and
specify either `\n` or `\0` as the terminating character. But for input
on platforms that include a CR character preceeding the LF, this
wouldn't quite be the same, since `strbuf_getline(...)` will trim any
trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.

In the future, we could clean this up further by introducing a variant
of `strbuf_getwholeline()` that addresses the aforementioned gap, but
that approach felt too heavy-handed for this pair of uses.

Some tests are added in t1006 to ensure that `cat-file` produces the
same output in `--batch`, `--batch-check`, and `--batch-command` modes
with and without the new `-z` option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:42:06 -07:00
3639fefe7d t1006: extract --batch-command inputs to variables
A future commit will want to ensure that various `--batch`-related
options produce the same output whether their input is newline
terminated, or NUL terminated (and a to-be-implemented `-z` option
exists).

To prepare for this, extract the given input(s) into separate variables
to that their LF characters can easily be converted into NUL bytes when
testing the new `-z` mode.

This is consistent with other tests in t1006 (which these days is no
longer a shining example of our CodingGuidelines).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:42:05 -07:00
6a475b71f8 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 15:07:08 -07:00
eacae022bb Merge branch 'rs/mingw-tighten-mkstemp'
mkstemp() emulation on Windows has been improved.

* rs/mingw-tighten-mkstemp:
  mingw: avoid mktemp() in mkstemp() implementation
2022-07-22 15:04:03 -07:00
a31dbaebb1 Merge branch 'js/ci-github-workflow-markup'
A fix for a regression in test framework.

* js/ci-github-workflow-markup:
  tests: fix incorrect --write-junit-xml code
2022-07-22 15:04:03 -07:00
dd7c820d9e Merge branch 'js/shortlog-sort-stably'
"git shortlog -n" relied on the underlying qsort() to be stable,
which shouldn't have.  Fixed.

* js/shortlog-sort-stably:
  shortlog: use a stable sort
2022-07-22 15:04:02 -07:00
4483ea9a01 Merge branch 'js/vimdiff-quotepath-fix'
Variable quoting fix in the vimdiff driver of "git mergetool"

* js/vimdiff-quotepath-fix:
  mergetool(vimdiff): allow paths to contain spaces again
2022-07-22 15:04:02 -07:00
18bbc795fc Merge branch 'gc/bare-repo-discovery'
Introduce a discovery.barerepository configuration variable that
allows users to forbid discovery of bare repositories.

* gc/bare-repo-discovery:
  setup.c: create `safe.bareRepository`
  safe.directory: use git_protected_config()
  config: learn `git_protected_config()`
  Documentation: define protected configuration
  Documentation/git-config.txt: add SCOPES section
2022-07-22 15:04:02 -07:00
6df543bdfc Merge branch 'mt/checkout-count-fix' into mt/rot13-in-c
* mt/checkout-count-fix:
  checkout: fix two bugs on the final count of updated entries
  checkout: show bug about failed entries being included in final report
  checkout: document bug where delayed checkout counts entries twice
2022-07-22 14:07:39 -07:00
198551ca54 git-p4: fix error handling in P4Unshelve.renameBranch()
The error handling code path is meant to be triggered when the loop does
not exit early via "break". This fails, as the boolean variable "found",
which is used to track whether the loop was exited early, is initialized
incorrectly.

It would be possible to fix this issue by correcting the initialization,
but Python supports a for:-else: control flow construct for this exact
use case (executing code if a loop does not exit early), so it is more
idiomatic to remove the tracking variable entirely.

In addition, the error message no longer refers to a variable that does
not exist.

Signed-off-by: Moritz Baumann <moritz.baumann@sap.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-20 13:53:54 -07:00
c0d2b07460 git-p4: fix typo in P4Submit.applyCommit()
Signed-off-by: Moritz Baumann <moritz.baumann@sap.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-20 13:53:52 -07:00
e72d93e88c The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 16:40:19 -07:00
4b8cdff8ba Merge branch 'll/curl-accept-language'
Earlier, HTTP transport clients learned to tell the server side
what locale they are in by sending Accept-Language HTTP header, but
this was done only for some requests but not others.

* ll/curl-accept-language:
  remote-curl: send Accept-Language header to server
2022-07-19 16:40:19 -07:00
7c683389d6 Merge branch 'jk/diff-files-cleanup-fix'
An earlier attempt to plug leaks placed a clean-up label to jump to
at a bogus place, which as been corrected.

* jk/diff-files-cleanup-fix:
  diff-files: move misplaced cleanup label
2022-07-19 16:40:18 -07:00
2fb377e569 Merge branch 'rs/cocci-array-copy'
A coccinelle rule (in contrib/) to encourage use of COPY_ARRAY
macro has been improved.

* rs/cocci-array-copy:
  cocci: avoid normalization rules for memcpy
2022-07-19 16:40:18 -07:00
40ab711a9c Merge branch 'jk/ref-filter-discard-commit-buffer'
* jk/ref-filter-discard-commit-buffer:
  ref-filter: disable save_commit_buffer while traversing
2022-07-19 16:40:17 -07:00
cf92cb29e9 Merge branch 'jk/clone-unborn-confusion'
"git clone" from a repository with some ref whose HEAD is unborn
did not set the HEAD in the resulting repository correctly, which
has been corrected.

* jk/clone-unborn-confusion:
  clone: move unborn head creation to update_head()
  clone: use remote branch if it matches default HEAD
  clone: propagate empty remote HEAD even with other branches
  clone: drop extra newline from warning message
2022-07-19 16:40:17 -07:00
99c0d94eaa Merge branch 'hx/lookup-commit-in-graph-fix'
A corner case bug where lazily fetching objects from a promisor
remote resulted in infinite recursion has been corrected.

* hx/lookup-commit-in-graph-fix:
  t5330: remove run_with_limited_processses()
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()
2022-07-19 16:40:16 -07:00
418aef9055 Merge branch 'jc/resolve-undo'
The resolve-undo information in the index was not protected against
GC, which has been corrected.

* jc/resolve-undo:
  fsck: do not dereference NULL while checking resolve-undo data
  revision: mark blobs needed for resolve-undo as reachable
2022-07-19 16:40:16 -07:00
4611884ea8 sequencer: notify user of --update-refs activity
When the user runs 'git rebase -i --update-refs', the end message still
says only

  Successfully rebased and updated <HEAD-ref>.

Update the sequencer to collect the successful (and unsuccessful) ref
updates due to the --update-refs option, so the end message now says

  Successfully rebased and updated <HEAD-ref>.
  Updated the following refs with --update-refs:
	refs/heads/first
	refs/heads/third
  Failed to update the following refs with --update-refs:
	refs/heads/second

To test this output, we need to be very careful to format the expected
error to drop the leading tab characters. Also, we need to be aware that
the verbose output from 'git rebase' is writing progress lines which
don't use traditional newlines but clear the line after every progress
item is complete. When opening the error file in an editor, these lines
are visible, but when looking at the diff in a terminal those lines
disappear because of the characters that delete the previous characters.
Use 'sed' to clear those progress lines and clear the tabs so we can get
an exact match on our expected output.

Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
aa37f3e1d8 sequencer: ignore HEAD ref under --update-refs
When using the 'git rebase -i --update-refs' option, the todo list is
populated with 'update-ref' commands for all tip refs in the history
that is being rebased. Refs that are checked out by some worktree are
instead added as a comment to warn the user that they will not be
updated.

Until now, this included the HEAD ref, which is being updated by the
rebase process itself, regardless of the --update-refs option. Remove
the comment in this case by ignoring any decorations that match the HEAD
ref.

Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
3113fedaeb rebase: add rebase.updateRefs config option
The previous change added the --update-refs command-line option.  For
users who always want this mode, create the rebase.updateRefs config
option which behaves the same way as rebase.autoSquash does with the
--autosquash option.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
b3b1a21d1a sequencer: rewrite update-refs as user edits todo list
An interactive rebase provides opportunities for the user to edit the
todo list. The --update-refs option initializes the list with some
'update-ref <ref>' steps, but the user could add these manually.
Further, the user could add or remove these steps during pauses in the
interactive rebase.

Add a new method, todo_list_filter_update_refs(), that scans a todo_list
and compares it to the stored update-refs file. There are two actions
that can happen at this point:

1. If a '<ref>/<before>/<after>' triple in the update-refs file does not
   have a matching 'update-ref <ref>' command in the todo-list _and_ the
   <after> value is the null OID, then remove that triple. Here, the
   user removed the 'update-ref <ref>' command before it was executed,
   since if it was executed then the <after> value would store the
   commit at that position.

2. If a 'update-ref <ref>' command in the todo-list does not have a
   matching '<ref>/<before>/<after>' triple in the update-refs file,
   then insert a new one. Store the <before> value to be the current
   OID pointed at by <ref>. This is handled inside of the
   init_update_ref_record() helper method.

We can test that this works by rewriting the todo-list several times in
the course of a rebase. Check that each ref is locked or unlocked for
updates after each todo-list update. We can also verify that the ref
update fails if a concurrent process updates one of the refs after the
rebase process records the "locked" ref location.

To help these tests, add a new 'set_replace_editor' helper that will
replace the todo-list with an exact file.

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
89fc0b53fd rebase: update refs from 'update-ref' commands
The previous change introduced the 'git rebase --update-refs' option
which added 'update-ref <ref>' commands to the todo list of an
interactive rebase.

Teach Git to record the HEAD position when reaching these 'update-ref'
commands. The ref/before/after triple is stored in the
$GIT_DIR/rebase-merge/update-refs file. A previous change parsed this
file to avoid having other processes updating the refs in that file
while the rebase is in progress.

Not only do we update the file when the sequencer reaches these
'update-ref' commands, we then update the refs themselves at the end of
the rebase sequence. If the rebase is aborted before this final step,
then the refs are not updated. The 'before' value is used to ensure that
we do not accidentally obliterate a ref that was updated concurrently
(say, by an older version of Git or a third-party tool).

Now that the 'git rebase --update-refs' command is implemented to write
to the update-refs file, we can remove the fake construction of the
update-refs file from a test in t2407-worktree-heads.sh.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
900b50c242 rebase: add --update-refs option
When working on a large feature, it can be helpful to break that feature
into multiple smaller parts that become reviewed in sequence. During
development or during review, a change to one part of the feature could
affect multiple of these parts. An interactive rebase can help adjust
the multi-part "story" of the branch.

However, if there are branches tracking the different parts of the
feature, then rebasing the entire list of commits can create commits not
reachable from those "sub branches". It can take a manual step to update
those branches.

Add a new --update-refs option to 'git rebase -i' that adds 'update-ref
<ref>' steps to the todo file whenever a commit that is being rebased is
decorated with that <ref>. At the very end, the rebase process updates
all of the listed refs to the values stored during the rebase operation.

Be sure to iterate after any squashing or fixups are placed. Update the
branch only after those squashes and fixups are complete. This allows a
--fixup commit at the tip of the feature to apply correctly to the sub
branch, even if it is fixing up the most-recent commit in that part.

This change update the documentation and builtin to accept the
--update-refs option as well as updating the todo file with the
'update-ref' commands. Tests are added to ensure that these todo
commands are added in the correct locations.

This change does _not_ include the actual behavior of tracking the
updated refs and writing the new ref values at the end of the rebase
process. That is deferred to a later change.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
a97d79163e sequencer: add update-ref command
Add the boilerplate for an "update-ref" command in the sequencer. This
connects to the current no-op do_update_ref() which will be filled in
after more connections are created.

The syntax in the todo list will be "update-ref <ref-name>" to signal
that we should store the current commit as the value for updating
<ref-name> at the end of the rebase.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:03 -07:00
d7ce9a2201 sequencer: define array with enum values
The todo_command_info array defines which strings match with which
todo_command enum values. The array is defined in the same order as the
enum values, but if one changed without the other, then we would have
unexpected results.

Make it easier to see changes to the enum and this array by using the
enum values as the indices of the array.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:03 -07:00
f57fd48d56 rebase-interactive: update 'merge' description
The 'merge' command description for the todo list documentation in an
interactive rebase has multiple lines. The lines other than the first
one start with dots ('.') while the similar multi-line documentation for
'fixup' does not. This description only appears in the comment text of
the todo file during an interactive rebase.

The 'merge' command was documented when interactive rebase was first
ported to C in 145e05ac44 (rebase -i: rewrite append_todo_help() in C,
2018-08-10). These dots might have been carried over from the previous
shell implementation.

The 'fixup' command was documented more recently in 9e3cebd97c (rebase
-i: add fixup [-C | -c] command, 2021-01-29).

Looking at the output in an editor, my personal opinion is that the dots
are unnecessary and noisy. Remove them now before adding more commands
with multi-line documentation.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:03 -07:00
aa7f2fd150 branch: consider refs under 'update-refs'
The branch_checked_out() helper helps commands like 'git branch' and
'git fetch' from overwriting refs that are currently checked out in
other worktrees.

A future update to 'git rebase' will introduce a new '--update-refs'
option which will update the local refs that point to commits that are
being rebased. To avoid collisions as the rebase completes, we want to
make the future data store for these refs to be considered by
branch_checked_out().

The data store is a plaintext file inside the 'rebase-merge' directory
for that worktree. The file lists refnames followed by two OIDs, each on
separate lines. The OIDs will be used to store the original values of
the refs and the to-be-written values as the rebase progresses, but can
be ignored at the moment.

Create a new sequencer_get_update_refs_state() method that parses this
file and populates a struct string_list with the ref-OID pairs. We can
then use this list to add to the current_checked_out_branches strmap
used by branch_checked_out().

To properly navigate to the rebase directory for a given worktree,
extract the static strbuf_worktree_gitdir() method to a public API
method.

We can test that this works without having Git write this file by
artificially creating one in our test script, at least until 'git rebase
--update-refs' is implemented and we can use it directly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:03 -07:00
18ea595827 t2407: test branches currently using apply backend
The tests in t2407 that verify the branch_checked_out() helper in the
case of bisects and rebases were added by 9347303db89 (branch: check for
bisects and rebases, 2022-06-08). However, that commit failed to check
for rebases that are using the 'apply' backend.

Add a test that checks the apply backend. The implementation was already
correct here, but it is good to have regression tests before modifying
the implementation further.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:03 -07:00
1bec4d1dfd t2407: test bisect and rebase as black-boxes
The tests added by d2ba271aad (branch: check for bisects and rebases,
2022-06-14) modified hidden state to verify the branch_checked_out()
helper. While this indeed checks that the method implementation is _as
designed_, it doesn't show that it is _correct_. Specifically, if 'git
bisect' or 'git rebase' change their back-end for preserving refs, then
these tests do not demonstrate that drift as a bug in
branch_checked_out().

Modify the tests in t2407 to actually rely on a paused bisect or rebase.
This requires adding the !SANITIZE_LEAK prereq for tests using those
builtins. The logic is still tested for leaks in the final test which
does set up that back-end directly for an error state that should not be
possible using Git commands.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:03 -07:00
068fa54c00 midx: reduce memory pressure while writing bitmaps
We noticed that some 'git multi-pack-index write --bitmap' processes
were running with very high memory. It turns out that a lot of this
memory is required to store a list of every object in the written
multi-pack-index, with a second copy that has additional information
used for the bitmap writing logic.

Using 'valgrind --tool=massif' before this change, the following chart
shows how memory load increased and was maintained throughout the
process:

    GB
4.102^                                                             ::
     |              @  @::@@::@@::::::::@::::::@@:#:::::::::::::@@:: :
     |         :::::@@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |      :::: :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    :::: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
   0 +--------------------------------------------------------------->

It turns out that the 'struct write_midx_context' data is persisting
through the life of the process, including the 'entries' array. This
array is used last inside find_commits_for_midx_bitmap() within
write_midx_bitmap(). If we free (and nullify) the array at that point,
we can free a decent chunk of memory before the bitmap logic adds more
to the memory footprint.

Here is the massif memory load chart after this change:

    GB
3.111^      #
     |      #                              :::::::::::@::::::::::::::@
     |      #        ::::::::::::::::::::::::: : :: : @:: ::::: :: ::@
     |     @#  :::::::::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :::@#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
   0 +--------------------------------------------------------------->

The previous change introduced a refactoring of write_midx_bitmap() to
make it more clear how much of the 'struct write_midx_context' instance
is needed at different parts of the process. In addition, the following
defensive programming measures were put in place:

 1. Using FREE_AND_NULL() we will at least get a segfault from reading a
    NULL pointer instead of a use-after-free.

 2. 'entries_nr' is also set to zero to make any loop that would iterate
    over the entries be trivial.

 3. Add significant comments in write_midx_internal() to add warnings
    for future authors who might accidentally add references to this
    cleared memory.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 08:38:17 -07:00
90b2bb710d midx: extract bitmap write setup
The write_midx_bitmap() method is a long method that does a lot of
steps. It requires the write_midx_context struct for use in
prepare_midx_packing_data() and find_commits_for_midx_bitmap(), but
after that only needs the pack_order array.

This is a messy, but completely non-functional refactoring. The code is
only being moved around to reduce visibility of the write_midx_context
during the longest part of computing reachability bitmaps.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 08:38:17 -07:00
5766524956 pack-bitmap-write: use const for hashes
The next change will use a const array when calling this method. There
is no need for the non-const version, so let's do this cleanup quickly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 08:38:17 -07:00
71a8fab31b The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 13:31:58 -07:00
afbe62d84c Merge branch 'sg/multi-pack-index-parse-options-fix'
The way "git multi-pack" uses parse-options API has been improved.

* sg/multi-pack-index-parse-options-fix:
  multi-pack-index: simplify handling of unknown --options
2022-07-18 13:31:58 -07:00
4af2138417 Merge branch 'bc/nettle-sha256'
Support for libnettle as SHA256 implementation has been added.

* bc/nettle-sha256:
  sha256: add support for Nettle
2022-07-18 13:31:58 -07:00
ba69ae876b Merge branch 'jd/gpg-interface-trust-level-string'
The code to convert between GPG trust level strings and internal
constants we use to represent them have been cleaned up.

* jd/gpg-interface-trust-level-string:
  gpg-interface: add function for converting trust level to string
2022-07-18 13:31:57 -07:00
7f8d098b1b Merge branch 'ab/cocci-unused'
Add Coccinelle rules to detect the pattern of initializing and then
finalizing a structure without using it in between at all, which
happens after code restructuring and the compilers fail to
recognize as an unused variable.

* ab/cocci-unused:
  cocci: generalize "unused" rule to cover more than "strbuf"
  cocci: add and apply a rule to find "unused" strbufs
  cocci: have "coccicheck{,-pending}" depend on "coccicheck-test"
  cocci: add a "coccicheck-test" target and test *.cocci rules
  Makefile & .gitignore: ignore & clean "git.res", not "*.res"
  Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
2022-07-18 13:31:57 -07:00
6d003858e5 Merge branch 'gc/submodule-use-super-prefix'
Another step to rewrite more parts of "git submodule" in C.

* gc/submodule-use-super-prefix:
  submodule--helper: remove display path helper
  submodule--helper update: use --super-prefix
  submodule--helper: remove unused SUPPORT_SUPER_PREFIX flags
  submodule--helper: use correct display path helper
  submodule--helper: don't recreate recursive prefix
  submodule--helper update: use display path helper
  submodule--helper tests: add missing "display path" coverage
2022-07-18 13:31:56 -07:00
e3349f2888 Merge branch 'en/merge-dual-dir-renames-fix'
Fixes a long-standing corner case bug around directory renames in
the merge-ort strategy.

* en/merge-dual-dir-renames-fix:
  merge-ort: fix issue with dual rename and add/add conflict
  merge-ort: shuffle the computation and cleanup of potential collisions
  merge-ort: make a separate function for freeing struct collisions
  merge-ort: small cleanups of check_for_directory_rename
  t6423: add tests of dual directory rename plus add/add conflict
2022-07-18 13:31:56 -07:00
3d3874d537 Merge branch 'ab/test-without-templates'
Tweak tests so that they still work when the "git init" template
did not create .git/info directory.

* ab/test-without-templates:
  tests: don't assume a .git/info for .git/info/sparse-checkout
  tests: don't assume a .git/info for .git/info/exclude
  tests: don't assume a .git/info for .git/info/refs
  tests: don't assume a .git/info for .git/info/attributes
  tests: don't assume a .git/info for .git/info/grafts
  tests: don't depend on template-created .git/branches
  t0008: don't rely on default ".git/info/exclude"
2022-07-18 13:31:55 -07:00
48e88a4862 Merge branch 'ab/build-gitweb'
Teach "make all" to build gitweb as well.

* ab/build-gitweb:
  gitweb/Makefile: add a "NO_GITWEB" parameter
  Makefile: build 'gitweb' in the default target
  gitweb/Makefile: include in top-level Makefile
  gitweb: remove "test" and "test-installed" targets
  gitweb/Makefile: prepare to merge into top-level Makefile
  gitweb/Makefile: clear up and de-duplicate the gitweb.{css,js} vars
  gitweb/Makefile: add a $(GITWEB_ALL) variable
  gitweb/Makefile: define all .PHONY prerequisites inline
2022-07-18 13:31:55 -07:00
f63ac61fbf Merge branch 'ab/test-tool-leakfix'
Plug various memory leaks in test-tool commands.

* ab/test-tool-leakfix:
  test-tool delta: fix a memory leak
  test-tool ref-store: fix a memory leak
  test-tool bloom: fix memory leaks
  test-tool json-writer: fix memory leaks
  test-tool regex: call regfree(), fix memory leaks
  test-tool urlmatch-normalization: fix a memory leak
  test-tool {dump,scrap}-cache-tree: fix memory leaks
  test-tool path-utils: fix a memory leak
  test-tool test-hash: fix a memory leak
2022-07-18 13:31:54 -07:00
44357f64f6 Merge branch 'ab/leakfix'
Plug various memory leaks.

* ab/leakfix:
  pull: fix a "struct oid_array" memory leak
  cat-file: fix a common "struct object_context" memory leak
  gc: fix a memory leak
  checkout: avoid "struct unpack_trees_options" leak
  merge-file: fix memory leaks on error path
  merge-file: refactor for subsequent memory leak fix
  cat-file: fix a memory leak in --batch-command mode
  revert: free "struct replay_opts" members
  submodule.c: free() memory from xgetcwd()
  clone: fix memory leak in wanted_peer_refs()
  check-ref-format: fix trivial memory leak
2022-07-18 13:31:54 -07:00
f01315ef7d Merge branch 'jc/builtin-mv-move-array'
Apply Coccinelle rule to turn raw memmove() into MOVE_ARRAY() cpp
macro, which would improve maintainability and readability.

* jc/builtin-mv-move-array:
  builtin/mv.c: use the MOVE_ARRAY() macro instead of memmove()
2022-07-18 13:31:53 -07:00
2c1439231a Merge branch 'fr/vimdiff-layout-fix'
Recent update to vimdiff layout code has been made more robust
against different end-user vim settings.

* fr/vimdiff-layout-fix:
  vimdiff: make layout engine more robust against user vim settings
2022-07-18 13:31:53 -07:00
ec031da9f9 cat-file: add mailmap support
git-cat-file is used by tools like GitLab to get commit tag contents
that are then displayed to users. This content which has author,
committer or tagger information, could benefit from passing through the
mailmap mechanism before being sent or displayed.

This patch adds --[no-]use-mailmap command line option to the git
cat-file command. It also adds --[no-]mailmap option as an alias to
--[no-]use-mailmap.

This patch also introduces new test cases to test the mailmap mechanism in
git cat-file command.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 12:55:53 -07:00
66a8a95315 ident: rename commit_rewrite_person() to apply_mailmap_to_header()
commit_rewrite_person() takes a commit buffer and replaces the idents
in the header with their canonical versions using the mailmap mechanism.
The name "commit_rewrite_person()" is misleading as it doesn't convey
what kind of rewrite are we going to do to the buffer. It also doesn't
clearly mention that the function will limit itself to the header part
of the buffer. The new name, "apply_mailmap_to_header()", expresses the
functionality of the function pretty clearly.

We intend to use apply_mailmap_to_header() in git-cat-file to replace
idents in the headers of commit and tag object buffers. So, we will be
extending this function to take tag objects buffer as well and replace
idents on the tagger header using the mailmap mechanism.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 12:55:53 -07:00
dc88e349a2 ident: move commit_rewrite_person() to ident.c
commit_rewrite_person() and rewrite_ident_line() are static functions
defined in revision.c.

Their usages are as follows:
- commit_rewrite_person() takes a commit buffer and replaces the author
  and committer idents with their canonical versions using the mailmap
  mechanism
- rewrite_ident_line() takes author/committer header lines from the
  commit buffer and replaces the idents with their canonical versions
  using the mailmap mechanism.

This patch moves commit_rewrite_person() and rewrite_ident_line() to
ident.c which contains many other functions related to idents like
split_ident_line(). By moving commit_rewrite_person() to ident.c, we
also intend to use it in git-cat-file to replace committer and author
idents from the headers to their canonical versions using the mailmap
mechanism. The function is moved as is for now to make it clear that
there are no other changes, but it will be renamed in a following
commit.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 12:55:53 -07:00
e9c1b0e38c revision: improve commit_rewrite_person()
The function, commit_rewrite_person(), is designed to find and replace
an ident string in the header part, and the way it avoids a random
occurrence of "author A U Thor <author@example.com" in the text is by
insisting "author" to appear at the beginning of line by passing
"\nauthor " as "what".

The implementation also doesn't make any effort to limit itself to the
commit header by locating the blank line that appears after the header
part and stopping the search there. Also, the interface forces the
caller to make multiple calls if it wants to rewrite idents on multiple
headers. It shouldn't be the case.

To support the existing caller better, update commit_rewrite_person()
to:
- Make a single pass in the input buffer to locate headers named
  "author" and "committer" and replace idents on them.
- Stop at the end of the header, ensuring that nothing in the body of
  the commit object is modified.

The return type of the function commit_rewrite_person() has also been
changed from int to void. This has been done because the caller of the
function doesn't do anything with the return value of the function.

By simplifying the interface of the commit_rewrite_person(), we also
intend to expose it as a public function. We will also be renaming the
function in a future commit to a different name which clearly tells that
the function replaces idents in the header of the commit buffer.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 12:55:53 -07:00
5dcee7c705 pack-bitmap.c: continue looping when first MIDX bitmap is found
In "open_midx_bitmap()", we do a loop with the MIDX(es) in repo, when
the first one has been found, then will break out by a "return"
directly.

But actually, it's better to continue the loop until we have visited
both the MIDX in our repository, as well as any alternates (along with
_their_ alternates, recursively).

The reason for this is, there may exist more than one MIDX file in
a repo. The "multi_pack_index" struct is actually designed as a singly
linked list, and if a MIDX file has been already opened successfully,
then the other MIDX files will be skipped and left with a warning
"ignoring extra bitmap file." to the output.

The discussion link of community:

  https://public-inbox.org/git/YjzCTLLDCby+kJrZ@nand.local/

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:20:52 -07:00
9005eb021a pack-bitmap.c: using error() instead of silently returning -1
In "open_pack_bitmap_1()" and "open_midx_bitmap_1()", it's better to
return error() instead of "-1" when some unexpected error occurs like
"stat bitmap file failed", "bitmap header is invalid" or "checksum
mismatch", etc.

There are places where we do not replace, such as when the bitmap
does not exist (no bitmap in repository is allowed) or when another
bitmap has already been opened (in which case it should be a warning
rather than an error).

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:20:52 -07:00
6411cc08f3 pack-bitmap.c: do not ignore error when opening a bitmap file
Calls to git_open() to open the pack bitmap file and
multi-pack bitmap file do not report any error when they
fail.  These files are optional and it is not an error if
open failed due to ENOENT, but we shouldn't be ignoring
other kinds of errors.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:20:52 -07:00
349c26ff29 pack-bitmap.c: rename "idx_name" to "bitmap_name"
In "open_pack_bitmap_1()" and "open_midx_bitmap_1()" we use
a var named "idx_name" to represent the bitmap filename which
is computed by "midx_bitmap_filename()" or "pack_bitmap_filename()"
before we open it.

There may bring some confusion in this "idx_name" naming, which
might lead us to think of ".idx "or" multi-pack-index" files,
although bitmap is essentially can be understood as a kind of index,
let's define this name a little more accurate here.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:20:52 -07:00
9975975d7f pack-bitmap.c: mark more strings for translations
In pack-bitmap.c, some printed texts are translated, some are not.
Let's support the translations of the bitmap related output.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:20:52 -07:00
baf20c39a7 pack-bitmap.c: fix formatting of error messages
There are some text output issues in 'pack-bitmap.c', they exist in
die(), error() etc. This includes issues with capitalization the
first letter, newlines, error() instead of BUG(), and substitution
that don't have quotes around them.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:20:52 -07:00
72d3a5da32 scalar: convert README.md into a technical design doc
Adapt the content from 'contrib/scalar/README.md' into a design document in
'Documentation/technical/'. In addition to reformatting for asciidoc,
elaborate on the background, purpose, and design choices that went into
Scalar.

Most of this document will persist in the 'Documentation/technical/' after
Scalar has been moved out of 'contrib/' and into the root of Git. Until that
time, it will also contain a temporary "Roadmap" section detailing the
remaining series needed to finish the initial version of Scalar. The section
will be removed once Scalar is moved to the repo root, but in the meantime
serves as a guide for readers to keep up with progress on the feature.

Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:03:56 -07:00
f22c95db53 scalar: reword command documentation to clarify purpose
Rephrase documentation to describe scalar as a "large repo management tool"
rather than an "opinionated management tool". The new description is
intended to more directly reflect the utility of scalar to better guide
users in preparation for scalar being built and installed as part of Git.

Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 11:03:56 -07:00
3a251bac0d trace2: only include "fsync" events if we git_fsync()
Fix the overly verbose trace2 logging added in 9a4987677d (trace2:
add stats for fsync operations, 2022-03-30) (first released with
v2.36.0).

Since that change every single "git" command invocation has included
these "data" events, even though we'll only make use of these with
core.fsyncMethod=batch, and even then only have non-zero values if
we're writing object data to disk. See c0f4752ed2 (core.fsyncmethod:
batched disk flushes for loose-objects, 2022-04-04) for that feature.

As we're needing to indent the trace2_data_intmax() lines let's
introduce helper variables to ensure that our resulting lines (which
were already too) don't exceed the recommendations of the
CodingGuidelines. Doing that requires either wrapping them twice, or
introducing short throwaway variable names, let's do the latter.

The result was that e.g. "git version" would previously emit a total
of 6 trace2 events with the GIT_TRACE2_EVENT target (version, start,
cmd_ancestry, cmd_name, exit, atexit), but afterwards would emit
8. We'd emit 2 "data" events before the "exit" event.

The reason we didn't catch this was that the trace2 unit tests added
in a15860dca3 (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
t0212.sh, 2019-02-22) would omit any "data" events that weren't the
ones it cared about. Before this change to the C code 6/7 of our
"t/t0212-trace2-event.sh" tests would fail if this change was applied
to "t/t0212/parse_events.perl".

Let's make the trace2 testing more strict, and further append any new
events types we don't know about in "t/t0212/parse_events.perl". Since
we only invoke the "test-tool trace2" there's no guarantee that we'll
catch other overly verbose events in the future, but we'll at least
notice if we start emitting new events that are issues every time we
log anything with trace2's JSON target.

We exclude the "data_json" event type, we'd otherwise would fail on
both "win test" and "win+VS test" CI due to the logging added in
353d3d77f4 (trace2: collect Windows-specific process information,
2019-02-22). It looks like that logging should really be using
trace2_cmd_ancestry() instead, which was introduced later in
2f732bf15e (tr2: log parent process name, 2021-07-21), but let's
leave it for now.

The fix-up to aaf81223f4 (unpack-objects: use stream_loose_object()
to unpack large objects, 2022-06-11) is needed because we're changing
the behavior of these events as discussed above. Since we'd always
emit a "hardware-flush" event the test added in aaf81223f4 wasn't
testing anything except that this trace2 data was unconditionally
logged. Even if "core.fsyncMethod" wasn't set to "batch" we'd pass the
test.

Now we'll check the expected number of "writeout" v.s. "flush" calls
under "core.fsyncMethod=batch", but note that this doesn't actually
test if we carried out the sync using that method, on a platform where
we'd have to fall back to fsync() each of those "writeout" would
really be a "flush" (i.e. a full fsync()).

But in this case what we're testing is that the logic in
"unpack-objects" behaves as expected, not the OS-specific question of
whether we actually were able to use the "bulk" method.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 09:41:57 -07:00
0f1eb7d6e9 mergesort: remove llist_mergesort()
Now that all of its callers are gone, remove llist_mergesort().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:39 -07:00
9b9f5f6217 packfile: use DEFINE_LIST_SORT
Build a typed sort function for packed_git lists using DEFINE_LIST_SORT
instead of calling llist_mergesort().  This gets rid of the next pointer
accessor functions and their calling overhead at the cost of slightly
increased object text size.

Before:
__TEXT	__DATA	__OBJC	others	dec	hex
20218	320	0	110936	131474	20192	packfile.o

With this patch:
__TEXT	__DATA	__OBJC	others	dec	hex
20430	320	0	112619	133369	208f9	packfile.o

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:39 -07:00
6fc9fec07b fetch-pack: use DEFINE_LIST_SORT
Build a static typed ref sorting function using DEFINE_LIST_SORT along
with a typed comparison function near its only two callers instead of
having an exported version that calls llist_mergesort().  This gets rid
of the next pointer accessor functions and their calling overhead at the
cost of a slightly increased object text size.

Before:
__TEXT	__DATA	__OBJC	others	dec	hex
23231	389	0	113689	137309	2185d	fetch-pack.o
29158	80	0	146864	176102	2afe6	remote.o

With this patch:
__TEXT	__DATA	__OBJC	others	dec	hex
23591	389	0	117759	141739	229ab	fetch-pack.o
29070	80	0	145718	174868	2ab14	remote.o

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:39 -07:00
c0fb5774a6 commit: use DEFINE_LIST_SORT
Use DEFINE_LIST_SORT to build a typed sort function for commit_list
entries instead of calling llist_mergesort().  This gets rid of the next
pointer accessor functions and their calling overhead at the cost of a
slightly increased object text size.

Before:
__TEXT	__DATA	__OBJC	others	dec	hex
18795	92	0	104654	123541	1e295	commit.o

With this patch:
__TEXT	__DATA	__OBJC	others	dec	hex
18963	92	0	106094	125149	1e8dd	commit.o

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:39 -07:00
47c30f7daa blame: use DEFINE_LIST_SORT
Build a typed sort function for blame entries using DEFINE_LIST_SORT
instead of calling llist_mergesort().  This gets rid of the next pointer
accessor functions and their calling overhead at the cost of a slightly
increased object text size.

Before:
__TEXT	__DATA	__OBJC	others	dec	hex
24621	56	0	147515	172192	2a0a0	blame.o

With this patch:
__TEXT	__DATA	__OBJC	others	dec	hex
25229	56	0	151702	176987	2b35b	blame.o

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:38 -07:00
b378c2ff1e test-mergesort: use DEFINE_LIST_SORT
Build a typed sort function for the mergesort performance test tool
using DEFINE_LIST_SORT instead of calling llist_mergesort().  This gets
rid of the next pointer accessor functions and improves the performance
at the cost of a slightly higher object text size.

Before:
0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

__TEXT	__DATA	__OBJC	others	dec	hex
6407	276	0	24701	31384	7a98	t/helper/test-mergesort.o

With this patch:
0071.12: DEFINE_LIST_SORT unsorted     0.22(0.21+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.11(0.10+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.11(0.10+0.01)

__TEXT	__DATA	__OBJC	others	dec	hex
6615	276	0	25832	32723	7fd3	t/helper/test-mergesort.o

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:38 -07:00
f00a039839 test-mergesort: use DEFINE_LIST_SORT_DEBUG
Define a typed sort function using DEFINE_LIST_SORT_DEBUG for the
mergesort sanity check instead of using llist_mergesort().  This gets
rid of the next pointer accessor functions and improves the performance
at the cost of slightly bigger object text.

Before:
Benchmark 1: t/helper/test-tool mergesort test
  Time (mean ± σ):     108.4 ms ±   0.2 ms    [User: 106.7 ms, System: 1.2 ms]
  Range (min … max):   108.0 ms … 108.8 ms    27 runs

__TEXT	__DATA	__OBJC	others	dec	hex
6251	276	0	23172	29699	7403	t/helper/test-mergesort.o

With this patch:
Benchmark 1: t/helper/test-tool mergesort test
  Time (mean ± σ):      94.0 ms ±   0.2 ms    [User: 92.4 ms, System: 1.1 ms]
  Range (min … max):    93.7 ms …  94.5 ms    31 runs

__TEXT	__DATA	__OBJC	others	dec	hex
6407	276	0	24701	31384	7a98	t/helper/test-mergesort.o

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:38 -07:00
318051eaeb mergesort: add macros for typed sort of linked lists
Add the macros DECLARE_LIST_SORT and DEFINE_LIST_SORT for building
type-specific functions for sorting linked lists.  The generated
function expects a typed comparison function.

The programmer provides full type information (no void pointers).  This
allows the compiler to check whether the comparison function matches the
list type.  It can also inline the "next" pointer accessor functions and
even the comparison function to get rid of the calling overhead.

Also provide a DECLARE_LIST_SORT_DEBUG macro that allows executing
custom code whenever the accessor functions are used.  It's intended to
be used by test-mergesort, which counts these operations.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:38 -07:00
848afebe56 mergesort: tighten merge loop
llist_merge() has special inner loops for taking elements from either of
the two lists to merge.  That helps consistently preferring one over the
other, for stability.  Merge the loops, swap the lists when the other
one has the next element for the result and keep track on which one to
prefer on equality.  This results in shorter code and object text:

Before:
__TEXT	__DATA	__OBJC	others	dec	hex
412	0	0	3441	3853	f0d	mergesort.o

With this patch:
__TEXT	__DATA	__OBJC	others	dec	hex
352	0	0	3516	3868	f1c	mergesort.o

Performance doesn't get worse:

Before:
0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

Benchmark 1: t/helper/test-tool mergesort test
  Time (mean ± σ):     109.2 ms ±   0.2 ms    [User: 107.5 ms, System: 1.1 ms]
  Range (min … max):   108.9 ms … 109.6 ms    27 runs

With this patch:
0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

Benchmark 1: t/helper/test-tool mergesort test
  Time (mean ± σ):     108.4 ms ±   0.2 ms    [User: 106.7 ms, System: 1.2 ms]
  Range (min … max):   108.0 ms … 108.8 ms    27 runs

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:38 -07:00
7a3775eeb4 mergesort: unify ranks loops
llist_mergesort() has a loop for adding a new element to the ranks array
and another one for rolling up said array into a single sorted list at
the end.  We can merge them, so that adding the last element rolls up
the whole array.  Handle the empty list before the main loop now because
list can't be NULL anymore inside the loop.

The result is shorter code and significantly less object text:

main:
__TEXT	__DATA	__OBJC	others	dec	hex
652	0	0	4651	5303	14b7	mergesort.o

With this patch:
__TEXT	__DATA	__OBJC	others	dec	hex
412	0	0	3441	3853	f0d	mergesort.o

Why is the change so big?  The reduction is amplified by llist_merge()
being inlined both before and after.

Performance stays basically the same:

main:
0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

Benchmark 1: t/helper/test-tool mergesort test
  Time (mean ± σ):     109.0 ms ±   0.3 ms    [User: 107.4 ms, System: 1.1 ms]
  Range (min … max):   108.7 ms … 109.6 ms    27 runs

With this patch:
0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

Benchmark 1: t/helper/test-tool mergesort test
  Time (mean ± σ):     109.2 ms ±   0.2 ms    [User: 107.5 ms, System: 1.1 ms]
  Range (min … max):   108.9 ms … 109.6 ms    27 runs

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17 15:20:38 -07:00
a92d8523ce commit-graph: pass repo_settings instead of repository
The parse_commit_graph() function takes a 'struct repository *' pointer,
but it only ever accesses config settings (either directly or through
the .settings field of the repo struct). Move all relevant config
settings into the repo_settings struct, and update parse_commit_graph()
and its existing callers so that it takes 'struct repo_settings *'
instead.

Callers of parse_commit_graph() will now need to call
prepare_repo_settings() themselves, or initialize a 'struct
repo_settings' directly.

Prior to ab14d0676c (commit-graph: pass a 'struct repository *' in more
places, 2020-09-09), parsing a commit-graph was a pure function
depending only on the contents of the commit-graph itself. Commit
ab14d0676c introduced a dependency on a `struct repository` pointer, and
later commits such as b66d84756f (commit-graph: respect
'commitGraph.readChangedPaths', 2020-09-09) added dependencies on config
settings, which were accessed through the `settings` field of the
repository pointer. This field was initialized via a call to
`prepare_repo_settings()`.

Additionally, this fixes an issue in fuzz-commit-graph: In 44c7e62
(2021-12-06, repo-settings:prepare_repo_settings only in git repos),
prepare_repo_settings was changed to issue a BUG() if it is called by a
process whose CWD is not a Git repository.

The combination of commits mentioned above broke fuzz-commit-graph,
which attempts to parse arbitrary fuzzing-engine-provided bytes as a
commit graph file. Prior to this change, parse_commit_graph() called
prepare_repo_settings(), but since we run the fuzz tests without a valid
repository, we are hitting the BUG() from 44c7e62 for every test case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:42:17 -07:00
8d1a744820 setup.c: create safe.bareRepository
There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `safe.bareRepository`, that tells Git whether
or not to die() when working with a bare repository. This config is an
enum of:

- "all": allow all bare repositories (this is the default)
- "explicit": only allow bare repositories specified via --git-dir
  or GIT_DIR.

If we want to protect users from such attacks by default, neither value
will suffice - "all" provides no protection, but "explicit" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
6061601d9f safe.directory: use git_protected_config()
Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
5b3c650777 config: learn git_protected_config()
`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`safe.bareRepository` should also be 'protected configuration only'. So,
for consistency, we'd like to have a single implementation for protected
configuration.

The primary constraints are:

1. Reading from protected configuration should be fast. Nearly all "git"
   commands inside a bare repository will read both `safe.directory` and
   `safe.bareRepository`, so we cannot afford to be slow.

2. Protected configuration must be readable when the gitdir is not
   known. `safe.directory` and `safe.bareRepository` both affect
   repository discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved.
git_protected_config() iterates through every variable in
protected_config, which is wasteful, but it makes the conversion simple
because it matches existing patterns. We will likely implement constant
time lookup functions for protected configuration in a future series
(such functions already exist for non-protected configuration, i.e.
repo_config_get_*()).

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
779ea9303a Documentation: define protected configuration
For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.

In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected configuration as
'protected configuration only', but this term is not used in the
documentation.

This definition of protected configuration is based on whether or not
Git can reasonably protect the user by ignoring the configuration scope:

- System, global and command line config are considered protected
  because an attacker who has control over any of those can do plenty of
  harm without Git, so we gain very little by ignoring those scopes.

- On the other hand, local (and similarly, worktree) config are not
  considered protected because it is relatively easy for an attacker to
  control local config, e.g.:

  - On some shared user environments, a non-admin attacker can create a
    repository high up the directory hierarchy (e.g. C:\.git on
    Windows), and a user may accidentally use it when their PS1
    automatically invokes "git" commands.

    `safe.directory` prevents attacks of this form by making sure that
    the user intended to use the shared repository. It obviously
    shouldn't be read from the repository, because that would end up
    trusting the repository that Git was supposed to reject.

  - "git upload-pack" is expected to run in repositories that may not be
    controlled by the user. We cannot ignore all config in that
    repository (because "git upload-pack" would fail), but we can limit
    the risks by ignoring `uploadpack.packObjectsHook`.

Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:

- `safe.directory` should be 'protected configuration only', but it does
  not technically fit the definition because it is not respected in the
  "command" scope. A future commit will fix this.

- `trace2.*` happens to read the same scopes as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
5f5af3735d Documentation/git-config.txt: add SCOPES section
In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.

145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
what those values mean.

Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
9dd64cb4d3 The third batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:04:00 -07:00
361cbe6d6d Merge branch 'ab/submodule-cleanup'
Further preparation to turn git-submodule.sh into a builtin.

* ab/submodule-cleanup:
  git-sh-setup.sh: remove "say" function, change last users
  git-submodule.sh: use "$quiet", not "$GIT_QUIET"
  submodule--helper: eliminate internal "--update" option
  submodule--helper: understand --checkout, --merge and --rebase synonyms
  submodule--helper: report "submodule" as our name in some "-h" output
  submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs"
  submodule update: remove "-v" option
  submodule--helper: have --require-init imply --init
  git-submodule.sh: remove unused top-level "--branch" argument
  git-submodule.sh: make the "$cached" variable a boolean
  git-submodule.sh: remove unused $prefix variable
  git-submodule.sh: remove unused sanitize_submodule_env()
2022-07-14 15:04:00 -07:00
0455aad1e3 Merge branch 'sy/mv-out-of-cone'
"git mv A B" in a sparsely populated working tree can be asked to
move a path between directories that are "in cone" (i.e. expected
to be materialized in the working tree) and "out of cone"
(i.e. expected to be hidden).  The handling of such cases has been
improved.

* sy/mv-out-of-cone:
  mv: add check_dir_in_index() and solve general dir check issue
  mv: use flags mode for update_mode
  mv: check if <destination> exists in index to handle overwriting
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: decouple if/else-if checks using goto
  mv: update sparsity after moving from out-of-cone to in-cone
  t1092: mv directory from out-of-cone to in-cone
  t7002: add tests for moving out-of-cone file/directory
2022-07-14 15:04:00 -07:00
73b9ef6ab1 Merge branch 'hx/unpack-streaming'
Allow large objects read from a packstream to be streamed into a
loose object file straight, without having to keep it in-core as a
whole.

* hx/unpack-streaming:
  unpack-objects: use stream_loose_object() to unpack large objects
  core doc: modernize core.bigFileThreshold documentation
  object-file.c: add "stream_loose_object()" to handle large object
  object-file.c: factor out deflate part of write_loose_object()
  object-file.c: refactor write_loose_object() to several steps
  unpack-objects: low memory footprint for get_data() in dry_run mode
2022-07-14 15:03:59 -07:00
be733e1200 Merge branch 'en/merge-tree'
"git merge-tree" learned a new mode where it takes two commits and
computes a tree that would result in the merge commit, if the
histories leading to these two commits were to be merged.

* en/merge-tree:
  git-merge-tree.txt: add a section on potentional usage mistakes
  merge-tree: add a --allow-unrelated-histories flag
  merge-tree: allow `ls-files -u` style info to be NUL terminated
  merge-ort: optionally produce machine-readable output
  merge-ort: store more specific conflict information
  merge-ort: make `path_messages` a strmap to a string_list
  merge-ort: store messages in a list, not in a single strbuf
  merge-tree: provide easy access to `ls-files -u` style info
  merge-tree: provide a list of which files have conflicts
  merge-ort: remove command-line-centric submodule message from merge-ort
  merge-ort: provide a merge_get_conflicted_files() helper function
  merge-tree: support including merge messages in output
  merge-ort: split out a separate display_update_messages() function
  merge-tree: implement real merges
  merge-tree: add option parsing and initial shell for real merge function
  merge-tree: move logic for existing merge into new function
  merge-tree: rename merge_trees() to trivial_merge_trees()
2022-07-14 15:03:59 -07:00
dc6315e1fc Merge branch 'gg/worktree-from-the-above'
In a non-bare repository, the behavior of Git when the
core.worktree configuration variable points at a directory that has
a repository as its subdirectory, regressed in Git 2.27 days.

* gg/worktree-from-the-above:
  dir: minor refactoring / clean-up
  dir: traverse into repository
2022-07-14 15:03:58 -07:00
4e2a4d1dd4 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-13 14:54:56 -07:00
fba8e7fa2d Merge branch 'ds/git-rebase-doc-markup'
References to commands-to-be-typed-literally in "git rebase"
documentation mark-up have been corrected.

* ds/git-rebase-doc-markup:
  git-rebase.txt: use back-ticks consistently
2022-07-13 14:54:56 -07:00
9a13943ef4 Merge branch 'tk/rev-parse-doc-clarify-at-u'
Doc update.

* tk/rev-parse-doc-clarify-at-u:
  rev-parse: documentation adjustment - mention remote tracking with @{u}
2022-07-13 14:54:55 -07:00
8c4f65e0bf Merge branch 'cl/grep-max-count'
"git grep -m<max-hits>" is a way to limit the hits shown per file.

* cl/grep-max-count:
  grep: add --max-count command line option
2022-07-13 14:54:55 -07:00
884339a15f Merge branch 'dr/i18n-die-warn-error-usage'
Give _() markings to fatal/warning/usage: labels that are shown in
front of these messages.

* dr/i18n-die-warn-error-usage:
  i18n: mark message helpers prefix for translation
2022-07-13 14:54:54 -07:00
81705c4ee6 Merge branch 'zk/push-use-bitmaps'
"git push" sometimes perform poorly when reachability bitmaps are
used, even in a repository where other operations are helped by
bitmaps.  The push.useBitmaps configuration variable is introduced
to allow disabling use of reachability bitmaps only for "git push".

* zk/push-use-bitmaps:
  send-pack.c: add config push.useBitmaps
2022-07-13 14:54:54 -07:00
33f448b5fc Merge branch 'jk/remote-show-with-negative-refspecs'
"git remote show [-n] frotz" now pays attention to negative
pathspec.

* jk/remote-show-with-negative-refspecs:
  remote: handle negative refspecs in git remote show
2022-07-13 14:54:54 -07:00
6fccbdaa51 Merge branch 'ro/mktree-allow-missing-fix'
"git mktree --missing" lazily fetched objects that are missing from
the local object store, which was totally unnecessary for the purpose
of creating the tree object(s) from its input.

* ro/mktree-allow-missing-fix:
  mktree: do not check type of remote objects
2022-07-13 14:54:53 -07:00
ee493108e5 Merge branch 'll/ls-files-tests-update'
Test update.

* ll/ls-files-tests-update:
  ls-files: update test style
2022-07-13 14:54:53 -07:00
92a25a8897 Merge branch 'ab/test-quoting-fix'
Fixes for tests when the source directory has unusual characters in
its path, e.g. whitespaces, double-quotes, etc.

* ab/test-quoting-fix:
  config tests: fix harmless but broken "rm -r" cleanup
  test-lib.sh: fix prepend_var() quoting issue
  tests: add missing double quotes to included library paths
2022-07-13 14:54:52 -07:00
db791e6e8f Merge branch 'ds/t5510-brokequote'
Test fix.

* ds/t5510-brokequote:
  t5510: replace 'origin' with URL more carefully
2022-07-13 14:54:52 -07:00
b59f04f843 Merge branch 'tb/pack-objects-remove-pahole-comment'
Comment fix.

* tb/pack-objects-remove-pahole-comment:
  pack-objects.h: remove outdated pahole results
2022-07-13 14:54:51 -07:00
8da79e7250 Merge branch 'en/t6429-test-must-be-empty-fix'
A test fix.

* en/t6429-test-must-be-empty-fix:
  t6429: fix use of non-existent function
2022-07-13 14:54:51 -07:00
7fefa1b68e Merge branch 'ds/branch-checked-out' into ds/rebase-update-ref
* ds/branch-checked-out:
  branch: drop unused worktrees variable
  fetch: stop passing around unused worktrees variable
  branch: fix branch_checked_out() leaks
  branch: use branch_checked_out() when deleting refs
  fetch: use new branch_checked_out() and add tests
  branch: check for bisects and rebases
  branch: add branch_checked_out() helper
2022-07-12 08:38:42 -07:00
f2e5255fc2 Sync with Git 2.37.1 2022-07-11 16:08:49 -07:00
55ece90cdd The first batch after Git 2.37
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 15:38:52 -07:00
1b638216b4 Merge branch 'ds/vscode-settings'
* ds/vscode-settings:
  vscode: improve tab size and wrapping
2022-07-11 15:38:52 -07:00
6d65013bb7 Merge branch 'cr/setup-bug-typo'
Typofix in a BUG() message.

* cr/setup-bug-typo:
  setup: fix function name in a BUG() message
2022-07-11 15:38:52 -07:00
b5a2d6cc49 Merge branch 'rs/archive-with-internal-gzip'
Teach "git archive" to (optionally and then by default) avoid
spawning an external "gzip" process when creating ".tar.gz" (and
".tgz") archives.

* rs/archive-with-internal-gzip:
  archive-tar: use internal gzip by default
  archive-tar: use OS_CODE 3 (Unix) for internal gzip
  archive-tar: add internal gzip implementation
  archive-tar: factor out write_block()
  archive: rename archiver data field to filter_command
  archive: update format documentation
2022-07-11 15:38:51 -07:00
c2d01098fb Merge branch 'ds/branch-checked-out'
Introduce a helper to see if a branch is already being worked on
(hence should not be newly checked out in a working tree), which
performs much better than the existing find_shared_symref() to
replace many uses of the latter.

* ds/branch-checked-out:
  branch: drop unused worktrees variable
  fetch: stop passing around unused worktrees variable
  branch: fix branch_checked_out() leaks
  branch: use branch_checked_out() when deleting refs
  fetch: use new branch_checked_out() and add tests
  branch: check for bisects and rebases
  branch: add branch_checked_out() helper
2022-07-11 15:38:51 -07:00
2b970bc09f Merge branch 'jk/optim-promisor-object-enumeration'
Collection of what is referenced by objects in promisor packs have
been optimized to inspect these objects in the in-pack order.

* jk/optim-promisor-object-enumeration:
  is_promisor_object(): walk promisor packs in pack-order
2022-07-11 15:38:50 -07:00
5dbbdaac79 Merge branch 'ac/bitmap-format-doc'
Adjust technical/bitmap-format to be formatted by AsciiDoc, and
add some missing information to the documentation.

* ac/bitmap-format-doc:
  bitmap-format.txt: add information for trailing checksum
  bitmap-format.txt: fix some formatting issues
  bitmap-format.txt: feed the file to asciidoc to generate html
2022-07-11 15:38:50 -07:00
2c8c0b4843 Merge branch 'pb/diff-doc-raw-format'
Update "git diff/log --raw" format documentation.

* pb/diff-doc-raw-format:
  diff-index.txt: update raw output format in examples
  diff-format.txt: correct misleading wording
  diff-format.txt: dst can be 0* SHA-1 when path is deleted, too
2022-07-11 15:38:49 -07:00
96730964f8 Merge branch 'jk/revisions-doc-markup-fix'
Documentation mark-up fix.

* jk/revisions-doc-markup-fix:
  revisions.txt: escape "..." to avoid asciidoc horizontal ellipsis
2022-07-11 15:38:49 -07:00
a2d1f00bdd Merge branch 'rs/combine-diff-with-incompatible-options'
Certain diff options are currently ignored when combined-diff is
shown; mark them as incompatible with the feature.

* rs/combine-diff-with-incompatible-options:
  combine-diff: abort if --output is given
  combine-diff: abort if --ignore-matching-lines is given
2022-07-11 15:38:48 -07:00
359b01ca84 ref-filter: disable save_commit_buffer while traversing
Various ref-filter options like "--contains" or "--merged" may cause us
to traverse large segments of the history graph. It's counter-productive
to have save_commit_buffer turned on, as that will instruct the commit
code to cache in-memory the object contents for each commit we traverse.

This increases the amount of heap memory used while providing little or
no benefit, since we're not actually planning to display those commits
(which is the usual reason that tools like git-log want to keep them
around). We can easily disable this feature while ref-filter is running.
This lowers peak heap (as measured by massif) for running:

  git tag --contains 1da177e4c3

in linux.git from ~100MB to ~20MB. It also seems to improve runtime by
4-5% (600ms vs 630ms).

A few points to note:

  - it should be safe to temporarily disable save_commit_buffer like
    this. The saved buffers are accessed through get_commit_buffer(),
    which treats the saved ones like a cache, and loads on-demand from
    the object database on a cache miss. So any code that was using this
    would not be wrong, it might just incur an extra object lookup for
    some objects. But...

  - I don't think any ref-filter related code is using the cache. While
    it's true that an option like "--format=%(*contents:subject)" or
    "--sort=*authordate" will need to look at the commit contents,
    ref-filter doesn't use get_commit_buffer() to do so! It always reads
    the objects directly via read_object_file(), though it does avoid
    re-reading objects if the format can be satisfied without them.

    Timing "git tag --format=%(*authordate)" shows that we're the same
    before and after, as expected.

  - Note that all of this assumes you don't have a commit-graph file. if
    you do, then the heap usage is even lower, and the runtime is 10x
    faster. So in that sense this is not urgent, as there's a much
    better solution. But since it's such an obvious and easy win for
    fallback cases (including commits which aren't yet in the graph
    file), there's no reason not to.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 14:27:31 -07:00
b0c4adcdd7 remote-curl: send Accept-Language header to server
Git server end's ability to accept Accept-Language header was introduced
in f18604bbf2 (http: add Accept-Language header if possible, 2015-01-28),
but this is only used by very early phase of the transfer, which is HTTP
GET request to discover references. For other phases, like POST request
in the smart HTTP, the server does not know what language the client
speaks.

Teach git client to learn end-user's preferred language and throw
accept-language header to the server side. Once the server gets this header,
it has the ability to talk to end-user with language they understand.
This would be very helpful for many non-English speakers.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 12:24:28 -07:00
803978da49 gpg-interface: add function for converting trust level to string
Add new helper function `gpg_trust_level_to_str()` which will
convert a given member of `enum signature_trust_level` to its
corresponding string (in lowercase). For example, `TRUST_ULTIMATE`
will yield the string "ultimate".

This will abstract out some code in `pretty.c` relating to gpg
signature trust levels.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Jaydeep Das <jaydeepjd.8914@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-10 22:10:23 -07:00
cc74afb83f multi-pack-index: simplify handling of unknown --options
Although parse_options() can handle unknown --options just fine, none
of 'git multi-pack-index's subcommands rely on it, but do it on their
own: they invoke parse_options() with the PARSE_OPT_KEEP_UNKNOWN flag,
then check whether there are any unparsed arguments left, and print
usage and quit if necessary.

Drop that PARSE_OPT_KEEP_UNKNOWN flag to let parse_options() handle
unknown options instead, which has the additional benefit that it
prints not only the usage but an "error: unknown option `foo'" message
as well.

Do leave the unparsed arguments check to catch any unexpected
non-option arguments, though, e.g. 'git multi-pack-index write foo'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-10 14:53:48 -07:00
f53156f2ee cocci: avoid normalization rules for memcpy
Some of the rules for using COPY_ARRAY instead of memcpy with sizeof are
intended to reduce the number of sizeof variants to deal with.  They can
have unintended side effects if only they match, but not the one for the
COPY_ARRAY conversion at the end.

Avoid these side effects by instead using a self-contained rule for each
combination of array and pointer for source and destination which lists
all sizeof variants inline.

This lets "make contrib/coccinelle/array.cocci.patch" take 15% longer on
my machine, but gives peace of mind that no incomplete transformation
will be generated.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-10 14:52:05 -07:00
e555735836 sha256: add support for Nettle
For SHA-256, we currently have support for OpenSSL and libgcrypt because
these two libraries contain optimized implementations that can take
advantage of native processor instructions.  However, OpenSSL is not
suitable for linking against for Linux distros due to licensing
incompatibilities with the GPLv2, and libgcrypt has been less favored by
cryptographers due to some security-related implementation issues,
which, while not affecting our use of hash algorithms, has affected its
reputation.

Let's add another option that's compatible with the GPLv2, which is
Nettle.  This is an option which is generally better than libgcrypt
because on many distros GnuTLS (which uses Nettle) is used for HTTPS and
therefore as a practical matter it will be available on most systems.
As a result, prefer it over libgcrypt and our built-in implementation.

Nettle also has recently gained support for Intel's SHA-NI instructions,
which compare very favorably to other implementations, as well as
assembly implementations for when SHA-NI is not available.

A git gc on git.git sees a 12% performance improvement with Nettle over
our block SHA-256 implementation due to general assembly improvements.
With SHA-NI, the performance of raw SHA-256 on a 2 GiB file goes from
7.296 seconds with block SHA-256 to 1.523 seconds with Nettle.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-10 14:43:34 -07:00
eee227ad8e builtin/mv.c: use the MOVE_ARRAY() macro instead of memmove()
The variables 'source', 'destination', and 'submodule_gitfile' are
all of type "const char **", and an element of such an array is of
"type const char *", but these memmove() calls were written as if
these variables are of type "char **".

Once these memmove() calls are fixed to use the correct type to
compute the number of bytes to be moved, e.g.

-      memmove(source + i, source + i + 1, n * sizeof(char *));
+      memmove(source + i, source + i + 1, n * sizeof(const char *));

existing contrib/coccinelle/array.cocci rules can recognize them as
candidates for turning into MOVE_ARRAY().

While at it, use CALLOC_ARRAY() instead of xcalloc() to allocate the
modes[] array that is involved in the change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-09 18:38:57 -07:00
f7b587bf65 xdiff: introduce XDL_ALLOC_GROW()
Add a helper to grow an array. This is analogous to ALLOC_GROW() in
the rest of the codebase but returns −1 on allocation failure to
accommodate other users of libxdiff such as libgit2. It will also
return a error if the multiplication overflows while calculating the
new allocation size. Note that this keeps doubling on reallocation
like the code it is replacing rather than increasing the existing size
by half like ALLOC_GROW(). It does however copy ALLOC_GROW()'s trick
of adding a small amount to the new allocation to avoid a lot of
reallocations at small sizes.

Note that xdl_alloc_grow_helper() uses long rather than size_t for
`nr` and `alloc` to match the existing code.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
848fd5ae5b xdiff: introduce XDL_CALLOC_ARRAY()
Add a helper for allocating an array and initialize the elements to
zero. This is analogous to CALLOC_ARRAY() in the rest of the codebase
but it returns NULL on allocation failures rather than dying to
accommodate other users of libxdiff such as libgit2.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
18aae7e21e xdiff: introduce xdl_calloc
In preparation for introducing XDL_CALLOC_ARRAY() use calloc() to
obtain zeroed out memory rather than malloc() followed by memset(). To
try and keep the lines a reasonable length this commit also stops
casting the pointer returned by calloc() as this is unnecessary.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
abf04bdaa8 xdiff: introduce XDL_ALLOC_ARRAY()
Add a helper to allocate an array that automatically calculates the
allocation size. This is analogous to ALLOC_ARRAY() in the rest of the
codebase but returns NULL if the allocation fails to accommodate other
users of libxdiff such as libgit2. The helper will also return NULL if
the multiplication in the allocation calculation overflows.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
06f5f8940c cocci: generalize "unused" rule to cover more than "strbuf"
Generalize the newly added "unused.cocci" rule to find more than just
"struct strbuf", let's have it find the same unused patterns for
"struct string_list", as well as other code that uses
similar-looking *_{release,clear,free}() and {release,clear,free}_*()
functions.

We're intentionally loose in accepting e.g. a "strbuf_init(&sb)"
followed by a "string_list_clear(&sb, 0)".  It's assumed that the
compiler will catch any such invalid code, i.e. that our
constructors/destructors don't take a "void *".

See [1] for example of code that would be covered by the
"get_worktrees()" part of this rule. We'd still need work that the
series is based on (we were passing "worktrees" to a function), but
could now do the change in [1] automatically.

1. https://lore.kernel.org/git/Yq6eJFUPPTv%2Fzc0o@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
4f40f6cb73 cocci: add and apply a rule to find "unused" strbufs
Add a coccinelle rule to remove "struct strbuf" initialization
followed by calling "strbuf_release()" function, without any uses of
the strbuf in the same function.

See the tests in contrib/coccinelle/tests/unused.{c,res} for what it's
intended to find and replace.

The inclusion of "contrib/scalar/scalar.c" is because "spatch" was
manually run on it (we don't usually run spatch on contrib).

Per the "buggy code" comment we also match a strbuf_init() before the
xmalloc(), but we're not seeking to be so strict as to make checks
that the compiler will catch for us redundant. Saying we'll match
either "init" or "xmalloc" lines makes the rule simpler.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
7a9a10b10e cocci: have "coccicheck{,-pending}" depend on "coccicheck-test"
Have the newly introduced "coccicheck-test" target run implicitly when
"coccicheck" itself is run. As with e.g. the "check-chainlint"
target (see [1]) it makes sense to run this unconditionally before we
run other "spatch" rules as a basic sanity check. See

1. 803394459d (t/Makefile: add machinery to check correctness of
   chainlint.sed, 2018-07-11)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
f7ff6597a7 cocci: add a "coccicheck-test" target and test *.cocci rules
Add a "coccicheck-test" target to test our *.cocci rules, and as a
demonstration add tests for the rules added in 39ea59a257 (remove
unnecessary NULL check before free(3), 2016-10-08) and
1b83d1251e (coccinelle: add a rule to make "expression" code use
FREE_AND_NULL(), 2017-06-15).

I considered making use of the "spatch --test" option, and the choice
of a "tests" over a "t" directory is to make these tests compatible
with such a future change.

Unfortunately "spatch --test" doesn't return meaningful exit codes,
AFAICT you need to "grep" its output to see if the *.res is what you
expect. There's "--test-okfailed", but I didn't find a way to sensibly
integrate those (it relies on some in-between status files, but
doesn't help with the status codes).

Instead let's use a "--sp-file" pattern similar to the main
"coccicheck" rule, with the difference that we use and compare the
two *.res files with cmp(1).

The --very-quiet and --no-show-diff options ensure that we don't need
to pipe stdout and stderr somewhere. Unlike the "%.cocci.patch" rule
we're not using the diff.

The "cmp || git diff" is optimistically giving us better output on
failure, but even if we only have POSIX cmp and no system git
installed we'll still fail with the "cmp", just with an error message
that isn't as friendly. The "2>/dev/null" is in case we don't have a
"git" installed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
af0aa6904b Makefile & .gitignore: ignore & clean "git.res", not "*.res"
Adjust the overly broad .gitignore and "make clean" rule added in
ce39c2e04c (Provide a Windows version resource for the git
executables., 2012-05-24).

For now this is merely a correctness fix, but needed because a
subsequent commit will want to check in *.res files elsewhere in the
tree, which we shouldn't have to "git add -f".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
7b63ea5750 Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
The "--patch ." part of SPATCH_FLAGS added in f57d11728d (coccinelle:
put sane filenames into output patches, 2018-07-23) should have been
added unconditionally to the "spatch" invocation instead, using it
isn't optional.

Let's also move the other mandatory flag to come after
$(SPATCH_FLAGS), to ensure that our "--sp-file" overrides any provided
in the environment, both --sp-file <arg> and --patch <arg> are
last-option-wins as far as spatch(1) option parsing is concerned.

The environment variable override was initially added in
a9a884aea5 (coccicheck: use --all-includes by default,
2016-09-30). In practice there's probably nobody that's using
SPATCH_FLAGS to try to intentionally break our invocations, but since
we're changing this let's make it clear what (if anything) we expect
to be overridden by user-supplied flags.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
30cc8d0f14 A regression fix for 2.37
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-02 21:56:08 -07:00
0f0bc2124b Merge branch 'js/add-i-delete'
Rewrite of "git add -i" in C that appeared in Git 2.25 didn't
correctly record a removed file to the index, which was fixed.

* js/add-i-delete:
  add --interactive: allow `update` to stage deleted files
2022-07-02 21:56:08 -07:00
b91a2b6594 mv: add check_dir_in_index() and solve general dir check issue
Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
24ea81d9ac mv: use flags mode for update_mode
As suggested by Derrick [1], move the in-line definition of
"enum update_mode" to the top of the file and make it use "flags"
mode (each state is a different bit in the word).

Change the flag assignments from '=' (single assignment) to '|='
(additive). Also change flag evaluation from '==' to '&', etc.

[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
8a26a3915f mv: check if <destination> exists in index to handle overwriting
Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
6645b03ca5 mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
7889755bae mv: decouple if/else-if checks using goto
Previous if/else-if chain are highly nested and hard to develop/extend.

Refactor to decouple this if/else-if chain by using goto to jump ahead.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
707fa2f76a mv: update sparsity after moving from out-of-cone to in-cone
Originally, "git mv" a sparse file from out-of-cone to
in-cone does not update the moved file's sparsity (remove its
SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
not checked out in the working tree.

Update the behavior so that:
1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
   corresponding cache entry.
2. The moved cache entry is checked out in the working tree to reflect
   the updated sparsity.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
1143cc01b7 t1092: mv directory from out-of-cone to in-cone
Add test for "mv: add check_dir_in_index() and solve general dir check
issue" in this series.

This change tests the following:

1. mv <source> as a directory on the sparse index boundary (where it
   would be a sparse directory in a sparse index).
2. mv <source> as a directory which is deeper than the boundary (so
   the sparse index would expand in the cache_name_pos() method).

These tests can be written now for correctness, but later the first case
can be updated to use the 'ensure_not_expanded' helper in t1092.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:15 -07:00
367844e5b7 t7002: add tests for moving out-of-cone file/directory
Add corresponding tests to test following situations:

We do not have sufficient coverage of moving files outside
of a sparse-checkout cone. Create new tests covering this
behavior, keeping in mind that the user can include --sparse
(or not), move a file or directory, and the destination can
already exist in the index (in this case user can use --force
to overwrite existing entry).

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:15 -07:00
f40a693450 test-tool delta: fix a memory leak
Fix a memory leak introduced in a310d43494 ([PATCH] Deltification
library work by Nicolas Pitre., 2005-05-19), as a result we can mark
another test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:50 -07:00
34e691288d test-tool ref-store: fix a memory leak
Fix a memory leak introduced in fa099d2322 (worktree.c: kill
parse_ref() in favor of refs_resolve_ref_unsafe(), 2017-04-24), as a
result we can mark another test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:50 -07:00
9794633b4e test-tool bloom: fix memory leaks
Fix memory leaks introduced with these tests in f1294eaf7f (bloom.c:
introduce core Bloom filter constructs, 2020-03-30), as a result we
can mark almost the entirety of t0095-bloom.sh as passing with
SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true", there's still an
unrelated memory leak in "git commit" in one of the tests, let's skip
that one under SANITIZE_LEAK for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:50 -07:00
1caaa858cc test-tool json-writer: fix memory leaks
Fix memory leaks introduced with these tests in
75459410ed (json_writer: new routines to create JSON data,
2018-07-13), as a result we can mark a test as passing with
SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:50 -07:00
a20b0dc796 test-tool regex: call regfree(), fix memory leaks
Fix memory leaks in "test-tool regex" which have been there since
c91841594c (test-regex: Add a test to check for a bug in the regex
routines, 2012-09-01), as a result we can mark a test as passing with
SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".

We could regfree() on the die() paths here, which would make some
invocations of valgrind(1) happy, but let's just target SANITIZE=leak
for now. Variables that are still reachable when we die() are not
reported as leaks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:50 -07:00
1c343e5aef test-tool urlmatch-normalization: fix a memory leak
Fix a memory leak in "test-tool urlmatch-normalization", as a result
we can mark the corresponding test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:49 -07:00
9afa46d4a6 test-tool {dump,scrap}-cache-tree: fix memory leaks
Fix memory leaks in two test-tools used by t0090-cache-tree.sh. As a
result we can mark the test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:49 -07:00
e287a5b0a4 test-tool path-utils: fix a memory leak
Fix a memory leak in "test-tool path-utils", as a result we can mark
the corresponding test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:49 -07:00
330ca8501b test-tool test-hash: fix a memory leak
Fix a memory leak in "test-tool test-hash" which has been there since
b57cbbf8a8 (test-sha1: test hashing large buffer, 2006-06-24), as a
result we can mark more tests as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 13:38:49 -07:00
ece3974ba6 pull: fix a "struct oid_array" memory leak
Fix a memory leak introduced in 44c175c7a4 (pull: error on no merge
candidates, 2015-06-18). As a result we can mark several tests as
passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".

Removing the "int ret = 0" assignment added here in a6d7eb2c7a (pull:
optionally rebase submodules (remote submodule changes only),
2017-06-23) is not a logic error, it could always have been left
uninitialized (as "int ret"), now that we'll use the "ret" from the
upper scope we can drop the assignment in the "opt_rebase" branch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
27472b5195 cat-file: fix a common "struct object_context" memory leak
Fix a memory leak where "cat-file" will leak the "path" member. See
e5fba602e5 (textconv: support for cat_file, 2010-06-15) for the code
that introduced the offending get_oid_with_context() call (called
get_sha1_with_context() at the time).

As a result we can mark several tests as passing with SANITIZE=leak
using "TEST_PASSES_SANITIZE_LEAK=true".

As noted in dc944b65f1 (get_sha1_with_context: dynamically allocate
oc->path, 2017-05-19) callers must free the "path" member. That same
commit added the relevant free() to this function, but we weren't
catching cases where we'd return early.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
55916bba0f gc: fix a memory leak
Fix a memory leak in code added in 41abfe15d9 (maintenance: add
pack-refs task, 2021-02-09), we need to call strvec_clear() on the
"struct strvec" that we initialized.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
33d0dda633 checkout: avoid "struct unpack_trees_options" leak
In 1c41d2805e (unpack_trees_options: free messages when done,
2018-05-21) we started calling clear_unpack_trees_porcelain() on this
codepath, but missed this error path.

We could call clear_unpack_trees_porcelain() just before we error()
and return when unmerged_cache() fails, but the more correct fix is to
not have the unmerged_cache() check happen in the middle of our
"topts" setup.

Before 23cbf11b5c (merge-recursive: porcelain messages for checkout,
2010-08-11) we would not malloc() to setup our "topts", which is when
this started to leak on the error path.

Before that this code wasn't conflating the setup of "topts" and the
unmerged_cache() call in any meaningful way. The initial version in
782c2d65c2 (Build in checkout, 2008-02-07) just does a "memset" of
it, and initializes a single struct member.

Then in 8ccba008ee (unpack-trees: allow Porcelain to give different
error messages, 2008-05-17) we added the initialization of the error
message, which as noted above finally started leaking in 23cbf11b5c.

Let's fix the memory leak, and avoid future issues by initializing the
"topts" with a helper function. There are no functional changes here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
e72e12cc02 merge-file: fix memory leaks on error path
Fix a memory leak in "merge-file", we need to loop over the "mmfs"
array and free() what we've got so far when we error out. As a result
we can mark a test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
480a0e30a7 merge-file: refactor for subsequent memory leak fix
Refactor the code in builtin/merge-file.c to:

 * Use the initializer to zero out "mmfs", and use modern C syntax for
   the rest.

 * Refactor the the inner loop to use a variable and "if/else if"
   pattern followed by "return". This will make a change to change it to
   a "goto cleanup" pattern smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
d90dafbe31 cat-file: fix a memory leak in --batch-command mode
Fix a memory leak introduced in 440c705ea6 (cat-file: add
--batch-command mode, 2022-02-18). The free_cmds() function was only
called on "queued_nr" if we had a "flush" command. As the "without
flush for blob info" test added in the same commit shows we can't rely
on that, so let's call free_cmds() again at the end.

Since "nr" follows the usual pattern of being set to 0 if we've
free()'d the memory already it's OK to call it twice, even in cases
where we are doing a "flush".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
fd74ac95ac revert: free "struct replay_opts" members
Call the release_revisions() function added in
1878b5edc0 (revision.[ch]: provide and start using a
release_revisions(), 2022-04-13) in cmd_revert(), as well as freeing
the xmalloc()'d "revs" member itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:42 -07:00
bc57ba1d54 submodule.c: free() memory from xgetcwd()
Fix a memory leak in code added in bf0231c661 (rev-parse: add
--show-superproject-working-tree, 2017-03-08), we should never have
made the result of xgetcwd() a "const char *", as we return a
strbuf_detach()'d value. Let's fix that and free() it when we're done
with it.

We can't mark any tests passing passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true" as a result of this change, but
e.g. "t/t1500-rev-parse.sh" now gets closer to passing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:42 -07:00
74a06a9f21 clone: fix memory leak in wanted_peer_refs()
Fix a memory leak added in 0ec4b1650c (clone: fix ref selection in
--single-branch --branch=xxx, 2012-06-22).

Whether we get our "remote_head" from copy_ref() directly, or with a
call to guess_remote_head() it'll be the result of a copy_ref() in
either case, as guess_remote_head() is a wrapper for copy_ref() (or it
returns NULL).

We can't mark any tests passing passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true" as a result of this change, but
e.g. "t/t1500-rev-parse.sh" now gets closer to passing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:42 -07:00
99b6c45d8f check-ref-format: fix trivial memory leak
Fix a memory leak in "git check-ref-format" that's been present in the
code in one form or another since 38eedc634b (git check-ref-format
--print, 2009-10-12), the code got substantially refactored in
cfbe22f03f (check-ref-format: handle subcommands in separate
functions, 2010-08-05).

As a result we can mark a test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:42 -07:00
5ad87271cf submodule--helper: remove display path helper
All invocations of do_get_submodule_displaypath() pass
get_super_prefix() as the super_prefix arg, which is exactly the same
as get_submodule_displaypath().

Replace all calls to do_get_submodule_displaypath() with
get_submodule_displaypath(), and since it has no more callers, remove
it.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:46 -07:00
d7a714fddc submodule--helper update: use --super-prefix
Unlike the other subcommands, "git submodule--helper update" uses the
"--recursive-prefix" flag instead of "--super-prefix". The two flags are
otherwise identical (they only serve to compute the 'display path' of a
submodule), except that there is a dedicated helper function to get the
value of "--super-prefix".

This inconsistency exists because "git submodule update" used to pass
"--recursive-prefix" between shell and C (introduced in [1]) before
"--super-prefix" was introduced (in [2]), and for simplicity, we kept
this name when "git submodule--helper update" was created.

Remove "--recursive-prefix" and its associated code from "git
submodule--helper update", replacing it with "--super-prefix".

To use "--super-prefix", module_update is marked with
SUPPORT_SUPER_PREFIX. Note that module_clone must also be marked with
SUPPORT_SUPER_PREFIX, otherwise the "git submodule--helper clone"
subprocess will fail check because "--super-prefix" is propagated via
the environment.

[1] 48308681b0 (git submodule update: have a dedicated helper for
cloning, 2016-02-29)
[2] 74866d7579 (git: make super-prefix option, 2016-10-07)

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:46 -07:00
b0f8b21305 submodule--helper: remove unused SUPPORT_SUPER_PREFIX flags
Remove the SUPPORT_SUPER_PREFIX flag from "add", "init" and
"summary". For the "add" command it hasn't been used since [1],
likewise for "init" and "summary" since [2] and [3], respectively.

As implemented in 74866d7579 (git: make super-prefix option,
2016-10-07) the SUPPORT_SUPER_PREFIX flag in git.c applies for the
entire command, but as implemented in 89c8626557 (submodule helper:
support super prefix, 2016-12-08) we assert here in
cmd_submodule__helper() that we're not getting the flag unexpectedly.

1. 8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
   2021-07-10)
2. 6e7c14e65c (submodule update --init: display correct path from
   submodule, 2017-01-06)
3. 1cf823d8f0 (submodule: remove unnecessary `prefix` based option
   logic, 2021-06-22)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:45 -07:00
58cec298f1 submodule--helper: use correct display path helper
Replace a chunk of code in update_submodule() with an equivalent
do_get_submodule_displaypath() invocation. This is already tested by
t/t7406-submodule-update.sh:'submodule update --init --recursive from
subdirectory', so no tests are added.

The two are equivalent because:

- Exactly one of recursive_prefix|prefix is non-NULL at a time; prefix
  is set at the superproject level, and recursive_prefix is set when
  recursing into submodules. There is also a BUG() statement in
  get_submodule_displaypath() that asserts that both cannot be non-NULL.

- In get_submodule_displaypath(), get_super_prefix() always returns NULL
  because "--super-prefix" is never passed. Thus calling it is
  equivalent to calling do_get_submodule_displaypath() with super_prefix
  = NULL.

Therefore:

- When recursive_prefix is non-NULL, prefix is NULL, and thus
  get_submodule_displaypath() just returns prefixed_path. This is
  identical to calling do_get_submodule_displaypath() with super_prefix
  = recursive_prefix because the return value is still the concatenation
  of recursive_prefix + update_data->sm_path.

- When prefix is non-NULL, prefixed_path = update_data->sm_path. Thus
  calling get_submodule_displaypath() with prefixed_path is equivalent
  to calling do_get_submodule_displaypath() with update_data->sm_path

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:45 -07:00
cb49e1e8d3 submodule--helper: don't recreate recursive prefix
update_submodule() uses duplicated code to compute
update_data->displaypath and next.recursive_prefix. The latter is just
the former with "/" appended to it, and since update_data->displaypath
not changed outside of this statement, we can just reuse the already
computed result.

We can go one step further and remove the reference to
next.recursive_prefix altogether. Since it is only used in
update_data_to_args() (to compute the "--recursive-prefix" flag for the
recursive update child process) we can just use the already computed
.displaypath value of there.

Delete the duplicated code, and remove the unnecessary reference to
next.recursive_prefix. As a bonus, this fixes a memory leak where
prefixed_path was never freed (this leak was first reported in [1]).

[1] https://lore.kernel.org/git/877a45867ae368bf9e053caedcb6cf421e02344d.1655336146.git.gitgitgadget@gmail.com

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:45 -07:00
618b8445d9 submodule--helper update: use display path helper
There are two locations in prepare_to_clone_next_submodule() that
manually calculate the submodule display path, but should just use
do_get_submodule_displaypath() for consistency.

Do this replacement and reorder the code slightly to avoid computing
the display path twice.

Until the preceding commit this code had never been tested, with our
newly added tests we can see that both these sites have been computing
the display path incorrectly ever since they were introduced in
48308681b0 (git submodule update: have a dedicated helper for cloning,
2016-02-29) [1]:

- The first hunk puts a "/" between recursive_prefix and ce->name, but
  recursive_prefix already ends with "/".
- The second hunk calls relative_path() on recursive_prefix and
  ce->name, but relative_path() only makes sense when both paths share
  the same base directory. This is never the case here:
  - recursive_prefix is the path from the topmost superproject to the
    current submodule
  - ce->name is the path from the root of the current submodule to its
    submodule.
  so, e.g. recursive_prefix="super" and ce->name="submodule" produces
  displayname="../super" instead of "super/submodule".

[1] I verified this by applying the tests to 48308681b0.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:45 -07:00
8fc36c39d9 submodule--helper tests: add missing "display path" coverage
There are two locations in prepare_to_clone_next_submodule() that
manually calculate the submodule display path. As discussed in the
next commit the "Skipping" output isn't exactly what we want, but
let's test how we behave now, before changing the existing behavior.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-30 22:41:45 -07:00
c9e221b124 Merge branch 'ab/submodule-cleanup' into gc/submodule-use-super-prefix
* ab/submodule-cleanup:
  git-sh-setup.sh: remove "say" function, change last users
  git-submodule.sh: use "$quiet", not "$GIT_QUIET"
  submodule--helper: eliminate internal "--update" option
  submodule--helper: understand --checkout, --merge and --rebase synonyms
  submodule--helper: report "submodule" as our name in some "-h" output
  submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs"
  submodule update: remove "-v" option
  submodule--helper: have --require-init imply --init
  git-submodule.sh: remove unused top-level "--branch" argument
  git-submodule.sh: make the "$cached" variable a boolean
  git-submodule.sh: remove unused $prefix variable
  git-submodule.sh: remove unused sanitize_submodule_env()
2022-06-30 15:43:06 -07:00
a35258c62a gitweb/Makefile: add a "NO_GITWEB" parameter
From looking at the {Free,Net,Dragonfly}BSD packages for git[1]
they've been monkeypatching "gitweb" out of the Makefile, let's be
nicer and provide a NO_GITWEB=Y for their use.

For the "all" target this allows for optionally restoring what's been
the status quo before the preceding commit, but now we'll also behave
correctly on the subsequent "make install".

As before our installation of gitweb can be suppressed with
NO_PERL. For backwards compatibility the NO_PERL=Y flag by itself
still doesn't change whether or not we build gitweb, unlike the new
NO_GITWEB=Y flag.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:05 -07:00
d3b827408c Makefile: build 'gitweb' in the default target
Our Makefile's default target used to build 'gitweb', though
indirectly: the 'all' target depended on 'git-instaweb', which in turn
depended on 'gitweb'.  Then e25c7cc146 (Makefile: drop dependency
between git-instaweb and gitweb, 2015-05-29) removed the latter
dependency, and for good reasons (quoting its commit message):

  "1. git-instaweb has no build-time dependency on gitweb; it
      is a run-time dependency

   2. gitweb is a directory that we want to recursively make
      in. As a result, its recipe is marked .PHONY, which
      causes "make" to rebuild git-instaweb every time it is
      run."

Since then a simple 'make' doesn't build 'gitweb'.

Luckily, installing 'gitweb' is not broken: although 'make install'
doesn't depend on the 'gitweb' target, it has a dependency on the
'install-gitweb' target, which does generate all the necessary files
for 'gitweb' and installs them.  However, if someone runs 'make &&
sudo make install', then those files in the 'gitweb' directory will be
generated and owned by root, which is not nice.

List 'gitweb' as a direct dependency of the default target, so a plain
'make' will build it.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:05 -07:00
affc3b755c gitweb/Makefile: include in top-level Makefile
Include the gitweb/Makefile in the top-level Makefile rather than
calling it as a sub-Makefile. As noted in the thread starting at at
[1] (in particular [2]) we'll pay a high cost on NOOP runs of "make"
just to figure out that we have nothing to do for "make gitweb".

The "gitweb" script also isn't maintained out-of-tree, unlike
"gitk-git" or "git-gui", which both have their own "Makefile". Other
parts of it are already integrated into our main Makefiles, e.g. the
documentation is built by Documentation/Makefile since
07ea4df278 (gitweb: Add gitweb(1) manpage for gitweb itself,
2011-10-16).

1. https://lore.kernel.org/git/20220525205651.825669-1-szeder.dev@gmail.com/
2. https://lore.kernel.org/git/220526.86k0a96sv2.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:05 -07:00
27438ef5e0 gitweb: remove "test" and "test-installed" targets
Remove the special "test" targets for gitweb added in
958a846721 (gitweb/Makefile: Add 'test' and 'test-installed' targets,
2010-09-26). Unlike e.g. "contrib/scalar" and "contrib/subtree" the
"gitweb" tests themselves live in our top-level t/ directory.

It therefore doesn't make sense to maintain this indirection, no more
than it would to have a "git-send-email-test". By dropping it we'll
also free other tests to use the t95*.sh prefix.

These removed targets are unlikely to be used by anyone, and to the
extent that they are we can easily use an invocation like this
instead:

	make test T='t[0-9]*gitweb*.sh'

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:05 -07:00
b82d66eb0c gitweb/Makefile: prepare to merge into top-level Makefile
Since the "gitweb/Makefile" was split out from the top-level Makefile
in 62331ef163 (gitweb: Makefile improvements, 2010-01-30) we've kept
the inter-dependencies between the two, and worse have dealt with a
lot of duplication as a result.

In preparation for merging the two again add a MAK_DIR_GITWEB variable
to various rules in it. This will allow us to set this variable to
"gitweb/" as we include it in the top-level Makefile, which will
minimize the size of the subsequent diff.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:04 -07:00
564ebde3d3 gitweb/Makefile: clear up and de-duplicate the gitweb.{css,js} vars
Change the variable definitions for the $(GITWEB_CSS) and $(GITWEB_JS)
so that we have a clear separation between what we use as "in" files,
v.s. our "min" files. We can now make the appending to $(GITWEB_FILES)
unconditional, since $(GITWEB_{JS,CSS}) is either the "min" or
non-"min" version. This reduces the duplication within the file.

While we're at it let's initialize "GITWEB_JSLIB_FILES" as we normally
do with such variables.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:04 -07:00
1e08fa5e2b gitweb/Makefile: add a $(GITWEB_ALL) variable
Declare the targets that the "all" target depends on with a new
$(GITWEB_ALL) variable. This will help to reduce churn in subsequent
commits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:04 -07:00
7decdb9b4a gitweb/Makefile: define all .PHONY prerequisites inline
Move the '.PHONY' definition so that it's split up and accompanies the
relevant as they're defined. This will make a subsequent diff smaller
as we'll remove some of these, and won't need to re-edit the
now-removed '.PHONY' line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:20:04 -07:00
5b893f7d81 git-sh-setup.sh: remove "say" function, change last users
Remove the "say" function, with various rewrites of the remaining
git-*.sh code to C and the preceding change to have git-submodule.sh
stop using the GIT_QUIET variable there were only four uses in
git-subtree.sh. Let's have it use an "arg_quiet" variable instead, and
move the "say" function over to it.

The only other use was a trivial message in git-instaweb.sh, since it
has never supported the --quiet option (or similar) that code added in
0b624b4cee (instaweb: restart server if already running, 2009-11-22)
can simply use "echo" instead.

The remaining in-tree hits from "say" are all for the sibling function
defined in t/test-lib.sh. It's safe to remove this function since it
has never been documented in Documentation/git-sh-setup.txt.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:18 -07:00
2eec463739 git-submodule.sh: use "$quiet", not "$GIT_QUIET"
Remove the use of the "$GIT_QUIET" variable in favor of our own
"$quiet", ever since b3c5f5cb04 (submodule: move core cmd_update()
logic to C, 2022-03-15) we have not used the "say" function in
git-sh-setup.sh, which is the only thing that's affected by using
"GIT_QUIET".

We still want to support --quiet for our own use though, but let's use
our own variable for that. Now it's obvious that we only care about
passing "--quiet" to "git submodule--helper", and not to change the
output of any "say" invocation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:18 -07:00
b788fc671b submodule--helper: eliminate internal "--update" option
Follow-up on the preceding commit which taught "git submodule--helper
update" to understand "--merge", "--checkout" and "--rebase" and use
those options instead of "--update=(rebase|merge|checkout|none)" when
the command invokes itself.

Unlike the preceding change this isn't strictly necessary to
eventually change "git-submodule.sh" so that it invokes "git
submodule--helper update" directly, but let's remove this
inconsistency in the command-line interface. We shouldn't need to
carry special synonyms for existing options in "git submodule--helper"
when that command can use the primary documented names instead.

But, as seen in the post-image this makes the control flow within
"builtin/submodule--helper.c" simpler, we can now write directly to
the "update_default" member of "struct update_data" when parsing the
options in "module_update()".

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:18 -07:00
8f12108c29 submodule--helper: understand --checkout, --merge and --rebase synonyms
Understand --checkout, --merge and --rebase synonyms for
--update={checkout,merge,rebase}, as well as the short options that
'git submodule' itself understands.

This removes a difference between the CLI API of "git submodule" and
"git submodule--helper", making it easier to make the latter an alias
for the former. See 48308681b0 (git submodule update: have a
dedicated helper for cloning, 2016-02-29) for the initial addition of
--update.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
36d45163b6 submodule--helper: report "submodule" as our name in some "-h" output
Change the user-facing "git submodule--helper" commands so that
they'll report their name as being "git submodule". To a user these
commands are internal implementation details, and it doesn't make
sense to emit usage about an internal helper when "git submodule" is
invoked with invalid options.

Before this we'd emit e.g.:

	$ git submodule absorbgitdirs --blah
	error: unknown option `blah'
	usage: git submodule--helper absorbgitdirs [<options>] [<path>...]
	[...]
And:

	$ git submodule set-url -- --
	usage: git submodule--helper set-url [--quiet] <path> <newurl>
	[...]

Now we'll start with "usage: git submodule [...]" in both of those
cases. This change does not alter the "list", "name", "clone",
"config" and "create-branch" commands, those are internal-only (as an
aside; their usage info should probably invoke BUG(...)). This only
changes the user-facing commands.

The "status", "deinit" and "update" commands are not included in this
change, because their usage information already used "submodule"
rather than "submodule--helper".

I don't think it's currently possible to emit some of this usage
information in practice, as git-submodule.sh will catch unknown
options, and e.g. it doesn't seem to be possible to get "add" to emit
its usage information from "submodule--helper".

Though that change may be superfluous now, it's also harmless, and
will allow us to eventually dispatch further into "git
submodule--helper" from git-submodule.sh, while emitting the correct
usage output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
6e556c412e submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs"
Rename the "absorb-git-dirs" subcommand to "absorbgitdirs", which is
what the "git submodule" command itself has called it since the
subcommand was implemented in f6f8586140 (submodule: add
absorb-git-dir function, 2016-12-12).

Having these two be different will make it more tedious to dispatch to
eventually dispatch "git submodule--helper" directly, as we'd need to
retain this name mapping. So let's get rid of this needless
inconsistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
0d68ee723e submodule update: remove "-v" option
In e84c3cf3dc (git-submodule.sh: accept verbose flag in cmd_update to
be non-quiet, 2018-08-14) the "git submodule update" sub-command was
made to understand "-v", but the option was never documented.

The only in-tree user has been this test added in
3ad0401e9e (submodule update: silence underlying merge/rebase with
"--quiet", 2020-09-30), it wasn't per-se testing --quiet, but fixing a
bug in e84c3cf3dc: It used to set "GIT_QUIET=0" instead of unsetting
it on "-v", and thus we'd end up passing "--quiet" to "git
submodule--helper" on "-v", since the "--quiet" option was passed
using the ${parameter:+word} construct.

Furthermore, even if someone had used the "-v" option they'd only be
getting the default output. Our default in both git-submodule.sh and
"git submodule--helper" has been to be "verbose", so the only way this
option could have matter is if it were used as e.g.:

    git submodule --quiet update -v [...]

I.e. to undo the effect of a previous "--quiet" on the command-line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
d9c7f69aaa submodule--helper: have --require-init imply --init
Adjust code added in 0060fd1511 (clone --recurse-submodules: prevent
name squatting on Windows, 2019-09-12) to have the internal
--require-init option imply --init, rather than having
"git-submodule.sh" add it implicitly.

This change doesn't make any difference now, but eliminates another
special-case where "git submodule--helper update"'s behavior was
different from "git submodule update". This will make it easier to
eventually replace the cmd_update() function in git-submodule.sh.

We'll still need to keep the distinction between "--init" and
"--require-init" in git-submodule.sh. Once cmd_update() gets
re-implemented in C we'll be able to change variables and other code
related to that, but not yet.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
da3aae9e84 git-submodule.sh: remove unused top-level "--branch" argument
In 5c08dbbdf1 (git-submodule: fix subcommand parser, 2008-01-15) the
"--branch" option was supported as an option to "git submodule"
itself, i.e. "git submodule --branch" as a side-effect of its
implementation.

Then in b57e8119e6 (submodule: teach set-branch subcommand,
2019-02-08) when the "set-branch" subcommand was added the assertion
that we shouldn't have "--branch" anywhere except as an argument to
"add" and "set-branch" was copy/pasted from the adjacent check for
"--cache" added (or rather modified) in 496eeeb19b (git-submodule.sh:
avoid "test <cond> -a/-o <cond>", 2014-06-10).

But there's been a logic error in that check, which at a glance looked
like it should be supporting:

    git submodule --branch <branch> (add | set-branch) [<options>]

But due to "||" in the condition (as opposed to "&&" for "--cache") if
we have "--branch" here already we'll emit usage, even for "add" and
"set-branch".

So in addition to never having documented this form, it hasn't worked
since b57e8119e6 was released with v2.22.0.

So it's safe to remove this code. I.e. we don't want to support the
form noted above, but only:

    git submodule (add | set-branch) --branch <branch> [<options>]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
757d092797 git-submodule.sh: make the "$cached" variable a boolean
Remove the assignment of "$1" to the "$cached" variable. As seen in
the initial implementation in 70c7ac22de (Add git-submodule command,
2007-05-26) we only need to keep track of if we've seen the --cached
option, not save the "--cached" string for later use.

In 28f9af5d25 (git-submodule summary: code framework, 2008-03-11)
"$1" was assigned to it, but since there was no reason to do so let's
stop doing it. This trivial change will make it easier to reason about
an eventual change that'll remove the cmd_summary() function in favor
of dispatching to "git submodule--helper summary" directly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
960fad98e8 git-submodule.sh: remove unused $prefix variable
Remove the $prefix variable which isn't used anymore, and hasn't been
since b3c5f5cb04 (submodule: move core cmd_update() logic to C,
2022-03-15).

Before that we'd use it to invoke "git submodule--helper" with the
"--recursive-prefix" option, but since b3c5f5cb04 that "git
submodule--helper" option is only used when it invokes itself.

So the "--recursive-prefix" option is still in use, but at this point
only when the helper invokes itself during submodule recursion. See
the "--recursive-prefix" option added in
c51f8f94e5 (submodule--helper: run update procedures from C,
2021-08-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:17 -07:00
85775255f1 git-submodule.sh: remove unused sanitize_submodule_env()
The sanitize_submodule_env() function was last used before
b3c5f5cb04 (submodule: move core cmd_update() logic to C,
2022-03-15), let's remove it.

This also allows us to remove clear_local_git_env() from
git-sh-setup.sh. That function hasn't been documented in
Documentation/git-sh-setup.sh, and since 14111fc492 (git: submodule
honor -c credential.* from command line, 2016-02-29) it had only been
used in the sanitize_submodule_env() function being removed here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-28 13:13:16 -07:00
7260e87248 git-merge-tree.txt: add a section on potentional usage mistakes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
7976721d17 merge-tree: add a --allow-unrelated-histories flag
Folks may want to merge histories that have no common ancestry; provide
a flag with the same name as used by `git merge` to allow this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
7c48b27822 merge-tree: allow ls-files -u style info to be NUL terminated
Much as `git ls-files` has a -z option, let's add one to merge-tree so
that the conflict-info section can be NUL terminated (and avoid quoting
of unusual filenames).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
de90581141 merge-ort: optionally produce machine-readable output
With the new `detailed` parameter, a new mode can be triggered when
displaying the merge messages: The `detailed` mode prints NUL-delimited
fields of the following form:

	<path-count> NUL <path>... NUL <conflict-type> NUL <message>

The `<path-count>` field determines how many `<path>` fields there are.

The intention of this mode is to support server-side operations, where
worktree-less merges can lead to conflicts and depending on the type
and/or path count, the caller might know how to handle said conflict.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
cb2607759e merge-ort: store more specific conflict information
It is all fine and dandy for a regular Git command that is intended to
be run interactively to produce a bunch of messages upon an error.

However, in `merge-ort`'s case, we want to call the command e.g. in
server-side software, where the actual error messages are not quite as
interesting as machine-readable, immutable terms that describe the exact
nature of any given conflict.

With this patch, the `merge-ort` machinery records the exact type (as
specified via an `enum` value) as well as the involved path(s) together
with the conflict's message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
2715e8a931 merge-ort: make path_messages a strmap to a string_list
This allows us once again to get away with less data copying.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
6debb7527b merge-ort: store messages in a list, not in a single strbuf
To prepare for using the `merge-ort` machinery in server operations, we
cannot simply produce a free-form string that combines a variable-length
list of messages.

Instead, we need to list them one by one. The natural fit for this is a
`string_list`.

We will subsequently add even more information in the `util` attribute
of the string list items.

Based-on-a-patch-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
b520bc6caa merge-tree: provide easy access to ls-files -u style info
Much like `git merge` updates the index with information of the form
    (mode, oid, stage, name)
provide this output for conflicted files for merge-tree as well.
Provide a --name-only option for users to exclude the mode, oid, and
stage and only get the list of conflicted filenames.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
7fa3338870 merge-tree: provide a list of which files have conflicts
Callers of `git merge-tree --write-tree` will often want to know which
files had conflicts.  While they could potentially attempt to parse the
CONFLICT notices printed, those messages are not meant to be machine
readable.  Provide a simpler mechanism of just printing the files (in
the same format as `git ls-files` with quoting, but restricted to
unmerged files) in the output before the free-form messages.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
a4040cfa35 merge-ort: remove command-line-centric submodule message from merge-ort
There was one case in merge-ort that would call path_msg() multiple
times for the same logical conflict, and it was in order to give advice
about how to resolve a conflict.  This advice does not make as much
sense with remerge-diff, or with merge-tree being invoked by a GitHub
GUI for resolution of messages, and is making it hard to provide
which-logical-conflict-affects-which-paths information in a machine
parseable way to a higher level caller of merge-tree.  Let's simply
remove this informational message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
fae26ce79c merge-ort: provide a merge_get_conflicted_files() helper function
After a merge, this function allows the user to extract the same
information that would be printed by `ls-files -u`, which means
files with their mode, oid, and stage.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
a1a7811975 merge-tree: support including merge messages in output
When running `git merge-tree --write-tree`, we previously would only
return an exit status reflecting the cleanness of a merge, and print out
the toplevel tree of the resulting merge.  Merges also have
informational messages, such as:
  * "Auto-merging <PATH>"
  * "CONFLICT (content): ..."
  * "CONFLICT (file/directory)"
  * etc.
In fact, when non-content conflicts occur (such as file/directory,
modify/delete, add/add with differing modes, rename/rename (1to2),
etc.), these informational messages may be the only notification the
user gets since these conflicts are not representable in the contents
of the file.

Add a --[no-]messages option so that callers can request these messages
be included at the end of the output.  Include such messages by default
when there are conflicts, and omit them by default when the merge is
clean.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
a34edae68a merge-ort: split out a separate display_update_messages() function
This patch includes no new code; it simply moves a bunch of lines into a
new function.  As such, there are no functional changes.  This is just a
preparatory step to allow the printed messages to be handled differently
by other callers, such as in `git merge-tree --write-tree`.

(Patch best viewed with
     --color-moved --color-moved-ws=allow-indentation-change
 to see that it is a simple code movement.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:06 -07:00
1f0c3a29da merge-tree: implement real merges
This adds the ability to perform real merges rather than just trivial
merges (meaning handling three way content merges, recursive ancestor
consolidation, renames, proper directory/file conflict handling, and so
forth).  However, unlike `git merge`, the working tree and index are
left alone and no branch is updated.

The only output is:
  - the toplevel resulting tree printed on stdout
  - exit status of 0 (clean), 1 (conflicts present), anything else
    (merge could not be performed; unknown if clean or conflicted)

This output is meant to be used by some higher level script, perhaps in
a sequence of steps like this:

   NEWTREE=$(git merge-tree --write-tree $BRANCH1 $BRANCH2)
   test $? -eq 0 || die "There were conflicts..."
   NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
   git update-ref $BRANCH1 $NEWCOMMIT

Note that higher level scripts may also want to access the
conflict/warning messages normally output during a merge, or have quick
access to a list of files with conflicts.  That is not available in this
preliminary implementation, but subsequent commits will add that
ability (meaning that NEWTREE would be a lot more than a tree in the
case of conflicts).

This also marks the traditional trivial merge of merge-tree as
deprecated.  The trivial merge not only had limited applicability, the
output format was also difficult to work with (and its format
undocumented), and will generally be less performant than real merges.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:05 -07:00
6ec755a0e1 merge-tree: add option parsing and initial shell for real merge function
Let merge-tree accept a `--write-tree` parameter for choosing real
merges instead of trivial merges, and accept an optional
`--trivial-merge` option to get the traditional behavior.  Note that
these accept different numbers of arguments, though, so these names
need not actually be used.

Note that real merges differ from trivial merges in that they handle:
  - three way content merges
  - recursive ancestor consolidation
  - renames
  - proper directory/file conflict handling
  - etc.
Basically all the stuff you'd expect from `git merge`, just without
updating the index and working tree.  The initial shell added here does
nothing more than die with "real merges are not yet implemented", but
that will be fixed in subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:05 -07:00
55e48f6bf7 merge-tree: move logic for existing merge into new function
In preparation for adding a non-trivial merge capability to merge-tree,
move the existing merge logic for trivial merges into a new function.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:05 -07:00
70176b7015 merge-tree: rename merge_trees() to trivial_merge_trees()
merge-recursive.h defined its own merge_trees() function, different than
the one found in builtin/merge-tree.c.  That was okay in the past, but
we want merge-tree to be able to use the merge-ort functions, which will
end up including merge-recursive.h.  Rename the function found in
builtin/merge-tree.c to avoid the conflict.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 16:10:05 -07:00
68437ede53 grep: add --max-count command line option
This patch adds a command line option analogous to that of GNU
grep(1)'s -m / --max-count, which users might already be used to.
This makes it possible to limit the amount of matches shown in the
output while keeping the functionality of other options such as -C
(show code context) or -p (show containing function), which would be
difficult to do with a shell pipeline (e.g. head(1)).

Signed-off-by: Carlos López 00xc@protonmail.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22 13:23:29 -07:00
9bef0b1e6e branch: drop unused worktrees variable
After b489b9d9aa (branch: use branch_checked_out() when deleting refs,
2022-06-14), we no longer look at our local "worktrees" variable, since
branch_checked_out() handles it under the hood. The compiler didn't
notice the unused variable because we call functions to initialize and
free it (so it's not totally unused, it just doesn't do anything
useful).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-21 08:52:37 -07:00
b2463fc30a fetch: stop passing around unused worktrees variable
In 12d47e3b1f (fetch: use new branch_checked_out() and add tests,
2022-06-14), fetch's update_local_ref() function stopped using its
"worktrees" parameter. It doesn't need it, since the
branch_checked_out() function examines the global worktrees under the
hood.

So we can not only drop the unused parameter from that function, but
also from its entire call chain. And as we do so all the way up to
do_fetch(), we can see that nobody uses it at all, and we can drop the
local variable there entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-21 08:52:32 -07:00
82f67ee13f send-pack.c: add config push.useBitmaps
Reachability bitmaps are designed to speed up the "counting objects"
phase of generating a pack during a clone or fetch. They are not
optimized for Git clients sending a small topic branch via "git push".
In some cases (see [1]), using reachability bitmaps during "git push"
can cause significant performance regressions.

Add a new "push.useBitmaps" configuration variable to allow users to
tell "git push" not to use bitmaps. We already have "pack.bitmaps"
that controls the use of bitmaps, but a separate configuration variable
allows the reachability bitmaps to still be used in other areas,
such as "git upload-pack", while disabling it only for "git push".

[1]: https://lore.kernel.org/git/87zhoz8b9o.fsf@evledraar.gmail.com/

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17 14:31:01 -07:00
2c80a82e34 remote: handle negative refspecs in git remote show
By default, the git remote show command will query data from remotes to
show data about what might be done on a future git fetch. This process
currently does not handle negative refspecs. This can be confusing,
because the show command will list refs as if they would be fetched. For
example if the fetch refspec "^refs/heads/pr/*", it still displays the
following:

  * remote jdk19
    Fetch URL: git@github.com:openjdk/jdk19.git
    Push  URL: git@github.com:openjdk/jdk19.git
    HEAD branch: master
    Remote branches:
      master tracked
      pr/1   new (next fetch will store in remotes/jdk19)
      pr/2   new (next fetch will store in remotes/jdk19)
      pr/3   new (next fetch will store in remotes/jdk19)
    Local ref configured for 'git push':
      master pushes to master (fast-forwardable)

Fix this by adding an additional check inside of get_ref_states. If a
ref matches one of the negative refspecs, mark it as skipped instead of
marking it as new or tracked.

With this change, we now report remote branches that are skipped due to
negative refspecs properly:

  * remote jdk19
    Fetch URL: git@github.com:openjdk/jdk19.git
    Push  URL: git@github.com:openjdk/jdk19.git
    HEAD branch: master
    Remote branches:
      master tracked
      pr/1   skipped
      pr/2   skipped
      pr/3   skipped
    Local ref configured for 'git push':
      master pushes to master (fast-forwardable)

By showing the refs as skipped, it helps clarify that these references
won't actually be fetched.

This does not properly handle refs going stale due to a newly added
negative refspec. In addition, git remote prune doesn't handle that
negative refspec case either. Fixing that requires digging into
get_stale_heads and handling the case of a ref which exists on the
remote but is omitted due to a negative refspec locally.

Add a new test case which covers the functionality above, as well as a
new expected failure indicating the poor overlap with stale refs.

Reported-by: Pavel Rappo <pavel.rappo@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17 10:03:59 -07:00
18c08abc82 is_promisor_object(): walk promisor packs in pack-order
When we generate the list of promisor objects, we walk every pack with a
.promisor file and examine its objects for any links to other objects.
By default, for_each_packed_object() will go in pack .idx order.

This is the worst case with respect to our delta base cache. If we have
a delta chain of A->B->C->D, then visiting A may require reconstructing
both B and C, unless we also visited B recently, in which case we may
have cached its value. Because .idx order is based on sha1, it's random
with respect to the actual object contents and deltas, and thus we're
unlikely to get many cache hits.

If we instead traverse in pack order, then we get the optimal case:
packs are written to keep delta families together, and to place bases
before their children.

Even on a modest repository like git.git, this has a noticeable speedup
on p5600.4, which runs "fsck" on a partial clone with blob:none (so lots
of trees which need to be walked, and which delta well):

Test       HEAD^               HEAD
-------------------------------------------------------
5600.4:    17.87(17.83+0.04)   15.42(15.35+0.06) -13.7%

On a larger repository like linux.git, the speedup is even more
pronounced:

Test       HEAD^                 HEAD
-----------------------------------------------------------
5600.4:    322.47(322.01+0.42)   186.41(185.76+0.63) -42.2%

Any other operations that call is_promisor_object(), like "rev-list
--exclude-promisor-objects", would similarly benefit, but the
invocations in p5600 don't actually trigger any such cases.

Note that we may pay a small price to build a rev-index in-memory to do
the pack-order traversal. But it's still a big net win, and even that
small cost goes away if you are using pack.writeReverseIndex.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-16 10:03:40 -07:00
4f4be00d30 archive-tar: use internal gzip by default
Drop the dependency on gzip(1) and use our internal implementation to
create tar.gz and tgz files.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 13:19:47 -07:00
23fcf8b09f archive-tar: use OS_CODE 3 (Unix) for internal gzip
gzip(1) encodes the OS it runs on in the 10th byte of its output. It
uses the following OS_CODE values according to its tailor.h [1]:

        0 - MS-DOS
        3 - UNIX
        5 - Atari ST
        6 - OS/2
       10 - TOPS-20
       11 - Windows NT

The gzip.exe that comes with Git for Windows uses OS_CODE 3 for some
reason, so this value is used on practically all supported platforms
when generating tgz archives using gzip(1).

Zlib uses a bigger set of values according to its zutil.h [2], aligned
with section 4.4.2 of the ZIP specification, APPNOTE.txt [3]:

         0 - MS-DOS
         1 - Amiga
         3 - UNIX
         4 - VM/CMS
         5 - Atari ST
         6 - OS/2
         7 - Macintosh
         8 - Z-System
        10 - Windows NT
        11 - MVS (OS/390 - Z/OS)
        13 - Acorn Risc
        16 - BeOS
        18 - OS/400
        19 - OS X (Darwin)

Thus the internal gzip implementation in archive-tar.c sets different
OS_CODE header values on major platforms Windows and macOS.  Git for
Windows uses its own zlib-based variant since v2.20.1 by default and
thus embeds OS_CODE 10 in tgz archives.

The tar archive for a commit is generated consistently on all systems
(by the same Git version).  The OS_CODE in the gzip header does not
influence extraction.  Avoid leaking OS information and make tgz
archives constistent and reproducable (with the same Git and libz
versions) by using OS_CODE 3 everywhere.

At least on macOS 12.4 this produces the same output as gzip(1) for the
examples I tried:

   # before
   $ git -c tar.tgz.command='git archive gzip' archive --format=tgz v2.36.0 | shasum
   3abbffb40b7c63cf9b7d91afc682f11682f80759  -

   # with this patch
   $ git -c tar.tgz.command='git archive gzip' archive --format=tgz v2.36.0 | shasum
   dc6dc6ba9636d522799085d0d77ab6a110bcc141  -

   $ git archive --format=tar v2.36.0 | gzip -cn | shasum
   dc6dc6ba9636d522799085d0d77ab6a110bcc141  -

[1] https://git.savannah.gnu.org/cgit/gzip.git/tree/tailor.h
[2] https://github.com/madler/zlib/blob/master/zutil.h
[3] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 13:19:47 -07:00
76d7602631 archive-tar: add internal gzip implementation
Git uses zlib for its own object store, but calls gzip when creating tgz
archives.  Add an option to perform the gzip compression for the latter
using zlib, without depending on the external gzip binary.

Plug it in by making write_block a function pointer and switching to a
compressing variant if the filter command has the magic value "git
archive gzip".  Does that indirection slow down tar creation?  Not
really, at least not in this test:

$ hyperfine -w3 -L rev HEAD,origin/main -p 'git checkout {rev} && make' \
'./git -C ../linux archive --format=tar HEAD # {rev}'
Benchmark #1: ./git -C ../linux archive --format=tar HEAD # HEAD
  Time (mean ± σ):      4.044 s ±  0.007 s    [User: 3.901 s, System: 0.137 s]
  Range (min … max):    4.038 s …  4.059 s    10 runs

Benchmark #2: ./git -C ../linux archive --format=tar HEAD # origin/main
  Time (mean ± σ):      4.047 s ±  0.009 s    [User: 3.903 s, System: 0.138 s]
  Range (min … max):    4.038 s …  4.066 s    10 runs

How does tgz creation perform?

$ hyperfine -w3 -L command 'gzip -cn','git archive gzip' \
'./git -c tar.tgz.command="{command}" -C ../linux archive --format=tgz HEAD'
Benchmark #1: ./git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD
  Time (mean ± σ):     20.404 s ±  0.006 s    [User: 23.943 s, System: 0.401 s]
  Range (min … max):   20.395 s … 20.414 s    10 runs

Benchmark #2: ./git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD
  Time (mean ± σ):     23.807 s ±  0.023 s    [User: 23.655 s, System: 0.145 s]
  Range (min … max):   23.782 s … 23.857 s    10 runs

Summary
  './git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD' ran
    1.17 ± 0.00 times faster than './git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD'

So the internal implementation takes 17% longer on the Linux repo, but
uses 2% less CPU time.  That's because the external gzip can run in
parallel on its own processor, while the internal one works sequentially
and avoids the inter-process communication overhead.

What are the benefits?  Only an internal sequential implementation can
offer this eco mode, and it allows avoiding the gzip(1) requirement.

This implementation uses the helper functions from our zlib.c instead of
the convenient gz* functions from zlib, because the latter doesn't give
the control over the generated gzip header that the next patch requires.

Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 13:19:47 -07:00
dfce1186c6 archive-tar: factor out write_block()
All tar archive writes have the same size and are done to the same file
descriptor.  Move them to a common function, write_block(), to reduce
code duplication and make it easy to change the destination.

Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 13:19:47 -07:00
96b9e5151b archive: rename archiver data field to filter_command
The void pointer "data" in struct archiver is only used to store filter
commands to pass tar archives to, like gzip.  Rename it accordingly and
also turn it into a char pointer to document the fact that it's a string
reference.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 13:19:46 -07:00
650134a478 archive: update format documentation
Mention all formats in the --format section, use backtick quoting for
literal values throughout, clarify the description of the configuration
option.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 13:19:46 -07:00
4b6e18f5a0 branch: fix branch_checked_out() leaks
The branch_checked_out() method populates a strmap linking a refname to
a worktree that has that branch checked out. While unlikely, it is
possible that a bug or filesystem manipulation could create a scenario
where the same ref is checked out in multiple places. Further, there are
some states in an interactive rebase where HEAD and REBASE_HEAD point to
the same ref, leading to multiple insertions into the strmap. In either
case, the strmap_put() method returns the old value which is leaked.

Update branch_checked_out() to consume that pointer and free it.

Add a test in t2407 that checks this erroneous case. The test "checks
itself" by first confirming that the filesystem manipulations it makes
trigger the branch_checked_out() logic, and then sets up similar
manipulations to make it look like there are multiple worktrees pointing
to the same ref.

While TEST_PASSES_SANITIZE_LEAK would be helpful to demonstrate the
leakage and prevent it in the future, t2407 uses helpers such as 'git
clone' that cause the test to fail under that mode.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 10:47:19 -07:00
b489b9d9aa branch: use branch_checked_out() when deleting refs
This is the last current use of find_shared_symref() that can easily be
replaced by branch_checked_out(). The benefit of this switch is that the
code is a bit simpler, but also it is faster on repeated calls.

The remaining uses of find_shared_symref() are non-trivial to remove, so
we probably should not continue in that direction:

* builtin/notes.c uses find_shared_symref() with "NOTES_MERGE_REF"
  instead of "HEAD", so it doesn't have an immediate analogue with
  branch_checked_out(). Perhaps we should consider extending it to
  include that symref in addition to HEAD, BISECT_HEAD, and
  REBASE_HEAD.

* receive-pack.c checks to see if a worktree has a checkout for the ref
  that is being updated. The tricky part is that it can actually decide
  to update the worktree directly instead of just skipping the update.
  This all depends on the receive.denyCurrentBranch config option. The
  implementation currenty cares about receiving the worktree in the
  result, so the current branch_checked_out() prototype is insufficient
  currently. This is something to investigate later, though, since a
  large number of refs could be updated at the same time and using the
  strmap implementation of branch_checked_out() could be beneficial.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 10:47:19 -07:00
12d47e3b1f fetch: use new branch_checked_out() and add tests
When fetching refs from a remote, it is possible that the refspec will
cause use to overwrite a ref that is checked out in a worktree. The
existing logic in builtin/fetch.c uses a possibly-slow mechanism. Update
those sections to use the new, more efficient branch_checked_out()
helper.

These uses were not previously tested, so add a test case that can be
used for these kinds of collisions. There is only one test now, but more
tests will be added as other consumers of branch_checked_out() are
added.

Note that there are two uses in builtin/fetch.c, but only one of the
messages is tested. This is because the tested check is run before
completing the fetch, and the untested check is not reachable without
concurrent updates to the filesystem. Thus, it is beneficial to keep
that extra check for the sake of defense-in-depth. However, we should
not attempt to test the check, as the effort required is too
complicated to be worth the effort. This use in update_local_ref()
also requires a change in the error message because we no longer have
access to the worktree struct, only the path of the worktree. This error
is so rare that making a distinction between the two is not critical.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 10:47:18 -07:00
d2ba271aad branch: check for bisects and rebases
The branch_checked_out() helper was added by the previous change, but it
used an over-simplified view to check if a branch is checked out. It
only focused on the HEAD symref, but ignored whether a bisect or rebase
was happening.

Teach branch_checked_out() to check for these things, and also add tests
to ensure that we do not lose this functionality in the future.

Now that this test coverage exists, we can safely refactor
validate_new_branchname() to use branch_checked_out().

Note that we need to prepend "refs/heads/" to the 'state.branch' after
calling wt_status_check_*(). We also need to duplicate wt->path so the
value is not freed at the end of the call.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 10:47:18 -07:00
31ad6b61bd branch: add branch_checked_out() helper
The validate_new_branchname() method contains a check to see if a branch
is checked out in any non-bare worktree. This is intended to prevent a
force push that will mess up an existing checkout. This helper is not
suitable to performing just that check, because the method will die()
when the branch is checked out instead of returning an error code.

Create a new branch_checked_out() helper that performs the most basic
form of this check. To ensure we can call branch_checked_out() in a loop
with good performance, do a single preparation step that iterates over
all worktrees and stores their current HEAD branches in a strmap. The
branch_checked_out() helper can then discover these branches using a
hash lookup.

This helper is currently missing some key functionality. Namely: it
doesn't look for active rebases or bisects which mean that the branch is
"checked out" even though HEAD doesn't point to that ref. This
functionality will be added in a coming change.

We could use branch_checked_out() in validate_new_branchname(), but this
missing functionality would be a regression. However, we have no tests
that cover this case!

Add a new test script that will be expanded with these cross-worktree
ref updates. The current tests would still pass if we refactored
validate_new_branchname() to use this version of branch_checked_out().
The next change will fix that functionality and add the proper test
coverage.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15 10:47:18 -07:00
aaf81223f4 unpack-objects: use stream_loose_object() to unpack large objects
Make use of the stream_loose_object() function introduced in the
preceding commit to unpack large objects. Before this we'd need to
malloc() the size of the blob before unpacking it, which could cause
OOM with very large blobs.

We could use the new streaming interface to unpack all blobs, but
doing so would be much slower, as demonstrated e.g. with this
benchmark using git-hyperfine[0]:

	rm -rf /tmp/scalar.git &&
	git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git &&
	mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack &&
	git hyperfine \
		-r 2 --warmup 1 \
		-L rev origin/master,HEAD -L v "10,512,1k,1m" \
		-s 'make' \
		-p 'git init --bare dest.git' \
		-c 'rm -rf dest.git' \
		'./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'

Here we'll perform worse with lower core.bigFileThreshold settings
with this change in terms of speed, but we're getting lower memory use
in return:

	Summary
	  './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran
	    1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
	    1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
	    1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
	    1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
	    1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
	    1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
	    1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'

A better benchmark to demonstrate the benefits of that this one, which
creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:

	rm -rf /tmp/repo &&
	git init /tmp/repo &&
	(
		cd /tmp/repo &&
		for i in 1 25 50 75 100
		do
			dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024
		done &&
		git add blob.* &&
		git commit -mblobs &&
		git gc &&
		PACK=$(echo .git/objects/pack/pack-*.pack) &&
		cp "$PACK" my.pack
	) &&
	git hyperfine \
		--show-output \
		-L rev origin/master,HEAD -L v "512,50m,100m" \
		-s 'make' \
		-p 'git init --bare dest.git' \
		-c 'rm -rf dest.git' \
		'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'

Using this test we'll always use >100MB of memory on
origin/master (around ~105MB), but max out at e.g. ~55MB if we set
core.bigFileThreshold=50m.

The relevant "Maximum resident set size" lines were manually added
below the relevant benchmark:

  '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran
        Maximum resident set size (kbytes): 107080
    1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
        Maximum resident set size (kbytes): 106968
    1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
        Maximum resident set size (kbytes): 107032
    1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
        Maximum resident set size (kbytes): 107072
    1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
        Maximum resident set size (kbytes): 55704
    2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
        Maximum resident set size (kbytes): 4564

This shows that if you have enough memory this new streaming method is
slower the lower you set the streaming threshold, but the benefit is
more bounded memory use.

An earlier version of this patch introduced a new
"core.bigFileStreamingThreshold" instead of re-using the existing
"core.bigFileThreshold" variable[1]. As noted in a detailed overview
of its users in [2] using it has several different meanings.

Still, we consider it good enough to simply re-use it. While it's
possible that someone might want to e.g. consider objects "small" for
the purposes of diffing but "big" for the purposes of writing them
such use-cases are probably too obscure to worry about. We can always
split up "core.bigFileThreshold" in the future if there's a need for
that.

0. https://github.com/avar/git-hyperfine/
1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:36 -07:00
3c3ca0b0c1 core doc: modernize core.bigFileThreshold documentation
The core.bigFileThreshold documentation has been largely unchanged
since 5eef828bc0 (fast-import: Stream very large blobs directly to
pack, 2010-02-01).

But since then this setting has been expanded to affect a lot more
than that description indicated. Most notably in how "git diff" treats
them, see 6bf3b81348 (diff --stat: mark any file larger than
core.bigfilethreshold binary, 2014-08-16).

In addition to that, numerous commands and APIs make use of a
streaming mode for files above this threshold.

So let's attempt to summarize 12 years of changes in behavior, which
can be seen with:

    git log --oneline -Gbig_file_thre 5eef828bc03.. -- '*.c'

To do that turn this into a bullet-point list. The summary Han Xin
produced in [1] helped a lot, but is a bit too detailed for
documentation aimed at users. Let's instead summarize how
user-observable behavior differs, and generally describe how we tend
to stream these files in various commands.

1. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/

Helped-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:35 -07:00
2b6070ac4c object-file.c: add "stream_loose_object()" to handle large object
If we want unpack and write a loose object using "write_loose_object",
we have to feed it with a buffer with the same size of the object, which
will consume lots of memory and may cause OOM. This can be improved by
feeding data to "stream_loose_object()" in a stream.

Add a new function "stream_loose_object()", which is a stream version of
"write_loose_object()" but with a low memory footprint. We will use this
function to unpack large blob object in later commit.

Another difference with "write_loose_object()" is that we have no chance
to run "write_object_file_prepare()" to calculate the oid in advance.
In "write_loose_object()", we know the oid and we can write the
temporary file in the same directory as the final object, but for an
object with an undetermined oid, we don't know the exact directory for
the object.

Still, we need to save the temporary file we're preparing
somewhere. We'll do that in the top-level ".git/objects/"
directory (or whatever "GIT_OBJECT_DIRECTORY" is set to). Once we've
streamed it we'll know the OID, and will move it to its canonical
path.

"freshen_packed_object()" or "freshen_loose_object()" will be called
inside "stream_loose_object()" after obtaining the "oid". After the
temporary file is written, we wants to mark the object to recent and we
may find that where indeed is already the object. We should remove the
temporary and do not leave a new copy of the object.

Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:35 -07:00
21e7d88140 object-file.c: factor out deflate part of write_loose_object()
Split out the part of write_loose_object() that deals with calling
git_deflate() into a utility function, a subsequent commit will
introduce another function that'll make use of it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:35 -07:00
97a9db6ffb object-file.c: refactor write_loose_object() to several steps
When writing a large blob using "write_loose_object()", we have to pass
a buffer with the whole content of the blob, and this behavior will
consume lots of memory and may cause OOM. We will introduce a stream
version function ("stream_loose_object()") in later commit to resolve
this issue.

Before introducing that streaming function, do some refactoring on
"write_loose_object()" to reuse code for both versions.

Rewrite "write_loose_object()" as follows:

 1. Figure out a path for the (temp) object file. This step is only
    used in "write_loose_object()".

 2. Move common steps for starting to write loose objects into a new
    function "start_loose_object_common()".

 3. Compress data.

 4. Move common steps for ending zlib stream into a new function
    "end_loose_object_common()".

 5. Close fd and finalize the object file.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:35 -07:00
a1bf5ca29f unpack-objects: low memory footprint for get_data() in dry_run mode
As the name implies, "get_data(size)" will allocate and return a given
amount of memory. Allocating memory for a large blob object may cause the
system to run out of memory. Before preparing to replace calling of
"get_data()" to unpack large blob objects in latter commits, refactor
"get_data()" to reduce memory footprint for dry_run mode.

Because in dry_run mode, "get_data()" is only used to check the
integrity of data, and the returned buffer is not used at all, we can
allocate a smaller buffer and use it as zstream output. Make the function
return NULL in the dry-run mode, as no callers use the returned buffer.

The "find [...]objects/?? -type f | wc -l" test idiom being used here
is adapted from the same "find" use added to another test in
d9545c7f46 (fast-import: implement unpack limit, 2016-04-25).

Suggested-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:35 -07:00
ead74601c6 tests: don't assume a .git/info for .git/info/sparse-checkout
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/sparse-checkout to explicitly
create the directory by setting "TEST_CREATE_REPO_NO_TEMPLATE=1"
before sourcing test-lib.sh, and using the "--template=" argument to
"git clone".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:21 -07:00
1d758728fb tests: don't assume a .git/info for .git/info/exclude
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/exclude to explicitly create
the directory by setting "TEST_CREATE_REPO_NO_TEMPLATE=1" before
sourcing test-lib.sh, and using the "--template=" argument to "git
clone" and "git init".

In the case of ".git/modules/sub1/info" we deviate from the
established pattern in this and preceding commits of passing a
"--template=" and doing a "mkdir .git/info".

In that case "git checkout" will run the "submodule--helper clone",
and both e.g. "git submodule update --init" and "git checkout" do not
have a way to pass down options to the eventual "git init" or "git
clone". Let's instead assume that the submodule was populated with our
default templates, remove them, and then run the "mkdir".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:21 -07:00
ce5369e3ef tests: don't assume a .git/info for .git/info/refs
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/refs to explicitly create the
directory by using the "--template=" argument to "git init".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:21 -07:00
8da0b02d99 tests: don't assume a .git/info for .git/info/attributes
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/attributes to explicitly
create the directory by setting "TEST_CREATE_REPO_NO_TEMPLATE=1"
before sourcing test-lib.sh, and using the "--template=" argument to
"git clone".

The change here in here in t7814-grep-recurse-submodules.sh would
continue "succeeding" with only the "TEST_CREATE_REPO_NO_TEMPLATE=1"
part of this change. That's because those tests use
"test_expect_failure", so they'd "pass" without this change, as
"test_expect_failure" by design isn't discerning about what failure
conditions it'll accept.

But as we're fixing these sorts of issues across the test suite let's
fix this one too. This issue was spotted with a local merge with
another topic of mine[1], which introduces a stricter alternative to
"test_expect_failure".

1. https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:21 -07:00
93e02b6e1e tests: don't assume a .git/info for .git/info/grafts
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/grafts to explicitly create
the directory.

Do this using the new "TEST_CREATE_REPO_NO_TEMPLATE" facility, and use
"mkdir" instead of "mkdir -p" to assert that we don't have the
.git/info already. An exception to this is the "with grafts" test in
"t6001-rev-list-graft.sh". There we're modifying our ".git" state in a
for-loop, in lieu of refactoring that more extensively let's use
"mkdir -p" there.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:21 -07:00
e942292a3e tests: don't depend on template-created .git/branches
As noted in c8a58ac5a5 (Revert "Don't create the $GIT_DIR/branches
directory on init", 2009-10-31) there was an attempt long ago in
0cc5691a8b (Don't create the $GIT_DIR/branches directory on init,
2009-10-30) to get rid of the legacy "branches" directory.

We should probably get rid of its creation by removing the
"templates/branches--" file. But whatever our default behavior, our
tests should be tightened up to explicitly create the .git/branches
directory if they rely on our default templates, to make the
dependency on those templates clear.

So let's amend the two tests that would fail if .git/branches wasn't
created. To do this introduce a new "TEST_CREATE_REPO_NO_TEMPLATE"
variable, which we'll set before sourcing test-lib.sh, and change the
"git clone" and "git init" commands in the tests themselves to
explicitly pass "--template=".

This way they won't get a .git/branches in either their top-level
.git, or in the ones they create. We can then amend the tests that
rely on the ".git/branches" directory existing to create it
explicitly, and to remove it after its creation.

This new "TEST_CREATE_REPO_NO_TEMPLATE" variable is a less
heavy-handed version of the "NO_SET_GIT_TEMPLATE_DIR" variable. See
a94d305bf8 (t/t0001-init.sh: add test for 'init with init.templatedir
set', 2010-02-26) for its implementation.

Unlike "TEST_CREATE_REPO_NO_TEMPLATE", this new
"TEST_CREATE_REPO_NO_TEMPLATE" variable is narrowly scoped to what the
"git init" in test-lib.sh does, as opposed to the global effect of
"NO_SET_GIT_TEMPLATE_DIR" and the setting of "GIT_TEMPLATE_DIR" in
wrap-for-bin.sh.

I experimented with adding a new "GIT_WRAP_FOR_BIN_VIA_TEST_LIB"
variable set in test-lib.sh, which would cause wrap-for-bin.sh to not
set GIT_TEMPLATE_DIR, GITPERLLIB etc, as we set those in
test-lib.sh. I think that's a viable approach, but it would interact
e.g. with the appending feature of GITPERLLIB added in
8bade1e12e (wrap-for-bin: make bin-wrappers chainable, 2013-07-04).

Doing so would allow us to convert the tests in t0001-init.sh that now
use "NO_SET_GIT_TEMPLATE_DIR" to simply unset "GIT_TEMPLATE_DIR" in a
sub-shell before invoking "git init" or "git clone". I think that
approach is worth pursuing, but let's table it for now. Some future
wrap-for-bin.sh refactoring can try to address it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:21 -07:00
dbbb8c50f5 t0008: don't rely on default ".git/info/exclude"
Change a test added in 368aa52952 (add git-check-ignore sub-command,
2013-01-06) to clobber .git/info/exclude rather than append to
it.

These tests would break if the "templates/info--exclude" file added in
d3af621b14 (Redo the templates generation and installation.,
2005-08-06) wasn't exactly 6 lines (of only comments).

Let's instead clobber the default .git/info/excludes file, and test
only our own expected content. This is not strictly needed for
anything in this series, but is a good cleanup while we're at it.

As discussed in the preceding commit a lot of things depend on the
"info" directory being created, but this was the only test that relied
on the specific content in the "templates/info--exclude" file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 12:00:20 -07:00
fa28da0202 Merge branch 'vk/readme-typo'
Add missing punctuation.

* vk/readme-typo:
  git-gui: Fix a typo in README
2021-11-05 17:47:11 +05:30
8cf36407ca git-gui: Fix a typo in README
Add a missing punctuation.

Signed-off-by: Vipul Kumar <kumar@onenetbeyond.org>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2021-11-05 17:45:38 +05:30
3175 changed files with 260846 additions and 103392 deletions

View File

@ -1,7 +1,7 @@
env:
CIRRUS_CLONE_DEPTH: 1
freebsd_12_task:
freebsd_task:
env:
GIT_PROVE_OPTS: "--timer --jobs 10"
GIT_TEST_OPTS: "--no-chain-lint --no-bin-wrappers"
@ -9,7 +9,7 @@ freebsd_12_task:
DEFAULT_TEST_TARGET: prove
DEVELOPER: 1
freebsd_instance:
image_family: freebsd-12-3
image_family: freebsd-13-4
memory: 2G
install_script:
pkg install -y gettext gmake perl5
@ -19,4 +19,4 @@ freebsd_12_task:
build_script:
- su git -c gmake
test_script:
- su git -c 'gmake test'
- su git -c 'gmake DEFAULT_UNIT_TEST_TARGET=unit-tests-prove test unit-tests'

View File

@ -32,6 +32,9 @@ AlignConsecutiveAssignments: false
# double b = 3.14;
AlignConsecutiveDeclarations: false
# Align consecutive macro definitions.
AlignConsecutiveMacros: true
# Align escaped newlines as far left as possible
# #define A \
# int aaaa; \
@ -72,6 +75,10 @@ AlwaysBreakAfterReturnType: None
BinPackArguments: true
BinPackParameters: true
# Add no space around the bit field
# unsigned bf:2;
BitFieldColonSpacing: None
# Attach braces to surrounding context except break before braces on function
# definitions.
# void foo()
@ -83,9 +90,9 @@ BinPackParameters: true
BreakBeforeBraces: Linux
# Break after operators
# int valuve = aaaaaaaaaaaaa +
# bbbbbb -
# ccccccccccc;
# int value = aaaaaaaaaaaaa +
# bbbbbb -
# ccccccccccc;
BreakBeforeBinaryOperators: None
BreakBeforeTernaryOperators: false
@ -96,6 +103,14 @@ BreakStringLiterals: false
# Switch statement body is always indented one level more than case labels.
IndentCaseLabels: false
# Indents directives before the hash. Each level uses a single space for
# indentation.
# #if FOO
# # include <foo>
# #endif
IndentPPDirectives: AfterHash
PPIndentWidth: 1
# Don't indent a function definition or declaration if it is wrapped after the
# type
IndentWrappedFunctionNames: false
@ -108,11 +123,18 @@ PointerAlignment: Right
# x = (int32)y; not x = (int32) y;
SpaceAfterCStyleCast: false
# No space is inserted after the logical not operator
SpaceAfterLogicalNot: false
# Insert spaces before and after assignment operators
# int a = 5; not int a=5;
# a += 42; a+=42;
SpaceBeforeAssignmentOperators: true
# Spaces will be removed before case colon.
# case 1: break; not case 1 : break;
SpaceBeforeCaseColon: false
# Put a space before opening parentheses only after control statement keywords.
# void f() {
# if (true) {
@ -124,6 +146,14 @@ SpaceBeforeParens: ControlStatements
# Don't insert spaces inside empty '()'
SpaceInEmptyParentheses: false
# No space before first '[' in arrays
# int a[5][5]; not int a [5][5];
SpaceBeforeSquareBrackets: false
# No space will be inserted into {}
# while (true) {} not while (true) { }
SpaceInEmptyBlock: false
# The number of spaces before trailing line comments (// - comments).
# This does not affect trailing block comments (/* - comments).
SpacesBeforeTrailingComments: 1
@ -149,20 +179,30 @@ Cpp11BracedListStyle: false
# A list of macros that should be interpreted as foreach loops instead of as
# function calls. Taken from:
# git grep -h '^#define [^[:space:]]*for_each[^[:space:]]*(' \
# | sed "s,^#define \([^[:space:]]*for_each[^[:space:]]*\)(.*$, - '\1'," \
# | sort | uniq
# git grep -h '^#define [^[:space:]]*for_\?each[^[:space:]]*(' |
# sed "s/^#define / - '/; s/(.*$/'/" | sort | uniq
ForEachMacros:
- 'for_each_abbrev'
- 'for_each_builtin'
- 'for_each_string_list_item'
- 'for_each_ut'
- 'for_each_wanted_builtin'
- 'hashmap_for_each_entry'
- 'hashmap_for_each_entry_from'
- 'kh_foreach'
- 'kh_foreach_value'
- 'list_for_each'
- 'list_for_each_dir'
- 'list_for_each_prev'
- 'list_for_each_prev_safe'
- 'list_for_each_safe'
- 'strintmap_for_each_entry'
- 'strmap_for_each_entry'
- 'strset_for_each_entry'
# A list of macros that should be interpreted as conditionals instead of as
# function calls.
IfMacros:
- 'if_test'
# The maximum number of consecutive empty lines to keep.
MaxEmptyLinesToKeep: 1
@ -172,13 +212,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false
# Penalties
# This decides what order things should be done if a line is too long
PenaltyBreakAssignment: 10
PenaltyBreakBeforeFirstCallParameter: 30
PenaltyBreakComment: 10
PenaltyBreakAssignment: 5
PenaltyBreakBeforeFirstCallParameter: 5
PenaltyBreakComment: 5
PenaltyBreakFirstLessLess: 0
PenaltyBreakString: 10
PenaltyExcessCharacter: 100
PenaltyReturnTypeOnItsOwnLine: 60
PenaltyBreakOpenParenthesis: 300
PenaltyBreakString: 5
PenaltyExcessCharacter: 10
PenaltyReturnTypeOnItsOwnLine: 300
# Don't sort #include's
SortIncludes: false

View File

@ -4,7 +4,7 @@ insert_final_newline = true
# The settings for C (*.c and *.h) files are mirrored in .clang-format. Keep
# them in sync.
[*.{c,h,sh,perl,pl,pm,txt}]
[{*.{c,h,sh,bash,perl,pl,pm,txt,adoc},config.mak.*,Makefile}]
indent_style = tab
tab_width = 8

28
.gitattributes vendored
View File

@ -1,18 +1,18 @@
* whitespace=!indent,trail,space
*.[ch] whitespace=indent,trail,space diff=cpp
*.sh whitespace=indent,trail,space eol=lf
*.perl eol=lf diff=perl
*.pl eof=lf diff=perl
*.pm eol=lf diff=perl
*.py eol=lf diff=python
*.bat eol=crlf
*.sh whitespace=indent,trail,space text eol=lf
*.perl text eol=lf diff=perl
*.pl text eof=lf diff=perl
*.pm text eol=lf diff=perl
*.py text eol=lf diff=python
*.bat text eol=crlf
CODE_OF_CONDUCT.md -whitespace
/Documentation/**/*.txt eol=lf
/command-list.txt eol=lf
/GIT-VERSION-GEN eol=lf
/mergetools/* eol=lf
/t/oid-info/* eol=lf
/Documentation/git-merge.txt conflict-marker-size=32
/Documentation/gitk.txt conflict-marker-size=32
/Documentation/user-manual.txt conflict-marker-size=32
/Documentation/**/*.adoc text eol=lf
/command-list.txt text eol=lf
/GIT-VERSION-GEN text eol=lf
/mergetools/* text eol=lf
/t/oid-info/* text eol=lf
/Documentation/git-merge.adoc conflict-marker-size=32
/Documentation/gitk.adoc conflict-marker-size=32
/Documentation/user-manual.adoc conflict-marker-size=32
/t/t????-*.sh conflict-marker-size=32

View File

@ -4,4 +4,7 @@ a mailing list (git@vger.kernel.org) for code submissions, code reviews, and
bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
to conveniently send your Pull Requests commits to our mailing list.
For a single-commit pull request, please *leave the pull request description
empty*: your commit message itself should describe your changes.
Please read the "guidelines for contributing" linked above!

34
.github/workflows/check-style.yml vendored Normal file
View File

@ -0,0 +1,34 @@
name: check-style
# Get the repository with all commits to ensure that we can analyze
# all of the commits contributed via the Pull Request.
on:
pull_request:
types: [opened, synchronize]
# Avoid unnecessary builds. Unlike the main CI jobs, these are not
# ci-configurable (but could be).
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
check-style:
env:
CC: clang
jobname: ClangFormat
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- run: ci/install-dependencies.sh
- name: git clang-format
continue-on-error: true
id: check_out
run: |
./ci/run-style-check.sh \
"${{github.event.pull_request.base.sha}}"

View File

@ -9,42 +9,24 @@ on:
pull_request:
types: [opened, synchronize]
# Avoid unnecessary builds. Unlike the main CI jobs, these are not
# ci-configurable (but could be).
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
check-whitespace:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: git log --check
id: check_out
run: |
log=
commit=
while read dash etc
do
case "${dash}" in
"---")
commit="${etc}"
;;
"")
;;
*)
if test -n "${commit}"
then
log="${log}\n${commit}"
echo ""
echo "--- ${commit}"
fi
commit=
log="${log}\n${dash} ${etc}"
echo "${dash} ${etc}"
;;
esac
done <<< $(git log --check --pretty=format:"---% h% s" ${{github.event.pull_request.base.sha}}..)
if test -n "${log}"
then
exit 2
fi
./ci/check-whitespace.sh \
"${{github.event.pull_request.base.sha}}" \
"$GITHUB_STEP_SUMMARY" \
"https://github.com/${{github.repository}}"

163
.github/workflows/coverity.yml vendored Normal file
View File

@ -0,0 +1,163 @@
name: Coverity
# This GitHub workflow automates submitting builds to Coverity Scan. To enable it,
# set the repository variable `ENABLE_COVERITY_SCAN_FOR_BRANCHES` (for details, see
# https://docs.github.com/en/actions/learn-github-actions/variables) to a JSON
# string array containing the names of the branches for which the workflow should be
# run, e.g. `["main", "next"]`.
#
# In addition, two repository secrets must be set (for details how to add secrets, see
# https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions):
# `COVERITY_SCAN_EMAIL` and `COVERITY_SCAN_TOKEN`. The former specifies the
# email to which the Coverity reports should be sent and the latter can be
# obtained from the Project Settings tab of the Coverity project).
#
# The workflow runs on `ubuntu-latest` by default. This can be overridden by setting
# the repository variable `ENABLE_COVERITY_SCAN_ON_OS` to a JSON string array specifying
# the operating systems, e.g. `["ubuntu-latest", "windows-latest"]`.
#
# By default, the builds are submitted to the Coverity project `git`. To override this,
# set the repository variable `COVERITY_PROJECT`.
on:
push:
defaults:
run:
shell: bash
jobs:
coverity:
if: contains(fromJSON(vars.ENABLE_COVERITY_SCAN_FOR_BRANCHES || '[""]'), github.ref_name)
strategy:
matrix:
os: ${{ fromJSON(vars.ENABLE_COVERITY_SCAN_ON_OS || '["ubuntu-latest"]') }}
runs-on: ${{ matrix.os }}
env:
COVERITY_PROJECT: ${{ vars.COVERITY_PROJECT || 'git' }}
COVERITY_LANGUAGE: cxx
COVERITY_PLATFORM: overridden-below
steps:
- uses: actions/checkout@v4
- name: install minimal Git for Windows SDK
if: contains(matrix.os, 'windows')
uses: git-for-windows/setup-git-for-windows-sdk@v1
- run: ci/install-dependencies.sh
if: contains(matrix.os, 'ubuntu') || contains(matrix.os, 'macos')
env:
CI_JOB_IMAGE: ${{ matrix.os }}
# The Coverity site says the tool is usually updated twice yearly, so the
# MD5 of download can be used to determine whether there's been an update.
- name: get the Coverity Build Tool hash
id: lookup
run: |
case "${{ matrix.os }}" in
*windows*)
COVERITY_PLATFORM=win64
COVERITY_TOOL_FILENAME=cov-analysis.zip
MAKEFLAGS=-j$(nproc)
;;
*macos*)
COVERITY_PLATFORM=macOSX
COVERITY_TOOL_FILENAME=cov-analysis.dmg
MAKEFLAGS=-j$(sysctl -n hw.physicalcpu)
;;
*ubuntu*)
COVERITY_PLATFORM=linux64
COVERITY_TOOL_FILENAME=cov-analysis.tgz
MAKEFLAGS=-j$(nproc)
;;
*)
echo '::error::unhandled OS ${{ matrix.os }}' >&2
exit 1
;;
esac
echo "COVERITY_PLATFORM=$COVERITY_PLATFORM" >>$GITHUB_ENV
echo "COVERITY_TOOL_FILENAME=$COVERITY_TOOL_FILENAME" >>$GITHUB_ENV
echo "MAKEFLAGS=$MAKEFLAGS" >>$GITHUB_ENV
MD5=$(curl https://scan.coverity.com/download/$COVERITY_LANGUAGE/$COVERITY_PLATFORM \
--fail \
--form token='${{ secrets.COVERITY_SCAN_TOKEN }}' \
--form project="$COVERITY_PROJECT" \
--form md5=1)
case $? in
0) ;; # okay
22) # 40x, i.e. access denied
echo "::error::incorrect token or project?" >&2
exit 1
;;
*) # other error
echo "::error::Failed to retrieve MD5" >&2
exit 1
;;
esac
echo "hash=$MD5" >>$GITHUB_OUTPUT
# Try to cache the tool to avoid downloading 1GB+ on every run.
# A cache miss will add ~30s to create, but a cache hit will save minutes.
- name: restore the Coverity Build Tool
id: cache
uses: actions/cache/restore@v4
with:
path: ${{ runner.temp }}/cov-analysis
key: cov-build-${{ env.COVERITY_LANGUAGE }}-${{ env.COVERITY_PLATFORM }}-${{ steps.lookup.outputs.hash }}
- name: download the Coverity Build Tool (${{ env.COVERITY_LANGUAGE }} / ${{ env.COVERITY_PLATFORM}})
if: steps.cache.outputs.cache-hit != 'true'
run: |
curl https://scan.coverity.com/download/$COVERITY_LANGUAGE/$COVERITY_PLATFORM \
--fail --no-progress-meter \
--output $RUNNER_TEMP/$COVERITY_TOOL_FILENAME \
--form token='${{ secrets.COVERITY_SCAN_TOKEN }}' \
--form project="$COVERITY_PROJECT"
- name: extract the Coverity Build Tool
if: steps.cache.outputs.cache-hit != 'true'
run: |
case "$COVERITY_TOOL_FILENAME" in
*.tgz)
mkdir $RUNNER_TEMP/cov-analysis &&
tar -xzf $RUNNER_TEMP/$COVERITY_TOOL_FILENAME --strip 1 -C $RUNNER_TEMP/cov-analysis
;;
*.dmg)
cd $RUNNER_TEMP &&
attach="$(hdiutil attach $COVERITY_TOOL_FILENAME)" &&
volume="$(echo "$attach" | cut -f 3 | grep /Volumes/)" &&
mkdir cov-analysis &&
cd cov-analysis &&
sh "$volume"/cov-analysis-macosx-*.sh &&
ls -l &&
hdiutil detach "$volume"
;;
*.zip)
cd $RUNNER_TEMP &&
mkdir cov-analysis-tmp &&
unzip -d cov-analysis-tmp $COVERITY_TOOL_FILENAME &&
mv cov-analysis-tmp/* cov-analysis
;;
*)
echo "::error::unhandled archive type: $COVERITY_TOOL_FILENAME" >&2
exit 1
;;
esac
- name: cache the Coverity Build Tool
if: steps.cache.outputs.cache-hit != 'true'
uses: actions/cache/save@v4
with:
path: ${{ runner.temp }}/cov-analysis
key: cov-build-${{ env.COVERITY_LANGUAGE }}-${{ env.COVERITY_PLATFORM }}-${{ steps.lookup.outputs.hash }}
- name: build with cov-build
run: |
export PATH="$RUNNER_TEMP/cov-analysis/bin:$PATH" &&
cov-configure --gcc &&
cov-build --dir cov-int make
- name: package the build
run: tar -czvf cov-int.tgz cov-int
- name: submit the build to Coverity Scan
run: |
curl \
--fail \
--form token='${{ secrets.COVERITY_SCAN_TOKEN }}' \
--form email='${{ secrets.COVERITY_SCAN_EMAIL }}' \
--form file=@cov-int.tgz \
--form version='${{ github.sha }}' \
"https://scan.coverity.com/builds?project=$COVERITY_PROJECT"

View File

@ -2,6 +2,12 @@ name: git-l10n
on: [push, pull_request_target]
# Avoid unnecessary builds. Unlike the main CI jobs, these are not
# ci-configurable (but could be).
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
git-po-helper:
if: >-
@ -23,8 +29,8 @@ jobs:
base=${{ github.event.before }}
head=${{ github.event.after }}
fi
echo "::set-output name=base::$base"
echo "::set-output name=head::$head"
echo base=$base >>$GITHUB_OUTPUT
echo head=$head >>$GITHUB_OUTPUT
- name: Run partial clone
run: |
git -c init.defaultBranch=master init --bare .
@ -57,9 +63,10 @@ jobs:
origin \
${{ github.ref }} \
$args
- uses: actions/setup-go@v2
- uses: actions/setup-go@v5
with:
go-version: '>=1.16'
cache: false
- name: Install git-po-helper
run: go install github.com/git-l10n/git-po-helper@main
- name: Install other dependencies
@ -85,14 +92,13 @@ jobs:
cat git-po-helper.out
exit $exit_code
- name: Create comment in pull request for report
uses: mshick/add-pr-comment@v1
uses: mshick/add-pr-comment@v2
if: >-
always() &&
github.event_name == 'pull_request_target' &&
env.COMMENT_BODY != ''
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: 'github-actions[bot]'
message: >
${{ steps.check-commits.outcome == 'failure' && 'Errors and warnings' || 'Warnings' }}
found by [git-po-helper](https://github.com/git-l10n/git-po-helper#readme) in workflow

View File

@ -5,12 +5,27 @@ on: [push, pull_request]
env:
DEVELOPER: 1
# If more than one workflow run is triggered for the very same commit hash
# (which happens when multiple branches pointing to the same commit), only
# the first one is allowed to run, the second will be kept in the "queued"
# state. This allows a successful completion of the first run to be reused
# in the second run via the `skip-if-redundant` logic in the `config` job.
#
# The only caveat is that if a workflow run is triggered for the same commit
# hash that another run is already being held, that latter run will be
# canceled. For more details about the `concurrency` attribute, see:
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency
concurrency:
group: ${{ github.sha }}
jobs:
ci-config:
name: config
if: vars.CI_BRANCHES == '' || contains(vars.CI_BRANCHES, github.ref_name)
runs-on: ubuntu-latest
outputs:
enabled: ${{ steps.check-ref.outputs.enabled }}${{ steps.skip-if-redundant.outputs.enabled }}
skip_concurrent: ${{ steps.check-ref.outputs.skip_concurrent }}
steps:
- name: try to clone ci-config branch
run: |
@ -29,22 +44,33 @@ jobs:
name: check whether CI is enabled for ref
run: |
enabled=yes
if test -x config-repo/ci/config/allow-ref &&
! config-repo/ci/config/allow-ref '${{ github.ref }}'
if test -x config-repo/ci/config/allow-ref
then
enabled=no
echo "::warning::ci/config/allow-ref is deprecated; use CI_BRANCHES instead"
if ! config-repo/ci/config/allow-ref '${{ github.ref }}'
then
enabled=no
fi
fi
echo "::set-output name=enabled::$enabled"
skip_concurrent=yes
if test -x config-repo/ci/config/skip-concurrent &&
! config-repo/ci/config/skip-concurrent '${{ github.ref }}'
then
skip_concurrent=no
fi
echo "enabled=$enabled" >>$GITHUB_OUTPUT
echo "skip_concurrent=$skip_concurrent" >>$GITHUB_OUTPUT
- name: skip if the commit or tree was already tested
id: skip-if-redundant
uses: actions/github-script@v3
uses: actions/github-script@v7
if: steps.check-ref.outputs.enabled == 'yes'
with:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
try {
// Figure out workflow ID, commit and tree
const { data: run } = await github.actions.getWorkflowRun({
const { data: run } = await github.rest.actions.getWorkflowRun({
owner: context.repo.owner,
repo: context.repo.repo,
run_id: context.runId,
@ -54,7 +80,7 @@ jobs:
const tree_id = run.head_commit.tree_id;
// See whether there is a successful run for that commit or tree
const { data: runs } = await github.actions.listWorkflowRuns({
const { data: runs } = await github.rest.actions.listWorkflowRuns({
owner: context.repo.owner,
repo: context.repo.repo,
per_page: 500,
@ -82,8 +108,11 @@ jobs:
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
runs-on: windows-latest
concurrency:
group: windows-build-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: build
shell: bash
@ -94,21 +123,24 @@ jobs:
- name: zip up tracked files
run: git archive -o artifacts/tracked.tar.gz HEAD
- name: upload tracked files and build artifacts
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: windows-artifacts
path: artifacts
windows-test:
name: win test
runs-on: windows-latest
needs: [windows-build]
needs: [ci-config, windows-build]
strategy:
fail-fast: false
matrix:
nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
concurrency:
group: windows-test-${{ matrix.nr }}-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- name: download tracked files and build artifacts
uses: actions/download-artifact@v2
uses: actions/download-artifact@v4
with:
name: windows-artifacts
path: ${{github.workspace}}
@ -125,37 +157,36 @@ jobs:
run: ci/print-test-failures.sh
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: failed-tests-windows
name: failed-tests-windows-${{ matrix.nr }}
path: ${{env.FAILED_TEST_ARTIFACTS}}
vs-build:
name: win+VS build
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
if: github.event.repository.owner.login == 'git-for-windows' && needs.ci-config.outputs.enabled == 'yes'
env:
NO_PERL: 1
GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'"
runs-on: windows-latest
concurrency:
group: vs-build-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: initialize vcpkg
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
repository: 'microsoft/vcpkg'
path: 'compat/vcbuild/vcpkg'
- name: download vcpkg artifacts
shell: powershell
run: |
$urlbase = "https://dev.azure.com/git/git/_apis/build/builds"
$id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=9&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id
$downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[0].resource.downloadUrl
(New-Object Net.WebClient).DownloadFile($downloadUrl, "compat.zip")
Expand-Archive compat.zip -DestinationPath . -Force
Remove-Item compat.zip
uses: git-for-windows/get-azure-pipelines-artifact@v0
with:
repository: git/git
definitionId: 9
- name: add msbuild to PATH
uses: microsoft/setup-msbuild@v1
uses: microsoft/setup-msbuild@v2
- name: copy dlls to root
shell: cmd
run: compat\vcbuild\vcpkg_copy_dlls.bat release
@ -177,22 +208,25 @@ jobs:
- name: zip up tracked files
run: git archive -o artifacts/tracked.tar.gz HEAD
- name: upload tracked files and build artifacts
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: vs-artifacts
path: artifacts
vs-test:
name: win+VS test
runs-on: windows-latest
needs: vs-build
needs: [ci-config, vs-build]
strategy:
fail-fast: false
matrix:
nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
concurrency:
group: vs-test-${{ matrix.nr }}-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- uses: git-for-windows/setup-git-for-windows-sdk@v1
- name: download tracked files and build artifacts
uses: actions/download-artifact@v2
uses: actions/download-artifact@v4
with:
name: vs-artifacts
path: ${{github.workspace}}
@ -210,97 +244,186 @@ jobs:
run: ci/print-test-failures.sh
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: failed-tests-windows
name: failed-tests-windows-vs-${{ matrix.nr }}
path: ${{env.FAILED_TEST_ARTIFACTS}}
windows-meson-build:
name: win+Meson build
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
runs-on: windows-latest
concurrency:
group: windows-meson-build-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- name: Set up dependencies
shell: pwsh
run: pip install meson ninja
- name: Setup
shell: pwsh
run: meson setup build -Dperl=disabled -Dcredential_helpers=wincred
- name: Compile
shell: pwsh
run: meson compile -C build
- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: windows-meson-artifacts
path: build
windows-meson-test:
name: win+Meson test
runs-on: windows-latest
needs: [ci-config, windows-meson-build]
strategy:
fail-fast: false
matrix:
nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
concurrency:
group: windows-meson-test-${{ matrix.nr }}-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- name: Set up dependencies
shell: pwsh
run: pip install meson ninja
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: windows-meson-artifacts
path: build
- name: Test
shell: pwsh
run: meson test -C build --list | Select-Object -Skip 1 | Select-String .* | Group-Object -Property { $_.LineNumber % 10 } | Where-Object Name -EQ ${{ matrix.nr }} | ForEach-Object { meson test -C build --no-rebuild --print-errorlogs $_.Group }
regular:
name: ${{matrix.vector.jobname}} (${{matrix.vector.pool}})
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
concurrency:
group: ${{ matrix.vector.jobname }}-${{ matrix.vector.pool }}-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
strategy:
fail-fast: false
matrix:
vector:
- jobname: linux-clang
cc: clang
pool: ubuntu-latest
- jobname: linux-sha256
cc: clang
os: ubuntu
pool: ubuntu-latest
- jobname: linux-gcc
cc: gcc
cc_package: gcc-8
pool: ubuntu-latest
- jobname: linux-TEST-vars
cc: gcc
os: ubuntu
cc_package: gcc-8
pool: ubuntu-latest
- jobname: osx-clang
cc: clang
pool: macos-latest
pool: macos-13
- jobname: osx-reftable
cc: clang
pool: macos-13
- jobname: osx-gcc
cc: gcc
cc_package: gcc-9
pool: macos-latest
- jobname: linux-gcc-default
cc: gcc
pool: ubuntu-latest
- jobname: linux-leaks
cc: gcc
pool: ubuntu-latest
cc: gcc-13
pool: macos-13
- jobname: osx-meson
cc: clang
pool: macos-13
env:
CC: ${{matrix.vector.cc}}
CC_PACKAGE: ${{matrix.vector.cc_package}}
jobname: ${{matrix.vector.jobname}}
runs_on_pool: ${{matrix.vector.pool}}
CI_JOB_IMAGE: ${{matrix.vector.pool}}
TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
runs-on: ${{matrix.vector.pool}}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- run: ci/install-dependencies.sh
- run: ci/run-build-and-tests.sh
- name: print test failures
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
shell: bash
run: ci/print-test-failures.sh
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: failed-tests-${{matrix.vector.jobname}}
path: ${{env.FAILED_TEST_ARTIFACTS}}
fuzz-smoke-test:
name: fuzz smoke test
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
env:
CC: clang
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: ci/install-dependencies.sh
- run: ci/run-build-and-minimal-fuzzers.sh
dockerized:
name: ${{matrix.vector.jobname}} (${{matrix.vector.image}})
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
concurrency:
group: dockerized-${{ matrix.vector.jobname }}-${{ matrix.vector.image }}-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
strategy:
fail-fast: false
matrix:
vector:
- jobname: linux-musl
image: alpine
- jobname: linux-sha256
image: ubuntu:rolling
cc: clang
- jobname: linux-reftable
image: ubuntu:rolling
cc: clang
- jobname: linux-TEST-vars
image: ubuntu:20.04
cc: gcc
cc_package: gcc-8
- jobname: linux-breaking-changes
cc: gcc
image: ubuntu:rolling
- jobname: linux-leaks
image: ubuntu:rolling
cc: gcc
- jobname: linux-reftable-leaks
image: ubuntu:rolling
cc: gcc
- jobname: linux-asan-ubsan
image: ubuntu:rolling
cc: clang
- jobname: linux-meson
image: ubuntu:rolling
cc: gcc
- jobname: linux-musl-meson
image: alpine:latest
# Supported until 2025-04-02.
- jobname: linux32
os: ubuntu32
image: daald/ubuntu32:xenial
image: i386/ubuntu:focal
- jobname: pedantic
image: fedora
image: fedora:latest
# A RHEL 8 compatible distro. Supported until 2029-05-31.
- jobname: almalinux-8
image: almalinux:8
# Supported until 2026-08-31.
- jobname: debian-11
image: debian:11
env:
jobname: ${{matrix.vector.jobname}}
CC: ${{matrix.vector.cc}}
CI_JOB_IMAGE: ${{matrix.vector.image}}
runs-on: ubuntu-latest
container: ${{matrix.vector.image}}
steps:
- uses: actions/checkout@v1
- run: ci/install-docker-dependencies.sh
- run: ci/run-build-and-tests.sh
- name: prepare libc6 for actions
if: matrix.vector.jobname == 'linux32'
run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
- uses: actions/checkout@v4
- run: ci/install-dependencies.sh
- run: useradd builder --create-home
- run: chown -R builder .
- run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
- name: print test failures
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
shell: bash
run: ci/print-test-failures.sh
run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh
- name: Upload failed tests' directories
if: failure() && env.FAILED_TEST_ARTIFACTS != ''
uses: actions/upload-artifact@v1
uses: actions/upload-artifact@v4
with:
name: failed-tests-${{matrix.vector.jobname}}
path: ${{env.FAILED_TEST_ARTIFACTS}}
@ -309,9 +432,12 @@ jobs:
if: needs.ci-config.outputs.enabled == 'yes'
env:
jobname: StaticAnalysis
runs-on: ubuntu-18.04
runs-on: ubuntu-22.04
concurrency:
group: static-analysis-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- run: ci/install-dependencies.sh
- run: ci/run-static-analysis.sh
- run: ci/check-directional-formatting.bash
@ -321,6 +447,9 @@ jobs:
env:
jobname: sparse
runs-on: ubuntu-20.04
concurrency:
group: sparse-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
steps:
- name: Download a current `sparse` package
# Ubuntu's `sparse` version is too old for us
@ -331,7 +460,7 @@ jobs:
artifact: sparse-20.04
- name: Install the current `sparse` package
run: sudo dpkg -i sparse-20.04/sparse_*.deb
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Install other dependencies
run: ci/install-dependencies.sh
- run: make sparse
@ -339,10 +468,13 @@ jobs:
name: documentation
needs: ci-config
if: needs.ci-config.outputs.enabled == 'yes'
concurrency:
group: documentation-${{ github.ref }}
cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
env:
jobname: Documentation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- run: ci/install-dependencies.sh
- run: ci/test-documentation.sh

23
.gitignore vendored
View File

@ -1,7 +1,5 @@
/fuzz-commit-graph
/fuzz_corpora
/fuzz-pack-headers
/fuzz-pack-idx
/GIT-BUILD-DIR
/GIT-BUILD-OPTIONS
/GIT-CFLAGS
/GIT-LDFLAGS
@ -10,19 +8,19 @@
/GIT-PERL-HEADER
/GIT-PYTHON-VARS
/GIT-SCRIPT-DEFINES
/GIT-SPATCH-DEFINES
/GIT-TEST-SUITES
/GIT-USER-AGENT
/GIT-VERSION-FILE
/bin-wrappers/
/git
/git-add
/git-add--interactive
/git-am
/git-annotate
/git-apply
/git-archimport
/git-archive
/git-backfill
/git-bisect
/git-bisect--helper
/git-blame
/git-branch
/git-bugreport
@ -53,6 +51,7 @@
/git-cvsimport
/git-cvsserver
/git-daemon
/git-diagnose
/git-diff
/git-diff-files
/git-diff-index
@ -60,7 +59,6 @@
/git-difftool
/git-difftool--helper
/git-describe
/git-env--helper
/git-fast-export
/git-fast-import
/git-fetch
@ -129,6 +127,7 @@
/git-rebase
/git-receive-pack
/git-reflog
/git-refs
/git-remote
/git-remote-http
/git-remote-https
@ -138,6 +137,7 @@
/git-remote-ext
/git-repack
/git-replace
/git-replay
/git-request-pull
/git-rerere
/git-reset
@ -180,11 +180,14 @@
/git-verify-commit
/git-verify-pack
/git-verify-tag
/git-version
/git-web--browse
/git-whatchanged
/git-worktree
/git-write-tree
/scalar
/git-core-*/?*
/git.res
/gitweb/GITWEB-BUILD-OPTIONS
/gitweb/gitweb.cgi
/gitweb/static/gitweb.js
@ -192,9 +195,11 @@
/config-list.h
/command-list.h
/hook-list.h
/version-def.h
*.tar.gz
*.dsc
*.deb
/git.rc
/git.spec
*.exe
*.[aos]
@ -222,10 +227,10 @@
/TAGS
/cscope*
/compile_commands.json
/.cache/
*.hcc
*.obj
*.lib
*.res
*.sln
*.sp
*.suo
@ -246,3 +251,5 @@ Release/
/git.VC.db
*.dSYM
/contrib/buildsystems/out
/contrib/libgit-rs/target
/contrib/libgit-sys/target

253
.gitlab-ci.yml Normal file
View File

@ -0,0 +1,253 @@
default:
timeout: 2h
stages:
- build
- test
- analyze
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_TAG
- if: $CI_COMMIT_REF_PROTECTED == "true"
test:linux:
image: $image
stage: test
needs: [ ]
tags:
- saas-linux-medium-amd64
variables:
CUSTOM_PATH: "/custom"
TEST_OUTPUT_DIRECTORY: "/tmp/test-output"
before_script:
- ./ci/install-dependencies.sh
script:
- useradd builder --create-home
- chown -R builder "${CI_PROJECT_DIR}"
- sudo --preserve-env --set-home --user=builder ./ci/run-build-and-tests.sh
after_script:
- |
if test "$CI_JOB_STATUS" != 'success'
then
sudo --preserve-env --set-home --user=builder ./ci/print-test-failures.sh
mv "$TEST_OUTPUT_DIRECTORY"/failed-test-artifacts t/
fi
parallel:
matrix:
- jobname: linux-sha256
image: ubuntu:rolling
CC: clang
- jobname: linux-reftable
image: ubuntu:rolling
CC: clang
- jobname: linux-breaking-changes
image: ubuntu:20.04
CC: gcc
- jobname: linux-TEST-vars
image: ubuntu:20.04
CC: gcc
CC_PACKAGE: gcc-8
- jobname: linux-leaks
image: ubuntu:rolling
CC: gcc
- jobname: linux-reftable-leaks
image: ubuntu:rolling
CC: gcc
- jobname: linux-asan-ubsan
image: ubuntu:rolling
CC: clang
- jobname: pedantic
image: fedora:latest
- jobname: linux-musl-meson
image: alpine:latest
- jobname: linux32
image: i386/ubuntu:20.04
- jobname: linux-meson
image: ubuntu:rolling
CC: gcc
artifacts:
paths:
- t/failed-test-artifacts
when: on_failure
test:osx:
image: $image
stage: test
needs: [ ]
tags:
- saas-macos-medium-m1
variables:
TEST_OUTPUT_DIRECTORY: "/Volumes/RAMDisk"
before_script:
# Create a 4GB RAM disk that we use to store test output on. This small hack
# significantly speeds up tests by more than a factor of 2 because the
# macOS runners use network-attached storage as disks, which is _really_
# slow with the many small writes that our tests do.
- sudo diskutil apfs create $(hdiutil attach -nomount ram://8192000) RAMDisk
- ./ci/install-dependencies.sh
script:
- ./ci/run-build-and-tests.sh
after_script:
- |
if test "$CI_JOB_STATUS" != 'success'
then
./ci/print-test-failures.sh
mv "$TEST_OUTPUT_DIRECTORY"/failed-test-artifacts t/
fi
parallel:
matrix:
- jobname: osx-clang
image: macos-14-xcode-15
CC: clang
- jobname: osx-reftable
image: macos-14-xcode-15
CC: clang
- jobname: osx-meson
image: macos-14-xcode-15
CC: clang
artifacts:
paths:
- t/failed-test-artifacts
when: on_failure
build:mingw64:
stage: build
tags:
- saas-windows-medium-amd64
variables:
NO_PERL: 1
before_script:
- ./ci/install-sdk.ps1 -directory "git-sdk"
script:
- git-sdk/usr/bin/bash.exe -l -c 'ci/make-test-artifacts.sh artifacts'
artifacts:
paths:
- artifacts
- git-sdk
test:mingw64:
stage: test
tags:
- saas-windows-medium-amd64
needs:
- job: "build:mingw64"
artifacts: true
before_script:
- git-sdk/usr/bin/bash.exe -l -c 'tar xf artifacts/artifacts.tar.gz'
- New-Item -Path .git/info -ItemType Directory
- New-Item .git/info/exclude -ItemType File -Value "/git-sdk"
script:
- git-sdk/usr/bin/bash.exe -l -c "ci/run-test-slice.sh $CI_NODE_INDEX $CI_NODE_TOTAL"
after_script:
- git-sdk/usr/bin/bash.exe -l -c 'ci/print-test-failures.sh'
parallel: 10
.msvc-meson:
tags:
- saas-windows-medium-amd64
before_script:
- choco install -y git meson ninja openssl
- Import-Module $env:ChocolateyInstall\helpers\chocolateyProfile.psm1
- refreshenv
# The certificate store for Python on Windows is broken and fails to fetch
# certificates, see https://bugs.python.org/issue36011. This seems to
# mostly be an issue with how the GitLab image is set up as it is a
# non-issue on GitHub Actions. Work around the issue by importing
# cetrificates manually.
- Invoke-WebRequest https://curl.haxx.se/ca/cacert.pem -OutFile cacert.pem
- openssl pkcs12 -export -nokeys -in cacert.pem -out certs.pfx -passout "pass:"
- Import-PfxCertificate -CertStoreLocation Cert:\LocalMachine\Root -FilePath certs.pfx
build:msvc-meson:
extends: .msvc-meson
stage: build
script:
- meson setup build -Dperl=disabled -Dbackend_max_links=1 -Dcredential_helpers=wincred
- meson compile -C build
artifacts:
paths:
- build
test:msvc-meson:
extends: .msvc-meson
stage: test
when: manual
timeout: 6h
needs:
- job: "build:msvc-meson"
artifacts: true
script:
- meson test -C build --list | Select-Object -Skip 1 | Select-String .* | Group-Object -Property { $_.LineNumber % $Env:CI_NODE_TOTAL + 1 } | Where-Object Name -EQ $Env:CI_NODE_INDEX | ForEach-Object { meson test -C build --no-rebuild --print-errorlogs $_.Group; if (!$?) { exit $LASTEXITCODE } }
parallel: 10
test:fuzz-smoke-tests:
image: ubuntu:latest
stage: test
needs: [ ]
variables:
CC: clang
before_script:
- ./ci/install-dependencies.sh
script:
- ./ci/run-build-and-minimal-fuzzers.sh
static-analysis:
image: ubuntu:22.04
stage: analyze
needs: [ ]
variables:
jobname: StaticAnalysis
before_script:
- ./ci/install-dependencies.sh
script:
- ./ci/run-static-analysis.sh
- ./ci/check-directional-formatting.bash
check-whitespace:
image: ubuntu:latest
stage: analyze
needs: [ ]
before_script:
- ./ci/install-dependencies.sh
# Since $CI_MERGE_REQUEST_TARGET_BRANCH_SHA is only defined for merged
# pipelines, we fallback to $CI_MERGE_REQUEST_DIFF_BASE_SHA, which should
# be defined in all pipelines.
script:
- |
R=${CI_MERGE_REQUEST_TARGET_BRANCH_SHA:-${CI_MERGE_REQUEST_DIFF_BASE_SHA:?}} || exit
./ci/check-whitespace.sh "$R"
rules:
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
check-style:
image: ubuntu:latest
stage: analyze
needs: [ ]
allow_failure: true
variables:
CC: clang
jobname: ClangFormat
before_script:
- ./ci/install-dependencies.sh
# Since $CI_MERGE_REQUEST_TARGET_BRANCH_SHA is only defined for merged
# pipelines, we fallback to $CI_MERGE_REQUEST_DIFF_BASE_SHA, which should
# be defined in all pipelines.
script:
- |
R=${CI_MERGE_REQUEST_TARGET_BRANCH_SHA:-${CI_MERGE_REQUEST_DIFF_BASE_SHA:?}} || exit
./ci/run-style-check.sh "$R"
rules:
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
documentation:
image: ubuntu:latest
stage: analyze
needs: [ ]
variables:
jobname: Documentation
before_script:
- ./ci/install-dependencies.sh
script:
- ./ci/test-documentation.sh

View File

@ -59,12 +59,13 @@ David Reiss <dreiss@facebook.com> <dreiss@dreiss-vmware.(none)>
David S. Miller <davem@davemloft.net>
David Turner <novalis@novalis.org> <dturner@twopensource.com>
David Turner <novalis@novalis.org> <dturner@twosigma.com>
Derrick Stolee <derrickstolee@github.com> <stolee@gmail.com>
Derrick Stolee <derrickstolee@github.com> Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Derrick Stolee <derrickstolee@github.com> <dstolee@microsoft.com>
Derrick Stolee <stolee@gmail.com> <derrickstolee@github.com>
Derrick Stolee <stolee@gmail.com> Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Derrick Stolee <stolee@gmail.com> <dstolee@microsoft.com>
Deskin Miller <deskinm@umich.edu>
Đoàn Trần Công Danh <congdanhqx@gmail.com> Doan Tran Cong Danh
Dirk Süsserott <newsletter@dirk.my1.cc>
Emily Shaffer <nasamuffin@google.com> <emilyshaffer@google.com>
Eric Blake <eblake@redhat.com> <ebb9@byu.net>
Eric Hanchrow <eric.hanchrow@gmail.com> <offby1@blarg.net>
Eric S. Raymond <esr@thyrsus.com>
@ -79,6 +80,7 @@ Frank Lichtenheld <frank@lichtenheld.de> <flichtenheld@astaro.com>
Fredrik Kuivinen <frekui@gmail.com> <freku045@student.liu.se>
Frédéric Heitzmann <frederic.heitzmann@gmail.com>
Garry Dolley <gdolley@ucla.edu> <gdolley@arpnetworks.com>
Glen Choo <glencbz@gmail.com> <chooglen@google.com>
Greg Price <price@mit.edu> <price@MIT.EDU>
Greg Price <price@mit.edu> <price@ksplice.com>
Heiko Voigt <hvoigt@hvoigt.net> <git-list@hvoigt.net>
@ -150,6 +152,7 @@ Lars Doelle <lars.doelle@on-line ! de>
Lars Doelle <lars.doelle@on-line.de>
Lars Noschinski <lars@public.noschinski.de> <lars.noschinski@rwth-aachen.de>
Li Hong <leehong@pku.edu.cn>
Linus Arver <linus@ucla.edu> <linusa@google.com>
Linus Torvalds <torvalds@linux-foundation.org> <torvalds@evo.osdl.org>
Linus Torvalds <torvalds@linux-foundation.org> <torvalds@g5.osdl.org>
Linus Torvalds <torvalds@linux-foundation.org> <torvalds@osdl.org>
@ -165,6 +168,7 @@ Mark Rada <marada@uwaterloo.ca>
Martin Langhoff <martin@laptop.org> <martin@catalyst.net.nz>
Martin von Zweigbergk <martinvonz@gmail.com> <martin.von.zweigbergk@gmail.com>
Masaya Suzuki <masayasuzuki@google.com> <draftcode@gmail.com>
Matheus Tavares <matheus.tavb@gmail.com> <matheus.bernardino@usp.br>
Matt Draisey <matt@draisey.ca> <mattdraisey@sympatico.ca>
Matt Kraai <kraai@ftbfs.org> <matt.kraai@amo.abbott.com>
Matt McCutchen <matt@mattmccutchen.net> <hashproduct@gmail.com>
@ -253,6 +257,7 @@ Stefan Naewe <stefan.naewe@gmail.com> <stefan.naewe@googlemail.com>
Stefan Sperling <stsp@elego.de> <stsp@stsp.name>
Štěpán Němec <stepnem@gmail.com> <stepan.nemec@gmail.com>
Stephen Boyd <bebarino@gmail.com> <sboyd@codeaurora.org>
Stephen P. Smith <ishchis2@gmail.com> <ischis2@cox.net>
Steven Drake <sdrake@xnet.co.nz> <sdrake@ihug.co.nz>
Steven Grimm <koreth@midwinter.com> <sgrimm@sgrimm-mbp.local>
Steven Grimm <koreth@midwinter.com> koreth@midwinter.com

View File

@ -130,11 +130,11 @@ This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].
Community Impact Guidelines were inspired by
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
at [https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org

View File

@ -1 +0,0 @@
*.txt whitespace

View File

@ -6,13 +6,15 @@
*.pdf
git.info
gitman.info
howto-index.txt
howto-index.adoc
doc.dep
cmds-*.txt
mergetools-*.txt
manpage-base-url.xsl
SubmittingPatches.txt
cmds-*.adoc
mergetools-*.adoc
SubmittingPatches.adoc
tmp-doc-diff/
tmp-meson-diff/
GIT-ASCIIDOCFLAGS
/.build/
/GIT-EXCLUDED-PROGRAMS
/asciidoc.conf
/asciidoctor-extensions.rb

View File

@ -0,0 +1,198 @@
= Upcoming breaking changes
The Git project aims to ensure backwards compatibility to the best extent
possible. Minor releases will not break backwards compatibility unless there is
a very strong reason to do so, like for example a security vulnerability.
Regardless of that, due to the age of the Git project, it is only natural to
accumulate a backlog of backwards-incompatible changes that will eventually be
required to keep the project aligned with a changing world. These changes fall
into several categories:
* Changes to long established defaults.
* Concepts that have been replaced with a superior design.
* Concepts, commands, configuration or options that have been lacking in major
ways and that cannot be fixed and which will thus be removed without any
replacement.
Explicitly not included in this list are fixes to minor bugs that may cause a
change in user-visible behavior.
The Git project irregularly releases breaking versions that deliberately break
backwards compatibility with older versions. This is done to ensure that Git
remains relevant, safe and maintainable going forward. The release cadence of
breaking versions is typically measured in multiple years. We had the following
major breaking releases in the past:
* Git 1.6.0, released in August 2008.
* Git 2.0, released in May 2014.
We use <major>.<minor> release numbers these days, starting from Git 2.0. For
future releases, our plan is to increment <major> in the release number when we
make the next breaking release. Before Git 2.0, the release numbers were
1.<major>.<minor> with the intention to increment <major> for "usual" breaking
releases, reserving the jump to Git 2.0 for really large backward-compatibility
breaking changes.
The intent of this document is to track upcoming deprecations for future
breaking releases. Furthermore, this document also tracks what will _not_ be
deprecated. This is done such that the outcome of discussions document both
when the discussion favors deprecation, but also when it rejects a deprecation.
Items should have a clear summary of the reasons why we do or do not want to
make the described change that can be easily understood without having to read
the mailing list discussions. If there are alternatives to the changed feature,
those alternatives should be pointed out to our users.
All items should be accompanied by references to relevant mailing list threads
where the deprecation was discussed. These references use message-IDs, which
can visited via
https://lore.kernel.org/git/$message_id/
to see the message and its surrounding discussion. Such a reference is there to
make it easier for you to find how the project reached consensus on the
described item back then.
This is a living document as the environment surrounding the project changes
over time. If circumstances change, an earlier decision to deprecate or change
something may need to be revisited from time to time. So do not take items on
this list to mean "it is settled, do not waste our time bringing it up again".
== Procedure
Discussing the desire to make breaking changes, declaring that breaking
changes are made at a certain version boundary, and recording these
decisions in this document, are necessary but not sufficient.
Because such changes are expected to be numerous, and the design and
implementation of them are expected to span over time, they have to
be deployable trivially at such a version boundary, prepared over long
time.
The breaking changes MUST be guarded with the a compile-time switch,
WITH_BREAKING_CHANGES, to help this process. When built with it,
the resulting Git binary together with its documentation would
behave as if these breaking changes slated for the next big version
boundary are already in effect. We also have a CI job to exercise
the work-in-progress version of Git with these breaking changes.
== Git 3.0
The following subsections document upcoming breaking changes for Git 3.0. There
is no planned release date for this breaking version yet.
Proposed changes and removals only include items which are "ready" to be done.
In other words, this is not supposed to be a wishlist of features that should
be changed to or replaced in case the alternative was implemented already.
=== Changes
* The default hash function for new repositories will be changed from "sha1"
to "sha256". SHA-1 has been deprecated by NIST in 2011 and is nowadays
recommended against in FIPS 140-2 and similar certifications. Furthermore,
there are practical attacks on SHA-1 that weaken its cryptographic properties:
+
** The SHAppening (2015). The first demonstration of a practical attack
against SHA-1 with 2^57 operations.
** SHAttered (2017). Generation of two valid PDF files with 2^63 operations.
** Birthday-Near-Collision (2019). This attack allows for chosen prefix
attacks with 2^68 operations.
** Shambles (2020). This attack allows for chosen prefix attacks with 2^63
operations.
+
While we have protections in place against known attacks, it is expected
that more attacks against SHA-1 will be found by future research. Paired
with the ever-growing capability of hardware, it is only a matter of time
before SHA-1 will be considered broken completely. We want to be prepared
and will thus change the default hash algorithm to "sha256" for newly
initialized repositories.
+
An important requirement for this change is that the ecosystem is ready to
support the "sha256" object format. This includes popular Git libraries,
applications and forges.
+
There is no plan to deprecate the "sha1" object format at this point in time.
+
Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@zombino.com>,
<20170223155046.e7nxivfwqqoprsqj@LykOS.localdomain>,
<CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@mail.gmail.com>.
=== Removals
* Support for grafting commits has long been superseded by git-replace(1).
Grafts are inferior to replacement refs:
+
** Grafts are a local-only mechanism and cannot be shared across
repositories.
** Grafts can lead to hard-to-diagnose problems when transferring objects
between repositories.
+
The grafting mechanism has been marked as outdated since e650d0643b (docs: mark
info/grafts as outdated, 2014-03-05) and will be removed.
+
Cf. <20140304174806.GA11561@sigill.intra.peff.net>.
* The git-pack-redundant(1) command can be used to remove redundant pack files.
The subcommand is unusably slow and the reason why nobody reports it as a
performance bug is suspected to be the absence of users. We have nominated
the command for removal and have started to emit a user-visible warning in
c3b58472be (pack-redundant: gauge the usage before proposing its removal,
2020-08-25) whenever the command is executed.
+
So far there was a single complaint about somebody still using the command, but
that complaint did not cause us to reverse course. On the contrary, we have
doubled down on the deprecation and starting with 4406522b76 (pack-redundant:
escalate deprecation warning to an error, 2023-03-23), the command dies unless
the user passes the `--i-still-use-this` option.
+
There have not been any subsequent complaints, so this command will finally be
removed.
+
Cf. <xmqq1rjuz6n3.fsf_-_@gitster.c.googlers.com>,
<CAKvOHKAFXQwt4D8yUCCkf_TQL79mYaJ=KAKhtpDNTvHJFuX1NA@mail.gmail.com>,
<20230323204047.GA9290@coredump.intra.peff.net>,
* Support for storing shorthands for remote URLs in "$GIT_COMMON_DIR/branches/"
and "$GIT_COMMON_DIR/remotes/" has been long superseded by storing remotes in
the repository configuration.
+
The mechanism has originally been introduced in f170e4b39d ([PATCH] fetch/pull:
short-hand notation for remote repositories., 2005-07-16) and was superseded by
6687f8fea2 ([PATCH] Use .git/remote/origin, not .git/branches/origin.,
2005-08-20), where we switched from ".git/branches/" to ".git/remotes/". That
commit already mentions an upcoming deprecation of the ".git/branches/"
directory, and starting with a1d4aa7424 (Add repository-layout document.,
2005-09-01) we have also marked this layout as deprecated. Eventually we also
started to migrate away from ".git/remotes/" in favor of config-based remotes,
and we have marked the directory as legacy in 3d3d282146 (Documentation:
Grammar correction, wording fixes and cleanup, 2011-08-23)
+
As our documentation mentions, these directories are unlikely to be used in
modern repositories and most users aren't even aware of these mechanisms. They
have been deprecated for almost 20 years and 14 years respectively, and we are
not aware of any active users that have complained about this deprecation.
Furthermore, the ".git/branches/" directory is nowadays misleadingly named and
may cause confusion as "branches" are almost exclusively used in the context of
references.
+
These features will be removed.
== Superseded features that will not be deprecated
Some features have gained newer replacements that aim to improve the design in
certain ways. The fact that there is a replacement does not automatically mean
that the old way of doing things will eventually be removed. This section tracks
those features with newer alternatives.
* The features git-checkout(1) offers are covered by the pair of commands
git-restore(1) and git-switch(1). Because the use of git-checkout(1) is still
widespread, and it is not expected that this will change anytime soon, all
three commands will stay.
+
This decision may get revisited in case we ever figure out that there are
almost no users of any of the commands anymore.
+
Cf. <xmqqttjazwwa.fsf@gitster.g>,
<xmqqleeubork.fsf@gitster.g>,
<112b6568912a6de6672bf5592c3a718e@manjaro.org>.

View File

@ -1,5 +1,5 @@
Like other projects, we also have some guidelines to keep to the
code. For Git in general, a few rough rules are:
Like other projects, we also have some guidelines for our code. For
Git in general, a few rough rules are:
- Most importantly, we never say "It's in POSIX; we'll happily
ignore your needs should your system not conform to it."
@ -24,7 +24,7 @@ code. For Git in general, a few rough rules are:
"Once it _is_ in the tree, it's not really worth the patch noise to
go and fix it up."
Cf. http://lkml.iu.edu/hypermail/linux/kernel/1001.3/01069.html
Cf. https://lore.kernel.org/all/20100126160632.3bdbe172.akpm@linux-foundation.org/
- Log messages to explain your changes are as important as the
changes themselves. Clearly written code and in-code comments
@ -40,11 +40,11 @@ As for more concrete guidelines, just imitate the existing code
contributing to). It is always preferable to match the _local_
convention. New code added to Git suite is expected to match
the overall style of existing code. Modifications to existing
code is expected to match the style the surrounding code already
code are expected to match the style the surrounding code already
uses (even if it doesn't match the overall style of existing code).
But if you must have a list of rules, here are some language
specific ones. Note that Documentation/ToolsForGit.txt document
specific ones. Note that Documentation/ToolsForGit.adoc document
has a collection of tips to help you use some external tools
to conform to these guidelines.
@ -162,8 +162,6 @@ For shell scripts specifically (not exhaustive):
- We do not use \{m,n\};
- We do not use -E;
- We do not use ? or + (which are \{0,1\} and \{1,\}
respectively in BRE) but that goes without saying as these
are ERE elements not BRE (note that \? and \+ are not even part
@ -187,8 +185,55 @@ For shell scripts specifically (not exhaustive):
- Even though "local" is not part of POSIX, we make heavy use of it
in our test suite. We do not use it in scripted Porcelains, and
hopefully nobody starts using "local" before they are reimplemented
in C ;-)
hopefully nobody starts using "local" before all shells that matter
support it (notably, ksh from AT&T Research does not support it yet).
- Some versions of shell do not understand "export variable=value",
so we write "variable=value" and then "export variable" on two
separate lines.
- Some versions of dash have broken variable assignment when prefixed
with "local", "export", and "readonly", in that the value to be
assigned goes through field splitting at $IFS unless quoted.
(incorrect)
local variable=$value
local variable=$(command args)
(correct)
local variable="$value"
local variable="$(command args)"
- The common construct
VAR=VAL command args
to temporarily set and export environment variable VAR only while
"command args" is running is handy, but this triggers an
unspecified behaviour according to POSIX when used for a command
that is not an external command (like shell functions). Indeed,
dash 0.5.10.2-6 on Ubuntu 20.04, /bin/sh on FreeBSD 13, and AT&T
ksh all make a temporary assignment without exporting the variable,
in such a case. As it does not work portably across shells, do not
use this syntax for shell functions. A common workaround is to do
an explicit export in a subshell, like so:
(incorrect)
VAR=VAL func args
(correct)
(
VAR=VAL &&
export VAR &&
func args
)
but be careful that the effect "func" makes to the variables in the
current shell will be lost across the subshell boundary.
- Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g.
"\xc2\xa2") in printf format strings, since hexadecimal escape
sequences are not portable.
For C programs:
@ -196,6 +241,16 @@ For C programs:
- We use tabs to indent, and interpret tabs as taking up to
8 spaces.
- Nested C preprocessor directives are indented after the hash by one
space per nesting level.
#if FOO
# include <foo.h>
# if BAR
# include <bar.h>
# endif
#endif
- We try to keep to at most 80 characters per line.
- As a Git developer we assume you have a reasonably modern compiler
@ -203,11 +258,28 @@ For C programs:
ensure your patch is clear of all compiler warnings we care about,
by e.g. "echo DEVELOPER=1 >>config.mak".
- When using DEVELOPER=1 mode, you may see warnings from the compiler
like "error: unused parameter 'foo' [-Werror=unused-parameter]",
which indicates that a function ignores its argument. If the unused
parameter can't be removed (e.g., because the function is used as a
callback and has to match a certain interface), you can annotate
the individual parameters with the UNUSED (or MAYBE_UNUSED)
keyword, like "int foo UNUSED".
- We try to support a wide range of C compilers to compile Git with,
including old ones. You should not use features from newer C
including old ones. As of Git v2.35.0 Git requires C99 (we check
"__STDC_VERSION__"). You should not use features from a newer C
standard, even if your compiler groks them.
There are a few exceptions to this guideline:
New C99 features have been phased in gradually, if something's new
in C99 but not used yet don't assume that it's safe to use, some
compilers we target have only partial support for it. These are
considered safe to use:
. since around 2007 with 2b6854c863a, we have been using
initializer elements which are not computable at load time. E.g.:
const char *args[] = { "constant", variable, NULL };
. since early 2012 with e1327023ea, we have been using an enum
definition whose last element is followed by a comma. This, like
@ -223,17 +295,25 @@ For C programs:
. since early 2021 with 765dc168882, we have been using variadic
macros, mostly for printf-like trace and debug macros.
These used to be forbidden, but we have not heard any breakage
report, and they are assumed to be safe.
. since late 2021 with 44ba10d6, we have had variables declared in
the for loop "for (int i = 0; i < 10; i++)".
New C99 features that we cannot use yet:
. %z and %zu as a printf() argument for a size_t (the %z being for
the POSIX-specific ssize_t). Instead you should use
printf("%"PRIuMAX, (uintmax_t)v). These days the MSVC version we
rely on supports %z, but the C library used by MinGW does not.
. Shorthand like ".a.b = *c" in struct initializations is known to
trip up an older IBM XLC version, use ".a = { .b = *c }" instead.
See the 33665d98 (reftable: make assignments portable to AIX xlc
v12.01, 2022-03-28).
- Variables have to be declared at the beginning of the block, before
the first statement (i.e. -Wdeclaration-after-statement).
- Declaring a variable in the for loop "for (int i = 0; i < 10; i++)"
is still not allowed in this codebase. We are in the process of
allowing it by waiting to see that 44ba10d6 (revision: use C99
declaration of variable in for() loop, 2021-11-14) does not get
complaints. Let's revisit this around November 2022.
the first statement (i.e. -Wdeclaration-after-statement). It is
encouraged to have a blank line between the end of the declarations
and the first statement in the block.
- NULL pointers shall be written as NULL, not as 0.
@ -253,6 +333,13 @@ For C programs:
while( condition )
func (bar+1);
- A binary operator (other than ",") and ternary conditional "?:"
have a space on each side of the operator to separate it from its
operands. E.g. "A + 1", not "A+1".
- A unary operator (other than "." and "->") have no space between it
and its operand. E.g. "(char *)ptr", not "(char *) ptr".
- Do not explicitly compare an integral value with constant 0 or '\0',
or a pointer value with constant NULL. For instance, to validate that
counted array <ptr, cnt> is initialized but has no elements, write:
@ -429,8 +516,41 @@ For C programs:
detail.
- The first #include in C files, except in platform specific compat/
implementations, must be either "git-compat-util.h", "cache.h" or
"builtin.h". You do not have to include more than one of these.
implementations and sha1dc/, must be <git-compat-util.h>. This
header file insulates other header files and source files from
platform differences, like which system header files must be
included in what order, and what C preprocessor feature macros must
be defined to trigger certain features we expect out of the system.
A collorary to this is that C files should not directly include
system header files themselves.
There are some exceptions, because certain group of files that
implement an API all have to include the same header file that
defines the API and it is convenient to include <git-compat-util.h>
there. Namely:
- the implementation of the built-in commands in the "builtin/"
directory that include "builtin.h" for the cmd_foo() prototype
definition,
- the test helper programs in the "t/helper/" directory that include
"t/helper/test-tool.h" for the cmd__foo() prototype definition,
- the xdiff implementation in the "xdiff/" directory that includes
"xdiff/xinclude.h" for the xdiff machinery internals,
- the unit test programs in "t/unit-tests/" directory that include
"t/unit-tests/test-lib.h" that gives them the unit-tests
framework, and
- the source files that implement reftable in the "reftable/"
directory that include "reftable/system.h" for the reftable
internals,
are allowed to assume that they do not have to include
<git-compat-util.h> themselves, as it is included as the first
'#include' in these header files. These headers must be the first
header file to be "#include"d in them, though.
- A C file must directly include the header files that declare the
functions and the types it uses, except for the functions and types
@ -463,13 +583,63 @@ For C programs:
Run `GIT_DEBUGGER=1 ./bin-wrappers/git foo` to simply use gdb as is, or
run `GIT_DEBUGGER="<debugger> <debugger-args>" ./bin-wrappers/git foo` to
use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb"
./bin-wrappers/git log` (See `wrap-for-bin.sh`.)
./bin-wrappers/git log` (See `bin-wrappers/wrap-for-bin.sh`.)
- The primary data structure that a subsystem 'S' deals with is called
`struct S`. Functions that operate on `struct S` are named
`S_<verb>()` and should generally receive a pointer to `struct S` as
first parameter. E.g.
struct strbuf;
void strbuf_add(struct strbuf *buf, ...);
void strbuf_reset(struct strbuf *buf);
is preferred over:
struct strbuf;
void add_string(struct strbuf *buf, ...);
void reset_strbuf(struct strbuf *buf);
- There are several common idiomatic names for functions performing
specific tasks on a structure `S`:
- `S_init()` initializes a structure without allocating the
structure itself.
- `S_release()` releases a structure's contents without freeing the
structure.
- `S_clear()` is equivalent to `S_release()` followed by `S_init()`
such that the structure is directly usable after clearing it. When
`S_clear()` is provided, `S_init()` shall not allocate resources
that need to be released again.
- `S_free()` releases a structure's contents and frees the
structure.
- Function names should be clear and descriptive, accurately reflecting
their purpose or behavior. Arbitrary suffixes that do not add meaningful
context can lead to confusion, particularly for newcomers to the codebase.
Historically, the '_1' suffix has been used in situations where:
- A function handles one element among a group that requires similar
processing.
- A recursive function has been separated from its setup phase.
The '_1' suffix can be used as a concise way to indicate these specific
cases. However, it is recommended to find a more descriptive name wherever
possible to improve the readability and maintainability of the code.
For Perl programs:
- Most of the C guidelines above apply.
- We try to support Perl 5.8 and later ("use Perl 5.008").
- We try to support Perl 5.8.1 and later ("use Perl 5.008001").
- use strict and use warnings are strongly preferred.
@ -497,7 +667,7 @@ For Perl programs:
For Python scripts:
- We follow PEP-8 (http://www.python.org/dev/peps/pep-0008/).
- We follow PEP-8 (https://peps.python.org/pep-0008/).
- As a minimum, we aim to be compatible with Python 2.7.
@ -533,16 +703,30 @@ Program Output
Error Messages
- Do not end error messages with a full stop.
- Do not end a single-sentence error message with a full stop.
- Do not capitalize the first word, only because it is the first word
in the message ("unable to open %s", not "Unable to open %s"). But
in the message ("unable to open '%s'", not "Unable to open '%s'"). But
"SHA-3 not supported" is fine, because the reason the first word is
capitalized is not because it is at the beginning of the sentence,
but because the word would be spelled in capital letters even when
it appeared in the middle of the sentence.
- Say what the error is first ("cannot open %s", not "%s: cannot open")
- Say what the error is first ("cannot open '%s'", not "%s: cannot open").
- Enclose the subject of an error inside a pair of single quotes,
e.g. `die(_("unable to open '%s'"), path)`.
- Unless there is a compelling reason not to, error messages from
porcelain commands should be marked for translation, e.g.
`die(_("bad revision %s"), revision)`.
- Error messages from the plumbing commands are sometimes meant for
machine consumption and should not be marked for translation,
e.g., `die("bad revision %s", revision)`.
- BUG("message") are for communicating the specific error to developers,
thus should not be translated.
Externally Visible Names
@ -557,7 +741,7 @@ Externally Visible Names
. The variable name describes the effect of tweaking this knob.
The section and variable names that consist of multiple words are
formed by concatenating the words without punctuations (e.g. `-`),
formed by concatenating the words without punctuation marks (e.g. `-`),
and are broken using bumpyCaps in documentation as a hint to the
reader.
@ -571,7 +755,7 @@ Externally Visible Names
Writing Documentation:
Most (if not all) of the documentation pages are written in the
AsciiDoc format in *.txt files (e.g. Documentation/git.txt), and
AsciiDoc format in *.adoc files (e.g. Documentation/git.adoc), and
processed into HTML and manpages (e.g. git.html and git.1 in the
same directory).
@ -591,30 +775,30 @@ Writing Documentation:
- Prefer succinctness and matter-of-factly describing functionality
in the abstract. E.g.
--short:: Emit output in the short-format.
`--short`:: Emit output in the short-format.
and avoid something like these overly verbose alternatives:
--short:: Use this to emit output in the short-format.
--short:: You can use this to get output in the short-format.
--short:: A user who prefers shorter output could....
--short:: Should a person and/or program want shorter output, he
she/they/it can...
`--short`:: Use this to emit output in the short-format.
`--short`:: You can use this to get output in the short-format.
`--short`:: A user who prefers shorter output could....
`--short`:: Should a person and/or program want shorter output, he
she/they/it can...
This practice often eliminates the need to involve human actors in
your description, but it is a good practice regardless of the
avoidance of gendered pronouns.
- When it becomes awkward to stick to this style, prefer "you" when
addressing the the hypothetical user, and possibly "we" when
addressing the hypothetical user, and possibly "we" when
discussing how the program might react to the user. E.g.
You can use this option instead of --xyz, but we might remove
You can use this option instead of `--xyz`, but we might remove
support for it in future versions.
while keeping in mind that you can probably be less verbose, e.g.
Use this instead of --xyz. This option might be removed in future
Use this instead of `--xyz`. This option might be removed in future
versions.
- If you still need to refer to an example person that is
@ -632,70 +816,12 @@ Writing Documentation:
The same general rule as for code applies -- imitate the existing
conventions.
A few commented examples follow to provide reference when writing or
modifying command usage strings and synopsis sections in the manual
pages:
Placeholders are spelled in lowercase and enclosed in angle brackets:
<file>
--sort=<key>
--abbrev[=<n>]
Markup:
If a placeholder has multiple words, they are separated by dashes:
<new-branch-name>
--template=<template-directory>
Possibility of multiple occurrences is indicated by three dots:
<file>...
(One or more of <file>.)
Optional parts are enclosed in square brackets:
[<extra>]
(Zero or one <extra>.)
--exec-path[=<path>]
(Option with an optional argument. Note that the "=" is inside the
brackets.)
[<patch>...]
(Zero or more of <patch>. Note that the dots are inside, not
outside the brackets.)
Multiple alternatives are indicated with vertical bars:
[-q | --quiet]
[--utf8 | --no-utf8]
Parentheses are used for grouping:
[(<rev> | <range>)...]
(Any number of either <rev> or <range>. Parens are needed to make
it clear that "..." pertains to both <rev> and <range>.)
[(-p <parent>)...]
(Any number of option -p, each with one <parent> argument.)
git remote set-head <name> (-a | -d | <branch>)
(One and only one of "-a", "-d" or "<branch>" _must_ (no square
brackets) be provided.)
And a somewhat more contrived example:
--diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]
Here "=" is outside the brackets, because "--diff-filter=" is a
valid usage. "*" has its own pair of brackets, because it can
(optionally) be specified only when one or more of the letters is
also provided.
A note on notation:
Use 'git' (all lowercase) when talking about commands i.e. something
the user would type into a shell and use 'Git' (uppercase first letter)
when talking about the version control system and its properties.
A few commented examples follow to provide reference when writing or
modifying paragraphs or option/command explanations that contain options
or commands:
Literal examples (e.g. use of command-line options, command names,
Literal parts (e.g. use of command-line options, command names,
branch names, URLs, pathnames (files and directories), configuration and
environment variables) must be typeset in monospace (i.e. wrapped with
environment variables) must be typeset as verbatim (i.e. wrapped with
backticks):
`--pretty=oneline`
`git rev-list`
@ -704,6 +830,7 @@ Writing Documentation:
`.git/config`
`GIT_DIR`
`HEAD`
`umask`(2)
An environment variable must be prefixed with "$" only when referring to its
value and not when referring to the variable itself, in this case there is
@ -720,6 +847,99 @@ Writing Documentation:
Incorrect:
`\--pretty=oneline`
Placeholders are spelled in lowercase and enclosed in
angle brackets surrounded by underscores:
_<file>_
_<commit>_
If a placeholder has multiple words, they are separated by dashes:
_<new-branch-name>_
_<template-directory>_
When needed, use a distinctive identifier for placeholders, usually
made of a qualification and a type:
_<git-dir>_
_<key-id>_
Git's Asciidoc processor has been tailored to treat backticked text
as complex synopsis. When literal and placeholders are mixed, you can
use the backtick notation which will take care of correctly typesetting
the content.
`--jobs <n>`
`--sort=<key>`
`<directory>/.git`
`remote.<name>.mirror`
`ssh://[<user>@]<host>[:<port>]/<path-to-git-repo>`
As a side effect, backquoted placeholders are correctly typeset, but
this style is not recommended.
Synopsis Syntax
The synopsis (a paragraph with [synopsis] attribute) is automatically
formatted by the toolchain and does not need typesetting.
A few commented examples follow to provide reference when writing or
modifying command usage strings and synopsis sections in the manual
pages:
Possibility of multiple occurrences is indicated by three dots:
<file>...
(One or more of <file>.)
Optional parts are enclosed in square brackets:
[<file>...]
(Zero or more of <file>.)
An optional parameter needs to be typeset with unconstrained pairs
[<repository>]
--exec-path[=<path>]
(Option with an optional argument. Note that the "=" is inside the
brackets.)
[<patch>...]
(Zero or more of <patch>. Note that the dots are inside, not
outside the brackets.)
Multiple alternatives are indicated with vertical bars:
[-q | --quiet]
[--utf8 | --no-utf8]
Use spacing around "|" token(s), but not immediately after opening or
before closing a [] or () pair:
Do: [-q | --quiet]
Don't: [-q|--quiet]
Don't use spacing around "|" tokens when they're used to separate the
alternate arguments of an option:
Do: --track[=(direct|inherit)]
Don't: --track[=(direct | inherit)]
Parentheses are used for grouping:
[(<rev>|<range>)...]
(Any number of either <rev> or <range>. Parens are needed to make
it clear that "..." pertains to both <rev> and <range>.)
[(-p <parent>)...]
(Any number of option -p, each with one <parent> argument.)
git remote set-head <name> (-a|-d|<branch>)
(One and only one of "-a", "-d" or "<branch>" _must_ (no square
brackets) be provided.)
And a somewhat more contrived example:
--diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]
Here "=" is outside the brackets, because "--diff-filter=" is a
valid usage. "*" has its own pair of brackets, because it can
(optionally) be specified only when one or more of the letters is
also provided.
A note on notation:
Use 'git' (all lowercase) when talking about commands i.e. something
the user would type into a shell and use 'Git' (uppercase first letter)
when talking about the version control system and its properties.
If some place in the documentation needs to typeset a command usage
example with inline substitutions, it is fine to use +monospaced and
inline substituted text+ instead of `monospaced literal text`, and with

View File

@ -0,0 +1,74 @@
Decision-Making Process in the Git Project
==========================================
Introduction
------------
This document describes the current decision-making process in the Git
project. It is a descriptive rather than prescriptive doc; that is, we want to
describe how things work in practice rather than explicitly recommending any
particular process or changes to the current process.
Here we document how the project makes decisions for discussions
(with or without patches), in scale larger than an individual patch
series (which is fully covered by the SubmittingPatches document).
Larger Discussions (with patches)
---------------------------------
As with discussions on an individual patch series, starting a larger-scale
discussion often begins by sending a patch or series to the list. This might
take the form of an initial design doc, with implementation following in later
iterations of the series (for example,
link:https://lore.kernel.org/git/0169ce6fb9ccafc089b74ae406db0d1a8ff8ac65.1688165272.git.steadmon@google.com/[adding unit tests] or
link:https://lore.kernel.org/git/20200420235310.94493-1-emilyshaffer@google.com/[config-based hooks]),
or it might include a full implementation from the beginning.
In either case, discussion progresses the same way for an individual patch series,
until consensus is reached or the topic is dropped.
Larger Discussions (without patches)
------------------------------------
Occasionally, larger discussions might occur without an associated patch series.
These may be very large-scale technical decisions that are beyond the scope of
even a single large patch series, or they may be more open-ended,
policy-oriented discussions (examples:
link:https://lore.kernel.org/git/ZZ77NQkSuiRxRDwt@nand.local/[introducing Rust]
or link:https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/[improving submodule UX]).
In either case, discussion progresses as described above for general patch series.
For larger discussions without a patch series or other concrete implementation,
it may be hard to judge when consensus has been reached, as there are not any
official guidelines. If discussion stalls at this point, it may be helpful to
restart discussion with an RFC patch series (such as a partial, unfinished
implementation or proof of concept) that can be more easily debated.
When consensus is reached that it is a good idea, the original
proposer is expected to coordinate the effort to make it happen,
with help from others who were involved in the discussion, as
needed.
For decisions that require code changes, it is often the case that the original
proposer will follow up with a patch series, although it is also common for
other interested parties to provide an implementation (or parts of the
implementation, for very large changes).
For non-technical decisions such as community norms or processes, it is up to
the community as a whole to implement and sustain agreed-upon changes.
The project leadership committee (PLC) may help the implementation of
policy decisions.
Other Discussion Venues
-----------------------
Occasionally decision proposals are presented off-list, e.g. at the semi-regular
Contributors' Summit. While higher-bandwidth face-to-face discussion is often
useful for quickly reaching consensus among attendees, generally we expect to
summarize the discussion in notes that can later be presented on-list. For an
example, see the thread
link:https://lore.kernel.org/git/AC2EB721-2979-43FD-922D-C5076A57F24B@jramsay.com.au/[Notes
from Git Contributor Summit, Los Angeles (April 5, 2020)] by James Ramsay.
We prefer that "official" discussion happens on the list so that the full
community has opportunity to engage in discussion. This also means that the
mailing list archives contain a more-or-less complete history of project
discussions and decisions.

View File

@ -1,6 +1,11 @@
# The default target of this Makefile is...
all::
# Import tree-wide shared Makefile behavior and libraries
include ../shared.mak
.PHONY: FORCE
# Guard against environment variables
MAN1_TXT =
MAN5_TXT =
@ -15,42 +20,56 @@ OBSOLETE_HTML =
-include GIT-EXCLUDED-PROGRAMS
MAN1_TXT += $(filter-out \
$(patsubst %,%.txt,$(EXCLUDED_PROGRAMS)) \
$(addsuffix .txt, $(ARTICLES) $(SP_ARTICLES)), \
$(wildcard git-*.txt))
MAN1_TXT += git.txt
MAN1_TXT += gitk.txt
MAN1_TXT += gitweb.txt
$(patsubst %,%.adoc,$(EXCLUDED_PROGRAMS)) \
$(addsuffix .adoc, $(ARTICLES) $(SP_ARTICLES)), \
$(wildcard git-*.adoc))
MAN1_TXT += git.adoc
MAN1_TXT += gitk.adoc
MAN1_TXT += gitweb.adoc
MAN1_TXT += scalar.adoc
# man5 / man7 guides (note: new guides should also be added to command-list.txt)
MAN5_TXT += gitattributes.txt
MAN5_TXT += githooks.txt
MAN5_TXT += gitignore.txt
MAN5_TXT += gitmailmap.txt
MAN5_TXT += gitmodules.txt
MAN5_TXT += gitrepository-layout.txt
MAN5_TXT += gitweb.conf.txt
MAN5_TXT += gitattributes.adoc
MAN5_TXT += gitformat-bundle.adoc
MAN5_TXT += gitformat-chunk.adoc
MAN5_TXT += gitformat-commit-graph.adoc
MAN5_TXT += gitformat-index.adoc
MAN5_TXT += gitformat-pack.adoc
MAN5_TXT += gitformat-signature.adoc
MAN5_TXT += githooks.adoc
MAN5_TXT += gitignore.adoc
MAN5_TXT += gitmailmap.adoc
MAN5_TXT += gitmodules.adoc
MAN5_TXT += gitprotocol-capabilities.adoc
MAN5_TXT += gitprotocol-common.adoc
MAN5_TXT += gitprotocol-http.adoc
MAN5_TXT += gitprotocol-pack.adoc
MAN5_TXT += gitprotocol-v2.adoc
MAN5_TXT += gitrepository-layout.adoc
MAN5_TXT += gitweb.conf.adoc
MAN7_TXT += gitcli.txt
MAN7_TXT += gitcore-tutorial.txt
MAN7_TXT += gitcredentials.txt
MAN7_TXT += gitcvs-migration.txt
MAN7_TXT += gitdiffcore.txt
MAN7_TXT += giteveryday.txt
MAN7_TXT += gitfaq.txt
MAN7_TXT += gitglossary.txt
MAN7_TXT += gitnamespaces.txt
MAN7_TXT += gitremote-helpers.txt
MAN7_TXT += gitrevisions.txt
MAN7_TXT += gitsubmodules.txt
MAN7_TXT += gittutorial-2.txt
MAN7_TXT += gittutorial.txt
MAN7_TXT += gitworkflows.txt
MAN7_TXT += gitcli.adoc
MAN7_TXT += gitcore-tutorial.adoc
MAN7_TXT += gitcredentials.adoc
MAN7_TXT += gitcvs-migration.adoc
MAN7_TXT += gitdiffcore.adoc
MAN7_TXT += giteveryday.adoc
MAN7_TXT += gitfaq.adoc
MAN7_TXT += gitglossary.adoc
MAN7_TXT += gitpacking.adoc
MAN7_TXT += gitnamespaces.adoc
MAN7_TXT += gitremote-helpers.adoc
MAN7_TXT += gitrevisions.adoc
MAN7_TXT += gitsubmodules.adoc
MAN7_TXT += gittutorial-2.adoc
MAN7_TXT += gittutorial.adoc
MAN7_TXT += gitworkflows.adoc
HOWTO_TXT += $(wildcard howto/*.txt)
HOWTO_TXT += $(wildcard howto/*.adoc)
DOC_DEP_TXT += $(wildcard *.txt)
DOC_DEP_TXT += $(wildcard config/*.txt)
DOC_DEP_TXT += $(wildcard *.adoc)
DOC_DEP_TXT += $(wildcard config/*.adoc)
DOC_DEP_TXT += $(wildcard includes/*.adoc)
ifdef MAN_FILTER
MAN_TXT = $(filter $(MAN_FILTER),$(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT))
@ -59,8 +78,8 @@ MAN_TXT = $(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT)
MAN_FILTER = $(MAN_TXT)
endif
MAN_XML = $(patsubst %.txt,%.xml,$(MAN_TXT))
MAN_HTML = $(patsubst %.txt,%.html,$(MAN_TXT))
MAN_XML = $(patsubst %.adoc,%.xml,$(MAN_TXT))
MAN_HTML = $(patsubst %.adoc,%.html,$(MAN_TXT))
GIT_MAN_REF = master
OBSOLETE_HTML += everyday.html
@ -87,35 +106,32 @@ SP_ARTICLES += howto/rebase-from-internal-branch
SP_ARTICLES += howto/keep-canonical-history-correct
SP_ARTICLES += howto/maintain-git
SP_ARTICLES += howto/coordinate-embargoed-releases
API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
API_DOCS = $(patsubst %.adoc,%,$(filter-out technical/api-index-skel.adoc technical/api-index.adoc, $(wildcard technical/api-*.adoc)))
SP_ARTICLES += $(API_DOCS)
TECH_DOCS += DecisionMaking
TECH_DOCS += ReviewingGuidelines
TECH_DOCS += MyFirstContribution
TECH_DOCS += MyFirstObjectWalk
TECH_DOCS += SubmittingPatches
TECH_DOCS += ToolsForGit
TECH_DOCS += technical/bitmap-format
TECH_DOCS += technical/bundle-format
TECH_DOCS += technical/cruft-packs
TECH_DOCS += technical/build-systems
TECH_DOCS += technical/bundle-uri
TECH_DOCS += technical/hash-function-transition
TECH_DOCS += technical/http-protocol
TECH_DOCS += technical/index-format
TECH_DOCS += technical/long-running-process-protocol
TECH_DOCS += technical/multi-pack-index
TECH_DOCS += technical/pack-format
TECH_DOCS += technical/pack-heuristics
TECH_DOCS += technical/pack-protocol
TECH_DOCS += technical/parallel-checkout
TECH_DOCS += technical/partial-clone
TECH_DOCS += technical/protocol-capabilities
TECH_DOCS += technical/protocol-common
TECH_DOCS += technical/protocol-v2
TECH_DOCS += technical/platform-support
TECH_DOCS += technical/racy-git
TECH_DOCS += technical/reftable
TECH_DOCS += technical/scalar
TECH_DOCS += technical/send-pack-pipeline
TECH_DOCS += technical/shallow
TECH_DOCS += technical/signature-format
TECH_DOCS += technical/trivial-merge
TECH_DOCS += technical/unit-tests
SP_ARTICLES += $(TECH_DOCS)
SP_ARTICLES += technical/api-index
@ -123,9 +139,9 @@ ARTICLES_HTML += $(patsubst %,%.html,$(ARTICLES) $(SP_ARTICLES))
HTML_FILTER ?= $(ARTICLES_HTML) $(OBSOLETE_HTML)
DOC_HTML = $(MAN_HTML) $(filter $(HTML_FILTER),$(ARTICLES_HTML) $(OBSOLETE_HTML))
DOC_MAN1 = $(patsubst %.txt,%.1,$(filter $(MAN_FILTER),$(MAN1_TXT)))
DOC_MAN5 = $(patsubst %.txt,%.5,$(filter $(MAN_FILTER),$(MAN5_TXT)))
DOC_MAN7 = $(patsubst %.txt,%.7,$(filter $(MAN_FILTER),$(MAN7_TXT)))
DOC_MAN1 = $(patsubst %.adoc,%.1,$(filter $(MAN_FILTER),$(MAN1_TXT)))
DOC_MAN5 = $(patsubst %.adoc,%.5,$(filter $(MAN_FILTER),$(MAN5_TXT)))
DOC_MAN7 = $(patsubst %.adoc,%.7,$(filter $(MAN_FILTER),$(MAN7_TXT)))
prefix ?= $(HOME)
bindir ?= $(prefix)/bin
@ -143,9 +159,7 @@ ASCIIDOC_EXTRA =
ASCIIDOC_HTML = xhtml11
ASCIIDOC_DOCBOOK = docbook
ASCIIDOC_CONF = -f asciidoc.conf
ASCIIDOC_COMMON = $(ASCIIDOC) $(ASCIIDOC_EXTRA) $(ASCIIDOC_CONF) \
-amanversion=$(GIT_VERSION) \
-amanmanual='Git Manual' -amansource='Git'
ASCIIDOC_COMMON = $(ASCIIDOC) $(ASCIIDOC_EXTRA) $(ASCIIDOC_CONF)
ASCIIDOC_DEPS = asciidoc.conf GIT-ASCIIDOCFLAGS
TXT_TO_HTML = $(ASCIIDOC_COMMON) -b $(ASCIIDOC_HTML)
TXT_TO_XML = $(ASCIIDOC_COMMON) -b $(ASCIIDOC_DOCBOOK)
@ -170,6 +184,10 @@ endif
-include ../config.mak.autogen
-include ../config.mak
# Set GIT_VERSION_OVERRIDE such that version_gen knows to substitute
# GIT_VERSION in case it was set by the user.
GIT_VERSION_OVERRIDE := $(GIT_VERSION)
ifndef NO_MAN_BOLD_LITERAL
XMLTO_EXTRA += -m manpage-bold-literal.xsl
endif
@ -183,15 +201,7 @@ endif
ifndef MAN_BASE_URL
MAN_BASE_URL = file://$(htmldir)/
endif
XMLTO_EXTRA += -m manpage-base-url.xsl
# If your target system uses GNU groff, it may try to render
# apostrophes as a "pretty" apostrophe using unicode. This breaks
# cut&paste, so you should set GNU_ROFF to force them to be ASCII
# apostrophes. Unfortunately does not work with non-GNU roff.
ifdef GNU_ROFF
XMLTO_EXTRA += -m manpage-quote-apos.xsl
endif
XMLTO_EXTRA += --stringparam man.base.url.for.relative.links='$(MAN_BASE_URL)'
ifdef USE_ASCIIDOCTOR
ASCIIDOC = asciidoctor
@ -201,16 +211,30 @@ ASCIIDOC_DOCBOOK = docbook5
ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
ASCIIDOC_EXTRA += -adocinfo=shared
ASCIIDOC_DEPS = asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
DBLATEX_COMMON =
XMLTO_EXTRA += --skip-validation
XMLTO_EXTRA += -x manpage.xsl
asciidoctor-extensions.rb: asciidoctor-extensions.rb.in FORCE
$(QUIET_GEN)$(call version_gen,"$(shell pwd)/..",$<,$@)
else
asciidoc.conf: asciidoc.conf.in FORCE
$(QUIET_GEN)$(call version_gen,"$(shell pwd)/..",$<,$@)
endif
ifdef WITH_BREAKING_CHANGES
ASCIIDOC_EXTRA += -awith-breaking-changes
endif
ASCIIDOC_DEPS += docinfo.html
SHELL_PATH ?= $(SHELL)
# Shell quote;
SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
ASCIIDOC_EXTRA += -abuild_dir='$(shell pwd)'
ifdef DEFAULT_PAGER
DEFAULT_PAGER_SQ = $(subst ','\'',$(DEFAULT_PAGER))
ASCIIDOC_EXTRA += -a 'git-default-pager=$(DEFAULT_PAGER_SQ)'
@ -221,7 +245,7 @@ DEFAULT_EDITOR_SQ = $(subst ','\'',$(DEFAULT_EDITOR))
ASCIIDOC_EXTRA += -a 'git-default-editor=$(DEFAULT_EDITOR_SQ)'
endif
all: html man
all:: html man
html: $(DOC_HTML)
@ -261,57 +285,46 @@ install-pdf: pdf
install-html: html
'$(SHELL_PATH_SQ)' ./install-webdoc.sh $(DESTDIR)$(htmldir)
../GIT-VERSION-FILE: FORCE
$(QUIET_SUBDIR0)../ $(QUIET_SUBDIR1) GIT-VERSION-FILE
ifneq ($(filter-out lint-docs clean,$(MAKECMDGOALS)),)
-include ../GIT-VERSION-FILE
endif
mergetools_txt = mergetools-diff.adoc mergetools-merge.adoc
#
# Determine "include::" file references in asciidoc files.
#
docdep_prereqs = \
mergetools-list.made $(mergetools_txt) \
$(mergetools_txt) \
cmd-list.made $(cmds_txt)
doc.dep : $(docdep_prereqs) $(DOC_DEP_TXT) build-docdep.perl
$(QUIET_GEN)$(PERL_PATH) ./build-docdep.perl >$@ $(QUIET_STDERR)
$(QUIET_GEN)$(PERL_PATH) ./build-docdep.perl "$(shell pwd)" >$@ $(QUIET_STDERR)
ifneq ($(MAKECMDGOALS),clean)
-include doc.dep
endif
cmds_txt = cmds-ancillaryinterrogators.txt \
cmds-ancillarymanipulators.txt \
cmds-mainporcelain.txt \
cmds-plumbinginterrogators.txt \
cmds-plumbingmanipulators.txt \
cmds-synchingrepositories.txt \
cmds-synchelpers.txt \
cmds-guide.txt \
cmds-purehelpers.txt \
cmds-foreignscminterface.txt
cmds_txt = cmds-ancillaryinterrogators.adoc \
cmds-ancillarymanipulators.adoc \
cmds-mainporcelain.adoc \
cmds-plumbinginterrogators.adoc \
cmds-plumbingmanipulators.adoc \
cmds-synchingrepositories.adoc \
cmds-synchelpers.adoc \
cmds-guide.adoc \
cmds-developerinterfaces.adoc \
cmds-userinterfaces.adoc \
cmds-purehelpers.adoc \
cmds-foreignscminterface.adoc
$(cmds_txt): cmd-list.made
cmd-list.made: cmd-list.perl ../command-list.txt $(MAN1_TXT)
$(QUIET_GEN)$(PERL_PATH) ./cmd-list.perl ../command-list.txt $(cmds_txt) $(QUIET_STDERR) && \
$(QUIET_GEN)$(PERL_PATH) ./cmd-list.perl .. . $(cmds_txt) && \
date >$@
mergetools_txt = mergetools-diff.txt mergetools-merge.txt
$(mergetools_txt): mergetools-list.made
mergetools-list.made: ../git-mergetool--lib.sh $(wildcard ../mergetools/*)
$(QUIET_GEN) \
$(SHELL_PATH) -c 'MERGE_TOOLS_DIR=../mergetools && TOOL_MODE=diff && \
. ../git-mergetool--lib.sh && \
show_tool_names can_diff' | sed -e "s/\([a-z0-9]*\)/\`\1\`;;/" >mergetools-diff.txt && \
$(SHELL_PATH) -c 'MERGE_TOOLS_DIR=../mergetools && TOOL_MODE=merge && \
. ../git-mergetool--lib.sh && \
show_tool_names can_merge' | sed -e "s/\([a-z0-9]*\)/\`\1\`;;/" >mergetools-merge.txt && \
date >$@
mergetools-%.adoc: generate-mergetool-list.sh ../git-mergetool--lib.sh $(wildcard ../mergetools/*)
mergetools-diff.adoc:
$(QUIET_GEN)$(SHELL_PATH) ./generate-mergetool-list.sh .. diff $@
mergetools-merge.adoc:
$(QUIET_GEN)$(SHELL_PATH) ./generate-mergetool-list.sh .. merge $@
TRACK_ASCIIDOCFLAGS = $(subst ','\'',$(ASCIIDOC_COMMON):$(ASCIIDOC_HTML):$(ASCIIDOC_DOCBOOK))
@ -327,41 +340,49 @@ clean:
$(RM) *.xml *.xml+ *.html *.html+ *.1 *.5 *.7
$(RM) *.texi *.texi+ *.texi++ git.info gitman.info
$(RM) *.pdf
$(RM) howto-index.txt howto/*.html doc.dep
$(RM) technical/*.html technical/api-index.txt
$(RM) SubmittingPatches.txt
$(RM) howto-index.adoc howto/*.html doc.dep
$(RM) technical/*.html technical/api-index.adoc
$(RM) SubmittingPatches.adoc
$(RM) $(cmds_txt) $(mergetools_txt) *.made
$(RM) manpage-base-url.xsl
$(RM) GIT-ASCIIDOCFLAGS
$(RM) asciidoc.conf asciidoctor-extensions.rb
$(RM) -rf tmp-meson-diff
$(MAN_HTML): %.html : %.txt $(ASCIIDOC_DEPS)
docinfo.html: docinfo-html.in
$(QUIET_GEN)$(RM) $@ && cat $< >$@
$(MAN_HTML): %.html : %.adoc $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) -d manpage -o $@ $<
$(OBSOLETE_HTML): %.html : %.txto $(ASCIIDOC_DEPS)
$(OBSOLETE_HTML): %.html : %.adoco $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) -o $@ $<
manpage-base-url.xsl: manpage-base-url.xsl.in
$(QUIET_GEN)sed "s|@@MAN_BASE_URL@@|$(MAN_BASE_URL)|" $< > $@
manpage-prereqs := $(wildcard manpage*.xsl)
manpage-cmd = $(QUIET_XMLTO)$(XMLTO) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $<
%.1 %.5 %.7 : %.xml manpage-base-url.xsl $(wildcard manpage*.xsl)
$(QUIET_XMLTO)$(XMLTO) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $<
%.1 : %.xml $(manpage-prereqs)
$(manpage-cmd)
%.5 : %.xml $(manpage-prereqs)
$(manpage-cmd)
%.7 : %.xml $(manpage-prereqs)
$(manpage-cmd)
%.xml : %.txt $(ASCIIDOC_DEPS)
%.xml : %.adoc $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_XML) -d manpage -o $@ $<
user-manual.xml: user-manual.txt user-manual.conf asciidoctor-extensions.rb GIT-ASCIIDOCFLAGS
user-manual.xml: user-manual.adoc $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_XML) -d book -o $@ $<
technical/api-index.txt: technical/api-index-skel.txt \
technical/api-index.sh $(patsubst %,%.txt,$(API_DOCS))
$(QUIET_GEN)cd technical && '$(SHELL_PATH_SQ)' ./api-index.sh
technical/api-index.adoc: technical/api-index-skel.adoc \
technical/api-index.sh $(patsubst %,%.adoc,$(API_DOCS))
$(QUIET_GEN)'$(SHELL_PATH_SQ)' technical/api-index.sh ./technical ./technical/api-index.adoc
technical/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../
$(patsubst %,%.html,$(API_DOCS) technical/api-index $(TECH_DOCS)): %.html : %.txt \
asciidoc.conf GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) $*.txt
$(patsubst %,%.html,$(API_DOCS) technical/api-index $(TECH_DOCS)): %.html : %.adoc \
$(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) $*.adoc
SubmittingPatches.txt: SubmittingPatches
SubmittingPatches.adoc: SubmittingPatches
$(QUIET_GEN) cp $< $@
XSLT = docbook.xsl
@ -395,19 +416,19 @@ gitman.texi: $(MAN_XML) cat-texi.perl texi.xsl
gitman.info: gitman.texi
$(QUIET_MAKEINFO)$(MAKEINFO) --no-split --no-validate $<
$(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml
$(patsubst %.adoc,%.texi,$(MAN_TXT)): %.texi : %.xml
$(QUIET_DB2TEXI)$(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@
howto-index.txt: howto-index.sh $(HOWTO_TXT)
$(QUIET_GEN)'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(HOWTO_TXT)) >$@
howto-index.adoc: howto/howto-index.sh $(HOWTO_TXT)
$(QUIET_GEN)'$(SHELL_PATH_SQ)' ./howto/howto-index.sh $(sort $(HOWTO_TXT)) >$@
$(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) $*.txt
$(patsubst %,%.html,$(ARTICLES)) : %.html : %.adoc $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC)$(TXT_TO_HTML) $*.adoc
WEBDOC_DEST = /pub/software/scm/git/docs
howto/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../
$(patsubst %.txt,%.html,$(HOWTO_TXT)): %.html : %.txt GIT-ASCIIDOCFLAGS
$(patsubst %.adoc,%.html,$(HOWTO_TXT)): %.html : %.adoc $(ASCIIDOC_DEPS)
$(QUIET_ASCIIDOC) \
sed -e '1,/^$$/d' $< | \
$(TXT_TO_HTML) - >$@
@ -438,9 +459,9 @@ print-man1:
@for i in $(MAN1_TXT); do echo $$i; done
## Lint: gitlink
LINT_DOCS_GITLINK = $(patsubst %.txt,.build/lint-docs/gitlink/%.ok,$(HOWTO_TXT) $(DOC_DEP_TXT))
LINT_DOCS_GITLINK = $(patsubst %.adoc,.build/lint-docs/gitlink/%.ok,$(HOWTO_TXT) $(DOC_DEP_TXT))
$(LINT_DOCS_GITLINK): lint-gitlink.perl
$(LINT_DOCS_GITLINK): .build/lint-docs/gitlink/%.ok: %.txt
$(LINT_DOCS_GITLINK): .build/lint-docs/gitlink/%.ok: %.adoc
$(call mkdir_p_parent_template)
$(QUIET_LINT_GITLINK)$(PERL_PATH) lint-gitlink.perl \
$< \
@ -452,27 +473,57 @@ $(LINT_DOCS_GITLINK): .build/lint-docs/gitlink/%.ok: %.txt
lint-docs-gitlink: $(LINT_DOCS_GITLINK)
## Lint: man-end-blurb
LINT_DOCS_MAN_END_BLURB = $(patsubst %.txt,.build/lint-docs/man-end-blurb/%.ok,$(MAN_TXT))
LINT_DOCS_MAN_END_BLURB = $(patsubst %.adoc,.build/lint-docs/man-end-blurb/%.ok,$(MAN_TXT))
$(LINT_DOCS_MAN_END_BLURB): lint-man-end-blurb.perl
$(LINT_DOCS_MAN_END_BLURB): .build/lint-docs/man-end-blurb/%.ok: %.txt
$(LINT_DOCS_MAN_END_BLURB): .build/lint-docs/man-end-blurb/%.ok: %.adoc
$(call mkdir_p_parent_template)
$(QUIET_LINT_MANEND)$(PERL_PATH) lint-man-end-blurb.perl $< >$@
.PHONY: lint-docs-man-end-blurb
## Lint: man-section-order
LINT_DOCS_MAN_SECTION_ORDER = $(patsubst %.txt,.build/lint-docs/man-section-order/%.ok,$(MAN_TXT))
LINT_DOCS_MAN_SECTION_ORDER = $(patsubst %.adoc,.build/lint-docs/man-section-order/%.ok,$(MAN_TXT))
$(LINT_DOCS_MAN_SECTION_ORDER): lint-man-section-order.perl
$(LINT_DOCS_MAN_SECTION_ORDER): .build/lint-docs/man-section-order/%.ok: %.txt
$(LINT_DOCS_MAN_SECTION_ORDER): .build/lint-docs/man-section-order/%.ok: %.adoc
$(call mkdir_p_parent_template)
$(QUIET_LINT_MANSEC)$(PERL_PATH) lint-man-section-order.perl $< >$@
.PHONY: lint-docs-man-section-order
lint-docs-man-section-order: $(LINT_DOCS_MAN_SECTION_ORDER)
.PHONY: lint-docs-fsck-msgids
LINT_DOCS_FSCK_MSGIDS = .build/lint-docs/fsck-msgids.ok
$(LINT_DOCS_FSCK_MSGIDS): lint-fsck-msgids.perl
$(LINT_DOCS_FSCK_MSGIDS): ../fsck.h fsck-msgids.adoc
$(call mkdir_p_parent_template)
$(QUIET_GEN)$(PERL_PATH) lint-fsck-msgids.perl \
../fsck.h fsck-msgids.adoc $@
lint-docs-fsck-msgids: $(LINT_DOCS_FSCK_MSGIDS)
lint-docs-manpages:
$(QUIET_GEN)./lint-manpages.sh
.PHONY: lint-docs-meson
lint-docs-meson:
@# awk acts up when trying to match single quotes, so we use \047 instead.
@mkdir -p tmp-meson-diff && \
awk "/^manpages = {$$/ {flag=1 ; next } /^}$$/ { flag=0 } flag { gsub(/^ \047/, \"\"); gsub(/\047 : [157],\$$/, \"\"); print }" meson.build | \
grep -v -e '#' -e '^$$' | \
sort >tmp-meson-diff/meson.adoc && \
ls git*.adoc scalar.adoc | grep -v -e git-bisect-lk2009.adoc -e git-tools.adoc >tmp-meson-diff/actual.adoc && \
if ! cmp tmp-meson-diff/meson.adoc tmp-meson-diff/actual.adoc; then \
echo "Meson man pages differ from actual man pages:"; \
diff -u tmp-meson-diff/meson.adoc tmp-meson-diff/actual.adoc; \
exit 1; \
fi
## Lint: list of targets above
.PHONY: lint-docs
lint-docs: lint-docs-fsck-msgids
lint-docs: lint-docs-gitlink
lint-docs: lint-docs-man-end-blurb
lint-docs: lint-docs-man-section-order
lint-docs: lint-docs-manpages
lint-docs: lint-docs-meson
ifeq ($(wildcard po/Makefile),po/Makefile)
doc-l10n install-l10n::

View File

@ -21,7 +21,7 @@ This tutorial aims to summarize the following documents, but the reader may find
useful additional context:
- `Documentation/SubmittingPatches`
- `Documentation/howto/new-command.txt`
- `Documentation/howto/new-command.adoc`
[[getting-help]]
=== Getting Help
@ -35,8 +35,9 @@ announcements, design discussions, and more take place. Those interested in
contributing are welcome to post questions here. The Git list requires
plain-text-only emails and prefers inline and bottom-posting when replying to
mail; you will be CC'd in all replies to you. Optionally, you can subscribe to
the list by sending an email to majordomo@vger.kernel.org with "subscribe git"
in the body. The https://lore.kernel.org/git[archive] of this mailing list is
the list by sending an email to <git+subscribe@vger.kernel.org>
(see https://subspace.kernel.org/subscribing.html for details).
The https://lore.kernel.org/git[archive] of this mailing list is
available to view in a browser.
==== https://groups.google.com/forum/#!forum/git-mentoring[git-mentoring@googlegroups.com]
@ -160,10 +161,11 @@ in order to keep the declarations alphabetically sorted:
int cmd_psuh(int argc, const char **argv, const char *prefix);
----
Be sure to `#include "builtin.h"` in your `psuh.c`.
Be sure to `#include "builtin.h"` in your `psuh.c`. You'll also need to
`#include "gettext.h"` to use functions related to printing output text.
Go ahead and add some throwaway printf to that function. This is a decent
starting point as we can now add build rules and register the command.
Go ahead and add some throwaway printf to the `cmd_psuh` function. This is a
decent starting point as we can now add build rules and register the command.
NOTE: Your throwaway text, as well as much of the text you will be adding over
the course of this tutorial, is user-facing. That means it needs to be
@ -329,7 +331,7 @@ function body:
apply standard precedence rules. `git_config_get_string_tmp()` will look up
a specific key ("user.name") and give you the value. There are a number of
single-key lookup functions like this one; you can see them all (and more info
about how to use `git_config()`) in `Documentation/technical/api-config.txt`.
about how to use `git_config()`) in `Documentation/technical/api-config.adoc`.
You should see that the name printed matches the one you see when you run:
@ -459,10 +461,10 @@ $ ./bin-wrappers/git help psuh
Your new command is undocumented! Let's fix that.
Take a look at `Documentation/git-*.txt`. These are the manpages for the
Take a look at `Documentation/git-*.adoc`. These are the manpages for the
subcommands that Git knows about. You can open these up and take a look to get
acquainted with the format, but then go ahead and make a new file
`Documentation/git-psuh.txt`. Like with most of the documentation in the Git
`Documentation/git-psuh.adoc`. Like with most of the documentation in the Git
project, help pages are written with AsciiDoc (see CodingGuidelines, "Writing
Documentation" section). Use the following template to fill out your own
manpage:
@ -541,7 +543,7 @@ Try and run `./bin-wrappers/git psuh -h`. Your command should crash at the end.
That's because `-h` is a special case which your command should handle by
printing usage.
Take a look at `Documentation/technical/api-parse-options.txt`. This is a handy
Take a look at `Documentation/technical/api-parse-options.adoc`. This is a handy
tool for pulling out options you need to be able to handle, and it takes a
usage string.
@ -736,7 +738,7 @@ the {lore}[Git mailing list archive]:
2022-02-21 1:43 ` John Cai
2022-02-21 1:50 ` Taylor Blau
2022-02-23 19:50 ` John Cai
2022-02-18 20:00 ` // other replies ellided
2022-02-18 20:00 ` // other replies elided
2022-02-18 18:40 ` [PATCH 2/3] reflog: call reflog_delete from reflog.c John Cai via GitGitGadget
2022-02-18 19:15 ` Ævar Arnfjörð Bjarmason
2022-02-18 20:26 ` Junio C Hamano
@ -832,7 +834,7 @@ Johannes Schindelin to make life as a Git contributor easier for those used to
the GitHub PR workflow. It allows contributors to open pull requests against its
mirror of the Git project, and does some magic to turn the PR into a set of
emails and send them out for you. It also runs the Git continuous integration
suite for you. It's documented at http://gitgitgadget.github.io.
suite for you. It's documented at https://gitgitgadget.github.io/.
[[create-fork]]
=== Forking `git/git` on GitHub
@ -1086,14 +1088,14 @@ This gives reviewers a summary of what they're in for when reviewing your topic.
The one generated for `psuh` from the sample implementation looks like this:
----
Documentation/git-psuh.txt | 40 +++++++++++++++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/psuh.c | 73 ++++++++++++++++++++++++++++++++++++++
git.c | 1 +
t/t9999-psuh-tutorial.sh | 12 +++++++
Documentation/git-psuh.adoc | 40 +++++++++++++++++++++
Makefile | 1 +
builtin.h | 1 +
builtin/psuh.c | 73 ++++++++++++++++++++++++++++++++++++++
git.c | 1 +
t/t9999-psuh-tutorial.sh | 12 +++++++
6 files changed, 128 insertions(+)
create mode 100644 Documentation/git-psuh.txt
create mode 100644 Documentation/git-psuh.adoc
create mode 100644 builtin/psuh.c
create mode 100755 t/t9999-psuh-tutorial.sh
----
@ -1114,6 +1116,15 @@ $ git send-email --to=target@example.com psuh/*.patch
NOTE: Check `git help send-email` for some other options which you may find
valuable, such as changing the Reply-to address or adding more CC and BCC lines.
:contrib-scripts: footnoteref:[contrib-scripts,Scripts under `contrib/` are +
not part of the core `git` binary and must be called directly. Clone the Git +
codebase and run `perl contrib/contacts/git-contacts`.]
NOTE: If you're not sure whom to CC, running `contrib/contacts/git-contacts` can
list potential reviewers. In addition, you can do `git send-email
--cc-cmd='perl contrib/contacts/git-contacts' feature/*.patch`{contrib-scripts} to
automatically pass this list of emails to `send-email`.
NOTE: When you are sending a real patch, it will go to git@vger.kernel.org - but
please don't send your patchset from the tutorial to the real mailing list! For
now, you can send it to yourself, to make sure you understand how it will look.
@ -1160,32 +1171,32 @@ all named like `v2-000n-my-commit-subject.patch`. `-v2` will also format
your patches by prefixing them with "[PATCH v2]" instead of "[PATCH]",
and your range-diff will be prefaced with "Range-diff against v1".
Afer you run this command, `format-patch` will output the patches to the `psuh/`
After you run this command, `format-patch` will output the patches to the `psuh/`
directory, alongside the v1 patches. Using a single directory makes it easy to
refer to the old v1 patches while proofreading the v2 patches, but you will need
to be careful to send out only the v2 patches. We will use a pattern like
"psuh/v2-*.patch" (not "psuh/*.patch", which would match v1 and v2 patches).
`psuh/v2-*.patch` (not `psuh/*.patch`, which would match v1 and v2 patches).
Edit your cover letter again. Now is a good time to mention what's different
between your last version and now, if it's something significant. You do not
need the exact same body in your second cover letter; focus on explaining to
reviewers the changes you've made that may not be as visible.
You will also need to go and find the Message-Id of your previous cover letter.
You will also need to go and find the Message-ID of your previous cover letter.
You can either note it when you send the first series, from the output of `git
send-email`, or you can look it up on the
https://lore.kernel.org/git[mailing list]. Find your cover letter in the
archives, click on it, then click "permalink" or "raw" to reveal the Message-Id
archives, click on it, then click "permalink" or "raw" to reveal the Message-ID
header. It should match:
----
Message-Id: <foo.12345.author@example.com>
Message-ID: <foo.12345.author@example.com>
----
Your Message-Id is `<foo.12345.author@example.com>`. This example will be used
below as well; make sure to replace it with the correct Message-Id for your
**previous cover letter** - that is, if you're sending v2, use the Message-Id
from v1; if you're sending v3, use the Message-Id from v2.
Your Message-ID is `<foo.12345.author@example.com>`. This example will be used
below as well; make sure to replace it with the correct Message-ID for your
**previous cover letter** - that is, if you're sending v2, use the Message-ID
from v1; if you're sending v3, use the Message-ID from v2.
While you're looking at the email, you should also note who is CC'd, as it's
common practice in the mailing list to keep all CCs on a thread. You can add
@ -1256,6 +1267,38 @@ index 88f126184c..38da593a60 100644
[[now-what]]
== My Patch Got Emailed - Now What?
Please give reviewers enough time to process your initial patch before
sending an updated version. That is, resist the temptation to send a new
version immediately, because others may have already started reviewing
your initial version.
While waiting for review comments, you may find mistakes in your initial
patch, or perhaps realize a different and better way to achieve the goal
of the patch. In this case you may communicate your findings to other
reviewers as follows:
- If the mistakes you found are minor, send a reply to your patch as if
you were a reviewer and mention that you will fix them in an
updated version.
- On the other hand, if you think you want to change the course so
drastically that reviews on the initial patch would be a waste of
time (for everyone involved), retract the patch immediately with
a reply like "I am working on a much better approach, so please
ignore this patch and wait for the updated version."
Now, the above is a good practice if you sent your initial patch
prematurely without polish. But a better approach of course is to avoid
sending your patch prematurely in the first place.
Please be considerate of the time needed by reviewers to examine each
new version of your patch. Rather than seeing the initial version right
now (followed by several "oops, I like this version better than the
previous one" patches over 2 days), reviewers would strongly prefer if a
single polished version came 2 days later instead, and that version with
fewer mistakes were the only one they would need to review.
[[reviewing]]
=== Responding to Reviews

View File

@ -15,7 +15,7 @@ revision walk is used for operations like `git log`.
=== Related Reading
- `Documentation/user-manual.txt` under "Hacking Git" contains some coverage of
- `Documentation/user-manual.adoc` under "Hacking Git" contains some coverage of
the revision walker in its various incarnations.
- `revision.h`
- https://eagain.net/articles/git-for-computer-scientists/[Git for Computer Scientists]
@ -41,6 +41,7 @@ Open up a new file `builtin/walken.c` and set up the command handler:
*/
#include "builtin.h"
#include "trace.h"
int cmd_walken(int argc, const char **argv, const char *prefix)
{
@ -49,12 +50,13 @@ int cmd_walken(int argc, const char **argv, const char *prefix)
}
----
NOTE: `trace_printf()` differs from `printf()` in that it can be turned on or
off at runtime. For the purposes of this tutorial, we will write `walken` as
though it is intended for use as a "plumbing" command: that is, a command which
is used primarily in scripts, rather than interactively by humans (a "porcelain"
command). So we will send our debug output to `trace_printf()` instead. When
running, enable trace output by setting the environment variable `GIT_TRACE`.
NOTE: `trace_printf()`, defined in `trace.h`, differs from `printf()` in
that it can be turned on or off at runtime. For the purposes of this
tutorial, we will write `walken` as though it is intended for use as
a "plumbing" command: that is, a command which is used primarily in
scripts, rather than interactively by humans (a "porcelain" command).
So we will send our debug output to `trace_printf()` instead.
When running, enable trace output by setting the environment variable `GIT_TRACE`.
Add usage text and `-h` handling, like all subcommands should consistently do
(our test suite will notice and complain if you fail to do so).
@ -110,7 +112,7 @@ $ GIT_TRACE=1 ./bin-wrappers/git walken
----
NOTE: For a more exhaustive overview of the new command process, take a look at
`Documentation/MyFirstContribution.txt`.
`Documentation/MyFirstContribution.adoc`.
NOTE: A reference implementation can be found at
https://github.com/nasamuffin/git/tree/revwalk.
@ -124,13 +126,13 @@ parameters provided by the user over the CLI.
`nr` represents the number of `rev_cmdline_entry` present in the array.
`alloc` is used by the `ALLOC_GROW` macro. Check `cache.h` - this variable is
`alloc` is used by the `ALLOC_GROW` macro. Check `alloc.h` - this variable is
used to track the allocated size of the list.
Per entry, we find:
`item` is the object provided upon which to base the object walk. Items in Git
can be blobs, trees, commits, or tags. (See `Documentation/gittutorial-2.txt`.)
can be blobs, trees, commits, or tags. (See `Documentation/gittutorial-2.adoc`.)
`name` is the object ID (OID) of the object - a hex string you may be familiar
with from using Git to organize your source in the past. Check the tutorial
@ -139,7 +141,7 @@ from.
`whence` indicates some information about what to do with the parents of the
specified object. We'll explore this flag more later on; take a look at
`Documentation/revisions.txt` to get an idea of what could set the `whence`
`Documentation/revisions.adoc` to get an idea of what could set the `whence`
value.
`flags` are used to hint the beginning of the revision walk and are the first
@ -151,7 +153,7 @@ can be used during the walk, as well.
This one is quite a bit longer, and many fields are only used during the walk
by `revision.c` - not configuration options. Most of the configurable flags in
`struct rev_info` have a mirror in `Documentation/rev-list-options.txt`. It's a
`struct rev_info` have a mirror in `Documentation/rev-list-options.adoc`. It's a
good idea to take some time and read through that document.
== Basic Commit Walk
@ -208,13 +210,14 @@ We'll also need to include the `config.h` header:
...
static int git_walken_config(const char *var, const char *value, void *cb)
static int git_walken_config(const char *var, const char *value,
const struct config_context *ctx, void *cb)
{
/*
* For now, we don't have any custom configuration, so fall back to
* the default config.
*/
return git_default_config(var, value, cb);
return git_default_config(var, value, ctx, cb);
}
----
@ -341,6 +344,10 @@ the walk loop below the `prepare_revision_walk()` call within your
`walken_commit_walk()`:
----
#include "pretty.h"
...
static void walken_commit_walk(struct rev_info *rev)
{
struct commit *commit;
@ -383,10 +390,11 @@ modifying `rev_info.grep_filter`, which is a `struct grep_opt`.
First some setup. Add `grep_config()` to `git_walken_config()`:
----
static int git_walken_config(const char *var, const char *value, void *cb)
static int git_walken_config(const char *var, const char *value,
const struct config_context *ctx, void *cb)
{
grep_config(var, value, cb);
return git_default_config(var, value, cb);
grep_config(var, value, ctx, cb);
return git_default_config(var, value, ctx, cb);
}
----
@ -517,7 +525,7 @@ about each one.
We can base our work on an example. `git pack-objects` prepares all kinds of
objects for packing into a bitmap or packfile. The work we are interested in
resides in `builtins/pack-objects.c:get_object_list()`; examination of that
resides in `builtin/pack-objects.c:get_object_list()`; examination of that
function shows that the all-object walk is being performed by
`traverse_commit_list()` or `traverse_commit_list_filtered()`. Those two
functions reside in `list-objects.c`; examining the source shows that, despite
@ -534,7 +542,7 @@ the arguments to `traverse_commit_list()`.
- `void *show_data`: A context buffer which is passed in turn to `show_commit`
and `show_object`.
In addition, `traverse_commit_list_filtered()` has an additional paramter:
In addition, `traverse_commit_list_filtered()` has an additional parameter:
- `struct oidset *omitted`: A linked-list of object IDs which the provided
filter caused to be omitted.
@ -702,7 +710,7 @@ objects grows along with the Git project.
=== Adding a Filter
There are a handful of filters that we can apply to the object walk laid out in
`Documentation/rev-list-options.txt`. These filters are typically useful for
`Documentation/rev-list-options.adoc`. These filters are typically useful for
operations such as creating packfiles or performing a partial clone. They are
defined in `list-objects-filter-options.h`. For the purposes of this tutorial we
will use the "tree:1" filter, which causes the walk to omit all trees and blobs
@ -726,8 +734,8 @@ walk we've just performed:
} else {
trace_printf(
_("Filtered object walk with filterspec 'tree:1'.\n"));
CALLOC_ARRAY(rev->filter, 1);
parse_list_objects_filter(rev->filter, "tree:1");
parse_list_objects_filter(&rev->filter, "tree:1");
}
traverse_commit_list(rev, walken_show_commit,
walken_show_object, NULL);
@ -746,14 +754,20 @@ points to the same tree object as its grandparent.)
=== Counting Omitted Objects
We also have the capability to enumerate all objects which were omitted by a
filter, like with `git log --filter=<spec> --filter-print-omitted`. Asking
`traverse_commit_list_filtered()` to populate the `omitted` list means that our
object walk does not perform any better than an unfiltered object walk; all
reachable objects are walked in order to populate the list.
filter, like with `git log --filter=<spec> --filter-print-omitted`. To do this,
change `traverse_commit_list()` to `traverse_commit_list_filtered()`, which is
able to populate an `omitted` list. Asking for this list of filtered objects
may cause performance degradations, however, because in this case, despite
filtering objects, the possibly much larger set of all reachable objects must
be processed in order to populate that list.
First, add the `struct oidset` and related items we will use to iterate it:
----
#include "oidset.h"
...
static void walken_object_walk(
...
@ -766,8 +780,9 @@ static void walken_object_walk(
...
----
Modify the call to `traverse_commit_list_filtered()` to include your `omitted`
object:
Replace the call to `traverse_commit_list()` with
`traverse_commit_list_filtered()` and pass a pointer to the `omitted` oidset
defined and initialized above:
----
...
@ -805,6 +820,10 @@ just walks of commits. First, we'll make our handlers chattier - modify
go:
----
#include "hex.h"
...
static void walken_show_commit(struct commit *cmt, void *buf)
{
trace_printf("commit: %s\n", oid_to_hex(&cmt->object.oid));
@ -829,7 +848,7 @@ those lines without having to recompile.
With only that change, run again (but save yourself some scrollback):
----
$ GIT_TRACE=1 ./bin-wrappers/git walken | head -n 10
$ GIT_TRACE=1 ./bin-wrappers/git walken 2>&1 | head -n 10
----
Take a look at the top commit with `git show` and the object ID you printed; it
@ -857,7 +876,7 @@ of the first handful:
----
$ make
$ GIT_TRACE=1 ./bin-wrappers git walken | tail -n 10
$ GIT_TRACE=1 ./bin-wrappers/git walken 2>&1 | tail -n 10
----
The last commit object given should have the same OID as the one we saw at the

View File

@ -10,7 +10,7 @@ To ease the transition plan, the receiving repository of such a
push running this release will issue a big warning when the
configuration variable is missing. Please refer to:
http://git.or.cz/gitwiki/GitFaq#non-bare
https://archive.kernel.org/oldwiki/git.wiki.kernel.org/index.php/GitFaq.html#non-bare
https://lore.kernel.org/git/7vbptlsuyv.fsf@gitster.siamese.dyndns.org/
for more details on the reason why this change is needed and the

View File

@ -10,7 +10,7 @@ To ease the transition plan, the receiving repository of such a
push running this release will issue a big warning when the
configuration variable is missing. Please refer to:
http://git.or.cz/gitwiki/GitFaq#non-bare
https://archive.kernel.org/oldwiki/git.wiki.kernel.org/index.php/GitFaq.html#non-bare
https://lore.kernel.org/git/7vbptlsuyv.fsf@gitster.siamese.dyndns.org/
for more details on the reason why this change is needed and the

Some files were not shown because too many files have changed in this diff Show More