Commit Graph

71893 Commits

Author SHA1 Message Date
6c26da8404 credential: erase all matching credentials
`credential reject` sends the erase action to each helper, but the
exact behaviour of erase isn't specified in documentation or tests.
Some helpers (such as credential-store and credential-libsecret) delete
all matching credentials, others (such as credential-cache) delete at
most one matching credential.

Test that helpers erase all matching credentials. This behaviour is
easiest to reason about. Users expect that `echo
"url=https://example.com" | git credential reject` or `echo
"url=https://example.com\nusername=tim" | git credential reject` erase
all matching credentials.

Fix credential-cache.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 13:26:41 -07:00
aeb21ce22e credential: avoid erasing distinct password
Test that credential helpers do not erase a password distinct from the
input. Such calls can happen when multiple credential helpers are
configured.

Fixes for credential-cache and credential-store.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 13:26:39 -07:00
c40f0b7877 revision: handle pseudo-opts in --stdin mode
While both git-rev-list(1) and git-log(1) support `--stdin`, it only
accepts commits and files. Most notably, it is impossible to pass any of
the pseudo-opts like `--all`, `--glob=` or others via stdin.

This makes it hard to use this function in certain scripted scenarios,
like when one wants to support queries against specific revisions, but
also against reference patterns. While this is theoretically possible by
using arguments, this may run into issues once we hit platform limits
with sufficiently large queries. And because `--stdin` cannot handle
pseudo-opts, the only alternative would be to use a mixture of arguments
and standard input, which is cumbersome.

Implement support for handling pseudo-opts in both commands to support
this usecase better. One notable restriction here is that `--stdin` only
supports "stuck" arguments in the form of `--glob=foo`. This is because
"unstuck" arguments would also require us to read the next line, which
would add quite some complexity to the code. This restriction should be
fine for scripted usage though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00
af37a209ad revision: small readability improvement for reading from stdin
The code that reads lines from standard input manually compares whether
the read line matches "--", which is a bit awkward to read. Furthermore,
we're about to extend the code to also support reading pseudo-options
via standard input, requiring more elaborate handling of lines with a
leading dash.

Refactor the code by hoisting out the check for "--" outside of the
block that checks for a leading dash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00
cc8045018d revision: reorder read_revisions_from_stdin()
Reorder `read_revisions_from_stdin()` so that we can start using
`handle_revision_pseudo_opt()` without a forward declaration in a
subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 12:09:31 -07:00
3744ffcbcd ls-tree: fix documentation of %x format placeholder
ls-tree --format expands %x followed by two hexadecimal digits to the
character indicated by that hexadecimal number, e.g.:

   $ git ls-tree --format=%x41 HEAD | head -1
   A

It rejects % followed by a hexadecimal digit, e.g.:

   $ git ls-tree --format=%41 HEAD | head -1
   fatal: bad ls-tree format: element '41' does not start with '('

This functionality is provided by strbuf_expand_literal_cb(), which has
not been changed since it was factored out by fd2015b323 (strbuf:
separate callback for strbuf_expand:ing literals, 2019-01-28).

Adjust the documentation accordingly.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 11:19:02 -07:00
d57fa7fc73 doc: trailer: add more examples in DESCRIPTION
Be more up-front about what trailers are in practice with examples, to
give the reader a visual cue while they go on to read the rest of the
description.

Also add an example for multiline values.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
eda2c44c8b doc: trailer: mention 'key' in DESCRIPTION
The 'key' option is used frequently in the examples at the bottom but
there is no mention of it in the description.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
dc8937fbb9 doc: trailer.<token>.command: emphasize deprecation
This puts the deprecation notice up front, instead of leaving it to the
next paragraph.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
8e80f2916b doc: trailer: use angle brackets for <token> and <value>
We already use angle brackets elsewhere, so this makes things more
consistent.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:20 -07:00
74a50fbd7f doc: trailer: remove redundant phrasing
The phrase "many rules" gets essentially repeated again with "many other
rules", so remove this repetition.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
229d6ab6bf doc: trailer: examples: avoid the word "message" by itself
Previously, "message" could mean the input, output, commit message, or
"internal body text inside the commit message" (in the EXAMPLES
section). Avoid overloading this term by using the appropriate meanings
explicitly.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
94f15fe5d5 doc: trailer: drop "commit message part" phrasing
The command can take inputs that are either just a commit message, or
an email-like output such as git-format-patch which includes a commit
message, "---" divider, and patch part. The existing explanation blends
these two inputs together in the first sentence

    This command reads some patches or commit messages

which then necessitates using the "commit message part" phrasing (as
opposed to just "commit message") because the input is ambiguous per the
above definition.

This change separates the two input types and explains them separately,
and so there is no longer a need to use the "commit message part"
phrase.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
00432a36e2 doc: trailer: swap verb order
This matches the order already used in the NAME section.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
bfb5f57bb3 doc: trailer: fix grammar
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 21:42:19 -07:00
f0b68f0546 CodingGuidelines: use octal escapes, not hex
Extend the shell-scripting section of CodingGuidelines to suggest octal
escape sequences (e.g. "\302\242") over hexadecimal (e.g. "\xc2\xa2")
since the latter can be a source of portability problems.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 14:44:19 -07:00
5768478edc diff-lib: honor override_submodule_config flag bit
When `diff.ignoreSubmodules = all` is set and submodule commits are
manually staged (e.g. via `git-update-index`), `git-commit` should
record the commit  with updated submodules.

`index_differs_from` is called from `prepare_to_commit` with flags set to
`override_submodule_config = 1`. `index_differs_from` then merges the
default diff flags and passed flags.

When `diff.ignoreSubmodules` is set to "all", `flags` ends up having
both `override_submodule_config` and `ignore_submodules` set to 1. This
results in `git-commit` ignoring staged commits.

This patch restores original `flags.ignore_submodule` if
`flags.override_submodule_config` is set.

Signed-off-by: Josip Sokcevic <sokcevic@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-14 11:28:12 -07:00
d7d8841f67 Start the 2.42 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-13 12:29:46 -07:00
32fe7fff0c Merge branch 'zh/ls-files-format-atoms'
Some atoms that can be used in "--format=<format>" for "git ls-tree"
were not supported by "git ls-files", even though they were relevant
in the context of the latter.

* zh/ls-files-format-atoms:
  ls-files: align format atoms with ls-tree
2023-06-13 12:29:46 -07:00
ca9c063c18 Merge branch 'sl/diff-tree-sparse'
"git diff-tree" has been taught to take advantage of the
sparse-index feature.

* sl/diff-tree-sparse:
  diff-tree: integrate with sparse index
2023-06-13 12:29:46 -07:00
e490bea8a6 Merge branch 'jk/format-patch-message-id-unleak'
Leakfix.

* jk/format-patch-message-id-unleak:
  format-patch: free elements of rev.ref_message_ids list
  format-patch: free rev.message_id when exiting
2023-06-13 12:29:46 -07:00
cbc882ea38 Merge branch 'jc/pack-ref-exclude-include'
"git pack-refs" learns "--include" and "--exclude" to tweak the ref
hierarchy to be packed using pattern matching.

* jc/pack-ref-exclude-include:
  pack-refs: teach pack-refs --include option
  pack-refs: teach --exclude option to exclude refs from being packed
  docs: clarify git-pack-refs --all will pack all refs
2023-06-13 12:29:45 -07:00
ebd07c9f7e Merge branch 'sa/doc-ls-remote'
Doc update.

* sa/doc-ls-remote:
  ls-remote doc: document the output format
  ls-remote doc: explain what each example does
  ls-remote doc: show peeled tags in examples
  ls-remote doc: remove redundant --tags example
  show-branch doc: say <ref>, not <reference>
  show-ref doc: update for internal consistency
2023-06-13 12:29:45 -07:00
4c7d878df6 Merge branch 'gc/doc-cocci-updates'
Update documentation regarding Coccinelle patches.

* gc/doc-cocci-updates:
  cocci: codify authoring and reviewing practices
  cocci: add headings to and reword README
2023-06-13 12:29:45 -07:00
6901ffe80c Merge branch 'jc/diff-s-with-other-options'
The "-s" (silent, squelch) option of the "diff" family of commands
did not interact with other options that specify the output format
well.  This has been cleaned up so that it will clear all the
formatting options given before.

* jc/diff-s-with-other-options:
  diff: fix interaction between the "-s" option and other options
2023-06-13 12:29:45 -07:00
6d2a88c728 Merge branch 'kh/keep-tag-editmsg-upon-failure'
"git tag" learned to leave the "$GIT_DIR/TAG_EDITMSG" file when the
command failed, so that the user can salvage what they typed.

* kh/keep-tag-editmsg-upon-failure:
  tag: keep the message file in case ref transaction fails
  t/t7004-tag: add regression test for successful tag creation
  doc: tag: document `TAG_EDITMSG`
2023-06-13 12:29:44 -07:00
861c56f6f9 branch: fix a leak in setup_tracking
The commit d3115660b4 (branch: add flags and config to inherit tracking,
2021-12-20) replaced in "struct tracking", the member "char *src" by a
new "struct string_list *srcs".

This caused a modification in find_tracked_branch().  The string
returned by remote_find_tracking(), previously assigned to "src", is now
added to the string_list "srcs".

That string_list is initialized with STRING_LIST_INIT_DUP, which means
that what is added is not the given string, but a duplicate.  Therefore,
the string returned by remote_find_tracking() is leaked.

The leak can be reviewed with:

   $ git branch foo
   $ git remote add local .
   $ git fetch local
   $ git branch --track bar local/foo

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in find_tracked_branch branch.c
       ... in for_each_remote remote.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's fix the leak, using the "_nodup" API to add to the string_list.
This way, the string itself will be added instead of being strdup()'d.
And when the string_list is cleared, the string will be freed.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:06:13 -07:00
caee1d669c branch: fix a leak in check_tracking_branch
Let's fix a leak we have in check_tracking_branch() since the function
was introduced in 41c21f22d0 (branch.c: Validate tracking branches with
refspecs instead of refs/remotes/*, 2013-04-21).

The leak can be reviewed with:

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --track bar local/foo

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in check_tracking_branch branch.c
       ... in for_each_remote remote.c
       ... in validate_remote_tracking_branch branch.c
       ... in dwim_branch_start branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:06:03 -07:00
a88a3d7cd7 branch: fix a leak in inherit_tracking
In d3115660b4 (branch: add flags and config to inherit tracking,
2021-12-20) a new option was introduced to allow creating a new branch,
inheriting the tracking of another branch.

The new code, strdup()'d the remote_name of the existing branch, but
unfortunately it was not freed, producing a leak.

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --track bar local/foo
   branch 'bar' set up to track 'local/foo'.
   $ git branch --track=inherit baz bar
   branch 'baz' set up to track 'local/foo'.

   Direct leak of 6 byte(s) in 1 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in inherit_tracking branch.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Actually, the string we're strdup()'ing is from the struct branch
returned by get_branch().  Which, in turn, retrieves the string from the
global "struct repository".  This makes perfectly valid to use the
string throughout the entire execution of create_branch().  There is no
need to duplicate it.

Let's fix the leak, removing the strdup().

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:05:44 -07:00
1533bda770 branch: fix a leak in dwim_and_setup_tracking
In e89f151db1 (branch: move --set-upstream-to behavior to
dwim_and_setup_tracking(), 2022-01-28) the string returned by
dwim_branch_start() was not freed, resulting in a memory leak.

It can be reviewed with:

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --set-upstream-to local/foo foo

   Direct leak of 23 byte(s) in 1 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in expand_ref refs.c
       ... in repo_dwim_ref refs.c
       ... in dwim_branch_start branch.c
       ... in dwim_and_setup_tracking branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's free it now.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:05:09 -07:00
4689101a40 remote: fix a leak in query_matches_negative_refspec
In c0192df630 (refspec: add support for negative refspecs, 2020-09-30)
query_matches_negative_refspec() was introduced.

The function was implemented as a two-loop process, where the former
loop accumulates and the latter evaluates.  To accumulate, a string_list
is used.

Within the first loop, there are three cases where a string is added to
the string_list.  Two of them add strings that do not need to be
freed.  But in the third case, the string added is returned by
match_name_with_pattern(), which needs to be freed.

The string_list is initialized with STRING_LIST_INIT_NODUP, i.e.  when
cleared, the strings added are not freed.  Therefore, the string
returned by match_name_with_pattern() is not freed, so we have a leak.

   $ git remote add local .
   $ git update-ref refs/remotes/local/foo HEAD
   $ git branch --track bar local/foo

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_matches_negative_refspec remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in find_tracked_branch branch.c
       ... in for_each_remote remote.c
       ... in setup_tracking branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

   Direct leak of 24 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_add strbuf.c
       ... in match_name_with_pattern remote.c
       ... in query_matches_negative_refspec remote.c
       ... in query_refspecs remote.c
       ... in remote_find_tracking remote.c
       ... in check_tracking_branch branch.c
       ... in for_each_remote remote.c
       ... in validate_remote_tracking_branch branch.c
       ... in dwim_branch_start branch.c
       ... in create_branch branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

An interesting point to note is that while string_list_append() is used
in the first two cases described, string_list_append_nodup() is used in
the third.  This seems to indicate an intention to delegate the
responsibility for freeing the string, to the string_list.  As if the
string_list had been initialized with STRING_LIST_INIT_DUP, i.e.  the
strings are strdup()'d when added (except if the "_nodup" API is used)
and freed when cleared.

Switching to STRING_LIST_INIT_DUP fixes the leak and probably is what we
wanted to do originally.  Let's do it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:04:28 -07:00
003c1f1171 config: fix a leak in git_config_copy_or_rename_section_in_file
In 52d59cc645 (branch: add a --copy (-c) option to go with --move (-m),
2017-06-18) a new strbuf variable was introduced, but not released.

Thus, when copying a branch that has any configuration, we have a
leak.

   $ git branch foo
   $ git config branch.foo.some-key some_value
   $ git branch -c foo bar

   Direct leak of 65 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in strbuf_grow strbuf.c
       ... in strbuf_vaddf strbuf.c
       ... in strbuf_addf strbuf.c
       ... in store_create_section config.c
       ... in git_config_copy_or_rename_section_in_file config.c
       ... in git_config_copy_section_in_file config.c
       ... in git_config_copy_section config.c
       ... in copy_or_rename_branch builtin/branch.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

Let's fix that leak.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 15:04:16 -07:00
06f3867865 pack-bitmap.c: gracefully degrade on failure to load MIDX'd pack
When opening a MIDX bitmap, we the pack-bitmap machinery eagerly calls
`prepare_midx_pack()` on each of the packs contained in the MIDX. This
is done in order to populate the array of `struct packed_git *`s held by
the MIDX, which we need later on in `load_reverse_index()`, since it
calls `load_pack_revindex()` on each of the MIDX'd packs, and requires
that the caller provide a pointer to a `struct packed_git`.

When opening one of these packs fails, the pack-bitmap code will `die()`
indicating that it can't open one of the packs in the MIDX. This
indicates that the MIDX is somehow broken with respect to the current
state of the repository. When this is the case, we indeed cannot make
use of the MIDX bitmap to speed up reachability traversals.

However, it does not mean that we can't perform reachability traversals
at all. In other failure modes, that same function calls `warning()` and
then returns -1, indicating to its caller (`open_bitmap()`) that we
should either look for a pack bitmap if one is available, or perform
normal object traversal without using bitmaps at all.

There is no reason why this case should cause us to die. If we instead
continued (by jumping to `cleanup` as this patch does) and avoid using
bitmaps altogether, we may again try and query the MIDX, which will also
fail. But when trying to call `fill_midx_entry()` fails, it also returns
a signal of its failure, and prompts the caller to try and locate the
object elsewhere.

In other words, the normal object traversal machinery works fine in the
presence of a corrupt MIDX, so there is no reason that the MIDX bitmap
machinery should abort in that case when we could easily continue.

Note that we *could* in theory try again to load a MIDX bitmap after
calling `reprepare_packed_git()`. Even though the `prepare_packed_git()`
code is careful to avoid adding a pack that we already have,
`prepare_midx_pack()` is not. So if we got part of the way through
calling `prepare_midx_pack()` on a stale MIDX, and then tried again on a
fresh MIDX that contains some of the same packs, we would end up with a
loop through the `->next` pointer.

For now, let's do the simplest thing possible and fallback to the
non-bitmap code when we detect a stale MIDX so that the complete fix as
above can be implemented carefully.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:19:31 -07:00
4dc16e2cb0 gc: introduce gc.recentObjectsHook
This patch introduces a new multi-valued configuration option,
`gc.recentObjectsHook` as a means to mark certain objects as recent (and
thus exempt from garbage collection), regardless of their age.

When performing a garbage collection operation on a repository with
unreachable objects, Git makes its decision on what to do with those
object(s) based on how recent the objects are or not. Generally speaking,
unreachable-but-recent objects stay in the repository, and older objects
are discarded.

However, we have no convenient way to keep certain precious, unreachable
objects around in the repository, even if they have aged out and would
be pruned. Our options today consist of:

  - Point references at the reachability tips of any objects you
    consider precious, which may be undesirable or infeasible if there
    are many such objects.

  - Track them via the reflog, which may be undesirable since the
    reflog's lifetime is limited to that of the reference it's tracking
    (and callers may want to keep those unreachable objects around for
    longer).

  - Extend the grace period, which may keep around other objects that
    the caller *does* want to discard.

  - Manually modify the mtimes of objects you want to keep. If those
    objects are already loose, this is easy enough to do (you can just
    enumerate and `touch -m` each one).

    But if they are packed, you will either end up modifying the mtimes
    of *all* objects in that pack, or be forced to write out a loose
    copy of that object, both of which may be undesirable. Even worse,
    if they are in a cruft pack, that requires modifying its `*.mtimes`
    file by hand, since there is no exposed plumbing for this.

  - Force the caller to construct the pack of objects they want
    to keep themselves, and then mark the pack as kept by adding a
    ".keep" file. This works, but is burdensome for the caller, and
    having extra packs is awkward as you roll forward your cruft pack.

This patch introduces a new option to the above list via the
`gc.recentObjectsHook` configuration, which allows the caller to
specify a program (or set of programs) whose output is treated as a set
of objects to treat as recent, regardless of their true age.

The implementation is straightforward. Git enumerates recent objects via
`add_unseen_recent_objects_to_traversal()`, which enumerates loose and
packed objects, and eventually calls add_recent_object() on any objects
for which `want_recent_object()`'s conditions are met.

This patch modifies the recency condition from simply "is the mtime of
this object more recent than the cutoff?" to "[...] or, is this object
mentioned by at least one `gc.recentObjectsHook`?".

Depending on whether or not we are generating a cruft pack, this allows
the caller to do one of two things:

  - If generating a cruft pack, the caller is able to retain additional
    objects via the cruft pack, even if they would have otherwise been
    pruned due to their age.

  - If not generating a cruft pack, the caller is likewise able to
    retain additional objects as loose.

A potential alternative here is to introduce a new mode to alter the
contents of the reachable pack instead of the cruft one. One could
imagine a new option to `pack-objects`, say `--extra-reachable-tips`
that does the same thing as above, adding the visited set of objects
along the traversal to the pack.

But this has the unfortunate side-effect of altering the reachability
closure of that pack. If parts of the unreachable object graph mentioned
by one or more of the "extra reachable tips" programs is not closed,
then the resulting pack won't be either. This makes it impossible in the
general case to write out reachability bitmaps for that pack, since
closure is a requirement there.

Instead, keep these unreachable objects in the cruft pack (or set of
unreachable, loose objects) instead, to ensure that we can continue to
have a pack containing just reachable objects, which is always safe to
write a bitmap over.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:12:20 -07:00
01e9ca4a40 reachable.c: extract obj_is_recent()
When enumerating objects in order to add recent ones (i.e. those whose
mtime is strictly newer than the cutoff) as tips of a reachability
traversal, `add_recent_object()` discards objects which do not meet the
recency criteria.

The subsequent commit will make checking whether or not an object is
recent also consult the list of hooks in `pack.recentHook`. Isolate this
check in its own function to keep the additional complexity outside of
`add_recent_object()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:08:51 -07:00
73320e49ad builtin/repack.c: only collect fully-formed packs
To partition the set of packs based on which ones are "kept" (either
they have a .keep file, or were otherwise marked via the `--keep-pack`
option) and "non-kept" ones (anything else), `git repack` uses its
`collect_pack_filenames()` function.

Ordinarily, we would rely on a convenience function such as
`get_all_packs()` to enumerate and partition the set of packs. But
`collect_pack_filenames()` uses `readdir()` directly to read the
contents of the "$GIT_DIR/objects/pack" directory, and adds each entry
ending in ".pack" to the appropriate list (either kept, or non-kept as
above).

This is subtly racy, since `collect_pack_filenames()` may see a pack
that is not fully staged (i.e., it is missing its ".idx" file).
Ordinarily, this doesn't cause a problem. But it can cause issues when
generating a cruft pack.

This is because `git repack` feeds (among other things) the list of
existing kept packs down to `git pack-objects --cruft` to indicate that
any kept packs will not be removed from the repository (so that the
cruft pack machinery can avoid packing objects that appear in those
packs as cruft).

But `read_cruft_objects()` lists packfiles by calling `get_all_packs()`.
So if a ".pack" file exists (necessary to get that pack to appear to
`collect_pack_filenames()`), but doesn't have a corresponding ".idx"
file (necessary to get that pack to appear via `get_all_packs()`), we'll
complain with:

    fatal: could not find pack '.tmp-5841-pack-a6b0150558609c323c496ced21de6f4b66589260.pack'

Fix the above by teaching `collect_pack_filenames()` to only collect
packs with their corresponding `*.idx` files in place, indicating that
those packs have been fully staged.

There are a couple of things worth noting:

  - Since each entry in the `extra_keep` list (which contains the
    `--keep-pack` names) has a `*.pack` suffix, we'll have to swap the
    suffix from ".pack" to ".idx", and compare that instead.

  - Since we use the the `fname_kept_list` to figure out which packs to
    delete (with `git repack -d`), we would have previously deleted a
    `*.pack` with no index (since the existince of a ".pack" file is
    necessary and sufficient to include that pack in the list of
    existing non-kept packs).

    Now we will leave it alone (since that pack won't appear in the
    list). This is far more correct behavior, since we don't want
    to race with a pack being staged. Deleting a partially staged pack
    is unlikely, however, since the window of time between staging a
    pack and moving its .idx file into place is miniscule.

    Note that this window does *not* include the time it takes to
    receive and index the pack, since the incoming data goes into
    "$GIT_DIR/objects/tmp_pack_XXXXXX", which does not end in ".pack"
    and is thus ignored by collect_pack_filenames().

In the future, this function should probably be rewritten as a callback
to `for_each_file_in_pack_dir()`, but this is the simplest change we
could do in the short-term.

Reported-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:54:38 -07:00
548afb0d9a docs: typofixes
These were found with an automated CLI tool [1]. Only the
"Documentation" subfolder (and not source code files) was considered
because the docs are user-facing.

[1]: https://crates.io/crates/typos-cli

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:52:51 -07:00
78e56cff69 t/lib-gpg: require GPGSSH for GPGSSH_VERIFYTIME prereq
The GPGSSH_VERIFYTIME prequeq makes use of "${GNUPGHOME}" but does not
create it.  Require GPGSSH which creates the "${GNUPGHOME}" directory.

Additionally, it makes sense to require GPGSSH in GPGSSH_VERIFYTIME
because the latter builds on the former.  If we can't use GPGSSH,
there's little point in checking whether GPGSSH_VERIFYTIME is usable.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:50:39 -07:00
787cb8a48a strbuf: remove global variable
As a library that only interacts with other primitives, strbuf should
not utilize the comment_line_char global variable within its
functions. Therefore, add an additional parameter for functions that use
comment_line_char and refactor callers to pass it in instead.
strbuf_stripspace() removes the skip_comments boolean and checks if
comment_line_char is a non-NUL character to determine whether to skip
comments or not.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
aba0706832 path: move related function to path
Move path-related function from strbuf.[ch] to path.[ch] so that strbuf
is focused on string manipulation routines with minimal dependencies.

repository.h is no longer a necessary dependency after moving this
function out.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
f94018506c object-name: move related functions to object-name
Move object-name-related functions from strbuf.[ch] to object-name.[ch]
so that strbuf is focused on string manipulation routines with minimal
dependencies.

dir.h relied on the forward declration of the repository struct in
strbuf.h. Since that is removed in this patch, add the forward
declaration to dir.h.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
f89854362c credential-store: move related functions to credential-store file
is_rfc3986_unreserved() and is_rfc3986_reserved_or_unreserved() are only
called from builtin/credential-store.c and they are only relevant to that
file so move those functions and make them static.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
5d1344b497 abspath: move related functions to abspath
Move abspath-related functions from strbuf.[ch] to abspath.[ch] so that
strbuf is focused on string manipulation routines with minimal
dependencies.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:35 -07:00
16b171fda0 strbuf: clarify dependency
refs.h was once needed but is no longer so as of 6bab74e7fb ("strbuf:
move strbuf_branchname to sha1_name.c", 2010-11-06). strbuf.h was
included thru refs.h, so removing refs.h requires strbuf.h to be added
back.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:35 -07:00
4557779660 strbuf: clarify API boundary
strbuf, as a generic and widely used structure across the codebase,
should be limited as a library to only interact with primitives. Add
documentation so future functions can appropriately be placed. Older
functions that do not follow this boundary should eventually be moved or
refactored.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:35 -07:00
9c7d1b057f repository: create read_replace_refs setting
The 'read_replace_refs' global specifies whether or not we should
respect the references of the form 'refs/replace/<oid>' to replace which
object we look up when asking for '<oid>'. This global has caused issues
when it is not initialized properly, such as in b6551feadf (merge-tree:
load default git config, 2023-05-10).

To make this more robust, move its config-based initialization out of
git_default_config and into prepare_repo_settings(). This provides a
repository-scoped version of the 'read_replace_refs' global.

The global still has its purpose: it is disabled process-wide by the
GIT_NO_REPLACE_OBJECTS environment variable or by a call to
disable_replace_refs() in some specific Git commands.

Since we already encapsulated the use of the constant inside
replace_refs_enabled(), we can perform the initialization inside that
method, if necessary. This solves the problem of forgetting to check the
config, as we will check it before returning this value.

Due to this encapsulation, the global can move to be static within
replace-object.c.

There is an interesting behavior change possible here: we now have a
repository-scoped understanding of this config value. Thus, if there was
a command that recurses into submodules and might follow replace refs,
then it would now respect the core.useReplaceRefs config value in each
repository.

'git grep --recurse-submodules' is such a command that recurses into
submodules in-process. We can demonstrate the granularity of this config
value via a test in t7814.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
f1178380ac replace-objects: create wrapper around setting
The 'read_replace_objects' constant is initialized by git_default_config
(if core.useReplaceRefs is disabled) and within setup_git_env (if
GIT_NO_REPLACE_OBJECTS) is set. To ensure that this variable cannot be
set accidentally in other places, wrap it in a replace_refs_enabled()
method.

Since we still assign this global in config.c, we are not able to remove
the global scope of this variable and make it a static within
replace-object.c. This will happen in a later change which will also
prevent the variable from being read before it is initialized.

Centralizing read access to the variable is an important first step.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
d24eda4e03 repository: create disable_replace_refs()
Several builtins depend on being able to disable the replace references
so we actually operate on each object individually. These currently do
so by directly mutating the 'read_replace_refs' global.

A future change will move this global into a different place, so it will
be necessary to change all of these lines. However, we can simplify that
transition by abstracting the purpose of these global assignments with a
method call.

We will need to keep this read_replace_refs global forever, as we want
to make sure that we never use replace refs throughout the life of the
process if this method is called. Future changes may present a
repository-scoped version of the variable to represent that repository's
core.useReplaceRefs config value, but a zero-valued read_replace_refs
will always override such a setting.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
f79e18849b cat-file: add option '-Z' that delimits input and output with NUL
In db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with
`-z`, 2022-07-22), we have introduced a new mode to read the input via
NUL-delimited records instead of newline-delimited records. This allows
the user to query for revisions that have newlines in their path
component. While unusual, such queries are perfectly valid and thus it
is clear that we should be able to support them properly.

Unfortunately, the commit only changed the input to be NUL-delimited,
but didn't change the output at the same time. While this is fine for
queries that are processed successfully, it is less so for queries that
aren't. In the case of missing commits for example the result can become
entirely unparsable:

```
$ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" |
    git cat-file --batch -z
7ce4f05bae blob 10
1234567890

commit missing
```

This is of course a crafted query that is intentionally gaming the
deficiency, but more benign queries that contain newlines would have
similar problems.

Ideally, we should have also changed the output to be NUL-delimited when
`-z` is specified to avoid this problem. As the input is NUL-delimited,
it is clear that the output in this case cannot ever contain NUL
characters by itself. Furthermore, Git does not allow NUL characters in
revisions anyway, further stressing the point that using NUL-delimited
output is safe. The only exception is of course the object data itself,
but as git-cat-file(1) prints the size of the object data clients should
read until that specified size has been consumed.

But even though `-z` has only been introduced a few releases ago in Git
v2.38.0, changing the output format retroactively to also NUL-delimit
output would be a backwards incompatible change. And while one could
make the argument that the output is inherently broken already, we need
to assume that there are existing users out there that use it just fine
given that revisions containing newlines are quite exotic.

Instead, introduce a new option `-Z` that switches to NUL-delimited
input and output. While this new option could arguably only switch the
output format to be NUL-delimited, the consequence would be that users
have to always specify both `-z` and `-Z` when the input may contain
newlines. On the other hand, if the user knows that there never will be
newlines in the input, they don't have to use either of those options.
There is thus no usecase that would warrant treating input and output
format separately, which is why we instead opt to "do the right thing"
and have `-Z` mean to NUL-terminate both formats.

The old `-z` option is marked as deprecated with a hint that its output
may become unparsable. It is thus hidden both from the synopsis as well
as the command's help output.

Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:46 -07:00
3217f52a49 cat-file: simplify reading from standard input
The batch modes of git-cat-file(1) read queries from stantard input that
are either newline- or NUL-delimited. This code was introduced via
db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`,
2022-07-22), which notes that:

"""
The refactoring here is slightly unfortunate, since we turn loops like:

     while (strbuf_getline(&buf, stdin) != EOF)

 into:

     while (1) {
         int ret;
         if (opt->nul_terminated)
             ret = strbuf_getline_nul(&input, stdin);
         else
             ret = strbuf_getline(&input, stdin);

         if (ret == EOF)
             break;
     }
"""

The commit proposed introducing a helper function that is easier to use,
which is just what we have done in the preceding commit. Refactor the
code to use this new helper to simplify the loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:24 -07:00