Commit Graph

76201 Commits

Author SHA1 Message Date
87c01003cd bundle: add bundle verification options type
When `unbundle()` is invoked, fsck verification may be configured by
passing the `VERIFY_BUNDLE_FSCK` flag. This mechanism allows fsck checks
on the bundle to be enabled or disabled entirely. To facilitate more
fine-grained fsck configuration, additional context must be provided to
`unbundle()`.

Introduce the `unbundle_opts` type, which wraps the existing
`verify_bundle_flags`, to facilitate future extension of `unbundle()`
configuration. Also update `unbundle()` and its call sites to accept
this new options type instead of the flags directly. The end behavior is
functionally the same, but allows for the set of configurable options to
be extended. This is leveraged in a subsequent commit to enable fsck
message severity configuration.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-28 12:07:57 +09:00
761e62a09a Merge branch 'bf/set-head-symref' into bf/fetch-set-head-config
* bf/set-head-symref:
  fetch set_head: handle mirrored bare repositories
  fetch: set remote/HEAD if it does not exist
  refs: add create_only option to refs_update_symref_extended
  refs: add TRANSACTION_CREATE_EXISTS error
  remote set-head: better output for --auto
  remote set-head: refactor for readability
  refs: atomically record overwritten ref in update_symref
  refs: standardize output of refs_read_symbolic_ref
  t/t5505-remote: test failure of set-head
  t/t5505-remote: set default branch to main
2024-11-27 22:49:05 +09:00
cc01bad4a9 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-27 07:57:10 +09:00
4a611ee7eb Merge branch 'kn/ref-transaction-hook-with-reflog'
The ref-transaction hook triggered for reflog updates, which has
been corrected.

* kn/ref-transaction-hook-with-reflog:
  refs: don't invoke reference-transaction hook for reflogs
2024-11-27 07:57:10 +09:00
1f3d9b9814 Merge branch 'jt/index-pack-allow-promisor-only-while-fetching'
We now ensure "index-pack" is used with the "--promisor" option
only during a "git fetch".

* jt/index-pack-allow-promisor-only-while-fetching:
  index-pack: teach --promisor to forbid pack name
2024-11-27 07:57:09 +09:00
8eaa06590f Merge branch 'en/fast-import-avoid-self-replace'
"git fast-import" can be tricked into a replace ref that maps an
object to itself, which is a useless thing to do.

* en/fast-import-avoid-self-replace:
  fast-import: avoid making replace refs point to themselves
2024-11-27 07:57:08 +09:00
89ceab7b4c Merge branch 'kh/trailer-in-glossary'
Doc updates.

* kh/trailer-in-glossary:
  Documentation/glossary: describe "trailer"
2024-11-27 07:57:07 +09:00
f670d811e2 Merge branch 'jk/gcc15'
GCC 15 compatibility updates.

* jk/gcc15:
  object-file: inline empty tree and blob literals
  object-file: treat cached_object values as const
  object-file: drop oid field from find_cached_object() return value
  object-file: move empty_tree struct into find_cached_object()
  object-file: drop confusing oid initializer of empty_tree struct
  object-file: prefer array-of-bytes initializer for hash literals
2024-11-27 07:57:06 +09:00
93905d3b70 Merge branch 'bc/c23'
C23 compatibility updates.

* bc/c23:
  reflog: rename unreachable
  index-pack: rename struct thread_local
2024-11-27 07:57:05 +09:00
87fc668ce5 Merge branch 'ps/clar-build-improvement'
Fix for clar unit tests to support CMake build.

* ps/clar-build-improvement:
  Makefile: let clar header targets depend on their scripts
  cmake: use verbatim arguments when invoking clar commands
  cmake: use SH_EXE to execute clar scripts
  t/unit-tests: convert "clar-generate.awk" into a shell script
2024-11-27 07:57:04 +09:00
c515230dcf Merge branch 'kh/bundle-docs'
Documentation for "git bundle" saw improvements to more prominently
call out the use of '--all' when creating bundles.

* kh/bundle-docs:
  Documentation/git-bundle.txt: discuss naïve backups
  Documentation/git-bundle.txt: mention --all in spec. refs
  Documentation/git-bundle.txt: remove old `--all` example
  Documentation/git-bundle.txt: mention full backup example
2024-11-27 07:57:03 +09:00
e1fbebe347 Git 2.47.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:02 +01:00
3fad508c3f Sync with 2.46.3
* maint-2.46:
  Git 2.46.3
  Git 2.45.3
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:15:02 +01:00
5c21db3a0d Git 2.46.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:01 +01:00
67809f7c4c Sync with 2.45.3
* maint-2.45:
  Git 2.45.3
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:15:01 +01:00
2f323bb162 Git 2.44.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:00 +01:00
fc16eb306c Git 2.45.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:15:00 +01:00
99cb64c31a Sync with 2.44.3
* maint-2.44:
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:15:00 +01:00
14799610a8 Sync with 2.43.6
* maint-2.43:
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:59 +01:00
664d4fa692 Git 2.43.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:59 +01:00
c39c2d29e6 Sync with 2.42.4
* maint-2.42:
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:59 +01:00
54ddf17f82 Git 2.42.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:58 +01:00
102e0e6daa Sync with 2.41.3
* maint-2.41:
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:58 +01:00
6fd641a521 Git 2.41.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:58 +01:00
676cddebf9 Sync with 2.40.4
* maint-2.40:
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2024-11-26 22:14:57 +01:00
54a3711a9d Git 2.40.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:57 +01:00
08756131a3 Merge branch 'disallow-control-characters-in-credential-urls-by-default'
This addresses two vulnerabilities:

- CVE-2024-50349:

	Printing unsanitized URLs when asking for credentials made the
	user susceptible to crafted URLs (e.g. in recursive clones) that
	mislead the user into typing in passwords for trusted sites that
	would then be sent to untrusted sites instead.

- CVE-2024-52006

	Git may pass on Carriage Returns via the credential protocol to
	credential helpers which use line-reading functions that
	interpret said Carriage Returns as line endings, even though Git
	did not intend that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 22:14:45 +01:00
b6318cf23a ref-cache: fix invalid free operation in free_ref_entry
In cfd971520e (refs: keep track of unresolved reference value in
iterators, 2024-08-09), we added a new field "referent" into the "struct
ref" structure. In order to free the "referent", we unconditionally
freed the "referent" by simply adding a "free" statement.

However, this is a bad usage. Because when ref entry is either directory
or loose ref, we will always execute the following statement:

  free(entry->u.value.referent);

This does not make sense. We should never access the "entry->u.value"
field when "entry" is a directory. However, the change obviously doesn't
break the tests. Let's analysis why.

The anonymous union in the "ref_entry" has two members: one is "struct
ref_value", another is "struct ref_dir". On a 64-bit machine, the size
of "struct ref_dir" is 32 bytes, which is smaller than the 48-byte size
of "struct ref_value". And the offset of "referent" field in "struct
ref_value" is 40 bytes. So, whenever we create a new "ref_entry" for a
directory, we will leave the offset from 40 bytes to 48 bytes untouched,
which means the value for this memory is zero (NULL). It's OK to free a
NULL pointer, but this is merely a coincidence of memory layout.

To fix this issue, we now ensure that "free(entry->u.value.referent)" is
only called when "entry->flag" indicates that it represents a loose
reference and not a directory to avoid the invalid memory operation.

Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-27 04:34:37 +09:00
b01b9b81d3 credential: disallow Carriage Returns in the protocol by default
While Git has documented that the credential protocol is line-based,
with newlines as terminators, the exact shape of a newline has not been
documented.

From Git's perspective, which is firmly rooted in the Linux ecosystem,
it is clear that "a newline" means a Line Feed character.

However, even Git's credential protocol respects Windows line endings
(a Carriage Return character followed by a Line Feed character, "CR/LF")
by virtue of using `strbuf_getline()`.

There is a third category of line endings that has been used originally
by MacOS, and that is respected by the default line readers of .NET and
node.js: bare Carriage Returns.

Git cannot handle those, and what is worse: Git's remedy against
CVE-2020-5260 does not catch when credential helpers are used that
interpret bare Carriage Returns as newlines.

Git Credential Manager addressed this as CVE-2024-50338, but other
credential helpers may still be vulnerable. So let's not only disallow
Line Feed characters as part of the values in the credential protocol,
but also disallow Carriage Return characters.

In the unlikely event that a credential helper relies on Carriage
Returns in the protocol, introduce an escape hatch via the
`credential.protectProtocol` config setting.

This addresses CVE-2024-52006.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 20:24:04 +01:00
7725b8100f credential: sanitize the user prompt
When asking the user interactively for credentials, we want to avoid
misleading them e.g. via control sequences that pretend that the URL
targets a trusted host when it does not.

While Git learned, over the course of the preceding commits, to disallow
URLs containing URL-encoded control characters by default, credential
helpers are still allowed to specify values very freely (apart from Line
Feed and NUL characters, anything is allowed), and this would allow,
say, a username containing control characters to be specified that would
then be displayed in the interactive terminal prompt asking the user for
the password, potentially sending those control characters directly to
the terminal. This is undesirable because control characters can be used
to mislead users to divulge secret information to untrusted sites.

To prevent such an attack vector, let's add a `git_prompt()` that forces
the displayed text to be sanitized, i.e. displaying question marks
instead of control characters.

Note: While this commit's diff changes a lot of `user@host` strings to
`user%40host`, which may look suspicious on the surface, there is a good
reason for that: this string specifies a user name, not a
<username>@<hostname> combination! In the context of t5541, the actual
combination looks like this: `user%40@127.0.0.1:5541`. Therefore, these
string replacements document a net improvement introduced by this
commit, as `user@host@127.0.0.1` could have left readers wondering where
the user name ends and where the host name begins.

Hinted-at-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 20:24:04 +01:00
c903985bf7 credential_format(): also encode <host>[:<port>]
An upcoming change wants to sanitize the credential password prompt
where a URL is displayed that may potentially come from a `.gitmodules`
file. To this end, the `credential_format()` function is employed.

To sanitize the host name (and optional port) part of the URL, we need a
new mode of the `strbuf_add_percentencode()` function because the
current mode is both too strict and too lenient: too strict because it
encodes `:`, `[` and `]` (which should be left unencoded in
`<host>:<port>` and in IPv6 addresses), and too lenient because it does
not encode invalid host name characters `/`, `_` and `~`.

So let's introduce and use a new mode specifically to encode the host
name and optional port part of a URI, leaving alpha-numerical
characters, periods, colons and brackets alone and encoding all others.

This only leads to a change of behavior for URLs that contain invalid
host names.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-11-26 20:24:00 +01:00
7cf65e2660 refs/reftable: reuse iterators when reading refs
When reading references the reftable backend has to:

  1. Create a new ref iterator.

  2. Seek the iterator to the record we're searching for.

  3. Read the record.

We cannot really avoid the last two steps, but re-creating the iterator
every single time we want to read a reference is kind of expensive and a
waste of resources. We couldn't help it in the past though because it
was not possible to reuse iterators. But starting with 5bf96e0c39
(reftable/generic: move seeking of records into the iterator,
2024-05-13) we have split up the iterator lifecycle such that creating
the iterator and seeking are two different concerns.

Refactor the code such that we cache iterators in the reftable backend.
This cache is invalidated whenever the respective stack is reloaded such
that we know to recreate the iterator in that case. This leads to a
sizeable speedup when creating many refs, which requires a lot of random
reference reads:

    Benchmark 1: update-ref: create many refs (refcount = 100000, revision = master)
      Time (mean ± σ):      1.793 s ±  0.010 s    [User: 0.954 s, System: 0.835 s]
      Range (min … max):    1.781 s …  1.811 s    10 runs

    Benchmark 2: update-ref: create many refs (refcount = 100000, revision = HEAD)
      Time (mean ± σ):      1.680 s ±  0.013 s    [User: 0.846 s, System: 0.831 s]
      Range (min … max):    1.664 s …  1.702 s    10 runs

    Summary
      update-ref: create many refs (refcount = 100000, revision = HEAD) ran
        1.07 ± 0.01 times faster than update-ref: create many refs (refcount = 100000, revision = master)

While 7% is not a huge win, you have to consider that the benchmark is
_writing_ data, so _reading_ references is only one part of what we do.
Flame graphs show that we spend around 40% of our time reading refs, so
the speedup when reading refs is approximately ~2.5x that. I could not
find better benchmarks where we perform a lot of random ref reads.

You can also see a sizeable impact on memory usage when creating 100k
references. Before this change:

    HEAP SUMMARY:
        in use at exit: 19,112,538 bytes in 200,170 blocks
      total heap usage: 8,400,426 allocs, 8,200,256 frees, 454,367,048 bytes allocated

After this change:

    HEAP SUMMARY:
        in use at exit: 674,416 bytes in 169 blocks
      total heap usage: 7,929,872 allocs, 7,929,703 frees, 281,509,985 bytes allocated

As an additional factor, this refactoring opens up the possibility for
more performance optimizations in how we re-seek iterators. Any change
that allows us to optimize re-seeking by e.g. reusing data structures
would thus also directly speed up random reads.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
9d471b9dfe reftable/merged: drain priority queue on reseek
In 5bf96e0c39 (reftable/generic: move seeking of records into the
iterator, 2024-05-13) we have refactored the reftable codebase such that
iterators can be initialized once and then re-seeked multiple times.
This feature is used by 1869525066 (refs/reftable: wire up support for
exclude patterns, 2024-09-16) in order to skip records based on exclude
patterns provided by the caller.

The logic to re-seek the merged iterator is insufficient though because
we don't drain the priority queue on a re-seek. This means that the
queue may contain stale entries and thus reading the next record in the
queue will return the wrong entry. While this is an obvious bug, it is
harmless in the context of above exclude patterns:

  - If the queue contained stale entries that match the pattern then the
    caller would already know to filter out such refs. This is because
    our codebase is prepared to handle backends that don't have a way to
    efficiently implement exclude patterns.

  - If the queue contained stale entries that don't match the pattern
    we'd eventually filter out any duplicates. This is because the
    reftable code discards items with the same ref name and sorts any
    remaining entries properly.

So things happen to work in this context regardless of the bug, and
there is no other use case yet where we re-seek iterators. We're about
to introduce a caching mechanism though where iterators are reused by
the reftable backend, and that will expose the bug.

Fix the issue by draining the priority queue when seeking and add a
testcase that surfaces the issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
eb22c1b46b reftable/stack: add mechanism to notify callers on reload
Reftable stacks are reloaded in two cases:

  - When calling `reftable_stack_reload()`, if the stat-cache tells us
    that the stack has been modified.

  - When committing a reftable addition.

While callers can figure out the second case, they do not have a
mechanism to figure out whether `reftable_stack_reload()` led to an
actual reload of the on-disk data. All they can do is thus to assume
that data is always being reloaded in that case.

Improve the situation by introducing a new `on_reload()` callback to the
reftable options. If provided, the function will be invoked every time
the stack has indeed been reloaded. This allows callers to invalidate
data that depends on the current stack data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
96e7cb83b6 refs/reftable: refactor reflog expiry to use reftable backend
Refactor the callback function that expires reflog entries in the
reftable backend to use `reftable_backend_read_ref()` instead of
accessing the reftable stack directly. This ensures that the function
will benefit from the new caching layer that we're about to introduce.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
ad6c41f4b7 refs/reftable: refactor reading symbolic refs to use reftable backend
Refactor the callback function that reads symbolic references in the
reftable backend to use `reftable_backend_read_ref()` instead of
accessing the reftable stack directly. This ensures that the function
will benefit from the new caching layer that we're about to introduce.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
27fdf8f4ed refs/reftable: read references via struct reftable_backend
Refactor `read_ref_without_reload()` to accept `struct reftable_backend`
as parameter instead of `struct reftable_stack`. Rename the function to
`reftable_backend_read_ref()` to clarify its scope and move it close to
other functions operating on `struct reftable_backend`.

This change allows us to implement an additional caching layer when
reading refs where we can reuse reftable iterators.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
3ec8022bb0 refs/reftable: figure out hash via reftable_stack
The function `read_ref_without_reload()` accepts a ref store as input
only so that we can figure out the hash function used by it. This is
duplicate information though because the reftable stack knows about its
hash function, too.

Drop the superfluous parameter to simplify the calling convention a bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:37 +09:00
c9f76fc7d1 reftable/stack: add accessor for the hash ID
Add an accessor function that allows callers to access the hash ID of a
reftable stack. This function will be used in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:36 +09:00
46b5f67019 refs/reftable: handle reloading stacks in the reftable backend
When accessing a stack we almost always have to reload the stack before
reading data from it. This is mostly because Git does not have a
notification mechanism for when underlying data has been changed, and
thus we are forced to opportunistically reload the stack every single
time to account for any changes that may have happened concurrently.

Handle the reload internally in `backend_for()`. For one this forces
callsites to think about whether or not they need to reload the stack.
But second this makes the logic to access stacks more self-contained by
letting the `struct reftable_backend` manage themselves.

Update callsites where we don't reload the stack to document why we
don't. In some cases it's unclear whether it is the right thing to do in
the first place, but fixing that is outside of the scope of this patch
series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:36 +09:00
ad0986c676 refs/reftable: encapsulate reftable stack
The reftable ref store needs to keep track of multiple stacks, one for
the main worktree and an arbitrary number of stacks for worktrees. This
is done by storing pointers to `struct reftable_stack`, which we then
access directly.

Wrap the stack in a new `struct reftable_backend`. This will allow us to
attach more data to each respective stack in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:36 +09:00
6f33d8e255 builtin: pass repository to sub commands
In 9b1cb5070f (builtin: add a repository parameter for builtin
functions, 2024-09-13) the repository was passed down to all builtin
commands. This allowed the repository to be passed down to lower layers
without depending on the global `the_repository` variable.

Continue this work by also passing down the repository parameter from
the command to sub-commands. This will help pass down the repository to
other subsystems and cleanup usage of global variables like
'the_repository' and 'the_hash_algo'.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:36:08 +09:00
4a2790a257 fast-import: disallow "." and ".." path components
If a user specified e.g.
   M 100644 :1 ../some-file
then fast-import previously would happily create a git history where
there is a tree in the top-level directory named "..", and with a file
inside that directory named "some-file".  The top-level ".." directory
causes problems.  While git checkout will die with errors and fsck will
report hasDotdot problems, the user is going to have problems trying to
remove the problematic file.  Simply avoid creating this bad history in
the first place.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:30:04 +09:00
5f9f7fafb7 bisect: address Coverity warning about potential double free
Coverity has started to warn about a potential double-free in
`find_bisection()`. This warning is triggered because we may modify the
list head of the passed-in `commit_list` in case it is an UNINTERESTING
commit, but still call `free_commit_list()` on the original variable
that points to the now-freed head in case where `do_find_bisection()`
returns a `NULL` pointer.

As far as I can see, this double free cannot happen in practice, as
`do_find_bisection()` only returns a `NULL` pointer when it was passed a
`NULL` input. So in order to trigger the double free we would have to
call `find_bisection()` with a commit list that only consists of
UNINTERESTING commits, but I have not been able to construct a case
where that happens.

Drop the `else` branch entirely as it seems to be a no-op anyway.
Another option might be to instead call `free_commit_list()` on `list`,
which is the modified version of `commit_list` and thus wouldn't cause a
double free. But as mentioned, I couldn't come up with any case where a
passed-in non-NULL list becomes empty, so this shouldn't be necessary.
And if it ever does become necessary we'd notice anyway via the leak
sanitizer.

Interestingly enough we did not have a single test exercising this
branch: all tests pass just fine even when replacing it with a call to
`BUG()`. Add a test that exercises it.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:22:24 +09:00
c6c977e82b Merge branch 'ps/leakfixes-part-10' into ps/bisect-double-free-fix
* ps/leakfixes-part-10: (27 commits)
  t: remove TEST_PASSES_SANITIZE_LEAK annotations
  test-lib: unconditionally enable leak checking
  t: remove unneeded !SANITIZE_LEAK prerequisites
  t: mark some tests as leak free
  t5601: work around leak sanitizer issue
  git-compat-util: drop now-unused `UNLEAK()` macro
  global: drop `UNLEAK()` annotation
  t/helper: fix leaking commit graph in "read-graph" subcommand
  builtin/branch: fix leaking sorting options
  builtin/init-db: fix leaking directory paths
  builtin/help: fix leaks in `check_git_cmd()`
  help: fix leaking return value from `help_unknown_cmd()`
  help: fix leaking `struct cmdnames`
  help: refactor to not use globals for reading config
  builtin/sparse-checkout: fix leaking sanitized patterns
  split-index: fix memory leak in `move_cache_to_base_index()`
  git: refactor builtin handling to use a `struct strvec`
  git: refactor alias handling to use a `struct strvec`
  strvec: introduce new `strvec_splice()` function
  line-log: fix leak when rewriting commit parents
  ...
2024-11-26 10:21:58 +09:00
7e2f377b03 sequencer: comment commit messages properly
The rebase todo editor has commands like `fixup -c` which affects
the commit messages of the rebased commits.[1]  For example:

    pick hash1 <msg>
    fixup hash2 <msg>
    fixup -c hash3 <msg>

This says that hash2 and hash3 should be squashed into hash1 and
that hash3’s commit message should be used for the resulting commit.
So the user is presented with an editor where the two first commit
messages are commented out and the third is not.  However this does
not work if `core.commentChar`/`core.commentString` is in use since
the comment char is hardcoded (#) in this `sequencer.c` function.
As a result the first commit message will not be commented out.

† 1: See 9e3cebd97c (rebase -i: add fixup [-C | -c] command,
    2021-01-29)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Co-authored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:05:08 +09:00
515d034f8d sequencer: comment --reference subject line properly
`git revert --reference <commit>` leaves behind a comment in the
first line:[1]

    # *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***

Meaning that the commit will just consist of the next line if the user
exits the editor directly:

    This reverts commit <--format=reference commit>

But the comment char here is hardcoded (#).  Which means that the
comment line will inadvertently be included in the commit message if
`core.commentChar`/`core.commentString` is in use.

† 1: See 43966ab315 (revert: optionally refer to commit in the
    "reference" format, 2022-05-26)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:05:08 +09:00
94304b9f48 sequencer: comment checked-out branch properly
`git rebase --update-ref` does not insert commands for dependent/sub-
branches which are checked out.[1]  Instead it leaves a comment about
that fact.  The comment char is hardcoded (#).  In turn the comment
line gets interpreted as an invalid command when `core.commentChar`/
`core.commentString` is in use.

† 1: See 900b50c242 (rebase: add --update-refs option, 2022-07-19)

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 10:05:08 +09:00
ef46ad0815 reftable: rename scratch buffer
Both `struct block_writer` and `struct reftable_writer` have a `buf`
member that is being reused to optimize the number of allocations.
Rename the variable to `scratch` to clarify its intend and provide a
comment explaining why it exists.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 08:39:38 +09:00
0f5762b043 refs: adapt initial_transaction flag to be unsigned
The `initial_transaction` flag is tracked as a signed integer, but we
typically pass around flags via unsigned integers. Adapt the type
accordingly.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 08:39:38 +09:00