Code clean-up to reduce inter-function communication inside
builtin/config.c done via the use of global variables.
* ps/builtin-config-cleanup: (21 commits)
builtin/config: pass data between callbacks via local variables
builtin/config: convert flags to a local variable
builtin/config: track "fixed value" option via flags only
builtin/config: convert `key` to a local variable
builtin/config: convert `key_regexp` to a local variable
builtin/config: convert `regexp` to a local variable
builtin/config: convert `value_pattern` to a local variable
builtin/config: convert `do_not_match` to a local variable
builtin/config: move `respect_includes_opt` into location options
builtin/config: move default value into display options
builtin/config: move type options into display options
builtin/config: move display options into local variables
builtin/config: move location options into local variables
builtin/config: refactor functions to have common exit paths
config: make the config source const
builtin/config: check for writeability after source is set up
builtin/config: move actions into `cmd_config_actions()`
builtin/config: move legacy options into `cmd_config()`
builtin/config: move subcommand options into `cmd_config()`
builtin/config: move legacy mode into its own function
...
Now that memory ownership rules around `git_config_string()` and
`git_config_pathname()` are clearer, it also got easier to spot that
the returned memory needs to be free'd. Plug a subset of those cases and
mark now-passing tests as leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `check_roundtrip_encoding` variable is tracked in a `const char *`
even though it may contain allocated strings at times. The result is
that those strings may be leaking because we never free them.
Refactor the code to always store allocated strings in this variable.
The default value is handled in `check_roundtrip()` now, which is the
only user of the variable.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fixes/2.45.1/2.44:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.43:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.42:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.41:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.40:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.
* jc/fix-2.45.1-and-friends-for-2.39:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Future commits will want to parse a double-precision floating point
value from configuration, but we have no way to parse such a value prior
to this patch.
The core of the routine is implemented in git_parse_double(). Unlike
git_parse_unsigned() and git_parse_signed(), however, the function
implemented here only works on type "double", and not related types like
"float", or "long double".
This is because "float" and "long double" use different functions to
convert from ASCII strings to floating point values (strtof() and
strtold(), respectively). Likewise, there is no pointer type that can
assign to any of these values (except for "void *"), so the only way to
define this trio of functions would be with a macro expansion that is
parameterized over the floating point type and conversion function.
That is all doable, but likely to be overkill given our current needs,
which is only to parse double-precision floats.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This defense-in-depth was intended to protect the clone operation
against future escalations where bugs in `git clone` would allow
attackers to write arbitrary files in the `.git/` directory would allow
for Remote Code Execution attacks via maliciously-placed hooks.
However, it turns out that the `core.hooksPath` protection has
unintentional side effects so severe that they do not justify the
benefit of the protections. For example, it has been reported in
https://lore.kernel.org/git/FAFA34CB-9732-4A0A-87FB-BDB272E6AEE8@alchemists.io/
that the following invocation, which is intended to make `git clone`
safer, is itself broken by that protective measure:
git clone --config core.hooksPath=/dev/null <url>
Since it turns out that the benefit does not justify the cost, let's revert
20f3588efc (core.hooksPath: add some protection while cloning,
2024-03-30).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The refs API lost functions that implicitly assumes to work on the
primary ref_store by forcing the callers to pass a ref_store as an
argument.
* ps/refs-without-the-repository:
refs: remove functions without ref store
cocci: apply rules to rewrite callers of "refs" interfaces
cocci: introduce rules to transform "refs" to pass ref store
refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
refs: introduce missing functions that accept a `struct ref_store`
The `struct git_config_source` passed to `config_with_options()` is
never modified. Let's mark it as `const` to clarify.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ownership of memory returned when preparing a comment string is
quite intricate: when the returned value is different than the passed
value, then the caller is responsible to free the memory. This is quite
subtle, and it's even easier to miss because the returned value is in
fact a `const char *`.
Adapt the function to always return either `NULL` or a newly allocated
string. The function is called at most once per git-config(1), so it's
not like this micro-optimization really matters. Thus, callers are now
always responsible for freeing the value.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.44: (41 commits)
Git 2.44.1
Git 2.43.4
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
...
* maint-2.43: (40 commits)
Git 2.43.4
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
...
* maint-2.42: (39 commits)
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
...
* maint-2.41: (38 commits)
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
docs: document security issues around untrusted .git dirs
...
* maint-2.40: (39 commits)
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
docs: document security issues around untrusted .git dirs
upload-pack: disable lazy-fetching by default
...
* maint-2.39: (38 commits)
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
docs: document security issues around untrusted .git dirs
upload-pack: disable lazy-fetching by default
fetch/clone: detect dubious ownership of local repositories
...
Quite frequently, when vulnerabilities were found in Git's (quite
complex) clone machinery, a relatively common way to escalate the
severity was to trick Git into running a hook which is actually a script
that has just been laid on disk as part of that clone. This constitutes
a Remote Code Execution vulnerability, the highest severity observed in
Git's vulnerabilities so far.
Some previously-fixed vulnerabilities allowed malicious repositories to
be crafted such that Git would check out files not in the worktree, but
in, say, a submodule's `<git>/hooks/` directory.
A vulnerability that "merely" allows to modify the Git config would
allow a related attack vector, to manipulate Git into looking in the
worktree for hooks, e.g. redirecting the location where Git looks for
hooks, via setting `core.hooksPath` (which would be classified as
CWE-427: Uncontrolled Search Path Element and CWE-114: Process Control,
for more details see https://cwe.mitre.org/data/definitions/427.html and
https://cwe.mitre.org/data/definitions/114.html).
To prevent that attack vector, let's error out and complain loudly if an
active `core.hooksPath` configuration is seen in the repository-local
Git config during a `git clone`.
There is one caveat: This changes Git's behavior in a slightly
backwards-incompatible manner. While it is probably a rare scenario (if
it exists at all) to configure `core.hooksPath` via a config in the Git
templates, it _is_ conceivable that some valid setup requires this to
work. In the hopefully very unlikely case that a user runs into this,
there is an escape hatch: set the `GIT_CLONE_PROTECTION_ACTIVE=false`
environment variable. Obviously, this should be done only with utmost
caution.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The variable that holds the value read from the core.excludefile
configuration variable used to leak, which has been corrected.
* jc/unleak-core-excludesfile:
config: do not leak excludes_file
The excludes_file variable is marked "const char *", but all the
assignments to it are made with a piece of memory allocated just
for it, and the variable is responsible for owning it.
When "core.excludesfile" is read, the code just lost the previous
value, leaking memory. Plug it.
The real problem is that the variable is mistyped; our convention
is to never make a variable that owns the piece of memory pointed
by it as "const". Fixing that would reduce the chance of this kind
of bug happening, and also would make it unnecessary to cast the
constness away while free()ing it, but that would be a much larger
follow-up effort.
Reported-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
core.commentChar used to be limited to a single byte, but has been
updated to allow an arbitrary multi-byte sequence.
* jk/core-comment-string:
config: add core.commentString
config: allow multi-byte core.commentChar
environment: drop comment_line_char compatibility macro
wt-status: drop custom comment-char stringification
sequencer: handle multi-byte comment characters when writing todo list
find multi-byte comment chars in unterminated buffers
find multi-byte comment chars in NUL-terminated strings
prefer comment_line_str to comment_line_char for printing
strbuf: accept a comment string for strbuf_add_commented_lines()
strbuf: accept a comment string for strbuf_commented_addf()
strbuf: accept a comment string for strbuf_stripspace()
environment: store comment_line_char as a string
strbuf: avoid shadowing global comment_line_char name
commit: refactor base-case of adjust_comment_line_char()
strbuf: avoid static variables in strbuf_add_commented_lines()
strbuf: simplify comment-handling in add_lines() helper
config: forbid newline as core.commentChar
"git config" learned "--comment=<message>" option to leave a
comment immediately after the "variable = value" on the same line
in the configuration file.
* rs/config-comment:
config: allow tweaking whitespace between value and comment
config: fix --comment formatting
config: add --comment option to add a comment
"git config" corrupted literal HT characters written in the
configuration file as part of a value, which has been corrected.
* ds/config-internal-whitespace-fix:
config.txt: describe handling of whitespace further
t1300: add more tests for whitespace and inline comments
config: really keep value-internal whitespace verbatim
config: minor addition of whitespace
The core.commentChar code recently learned to accept more than a
single ASCII character. But using it is annoying with multiple versions
of Git, since older ones will reject it outright:
$ git.v2.44.0 -c core.commentchar=foo stripspace -s
error: core.commentChar should only be one ASCII character
fatal: unable to parse 'core.commentchar' from command-line config
Let's add an alias core.commentString. That's arguably a better name
anyway, since we now can handle strings, and it makes it possible to
have a config that works reasonably with both old and new versions of
Git (see the example in the documentation).
This is strictly an alias, so there's not much point in adding duplicate
tests; I added a single one to t0030 that exercises the alias code.
Note also that the error messages for invalid values will now show the
variable the config parser handed us, and thus will be normalized to
lowercase (rather than camelcase). A few tests in t0030 are adjusted to
match.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in function parse_value() that prevented whitespace characters
(i.e. spaces and horizontal tabs) found inside configuration option values
from being parsed and returned in their original form. The bug caused any
number of consecutive whitespace characters to be wrongly "squashed" into
the same number of space characters.
This bug was introduced back in July 2009, in commit ebdaae372b ("config:
Keep inner whitespace verbatim").
Further investigation showed that setting a configuration value, by invoking
git-config(1), converts value-internal horizontal tabs into "\t" escape
sequences, which the buggy value-parsing logic in function parse_value()
didn't "squash" into spaces. That's why the test included in the ebdaae37
commit passed, which presumably made the bug remain undetected for this long.
On the other hand, value-internal literal horizontal tab characters, found in
a configuration file edited by hand, do get "squashed" by the value-parsing
logic, so the right choice was to fix this bug by making the value-internal
whitespace characters preserved verbatim.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, binary operators should be enclosed in a pair of leading and
trailing space (SP) characters. Thus, clean up one spotted expression that
for some reason had a "bunched up" operator.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extending the previous step, this allows the whitespace placed after
the value before the "# comment message" to be tweaked by tweaking
the preprocessing rule to:
* If the given comment string begins with one or more whitespace
characters followed by '#', it is passed intact.
* If the given comment string begins with '#', a Space is
prepended.
* Otherwise, " # " (Space, '#', Space) is prefixed.
* A string with LF in it cannot be used as a comment string.
Unlike the previous step, which unconditionally added a space after
the value before writing the "# comment string", because the above
preprocessing already gives a whitespace before the '#', the
resulting string is written immediately after copying the value.
And the sanity checking rule becomes
* comment string after the above massaging that comes into
git_config_set_multivar_in_file_gently() must
- begin with zero or more whitespace characters followed by '#'.
- not have a LF in it.
I personally think this is over-engineered, but since I thought
things through anyway, here it is in the patch form. The logic to
tweak end-user supplied comment string is encapsulated in a new
helper function, git_config_prepare_comment_string(), so if new
front-end callers would want to use the same massaging rules, it is
easily reused.
Unfortunately I do not think of a way to tweak the preprocessing
rules further to optionally allow having no blank after the value,
i.e. to produce
[section]
variable = value#comment
(which is a valid way to say section.variable=value, by the way)
without sacrificing the ergonomics for the more usual case, so this
time I really stop here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git adds comments itself (like "rebase -i" todo list and
"commit -e" log message editor), it always gives a comment
introducer "#" followed by a Space before the message, except for
the recently introduced "git config --comment", where the users are
forced to say " this is my comment" if they want to add their
comment in this usual format; otherwise their comment string will
end up without a space after the "#".
Make it more ergonomic, while keeping it possible to also use this
unusual style, by massaging the comment string at the UI layer with
a set of simple rules:
* If the given comment string begins with '#', it is passed intact.
* Otherwise, "# " is prefixed.
* A string with LF in it cannot be used as a comment string.
Right now there is only one "front-end" that accepts end-user
comment string and calls the underlying machinery to add or modify
configuration file with comments, but to make sure that the future
callers perform similar massaging as they see fit, add a sanity
check logic in git_config_set_multivar_in_file_gently(), which is
the single choke point in the codepaths that consumes the comment
string.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the ability to append comments to modifications
made using git-config. Example usage:
git config --comment "changed via script" \
--add safe.directory /home/alice/repo.git
based on the proposed patch, the output produced is:
[safe]
directory = /home/alice/repo.git #changed via script
Users need to be able to distinguish between config entries made
using automation and entries made by a human. Automation can add
comments containing a URL pointing to explanations for the change
made, avoiding questions from users as to why their config file
was changed by a third party.
The implementation ensures that a # character is unconditionally
prepended to the provided comment string, and that the comment
text is appended as a suffix to the changed key-value-pair in the
same line of text. Multi-line comments (i.e. comments containing
linefeed) are rejected as errors, causing Git to exit without
making changes.
Comments are aimed at humans who inspect or change their Git
config using a pager or editor. Comments are not meant to be
read or displayed by git-config at a later time.
Signed-off-by: Ralph Seichter <github@seichter.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that all of the code handles multi-byte comment characters, it's
safe to allow users to set them.
There is one special case I kept: we still will not allow an empty
string for the commentChar. While it might make sense in some contexts
(e.g., output where you don't want any comment prefix), there are plenty
where it will behave badly (e.g., all of our starts_with() checks will
indicate that every line is a comment!). It might be reasonable to
assign some meaningful semantics, but it would probably involve checking
how each site behaves. In the interim let's forbid it and we can loosen
things later.
Likewise, the "commentChar cannot be a newline" rule is now extended to
"it cannot contain a newline" (for the same reason: it can confuse our
parsing loops).
Since comment_line_str is used in many parts of the code, it's hard to
cover all possibilities with tests. We can convert the existing
double-semicolon prefix test to show that "git status" works. And we'll
give it a more challenging case in t7507, where we confirm that
git-commit strips out the commit template along with any --verbose text
when reading the edited commit message back in. That covers the basics,
though it's possible there could be issues in more exotic spots (e.g.,
the sequencer todo list uses its own code).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We'd like to eventually support multi-byte comment prefixes, but the
comment_line_char variable is referenced in many spots, making the
transition difficult.
Let's start by storing the character in a NUL-terminated string. That
will let us switch code over incrementally to the string format, and we
can easily support the existing code with a macro wrapper (since we'll
continue to allow only a single-byte prefix, this will behave
identically).
Once all references to the "char" variable have been converted, we can
drop it and enable longer strings.
We'll still have to touch all of the spots that create or set the
variable in this patch, but there are only a few (reading the config,
and the "auto" character selector).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we usually look for a comment char while parsing line-oriented
files, setting core.commentChar to a single newline can confuse our code
quite a bit. For example, using it with "git commit" causes us to fail
to recognize any of the template as comments, including it in the config
message. Which kind of makes sense, since the template content is on its
own line (so no line can "start" with a newline). In other spots I would
not be surprised if you can create more mischief (e.g., violating loop
assumptions) but I didn't dig into it.
Since comment characters are a local preference, to some degree this is
a case of "if it hurts, don't do it". But given that this would be a
silly and pointless thing to do, and that it makes it harder to reason
about code parsing comment lines, let's just forbid it.
There are other cases that are perhaps questionable (e.g., setting the
comment char to a single space), but they seem to behave reasonably (at
least a simple "git commit" will correctly identify and strip the
template lines). So I haven't worried about going on a hunt for every
stupid thing a user might do to themselves, and just focused on the most
confusing case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
Factor out code that retrieves the global config file so that we can use
it in `gc.c` as well.
Use the old name from the previous commit since this function acts
functionally the same as `git_system_config` but for “global”.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename this function to a more descriptive name since we want to use the
existing name for a new function.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove unneeded newlines according to `clang-format`.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this code path was recently converted to check for a NULL value,
it now behaves exactly like git_config_string(). We can shorten the code
a bit by using that helper.
Note that git_config_string() takes a const pointer, but our storage
variable is non-const. We're better off making this "const", though,
since the default value points to a string literal (and thus it would be
an error if anybody tried to write to it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you feed nonsense config like:
git -c core.checkstat=foobar status
we'll silently ignore the unknown value, rather than reporting an error.
This goes all the way back to c08e4d5b5c (Enable minimal stat checking,
2013-01-22).
Detecting and complaining now is technically a backwards-incompatible
change, but I don't think anybody has any reason to use an invalid value
here. There are no historical values we'd want to allow for backwards
compatibility or anything like that. We are better off loudly telling
the user that their config may not be doing what they expect.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the config parser sees an "implicit" bool like:
[core]
someVariable
it passes NULL to the config callback. Any callback code which expects a
string must check for NULL. This usually happens via helpers like
git_config_string(), etc, but some custom code forgets to do so and will
segfault.
These are all fairly vanilla cases where the solution is just the usual
pattern of:
if (!value)
return config_error_nonbool(var);
though note that in a few cases we have to split initializers like:
int some_var = initializer();
into:
int some_var;
if (!value)
return config_error_nonbool(var);
some_var = initializer();
There are still some broken instances after this patch, which I'll
address on their own in individual patches after this one.
Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The attribute subsystem learned to honor `attr.tree` configuration
that specifies which tree to read the .gitattributes files from.
* jc/attr-tree-config:
attr: add attr.tree for setting the treeish to read attributes from
attr: read attributes from HEAD when bare repo
44451a2 (attr: teach "--attr-source=<tree>" global option to "git",
2023-05-06) provided the ability to pass in a treeish as the attr
source. In the context of serving Git repositories as bare repos like we
do at GitLab however, it would be easier to point --attr-source to HEAD
for all commands by setting it once.
Add a new config attr.tree that allows this.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shuffle some bits across headers and sources to prepare for
libification effort.
* cw/prelim-cleanup:
parse: separate out parsing functions from config.h
config: correct bad boolean env value error message
wrapper: reduce scope of remove_or_warn()
hex-ll: separate out non-hash-algo functions
The files config.{h,c} contain functions that have to do with parsing,
but not config.
In order to further reduce all-in-one headers, separate out functions in
config.c that do not operate on config into its own file, parse.h,
and update the include directives in the .c files that need only such
functions accordingly.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An incorrectly defined boolean environment value would result in the
following error message:
bad boolean config value '%s' for '%s'
This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.
In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.
And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).
The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.
This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
Reduce reliance on a global state in the config reading API.
* gc/config-context:
config: pass source to config_parser_event_fn_t
config: add kvi.path, use it to evaluate includes
config.c: remove config_reader from configsets
config: pass kvi to die_bad_number()
trace2: plumb config kvi
config.c: pass ctx with CLI config
config: pass ctx with config files
config.c: pass ctx in configsets
config: add ctx arg to config_fn_t
urlmatch.h: use config_fn_t type
config: inline git_color_default_config
Partially revert a sanity check that the rest of the config code
was not ready, to avoid triggering it in a corner case.
* gc/config-partial-submodule-kvi-fix:
config: don't BUG when both kvi and source are set
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Header files cleanup.
* en/header-split-cache-h-part-3: (28 commits)
fsmonitor-ll.h: split this header out of fsmonitor.h
hash-ll, hashmap: move oidhash() to hash-ll
object-store-ll.h: split this header out of object-store.h
khash: name the structs that khash declares
merge-ll: rename from ll-merge
git-compat-util.h: remove unneccessary include of wildmatch.h
builtin.h: remove unneccessary includes
list-objects-filter-options.h: remove unneccessary include
diff.h: remove unnecessary include of oidset.h
repository: remove unnecessary include of path.h
log-tree: replace include of revision.h with simple forward declaration
cache.h: remove this no-longer-used header
read-cache*.h: move declarations for read-cache.c functions from cache.h
repository.h: move declaration of the_index from cache.h
merge.h: move declarations for merge.c from cache.h
diff.h: move declaration for global in diff.c from cache.h
preload-index.h: move declarations for preload-index.c from elsewhere
sparse-index.h: move declarations for sparse-index.c from cache.h
name-hash.h: move declarations for name-hash.c from cache.h
run-command.h: move declarations for run-command.c from cache.h
...
..so that the callback can use a "struct config_source" parameter
instead of "config_reader.source". "struct config_source" is internal to
config.c, so we are adding a pointer to a struct defined in config.c
into a public function signature defined in config.h, but this is okay
because this function has only ever been (and probably ever will be)
used internally by config.c.
As a result, the_reader isn't used anywhere, so "struct config_reader"
is obsolete (it was only intended to be used with the_reader). Remove
them.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Include directives are evaluated using the path of the config file. To
reduce the dependence on "config_reader.source", add a new
"key_value_info.path" member and use that instead of
"config_source.path". This allows us to remove a "struct config_reader
*" field from "struct config_include_data", which will subsequently
allow us to remove "struct config_reader" entirely.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the last usage of "struct config_reader" from configsets by
copying the "kvi" arg instead of recomputing "kvi" from
config_reader.source. Since we no longer need to pass both "struct
config_reader" and "struct config_set" in a single "void *cb", remove
"struct configset_add_data" too.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a code path starting from trace2_def_param_fl() that eventually
calls current_config_scope(), and thus it needs to have "kvi" plumbed
through it. Additional plumbing is also needed to get "kvi" to
trace2_def_param_fl(), which gets called by two code paths:
- Through tr2_cfg_cb(), which is a config callback, so it trivially
receives "kvi" via the "struct config_context ctx" parameter.
- Through tr2_list_env_vars_fl(), which is a high level function that
lists environment variables for tracing. This has been secretly
behaving like git_config_from_parameters() (in that it parses config
from environment variables/the CLI), but does not set config source
information.
Teach tr2_list_env_vars_fl() to be well-behaved by using
kvi_from_param(), which is used elsewhere for CLI/environment
variable-based config.
As a result, current_config_scope() has no more callers, so remove it.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass config_context when parsing CLI config. To provide the .kvi member,
refactor out kvi_from_param() from the logic that caches CLI config in
configsets. Now that config_context and config_context.kvi is always
present when config machinery calls config callbacks, plumb "kvi" so
that we can remove all calls of current_config_scope() except for
trace2/*.c (which will be handled in a later commit), and remove all
other current_config_*() (the functions themselves and their calls).
Note that this results in .kvi containing a different, more complete
set of information than the mocked up "struct config_source" in
git_config_from_parameters().
Plumbing "kvi" reveals a few places where we've been doing the wrong
thing:
* git_config_parse_parameter() hasn't been setting config source
information, so plumb "kvi" there too.
* Several sites in builtin/config.c have been calling current_config_*()
functions outside of config callbacks (indirectly, via the
format_config() helper), which means they're reading state that isn't
set correctly:
* "git config --get-urlmatch --show-scope" iterates config to collect
values, but then attempts to display the scope after config
iteration, causing the "unknown" scope to be shown instead of the
config file's scope. It's clear that this wasn't intended: we knew
that "--get-urlmatch" couldn't show config source metadata, which is
why "--show-origin" was marked incompatible with "--get-urlmatch"
when it was introduced [1]. It was most likely a mistake that we
allowed "--show-scope" to sneak through.
Fix this by copying the "kvi" value in the collection phase so that
it can be read back later. This means that we can now support "git
config --get-urlmatch --show-origin", but that is left unchanged
for now.
* "git config --default" doesn't have config source metadata when
displaying the default value, so "--show-scope" also results in
"unknown", and "--show-origin" results in a BUG(). Fix this by
treating the default value as if it came from the command line (e.g.
like we do with "git -c" or "git config --file"), using
kvi_from_param().
[1] https://lore.kernel.org/git/20160205112001.GA13397@sigill.intra.peff.net/
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass config_context to config_callbacks when parsing config files. To
provide the .kvi member, refactor out the configset logic that caches
"struct config_source" and "enum config_scope" as a "struct
key_value_info". Make the "enum config_scope" available to the config
file machinery by plumbing an additional arg through
git_config_from_file_with_options().
We do not exercise ctx yet because the remaining current_config_*()
callers may be used with config_with_options(), which may read config
from parameters, but parameters don't pass ctx yet.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass config_context to config callbacks in configset_iter(), trivially
setting the .kvi member to the cached key_value_info. Then, in config
callbacks that are only used with configsets, use the .kvi member to
replace calls to current_config_*(), and delete current_config_line()
because it has no remaining callers.
This leaves builtin/config.c and config.c as the only remaining users of
current_config_*().
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When iterating through config, we read config source metadata from
global values - either a "struct config_source + enum config_scope"
or a "struct key_value_info", using the current_config* functions. Prior
to the series starting from 0c60285147 (config.c: create config_reader
and the_reader, 2023-03-28), we weren't very picky about which values we
should read in which situation; we did note that both groups of values
generally shouldn't be set together, but if both were set,
current_config* preferentially reads key_value_info. When that series
added more structure, we enforced that either the former (when parsing a
config source) can be set, or the latter (when iterating a config set),
but *never* both at the same time. See 9828453ff0 (config.c: remove
current_config_kvi, 2023-03-28) and 5cdf18e7cd (config.c: remove
current_parsing_scope, 2023-03-28).
That was a good simplifying constraint that helped us reason about the
global state, but it turns out that there is at least one situation
where we need both to be set at the same time: in a blobless partial
clone where .gitmodules is missing. "git fetch" in such a repo will
start a config parse over .gitmodules (setting the config_source), and
Git will attempt to lazy-fetch it from the promisor remote. However,
when we try to read the promisor configuration, we start iterating a
config set (setting the key_value_info), and we BUG() out because that's
not allowed any more.
Teaching config_reader to gracefully handle this is somewhat
complicated, but fortunately, there are proposed changes to the config.c
machinery to get rid of this global state, and make the BUG() obsolete
[1]. We should rely on that as the eventual solution, and avoid doing
yet another refactor in the meantime.
Therefore, fix the bug by removing the BUG() check. We're reverting to
an older, less safe state, but that's generally okay since
key_value_info is always preferentially read, so we'd always read the
correct values when we iterate a config set in the middle of a config
parse (like we are here). The reverse would be wrong, but extremely
unlikely to happen since very few callers parse config without going
through a config set.
[1] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfixes
* rj/leakfixes:
tests: mark as passing with SANITIZE=leak
config: fix a leak in git_config_copy_or_rename_section_in_file
branch: fix a leak in cmd_branch
branch: fix a leak in setup_tracking
rev-parse: fix a leak with --abbrev-ref
branch: fix a leak in setup_tracking
branch: fix a leak in check_tracking_branch
branch: fix a leak in inherit_tracking
branch: fix a leak in dwim_and_setup_tracking
remote: fix a leak in query_matches_negative_refspec
config: fix a leak in git_config_copy_or_rename_section_in_file
Introduce a mechanism to disable replace refs globally and per
repository.
* ds/disable-replace-refs:
repository: create read_replace_refs setting
replace-objects: create wrapper around setting
repository: create disable_replace_refs()
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch(). The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.
Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h. Also move some related inline
functions from cache.h to read-cache.h. The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A branch can have its configuration spread over several configuration
sections. This situation was already foreseen in 52d59cc645 (branch:
add a --copy (-c) option to go with --move (-m), 2017-06-18) when
"branch -c" was introduced.
Unfortunately, a leak was also introduced:
$ git branch foo
$ cat >> .git/config <<EOF
[branch "foo"]
some-key-a = a value
[branch "foo"]
some-key-b = b value
[branch "foo"]
some-key-c = c value
EOF
$ git branch -c foo bar
Direct leak of 130 byte(s) in 2 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_vaddf strbuf.c
... in strbuf_addf strbuf.c
... in store_create_section config.c
... in git_config_copy_or_rename_section_in_file config.c
... in git_config_copy_section_in_file config.c
... in git_config_copy_section config.c
... in copy_or_rename_branch builtin/branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Let's fix it.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 52d59cc645 (branch: add a --copy (-c) option to go with --move (-m),
2017-06-18) a new strbuf variable was introduced, but not released.
Thus, when copying a branch that has any configuration, we have a
leak.
$ git branch foo
$ git config branch.foo.some-key some_value
$ git branch -c foo bar
Direct leak of 65 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_vaddf strbuf.c
... in strbuf_addf strbuf.c
... in store_create_section config.c
... in git_config_copy_or_rename_section_in_file config.c
... in git_config_copy_section_in_file config.c
... in git_config_copy_section config.c
... in copy_or_rename_branch builtin/branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Let's fix that leak.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'read_replace_refs' global specifies whether or not we should
respect the references of the form 'refs/replace/<oid>' to replace which
object we look up when asking for '<oid>'. This global has caused issues
when it is not initialized properly, such as in b6551feadf (merge-tree:
load default git config, 2023-05-10).
To make this more robust, move its config-based initialization out of
git_default_config and into prepare_repo_settings(). This provides a
repository-scoped version of the 'read_replace_refs' global.
The global still has its purpose: it is disabled process-wide by the
GIT_NO_REPLACE_OBJECTS environment variable or by a call to
disable_replace_refs() in some specific Git commands.
Since we already encapsulated the use of the constant inside
replace_refs_enabled(), we can perform the initialization inside that
method, if necessary. This solves the problem of forgetting to check the
config, as we will check it before returning this value.
Due to this encapsulation, the global can move to be static within
replace-object.c.
There is an interesting behavior change possible here: we now have a
repository-scoped understanding of this config value. Thus, if there was
a command that recurses into submodules and might follow replace refs,
then it would now respect the core.useReplaceRefs config value in each
repository.
'git grep --recurse-submodules' is such a command that recurses into
submodules in-process. We can demonstrate the granularity of this config
value via a test in t7814.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move 'repository_format_worktree_config' out of the global scope and into
the 'repository' struct. This change is similar to how
'repository_format_partial_clone' was moved in ebaf3bcf1a (repository: move
global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository'
struct and updating 'setup.c' & 'repository.c' functions to assign the value
appropriately.
The primary goal of this change is to be able to load the worktree config of
a submodule depending on whether that submodule - not its superproject - has
'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()'
has access to the newly repo-scoped configuration, add a 'struct repository'
argument to 'do_git_config_sequence()' and pass it the 'repo' value from
'config_with_options()'.
Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to
verify 'extensions.worktreeConfig' is read an used independently by
superprojects and submodules.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a 'struct repository' argument to 'config_with_options()' and remove the
'repo' field from 'struct git_config_source'.
A 'struct repository' instance was originally added to the config source in
e3e8bf046e (submodule-config: pass repo upon blob config read, 2021-08-16)
to improve how submodule blob config content was accessed. At the time, this
was the only use for a 'repository' instance, so it was naturally added only
where it was needed: to 'struct git_config_source'. However, in upcoming
patches, 'config_with_options()' will need the repository instance to access
extension information (regardless of whether a 'config_source' exists). To
make the 'struct repository' instance more easily accessible, move it into
the function's arguments.
Update all callers of 'config_with_options()' to pass the appropriate 'repo'
value:
* in 'builtin/config.c', use 'the_repository'
* in 'submodule--config.c', use the 'repo' arg in 'config_from_gitmodules()'
* in 'read_[very_]early_config()' & 'read_protected_config()', set 'repo' to
NULL (repository instances aren't available there)
* in 'populate_remote_urls()', use the repo instance that has been added to
the 'struct config_include_data'
* in 'repo_read_config()', use the given 'repo' arg
Finally, note that this patch eliminates the fallback to 'the_repository'
that previously existed for the 'config_source' repo instance if it was
NULL. The fallback is no longer necessary, as the 'repo' is set explicitly
in all cases where it is needed.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'do_git_config_sequence()' to read the worktree config from
'config.worktree' in 'opts->git_dir' rather than the gitdir of
'the_repository'.
The worktree config is loaded from the path returned by
'git_pathdup("config.worktree")', the 'config.worktree' relative to the
gitdir of 'the_repository'. If loading the config for a submodule, this path
is incorrect, since 'the_repository' is the superproject. 'opts->git_dir' is
the gitdir of the submodule being configured, so the config file in that
location should be read instead.
To ensure the use of 'opts->git_dir' is safe, require that 'opts->git_dir'
is set if-and-only-if 'opts->commondir' is set (rather than "only-if" as it
is now). In all current usage of 'config_options', these values are set
together, so the stricter check does not change any behavior.
Finally, add tests to 't3007-ls-files-recurse-submodules.sh' to verify the
corrected config is loaded. Use 'ls-files' to test this because, unlike some
other '--recurse-submodules' commands, 'ls-files' parses the config of the
submodule in the same process as the superproject (via 'show_submodule()' ->
'repo_read_index()' -> 'prepare_repo_settings()'). As a result,
'the_repository' points to the config of the superproject but the
commondir/gitdir in the config sequence will be that of the submodule,
providing the exact scenario needed to verify this patch.
The first test ('--recurse-submodules parses submodule repo config') checks
that the submodule's *repo* config is read when running 'ls-files' on the
superproject; this confirms already-working behavior, serving as a reference
for how worktree config parsing should behave. The second test
('--recurse-submodules parses submodule worktree config') tests the same
scenario as the previous but instead using the *worktree* config,
demonstrating the corrected behavior. The 'test_config' helper is extended
for this case so that it properly applies the '--worktree' option to the
configure/unconfigure operations it performs.
Note that, although the submodule worktree config is now parsed instead of
the superproject's, 'extensions.worktreeConfig' in the superproject still
controls whether or not the worktree config is enabled at all in the
submodule. This will be fixed in a later patch.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More header clean-up.
* en/header-split-cache-h-part-2: (22 commits)
reftable: ensure git-compat-util.h is the first (indirect) include
diff.h: reduce unnecessary includes
object-store.h: reduce unnecessary includes
commit.h: reduce unnecessary includes
fsmonitor: reduce includes of cache.h
cache.h: remove unnecessary headers
treewide: remove cache.h inclusion due to previous changes
cache,tree: move basic name compare functions from read-cache to tree
cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
hash-ll.h: split out of hash.h to remove dependency on repository.h
tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
dir.h: move DTYPE defines from cache.h
versioncmp.h: move declarations for versioncmp.c functions from cache.h
ws.h: move declarations for ws.c functions from cache.h
match-trees.h: move declarations for match-trees.c functions from cache.h
pkt-line.h: move declarations for pkt-line.c functions from cache.h
base85.h: move declarations for base85.c functions from cache.h
copy.h: move declarations for copy.c functions from cache.h
server-info.h: move declarations for server-info.c functions from cache.h
packfile.h: move pack_window and pack_entry from cache.h
...
Header clean-up.
* en/header-split-cache-h: (24 commits)
protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
mailmap, quote: move declarations of global vars to correct unit
treewide: reduce includes of cache.h in other headers
treewide: remove double forward declaration of read_in_full
cache.h: remove unnecessary includes
treewide: remove cache.h inclusion due to pager.h changes
pager.h: move declarations for pager.c functions from cache.h
treewide: remove cache.h inclusion due to editor.h changes
editor: move editor-related functions and declarations into common file
treewide: remove cache.h inclusion due to object.h changes
object.h: move some inline functions and defines from cache.h
treewide: remove cache.h inclusion due to object-file.h changes
object-file.h: move declarations for object-file.c functions from cache.h
treewide: remove cache.h inclusion due to git-zlib changes
git-zlib: move declarations for git-zlib functions from cache.h
treewide: remove cache.h inclusion due to object-name.h changes
object-name.h: move declarations for object-name.c functions from cache.h
treewide: remove unnecessary cache.h inclusion
treewide: be explicit about dependence on mem-pool.h
treewide: be explicit about dependence on oid-array.h
...
* maint-2.39: (34 commits)
Git 2.39.3
Git 2.38.5
Git 2.37.7
Git 2.36.6
Git 2.35.8
Makefile: force -O0 when compiling with SANITIZE=leak
Git 2.34.8
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
range-diff: use ssize_t for parsed "len" in read_patches()
...
* maint-2.38: (32 commits)
Git 2.38.5
Git 2.37.7
Git 2.36.6
Git 2.35.8
Git 2.34.8
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
...
* maint-2.37: (31 commits)
Git 2.37.7
Git 2.36.6
Git 2.35.8
Git 2.34.8
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
...
* maint-2.36: (30 commits)
Git 2.36.6
Git 2.35.8
Git 2.34.8
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
...
* maint-2.35: (29 commits)
Git 2.35.8
Git 2.34.8
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
...
* maint-2.34: (28 commits)
Git 2.34.8
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
...
* maint-2.33: (27 commits)
Git 2.33.8
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
...
* maint-2.32: (26 commits)
Git 2.32.7
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
ci: install python on ubuntu
...
* maint-2.31: (25 commits)
Git 2.31.8
tests: avoid using `test_i18ncmp`
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
ci: install python on ubuntu
ci: use the same version of p4 on both Linux and macOS
...
* maint-2.30: (23 commits)
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
ci: install python on ubuntu
ci: use the same version of p4 on both Linux and macOS
ci: remove the pipe after "p4 -V" to catch errors
github-actions: run gcc-8 on ubuntu-20.04 image
...
As a defense-in-depth measure to guard against any potentially-unknown
buffer overflows in `copy_or_rename_section_in_file()`, refuse to work
with overly-long lines in a gitconfig.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
There are a couple of spots within `copy_or_rename_section_in_file()`
that incorrectly use an `int` to track an offset within a string, which
may truncate or wrap around to a negative value.
Historically it was impossible to have a line longer than 1024 bytes
anyway, since we used fgets() with a fixed-size buffer of exactly that
length. But the recent change to use a strbuf permits us to read lines
of arbitrary length, so it's possible for a malicious input to cause us
to overflow past INT_MAX and do an out-of-bounds array read.
Practically speaking, however, this should never happen, since it
requires 2GB section names or values, which are unrealistic in
non-malicious circumstances.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When renaming (or deleting) a section of configuration, Git uses the
function `git_config_copy_or_rename_section_in_file()` to rewrite the
configuration file after applying the rename or deletion to the given
section.
To do this, Git repeatedly calls `fgets()` to read the existing
configuration data into a fixed size buffer.
When the configuration value under `old_name` exceeds the size of the
buffer, we will call `fgets()` an additional time even if there is no
newline in the configuration file, since our read length is capped at
`sizeof(buf)`.
If the first character of the buffer (after zero or more characters
satisfying `isspace()`) is a '[', Git will incorrectly treat it as
beginning a new section when the original section is being removed. In
other words, a configuration value satisfying this criteria can
incorrectly be considered as a new secftion instead of a variable in the
original section.
Avoid this issue by using a variable-width buffer in the form of a
strbuf rather than a fixed-with region on the stack. A couple of small
points worth noting:
- Using a strbuf will cause us to allocate arbitrary sizes to match
the length of each line. In practice, we don't expect any
reasonable configuration files to have lines that long, and a
bandaid will be introduced in a later patch to ensure that this is
the case.
- We are using strbuf_getwholeline() here instead of strbuf_getline()
in order to match `fgets()`'s behavior of leaving the trailing LF
character on the buffer (as well as a trailing NUL).
This could be changed later, but using strbuf_getwholeline() changes
the least about this function's implementation, so it is picked as
the safest path.
- It is temping to want to replace the loop to skip over characters
matching isspace() at the beginning of the buffer with a convenience
function like `strbuf_ltrim()`. But this is the wrong approach for a
couple of reasons:
First, it involves a potentially large and expensive `memmove()`
which we would like to avoid. Second, and more importantly, we also
*do* want to preserve those spaces to avoid changing the output of
other sections.
In all, this patch is a minimal replacement of the fixed-width buffer in
`git_config_copy_or_rename_section_in_file()` to instead use a `struct
strbuf`.
Reported-by: André Baptista <andre@ethiack.com>
Reported-by: Vítor Pinho <vitor@ethiack.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>