Compare commits

..

925 Commits

Author SHA1 Message Date
af778cd9be Git 2.32.4
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:41:15 -04:00
9cbd2827c5 Sync with 2.31.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:40:44 -04:00
ecf9b4a443 Git 2.31.5
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:39:26 -04:00
122512967e Sync with 2.30.6
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:39:15 -04:00
abd4d67ab0 Git 2.30.6
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:38:16 -04:00
067aa8fb41 t2080: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:27:18 -04:00
4a7dab5ce4 t1092: prepare for changing protocol.file.allow
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:27:14 -04:00
0ca6ead81e alias.c: reject too-long cmdline strings in split_cmdline()
This function improperly uses an int to represent the number of entries
in the resulting argument array. This allows a malicious actor to
intentionally overflow the return value, leading to arbitrary heap
writes.

Because the resulting argv array is typically passed to execv(), it may
be possible to leverage this attack to gain remote code execution on a
victim machine. This was almost certainly the case for certain
configurations of git-shell until the previous commit limited the size
of input it would accept. Other calls to split_cmdline() are typically
limited by the size of argv the OS is willing to hand us, so are
similarly protected.

So this is not strictly fixing a known vulnerability, but is a hardening
of the function that is worth doing to protect against possible unknown
vulnerabilities.

One approach to fixing this would be modifying the signature of
`split_cmdline()` to look something like:

    int split_cmdline(char *cmdline, const char ***argv, size_t *argc);

Where the return value of `split_cmdline()` is negative for errors, and
zero otherwise. If non-NULL, the `*argc` pointer is modified to contain
the size of the `**argv` array.

But this implies an absurdly large `argv` array, which more than likely
larger than the system's argument limit. So even if split_cmdline()
allowed this, it would fail immediately afterwards when we called
execv(). So instead of converting all of `split_cmdline()`'s callers to
work with `size_t` types in this patch, instead pursue the minimal fix
here to prevent ever returning an array with more than INT_MAX entries
in it.

Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
71ad7fe1bc shell: limit size of interactive commands
When git-shell is run in interactive mode (which must be enabled by
creating $HOME/git-shell-commands), it reads commands from stdin, one
per line, and executes them.

We read the commands with git_read_line_interactively(), which uses a
strbuf under the hood. That means we'll accept an input of arbitrary
size (limited only by how much heap we can allocate). That creates two
problems:

  - the rest of the code is not prepared to handle large inputs. The
    most serious issue here is that split_cmdline() uses "int" for most
    of its types, which can lead to integer overflow and out-of-bounds
    array reads and writes. But even with that fixed, we assume that we
    can feed the command name to snprintf() (via xstrfmt()), which is
    stuck for historical reasons using "int", and causes it to fail (and
    even trigger a BUG() call).

  - since the point of git-shell is to take input from untrusted or
    semi-trusted clients, it's a mild denial-of-service. We'll allocate
    as many bytes as the client sends us (actually twice as many, since
    we immediately duplicate the buffer).

We can fix both by just limiting the amount of per-command input we're
willing to receive.

We should also fix split_cmdline(), of course, which is an accident
waiting to happen, but that can come on top. Most calls to
split_cmdline(), including the other one in git-shell, are OK because
they are reading from an OS-provided argv, which is limited in practice.
This patch should eliminate the immediate vulnerabilities.

I picked 4MB as an arbitrary limit. It's big enough that nobody should
ever run into it in practice (since the point is to run the commands via
exec, we're subject to OS limits which are typically much lower). But
it's small enough that allocating it isn't that big a deal.

The code is mostly just swapping out fgets() for the strbuf call, but we
have to add a few niceties like flushing and trimming line endings. We
could simplify things further by putting the buffer on the stack, but
4MB is probably a bit much there. Note that we'll _always_ allocate 4MB,
which for normal, non-malicious requests is more than we would before
this patch. But on the other hand, other git programs are happy to use
96MB for a delta cache. And since we'd never touch most of those pages,
on a lazy-allocating OS like Linux they won't even get allocated to
actual RAM.

The ideal would be a version of strbuf_getline() that accepted a maximum
value. But for a minimal vulnerability fix, let's keep things localized
and simple. We can always refactor further on top.

The included test fails in an obvious way with ASan or UBSan (which
notice the integer overflow and out-of-bounds reads). Without them, it
fails in a less obvious way: we may segfault, or we may try to xstrfmt()
a long string, leading to a BUG(). Either way, it fails reliably before
this patch, and passes with it. Note that we don't need an EXPENSIVE
prereq on it. It does take 10-15s to fail before this patch, but with
the new limit, we fail almost immediately (and the perl process
generating 2GB of data exits via SIGPIPE).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
32696a4cbe shell: add basic tests
We have no tests of even basic functionality of git-shell. Let's add a
couple of obvious ones. This will serve as a framework for adding tests
for new things we fix, as well as making sure we don't screw anything up
too badly while doing so.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
a1d4f67c12 transport: make protocol.file.allow be "user" by default
An earlier patch discussed and fixed a scenario where Git could be used
as a vector to exfiltrate sensitive data through a Docker container when
a potential victim clones a suspicious repository with local submodules
that contain symlinks.

That security hole has since been plugged, but a similar one still
exists.  Instead of convincing a would-be victim to clone an embedded
submodule via the "file" protocol, an attacker could convince an
individual to clone a repository that has a submodule pointing to a
valid path on the victim's filesystem.

For example, if an individual (with username "foo") has their home
directory ("/home/foo") stored as a Git repository, then an attacker
could exfiltrate data by convincing a victim to clone a malicious
repository containing a submodule pointing at "/home/foo/.git" with
`--recurse-submodules`. Doing so would expose any sensitive contents in
stored in "/home/foo" tracked in Git.

For systems (such as Docker) that consider everything outside of the
immediate top-level working directory containing a Dockerfile as
inaccessible to the container (with the exception of volume mounts, and
so on), this is a violation of trust by exposing unexpected contents in
the working copy.

To mitigate the likelihood of this kind of attack, adjust the "file://"
protocol's default policy to be "user" to prevent commands that execute
without user input (including recursive submodule initialization) from
taking place by default.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
f4a32a550f t/t9NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that interact with submodules a handful of times use
`test_config_global`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
0d3beb71da t/t7NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
0f21b8f468 t/t6NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
225d2d50cc t/t5NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
ac7e57fa28 t/t4NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
f8d510ed0b t/t3NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
99f4abb8da t/2NNNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
8a96dbcb33 t/t1NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
7de0c306f7 t/lib-submodule-update.sh: allow local submodules
To prepare for changing the default value of `protocol.file.allow` to
"user", update the `prolog()` function in lib-submodule-update to allow
submodules to be cloned over the file protocol.

This is used by a handful of submodule-related test scripts, which
themselves will have to tweak the value of `protocol.file.allow` in
certain locations. Those will be done in subsequent commits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
6f054f9fb3 builtin/clone.c: disallow --local clones with symlinks
When cloning a repository with `--local`, Git relies on either making a
hardlink or copy to every file in the "objects" directory of the source
repository. This is done through the callpath `cmd_clone()` ->
`clone_local()` -> `copy_or_link_directory()`.

The way this optimization works is by enumerating every file and
directory recursively in the source repository's `$GIT_DIR/objects`
directory, and then either making a copy or hardlink of each file. The
only exception to this rule is when copying the "alternates" file, in
which case paths are rewritten to be absolute before writing a new
"alternates" file in the destination repo.

One quirk of this implementation is that it dereferences symlinks when
cloning. This behavior was most recently modified in 36596fd2df (clone:
better handle symlinked files at .git/objects/, 2019-07-10), which
attempted to support `--local` clones of repositories with symlinks in
their objects directory in a platform-independent way.

Unfortunately, this behavior of dereferencing symlinks (that is,
creating a hardlink or copy of the source's link target in the
destination repository) can be used as a component in attacking a
victim by inadvertently exposing the contents of file stored outside of
the repository.

Take, for example, a repository that stores a Dockerfile and is used to
build Docker images. When building an image, Docker copies the directory
contents into the VM, and then instructs the VM to execute the
Dockerfile at the root of the copied directory. This protects against
directory traversal attacks by copying symbolic links as-is without
dereferencing them.

That is, if a user has a symlink pointing at their private key material
(where the symlink is present in the same directory as the Dockerfile,
but the key itself is present outside of that directory), the key is
unreadable to a Docker image, since the link will appear broken from the
container's point of view.

This behavior enables an attack whereby a victim is convinced to clone a
repository containing an embedded submodule (with a URL like
"file:///proc/self/cwd/path/to/submodule") which has a symlink pointing
at a path containing sensitive information on the victim's machine. If a
user is tricked into doing this, the contents at the destination of
those symbolic links are exposed to the Docker image at runtime.

One approach to preventing this behavior is to recreate symlinks in the
destination repository. But this is problematic, since symlinking the
objects directory are not well-supported. (One potential problem is that
when sharing, e.g. a "pack" directory via symlinks, different writers
performing garbage collection may consider different sets of objects to
be reachable, enabling a situation whereby garbage collecting one
repository may remove reachable objects in another repository).

Instead, prohibit the local clone optimization when any symlinks are
present in the `$GIT_DIR/objects` directory of the source repository.
Users may clone the repository again by prepending the "file://" scheme
to their clone URL, or by adding the `--no-local` option to their `git
clone` invocation.

The directory iterator used by `copy_or_link_directory()` must no longer
dereference symlinks (i.e., it *must* call `lstat()` instead of `stat()`
in order to discover whether or not there are symlinks present). This has
no bearing on the overall behavior, since we will immediately `die()` on
encounter a symlink.

Note that t5604.33 suggests that we do support local clones with
symbolic links in the source repository's objects directory, but this
was likely unintentional, or at least did not take into consideration
the problem with sharing parts of the objects directory with symbolic
links at the time. Update this test to reflect which options are and
aren't supported.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
656d9a24f6 Git 2.32.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-06-23 12:35:32 +02:00
fc0c773028 Sync with 2.31.4
* maint-2.31:
  Git 2.31.4
  Git 2.30.5
  setup: tighten ownership checks post CVE-2022-24765
  git-compat-util: allow root to access both SUDO_UID and root owned
  t0034: add negative tests and allow git init to mostly work under sudo
  git-compat-util: avoid failing dir ownership checks if running privileged
  t: regression git needs safe.directory when using sudo
2022-06-23 12:35:30 +02:00
5b1c746c35 Git 2.31.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-06-23 12:35:25 +02:00
2f8809f9a1 Sync with 2.30.5
* maint-2.30:
  Git 2.30.5
  setup: tighten ownership checks post CVE-2022-24765
  git-compat-util: allow root to access both SUDO_UID and root owned
  t0034: add negative tests and allow git init to mostly work under sudo
  git-compat-util: avoid failing dir ownership checks if running privileged
  t: regression git needs safe.directory when using sudo
2022-06-23 12:35:23 +02:00
88b7be68a4 Git 2.30.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-06-23 12:31:05 +02:00
3b0bf27049 setup: tighten ownership checks post CVE-2022-24765
8959555cee (setup_git_directory(): add an owner check for the top-level
directory, 2022-03-02), adds a function to check for ownership of
repositories using a directory that is representative of it, and ways to
add exempt a specific repository from said check if needed, but that
check didn't account for owership of the gitdir, or (when used) the
gitfile that points to that gitdir.

An attacker could create a git repository in a directory that they can
write into but that is owned by the victim to work around the fix that
was introduced with CVE-2022-24765 to potentially run code as the
victim.

An example that could result in privilege escalation to root in *NIX would
be to set a repository in a shared tmp directory by doing (for example):

  $ git -C /tmp init

To avoid that, extend the ensure_valid_ownership function to be able to
check for all three paths.

This will have the side effect of tripling the number of stat() calls
when a repository is detected, but the effect is expected to be likely
minimal, as it is done only once during the directory walk in which Git
looks for a repository.

Additionally make sure to resolve the gitfile (if one was used) to find
the relevant gitdir for checking.

While at it change the message printed on failure so it is clear we are
referring to the repository by its worktree (or gitdir if it is bare) and
not to a specific directory.

Helped-by: Junio C Hamano <junio@pobox.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-06-23 12:31:05 +02:00
b779214eaf Merge branch 'cb/path-owner-check-with-sudo'
With a recent update to refuse access to repositories of other
people by default, "sudo make install" and "sudo git describe"
stopped working.  This series intends to loosen it while keeping
the safety.

* cb/path-owner-check-with-sudo:
  t0034: add negative tests and allow git init to mostly work under sudo
  git-compat-util: avoid failing dir ownership checks if running privileged
  t: regression git needs safe.directory when using sudo

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-06-23 12:31:04 +02:00
6b11e3d52e git-compat-util: allow root to access both SUDO_UID and root owned
Previous changes introduced a regression which will prevent root for
accessing repositories owned by thyself if using sudo because SUDO_UID
takes precedence.

Loosen that restriction by allowing root to access repositories owned
by both uid by default and without having to add a safe.directory
exception.

A previous workaround that was documented in the tests is no longer
needed so it has been removed together with its specially crafted
prerequisite.

Helped-by: Johanness Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17 14:03:08 -07:00
b9063afda1 t0034: add negative tests and allow git init to mostly work under sudo
Add a support library that provides one function that can be used
to run a "scriplet" of commands through sudo and that helps invoking
sudo in the slightly awkward way that is required to ensure it doesn't
block the call (if shell was allowed as tested in the prerequisite)
and it doesn't run the command through a different shell than the one
we intended.

Add additional negative tests as suggested by Junio and that use a
new workspace that is owned by root.

Document a regression that was introduced by previous commits where
root won't be able anymore to access directories they own unless
SUDO_UID is removed from their environment.

The tests document additional ways that this new restriction could
be worked around and the documentation explains why it might be instead
considered a feature, but a "fix" is planned for a future change.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-12 18:12:23 -07:00
ae9abbb63e git-compat-util: avoid failing dir ownership checks if running privileged
bdc77d1d68 (Add a function to determine whether a path is owned by the
current user, 2022-03-02) checks for the effective uid of the running
process using geteuid() but didn't account for cases where that user was
root (because git was invoked through sudo or a compatible tool) and the
original uid that repository trusted for its config was no longer known,
therefore failing the following otherwise safe call:

  guy@renard ~/Software/uncrustify $ sudo git describe --always --dirty
  [sudo] password for guy:
  fatal: unsafe repository ('/home/guy/Software/uncrustify' is owned by someone else)

Attempt to detect those cases by using the environment variables that
those tools create to keep track of the original user id, and do the
ownership check using that instead.

This assumes the environment the user is running on after going
privileged can't be tampered with, and also adds code to restrict that
the new behavior only applies if running as root, therefore keeping the
most common case, which runs unprivileged, from changing, but because of
that, it will miss cases where sudo (or an equivalent) was used to change
to another unprivileged user or where the equivalent tool used to raise
privileges didn't track the original id in a sudo compatible way.

Because of compatibility with sudo, the code assumes that uid_t is an
unsigned integer type (which is not required by the standard) but is used
that way in their codebase to generate SUDO_UID.  In systems where uid_t
is signed, sudo might be also patched to NOT be unsigned and that might
be able to trigger an edge case and a bug (as described in the code), but
it is considered unlikely to happen and even if it does, the code would
just mostly fail safely, so there was no attempt either to detect it or
prevent it by the code, which is something that might change in the future,
based on expected user feedback.

Reported-by: Guy Maurel <guy.j@maurel.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Randall Becker <rsbecker@nexbridge.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-12 18:12:23 -07:00
5f1a3fec8c t: regression git needs safe.directory when using sudo
Originally reported after release of v2.35.2 (and other maint branches)
for CVE-2022-24765 and blocking otherwise harmless commands that were
done using sudo in a repository that was owned by the user.

Add a new test script with very basic support to allow running git
commands through sudo, so a reproduction could be implemented and that
uses only `git status` as a proxy of the issue reported.

Note that because of the way sudo interacts with the system, a much
more complete integration with the test framework will require a lot
more work and that was therefore intentionally punted for now.

The current implementation requires the execution of a special cleanup
function which should always be kept as the last "test" or otherwise
the standard cleanup functions will fail because they can't remove
the root owned directories that are used.  This also means that if
failures are found while running, the specifics of the failure might
not be kept for further debugging and if the test was interrupted, it
will be necessary to clean the working directory manually before
restarting by running:

  $ sudo rm -rf trash\ directory.t0034-root-safe-directory/

The test file also uses at least one initial "setup" test that creates
a parallel execution directory under the "root" sub directory, which
should be used as top level directory for all repositories that are
used in this test file.  Unlike all other tests the repository provided
by the test framework should go unused.

Special care should be taken when invoking commands through sudo, since
the environment is otherwise independent from what the test framework
setup and might have changed the values for HOME, SHELL and dropped
several relevant environment variables for your test.  Indeed `git status`
was used as a proxy because it doesn't even require commits in the
repository to work and usually doesn't require much from the environment
to run, but a future patch will add calls to `git init` and that will
fail to honor the default branch name, unless that setting is NOT
provided through an environment variable (which means even a CI run
could fail that test if enabled incorrectly).

A new SUDO prerequisite is provided that does some sanity checking
to make sure the sudo command that will be used allows for passwordless
execution as root without restrictions and doesn't mess with git's
execution path.  This matches what is provided by the macOS agents that
are used as part of GitHub actions and probably nowhere else.

Most of those characteristics make this test mostly only suitable for
CI, but it might be executed locally if special care is taken to provide
for all of them in the local configuration and maybe making use of the
sudo credential cache by first invoking sudo, entering your password if
needed, and then invoking the test with:

  $ GIT_TEST_ALLOW_SUDO=YES ./t0034-root-safe-directory.sh

If it fails to run, then it means your local setup wouldn't work for the
test because of the configuration sudo has or other system settings, and
things that might help are to comment out sudo's secure_path config, and
make sure that the account you are using has no restrictions on the
commands it can run through sudo, just like is provided for the user in
the CI.

For example (assuming a username of marta for you) something probably
similar to the following entry in your /etc/sudoers (or equivalent) file:

  marta	ALL=(ALL:ALL) NOPASSWD: ALL

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-12 18:12:23 -07:00
1530434434 Git 2.32.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 15:21:26 -07:00
09f66d65f8 Git 2.31.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 15:21:08 -07:00
17083c79ae Git 2.30.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 13:31:29 -07:00
0f85c4a30b setup: opt-out of check with safe.directory=*
With the addition of the safe.directory in 8959555ce
(setup_git_directory(): add an owner check for the top-level directory,
2022-03-02) released in v2.35.2, we are receiving feedback from a
variety of users about the feature.

Some users have a very large list of shared repositories and find it
cumbersome to add this config for every one of them.

In a more difficult case, certain workflows involve running Git commands
within containers. The container boundary prevents any global or system
config from communicating `safe.directory` values from the host into the
container. Further, the container almost always runs as a different user
than the owner of the directory in the host.

To simplify the reactions necessary for these users, extend the
definition of the safe.directory config value to include a possible '*'
value. This value implies that all directories are safe, providing a
single setting to opt-out of this protection.

Note that an empty assignment of safe.directory clears all previous
values, and this is already the case with the "if (!value || !*value)"
condition.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 12:42:51 -07:00
bb50ec3cc3 setup: fix safe.directory key not being checked
It seems that nothing is ever checking to make sure the safe directories
in the configs actually have the key safe.directory, so some unrelated
config that has a value with a certain directory would also make it a
safe directory.

Signed-off-by: Matheus Valadares <me@m28.io>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 12:42:51 -07:00
e47363e5a8 t0033: add tests for safe.directory
It is difficult to change the ownership on a directory in our test
suite, so insert a new GIT_TEST_ASSUME_DIFFERENT_OWNER environment
variable to trick Git into thinking we are in a differently-owned
directory. This allows us to test that the config is parsed correctly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 12:42:49 -07:00
9bcd7a8eca Git 2.32.1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-24 00:31:29 +01:00
201b0c7af6 Sync with 2.31.2
* maint-2.31:
  Git 2.31.2
  Git 2.30.3
  setup_git_directory(): add an owner check for the top-level directory
  Add a function to determine whether a path is owned by the current user
2022-03-24 00:31:28 +01:00
44de39c45c Git 2.31.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-24 00:24:29 +01:00
6a2381a3e5 Sync with 2.30.3
* maint-2.30:
  Git 2.30.3
  setup_git_directory(): add an owner check for the top-level directory
  Add a function to determine whether a path is owned by the current user
2022-03-24 00:24:29 +01:00
cb95038137 Git 2.30.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-24 00:22:17 +01:00
fdcad5a53e Fix GIT_CEILING_DIRECTORIES with C:\ and the likes
When determining the length of the longest ancestor of a given path with
respect to to e.g. `GIT_CEILING_DIRECTORIES`, we special-case the root
directory by returning 0 (i.e. we pretend that the path `/` does not end
in a slash by virtually stripping it).

That is the correct behavior because when normalizing paths, the root
directory is special: all other directory paths have their trailing
slash stripped, but not the root directory's path (because it would
become the empty string, which is not a legal path).

However, this special-casing of the root directory in
`longest_ancestor_length()` completely forgets about Windows-style root
directories, e.g. `C:\`. These _also_ get normalized with a trailing
slash (because `C:` would actually refer to the current directory on
that drive, not necessarily to its root directory).

In fc56c7b34b (mingw: accomodate t0060-path-utils for MSYS2,
2016-01-27), we almost got it right. We noticed that
`longest_ancestor_length()` expects a slash _after_ the matched prefix,
and if the prefix already ends in a slash, the normalized path won't
ever match and -1 is returned.

But then that commit went astray: The correct fix is not to adjust the
_tests_ to expect an incorrect -1 when that function is fed a prefix
that ends in a slash, but instead to treat such a prefix as if the
trailing slash had been removed.

Likewise, that function needs to handle the case where it is fed a path
that ends in a slash (not only a prefix that ends in a slash): if it
matches the prefix (plus trailing slash), we still need to verify that
the path does not end there, otherwise the prefix is not actually an
ancestor of the path but identical to it (and we need to return -1 in
that case).

With these two adjustments, we no longer need to play games in t0060
where we only add `$rootoff` if the passed prefix is different from the
MSYS2 pseudo root, instead we also add it for the MSYS2 pseudo root
itself. We do have to be careful to skip that logic entirely for Windows
paths, though, because they do are not subject to that MSYS2 pseudo root
treatment.

This patch fixes the scenario where a user has set
`GIT_CEILING_DIRECTORIES=C:\`, which would be ignored otherwise.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-24 00:21:08 +01:00
8959555cee setup_git_directory(): add an owner check for the top-level directory
It poses a security risk to search for a git directory outside of the
directories owned by the current user.

For example, it is common e.g. in computer pools of educational
institutes to have a "scratch" space: a mounted disk with plenty of
space that is regularly swiped where any authenticated user can create
a directory to do their work. Merely navigating to such a space with a
Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/`
can lead to a compromised account.

The same holds true in multi-user setups running Windows, as `C:\` is
writable to every authenticated user by default.

To plug this vulnerability, we stop Git from accepting top-level
directories owned by someone other than the current user. We avoid
looking at the ownership of each and every directories between the
current and the top-level one (if there are any between) to avoid
introducing a performance bottleneck.

This new default behavior is obviously incompatible with the concept of
shared repositories, where we expect the top-level directory to be owned
by only one of its legitimate users. To re-enable that use case, we add
support for adding exceptions from the new default behavior via the
config setting `safe.directory`.

The `safe.directory` config setting is only respected in the system and
global configs, not from repository configs or via the command-line, and
can have multiple values to allow for multiple shared repositories.

We are particularly careful to provide a helpful message to any user
trying to use a shared repository.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-21 13:16:26 +01:00
bdc77d1d68 Add a function to determine whether a path is owned by the current user
This function will be used in the next commit to prevent
`setup_git_directory()` from discovering a repository in a directory
that is owned by someone other than the current user.

Note: We cannot simply use `st.st_uid` on Windows just like we do on
Linux and other Unix-like platforms: according to
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/stat-functions
this field is always zero on Windows (because Windows' idea of a user ID
does not fit into a single numerical value). Therefore, we have to do
something a little involved to replicate the same functionality there.

Also note: On Windows, a user's home directory is not actually owned by
said user, but by the administrator. For all practical purposes, it is
under the user's control, though, therefore we pretend that it is owned
by the user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-21 13:16:26 +01:00
2a9a5862e5 Merge branch 'cb/mingw-gmtime-r'
Build fix on Windows.

* cb/mingw-gmtime-r:
  mingw: avoid fallback for {local,gm}time_r()

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-17 12:52:12 +01:00
6e7ad1e4c2 mingw: avoid fallback for {local,gm}time_r()
mingw-w64's pthread_unistd.h had a bug that mistakenly (because there is
no support for the *lockfile() functions required[1]) defined
_POSIX_THREAD_SAFE_FUNCTIONS and that was being worked around since
3ecd153a3b (compat/mingw: support MSys2-based MinGW build, 2016-01-14).

The bug was fixed in winphtreads, but as a side effect, leaves the
reentrant functions from time.h no longer visible and therefore breaks
the build.

Since the intention all along was to avoid using the fallback functions,
formalize the use of POSIX by setting the corresponding feature flag and
compile out the implementation for the fallback functions.

[1] https://unix.org/whitepapers/reentrant.html

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-17 12:52:12 +01:00
ebf3c04b26 Git 2.32
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-06 15:40:01 +09:00
15664a5f35 Merge tag 'l10n-2.32.0-rnd1.1' of git://github.com/git-l10n/git-po
l10n-2.32.0-rnd1.1

* tag 'l10n-2.32.0-rnd1.1' of git://github.com/git-l10n/git-po: (25 commits)
  l10n: es: 2.32.0 round 1
  l10n: zh_CN: for git v2.32.0 l10n round 1
  l10n: Update Catalan translation
  l10n: de.po: Update German translation for Git v2.32.0
  l10n: README: note on fuzzy translations
  l10n: README: document l10n conventions
  l10n: README: document "core translation"
  l10n: README: document git-po-helper
  l10n: README: add file extention ".md"
  l10n: pt_PT: add Portuguese translations part 3
  l10n: bg.po: Updated Bulgarian translation (5204t)
  l10n: id: po-id for 2.32.0 (round 1)
  l10n: vi.po(5204t): Updated Vietnamese translation for v2.32.0
  l10n: zh_TW.po: localized
  l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
  l10n: sv.po: Update Swedish translation (5204t0f0u)
  l10n: fix typos in po/TEAMS
  l10n: fr: v2.32.0 round 1
  l10n: tr: v2.32.0-r1
  l10n: fr: fixed inconsistencies
  ...
2021-06-06 15:39:21 +09:00
0d3505e286 Merge branch 'rs/parallel-checkout-test-fix'
Test fix.

* rs/parallel-checkout-test-fix:
  parallel-checkout: avoid dash local bug in tests
2021-06-06 15:39:10 +09:00
0481af98ba Merge branch 'jc/fsync-can-fail-with-eintr'
Last minute portability fix.

* jc/fsync-can-fail-with-eintr:
  fsync(): be prepared to see EINTR
2021-06-06 15:39:10 +09:00
ebee5580ca parallel-checkout: avoid dash local bug in tests
Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
lets the shell erroneously perform field splitting on the expansion of a
command substitution during declaration of a local variable.  It causes
the parallel-checkout tests to fail e.g. when running them with
/bin/dash on MacOS 11.4, where they error out like this:

   ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name

That's because the output of wc -l contains leading spaces and the
returned number of lines is treated as another variable to declare, i.e.
as in "local workers= 0".

Work around it by enclosing the command substitution in quotes.

Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-06 10:40:26 +09:00
8e02217e10 l10n: es: 2.32.0 round 1
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
2021-06-05 20:06:23 -05:00
33b62fba4d l10n: zh_CN: for git v2.32.0 l10n round 1
Translate 126 new messages (5204t0f0u) for git 2.32.0.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-05 22:45:18 +08:00
de65c76e55 Merge branch 'fix_typo' of github.com:e-yes/git
* 'fix_typo' of github.com:e-yes/git:
  l10n: ru.po: fix typo in Russian translation
2021-06-05 21:30:30 +08:00
cccdfd2243 fsync(): be prepared to see EINTR
Some platforms, like NonStop do not automatically restart fsync()
when interrupted by a signal, even when that signal is setup with
SA_RESTART.

This can lead to test breakage, e.g., where "--progress" is used,
thus SIGALRM is sent often, and can interrupt an fsync() syscall.

Make sure we deal with such a case by retrying the syscall
ourselves.  Luckily, we call fsync() fron a single wrapper,
fsync_or_die(), so the fix is fairly isolated.

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
[jc: the above two did most of the work---I just tied the loose end]
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-05 22:13:40 +09:00
d5e7f9632f Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po
* 'pt-PT' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: add Portuguese translations part 3
  l10n: pt_PT: add Portuguese translations part 2
2021-06-04 18:59:17 +08:00
a2bb98ba76 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-06-04 06:58:05 +02:00
94d17948af l10n: de.po: Update German translation for Git v2.32.0
Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
2021-06-02 19:24:10 +02:00
c09b6306c6 Git 2.32-rc3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 12:51:09 +09:00
0b18023d00 contrib/completion: fix zsh completion regression from 59d85a2a05
A recent change to make git-completion.bash use $__git_cmd_idx
in more places broke a number of completions on zsh because it
modified __git_main but did not update __git_zsh_main.

Notably, completions for "add", "branch", "mv" and "push" were
broken as a result of this change.

In addition to the undefined variable usage, "git mv <tab>" also
prints the following error:

	__git_count_arguments:7: bad math expression:
	operand expected at `"1"'

	_git_mv:[:7: unknown condition: -gt

Remove the quotes around $__git_cmd_idx in __git_count_arguments
and set __git_cmd_idx=1 early in __git_zsh_main to fix the
regressions from 59d85a2a05.

This was tested on zsh 5.7.1 (x86_64-apple-darwin19.0).

Suggested-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Acked-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 12:49:40 +09:00
3714fbcb45 l10n: README: note on fuzzy translations
Fuzzy translation problem can occur when updating translations.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
69c13a7880 l10n: README: document l10n conventions
Document the conventions that l10n contributors must follow.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
2fb9d2596f l10n: README: document "core translation"
Contributor for a new language must complete translations of a small set
of l10n messages.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
6d09c53001 l10n: README: document git-po-helper
Document the PO helper program (git-po-helper) with installation and
basic usage.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
e54b271529 l10n: README: add file extention ".md"
Add file extension ".md" to "po/README" to help to display this markdown
file properly.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-02 11:21:26 +08:00
ed125c4f07 Merge branch 'ab/fsck-api-cleanup'
Last minute compilation fix.

* ab/fsck-api-cleanup:
  builtin/fsck.c: don't conflate "int" and "enum" in callback
2021-06-02 07:34:27 +09:00
28abf260a5 builtin/fsck.c: don't conflate "int" and "enum" in callback
Fix a warning on AIX's xlc compiler that's been emitted since my
a1aad71601 (fsck.h: use "enum object_type" instead of "int",
2021-03-28):

    "builtin/fsck.c", line 805.32: 1506-068 (W) Operation between
    types "int(*)(struct object*,enum object_type,void*,struct
    fsck_options*)" and "int(*)(struct object*,int,void*,struct
    fsck_options*)" is not allowed.

I.e. it complains about us assigning a function with a prototype "int"
where we're expecting "enum object_type".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 05:59:15 +09:00
ed1e674f4d l10n: pt_PT: add Portuguese translations part 3
* Correct malformed strings
* Transforming 'não' (no) into affirmative

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-06-01 11:45:52 +01:00
3d33e36c47 Merge branch 'l10n/zh_TW/21-05-20' of github.com:l10n-tw/git-po
* 'l10n/zh_TW/21-05-20' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: localized
  l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
2021-05-30 21:40:59 +08:00
e94005634c Merge branch 'master' of github.com:Softcatala/git-po
* 'master' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2021-05-30 20:45:10 +08:00
fe1c18ba4d l10n: bg.po: Updated Bulgarian translation (5204t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-05-28 17:45:58 +02:00
4e42405f00 Git 2.32-rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-28 13:05:29 +09:00
329d63e7be Merge branch 'en/dir-traversal'
Fix-up to a topic that is already in 'master'.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  Revert "dir: update stale description of treat_directory()"
  Revert "dir: introduce readdir_skip_dot_and_dotdot() helper"
2021-05-28 13:03:00 +09:00
906fc557b7 dir: introduce readdir_skip_dot_and_dotdot() helper
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d->d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 14:02:37 +09:00
eef814828f dir: update stale description of treat_directory()
The documentation comment for treat_directory() was originally written
in 095952 (Teach directory traversal about subprojects, 2007-04-11)
which was before the 'struct dir_struct' split its bitfield of named
options into a 'flags' enum in 7c4c97c0 (Turn the flags in struct
dir_struct into a single variable, 2009-02-16). When those flags
changed, the comment became stale, since members like
'show_other_directories' transitioned into flags like
DIR_SHOW_OTHER_DIRECTORIES.

Update the comments for treat_directory() to use these flag names rather
than the old member names.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 14:02:37 +09:00
2c9f1bfdb4 Revert "dir: update stale description of treat_directory()"
This reverts commit 4e689d8171,
to be replaced with a reworked version.
2021-05-27 14:02:37 +09:00
1df046bcff Revert "dir: introduce readdir_skip_dot_and_dotdot() helper"
This reverts commit b548f0f156,
to be replaced with a reworked version.
2021-05-27 14:02:37 +09:00
5afd72a96f Merge branch 'ab/pack-linkage-fix'
"ld" on Solaris fails to link some test helpers, which has been
worked around by reshuffling the inline function definitions from a
header file to a source file that is the only user of them.

* ab/pack-linkage-fix:
  pack-objects: move static inline from a header to the sole consumer
2021-05-27 12:36:58 +09:00
2f0ca41349 Merge branch 'mt/t2080-cp-symlink-fix'
Test portability fix.

* mt/t2080-cp-symlink-fix:
  t2080: fix cp invocation to copy symlinks instead of following them
2021-05-27 12:36:57 +09:00
f4d715b0ac Merge branch 'ab/send-email-inline-hooks-path'
Code simplification.

* ab/send-email-inline-hooks-path:
  send-email: move "hooks_path" invocation to git-send-email.perl
  send-email: don't needlessly abs_path() the core.hooksPath
2021-05-27 12:36:57 +09:00
1accb34ce0 Merge branch 'ds/t1092-fix-flake-from-progress'
Workaround flaky tests introduced recently.

* ds/t1092-fix-flake-from-progress:
  t1092: revert the "-1" hack for emulating "no progress meter"
  t1092: use GIT_PROGRESS_DELAY for consistent results
2021-05-27 12:36:57 +09:00
7d089fb9b7 pack-objects: move static inline from a header to the sole consumer
Move the code that is only used in builtin/pack-objects.c out of
pack-objects.h.

This fixes an issue where Solaris's SunCC hasn't been able to compile
git since 483fa7f42d (t/helper/test-bitmap.c: initial commit,
2021-03-31).

The real origin of that issue is that in 898eba5e63 (pack-objects:
refer to delta objects by index instead of pointer, 2018-04-14)
utility functions only needed by builtin/pack-objects.c were added to
pack-objects.h. Since then the header has been used in a few other
places, but 483fa7f42d was the first time it was used by test helper.

Since Solaris is stricter about linking and the oe_get_size_slow()
function lives in builtin/pack-objects.c the build started failing
with:

    Undefined                       first referenced
     symbol                             in file
    oe_get_size_slow                    t/helper/test-bitmap.o
    ld: fatal: symbol referencing errors. No output written to t/helper/test-tool

On other platforms this is presumably OK because the compiler and/or
linker detects that the "static inline" functions that reference
oe_get_size_slow() aren't used.

Let's solve this by moving the relevant code from pack-objects.h to
builtin/pack-objects.c. This is almost entirely a code-only move, but
because of the early macro definitions in that file referencing some
of these inline functions we need to move the definition of "static
struct packing_data to_pack" earlier, and declare these inline
functions above the macros.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 12:14:41 +09:00
c2529290f0 Merge branch 'fr_next' of github.com:jnavila/git
* 'fr_next' of github.com:jnavila/git:
  l10n: fr: v2.32.0 round 1
  l10n: fr: fixed inconsistencies
  l10n: fr.po fixed inconsistencies
2021-05-27 10:28:50 +08:00
ea08db7473 t2080: fix cp invocation to copy symlinks instead of following them
t2080 makes a few copies of a test repository and later performs a
branch switch on each one of the copies to verify that parallel checkout
and sequential checkout produce the same results. However, the
repository is copied with `cp -R` which, on some systems, defaults to
following symlinks on the directory hierarchy and copying their target
files instead of copying the symlinks themselves. AIX is one example of
system where this happens. Because the symlinks are not preserved, the
copied repositories have paths that do not match what is in the index,
causing git to abort the checkout operation that we want to test. This
makes the test fail on these systems.

Fix this by copying the repository with the POSIX flag '-P', which
forces cp to copy the symlinks instead of following them. Note that we
already use this flag for other cp invocations in our test suite (see
t7001). With this change, t2080 now passes on AIX.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 09:04:49 +09:00
7cbc0455cc send-email: move "hooks_path" invocation to git-send-email.perl
Move the newly added "hooks_path" API in Git.pm to its only user in
git-send-email.perl. This was added in c8243933c7 (git-send-email:
Respect core.hooksPath setting, 2021-03-23), meaning that it hasn't
yet made it into a non-rc release of git.

The consensus with Git.pm is that we need to be considerate of
out-of-tree users who treat it as a public documented interface. We
should therefore be less willing to add new functionality to it, least
we be stuck supporting it after our own uses for it disappear.

In this case the git-send-email.perl hook invocation will probably be
replaced by a future "git hook run" command, and in the commit
preceding this one the "hooks_path" become nothing but a trivial
wrapper for "rev-parse --git-path hooks" anyway (with no
Cwd::abs_path() call), so let's just inline this command in
git-send-email.perl itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 09:00:59 +09:00
2815326f09 send-email: don't needlessly abs_path() the core.hooksPath
In c8243933c7 (git-send-email: Respect core.hooksPath setting,
2021-03-23) we started supporting core.hooksPath in "send-email". It's
been reported that on Windows[1] doing this by calling abs_path()
results in different canonicalizations of the absolute path.

This wasn't an issue in c8243933c7 itself, but was revealed by my
ea7811b37e (git-send-email: improve --validate error output,
2021-04-06) when we started emitting the path to the hook, which was
previously only internal to git-send-email.perl.

The just-landed 53753a37d0 (t9001-send-email.sh: fix expected
absolute paths on Windows, 2021-05-24) narrowly fixed this issue, but
I believe we can do better here. We should not be relying on whatever
changes Perl's abs_path() makes to the path "rev-parse --git-path
hooks" hands to us. Let's instead trust it, and hand it to Perl's
system() in git-send-email.perl. It will handle either a relative or
absolute path.

So let's revert most of 53753a37d0 and just have "hooks_path" return
what we get from "rev-parse" directly without modification. This has
the added benefit of making the error message friendlier in the common
case, we'll no longer print an absolute path for repository-local hook
errors.

1. http://lore.kernel.org/git/bb30fe2b-cd75-4782-24a6-08bb002a0367@kdbg.org

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 09:00:57 +09:00
a96355d84c t1092: revert the "-1" hack for emulating "no progress meter"
This looked like a good idea, but it seems to break tests on 32-bit
builds rather badly.  Revert to just use "100 thousands must be big
enough" for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-26 06:23:58 +09:00
1df318be63 l10n: id: po-id for 2.32.0 (round 1)
Translate following components:

  * builtin/add.c
  * worktree.c
  * builtin/branch.c
  * builtin/commit.c
  * builtin/merge.c
  * builtin/rebase.c
  * builtin/pull.c
  * diff.c
  * add-interactive.c
  * builtin/log.c
  * builtin/stash.c
  * builtin/tag.c
  * config.c
  * builtin/config.c
  * reset.c
  * builtin/remote.c
  * builtin/rm.c
  * builtin/mv.c
  * builtin/clean.c
  * builtin/help.c
  * archive.c
  * submodule.c
  * builtin/submodule--helper.c
  * submodule-config.c

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-05-25 18:08:16 +07:00
5d5b147345 Merge branch 'mt/init-template-userpath-fix'
Regression fix.

* mt/init-template-userpath-fix:
  init: fix bug regarding ~/ expansion in init.templateDir
2021-05-25 16:21:20 +09:00
d9929cbb08 Merge branch 'jt/send-email-validate-errors-fix'
Fix a test breakage.

* jt/send-email-validate-errors-fix:
  t9001-send-email.sh: fix expected absolute paths on Windows
2021-05-25 16:21:19 +09:00
53cb2103ce Merge branch 'ab/send-email-validate-errors-fix'
* ab/send-email-validate-errors-fix:
  send-email: fix missing error message regression
2021-05-25 16:21:19 +09:00
e2b05746e1 t1092: use GIT_PROGRESS_DELAY for consistent results
The t1092-sparse-checkout-compatibility.sh tests compare the stdout and
stderr for several Git commands across both full checkouts, sparse
checkouts with a full index, and sparse checkouts with a sparse index.
Since these are direct comparisons, sometimes a progress indicator can
flush at unpredictable points, especially on slower machines. This
causes the tests to be flaky.

One standard way to avoid this is to add GIT_PROGRESS_DELAY=0 to the Git
commands that are run, as this will force every progress indicator
created with start_progress_delay() to be created immediately. However,
there are some progress indicators that are created in the case of a
full index that are not created with a sparse index. Moreover, their
values may be different as those indexes have a different number of
entries.

Instead, use GIT_PROGRESS_DELAY=-1 (which will turn into UINT_MAX)
to ensure that any reasonable machine running these tests would
never display delayed progress indicators.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 15:30:33 +09:00
a185dd58ec init: fix bug regarding ~/ expansion in init.templateDir
We used to read the init.templateDir setting at builtin/init-db.c using
a git_config() callback that, in turn, called git_config_pathname(). To
simplify the config reading logic at this file and plug a memory leak,
this was replaced by a direct call to git_config_get_value() at
e4de4502e6 ("init: remove git_init_db_config() while fixing leaks",
2021-03-14). However, this function doesn't provide path expanding
semantics, like git_config_pathname() does, so paths with '~/' and
'~user/' are treated literally. This makes 'git init' fail to handle
init.templateDir paths using these constructs:

	$ git config init.templateDir '~/templates_dir'
	$ git init
	'warning: templates not found in ~/templates_dir'

Replace the git_config_get_value() call by git_config_get_pathname(),
which does the '~/' and '~user/' expansions. Also add a regression test.
Note that unlike git_config_get_value(), the config cache does not own
the memory for the path returned by git_config_get_pathname(), so we
must free() it.

Reported on IRC by rkta.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 13:22:08 +09:00
5b719b7552 send-email: fix missing error message regression
Fix a regression with the "the editor exited uncleanly, aborting
everything" error message going missing after my
d21616c039 (git-send-email: refactor duplicate $? checks into a
function, 2021-04-06).

I introduced a $msg variable, but did not actually use it. This caused
us to miss the optional error message supplied by the "do_edit"
codepath. Fix that, and add tests to check that this works.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 09:52:42 +09:00
53753a37d0 t9001-send-email.sh: fix expected absolute paths on Windows
Git for Windows is a native Windows program that works with native
absolute paths in the drive letter style C:\dir. The auxiliary
infrastructure is based on MSYS2, which uses POSIX style /C/dir.

When we test for output of absolute paths produced by git.exe, we
usally have to expect C:\dir style paths. To produce such expected
paths, we have to use $(pwd) in the test scripts; the alternative,
$PWD, produces a POSIX style path. ($PWD is a shell variable, and the
shell is bash, an MSYS2 program, and operates in the POSIX realm.)

There are two recently added tests that were written to expect C:\dir
paths. The output that is tested is produced by `git send-email`, but
behind the scenes, this is a Perl script, which also works in the
POSIX realm and produces /C/dir style output.

In the first test case that is changed here, replace $(pwd) by $PWD
so that the expected path is constructed using /C/dir style.

The second test case sets core.hooksPath to an absolute path. Since
the test script talks to native git.exe, it is supposed to place a
C:/dir style path into the configuration; therefore, keep $(pwd).
When this configuration value is consumed by the Perl script, it is
transformed to /C/dir style by the MSYS2 layer and echoed back in
this form in the error message. Hence, do use $PWD for the expected
value.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 09:45:17 +09:00
11998a0364 l10n: vi.po(5204t): Updated Vietnamese translation for v2.32.0
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-05-24 13:54:03 +07:00
beded618b2 l10n: zh_TW.po: localized
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-05-23 15:31:29 +08:00
de88ac70f3 Git 2.32-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22 18:29:01 +09:00
378c7c6ad4 Merge branch 'dl/stash-show-untracked-fixup'
Another brown paper bag inconsistency fix for a new feature
introduced during this cycle.

* dl/stash-show-untracked-fixup:
  stash show: use stash.showIncludeUntracked even when diff options given
2021-05-22 18:29:01 +09:00
6aae0e2ad2 Merge branch 'jh/simple-ipc-sans-pthread'
The "simple-ipc" did not compile without pthreads support, but the
build procedure was not properly account for it.

* jh/simple-ipc-sans-pthread:
  simple-ipc: correct ifdefs when NO_PTHREADS is defined
2021-05-22 18:29:01 +09:00
99fe1c6069 Merge branch 'wm/rev-parse-path-format-wo-arg'
The "rev-parse" command did not diagnose the lack of argument to
"--path-format" option, which was introduced in v2.31 era, which
has been corrected.

* wm/rev-parse-path-format-wo-arg:
  rev-parse: fix segfault with missing --path-format argument
2021-05-22 18:29:00 +09:00
af5cd44b6f stash show: use stash.showIncludeUntracked even when diff options given
If options pertaining to how the diff is displayed is provided to
`git stash show`, the command will ignore the stash.showIncludeUntracked
configuration variable, defaulting to not showing any untracked files.
This is unintuitive behaviour since the format of the diff output and
whether or not to display untracked files are orthogonal.

Use stash.showIncludeUntracked even when diff options are given. Of
course, this is still overridable via the command-line options.

Update the documentation to explicitly say which configuration variables
will be overridden when a diff options are given.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22 17:56:46 +09:00
a6eff43bea l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-05-21 14:50:11 +08:00
6aac70a870 simple-ipc: correct ifdefs when NO_PTHREADS is defined
Simple IPC always requires threads (in addition to various
platform-specific IPC support).  Fix the ifdefs in the Makefile
to define SUPPORTS_SIMPLE_IPC when appropriate.

Previously, the Unix version of the code would only verify that
Unix domain sockets were available.

This problem was reported here:
https://lore.kernel.org/git/YKN5lXs4AoK%2FJFTO@coredump.intra.peff.net/T/#m08be8f1942ea8a2c36cfee0e51cdf06489fdeafc

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 07:55:00 +09:00
107691cb07 Merge branch 'ds/sparse-index-protections'
Fix access to uninitialized piece of memory, introduced during this
cycle.

* ds/sparse-index-protections:
  sparse-index: fix uninitialized jump
2021-05-21 05:50:38 +09:00
2b8b1aa6ad Merge branch 'tz/c-locale-output-is-no-more'
Test update.

* tz/c-locale-output-is-no-more:
  t7500: remove non-existant C_LOCALE_OUTPUT prereq
2021-05-21 05:50:32 +09:00
c69f2f8c86 Merge branch 'cs/http-use-basic-after-failed-negotiate'
Regression fix for a change made during this cycle.

* cs/http-use-basic-after-failed-negotiate:
  Revert "remote-curl: fall back to basic auth if Negotiate fails"
  t5551: test http interaction with credential helpers
2021-05-21 05:49:41 +09:00
733b9f59ba l10n: sv.po: Update Swedish translation (5204t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-05-20 13:32:48 +01:00
619418b993 l10n: fix typos in po/TEAMS
Find typos in "po/TEAMS" file using the "git-po-helper" program.  These
typos were introduced from commit v2.24.0-1-g9917eca794 (l10n: zh_TW:
add translation for v2.24.0, 2019-11-20 19:14:22 +0800).

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-05-20 12:56:10 +08:00
88dd4282d9 A handful more topics before -rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 08:55:00 +09:00
cb227d5cd6 Merge branch 'jk/test-chainlint-softer'
The "chainlint" feature in the test framework is a handy way to
catch common mistakes in writing new tests, but tends to get
expensive.  An knob to selectively disable it has been introduced
to help running tests that the developer has not modified.

* jk/test-chainlint-softer:
  t: avoid sed-based chain-linting in some expensive cases
2021-05-20 08:55:00 +09:00
02112fcb70 Merge branch 'en/prompt-under-set-u'
The bash prompt script (in contrib/) did not work under "set -u".

* en/prompt-under-set-u:
  git-prompt: work under set -u
2021-05-20 08:55:00 +09:00
36a255acd1 Merge branch 'zh/ref-filter-push-remote-fix'
The handling of "%(push)" formatting element of "for-each-ref" and
friends was broken when the same codepath started handling
"%(push:<what>)", which has been corrected.

* zh/ref-filter-push-remote-fix:
  ref-filter: fix read invalid union member bug
2021-05-20 08:55:00 +09:00
bdff0419da Merge branch 'ew/sha256-clone-remote-curl-fix'
"git clone" from SHA256 repository by Git built with SHA-1 as the
default hash algorithm over the dumb HTTP protocol did not
correctly set up the resulting repository, which has been corrected.

* ew/sha256-clone-remote-curl-fix:
  remote-curl: fix clone on sha256 repos
2021-05-20 08:54:59 +09:00
33be431c0c Merge branch 'en/dir-traversal'
"git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  dir: traverse into untracked directories if they may have ignored subfiles
  dir: avoid unnecessary traversal into ignored directory
  t3001, t7300: add testcase showcasing missed directory traversal
  t7300: add testcase showing unnecessary traversal into ignored directory
  ls-files: error out on -i unless -o or -c are specified
  dir: report number of visited directories and paths with trace2
  dir: convert trace calls to trace2 equivalents
2021-05-20 08:54:59 +09:00
2e2ed74be0 Merge branch 'ab/perl-makefile-cleanup'
Build procedure clean-up.

* ab/perl-makefile-cleanup:
  Makefile: make PERL_DEFINES recursively expanded
  perl: use mock i18n functions under NO_GETTEXT=Y
  Makefile: regenerate *.pm on NO_PERL_CPAN_FALLBACKS change
  Makefile: regenerate perl/build/* if GIT-PERL-DEFINES changes
  Makefile: don't re-define PERL_DEFINES
2021-05-20 08:54:58 +09:00
4d0a2a608d l10n: fr: v2.32.0 round 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-05-19 18:58:25 +02:00
ecf7b129fa Revert "remote-curl: fall back to basic auth if Negotiate fails"
This reverts commit 1b0d9545bb.

That commit does fix the situation it intended to (avoiding Negotiate
even when the credentials were provided in the URL), but it creates a
more serious regression: we now never hit the conditional for "we had a
username and password, tried them, but the server still gave us a 401".
That has two bad effects:

 1. we never call credential_reject(), and thus a bogus credential
    stored by a helper will live on forever

 2. we never return HTTP_NOAUTH, so the error message the user gets is
    "The requested URL returned error: 401", instead of "Authentication
    failed".

Doing this correctly seems non-trivial, as we don't know whether the
Negotiate auth was a problem. Since this is a regression in the upcoming
v2.23.0 release (for which we're in -rc0), let's revert for now and work
on a fix separately.

(Note that this isn't a pure revert; the previous commit added a test
showing the regression, so we can now flip it to expect_success).

Reported-by: Ben Humphreys <behumphreys@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 10:09:58 +09:00
b694f1e49e t5551: test http interaction with credential helpers
We test authentication with http, and we independently test that
credential helpers work, but we don't have any tests that cover the
two features working together. Let's add two:

  1. Make sure that a successful request asks the helper to save the
     credential. This works as expected.

  2. Make sure that a failed request asks the helper to forget the
     credential. This is marked as expect_failure, as it was recently
     regressed by 1b0d9545bb (remote-curl: fall back to basic auth if
     Negotiate fails, 2021-03-22). The symptom here is that the second
     request should prompt the user, but doesn't.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 10:09:57 +09:00
4279cb1c6e sparse-index: fix uninitialized jump
While testing the sparse-index, I verified a test with --valgrind and it
complained about an uninitialized value being used in a jump in the
path_matches_pattern_list() method. The line was this one:

	if (*dtype == DT_UNKNOWN)

In the call stack, the culprit was the initialization of the dtype
variable in convert_to_sparse_rec().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 06:29:17 +09:00
58cf6056c9 t7500: remove non-existant C_LOCALE_OUTPUT prereq
The C_LOCALE_OUTPUT prerequisite was removed in b1e079807b (tests:
remove last uses of C_LOCALE_OUTPUT, 2021-02-11), where Ævar noted:

    I'm not leaving the prerequisite itself in place for in-flight changes
    as there currently are none that introduce new tests that rely on it,
    and because C_LOCALE_OUTPUT is currently a noop on the master branch
    we likely won't have any new submissions that use it.

One more use of C_LOCALE_OUTPUT did creep in with 3d1bda6b5b (t7500: add
tests for --fixup=[amend|reword] options, 2021-03-15).  This causes a
number of the tests to be skipped by default:

    ok 35 # SKIP --fixup=reword: incompatible with --all (missing C_LOCALE_OUTPUT)
    ok 36 # SKIP --fixup=reword: incompatible with --include (missing C_LOCALE_OUTPUT)
    ok 37 # SKIP --fixup=reword: incompatible with --only (missing C_LOCALE_OUTPUT)
    ok 38 # SKIP --fixup=reword: incompatible with --interactive (missing C_LOCALE_OUTPUT)
    ok 39 # SKIP --fixup=reword: incompatible with --patch (missing C_LOCALE_OUTPUT)

Remove the C_LOCALE_OUTPUT prerequisite from these tests so they are
not skipped.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-18 04:48:30 +09:00
99234d5905 l10n: tr: v2.32.0-r1
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-05-17 20:32:32 +03:00
7434b92798 l10n: fr: fixed inconsistencies
Signed-off-by: rlespinasse <romain.lespinasse@gmail.com>
2021-05-17 15:16:25 +02:00
9bafe049d6 l10n: fr.po fixed inconsistencies
Signed-off-by: Vincent Tam <sere@live.hk>
2021-05-17 15:16:25 +02:00
99fc555188 rev-parse: fix segfault with missing --path-format argument
Calling "git rev-parse --path-format" without an argument segfaults
instead of giving an error message. Commit fac60b8925 (rev-parse: add
option for absolute or relative path formatting, 2020-12-13) added the
argument parsing code but forgot to handle NULL.

Returning an error makes sense here because there is no default value we
could use. Add a test case to verify.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 18:39:29 +09:00
e9488197ad l10n: pt_PT: add Portuguese translations part 2
* Eliminated 'Negation of emptiness' of 'nenhum' (not one/none)
* Eliminated 'Negation of emptiness' of 'nada' (nothing)
* Transformed 'Não' (No) into affirmative
* Some other translations
* Transforming 'não' (no) into affirmative
* From junção-de-3 to tri-junção

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-05-17 09:42:44 +01:00
2b95ebb4fd l10n: git.pot: v2.32.0 round 1 (126 new, 26 removed)
Generate po/git.pot from v2.32.0-rc0 for git v2.32.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-05-17 16:06:49 +08:00
bf949ade81 Git 2.32-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-16 21:05:24 +09:00
e004fd6b69 Merge branch 'ls/typofix'
* ls/typofix:
  pretty: fix a typo in the documentation for %(trailers)
2021-05-16 21:05:24 +09:00
a8a2491e62 Merge branch 'dl/stash-show-untracked-fixup'
The code to handle options recently added to "git stash show"
around untracked part of the stash segfaulted when these options
were used on a stash entry that does not record untracked part.

* dl/stash-show-untracked-fixup:
  stash show: fix segfault with --{include,only}-untracked
  t3905: correct test title
2021-05-16 21:05:24 +09:00
16f91451fa Merge branch 'wc/packed-ref-removal-cleanup'
When "git update-ref -d" removes a ref that is packed, it left
empty directories under $GIT_DIR/refs/ for

* wc/packed-ref-removal-cleanup:
  refs: cleanup directories when deleting packed ref
2021-05-16 21:05:24 +09:00
94294e92e1 Merge branch 'lh/maintenance-leakfix'
* lh/maintenance-leakfix:
  maintenance: fix two memory leaks
2021-05-16 21:05:24 +09:00
caf6840be0 Merge branch 'ma/typofixes'
A couple of trivial typofixes.

* ma/typofixes:
  pretty-formats.txt: add missing space
  git-repack.txt: remove spurious ")"
2021-05-16 21:05:24 +09:00
c7c7c460f8 Merge branch 'ah/merge-ort-i18n'
An i18n fix.

* ah/merge-ort-i18n:
  merge-ort: split "distinct types" message into two translatable messages
2021-05-16 21:05:23 +09:00
483932a3d8 Merge branch 'dd/mailinfo-quoted-cr'
"git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.

* dd/mailinfo-quoted-cr:
  am: learn to process quoted lines that ends with CRLF
  mailinfo: allow stripping quoted CR without warning
  mailinfo: allow squelching quoted CRLF warning
  mailinfo: warn if CRLF found in decoded base64/QP email
  mailinfo: stop parsing options manually
  mailinfo: load default metainfo_charset lazily
2021-05-16 21:05:23 +09:00
c8e34a7ac2 Merge branch 'ab/sparse-index-cleanup'
Code clean-up.

* ab/sparse-index-cleanup:
  sparse-index.c: remove set_index_sparse_config()
2021-05-16 21:05:23 +09:00
502a67891c Merge branch 'ab/streaming-simplify'
Code clean-up.

* ab/streaming-simplify:
  streaming.c: move {open,close,read} from vtable to "struct git_istream"
  streaming.c: stop passing around "object_info *" to open()
  streaming.c: remove {open,close,read}_method_decl() macros
  streaming.c: remove enum/function/vtbl indirection
  streaming.c: avoid forward declarations
2021-05-16 21:05:23 +09:00
a737e1f1d2 Merge branch 'mt/parallel-checkout-part-3'
The final part of "parallel checkout".

* mt/parallel-checkout-part-3:
  ci: run test round with parallel-checkout enabled
  parallel-checkout: add tests related to .gitattributes
  t0028: extract encoding helpers to lib-encoding.sh
  parallel-checkout: add tests related to path collisions
  parallel-checkout: add tests for basic operations
  checkout-index: add parallel checkout support
  builtin/checkout.c: complete parallel checkout support
  make_transient_cache_entry(): optionally alloc from mem_pool
2021-05-16 21:05:23 +09:00
644f4a2046 Merge branch 'jt/push-negotiation'
"git push" learns to discover common ancestor with the receiving
end over protocol v2.

* jt/push-negotiation:
  send-pack: support push negotiation
  fetch: teach independent negotiation (no packfile)
  fetch-pack: refactor command and capability write
  fetch-pack: refactor add_haves()
  fetch-pack: refactor process_acks()
2021-05-16 21:05:22 +09:00
afb82d1db0 l10n: Update Catalan translation
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2021-05-14 21:14:52 +02:00
97eea85a0a The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-14 08:26:11 +09:00
52371bf449 Merge branch 'mt/clean-clean'
Code clean-up.

* mt/clean-clean:
  clean: remove unnecessary variable
2021-05-14 08:26:11 +09:00
47fa106617 Merge branch 'ow/no-dryrun-in-add-i'
"git add -i --dry-run" does not dry-run, which was surprising.  The
combination of options has taught to error out.

* ow/no-dryrun-in-add-i:
  add: die if both --dry-run and --interactive are given
2021-05-14 08:26:09 +09:00
e289f681ed Merge branch 'jk/p4-locate-branch-point-optim'
"git p4" learned to find branch points more efficiently.

* jk/p4-locate-branch-point-optim:
  git-p4: speed up search for branch parent
  git-p4: ensure complex branches are cloned correctly
2021-05-14 08:26:08 +09:00
eede71149e Merge branch 'ba/object-info'
Over-the-wire protocol learns a new request type to ask for object
sizes given a list of object names.

* ba/object-info:
  object-info: support for retrieving object info
2021-05-14 08:26:08 +09:00
daffa8961b Merge branch 'pw/patience-diff-clean-up'
Code clean-up.

* pw/patience-diff-clean-up:
  patience diff: remove unused variable
  patience diff: remove unnecessary string comparisons
2021-05-14 08:26:08 +09:00
65c18913de Merge branch 'pw/word-diff-zero-width-matches'
The word-diff mode has been taught to work better with a word
regexp that can match an empty string.

* pw/word-diff-zero-width-matches:
  word diff: handle zero length matches
2021-05-14 08:26:06 +09:00
e5b66bb324 l10n: ru.po: fix typo in Russian translation
Signed-off-by: Alexey Roslyakov <alexey.roslyakov@gmail.com>
2021-05-13 12:48:37 +03:00
2d86a96220 t: avoid sed-based chain-linting in some expensive cases
Commit 878f988350 (t/test-lib: teach --chain-lint to detect broken
&&-chains in subshells, 2018-07-11) introduced additional chain-lint
tests which add an extra "sed" pipeline to each test we run. This has a
measurable impact on runtime. Here are timings with and without a new
environment variable (added by this patch) that lets you disable just
the additional sed-based chain-lint tests:

  Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 make test
    Time (mean ± σ):     64.202 s ±  1.030 s    [User: 622.469 s, System: 301.402 s]
    Range (min … max):   61.571 s … 65.662 s    10 runs

  Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 make test
    Time (mean ± σ):     57.591 s ±  0.333 s    [User: 529.368 s, System: 270.618 s]
    Range (min … max):   57.143 s … 58.309 s    10 runs

  Summary
    'GIT_TEST_CHAIN_LINT_HARDER=0 make test' ran
      1.11 ± 0.02 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 make test'

Of course those extra lint checks are doing something useful, so paying
a few extra seconds (at least on Linux) isn't so bad (though note the
CPU time; we're bounded in our parallel run here by the slowest test, so
it really is ~120s of CPU improvement).

But we can observe that there are some test scripts where they produce a
much stronger effect, and provide less value. In t0027 and t3070 we run
a very large number of small tests, all driven by a series of
functions/loops which are filling in the test bodies. There we get much
less bang for our buck in terms of bug-finding versus CPU cost.

This patch introduces a mechanism for controlling when those extra
lint checks are run, at two levels:

  - a user can ask to disable or to force-enable the checks by setting
    GIT_TEST_CHAIN_LINT_HARDER

  - if the user hasn't specified a preference, individual scripts can
    disable the checks by setting GIT_TEST_CHAIN_LINT_HARDER_DEFAULT;
    scripts which don't set that get the current behavior of enabling
    them.

In addition, this patch flips the default for t0027 and t3070's
mass-generated sections to disable the extra checks. Here are the timing
results for t0027:

  Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh
    Time (mean ± σ):     17.078 s ±  0.848 s    [User: 14.878 s, System: 7.075 s]
    Range (min … max):   15.952 s … 18.421 s    10 runs

  Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh
    Time (mean ± σ):      9.063 s ±  0.759 s    [User: 7.890 s, System: 3.362 s]
    Range (min … max):    7.747 s … 10.619 s    10 runs

  Benchmark #3: ./t0027-auto-crlf.sh
    Time (mean ± σ):      9.186 s ±  0.881 s    [User: 7.957 s, System: 3.427 s]
    Range (min … max):    7.796 s … 10.498 s    10 runs

  Summary
    'GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh' ran
      1.01 ± 0.13 times faster than './t0027-auto-crlf.sh'
      1.88 ± 0.18 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh'

We can see that disabling the checks for the whole script buys us an
almost 2x speedup. But the new default behavior, disabling them only for
the mass-generated part, gets us most of that speedup (but still leaves
the checks on for further manual tests people might write).

  As a side note, I'd caution about comparing runtimes and CPU seconds
  between this timing and the earlier "make test" one. In "make test",
  we're running a lot of scripts in parallel, so the CPU is throttling
  down (and thus a CPU second saved here would count for more during a
  parallel run; the same work takes more CPU seconds there).

We get similar results for t3070:

  Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh
    Time (mean ± σ):     20.054 s ±  3.967 s    [User: 16.003 s, System: 8.286 s]
    Range (min … max):   11.891 s … 23.671 s    10 runs

  Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh
    Time (mean ± σ):     12.399 s ±  2.256 s    [User: 7.542 s, System: 5.342 s]
    Range (min … max):    9.606 s … 15.727 s    10 runs

  Benchmark #3: ./t3070-wildmatch.sh
    Time (mean ± σ):     10.726 s ±  3.476 s    [User: 6.790 s, System: 4.365 s]
    Range (min … max):    5.444 s … 15.376 s    10 runs

  Summary
    './t3070-wildmatch.sh' ran
      1.16 ± 0.43 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh'
      1.87 ± 0.71 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh'

Again, we get almost a 2x speedup disabling these. In this case, there
are no tests not covered by the script's "default to disable" behavior,
so the second two benchmarks should be the same (and while they do
differ, you can see the variance is quite high but they're within one
standard deviation).

So it seems like for these two scripts, at least, disabling the extra
checks is a reasonable tradeoff. Sadly, the overall runtime of "make
test" on my system doesn't get much faster. But that's because we're
mostly limited by the cost of the single biggest test. Here are the
top-5 tests by wall-clock time from a parallel run, before my patch:

  57.9192368984222 t9001-send-email.sh
  45.6329638957977 t0027-auto-crlf.sh
  32.5278220176697 t3070-wildmatch.sh
  22.2701289653778 t7610-mergetool.sh
  20.8635759353638 t1701-racy-split-index.sh

And after:

  57.1476998329163 t9001-send-email.sh
  33.776211977005 t0027-auto-crlf.sh
  21.3116669654846 t7610-mergetool.sh
  20.7748689651489 t1701-racy-split-index.sh
  19.6957249641418 t7112-reset-submodule.sh

We dropped 12s from t0027, and t3070 dropped off our list entirely at
around 16s. In both cases we're bound by t9001, but its slowness is
due to the actual tests, so we'll have to deal with it in a different
way. But this reduces overall CPU, and means that dealing with t9001 (by
improving the speed of send-email or splitting it apart) will let us
reduce our overall runtime even on multi-core machines.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 15:50:44 +09:00
5c0cbdb107 git-prompt: work under set -u
Commit afda36dbf3 ("git-prompt: include sparsity state as well",
2020-06-21) added the use of some variables to control how to show
sparsity state in the git prompt, but implicitly assumed that undefined
variables would be treated as the empty string.  This breaks users who
run under 'set -u'; fix the code to be more explicit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 15:50:26 +09:00
1ff595d218 stash show: fix segfault with --{include,only}-untracked
When `git stash show --include-untracked` or
`git stash show --only-untracked` is run on a stash that doesn't include
an untracked entry, a segfault occurs. This happens because we do not
check whether the untracked entry is actually present and just attempt
to blindly dereference it.

Ensure that the untracked entry is present before actually attempting to
dereference it.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:48:59 +09:00
aa2b05d9f6 t3905: correct test title
We reference the non-existent option `git stash show --show-untracked`
when we really meant `--only-untracked`. Correct the test title
accordingly.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:48:16 +09:00
b548f0f156 dir: introduce readdir_skip_dot_and_dotdot() helper
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d->d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
4e689d8171 dir: update stale description of treat_directory()
The documentation comment for treat_directory() was originally written
in 095952 (Teach directory traversal about subprojects, 2007-04-11)
which was before the 'struct dir_struct' split its bitfield of named
options into a 'flags' enum in 7c4c97c0 (Turn the flags in struct
dir_struct into a single variable, 2009-02-16). When those flags
changed, the comment became stale, since members like
'show_other_directories' transitioned into flags like
DIR_SHOW_OTHER_DIRECTORIES.

Update the comments for treat_directory() to use these flag names rather
than the old member names.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
dd55fc0df1 dir: traverse into untracked directories if they may have ignored subfiles
A directory that is untracked does not imply that all files under it
should be categorized as untracked; in particular, if the caller is
interested in ignored files, many files or directories underneath the
untracked directory may be ignored.  We previously partially handled
this right with DIR_SHOW_IGNORED_TOO, but missed DIR_SHOW_IGNORED.  It
was not obvious, though, because the logic for untracked and excluded
files had been fused together making it harder to reason about.  The
previous commit split that logic out, making it easier to notice that
DIR_SHOW_IGNORED was missing.  Add it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
aa6e1b21e5 dir: avoid unnecessary traversal into ignored directory
The show_other_directories case in treat_directory() tried to handle
both excludes and untracked files with the same logic, and mishandled
both the excludes and the untracked files in the process, in different
ways.  Split that logic apart, and then focus on the logic for the
excludes; a subsequent commit will address the logic for untracked
files.

For show_other_directories, an excluded directory means that
every path underneath that directory will also be excluded.  Given that
the calling code requested to just show directories when everything
under a directory had the same state (that's what the
"DIR_SHOW_OTHER_DIRECTORIES" flag means), we generally do not need to
traverse into such directories and can just immediately mark them as
ignored (i.e. as path_excluded).  The only reason we cannot just
immediately return path_excluded is the DIR_HIDE_EMPTY_DIRECTORIES flag
and the possibility that the ignored directory is an empty directory.
The code previously treated DIR_SHOW_IGNORED_TOO in most cases as an
exception as well, which was wrong.  It can sometimes reduce the number
of cases where we need to recurse (namely if
DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set), but should not be able
to increase the number of cases where we need to recurse.  Fix the logic
accordingly.

Some sidenotes about possible confusion with dir.c:

* "ignored" often refers to an untracked ignore", i.e. a file which is
  not tracked which matches one of the ignore/exclusion rules.  But you
  can also have a "tracked ignore", a tracked file that happens to match
  one of the ignore/exclusion rules and which dir.c has to worry about
  since "git ls-files -c -i" is supposed to list them.

* The dir code often uses "ignored" and "excluded" interchangeably,
  which you need to keep in mind while reading the code.

* "exclude" is used multiple ways in the code:

  * As noted above, "exclude" is often a synonym for "ignored".

  * The logic for parsing .gitignore files was re-used in
    .git/info/sparse-checkout, except there it is used to mark paths that
    the user wants to *keep*.  This was mostly addressed by commit
    65edd96aec ("treewide: rename 'exclude' methods to 'pattern'",
    2019-09-03), but every once in a while you'll find a comment about
    "exclude" referring to these patterns that might in fact be in use
    by the sparse-checkout machinery for inclusion rules.

  * The word "EXCLUDE" is also used for pathspec negation, as in
      (pathspec->items[3].magic & PATHSPEC_EXCLUDE)
    Thus if a user had a .gitignore file containing
      *~
      *.log
      !settings.log
    And then ran
      git add -- 'settings.*' ':^settings.log'
    Then :^settings.log is a pathspec negation making settings.log not
    be requested to be added even though all other settings.* files are
    being added.  Also, !settings.log in the gitignore file is a negative
    exclude pattern meaning that settings.log is normally a file we
    want to track even though all other *.log files are ignored.

Sometimes it feels like dir.c needs its own glossary with its many
definitions, including the multiply-defined terms.

Reported-by: Jason Gore <Jason.Gore@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
a97c7a8bc4 t3001, t7300: add testcase showcasing missed directory traversal
In the last commit, we added a testcase showing that the directory
traversal machinery sometimes traverses into directories unnecessarily.
Here we show that there are cases where it does the opposite: it does
not traverse into directories, despite those directories having
important files that need to be flagged.

Add a testcase showing that `git ls-files -o -i --directory` can omit
some of the files it should be listing, and another showing that `git
clean -fX` can fail to clean out some of the expected files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
2e4e43a691 t7300: add testcase showing unnecessary traversal into ignored directory
The PNPM package manager is apparently creating deeply nested (but
ignored) directory structures; traversing them is costly
performance-wise, unnecessary, and in some cases is even throwing
warnings/errors because the paths are too long to handle on various
platforms.  Add a testcase that checks for such unnecessary directory
traversal.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
b338e9f668 ls-files: error out on -i unless -o or -c are specified
ls-files --ignored can be used together with either --others or
--cached.  After being perplexed for a bit and digging in to the code, I
assumed that ls-files -i was just broken and not printing anything and
I had a nice patch ready to submit when I finally realized that -i can be
used with --cached to find tracked ignores.

While that was a mistake on my part, and a careful reading of the
documentation could have made this more clear, I suspect this is an
error others are likely to make as well.  In fact, of two uses in our
testsuite, I believe one of the two did make this error.  In t1306.13,
there are NO tracked files, and all the excludes built up and used in
that test and in previous tests thus have to be about untracked files.
However, since they were looking for an empty result, the mistake went
unnoticed as their erroneous command also just happened to give an empty
answer.

-i will most the time be used with -o, which would suggest we could just
make -i imply -o in the absence of either a -o or -c, but that would be
a backward incompatible break.  Instead, let's just flag -i without
either a -o or -c as an error, and update the two relevant testcases to
specify their intent.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
7fe1ffdafa dir: report number of visited directories and paths with trace2
Provide more statistics in trace2 output that include the number of
directories and total paths visited by the directory traversal logic.
Subsequent patches will take advantage of this to ensure we do not
unnecessarily traverse into ignored directories.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:02 +09:00
7f9dd87922 dir: convert trace calls to trace2 equivalents
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:02 +09:00
e6f68f62e0 pretty: fix a typo in the documentation for %(trailers)
Signed-off-by: Louis Sautier <sautier.louis@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 07:47:51 +09:00
8c55753c68 Makefile: make PERL_DEFINES recursively expanded
Since 07d90eadb5 (Makefile: add Perl runtime prefix support,
2018-04-10) PERL_DEFINES has been a simply-expanded variable, let's
make it recursively expanded instead.

This change doesn't matter for the correctness of the logic. Whether
we used simply-expanded or recursively expanded didn't change what we
wrote out in GIT-PERL-DEFINES, but being consistent with other rules
makes this easier to understand.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 07:45:39 +09:00
00bc8390d8 remote-curl: fix clone on sha256 repos
The remote-https process needs to update it's own instance of
`the_repository' when it sees an HTTP(S) remote is using sha256.
Without this, parse_oid_hex() fails to handle sha256 OIDs when
it's eventually called by parse_fetch().

Tested with:

	git clone https://yhbt.net/sha256test.git
	GIT_SMART_HTTP=0 git clone https://yhbt.net/sha256test.git
	(plain http:// also works)

Cloning the URL via git:// required no changes

Signed-off-by: Eric Wong <e@80x24.org>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 12:14:44 +09:00
1e1c4c5eac ref-filter: fix read invalid union member bug
used_atom.u is an union, and it has different members depending on
what atom the auxiliary data the union part of the "struct
used_atom" wants to record. At most only one of the members can be
valid at any one time. Since the code checks u.remote_ref without
even making sure if the atom is "push" or "push:" (which are only
two cases that u.remote_ref.push becomes valid), but u.remote_ref
shares the same storage for other members of the union, the check
was reading from an invalid member, which was the bug.

Modify the condition here to check whether the atom name
equals to "push" or starts with "push:", to avoid reading the
value of invalid member of the union.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: further test fixes]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 08:13:14 +09:00
c5d0b12a4c maintenance: fix two memory leaks
Fixes two memory leaks when running `git maintenance start` or `git
maintenance stop` in `update_background_schedule`:

$ valgrind --leak-check=full ~/git/bin/git maintenance start
==76584== Memcheck, a memory error detector
==76584== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==76584== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==76584== Command: /home/lenaic/git/bin/git maintenance start
==76584==
==76584==
==76584== HEAP SUMMARY:
==76584==     in use at exit: 34,880 bytes in 252 blocks
==76584==   total heap usage: 820 allocs, 568 frees, 146,414 bytes allocated
==76584==
==76584== 65 bytes in 1 blocks are definitely lost in loss record 17 of 39
==76584==    at 0x483E6AF: malloc (vg_replace_malloc.c:306)
==76584==    by 0x3DC39C: xrealloc (wrapper.c:126)
==76584==    by 0x3992CC: strbuf_grow (strbuf.c:98)
==76584==    by 0x39A473: strbuf_vaddf (strbuf.c:392)
==76584==    by 0x39BC54: xstrvfmt (strbuf.c:979)
==76584==    by 0x39BD2C: xstrfmt (strbuf.c:989)
==76584==    by 0x18451B: update_background_schedule (gc.c:1977)
==76584==    by 0x1846F6: maintenance_start (gc.c:2011)
==76584==    by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584==    by 0x127A2E: run_builtin (git.c:453)
==76584==    by 0x127E81: handle_builtin (git.c:704)
==76584==    by 0x128142: run_argv (git.c:771)
==76584==
==76584== 240 bytes in 1 blocks are definitely lost in loss record 29 of 39
==76584==    at 0x4840D7B: realloc (vg_replace_malloc.c:834)
==76584==    by 0x491CE5D: getdelim (in /usr/lib/libc-2.33.so)
==76584==    by 0x39ADD7: strbuf_getwholeline (strbuf.c:635)
==76584==    by 0x39AF31: strbuf_getdelim (strbuf.c:706)
==76584==    by 0x39B064: strbuf_getline_lf (strbuf.c:727)
==76584==    by 0x184273: crontab_update_schedule (gc.c:1919)
==76584==    by 0x184678: update_background_schedule (gc.c:1997)
==76584==    by 0x1846F6: maintenance_start (gc.c:2011)
==76584==    by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584==    by 0x127A2E: run_builtin (git.c:453)
==76584==    by 0x127E81: handle_builtin (git.c:704)
==76584==    by 0x128142: run_argv (git.c:771)
==76584==
==76584== LEAK SUMMARY:
==76584==    definitely lost: 305 bytes in 2 blocks
==76584==    indirectly lost: 0 bytes in 0 blocks
==76584==      possibly lost: 0 bytes in 0 blocks
==76584==    still reachable: 34,575 bytes in 250 blocks
==76584==         suppressed: 0 bytes in 0 blocks
==76584== Reachable blocks (those to which a pointer was found) are not shown.
==76584== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==76584==
==76584== For lists of detected and suppressed errors, rerun with: -s
==76584== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 07:00:45 +09:00
df6c4f722c The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 15:27:23 +09:00
2cd6ce21f3 Merge branch 'zh/trailer-cmd'
The way the command line specified by the trailer.<token>.command
configuration variable receives the end-user supplied value was
both error prone and misleading.  An alternative to achieve the
same goal in a safer and more intuitive way has been added, as
the trailer.<token>.cmd configuration variable, to replace it.

* zh/trailer-cmd:
  trailer: add new .cmd config option
  docs: correct descript of trailer.<token>.command
2021-05-11 15:27:23 +09:00
416449eaba Merge branch 'jk/symlinked-dotgitx-cleanup'
Various test and documentation updates about .gitsomething paths
that are symlinks.

* jk/symlinked-dotgitx-cleanup:
  docs: document symlink restrictions for dot-files
  fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW
  t0060: test ntfs/hfs-obscured dotfiles
  t7450: test .gitmodules symlink matching against obscured names
  t7450: test verify_path() handling of gitmodules
  t7415: rename to expand scope
  fsck_tree(): wrap some long lines
  fsck_tree(): fix shadowed variable
  t7415: remove out-dated comment about translation
2021-05-11 15:27:23 +09:00
1af57f5d32 Merge branch 'jk/pack-objects-negative-options-fix'
Options to "git pack-objects" that take numeric values like
--window and --depth should not accept negative values; the input
validation has been tightened.

* jk/pack-objects-negative-options-fix:
  pack-objects: clamp negative depth to 0
  t5316: check behavior of pack-objects --depth=0
  pack-objects: clamp negative window size to 0
  t5300: check that we produced expected number of deltas
  t5300: modernize basic tests
2021-05-11 15:27:23 +09:00
270f8bfe00 Merge branch 'jk/doc-format-patch-skips-merges'
Document that "format-patch" skips merges.

* jk/doc-format-patch-skips-merges:
  docs/format-patch: mention handling of merges
2021-05-11 15:27:23 +09:00
0b77301bf4 Merge branch 'jc/test-allows-local'
Document that our test can use "local" keyword.

* jc/test-allows-local:
  CodingGuidelines: explicitly allow "local" for test scripts
2021-05-11 15:27:22 +09:00
74339f814c Merge branch 'nc/submodule-update-quiet'
"git submodule update --quiet" did not propagate the quiet option
down to underlying "git fetch", which has been corrected.

* nc/submodule-update-quiet:
  submodule update: silence underlying fetch with "--quiet"
2021-05-11 15:27:22 +09:00
5feebddd86 Merge branch 'js/merge-already-up-to-date-message-reword'
A few variants of informational message "Already up-to-date" has
been rephrased.

* js/merge-already-up-to-date-message-reword:
  merge: fix swapped "up to date" message components
  merge(s): apply consistent punctuation to "up to date" messages
2021-05-11 15:27:22 +09:00
8ca4771dd0 Merge branch 'rj/bisect-skip-honor-terms'
"git bisect skip" when custom words are used for new/old did not
work, which has been corrected.

* rj/bisect-skip-honor-terms:
  bisect--helper: use BISECT_TERMS in 'bisect skip' command
2021-05-11 15:27:22 +09:00
5f03e5126d refs: cleanup directories when deleting packed ref
When deleting a packed ref via 'update-ref -d', a lockfile is made in
the directory that would contain the loose copy of that ref, creating
any directories in the ref's path that do not exist. When the
transaction completes, the lockfile is deleted, but any empty parent
directories made when creating the lockfile are left in place.  These
empty directories are not removed by 'pack-refs' or other housekeeping
tasks and will accumulate over time.

When deleting a loose ref, we remove all empty parent directories at the
end of the transaction.

This commit applies the parent directory cleanup logic used when
deleting loose refs to packed refs as well.

Signed-off-by: Will Chandler <wfc@wfchandler.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 13:59:57 +09:00
0e59f7ad67 merge-ort: split "distinct types" message into two translatable messages
The word "renamed" has two possible translations in many European
languages depending on whether one thing was renamed or two things were
renamed. Give translators freedom to alter any part of the message to
make it sound right in their language.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:26:01 +09:00
49f38e2de4 The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 16:59:47 +09:00
a0f521b56c Merge branch 'rs/repack-without-loosening-promised-objects'
"git repack -A -d" in a partial clone unnecessarily loosened
objects in promisor pack.

* rs/repack-without-loosening-promised-objects:
  repack: avoid loosening promisor objects in partial clones
2021-05-10 16:59:47 +09:00
44ccb7629a Merge branch 'ls/subtree'
"git subtree" updates.

* ls/subtree: (30 commits)
  subtree: be stricter about validating flags
  subtree: push: allow specifying a local rev other than HEAD
  subtree: allow 'split' flags to be passed to 'push'
  subtree: allow --squash to be used with --rejoin
  subtree: give the docs a once-over
  subtree: have $indent actually affect indentation
  subtree: don't let debug and progress output clash
  subtree: add comments and sanity checks
  subtree: remove duplicate check
  subtree: parse revs in individual cmd_ functions
  subtree: use "^{commit}" instead of "^0"
  subtree: don't fuss with PATH
  subtree: use "$*" instead of "$@" as appropriate
  subtree: use more explicit variable names for cmdline args
  subtree: use git-sh-setup's `say`
  subtree: use `git merge-base --is-ancestor`
  subtree: drop support for git < 1.7
  subtree: more consistent error propagation
  subtree: don't have loose code outside of a function
  subtree: t7900: add porcelain tests for 'pull' and 'push'
  ...
2021-05-10 16:59:47 +09:00
aaa3c8065d Merge branch 'bc/hash-transition-interop-part-1'
SHA-256 transition.

* bc/hash-transition-interop-part-1:
  hex: print objects using the hash algorithm member
  hex: default to the_hash_algo on zero algorithm value
  builtin/pack-objects: avoid using struct object_id for pack hash
  commit-graph: don't store file hashes as struct object_id
  builtin/show-index: set the algorithm for object IDs
  hash: provide per-algorithm null OIDs
  hash: set, copy, and use algo field in struct object_id
  builtin/pack-redundant: avoid casting buffers to struct object_id
  Use the final_oid_fn to finalize hashing of object IDs
  hash: add a function to finalize object IDs
  http-push: set algorithm when reading object ID
  Always use oidread to read into struct object_id
  hash: add an algo member to struct object_id
2021-05-10 16:59:46 +09:00
59b519ab7e am: learn to process quoted lines that ends with CRLF
In previous changes, mailinfo has learnt to process lines that decoded
from base64 or quoted-printable, and ends with CRLF.

Let's teach "am" that new trick, too.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
133a4fda59 mailinfo: allow stripping quoted CR without warning
In previous changes, we've turned on warning for quoted CR in base64 or
quoted-printable email messages. Some projects see those quoted CR a lot,
they know that it happens most of the time, and they find it's desirable
to always strip those CR.

Those projects in question usually fall back to use other tools to handle
patches when receive such patches.

Let's help those projects handle those patches by stripping those
excessive CR.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
f1aa299443 mailinfo: allow squelching quoted CRLF warning
In previous change, Git starts to warn for quoted CRLF in decoded
base64/QP email. Despite those warnings are usually helpful,
quoted CRLF could be part of some users' workflow.

Let's give them an option to turn off the warning completely.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
0b689562ca mailinfo: warn if CRLF found in decoded base64/QP email
When SMTP servers receive 8-bit email messages, possibly with only
LF as line ending, some of them decide to change said LF to CRLF.

Some mailing list softwares, when receive 8-bit email messages,
decide to encode those messages in base64 or quoted-printable.

If an email is transfered through above mail servers, then distributed
by such mailing list softwares, the recipients will receive an email
contains a patch mungled with CRLF encoded inside another encoding.

Thus, such CR (in CRLF) couldn't be dropped by "mailsplit".
Hence, the mailed patch couldn't be applied cleanly.
Such accidents have been observed in the wild [1].

Instead of silently rejecting those messages, let's give our users
some warnings if such CR (as part of CRLF) is found.

[1]: https://nmbug.notmuchmail.org/nmweb/show/m2lf9ejegj.fsf%40guru.guru-group.fi

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
8c9ca6f095 pretty-formats.txt: add missing space
The description of "%ch" is missing a space after "human style", before
the parenthetical remark. This description was introduced in b722d4560e
("pretty: provide human date format", 2021-04-25). That commit also
added "%ah", which does have the space already.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 14:12:49 +09:00
fba8e4c3d0 git-repack.txt: remove spurious ")"
Drop the ")" at the end of this paragraph. There's a parenthetical
remark in this paragraph, but it's been closed on the line above.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 14:12:47 +09:00
2d677e5b15 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 12:47:42 +09:00
39c5392d68 Merge branch 'll/clone-reject-shallow'
Fix tests when forced to use v0 protocol.

* ll/clone-reject-shallow:
  t5601: mark protocol v2-only test
2021-05-07 12:47:42 +09:00
70a890d42f Merge branch 'si/zsh-complete-comment-fix'
Portability fix for command line completion script (in contrib/).

* si/zsh-complete-comment-fix:
  work around zsh comment in __git_complete_worktree_paths
2021-05-07 12:47:42 +09:00
18e1ba1092 Merge branch 'dl/complete-stash-updates'
Further update the command line completion (in contrib/) for "git
stash".

* dl/complete-stash-updates:
  git-completion.bash: consolidate cases in _git_stash()
  git-completion.bash: use $__git_cmd_idx in more places
  git-completion.bash: rename to $__git_cmd_idx
  git-completion.bash: separate some commands onto their own line
2021-05-07 12:47:41 +09:00
848a17c274 Merge branch 'dl/complete-stash'
The command line completion (in contrib/) for "git stash" has been
updated.

* dl/complete-stash:
  git-completion.bash: use __gitcomp_builtin() in _git_stash()
  git-completion.bash: extract from else in _git_stash()
  git-completion.bash: pass $__git_subcommand_idx from __git_main()
2021-05-07 12:47:41 +09:00
936e58851a Merge branch 'ah/plugleaks'
Plug various leans reported by LSAN.

* ah/plugleaks:
  builtin/rm: avoid leaking pathspec and seen
  builtin/rebase: release git_format_patch_opt too
  builtin/for-each-ref: free filter and UNLEAK sorting.
  mailinfo: also free strbuf lists when clearing mailinfo
  builtin/checkout: clear pending objects after diffing
  builtin/check-ignore: clear_pathspec before returning
  builtin/bugreport: don't leak prefixed filename
  branch: FREE_AND_NULL instead of NULL'ing real_ref
  bloom: clear each bloom_key after use
  ls-files: free max_prefix when done
  wt-status: fix multiple small leaks
  revision: free remainder of old commit list in limit_list
2021-05-07 12:47:41 +09:00
8585d6c04a Merge branch 'ps/rev-list-object-type-filter'
"git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.

* ps/rev-list-object-type-filter:
  rev-list: allow filtering of provided items
  pack-bitmap: implement combined filter
  pack-bitmap: implement object type filter
  list-objects: implement object type filter
  list-objects: support filtering by tag and commit
  list-objects: move tag processing into its own function
  revision: mark commit parents as NOT_USER_GIVEN
  uploadpack.txt: document implication of `uploadpackfilter.allow`
2021-05-07 12:47:41 +09:00
826ef0e5e5 Merge branch 'ab/svn-tests-set-e-fix'
Test clean-up.

* ab/svn-tests-set-e-fix:
  svn tests: refactor away a "set -e" in test body
  svn tests: remove legacy re-setup from init-clone test
2021-05-07 12:47:40 +09:00
0377ac98dc Merge branch 'ab/rebase-no-reschedule-failed-exec'
"git rebase --[no-]reschedule-failed-exec" did not work well with
its configuration variable, which has been corrected.

* ab/rebase-no-reschedule-failed-exec:
  rebase: don't override --no-reschedule-failed-exec with config
  rebase tests: camel-case rebase.rescheduleFailedExec consistently
2021-05-07 12:47:40 +09:00
5a357fa477 Merge branch 'ab/doc-lint'
Dev support.

* ab/doc-lint:
  docs: fix linting issues due to incorrect relative section order
  doc lint: lint relative section order
  doc lint: lint and fix missing "GIT" end sections
  doc lint: fix bugs in, simplify and improve lint script
  doc lint: Perl "strict" and "warnings" in lint-gitlink.perl
  Documentation/Makefile: make doc.dep dependencies a variable again
  Documentation/Makefile: make $(wildcard howto/*.txt) a var
2021-05-07 12:47:40 +09:00
fe069dce62 Merge branch 'mt/add-rm-in-sparse-checkout'
"git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.

* mt/add-rm-in-sparse-checkout:
  rm: honor sparse checkout patterns
  add: warn when asked to update SKIP_WORKTREE entries
  refresh_index(): add flag to ignore SKIP_WORKTREE entries
  pathspec: allow to ignore SKIP_WORKTREE entries on index matching
  add: make --chmod and --renormalize honor sparse checkouts
  t3705: add tests for `git add` in sparse checkouts
  add: include magic part of pathspec on --refresh error
2021-05-07 12:47:40 +09:00
e706aaf3bc Merge branch 'ps/config-global-override'
Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.

* ps/config-global-override:
  t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
  config: allow overriding of global and system configuration
  config: unify code paths to get global config paths
  config: rename `git_etc_config()`
2021-05-07 12:47:39 +09:00
f16a4660de Merge branch 'zh/pretty-date-human'
"git log --format=..." placeholders learned %ah/%ch placeholders to
request the --date=human output.

* zh/pretty-date-human:
  pretty: provide human date format
2021-05-07 12:47:39 +09:00
c108c8c2f2 Merge branch 'zh/format-ref-array-optim'
"git (branch|tag) --format=..." has been micro-optimized.

* zh/format-ref-array-optim:
  ref-filter: reuse output buffer
  ref-filter: get rid of show_ref_array_item
2021-05-07 12:47:39 +09:00
bb2feec17f Merge branch 'ad/cygwin-no-backslashes-in-paths'
Cygwin pathname handling fix.

* ad/cygwin-no-backslashes-in-paths:
  cygwin: disallow backslashes in file names
2021-05-07 12:47:39 +09:00
6d99f31dda Merge branch 'jz/apply-3way-first-message-fix'
When we swapped the order of --3way fallback, we forgot to adjust
the message we give when the first method fails and the second
method is attempted (which used to be "direct application failed
hence we try 3way", now it is the other way around).

* jz/apply-3way-first-message-fix:
  apply: adjust messages to account for --3way changes
2021-05-07 12:47:38 +09:00
6e08cbdf38 Merge branch 'jk/prune-with-bitmap-fix'
When the reachability bitmap is in effect, the "do not lose
recently created objects and those that are reachable from them"
safety to protect us from races were disabled by mistake, which has
been corrected.

* jk/prune-with-bitmap-fix:
  prune: save reachable-from-recent objects with bitmaps
  pack-bitmap: clean up include_check after use
2021-05-07 12:47:38 +09:00
e60e9cc20e Merge branch 'po/diff-patch-doc'
Doc update.

* po/diff-patch-doc:
  doc: point to diff attribute in patch format docs
2021-05-07 12:47:38 +09:00
a850356d1b Merge branch 'hn/trace-reflog-expiry'
The reflog expiry machinery has been taught to emit trace events.

* hn/trace-reflog-expiry:
  refs/debug: trace into reflog expiry too
2021-05-07 12:47:38 +09:00
e5d99d378b Merge branch 'ab/pretty-date-format-tests'
Tweak a few tests for "log --format=..." that show timestamps in
various formats.

* ab/pretty-date-format-tests:
  pretty tests: give --date/format tests a better description
  pretty tests: simplify %aI/%cI date format test
2021-05-07 12:47:38 +09:00
5f586f55a0 Merge branch 'ps/config-env-option-with-separate-value'
"git --config-env var=val cmd" weren't accepted (only
--config-env=var=val was).

* ps/config-env-option-with-separate-value:
  git: support separate arg for `--config-env`'s value
  git.txt: fix synopsis of `--config-env` missing the equals sign
2021-05-07 12:47:37 +09:00
3a7f0908b6 clean: remove unnecessary variable
The variable `matches` used to hold the return of a `dir_path_match()`
call that was removed in 95c11ecc73 ("Fix error-prone fill_directory()
API; make it only return matches", 2020-04-01). Now `matches` will
always hold 0, which is the value it's initialized with; and the
condition `matches != MATCHED_EXACTLY` will always evaluate to true. So
let's remove this unnecessary variable.

Interestingly, it seems that `matches != MATCHED_EXACTLY` was already
unnecessary before 95c11ecc73. That's because `remove_directories` is
always set to 1 when we have pathspecs; So, in the condition
`!remove_directories && matches != MATCHED_EXACTLY`, we would either:

- have pathspecs (or have been given `-d`) and ignore `matches` because
  `remove_directories` is 1; or

- not have pathspecs (nor `-d`) and end up just checking that
  `0 != MATCHED_EXACTLY`, as `matches` would never get reassigned
  after its zero initialization (because there is no pathspec to match).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 07:48:11 +09:00
dd9323b7fb mailinfo: stop parsing options manually
In a later change, mailinfo will learn more options, let's switch to our
robust parse_options framework before that step.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:26 +09:00
d582992e80 mailinfo: load default metainfo_charset lazily
In a later change, we will use parse_option to parse mailinfo's options.
In mailinfo, both "-u", "-n", and "--encoding" try to set the same
field, with "-u" reset that field to some default value from
configuration variable "i18n.commitEncoding".

Let's delay the setting of that field until we finish processing all
options. By doing that, "i18n.commitEncoding" can be parsed on demand.
More importantly, it cleans the way for using parse_option.

This change introduces some inconsistent brackets "{}" in "if/else if"
construct, however, we will rewrite them in the next few changes.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:25 +09:00
a1989cf7b8 add: die if both --dry-run and --interactive are given
The interactive machinery does not obey --dry-run. Die appropriately
if both flags are passed.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:14:04 +09:00
256c2dc42c perl: use mock i18n functions under NO_GETTEXT=Y
Change the logic of the i18n functions I added in 5e9637c629 (i18n:
add infrastructure for translating Git with gettext, 2011-11-18) to
use pass-through functions when NO_GETTEXT is defined.

This speeds up the compilation time of commands that use this library
when NO_GETTEXT=Y is in effect. Loading it and POSIX.pm is around 20ms
on my machine, whereas it takes 2ms to just instantiate perl itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:33 +09:00
368a50d9ee Makefile: regenerate *.pm on NO_PERL_CPAN_FALLBACKS change
Regenerate the *.pm files in perl/build/* if the
NO_PERL_CPAN_FALLBACKS flag added to the *.pm files in
1aca69c019 (perl Git::LoadCPAN: emit better errors under
NO_PERL_CPAN_FALLBACKS, 2018-03-03) is changed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:32 +09:00
3d49f7220a Makefile: regenerate perl/build/* if GIT-PERL-DEFINES changes
Change the logic to generate perl/build/* to regenerate those files if
GIT-PERL-DEFINES changes. This ensures that e.g. changing localedir
will result in correctly re-generated files.

I don't think that ever worked. The brokenness pre-dates my
20d2a30f8f (Makefile: replace perl/Makefile.PL with simple make
rules, 2017-12-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:30 +09:00
4070c9e09f Makefile: don't re-define PERL_DEFINES
Since 07d90eadb5 (Makefile: add Perl runtime prefix support,
2018-04-10) we have been declaring PERL_DEFINES right after assigning
to it, with the effect that the first PERL_DEFINES was ignored.

That bug didn't matter in practice since the first line had all the
same variables as the second, so we'd correctly re-generate
everything. It just made for confusing reading.

Let's remove that first assignment, and while we're at it split these
across lines to make them more maintainable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:58:27 +09:00
d4e2d15a8b streaming.c: move {open,close,read} from vtable to "struct git_istream"
Move the definition of the structure around the open/close/read
functions introduced in 46bf043807 (streaming: a new API to read from
the object store, 2011-05-11) to instead populate "close" and "read"
members in the "struct git_istream".

This gets us rid of an extra pointer deference, and I think makes more
sense. The "close" and "read" functions are the primary interface to
the stream itself.

Let's also populate a "open" callback in the same struct. That's now
used by open_istream() after istream_source() decides what "open"
function should be used. This isn't needed to get rid of the
"stream_vtbl" variables, but makes sense for consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:10 +09:00
de94c0eace streaming.c: stop passing around "object_info *" to open()
Change the streaming interface to stop passing around the "struct
object_info" the open() functions.

As seen in 7ef2d9a260 (streaming: read non-delta incrementally from a
pack, 2011-05-13) which introduced the "st->u.in_pack" assignments
being changed here only the open_istream_pack_non_delta() path need
these.

So let's instead do this when preparing the selected callback in the
istream_source() function. This might also allow the compiler to
reduce the lifetime of the "oi" variable, as we've moved it from
"git_istream()" to "istream_source()".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:09 +09:00
bc062ad001 streaming.c: remove {open,close,read}_method_decl() macros
Remove the {open,close,read}_method_decl() macros added in
46bf043807 (streaming: a new API to read from the object store,
2011-05-11) in favor of inlining the definition of the arguments of
these functions.

Since we'll end up using them via the "{open,close,read}_istream_fn"
types we don't gain anything in the way of compiler checking by using
these macros, and as of preceding commits we no longer need to declare
these argument lists twice. So declaring them at a distance just
serves to make the code less readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:06 +09:00
0d9af06e36 streaming.c: remove enum/function/vtbl indirection
Remove the indirection of discovering a function pointer to use via an
enum and virtual table. This refactors code added in
46bf043807 (streaming: a new API to read from the object store,
2011-05-11).

We can instead simply return an "open_istream_fn" for use from the
"istream_source()" selector function directly. This allows us to get
rid of the "incore", "loose" and "pack_non_delta" enum
variables. We'll return the functions instead.

The "stream_error" variable in that enum can likewise go in favor of
returning NULL, which is what the open_istream() was doing when it got
that value anyway.

We can thus remove the entire enum, and the "open_istream_tbl" virtual
table that (indirectly) referenced it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:04 +09:00
b65528360f streaming.c: avoid forward declarations
Change code added in 46bf043807 (streaming: a new API to read from
the object store, 2011-05-11) to avoid forward declarations of the
functions it uses. We can instead move this code to the bottom of the
file, and thus avoid the open_method_decl() calls.

Aside from the addition of the "static helpers[...]" comment being
added here, and the removal of the forward declarations this is a
move-only change.

The style of the added "static helpers[...]"  comment isn't in line
with our usual coding style, but is consistent with several other
comments used in this file, so let's use that style consistently here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:56:02 +09:00
b79f9c075d sparse-index.c: remove set_index_sparse_config()
Remove the set_index_sparse_config() function by folding it into
set_sparse_index_config(), which was its only user.

Since 122ba1f7b5 (sparse-checkout: toggle sparse index from builtin,
2021-03-30) the flow of this code hasn't made much sense, we'd get
"enabled" in set_sparse_index_config(), proceed to call
set_index_sparse_config() with it.

There we'd call prepare_repo_settings() and set
"repo->settings.sparse_index = 1", only to needlessly call
prepare_repo_settings() again in set_sparse_index_config() (where it
would early abort), and finally setting "repo->settings.sparse_index =
enabled".

Instead we can just call prepare_repo_settings() once, and set the
variable to "enabled" in the first place.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:53:46 +09:00
6b79818bfb git-p4: speed up search for branch parent
For every new branch that git-p4 imports, it needs to find the commit
where it branched off its parent branch. While p4 doesn't record this
information explicitly, the first changelist on a branch is usually an
identical copy of the parent branch.

The method searchParent() tries to find a commit in the history of the
given "parent" branch whose tree exactly matches the initial changelist
of the new branch, "target". The code iterates through the parent
commits and compares each of them to this initial changelist using
diff-tree.

Since we already know the tree object name we are looking for, spawning
diff-tree for each commit is wasteful.

Use the "--format" option of "rev-list" to find out the tree object name
of each commit in the history, and find the tree whose name is exactly
the same as the tree of the target commit to optimize this.

This results in a considerable speed-up, at least on Windows. On one
Windows machine with a fairly large repository of about 16000 commits in
the parent branch, the current code takes over 7 minutes, while the new
code only takes just over 10 seconds for the same changelist:

Before:

    $ time git p4 sync
    Importing from/into multiple branches
    Depot paths: //depot
    Importing revision 31274 (100.0%)
    Updated branches: b1

    real    7m41.458s
    user    0m0.000s
    sys     0m0.077s

After:

    $ time git p4 sync
    Importing from/into multiple branches
    Depot paths: //depot
    Importing revision 31274 (100.0%)
    Updated branches: b1

    real    0m10.235s
    user    0m0.000s
    sys     0m0.062s

Signed-off-by: Joachim Kuebart <joachim.kuebart@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:51:33 +09:00
c3ab08844c git-p4: ensure complex branches are cloned correctly
When importing a branch from p4, git-p4 searches the history of the parent
branch for the branch point. The test for the complex branch structure
ensures all files have the expected contents, but doesn't examine the
branch structure.

Check for the correct branch structure by making sure that the initial
commit on each branch is empty. This ensures that the initial commit's
parent is indeed the correct branch-off point.

Signed-off-by: Joachim Kuebart <joachim.kuebart@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06 12:51:31 +09:00
f91371b948 patience diff: remove unused variable
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:56:48 +09:00
204aa2d24d patience diff: remove unnecessary string comparisons
xdl_prepare_env() calls xdl_classify_record() which arranges for the
hashes of non-matching lines to be different so lines can be tested
for equality by comparing just their hashes.

This reduces the time taken to calculate the diff of v2.28.0 to
v2.29.0 by ~3-4%.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:56:48 +09:00
0324e8fc6b word diff: handle zero length matches
If find_word_boundaries() encounters a zero length match (which can be
caused by matching a newline or using '*' instead of '+' in the regex)
we stop splitting the input into words which generates an inaccurate
diff. To fix this increment the start point when there is a zero
length match and try a new match. This is safe as posix regular
expressions always return the longest available match so a zero length
match means there are no longer matches available from the current
position.

Commit bf82940dbf (color-words: enable REG_NEWLINE to help user,
2009-01-17) prevented matching newlines in negated character classes
but it is still possible for the user to have an explicit newline
match in the regex which could cause a zero length match.

One could argue that having explicit newline matches or using '*'
rather than '+' are user errors but it seems to be better to work
round them than produce inaccurate diffs.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:53:42 +09:00
87094fc2da ci: run test round with parallel-checkout enabled
We already have tests for the basic parallel-checkout operations. But
this code can also run be executed by other commands, such as
git-read-tree and git-sparse-checkout, which are currently not tested
with multiple workers. To promote a wider test coverage without
duplicating tests:

1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally
   force parallel-checkout execution during the whole test suite.

2. Set this variable (with a value of 2) in the second test round of our
   linux-gcc CI job. This round runs `make test` again with some
   optional GIT_TEST_* variables enabled, so there is no additional
   overhead in exercising the parallel-checkout code here.

Note that tests checking out less than two parallel-eligible entries
will fall back to the sequential mode. Nevertheless, it's still a good
exercise for the parallel-checkout framework as the fallback codepath
also writes the queued entries using the parallel-checkout functions
(only without spawning any worker).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:27:17 +09:00
d0e5d35700 parallel-checkout: add tests for basic operations
Add tests to populate the working tree during clone and checkout using
sequential and parallel mode, to confirm that they produce identical
results. Also test basic checkout mechanics, such as checking for
symlinks in the leading directories and the abidance to --force.

Note: some helper functions are added to a common lib file which is only
included by t2080 for now. But they will also be used by other
parallel-checkout tests in the following patches.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
d5904220bc parallel-checkout: add tests related to .gitattributes
Add tests to confirm that the `struct conv_attrs` data is correctly
passed from the main process to the workers, and that they can properly
convert the blobs before writing them to the working tree.

Also check that parallel-ineligible entries, such as regular files that
require external filters, are correctly smudge and written when
parallel-checkout is enabled.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
70b052b209 checkout-index: add parallel checkout support
Allow checkout-index to use the parallel checkout framework, honoring
the checkout.workers configuration.

There are two code paths in checkout-index which call
`checkout_entry()`, and thus, can make use of parallel checkout:
`checkout_file()`, which is used to write paths explicitly given at the
command line; and `checkout_all()`, which is used to write all paths in
the index, when the `--all` option is given.

In both operation modes, checkout-index doesn't abort immediately on a
`checkout_entry()` failure. Instead, it tries to check out all remaining
paths before exiting with a non-zero exit code. To keep this behavior
when parallel checkout is being used, we must allow
`run_parallel_checkout()` to try writing the queued entries before we
exit, even if we already got an error code from a previous
`checkout_entry()` call.

However, `checkout_all()` doesn't return on errors, it calls `exit()`
with code 128. We could make it call `run_parallel_checkout()` before
exiting, but it makes the code easier to follow if we unify the exit
path for both checkout-index modes at `cmd_checkout_index()`, and let
this function take care of the interactions with the parallel checkout
API. So let's do that.

With this change, we also have to consider whether we want to keep using
128 as the error code for `git checkout-index --all`, while we use 1 for
`git checkout-index <path>` (even when the actual error is the same).
Since there is not much value in having code 128 only for `--all`, and
there is no mention about it in the docs (so it's unlikely that changing
it will break any existing script), let's make both modes exit with code
1 on `checkout_entry()` errors.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
2fa3cbadcd t0028: extract encoding helpers to lib-encoding.sh
The following patch will add tests outside t0028 which will also need to
re-encode some strings. Extract the auxiliary encoding functions from
t0028 to a common lib file so that they can be reused.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
6a7bc9d118 parallel-checkout: add tests related to path collisions
Add tests to confirm that path collisions are properly detected by
checkout workers, both to avoid race conditions and to report colliding
entries on clone.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
6053950632 builtin/checkout.c: complete parallel checkout support
Pathspec-limited checkouts (like `git checkout *.txt`) are performed by
a code path that doesn't yet support parallel checkout because it calls
checkout_entry() directly, instead of unpack_trees(). Let's add parallel
checkout support for this code path too.

The transient cache entries allocated in checkout_merged() are now
allocated in a mem_pool which is only discarded after parallel checkout
finishes. This is done because the entries need to be valid when
run_parallel_checkout() is called.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:33 +09:00
9616882780 make_transient_cache_entry(): optionally alloc from mem_pool
Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:25:25 +09:00
b89c731228 t5601: mark protocol v2-only test
A HTTP-clone test introduced in 4fe788b1b0 ("builtin/clone.c: add
--reject-shallow option", 2021-04-01) only works in protocol v2, but is
not marked as such.

The aforementioned patch implements --reject-shallow for a variety of
situations, but usage of a protocol that requires a remote helper is not
one of them. (Such an implementation would require extending the remote
helper protocol to support the passing of a "reject shallow" option, and
then teaching it to both protocol-speaking ends.)

For now, to make it pass when GIT_TEST_PROTOCOL_VERSION=0 is passed, add
"-c protocol.version=2". A more complete solution would be either to
augment the remote helper protocol to support this feature or to return
a fatal error when using --reject-shallow with a protocol that uses a
remote helper.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:54:41 +09:00
477673d6f3 send-pack: support push negotiation
Teach Git the push.negotiate config variable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:41:29 +09:00
9c1e657a8f fetch: teach independent negotiation (no packfile)
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.

This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)

In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.

In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.

There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:41:29 +09:00
f2acf763e2 work around zsh comment in __git_complete_worktree_paths
[PATCH]: contrib/completion/git-completion.bash, there is a construct
where comment lines are placed between the command that is on
the upstream of a pipe and the command that is on the downstream
of a pipe in __git_complete_worktree_paths function.

Unfortunately, this script is also used by Zsh completion, but
Zsh mishandles this construct when "interactive_comments" option is not
set (by default it is off on macOS), resulting in a breakage:

$ git worktree remove [TAB]
$ git worktree remove __git_complete_worktree_paths:7: command not found: #

Move the comment, even though it explains what happens on the
downstream of the pipe and logically belongs where it is right
now, before the entire pipeline, to work around this problem.

Signed-off-by: Sardorbek Imomaliev <sardorbek.imomaliev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:17:23 +09:00
c364b7ef51 trailer: add new .cmd config option
The `trailer.<token>.command` configuration variable
specifies a command (run via the shell, so it does not have
to be a single name or path to the command, but can be a
shell script), and the first occurrence of substring $ARG is
replaced with the value given to the `interpret-trailer`
command for the token in a '--trailer <token>=<value>' argument.

This has three downsides:

* The use of $ARG in the mechanism misleads the users that
the value is passed in the shell variable, and tempt them
to use $ARG more than once, but that would not work, as
the second and subsequent $ARG are not replaced.

* Because $ARG is textually replaced without regard to the
shell language syntax, even '$ARG' (inside a single-quote
pair), which a user would expect to stay intact, would be
replaced, and worse, if the value had an unmatched single
quote (imagine a name like "O'Connor", substituted into
NAME='$ARG' to make it NAME='O'Connor'), it would result in
a broken command that is not syntactically correct (or
worse).

* The first occurrence of substring `$ARG` will be replaced
with the empty string, in the command when the command is
first called to add a trailer with the specified <token>.
This is a bad design, the nature of automatic execution
causes it to add a trailer that we don't expect.

Introduce a new `trailer.<token>.cmd` configuration that
takes higher precedence to deprecate and eventually remove
`trailer.<token>.command`, which passes the value as an
argument to the command.  Instead of "$ARG", users can
refer to the value as positional argument, $1, in their
scripts. At the same time, in order to allow
`git interpret-trailers` to better simulate the behavior
of `git command -s`, 'trailer.<token>.cmd' will not
automatically execute.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:09:43 +09:00
57dcb6575b docs: correct descript of trailer.<token>.command
In the original documentation of `trailer.<token>.command`,
some descriptions are easily misunderstood. So let's modify
it to increase its readability.

In addition, clarify that `$ARG` in command can only be
replaced once.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 12:09:43 +09:00
8ff06de10c docs: document symlink restrictions for dot-files
We stopped allowing symlinks for .gitmodules files in 10ecfa7649
(verify_path: disallow symlinks in .gitmodules, 2018-05-04), and we
stopped following symlinks for .gitattributes, .gitignore, and .mailmap
in the commits from 204333b015 (Merge branch 'jk/open-dotgitx-with-nofollow',
2021-03-22). The reasons are discussed in detail there, but we never
adjusted the documentation to let users know.

This hasn't been a big deal since the point is that such setups were
mildly broken and thought to be unusual anyway. But it certainly doesn't
hurt to be clear and explicit about it.

Suggested-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:03 +09:00
bb6832d552 fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW
In the commits merged in via 204333b015 (Merge branch
'jk/open-dotgitx-with-nofollow', 2021-03-22), we stopped following
symbolic links for .gitattributes, .gitignore, and .mailmap files.

Let's teach fsck to warn that these symlinks are not going to do
anything. Note that this is just a warning, and won't block the objects
via transfer.fsckObjects, since there are reported to be cases of this
in the wild (and even once fixed, they will continue to exist in the
commit history of those projects, but are not particularly dangerous).

Note that we won't add these to the existing gitmodules block in the
fsck code. The logic for gitmodules is a bit more complicated, as we
also check the content of non-symlink instances we find. But for these
new files, there is no content check; we're just looking at the name and
mode of the tree entry (and we can avoid even the complicated name
checks in the common case that the mode doesn't indicate a symlink).

We can reuse the test helper function we defined for .gitmodules, though
(it needs some slight adjustments for the fsck error code, and because
we don't block these symlinks via verify_path()).

Note that I didn't explicitly test the transfer.fsckObjects case here
(nor does the existing .gitmodules test that it blocks a push). The
translation of fsck severities to outcomes is covered in general in
t5504.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
801ed010bf t0060: test ntfs/hfs-obscured dotfiles
We have tests that cover various filesystem-specific spellings of
".gitmodules", because we need to reliably identify that path for some
security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules:
add tests, 2018-05-12), with the actual code coming from e7cb0b4455
(is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20
(is_hfs_dotgit: match other .git files, 2018-05-02).

Those latter two commits also added similar matching functions for
.gitattributes and .gitignore. These ended up not being used in the
final series, and are currently dead code. But in preparation for them
being used in some fsck checks, let's make sure they actually work by
throwing a few basic tests at them. Likewise, let's cover .mailmap
(which does need matching code added).

I didn't bother with the whole battery of tests that we cover for
.gitmodules. These functions are all based on the same generic matcher,
so it's sufficient to test most of the corner cases just once.

Note that the ntfs magic prefix names in the tests come from the
algorithm described in e7cb0b4455 (and are different for each file).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
1cb12f3339 t7450: test .gitmodules symlink matching against obscured names
In t7450 we check that both verify_path() and fsck catch malformed
.gitmodules entries in trees. However, we don't check that we catch
filesystem-equivalent forms of these (e.g., ".GITMOD~1" on Windows).
Our name-matching functions are exercised well in t0060, but there's
nothing to test that we correctly call the matching functions from the
actual fsck and verify_path() code.

So instead of testing just .gitmodules, let's repeat our tests for a few
basic cases. We don't need to be exhaustive here (t0060 handles that),
but just make sure we hit one name of each type.

Besides pushing the tests into a function that takes the path as a
parameter, we'll need to do a few things:

  - adjust the directory name to accommodate the tests running multiple
    times

  - set core.protecthfs for index checks. Fsck always protects all types
    by default, but we want to be able to exercise the HFS routines on
    every system. Note that core.protectntfs is already the default
    these days, but it doesn't hurt to explicitly label our need for it.

  - we'll also take the filename ("gitmodules") as a parameter. All
    calls use the same name for now, but a future patch will extend this
    to handle other .gitfoo files. Note that our fake-content symlink
    destination is somewhat .gitmodules specific. But it isn't necessary
    for other files (which don't do a content check). And it happens to
    be a valid attribute and ignore file anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
a1ca398ba7 t7450: test verify_path() handling of gitmodules
Commit 10ecfa7649 (verify_path: disallow symlinks in .gitmodules,
2018-05-04) made it impossible to load a symlink .gitmodules file into
the index. However, there are no tests of this behavior. Let's make sure
this case is covered. We can easily reuse the test setup created by
the matching b7b1fca175 (fsck: complain when .gitmodules is a symlink,
2018-05-04).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 11:52:02 +09:00
43a2220f19 t7415: rename to expand scope
This script has already expanded beyond its original intent of ".. in
submodule names" to include other malicious submodule bits. Let's update
the name and description to reflect that, as well as the fact that we'll
soon be adding similar tests for other dotfiles (.gitattributes, etc).
We'll also renumber it to move it out of the group of submodule-specific
tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
0282f6799f fsck_tree(): wrap some long lines
Many calls to report() in fsck_tree() are kept on a single line and are
quite long. Most were pretty big to begin with, but have gotten even
longer over the years as we've added more parameters. Let's accept the
churn of wrapping them in order to conform to our usual line limits.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
9e1947cb48 fsck_tree(): fix shadowed variable
Commit b2f2039c2b (fsck: accept an oid instead of a "struct tree" for
fsck_tree(), 2019-10-18) introduced a new "oid" parameter to
fsck_tree(), and we pass it to the report() function when we find
problems. However, that is shadowed within the tree-walking loop by the
existing "oid" variable which we use to store the oid of each tree
entry. As a result, we may report the wrong oid for some problems we
detect within the loop (the entry oid, instead of the tree oid).

Our tests didn't catch this because they checked only that we found the
expected fsck problem, not that it was attached to the correct object.

Let's rename both variables in the function to avoid confusion. This
makes the diff a little noisy (e.g., all of the report() calls outside
the loop were already correct but need to be touched), but makes sure we
catch all cases and will avoid similar confusion in the future.

And we can update the test to be a bit more specific and catch this
problem.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
963d02a24a t7415: remove out-dated comment about translation
Since GETTEXT_POISON does not exist anymore, there is no point warning
people about whether we should use test_i18ngrep. This is doubly
confusing because the comment was describing why it was OK to use grep,
but it got caught up in the mass conversion of 674ba34038 (fsck: mark
strings for translation, 2018-11-10).

Note there are other uses of test_i18ngrep in this script which are now
obsolete; I'll save those for a mass-cleanup. My goal here was just to
fix the confusing comment in code I'm about to refactor.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:41:08 +09:00
8e0601f568 docs/format-patch: mention handling of merges
Format-patch doesn't have a way to format merges in a way that can be
applied by git-am (or any other tool), and so it just omits them.
However, this may be a surprising implication for users who are not well
versed in how the tool works. Let's add a note to the documentation
making this more clear.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:32:39 +09:00
6d52b6a5df pack-objects: clamp negative depth to 0
A negative delta depth makes no sense, and the code is not prepared to
handle it. If passed "--depth=-1" on the command line, then this line
from break_delta_chains():

	cur->depth = (total_depth--) % (depth + 1);

triggers a divide-by-zero. This is undefined behavior according to the C
standard, but on POSIX systems results in SIGFPE killing the process.
This is certainly one way to inform the use that the command was
invalid, but it's a bit friendlier to just treat it as "don't allow any
deltas", which we already do for --depth=0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:30:46 +09:00
49ac1d33bb t5316: check behavior of pack-objects --depth=0
We'd expect this to cleanly produce no deltas at all (as opposed to
getting confused by an out-of-bounds value), and it does.

Note we have to adjust our max_chain test helper, which expected to find
at least one delta.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:56 +09:00
953aa54e1a pack-objects: clamp negative window size to 0
A negative window size makes no sense, and the code in find_deltas() is
not prepared to handle it. If you pass "-1", for example, we end up
generate a 0-length array of "struct unpacked", but our loop assumes it
has at least one entry in it (and we end up reading garbage memory).

We could complain to the user about this, but it's more forgiving to
just clamp it to 0, which means "do not find any deltas at all". The
0-case is already tested earlier in the script, so we'll make sure this
does the same thing.

Reported-by: Yiyuan guo <yguoaz@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:27 +09:00
95356789ee t5300: check that we produced expected number of deltas
We pack a set of objects both with and without --window=0, assuming that
the 0-length window will cause us not to produce any deltas. Let's
confirm that this is the case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:16 +09:00
5489899812 t5300: modernize basic tests
The first set of tests in t5300 goes back to 2005, and doesn't use some
of our customary style and tools these days. In preparation for touching
them, let's modernize a few things:

  - titles go on the line with test_expect_success, with a hanging
    open-quote to start the test body

  - test bodies should be indented with tabs

  - opening braces for shell blocks in &&-chains go on their own line

  - no space between redirect operators and files (">foo", not "> foo")

  - avoid doing work outside of test blocks; in this case, we can stick
    the setup of ".git2" into the appropriate blocks

  - avoid modifying and then cleaning up the environment or current
    directory by using subshells and "git -C"

  - this test does a curious thing when testing the unpacking: it sets
    GIT_OBJECT_DIRECTORY, and then does a "git init" in the _original_
    directory, creating a weird mixed situation. Instead, it's much
    simpler to just "git init --bare" a new repository to unpack into,
    and check the results there. I renamed this "git2" instead of
    ".git2" to make it more clear it's a separate repo.

  - we can observe that the bodies of the no-delta, ref_delta, and
    ofs_delta cases are all virtually identical except for the pack
    creation, and factor out shared helper functions. I collapsed "do
    the unpack" and "check the results of the unpack" into a single
    test, since that makes the expected lifetime of the "git2" temporary
    directory more clear (that also lets us use test_when_finished to
    clean it up). This does make the "-v" output slightly less useful,
    but the improvement in reading the actual test code makes it worth
    it.

  - I dropped the "pwd" calls from some tests. These don't do anything
    functional, and I suspect may have been an aid for debugging when
    the script was more cavalier about leaving the working directory
    changed between tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:16 +09:00
a84fd3bcc6 CodingGuidelines: explicitly allow "local" for test scripts
01d3a526 (t0000: check whether the shell supports the "local"
keyword, 2017-10-26) raised a test balloon to see if those who build
and test Git use a platform with a shell that lacks support for the
"local" keyword.  After two years, 7f0b5908 (t0000: reword comments
for "local" test, 2019-08-08) documented that "local" keyword, even
though is outside POSIX, is allowed in our test scripts.

Let's write it in the CodingGuidelines, too.  It might be tempting
to allow it in scripted Porcelains (we have avoided getting them
contaminiated by "local" so far), but they are on their way out and
getting rewritten in C.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:24:11 +09:00
ad9322da03 merge: fix swapped "up to date" message components
The rewrite of git-merge from shell to C in 1c7b76be7d (Build in merge,
2008-07-07) accidentally transformed the message:

    Already up-to-date. (nothing to squash)

to:

    (nothing to squash)Already up-to-date.

due to reversed printf() arguments. This problem has gone unnoticed
despite being touched over the years by 7f87aff22c (Teach/Fix pull/fetch
-q/-v options, 2008-11-15) and bacec47845 (i18n: git-merge basic
messages, 2011-02-22), and tangentially by bef4830e88 (i18n: merge: mark
messages for translation, 2016-06-17) and 7560f547e6 (treewide: correct
several "up-to-date" to "up to date", 2017-08-23).

Fix it by restoring the message to its intended order. While at it, help
translators out by avoiding "sentence Lego".

[es: rewrote commit message]

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:14:58 +09:00
80cde95eec merge(s): apply consistent punctuation to "up to date" messages
Although the various "Already up to date" messages resulting from merge
attempts share identical phrasing, they use a mix of punctuation ranging
from "." to "!" and even "Yeeah!", which leads to extra work for
translators. Ease the job of translators by settling upon "." as
punctuation for all such messages.

While at it, take advantage of printf_ln() to further ease the
translation task so translators need not worry about line termination,
and fix a case of missing line termination in the (unused)
merge_ort_nonrecursive() function.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:14:56 +09:00
62af4bdd42 submodule update: silence underlying fetch with "--quiet"
Commands such as

    $ git submodule update --quiet --init --depth=1

involving shallow clones, call the shell function fetch_in_submodule, which
in turn invokes git fetch.  Pass the --quiet option onward there.

Signed-off-by: Nicholas Clark <nick@ccl4.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 12:24:38 +09:00
7e39198978 The thirteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 13:50:27 +09:00
93e0b28dbb Merge branch 'ab/pathname-encoding-doc'
Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.

* ab/pathname-encoding-doc:
  doc: clarify the filename encoding in git diff
2021-04-30 13:50:27 +09:00
5980e0d442 Merge branch 'vs/completion-with-set-u'
Effort to make the command line completion (in contrib/) safe with
"set -u" continues.

* vs/completion-with-set-u:
  completion: avoid aliased command lookup error in nounset mode
2021-04-30 13:50:27 +09:00
bf0d4c8491 Merge branch 'hn/refs-trace-errno'
Show errno in the trace output in the error codepath that calls
read_raw_ref method.

* hn/refs-trace-errno:
  refs: print errno for read_raw_ref if GIT_TRACE_REFS is set
2021-04-30 13:50:27 +09:00
a1cac26cc6 Merge branch 'mt/parallel-checkout-part-2'
The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.

* mt/parallel-checkout-part-2:
  parallel-checkout: add design documentation
  parallel-checkout: support progress displaying
  parallel-checkout: add configuration options
  parallel-checkout: make it truly parallel
  unpack-trees: add basic support for parallel checkout
2021-04-30 13:50:26 +09:00
59bb0aa93e Merge branch 'so/log-diff-merge'
"git log" learned "--diff-merges=<style>" option, with an
associated configuration variable log.diffMerges.

* so/log-diff-merge:
  doc/diff-options: document new --diff-merges features
  diff-merges: introduce log.diffMerges config variable
  diff-merges: adapt -m to enable default diff format
  diff-merges: refactor set_diff_merges()
  diff-merges: introduce --diff-merges=on
2021-04-30 13:50:26 +09:00
8e97852919 Merge branch 'ds/sparse-index-protections'
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

* ds/sparse-index-protections: (47 commits)
  name-hash: use expand_to_path()
  sparse-index: expand_to_path()
  name-hash: don't add directories to name_hash
  revision: ensure full index
  resolve-undo: ensure full index
  read-cache: ensure full index
  pathspec: ensure full index
  merge-recursive: ensure full index
  entry: ensure full index
  dir: ensure full index
  update-index: ensure full index
  stash: ensure full index
  rm: ensure full index
  merge-index: ensure full index
  ls-files: ensure full index
  grep: ensure full index
  fsck: ensure full index
  difftool: ensure full index
  commit: ensure full index
  checkout: ensure full index
  ...
2021-04-30 13:50:26 +09:00
d250f90359 Merge branch 'ds/maintenance-prefetch-fix'
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.

* ds/maintenance-prefetch-fix:
  maintenance: respect remote.*.skipFetchAll
  maintenance: use 'git fetch --prefetch'
  fetch: add --prefetch option
  maintenance: simplify prefetch logic
2021-04-30 13:50:25 +09:00
a819e2b3ef Merge branch 'ow/push-quiet-set-upstream'
"git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.

* ow/push-quiet-set-upstream:
  transport: respect verbosity when setting upstream
2021-04-30 13:50:25 +09:00
279a2e637a Merge branch 'mt/pkt-write-errors'
When packet_write() fails, we gave an extra error message
unnecessarily, which has been corrected.

* mt/pkt-write-errors:
  pkt-line: do not report packet write errors twice
2021-04-30 13:50:24 +09:00
13158b9910 Merge branch 'jk/promisor-optim'
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).

* jk/promisor-optim:
  revision: avoid parsing with --exclude-promisor-objects
  lookup_unknown_object(): take a repository argument
  is_promisor_object(): free tree buffer after parsing
2021-04-30 13:50:24 +09:00
4cd66e7d6b bisect--helper: use BISECT_TERMS in 'bisect skip' command
Commit e4c7b33747 ("bisect--helper: reimplement `bisect_skip` shell
function in C", 2021-02-03), as part of the shell-to-C conversion,
forgot to read the 'terms' file (.git/BISECT_TERMS) during the new
'bisect skip' command implementation. As a result, the 'bisect skip'
command will use the default 'bad'/'good' terms. If the bisection
terms have been set to non-default values (for example by the
'bisect start' command), then the 'bisect skip' command will fail.

In order to correct this problem, we insert a call to the get_terms()
function, which reads the non-default terms from that file (if set),
in the '--bisect-skip' command implementation of 'bisect--helper'.

Also, add a test[1] to protect against potential future regression.

[1] https://lore.kernel.org/git/xmqqim45h585.fsf@gitster.g/T/#m207791568054b0f8cf1a3942878ea36293273c7d

Reported-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:56:42 +09:00
bccc37fdc7 cygwin: disallow backslashes in file names
The backslash character is not a valid part of a file name on Windows.
If, in Windows, Git attempts to write a file that has a backslash
character in the filename, it will be incorrectly interpreted as a
directory separator.

This caused CVE-2019-1354 in MinGW, as this behaviour can be manipulated
to cause the checkout to write to files it ought not write to, such as
adding code to the .git/hooks directory.  This was fixed by e1d911dd4c
(mingw: disallow backslash characters in tree objects' file names,
2019-09-12).  However, the vulnerability also exists in Cygwin: while
Cygwin mostly provides a POSIX-like path system, it will still interpret
a backslash as a directory separator.

To avoid this vulnerability, CVE-2021-29468, extend the previous fix to
also apply to Cygwin.

Similarly, extend the test case added by the previous version of the
commit.  The test suite doesn't have an easy way to say "run this test
if in MinGW or Cygwin", so add a new test prerequisite that covers both.

As well as checking behaviour in the presence of paths containing
backslashes, the existing test also checks behaviour in the presence of
paths that differ only by the presence of a trailing ".".  MinGW follows
normal Windows application behaviour and treats them as the same path,
but Cygwin more closely emulates *nix systems (at the expense of
compatibility with native Windows applications) and will create and
distinguish between such paths.  Gate the relevant bit of that test
accordingly.

Reported-by: RyotaK <security@ryotak.me>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:49:20 +09:00
c331551ccf git: support separate arg for --config-env's value
While not documented as such, many of the top-level options like
`--git-dir` and `--work-tree` support two syntaxes: they accept both an
equals sign between option and its value, and they do support option and
value as two separate arguments. The recently added `--config-env`
option only supports the syntax with an equals sign.

Mitigate this inconsistency by accepting both syntaxes and add tests to
verify both work.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:46:53 +09:00
9152904c11 git.txt: fix synopsis of --config-env missing the equals sign
When executing `git -h`, then the `--config-env` documentation rightly
lists the option as requiring an equals between the option and its
argument: this is the only currently supported format. But the git(1)
manpage incorrectly lists the option as taking a space in between.

Fix the issue by adding the missing space.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-of-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:46:46 +09:00
526705fd3d apply: adjust messages to account for --3way changes
"git apply" specifically calls out when it is falling back to 3way
merge application.  Since the order changed to preferring 3way and
falling back to direct application, continue that behavior by
printing whenever 3way fails and git has to fall back.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 12:27:45 +09:00
2ba582ba4c prune: save reachable-from-recent objects with bitmaps
We pass our prune expiration to mark_reachable_objects(), which will
traverse not only the reachable objects, but consider any recent ones as
tips for reachability; see d3038d22f9 (prune: keep objects reachable
from recent objects, 2014-10-15) for details.

However, this interacts badly with the bitmap code path added in
fde67d6896 (prune: use bitmaps for reachability traversal, 2019-02-13).
If we hit the bitmap-optimized path, we return immediately to avoid the
regular traversal, accidentally skipping the "also traverse recent"
code.

Instead, we should do an if-else for the bitmap versus regular
traversal, and then follow up with the "recent" traversal in either
case. This reuses the "rev_info" for a bitmap and then a regular
traversal, but that should work OK (the bitmap code clears the pending
array in the usual way, just like a regular traversal would).

Note that I dropped the comment above the regular traversal here.  It
has little explanatory value, and makes the if-else logic much harder to
read.

Here are a few variants that I rejected:

  - it seems like both the reachability and recent traversals could be
    done in a single traversal. This was rejected by d3038d22f9 (prune:
    keep objects reachable from recent objects, 2014-10-15), though the
    balance may be different when using bitmaps. However, there's a
    subtle correctness issue, too: we use revs->ignore_missing_links for
    the recent traversal, but not the reachability one.

  - we could try using bitmaps for the recent traversal, too, which
    could possibly improve performance. But it would require some fixes
    in the bitmap code, which uses ignore_missing_links for its own
    purposes. Plus it would probably not help all that much in practice.
    We use the reachable tips to generate bitmaps, so those objects are
    likely not covered by bitmaps (unless they just became unreachable).
    And in general, we expect the set of unreachable objects to be much
    smaller anyway, so there's less to gain.

The test in t5304 detects the bug and confirms the fix.

I also beefed up the tests in t6501, which covers the mtime-checking
code more thoroughly, to handle the bitmap case (in addition to just
"loose" and "packed" cases). Interestingly, this test doesn't actually
detect the bug, because it is running "git gc", and not "prune"
directly. And "gc" will call "repack" first, which does not suffer the
same bug. So the old-but-reachable-from-recent objects get scooped up
into the new pack along with the actually-recent objects, which gives
both a recent mtime. But it seemed prudent to get more coverage of the
bitmap case for related code.

Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 10:38:25 +09:00
1e951c6473 pack-bitmap: clean up include_check after use
When a bitmap walk has to traverse (to fill in non-bitmapped objects),
we use rev_info's include_check mechanism to let us stop the traversal
early. But after setting the function and its data parameter, we never
clean it up. This means that if the rev_info is used for a subsequent
traversal without bitmaps, it will unexpectedly call into our
include_check function (worse, it will do so pointing to a now-defunct
stack variable in include_check_data, likely resulting in a segfault).

There's no code which does this now, but it's an accident waiting to
happen. Let's clean up after ourselves in the bitmap code.

Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 10:03:46 +09:00
9a3e3ca2ba subtree: be stricter about validating flags
Don't silently ignore a flag that's invalid for a given subcommand.  The
user expected it to do something; we should tell the user that they are
mistaken, instead of surprising the user.

It could be argued that this change might break existing users.  I'd
argue that those existing users are already broken, and they just don't
know it.  Let them know that they're broken.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
49470cd445 subtree: push: allow specifying a local rev other than HEAD
'git subtree split' lets you specify a rev other than HEAD.  'git push'
lets you specify a mapping between a local thing and a remot ref.  So
smash those together, and have 'git subtree push' let you specify which
local thing to run split on and push the result of that split to the
remote ref.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
94389e7c81 subtree: allow 'split' flags to be passed to 'push'
'push' does a 'split' internally, but it doesn't pass flags through to the
'split'.  This is silly, if you need to pass flags to 'split', then it
means that you can't use 'push'!

So, have 'push' accept 'split' flags, and pass them through to 'split'.

Add tests for this by copying split's tests with minimal modification.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
cb6551447b subtree: allow --squash to be used with --rejoin
Besides being a genuinely useful thing to do, this also just makes sense
and harmonizes which flags may be used when.  `git subtree split
--rejoin` amounts to "automatically go ahead and do a `git subtree
merge` after doing the main `git subtree split`", so it's weird and
arbitrary that you can't pass `--squash` to `git subtree split --rejoin`
like you can `git subtree merge`.  It's weird that `git subtree split
--rejoin` inherits `git subtree merge`'s `--message` but not `--squash`.

Reconcile the situation by just having `split --rejoin` actually just
call `merge` internally (or call `add` instead, as appropriate), so it
can get access to the full `merge` behavior, including `--squash`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
6468784dd2 subtree: give the docs a once-over
Just went through the docs looking for anything inaccurate or that can
be improved.

In the '-h' text, in the man page synopsis, and in the man page
description: Normalize the ordering of the list of sub-commands: 'add',
'merge', 'split', 'pull', 'push'.  This allows us to kinda separate the
lower-level add/merge/split from the higher-level pull/push.

'-h' text:
 - correction: Indicate that split's arg is optional.
 - clarity: Emphasize that 'pull' takes the 'add'/'merge' flags.

man page:

 - correction: State that all subcommands take options (it seemed to
   indicate that only 'split' takes any options other than '-P').
 - correction: 'split' only guarantees that the results are identical if
   the flags are identical.
 - correction: The flag is named '--ignore-joins', not '--ignore-join'.
 - completeness: Clarify that 'push' always operates on HEAD, and that
   'split' operates on HEAD if no local commit is given.
 - clarity: In the description, when listing commands, repeat what their
   arguments are.  This way the reader doesn't need to flip back and
   forth between the command description and the synopsis and the full
   description to understand what's being said.
 - clarity: In the <variables> used to give command arguments, give
   slightly longer, descriptive names.  Like <local-commit> instead of
   just <commit>.
 - clarity: Emphasize that 'pull' takes the 'add'/'merge' flags.
 - style: In the synopsis, list options before the subcommand.  This
   makes things line up and be much more readable when shown
   non-monospace (such as in `make html`), and also more closely matches
   other man pages (like `git-submodule.txt`).
 - style: Use the correct syntax for indicating the options ([<options>]
   instead of [OPTIONS]).
 - style: In the synopsis, separate 'pull' and 'push' from the other
   lower-level commands.  I think this helps readability.
 - style: Code-quote things in prose that seem like they should be
   code-quoted, like '.gitmodules', flags, or full commands.
 - style: Minor wording improvements, like more consistent mood (many
   of the command descriptions start in the imperative mood and switch
   to the indicative mode by the end).  That sort of thing.
 - style: Capitalize "ID".
 - style: Remove the "This option is only valid for XXX command" remarks
   from each option, and instead rely on the section headings.
 - style: Since that line is getting edited anyway, switch "behaviour" to
   American "behavior".
 - style: Trim trailing whitespace.

`todo`:
 - style: Trim trailing whitespace.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:19 +09:00
e9525a8a02 subtree: have $indent actually affect indentation
Currently, the $indent variable is just used to track how deeply we're
nested, and the debug log is indented by things like

   debug "  foo"

That is: The indentation-level is hard-coded.  It used to be that the
code couldn't recurse, so the indentation level could be known
statically, so it made sense to just hard-code it in the
output. However, since 315a84f9aa ("subtree: use commits before rejoins
for splits", 2018-09-28), it can now recurse, and the debug log is
misleading.

So fix that.  Indent according to $indent.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
534ff90dbd subtree: don't let debug and progress output clash
Currently, debug output (triggered by passing '-d') and progress output
stomp on each other.  The debug output is just streamed as lines to
stderr, and the progress output is sent to stderr as '%s\r'.  When
writing to a file, it is awkward to read and difficult to distinguish
between the debug output and a progress line.  When writing to a
terminal the debug lines hide progress lines.

So, when '-d' has been passed, spit out progress as 'progress: %s\n',
instead of as '%s\r', so that it can be detected, and so that the debug
lines don't overwrite the progress when written to a terminal.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
5cdae0f6fd subtree: add comments and sanity checks
For each function in subtree, add a usage comment saying what the
arguments are, and add an `assert` checking the number of arguments.

In figuring out each thing's arguments in order to write those comments
and assertions, it turns out that find_existing_splits is written as if
it takes multiple 'revs', but it is in fact only ever passed a single
'rev':

	unrevs="$(find_existing_splits "$dir" "$rev")" || exit $?

So go ahead and codify that by documenting and asserting that it takes
exactly two arguments, one dir and one rev.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
cbb5de8b83 subtree: remove duplicate check
`cmd_add` starts with a check that the directory doesn't yet exist.
However, the `main` function performs the exact same check before
calling `cmd_add`.  So remove the check from `cmd_add`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
e4f8baa88a subtree: parse revs in individual cmd_ functions
The main argument parser goes ahead and tries to parse revs to make
things simpler for the sub-command implementations.  But, it includes
enough special cases for different sub-commands.  And it's difficult
having having to think about "is this info coming from an argument, or a
global variable?".  So the main argument parser's effort to make things
"simpler" ends up just making it more confusing and complicated.

Begone with the 'revs' global variable; parse 'rev=$(...)' as needed in
individual 'cmd_*' functions.

Begone with the 'default' global variable.  Its would-be value is
knowable just from which function we're in.

Begone with the 'ensure_single_rev' function.  Its functionality can be
achieved by passing '--verify' to 'git rev-parse'.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
bbffb02383 subtree: use "^{commit}" instead of "^0"
They are synonyms.  Both are used in the file.  ^{commit} is clearer, so
"standardize" on that.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
22d5507493 subtree: don't fuss with PATH
Scripts needing to fuss with with adding $(git --exec-prefix) PATH
before loading git-sh-setup is a thing of the past.  As far as I can
tell, it's been a thing of the past since since Git v1.2.0 (2006-02-12),
or more specifically, since 77cb17e940 (Exec git programs without using
PATH, 2006-01-10).  However, it stuck around in contrib scripts and in
third-party scripts for long enough that it wasn't unusual to see.

Originally `git subtree` didn't fuss with PATH, but when people
(including the original subtree author) had problems, because it was a
common thing to see, it seemed that having subtree fuss with PATH was a
reasonable solution.

Here is an abridged history of fussing with PATH in subtree:

  2987e6add3 (Add explicit path of git installation by 'git --exec-path', Gianluca Pacchiella, 2009-08-20)

    As pointed out by documentation, the correct use of 'git-sh-setup' is
    using $(git --exec-path) to avoid problems with not standard
    installations.

    -. git-sh-setup
    +. $(git --exec-path)/git-sh-setup

  33aaa697a2 (Improve patch to use git --exec-path: add to PATH instead, Avery Pennarun, 2009-08-26)

    If you (like me) are using a modified git straight out of its source
    directory (ie. without installing), then --exec-path isn't actually correct.
    Add it to the PATH instead, so if it is correct, it'll work, but if it's
    not, we fall back to the previous behaviour.

    -. $(git --exec-path)/git-sh-setup
    +PATH=$(git --exec-path):$PATH
    +. git-sh-setup

  9c632ea29c ((Hopefully) fix PATH setting for msysgit, Avery Pennarun, 2010-06-24)

    Reported by Evan Shaw.  The problem is that $(git --exec-path) includes a
    'git' binary which is incompatible with the one in /usr/bin; if you run it,
    it gives you an error about libiconv2.dll.

    +OPATH=$PATH
     PATH=$(git --exec-path):$PATH
     . git-sh-setup
    +PATH=$OPATH  # apparently needed for some versions of msysgit

  df2302d774 (Another fix for PATH and msysgit, Avery Pennarun, 2010-06-24)

    Evan Shaw tells me the previous fix didn't work.  Let's use this one
    instead, which he says does work.

    This fix is kind of wrong because it will run the "correct" git-sh-setup
    *after* the one in /usr/bin, if there is one, which could be weird if you
    have multiple versions of git installed.  But it works on my Linux and his
    msysgit, so it's obviously better than what we had before.

    -OPATH=$PATH
    -PATH=$(git --exec-path):$PATH
    +PATH=$PATH:$(git --exec-path)
     . git-sh-setup
    -PATH=$OPATH  # apparently needed for some versions of msysgit

First of all, I disagree with Gianluca's reading of the documentation:
 - I haven't gone back to read what the documentation said in 2009, but
   in my reading of the 2021 documentation is that it includes "$(git
   --exec-path)/" in the synopsis for illustrative purposes, not to say
   it's the proper way.
 - After being executed by `git`, the git exec path should be the very
   first entry in PATH, so it shouldn't matter.
 - None of the scripts that are part of git do it that way.

But secondly, the root reason for fussing with PATH seems to be that
Avery didn't know that he needs to set GIT_EXEC_PATH if he's going to
use git from the source directory without installing.

And finally, Evan's issue is clearly just a bug in msysgit.  I assume
that msysgit has since fixed the issue, and also msysgit has been
deprecated for 6 years now, so let's drop the workaround for it.

So, remove the line fussing with PATH.  However, since subtree *is* in
'contrib/' and it might get installed in funny ways by users
after-the-fact, add a sanity check to the top of the script, checking
that it is installed correctly.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
a94f911072 subtree: use "$*" instead of "$@" as appropriate
"$*" is for when you want to concatenate the args together,
whitespace-separated; and "$@" is for when you want them to be separate
strings.

There are several places in subtree that erroneously use $@ when
concatenating args together into an error message.

For instance, if the args are argv[1]="dead" and argv[2]="beef", then
the line

    die "You must provide exactly one revision.  Got: '$@'"

surely intends to call 'die' with the argument

    argv[1]="You must provide exactly one revision.  Got: 'dead beef'"

however, because the line used $@ instead of $*, it will actually call
'die' with the arguments

    argv[1]="You must provide exactly one revision.  Got: 'dead"
    argv[2]="beef'"

This isn't a big deal, because 'die' concatenates its arguments together
anyway (using "$*").  But that doesn't change the fact that it was a
mistake to use $@ instead of $*, even though in the end $@ still ended
up doing the right thing.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
e2b11e4211 subtree: use more explicit variable names for cmdline args
Make it painfully obvious when reading the code which variables are
direct parsings of command line arguments.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
6d43585a68 subtree: use git-sh-setup's say
subtree currently defines its own `say` implementation, rather than
using git-sh-setups's implementation.  Change that, don't re-invent the
wheel.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:18 +09:00
f664304836 subtree: use git merge-base --is-ancestor
Instead of writing a slow `rev_is_descendant_of_branch $a $b` function
in shell, just use the fast `git merge-base --is-ancestor $b $a`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
8dc3240f5f subtree: drop support for git < 1.7
Suport for Git versions older than 1.7.0 (older than February 2010) was
nice to have when git-subtree lived out-of-tree.  But now that it lives
in git.git, it's not necessary to keep around.  While it's technically
in contrib, with the standard 'git' packages for common systems
(including Arch Linux and macOS) including git-subtree, it seems
vanishingly likely to me that people are separately installing
git-subtree from git.git alongside an older 'git' install (although it
also seems vanishingly likely that people are still using >11 year old
git installs).

Not that there's much reason to remove it either, it's not much code,
and none of my changes depend on a newer git (to my knowledge, anyway;
I'm not actually testing against older git).  I just figure it's an easy
piece of fat to trim, in the journey to making the whole thing easier to
hack on.

"Ignore space change" is probably helpful when viewing this diff.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
d2f0f81954 subtree: more consistent error propagation
Ensure that every $(subshell) that calls a function (as opposed to an
external executable) is followed by `|| exit $?`.  Similarly, ensure that
every `cmd | while read; do ... done` loop is followed by `|| exit $?`.

Both of those constructs mean that it can miss `die` calls, and keep
running when it shouldn't.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
5a3569774f subtree: don't have loose code outside of a function
Shove all of the loose code inside of a main() function.

This comes down to personal preference more than anything else.  A
preference that I've developed over years of maintaining large Bash
scripts, but still a mere personal preference.

In this specific case, it's also moving the `set -- -h`, the `git
rev-parse --parseopt`, and the `. git-sh-setup` to be closer to all
the rest of the argument parsing, which is a readability win on its
own, IMO.

"Ignore space change" is probably helpful when viewing this diff.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
b04538d99f subtree: t7900: add porcelain tests for 'pull' and 'push'
The 'pull' and 'push' subcommands deserve their own sections in the tests.
Add some basic tests for them.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
b269976979 subtree: t7900: add a test for the -h flag
It's a dumb test, but it's surprisingly easy to break.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
db6952b2b2 subtree: t7900: rename last_commit_message to last_commit_subject
t7900-subtree.sh defines a helper function named last_commit_message.
However, it only returns the subject line of the commit message, not the
entire commit message.  So rename it, to make the name less confusing.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
f1cd2d93c2 subtree: t7900: fix 'verify one file change per commit'
As far as I can tell, this test isn't actually testing anything, because
someone forgot to tack on `--name-only` to `git log`.  This seems to
have been the case since the test was first written, back in fa16ab36ad
("test.sh: make sure no commit changes more than one file at a time.",
2009-04-26), unless `git log` used to do that by default and didn't need
the flag back then?

Convincing myself that it's not actually testing anything was tricky,
the code is a little hard to reason about.  It can be made a lot simpler
if instead of trying to parse all of the info from a single `git log`,
we're OK calling `git log` from inside of a loop.  And it's my opinion
that tests are not the place for clever optimized code.

So, fix and simplify the test, so that it's actually testing something
and is simpler to reason about.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
63ac4f1ade subtree: t7900: delete some dead code
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:17 +09:00
c4566ab429 subtree: t7900: use 'test' for string equality
t7900-subtree.sh defines its own `check_equal A B` function, instead of
just using `test A = B` like all of the other tests.  Don't be special,
get rid of `check_equal` in favor of `test`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
40b1e1ec58 subtree: t7900: comment subtree_test_create_repo
It's unclear what the purpose of t7900-subtree.sh's
`subtree_test_create_repo` helper function is.  It wraps test-lib.sh's,
`test_create_repo` but follows that up by setting log.date=relative.  Why
does it set log.date=relative?

My first guess was that at one point the tests required that, but no
longer do, and that the function is now vestigial.  I even wrote a patch
to get rid of it and was moments away from `git send-email`ing it.

However, by chance when looking for something else in the history, I
discovered the true reason, from e7aac44ed2 (contrib/subtree: ignore
log.date configuration, 2015-07-21).  It's testing that setting
log.date=relative doesn't break `git subtree`, as at one point in the past
that did break `git subtree`.

So, add a comment about this, to avoid future such confusion.

And while at it, go ahead and (1) touch up the function to avoid a
pointless subshell and (2) update the one test that didn't use it.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
f700406957 subtree: t7900: use consistent formatting
The formatting in t7900-subtree.sh isn't even consistent throughout the
file.  Fix that; make it consistent throughout the file.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
f2bb7fef7a subtree: t7900: use test-lib.sh's test_count
Use test-lib.sh's `test_count`, instead instead of having
t7900-subtree.sh do its own book-keeping with `subtree_test_count` that
has to be explicitly incremented by calling `next_test`.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
914d512551 subtree: t7900: update for having the default branch name be 'main'
Most of the tests had been converted to support
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`, but `contrib/subtree/t/`
hadn't.

Convert it.  Most of the mentions of 'master' can just be replaced with
'HEAD'.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:47:16 +09:00
4c996deb4a .gitignore: ignore 'git-subtree' as a build artifact
Running `make -C contrib/subtree/ test` creates a `git-subtree` executable
in the root of the repo.  Add it to the .gitignore so that anyone hacking
on subtree won't have to deal with that noise.

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 16:46:30 +09:00
a643157d5a repack: avoid loosening promisor objects in partial clones
When `git repack -A -d` is run in a partial clone, `pack-objects`
is invoked twice: once to repack all promisor objects, and once to
repack all non-promisor objects. The latter `pack-objects` invocation
is with --exclude-promisor-objects and --unpack-unreachable, which
loosens all objects unused during this invocation. Unfortunately,
this includes promisor objects.

Because the -d argument to `git repack` subsequently deletes all loose
objects also in packs, these just-loosened promisor objects will be
immediately deleted. However, this extra disk churn is unnecessary in
the first place.  For example, in a newly-cloned partial repo that
filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up
unpacking all trees and commits into the filesystem because every
object, in this particular case, is a promisor object. Depending on
the repo size, this increases the disk usage considerably: In my copy
of the linux.git, the object directory peaked 26GB of more disk usage.

In order to avoid this extra disk churn, pass the names of the promisor
packfiles as --keep-pack arguments to the second invocation of
`pack-objects`. This informs `pack-objects` that the promisor objects
are already in a safe packfile and, therefore, do not need to be
loosened.

For testing, we need to validate whether any object was loosened.
However, the "evidence" (loosened objects) is deleted during the
process which prevents us from inspecting the object directory.
Instead, let's teach `pack-objects` to count loosened objects and
emit via trace2 thus allowing inspecting the debug events after the
process is finished. This new event is used on the added regression
test.

Lastly, add a new perf test to evaluate the performance impact
made by this changes (tested on git.git):

     Test          HEAD^                 HEAD
     ----------------------------------------------------------
     5600.3: gc    134.38(41.93+90.95)   7.80(6.72+1.35) -94.2%

For a bigger repository, such as linux.git, the improvement is
even bigger:

     Test          HEAD^                     HEAD
     -------------------------------------------------------------------
     5600.3: gc    6833.00(918.07+3162.74)   268.79(227.02+39.18) -96.1%

These improvements are particular big because every object in the
newly-cloned partial repository is a promisor object.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 13:36:13 +09:00
7a14acdbe6 doc: point to diff attribute in patch format docs
From the documentation for generating patch text with diff-related
commands, refer to the documentation for the diff attribute.

This attribute influences the way that patches are generated, but this
was previously not mentioned in e.g., the git-diff manpage.

Signed-off-by: Peter Oliver <git@mavit.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 13:34:44 +09:00
37be11994f builtin/rm: avoid leaking pathspec and seen
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.

cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.

LSAN output from t0020:

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
    #2 0x9ac07a in xmalloc wrapper.c:62:9
    #3 0x873277 in parse_pathspec pathspec.c:582:2
    #4 0x646ffa in cmd_rm builtin/rm.c:266:2
    #5 0x4cd91d in run_builtin git.c:467:11
    #6 0x4cb5f3 in handle_builtin git.c:719:3
    #7 0x4ccf47 in run_argv git.c:808:4
    #8 0x4caf49 in cmd_main git.c:939:19
    #9 0x69dc0e in main common-main.c:52:11
    #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac2a6 in xrealloc wrapper.c:126:8
    #2 0x93b14d in strbuf_grow strbuf.c:98:2
    #3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
    #4 0x93f726 in xstrvfmt strbuf.c:979:2
    #5 0x93f8b3 in xstrfmt strbuf.c:989:8
    #6 0x92ad8a in prefix_path_gently setup.c:115:15
    #7 0x873a8d in init_pathspec_item pathspec.c:439:11
    #8 0x87334f in parse_pathspec pathspec.c:589:3
    #9 0x646ffa in cmd_rm builtin/rm.c:266:2
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 15 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac048 in xstrdup wrapper.c:29:14
    #2 0x873ba2 in init_pathspec_item pathspec.c:468:20
    #3 0x87334f in parse_pathspec pathspec.c:589:3
    #4 0x646ffa in cmd_rm builtin/rm.c:266:2
    #5 0x4cd91d in run_builtin git.c:467:11
    #6 0x4cb5f3 in handle_builtin git.c:719:3
    #7 0x4ccf47 in run_argv git.c:808:4
    #8 0x4caf49 in cmd_main git.c:939:19
    #9 0x69dc0e in main common-main.c:52:11
    #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 1 byte(s) in 1 object(s) allocated from:
    #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9ac392 in xcalloc wrapper.c:140:8
    #2 0x647108 in cmd_rm builtin/rm.c:294:9
    #3 0x4cd91d in run_builtin git.c:467:11
    #4 0x4cb5f3 in handle_builtin git.c:719:3
    #5 0x4ccf47 in run_argv git.c:808:4
    #6 0x4caf49 in cmd_main git.c:939:19
    #7 0x69dbfe in main common-main.c:52:11
    #8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
805b789a69 builtin/rebase: release git_format_patch_opt too
options.git_format_patch_opt can be populated during cmd_rebase's setup,
and will therefore leak on return. Although we could just UNLEAK all of
options, we choose to strbuf_release() the individual member, which matches
the existing pattern (where we're freeing invidual members of options).

Leak found when running t0021:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac296 in xrealloc wrapper.c:126:8
    #2 0x93b13d in strbuf_grow strbuf.c:98:2
    #3 0x93bd3a in strbuf_add strbuf.c:295:2
    #4 0x60ae92 in strbuf_addstr strbuf.h:304:2
    #5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69dbfe in main common-main.c:52:11
    #11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
a317a553b8 builtin/for-each-ref: free filter and UNLEAK sorting.
sorting might be a list allocated in ref_default_sorting() (in this case
it's a fixed single item list, which has nevertheless been xcalloc'd),
or it might be a list allocated in parse_opt_ref_sorting(). In either
case we could free these lists - but instead we UNLEAK as we're at the
end of cmd_for_each_ref. (There's no existing implementation of
clear_ref_sorting(), and writing a loop to free the list seems more
trouble than it's worth.)

filter.with_commit/no_commit are populated via
OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via
parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because
free_commit_list() already exists, we choose to use that over an UNLEAK.

LSAN output from t0041:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9ac252 in xcalloc wrapper.c:140:8
    #2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32
    #3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13
    #4 0x4cd91d in run_builtin git.c:467:11
    #5 0x4cb5f3 in handle_builtin git.c:719:3
    #6 0x4ccf47 in run_argv git.c:808:4
    #7 0x4caf49 in cmd_main git.c:939:19
    #8 0x69dabe in main common-main.c:52:11
    #9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9abf54 in do_xmalloc wrapper.c:41:8
    #2 0x9abf2a in xmalloc wrapper.c:62:9
    #3 0x717486 in commit_list_insert commit.c:540:33
    #4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2
    #5 0x869bb5 in get_value parse-options.c:181:11
    #6 0x8677dc in parse_long_opt parse-options.c:378:10
    #7 0x8659bd in parse_options_step parse-options.c:817:11
    #8 0x867fcd in parse_options parse-options.c:870:10
    #9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dabe in main common-main.c:52:11
    #15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
f3a9680791 mailinfo: also free strbuf lists when clearing mailinfo
mailinfo.p_hdr_info/s_hdr_info are null-terminated lists of strbuf's,
with entries pointing either to NULL or an allocated strbuf. Therefore
we need to free those strbuf's (and not just the data they contain)
whenever we're done with a given entry. (See handle_header() where those
new strbufs are malloc'd.)

Once we no longer need the list (and not just its entries) we can switch
over to strbuf_list_free() instead of manually iterating over the list,
which takes care of those additional details for us. We can only do this
in clear_mailinfo() - in handle_commit_message() we are only clearing the
array contents but want to reuse the array itself, hence we can't use
strbuf_list_free() there.

However, strbuf_list_free() cannot handle a NULL input, and the lists we
are freeing might be NULL. Therefore we add a NULL check in
strbuf_list_free() to make it safe to use with a NULL input (which is a
pattern used by some of the other *_free() functions around git).

Leak output from t0023:

Direct leak of 72 byte(s) in 3 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac9f4 in do_xmalloc wrapper.c:41:8
    #2 0x9ac9ca in xmalloc wrapper.c:62:9
    #3 0x7f6cf7 in handle_header mailinfo.c:205:10
    #4 0x7f5abf in check_header mailinfo.c:583:4
    #5 0x7f5524 in mailinfo mailinfo.c:1197:3
    #6 0x4dcc95 in parse_mail builtin/am.c:1167:6
    #7 0x4d9070 in am_run builtin/am.c:1732:12
    #8 0x4d5b7a in cmd_am builtin/am.c:2398:3
    #9 0x4cd91d in run_builtin git.c:467:11
    #10 0x4cb5f3 in handle_builtin git.c:719:3
    #11 0x4ccf47 in run_argv git.c:808:4
    #12 0x4caf49 in cmd_main git.c:939:19
    #13 0x69e43e in main common-main.c:52:11
    #14 0x7fc1fadfa349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 72 byte(s) leaked in 3 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
52a9436aa7 builtin/checkout: clear pending objects after diffing
add_pending_object() populates rev.pending, we need to take care of
clearing it once we're done.

This code is run close to the end of a checkout, therefore this leak
seems like it would have very little impact. See also LSAN output
from t0020 below:

Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc46 in xrealloc wrapper.c:126:8
    #2 0x83e3a3 in add_object_array_with_path object.c:337:3
    #3 0x8f672a in add_pending_object_with_path revision.c:329:2
    #4 0x8eaeab in add_pending_object_with_mode revision.c:336:2
    #5 0x8eae9d in add_pending_object revision.c:342:2
    #6 0x5154a0 in show_local_changes builtin/checkout.c:602:2
    #7 0x513b00 in merge_working_tree builtin/checkout.c:979:3
    #8 0x512cb3 in switch_branches builtin/checkout.c:1242:9
    #9 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #10 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #12 0x4cd91d in run_builtin git.c:467:11
    #13 0x4cb5f3 in handle_builtin git.c:719:3
    #14 0x4ccf47 in run_argv git.c:808:4
    #15 0x4caf49 in cmd_main git.c:939:19
    #16 0x69e43e in main common-main.c:52:11
    #17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
265644367f builtin/check-ignore: clear_pathspec before returning
parse_pathspec() allocates new memory into pathspec, therefore we need
to free it when we're done.

An UNLEAK would probably be just as good here - but clear_pathspec() is
not much more work so we might as well use it. check_ignore() is either
called once directly from cmd_check_ignore() (in which case the leak
really doesnt matter), or it can be called multiple times in a loop from
check_ignore_stdin_paths(), in which case we're potentially leaking
multiple times - but even in this scenario the leak is so small as to
have no real consequence.

Found while running t0008:

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9aca44 in do_xmalloc wrapper.c:41:8
    #2 0x9aca1a in xmalloc wrapper.c:62:9
    #3 0x873c17 in parse_pathspec pathspec.c:582:2
    #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69e43e in main common-main.c:52:11
    #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc46 in xrealloc wrapper.c:126:8
    #2 0x93baed in strbuf_grow strbuf.c:98:2
    #3 0x93d696 in strbuf_vaddf strbuf.c:392:3
    #4 0x9400c6 in xstrvfmt strbuf.c:979:2
    #5 0x940253 in xstrfmt strbuf.c:989:8
    #6 0x92b72a in prefix_path_gently setup.c:115:15
    #7 0x87442d in init_pathspec_item pathspec.c:439:11
    #8 0x873cef in parse_pathspec pathspec.c:589:3
    #9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #11 0x4cd91d in run_builtin git.c:467:11
    #12 0x4cb5f3 in handle_builtin git.c:719:3
    #13 0x4ccf47 in run_argv git.c:808:4
    #14 0x4caf49 in cmd_main git.c:939:19
    #15 0x69e43e in main common-main.c:52:11
    #16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac9e8 in xstrdup wrapper.c:29:14
    #2 0x874542 in init_pathspec_item pathspec.c:468:20
    #3 0x873cef in parse_pathspec pathspec.c:589:3
    #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69e43e in main common-main.c:52:11
    #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
4fa268738c builtin/bugreport: don't leak prefixed filename
prefix_filename() returns newly allocated memory, and strbuf_addstr()
doesn't take ownership of its inputs. Therefore we have to make sure to
store and free prefix_filename()'s result.

As this leak is in cmd_bugreport(), we could just as well UNLEAK the
prefix - but there's no good reason not to just free it properly. This
leak was found while running t0091, see output below:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc66 in xrealloc wrapper.c:126:8
    #2 0x93baed in strbuf_grow strbuf.c:98:2
    #3 0x93c6ea in strbuf_add strbuf.c:295:2
    #4 0x69f162 in strbuf_addstr ./strbuf.h:304:2
    #5 0x69f083 in prefix_filename abspath.c:277:2
    #6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9
    #7 0x4cd91d in run_builtin git.c:467:11
    #8 0x4cb5f3 in handle_builtin git.c:719:3
    #9 0x4ccf47 in run_argv git.c:808:4
    #10 0x4caf49 in cmd_main git.c:939:19
    #11 0x69df9e in main common-main.c:52:11
    #12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
d895804b5a branch: FREE_AND_NULL instead of NULL'ing real_ref
real_ref was previously populated by dwim_ref(), which allocates new
memory. We need to make sure to free real_ref when discarding it.
(real_ref is already being freed at the end of create_branch() - but
if we discard it early then it will leak.)

This fixes the following leak found while running t0002-t0099:

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486954 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xdd6484 in xstrdup wrapper.c:29:14
    #2 0xc0f658 in expand_ref refs.c:671:12
    #3 0xc0ecf1 in repo_dwim_ref refs.c:644:22
    #4 0x8b1184 in dwim_ref ./refs.h:162:9
    #5 0x8b0b02 in create_branch branch.c:284:10
    #6 0x550cbb in update_refs_for_switch builtin/checkout.c:1046:4
    #7 0x54e275 in switch_branches builtin/checkout.c:1274:2
    #8 0x548828 in checkout_branch builtin/checkout.c:1668:9
    #9 0x541306 in checkout_main builtin/checkout.c:2025:9
    #10 0x5395fa in cmd_checkout builtin/checkout.c:2077:8
    #11 0x4d02a8 in run_builtin git.c:467:11
    #12 0x4cbfe9 in handle_builtin git.c:719:3
    #13 0x4cf04f in run_argv git.c:808:4
    #14 0x4cb85a in cmd_main git.c:939:19
    #15 0x820cf6 in main common-main.c:52:11
    #16 0x7f30bd9dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
b180c681bb bloom: clear each bloom_key after use
fill_bloom_key() allocates memory into bloom_key, we need to clean that
up once the key is no longer needed.

This leak was found while running t0002-t0099. Although this leak is
happening in code being called from a test-helper, the same code is also
used in various locations around git, and can therefore happen during
normal usage too. Gabor's analysis shows that peak-memory usage during
'git commit-graph write' is reduced on the order of 10% for a selection
of larger repos (along with an even larger reduction if we override
modified path bloom filter limits):
https://lore.kernel.org/git/20210411072651.GF2947267@szeder.dev/

LSAN output:

Direct leak of 308 byte(s) in 11 object(s) allocated from:
    #0 0x49a5e2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x6f4032 in xcalloc wrapper.c:140:8
    #2 0x4f2905 in fill_bloom_key bloom.c:137:28
    #3 0x4f34c1 in get_or_compute_bloom_filter bloom.c:284:4
    #4 0x4cb484 in get_bloom_filter_for_commit t/helper/test-bloom.c:43:11
    #5 0x4cb072 in cmd__bloom t/helper/test-bloom.c:97:3
    #6 0x4ca7ef in cmd_main t/helper/test-tool.c:121:11
    #7 0x4caace in main common-main.c:52:11
    #8 0x7f798af95349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 308 byte(s) leaked in 11 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
4c217a4c34 ls-files: free max_prefix when done
common_prefix() returns a new string, which we store in max_prefix -
this string needs to be freed to avoid a leak. This leak is happening
in cmd_ls_files, hence is of no real consequence - an UNLEAK would be
just as good, but we might as well free the string properly.

Leak found while running t0002, see output below:

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ab1b4 in do_xmalloc wrapper.c:41:8
    #2 0x9ab248 in do_xmallocz wrapper.c:75:8
    #3 0x9ab22a in xmallocz wrapper.c:83:9
    #4 0x9ab2d7 in xmemdupz wrapper.c:99:16
    #5 0x78d6a4 in common_prefix dir.c:191:15
    #6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16
    #7 0x4cd92d in run_builtin git.c:453:11
    #8 0x4cb5fa in handle_builtin git.c:704:3
    #9 0x4ccf57 in run_argv git.c:771:4
    #10 0x4caf49 in cmd_main git.c:902:19
    #11 0x69ce2e in main common-main.c:52:11
    #12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
5493ce7af9 wt-status: fix multiple small leaks
rev.prune_data is populated (in multiple functions) via copy_pathspec,
and therefore needs to be cleared after running the diff in those
functions.

rev(_info).pending is populated indirectly via setup_revisions, and also
needs to be cleared once diffing is done.

These leaks were found while running t0008 or t0021. The rev.prune_data
leaks are small (80B) but noisy, hence I won't bother including their
logs - the rev.pending leaks are bigger, and can happen early in the
course of other commands, and therefore possibly more valuable to fix -
see example log from a rebase below:

Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac2a6 in xrealloc wrapper.c:126:8
    #2 0x83da03 in add_object_array_with_path object.c:337:3
    #3 0x8f5d8a in add_pending_object_with_path revision.c:329:2
    #4 0x8ea50b in add_pending_object_with_mode revision.c:336:2
    #5 0x8ea4fd in add_pending_object revision.c:342:2
    #6 0x8ea610 in add_head_to_pending revision.c:354:2
    #7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2
    #8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6
    #9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac048 in xstrdup wrapper.c:29:14
    #2 0x83da8d in add_object_array_with_path object.c:349:17
    #3 0x8f5d8a in add_pending_object_with_path revision.c:329:2
    #4 0x8ea50b in add_pending_object_with_mode revision.c:336:2
    #5 0x8ea4fd in add_pending_object revision.c:342:2
    #6 0x8ea610 in add_head_to_pending revision.c:354:2
    #7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2
    #8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6
    #9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 2053 byte(s) leaked in 2 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
db69bf608d revision: free remainder of old commit list in limit_list
limit_list() iterates over the original revs->commits list, and consumes
many of its entries via pop_commit. However we might stop iterating over
the list early (e.g. if we realise that the rest of the list is
uninteresting). If we do stop iterating early, list will be pointing to
the unconsumed portion of revs->commits - and we need to free this list
to avoid a leak. (revs->commits itself will be an invalid pointer: it
will have been free'd during the first pop_commit.)

However the list pointer is later reused to iterate over our new list,
but only for the limiting_can_increase_treesame() branch. We therefore
need to introduce a new variable for that branch - and while we're here
we can rename the original list to original_list as that makes its
purpose more obvious.

This leak was found while running t0090. It's not likely to be very
impactful, but it can happen quite early during some checkout
invocations, and hence seems to be worth fixing:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac084 in do_xmalloc wrapper.c:41:8
    #2 0x9ac05a in xmalloc wrapper.c:62:9
    #3 0x7175d6 in commit_list_insert commit.c:540:33
    #4 0x71800f in commit_list_insert_by_date commit.c:604:9
    #5 0x8f8d2e in process_parents revision.c:1128:5
    #6 0x8f2f2c in limit_list revision.c:1418:7
    #7 0x8f210e in prepare_revision_walk revision.c:3577:7
    #8 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
    #9 0x512f05 in switch_branches builtin/checkout.c:1250:3
    #10 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #11 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #12 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #13 0x4cd91d in run_builtin git.c:467:11
    #14 0x4cb5f3 in handle_builtin git.c:719:3
    #15 0x4ccf47 in run_argv git.c:808:4
    #16 0x4caf49 in cmd_main git.c:939:19
    #17 0x69dc0e in main common-main.c:52:11
    #18 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 48 byte(s) in 3 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac084 in do_xmalloc wrapper.c:41:8
    #2 0x9ac05a in xmalloc wrapper.c:62:9
    #3 0x717de6 in commit_list_append commit.c:1609:35
    #4 0x8f1f9b in prepare_revision_walk revision.c:3554:12
    #5 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
    #6 0x512f05 in switch_branches builtin/checkout.c:1250:3
    #7 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #8 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #9 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
3dd71461e2 hex: print objects using the hash algorithm member
Now that all code paths correctly set the hash algorithm member of
struct object_id, write an object's hex representation using the hash
algorithm member embedded in it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
b8505ecbf2 hex: default to the_hash_algo on zero algorithm value
There are numerous places in the codebase where we assume we can
initialize data by zeroing all its bytes.  However, when we do that with
a struct object_id, it leaves the structure with a zero value for the
algorithm, which is invalid.

We could forbid this pattern and require that all struct object_id
instances be initialized using oidclr, but this seems burdensome and
it's unnatural to most C programmers.  Instead, if the algorithm is
zero, assume we wanted to use the default hash algorithm instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
71b7672b67 builtin/pack-objects: avoid using struct object_id for pack hash
We use struct object_id for the names of objects.  It isn't intended to
be used for other hash values that don't name objects such as the pack
hash.

Because struct object_id will soon need to have its algorithm member
set, using it in this code path would mean that we didn't set that
member, only the hash member, which would result in a crash.  For both
of these reasons, switch to using an unsigned char array of size
GIT_MAX_RAWSZ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
72871b132c commit-graph: don't store file hashes as struct object_id
The idea behind struct object_id is that it is supposed to represent the
identifier of a standard Git object or a special pseudo-object like the
all-zeros object ID.  In this case, we have file hashes, which, while
similar, are distinct from the identifiers of objects.

Switch these code paths to use an unsigned char array.  This is both
more logically consistent and it means that we need not set the
algorithm identifier for the struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
dd15f4f457 builtin/show-index: set the algorithm for object IDs
In most cases, when we load the hash of an object into a struct
object_id, we load it using one of the oid* or *_oid_hex functions.
However, for git show-index, we read it in directly using fread.  As a
consequence, set the algorithm correctly so the objects can be used
correctly both now and in the future.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
14228447c9 hash: provide per-algorithm null OIDs
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
5a6dce70d7 hash: set, copy, and use algo field in struct object_id
Now that struct object_id has an algorithm field, we should populate it.
This will allow us to handle object IDs in any supported algorithm and
distinguish between them.  Ensure that the field is written whenever we
write an object ID by storing it explicitly every time we write an
object.  Set values for the empty blob and tree values as well.

In addition, use the algorithm field to compare object IDs.  Note that
because we zero-initialize struct object_id in many places throughout
the codebase, we default to the default algorithm in cases where the
algorithm field is zero rather than explicitly initialize all of those
locations.

This leads to a branch on every comparison, but the alternative is to
compare the entire buffer each time and padding the buffer for SHA-1.
That alternative ranges up to 3.9% worse than this approach on the perf
t0001, t1450, and t1451.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
0e5e2284f1 builtin/pack-redundant: avoid casting buffers to struct object_id
Now that we need our instances of struct object_id to be zero padded, we
can no longer cast unsigned char buffers to be pointers to struct
object_id.  This file reads data out of the pack objects and then
inserts it directly into a linked list item which is a pointer to struct
object_id.  Instead, let's have the linked list item hold its own struct
object_id and copy the data into it.

In addition, since these are not really pointers to struct object_id,
stop passing them around as such, and call them what they really are:
pointers to unsigned char.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
5951bf467e Use the final_oid_fn to finalize hashing of object IDs
When we're hashing a value which is going to be an object ID, we want to
zero-pad that value if necessary.  To do so, use the final_oid_fn
instead of the final_fn anytime we're going to create an object ID to
ensure we perform this operation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
ab795f0d77 hash: add a function to finalize object IDs
To avoid the penalty of having to branch in hash comparison functions,
we'll want to always compare the full hash member in a struct object_id,
which will require that SHA-1 object IDs be zero-padded.  To do so, add
a function which finalizes a hash context and writes it into an object
ID that performs this padding.

Move the definition of struct object_id and the constant definitions
higher up so we they are available for us to use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
c3b4e4ee36 http-push: set algorithm when reading object ID
In most places in the codebase, we use oidread to properly read an
object ID into a struct object_id.  However, in the HTTP code, we end up
needing to parse a loose object path with a slash in it, so we can't do
that.  Let's instead explicitly set the algorithm in this function so we
can rely on it in the future.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
92e2cab96b Always use oidread to read into struct object_id
In the future, we'll want oidread to automatically set the hash
algorithm member for an object ID we read into it, so ensure we use
oidread instead of hashcpy everywhere we're copying a hash value into a
struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
cf0983213c hash: add an algo member to struct object_id
Now that we're working with multiple hash algorithms in the same repo,
it's best if we label each object ID with its algorithm so we can
determine how to format a given object ID. Add a member called algo to
struct object_id.

Performance testing on object ID-heavy workloads doesn't reveal a clear
change in performance.  Out of performance tests t0001 and t1450, there
are slight variations in performance both up and down, but all
measurements are within the margin of error.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
b722d4560e pretty: provide human date format
Add the placeholders %ah and %ch to format author date and committer
date, like --date=human does, which provides more humanity date output.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:09:32 +09:00
3593ebd3f5 pretty tests: give --date/format tests a better description
Change the description for the --date/format equivalency tests added
in 466fb6742d (pretty: provide a strict ISO 8601 date format,
2014-08-29) and 0df621172d (pretty: provide short date format,
2019-11-19) to be more meaningful.

This allows us to reword the comment added in the former commit to
refer to both tests, and any other future test, such as the in-flight
--date=human format being proposed in [1].

1. http://lore.kernel.org/git/pull.939.v2.git.1619275340051.gitgitgadget@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:08:54 +09:00
fbfcaec8d8 pretty tests: simplify %aI/%cI date format test
Change a needlessly complex test for the %aI/%cI date
formats (iso-strict) added in 466fb6742d (pretty: provide a strict
ISO 8601 date format, 2014-08-29) to instead use the same pattern used
to test %as/%cs since 0df621172d (pretty: provide short date format,
2019-11-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:05:56 +09:00
34c319970d refs/debug: trace into reflog expiry too
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:59:39 +09:00
7cdb096903 git-completion.bash: consolidate cases in _git_stash()
The $subcommand case statement in _git_stash() is quite repetitive.
Consolidate the cases together into one catch-all case to reduce the
repetition.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:41:07 +09:00
59d85a2a05 git-completion.bash: use $__git_cmd_idx in more places
With the introduction of the $__git_cmd_idx variable in e94fb44042
(git-completion.bash: pass $__git_subcommand_idx from __git_main(),
2021-03-24), completion functions were able to know the index at which
the git command is listed, allowing them to skip options that are given
to the underlying git itself, not the corresponding command (e.g.
`-C asdf` in `git -C asdf branch`).

While most of the changes here are self-explanatory, some bear further
explanation.

For the __git_find_on_cmdline() and __git_find_last_on_cmdline() pair of
functions, these functions are only ever called in the context of a git
command completion function. These functions will only care about words
after the command so we can safely ignore the words before this.

For _git_worktree(), this change is technically a no-op (once the
__git_find_last_on_cmdline change is also applied). It was in poor style
to have hard-coded on the index right after `worktree`. In case
`git worktree` were to ever learn to accept options, the current
situation would be inflexible.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:41:07 +09:00
87e629756f git-completion.bash: rename to $__git_cmd_idx
In e94fb44042 (git-completion.bash: pass $__git_subcommand_idx from
__git_main(), 2021-03-24), the $__git_subcommand_idx variable was
introduced. Naming it after the index of the subcommand is needlessly
confusing as, when this variable is used, it is in the completion
functions for commands (e.g. _git_remote()) where for `git remote add`,
the `remote` is referred to as the command and `add` is referred to as
the subcommand.

Rename this variable so that it's obvious it's about git commands. While
we're at it, shorten up its name so that it's still readable without
being a handful to type.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:41:07 +09:00
482d549906 t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
In order to test whether the new GIT_CONFIG_SYSTEM environment variable
behaves as expected, we unset GIT_CONFIG_NOSYSTEM in one of our tests in
t1300. But because tests are not executed in a subshell, this unset
leaks into all subsequent tests and may thus cause them to fail in some
environments. These failures are easily reproducable with `make
prefix=/root test`.

Fix the issue by not using `sane_unset GIT_CONFIG_NOSYSTEM`, but instead
just manually add it to the environment of the two command invocations
which need it.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 15:15:34 +09:00
a2ba162cda object-info: support for retrieving object info
Sometimes it is useful to get information of an object without having to
download it completely.

Add the "object-info" capability that lets the client ask for
object-related information with their full hexadecimal object names.

Only sizes are returned for now.

Signed-off-by: Bruno Albuquerque <bga@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 17:41:13 -07:00
311531c9de The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 17:23:37 -07:00
4090b6973b Merge branch 'js/access-nul-emulation-on-windows'
Portability fix.

* js/access-nul-emulation-on-windows:
  msvc: avoid calling `access("NUL", flags)`
2021-04-20 17:23:37 -07:00
b9fa3ba0ca Merge branch 'sg/bugreport-fixes'
The dependencies for config-list.h and command-list.h were broken
when the former was split out of the latter, which has been
corrected.

* sg/bugreport-fixes:
  Makefile: add missing dependencies of 'config-list.h'
2021-04-20 17:23:37 -07:00
092bf77e8c Merge branch 'jc/doc-do-not-capitalize-clarification'
Doc update for developers.

* jc/doc-do-not-capitalize-clarification:
  doc: clarify "do not capitalize the first word" rule
2021-04-20 17:23:36 -07:00
fdef940afe Merge branch 'ab/usage-error-docs'
Documentation updates, with unrelated comment updates, too.

* ab/usage-error-docs:
  api docs: document that BUG() emits a trace2 error event
  api docs: document BUG() in api-error-handling.txt
  usage.c: don't copy/paste the same comment three times
2021-04-20 17:23:36 -07:00
522010b573 Merge branch 'ab/detox-gettext-tests'
Test clean-up.

* ab/detox-gettext-tests:
  tests: remove all uses of test_i18cmp
2021-04-20 17:23:36 -07:00
e02f75c9eb Merge branch 'jt/fetch-pack-request-fix'
* jt/fetch-pack-request-fix:
  fetch-pack: buffer object-format with other args
2021-04-20 17:23:36 -07:00
196cc525e2 Merge branch 'hn/reftable-tables-doc-update'
Doc updte.

* hn/reftable-tables-doc-update:
  reftable: document an alternate cleanup method on Windows
2021-04-20 17:23:35 -07:00
2eebac2c49 Merge branch 'jk/pack-objects-bitmap-progress-fix'
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.

* jk/pack-objects-bitmap-progress-fix:
  pack-objects: update "nr_seen" progress based on pack-reused count
2021-04-20 17:23:35 -07:00
ab99efc817 Merge branch 'ab/userdiff-tests'
A bit of code clean-up and a lot of test clean-up around userdiff
area.

* ab/userdiff-tests:
  blame tests: simplify userdiff driver test
  blame tests: don't rely on t/t4018/ directory
  userdiff: remove support for "broken" tests
  userdiff tests: list builtin drivers via test-tool
  userdiff tests: explicitly test "default" pattern
  userdiff: add and use for_each_userdiff_driver()
  userdiff style: normalize pascal regex declaration
  userdiff style: declare patterns with consistent style
  userdiff style: re-order drivers in alphabetical order
2021-04-20 17:23:34 -07:00
6d7a62d74d Merge branch 'ar/userdiff-scheme'
Userdiff patterns for "Scheme" has been added.

* ar/userdiff-scheme:
  userdiff: add support for Scheme
2021-04-20 17:23:34 -07:00
8c8c8c0e16 git-completion.bash: separate some commands onto their own line
In e94fb44042 (git-completion.bash: pass $__git_subcommand_idx from
__git_main(), 2021-03-24), a line was introduced which contained
multiple statements. This is difficult to read so break it into multiple
lines.

While we're at it, follow this convention for the rest of the
__git_main() and break up lines that contain multiple statements.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 13:27:35 -07:00
9364bf465d doc: clarify the filename encoding in git diff
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files
modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 12:57:26 -07:00
844c3f0b0b ref-filter: reuse output buffer
When we use `git for-each-ref`, every ref will allocate
its own output strbuf and error strbuf. But we can reuse
the final strbuf for each step ref's output. The error
buffer will also be reused, despite the fact that the git
will exit when `format_ref_array_item()` return a non-zero
value and output the contents of the error buffer.

The performance for `git for-each-ref` on the Git repository
itself with performance testing tool `hyperfine` changes from
23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. Optimization is relatively
minor.

At the same time, we apply this optimization to `git tag -l`
and `git branch -l`.

This approach is similar to the one used by 79ed0a5
(cat-file: use a single strbuf for all output, 2018-08-14)
to speed up the cat-file builtin.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 11:09:50 -07:00
22f69a85ed ref-filter: get rid of show_ref_array_item
Inlining the exported function `show_ref_array_item()`,
which is not providing the right level of abstraction,
simplifies the API and can unlock improvements at the
former call sites.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 15:08:00 -07:00
68e66f2987 parallel-checkout: add design documentation
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 15:05:25 -07:00
4179b4897f config: allow overriding of global and system configuration
In order to have git run in a fully controlled environment without any
misconfiguration, it may be desirable for users or scripts to override
global- and system-level configuration files. We already have a way of
doing this, which is to unset both HOME and XDG_CONFIG_HOME environment
variables and to set `GIT_CONFIG_NOGLOBAL=true`. This is quite kludgy,
and unsetting the first two variables likely has an impact on other
executables spawned by such a script.

The obvious way to fix this would be to introduce `GIT_CONFIG_NOGLOBAL`
as an equivalent to `GIT_CONFIG_NOSYSTEM`. But in the past, it has
turned out that this design is inflexible: we cannot test system-level
parsing of the git configuration in our test harness because there is no
way to change its location, so all tests run with `GIT_CONFIG_NOSYSTEM`
set.

Instead of doing the same mistake with `GIT_CONFIG_NOGLOBAL`, introduce
two new variables `GIT_CONFIG_GLOBAL` and `GIT_CONFIG_SYSTEM`:

    - If unset, git continues to use the usual locations.

    - If set to a specific path, we skip reading the normal
      configuration files and instead take the path. By setting the path
      to `/dev/null`, no configuration will be loaded for the respective
      level.

This implements the usecase where we want to execute code in a sanitized
environment without any potential misconfigurations via `/dev/null`, but
is more flexible and allows for more usecases than simply adding
`GIT_CONFIG_NOGLOBAL`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
1e06eb9b5d config: unify code paths to get global config paths
There's two callsites which assemble global config paths, once in the
config loading code and once in the git-config(1) builtin. We're about
to implement a way to override global config paths via an environment
variable which would require us to adjust both sites.

Unify both code paths into a single `git_global_config()` function which
returns both paths for `~/.gitconfig` and the XDG config file. This will
make the subsequent patch which introduces the new envvar easier to
implement.

No functional changes are expected from this patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
c62a999c6e config: rename git_etc_config()
The `git_etc_gitconfig()` function retrieves the system-level path of
the configuration file. We're about to introduce a way to override it
via an environment variable, at which point the name of this function
would start to become misleading.

Rename the function to `git_system_config()` as a preparatory step.
While at it, the function is also refactored to pass memory ownership to
the caller. This is done to better match semantics of
`git_global_config()`, which is going to be introduced in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
9cf68b27d5 rev-list: allow filtering of provided items
When providing an object filter, it is currently impossible to also
filter provided items. E.g. when executing `git rev-list HEAD` , the
commit this reference points to will be treated as user-provided and is
thus excluded from the filtering mechanism. This makes it harder than
necessary to properly use the new `--filter=object:type` filter given
that even if the user wants to only see blobs, he'll still see commits
of provided references.

Improve this by introducing a new `--filter-provided-objects` option
to the git-rev-parse(1) command. If given, then all user-provided
references will be subject to filtering.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
169a15ebd6 pack-bitmap: implement combined filter
When the user has multiple objects filters specified, then this is
internally represented by having a "combined" filter. These combined
filters aren't yet supported by bitmap indices and can thus not be
accelerated.

Fix this by implementing support for these combined filters. The
implementation is quite trivial: when there's a combined filter, we
simply recurse into `filter_bitmap()` for all of the sub-filters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
7ab6aafa58 pack-bitmap: implement object type filter
The preceding commit has added a new object filter for git-rev-list(1)
which allows to filter objects by type. Implement the equivalent filter
for packfile bitmaps so that we can answer these queries fast.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
b0c42a53c9 list-objects: implement object type filter
While it already is possible to filter objects by some criteria in
git-rev-list(1), it is not yet possible to filter out only a specific
type of objects. This makes some filters less useful. The `blob:limit`
filter for example filters blobs such that only those which are smaller
than the given limit are returned. But it is unfit to ask only for these
smallish blobs, given that git-rev-list(1) will continue to print tags,
commits and trees.

Now that we have the infrastructure in place to also filter tags and
commits, we can improve this situation by implementing a new filter
which selects objects based on their type. Above query can thus
trivially be implemented with the following command:

    $ git rev-list --objects --filter=object:type=blob \
        --filter=blob:limit=200

Furthermore, this filter allows to optimize for certain other cases: if
for example only tags or commits have been selected, there is no need to
walk down trees.

The new filter is not yet supported in bitmaps. This is going to be
implemented in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
1c4d6f46be parallel-checkout: support progress displaying
Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
7531e4b66e parallel-checkout: add configuration options
Make parallel checkout configurable by introducing two new settings:
checkout.workers and checkout.thresholdForParallelism. The first defines
the number of workers (where one means sequential checkout), and the
second defines the minimum number of entries to attempt parallel
checkout.

To decide the default value for checkout.workers, the parallel version
was benchmarked during three operations in the linux repo, with cold
cache: cloning v5.8, checking out v5.8 from v2.6.15 (checkout I) and
checking out v5.8 from v5.7 (checkout II). The four tables below show
the mean run times and standard deviations for 5 runs in: a local file
system on SSD, a local file system on HDD, a Linux NFS server, and
Amazon EFS (all on Linux). Each parallel checkout test was executed with
the number of workers that brings the best overall results in that
environment.

Local SSD:
             Sequential             10 workers            Speedup
Clone        8.805 s ± 0.043 s      3.564 s ± 0.041 s     2.47 ± 0.03
Checkout I   9.678 s ± 0.057 s      4.486 s ± 0.050 s     2.16 ± 0.03
Checkout II  5.034 s ± 0.072 s      3.021 s ± 0.038 s     1.67 ± 0.03

Local HDD:
             Sequential             10 workers             Speedup
Clone        32.288 s ± 0.580 s     30.724 s ± 0.522 s    1.05 ± 0.03
Checkout I   54.172 s ±  7.119 s    54.429 s ± 6.738 s    1.00 ± 0.18
Checkout II  40.465 s ± 2.402 s     38.682 s ± 1.365 s    1.05 ± 0.07

Linux NFS server (v4.1, on EBS, single availability zone):

             Sequential             32 workers            Speedup
Clone        240.368 s ± 6.347 s    57.349 s ± 0.870 s    4.19 ± 0.13
Checkout I   242.862 s ± 2.215 s    58.700 s ± 0.904 s    4.14 ± 0.07
Checkout II  65.751 s ± 1.577 s     23.820 s ± 0.407 s    2.76 ± 0.08

EFS (v4.1, replicated over multiple availability zones):

             Sequential             32 workers            Speedup
Clone        922.321 s ± 2.274 s    210.453 s ± 3.412 s   4.38 ± 0.07
Checkout I   1011.300 s ± 7.346 s   297.828 s ± 0.964 s   3.40 ± 0.03
Checkout II  294.104 s ± 1.836 s    126.017 s ± 1.190 s   2.33 ± 0.03

The above benchmarks show that parallel checkout is most effective on
repositories located on an SSD or over a distributed file system. For
local file systems on spinning disks, and/or older machines, the
parallelism does not always bring a good performance. For this reason,
the default value for checkout.workers is one, a.k.a. sequential
checkout.

To decide the default value for checkout.thresholdForParallelism,
another benchmark was executed in the "Local SSD" setup, where parallel
checkout showed to be beneficial. This time, we compared the runtime of
a `git checkout -f`, with and without parallelism, after randomly
removing an increasing number of files from the Linux working tree. The
"sequential fallback" column below corresponds to the executions where
checkout.workers was 10 but checkout.thresholdForParallelism was equal
to the number of to-be-updated files plus one (so that we end up writing
sequentially). Each test case was sampled 15 times, and each sample had
a randomly different set of files removed. Here are the results:

             sequential fallback   10 workers           speedup
10   files    772.3 ms ± 12.6 ms   769.0 ms ± 13.6 ms   1.00 ± 0.02
20   files    780.5 ms ± 15.8 ms   775.2 ms ±  9.2 ms   1.01 ± 0.02
50   files    806.2 ms ± 13.8 ms   767.4 ms ±  8.5 ms   1.05 ± 0.02
100  files    833.7 ms ± 21.4 ms   750.5 ms ± 16.8 ms   1.11 ± 0.04
200  files    897.6 ms ± 30.9 ms   730.5 ms ± 14.7 ms   1.23 ± 0.05
500  files   1035.4 ms ± 48.0 ms   677.1 ms ± 22.3 ms   1.53 ± 0.09
1000 files   1244.6 ms ± 35.6 ms   654.0 ms ± 38.3 ms   1.90 ± 0.12
2000 files   1488.8 ms ± 53.4 ms   658.8 ms ± 23.8 ms   2.26 ± 0.12

From the above numbers, 100 files seems to be a reasonable default value
for the threshold setting.

Note: Up to 1000 files, we observe a drop in the execution time of the
parallel code with an increase in the number of files. This is a rather
odd behavior, but it was observed in multiple repetitions. Above 1000
files, the execution time increases according to the number of files, as
one would expect.

About the test environments: Local SSD tests were executed on an
i7-7700HQ (4 cores with hyper-threading) running Manjaro Linux. Local
HDD tests were executed on an Intel(R) Xeon(R) E3-1230 (also 4 cores
with hyper-threading), HDD Seagate Barracuda 7200.14 SATA 3.1, running
Debian. NFS and EFS tests were executed on an Amazon EC2 c5n.xlarge
instance, with 4 vCPUs. The Linux NFS server was running on a m6g.large
instance with 2 vCPUSs and a 1 TB EBS GP2 volume. Before each timing,
the linux repository was removed (or checked out back to its previous
state), and `sync && sysctl vm.drop_caches=3` was executed.

Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
e9e8adf1a8 parallel-checkout: make it truly parallel
Use multiple worker processes to distribute the queued entries and call
write_pc_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could affect
performance due to lock contention in the kernel. Work stealing (or any
other format of re-distribution) is not implemented yet.

The protocol between the main process and the workers is quite simple.
They exchange binary messages packed in pkt-line format, and use
PKT-FLUSH to mark the end of input (from both sides). The main process
starts the communication by sending N pkt-lines, each corresponding to
an item that needs to be written. These packets contain all the
necessary information to load, smudge, and write the blob associated
with each item. Then it waits for the worker to send back N pkt-lines
containing the results for each item. The resulting packet must contain:
the identification number of the item that it refers to, the status of
the operation, and the lstat() data gathered after writing the file (iff
the operation was successful).

For now, checkout always uses a hardcoded value of 2 workers, only to
demonstrate that the parallel checkout framework correctly divides and
writes the queued entries. The next patch will add user configurations
and define a more reasonable default, based on tests with the said
settings.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
04155bdad8 unpack-trees: add basic support for parallel checkout
This new interface allows us to enqueue some of the entries being
checked out to later uncompress them, apply in-process filters, and
write out the files in parallel. For now, the parallel checkout
machinery is enabled by default and there is no user configuration, but
run_parallel_checkout() just writes the queued entries in sequence
(without spawning additional workers). The next patch will actually
implement the parallelism and, later, we will make it configurable.

Note that, to avoid potential data races, not all entries are eligible
for parallel checkout. Also, paths that collide on disk (e.g.
case-sensitive paths in case-insensitive file systems), are detected by
the parallel checkout code and skipped, so that they can be safely
sequentially handled later. The collision detection works like the
following:

- If the collision was at basename (e.g. 'a/b' and 'a/B'), the framework
  detects it by looking for EEXIST and EISDIR errors after an
  open(O_CREAT | O_EXCL) failure.

- If the collision was at dirname (e.g. 'a/b' and 'A'), it is detected
  at the has_dirs_only_path() check, which is done for the leading path
  of each item in the parallel checkout queue.

Both verifications rely on the fact that, before enqueueing an entry for
parallel checkout, checkout_entry() makes sure that there is no file at
the entry's path and that its leading components are all real
directories. So, any later change in these conditions indicates that
there was a collision (either between two parallel-eligible entries or
between an eligible and an ineligible one).

After all parallel-eligible entries have been processed, the collided
(and thus, skipped) entries are sequentially fed to checkout_entry()
again. This is similar to the way the current code deals with
collisions, overwriting the previously checked out entries with the
subsequent ones. The only difference is that, since we no longer create
the files in the same order that they appear on index, we are not able
to determine which of the colliding entries will survive on disk (for
the classic code, it is always the last entry).

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
364bc11fe5 doc/diff-options: document new --diff-merges features
Document changes in -m and --diff-merges=m semantics, as well as new
--diff-merges=on option.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
17c13e60fd diff-merges: introduce log.diffMerges config variable
New log.diffMerges configuration variable sets the format that
--diff-merges=on will be using. The default is "separate".

t4013: add the following tests for log.diffMerges config:

* Test that wrong values are denied.

* Test that the value of log.diffMerges properly affects both
--diff-merges=on and -m.

t9902: fix completion tests for log.d* to match log.diffMerges.

Added documentation for log.diffMerges.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
38fc4dbbc2 diff-merges: adapt -m to enable default diff format
Let -m option (and --diff-merges=m) enable the default format instead
of "separate", to be able to tune it with log.diffMerges option.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
26a0f58da8 diff-merges: refactor set_diff_merges()
Split set_diff_merges() into separate parsing and execution functions,
the former to be reused for parsing of configuration values later in
the patch series.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
4320815eb9 diff-merges: introduce --diff-merges=on
Introduce the notion of default diff format for merges, and the option
"on" to select it. The default format is "separate" and can't yet
be changed, so effectively "on" is just a synonym for "separate"
for now. Add corresponding test to t4013.

This is in preparation for introducing log.diffMerges configuration
option that will let --diff-merges=on to be configured to any
supported format.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
b0c09ab879 The eleventh (aka "ort") batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:53:34 -07:00
257ae76ba9 Merge branch 'ah/merge-ort-ubsan-fix'
Code clean-up for merge-ort backend.

* ah/merge-ort-ubsan-fix:
  merge-ort: only do pointer arithmetic for non-empty lists
2021-04-16 13:53:34 -07:00
7bec8e7fa6 Merge branch 'en/ort-readiness'
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.

* en/ort-readiness:
  Add testing with merge-ort merge strategy
  t6423: mark remaining expected failure under merge-ort as such
  Revert "merge-ort: ignore the directory rename split conflict for now"
  merge-recursive: add a bunch of FIXME comments documenting known bugs
  merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
  t: mark several submodule merging tests as fixed under merge-ort
  merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
  t6428: new test for SKIP_WORKTREE handling and conflicts
  merge-ort: support subtree shifting
  merge-ort: let renormalization change modify/delete into clean delete
  merge-ort: have ll_merge() use a special attr_index for renormalization
  merge-ort: add a special minimal index just for renormalization
  merge-ort: use STABLE_QSORT instead of QSORT where required
2021-04-16 13:53:34 -07:00
e2e1a03f6b Merge branch 'en/ort-perf-batch-10'
Various rename detection optimization to help "ort" merge strategy
backend.

* en/ort-perf-batch-10:
  diffcore-rename: determine which relevant_sources are no longer relevant
  merge-ort: record the reason that we want a rename for a file
  diffcore-rename: add computation of number of unknown renames
  diffcore-rename: check if we have enough renames for directories early on
  diffcore-rename: only compute dir_rename_count for relevant directories
  merge-ort: record the reason that we want a rename for a directory
  merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type
  diffcore-rename: take advantage of "majority rules" to skip more renames
2021-04-16 13:53:33 -07:00
76655e8a28 completion: avoid aliased command lookup error in nounset mode
Aliased command lookup accesses the `list` variable before it has been
set, causing an error in "nounset" mode. Initialize to an empty string
to avoid that.

    $ git nonexistent-command <Tab>bash: list: unbound variable

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:40:52 -07:00
32f67888d8 maintenance: respect remote.*.skipFetchAll
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
cfd781ea22 maintenance: use 'git fetch --prefetch'
The 'prefetch' maintenance task previously forced the following refspec
for each remote:

	+refs/heads/*:refs/prefetch/<remote>/*

If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.

The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.

Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
2e03115d0c fetch: add --prefetch option
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

There are some interesting cases that are worth testing.

An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".

It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.

Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
9160068ac6 msvc: avoid calling access("NUL", flags)
Apparently this is not supported with Microsoft's Universal C Runtime.
So let's not actually do that.

Instead, just return success because we _know_ that we expect the `NUL`
device to be present.

Side note: it is possible to turn off the "Null device driver" and
thereby disable `NUL`. Too many things are broken if this driver is
disabled, therefore it is not worth bothering to try to detect its
presence when `access()` is called.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 12:05:32 -07:00
332ec963bc pkt-line: do not report packet write errors twice
On write() errors, packet_write() dies with the same error message that
is already printed by its callee, packet_write_gently(). This produces
an unnecessarily verbose and repetitive output:

error: packet write failed
fatal: packet write failed: <strerror() message>

In addition to that, packet_write_gently() does not always fulfill its
caller expectation that errno will be properly set before a non-zero
return. In particular, that is not the case for a "data exceeds max
packet size" error. So, in this case, packet_write() will call
die_errno() and print an strerror(errno) message that might be totally
unrelated to the actual error.

Fix both those issues by turning packet_write() and
packet_write_gently() into wrappers to a common lower level function
that doesn't print the error message, but instead returns it on a buffer
for the caller to die() or error() as appropriate.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 15:05:31 -07:00
d1b10fc6d8 The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 13:36:01 -07:00
5a7e52bed2 Merge branch 'jz/apply-3way-cached'
"git apply" now takes "--3way" and "--cached" at the same time, and
work and record results only in the index.

* jz/apply-3way-cached:
  git-apply: allow simultaneous --cached and --3way options
2021-04-15 13:36:01 -07:00
b98db1dd70 Merge branch 'ab/complete-cherry-pick-head'
The command line completion (in contrib/) has learned that
CHERRY_PICK_HEAD is a possible pseudo-ref.

* ab/complete-cherry-pick-head:
  bash completion: complete CHERRY_PICK_HEAD
2021-04-15 13:36:01 -07:00
771c758e8a Merge branch 'jz/apply-run-3way-first'
"git apply --3way" has always been "to fall back to 3-way merge
only when straight application fails". Swap the order of falling
back so that 3-way is always attempted first (only when the option
is given, of course) and then straight patch application is used as
a fallback when it fails.

* jz/apply-run-3way-first:
  git-apply: try threeway first when "--3way" is used
2021-04-15 13:36:00 -07:00
f3cce896a8 transport: respect verbosity when setting upstream
A command such as `git push -qu origin feature` will print "Branch
'feature' set up to track remote branch 'feature' from 'origin'." even
when --quiet is passed. In this case it's because install_branch_config() is
always called with BRANCH_CONFIG_VERBOSE.

struct transport keeps track of the desired verbosity. Fix the above
issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 12:52:49 -07:00
151b6c2dd7 doc: clarify "do not capitalize the first word" rule
The same "do not capitalize the first word" rule is applied to both
our patch titles and error messages, but the existing description
was fuzzy in two aspects.

 * For error messages, it was not said that this was only about the
   first word that begins the sentence.

 * For both, it was not clear when a capital letter there was not an
   error.  We avoid capitalizing the first word when the only reason
   you would capitalize it is because it happens to be the first
   word in the sentence.  If a proper noun, which is usually spelled
   in capital letters, happens to come at the beginning of the
   sentence, it should be kept in capital letters.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 23:41:00 -07:00
4589bca829 name-hash: use expand_to_path()
A sparse-index loads the name-hash data for its entries, including the
sparse-directory entries. If a caller asks for a path that is contained
within a sparse-directory entry, we need to expand to a full index and
recalculate the name hash table before returning the result. Insert
calls to expand_to_path() to protect against this case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:48:01 -07:00
71f82d032f sparse-index: expand_to_path()
Some users of the index API have a specific path they are looking for,
but choose to use index_file_exists() to rely on the name-hash hashtable
instead of doing binary search with index_name_pos(). These users only
need to know a yes/no answer, not a position within the cache array.

When the index is sparse, the name-hash hash table does not contain the
full list of paths within sparse directories. It _does_ contain the
directory names for the sparse-directory entries.

Create a helper function, expand_to_path(), for intended use with the
name-hash hashtable functions. The integration with name-hash.c will
follow in a later change.

The solution here is to use ensure_full_index() when we determine that
the requested path is within a sparse directory entry. This will
populate the name-hash hashtable as the index is recomputed from
scratch.

There may be cases where the caller is trying to find an untracked path
that is not in the index but also is not within a sparse directory
entry. We want to minimize the overhead for these requests. If we used
index_name_pos() to find the insertion order of the path, then we could
determine from that position if a sparse-directory exists. (In fact,
just calling index_name_pos() in that case would lead to expanding the
index to a full index.) However, this takes O(log N) time where N is the
number of cache entries.

To keep the performance of this call based mostly on the input string,
use index_file_exists() to look for the ancestors of the path. Using the
heuristic that a sparse directory is likely to have a small number of
parent directories, we start from the bottom and build up. Use a string
buffer to allow mutating the path name to terminate after each slash for
each hashset test.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:54 -07:00
5f11669586 name-hash: don't add directories to name_hash
Sparse directory entries represent a directory that is outside the
sparse-checkout definition. These are not paths to blobs, so should not
be added to the name_hash table. Instead, they should be added to the
directory hashtable when 'ignore_case' is true.

Add a condition to avoid placing sparse directories into the name_hash
hashtable. This avoids filling the table with extra entries that will
never be queried.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:51 -07:00
f5fed74fb2 revision: ensure full index
Before iterating over all index entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior. This case could
be integrated later by ensuring that we walk the tree in the
sparse-directory entry, but the current behavior is only expecting
blobs. Save this integration for later when it can be properly tested.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:48 -07:00
dc26b23ebc resolve-undo: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:45 -07:00
0c18c059a1 read-cache: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:42 -07:00
465a04abc6 pathspec: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:40 -07:00
f7ef64be0c merge-recursive: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:37 -07:00
3450a304aa entry: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:35 -07:00
d425f65127 dir: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:32 -07:00
2508df0272 update-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:29 -07:00
a02912019a stash: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:26 -07:00
e43e2a17d2 rm: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:24 -07:00
299e2c4561 merge-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:21 -07:00
42f44e84eb ls-files: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid missing files.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:17 -07:00
46eb6e31ef grep: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one so we do not miss blobs to scan. Later, this can
integrate more carefully with sparse indexes with proper testing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:13 -07:00
2227ea175f fsck: ensure full index
When verifying all blobs reachable from the index, ensure that a sparse
index has been expanded to a full one to avoid missing some blobs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:11 -07:00
48b3c7da6c difftool: ensure full index
Before iterating over all cache entries, ensure that a sparse index has
been expanded to a full one to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:09 -07:00
cb8388df5b commit: ensure full index
These two loops iterate over all cache entries, so ensure that a sparse
index is expanded to a full index before we do so.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:06 -07:00
0f6d3ba6bd checkout: ensure full index
Before iterating over all cache entries in the checkout builtin, ensure
that we have a full index to avoid any unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:03 -07:00
1b850d37f4 checkout-index: ensure full index
Before we iterate over all cache entries, ensure that the index is not
sparse. This loop in checkout_all() might be safe to iterate over a
sparse index, but let's put this protection here until it can be
carefully tested.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:59 -07:00
54beed24d2 add: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:48 -07:00
118a2e8bde cache: move ensure_full_index() to cache.h
Soon we will insert ensure_full_index() calls across the codebase.
Instead of also adding include statements for sparse-index.h, let's just
use the fact that anything that cares about the index already has
cache.h in its includes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:41 -07:00
95e0321c4d read-cache: expand on query into sparse-directory entry
Callers to index_name_pos() or index_name_stage_pos() have a specific
path in mind. If that happens to be a path with an ancestor being a
sparse-directory entry, it can lead to unexpected results.

In the case that we did not find the requested path, check to see if the
position _before_ the inserted position is a sparse directory entry that
matches the initial segment of the input path (including the directory
separator at the end of the directory name). If so, then expand the
index to be a full index and search again. This expansion will only
happen once per index read.

Future enhancements could be more careful to expand only the necessary
sparse directory entry, but then we would have a special "not fully
sparse, but also not fully expanded" mode that could affect writing the
index to file. Since this only occurs if a specific file is requested
outside of the sparse checkout definition, this is unlikely to be a
common situation.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:30 -07:00
847a9e5d4f *: remove 'const' qualifier for struct index_state
Several methods specify that they take a 'struct index_state' pointer
with the 'const' qualifier because they intend to only query the data,
not change it. However, we will be introducing a step very low in the
method stack that might modify a sparse-index to become a full index in
the case that our queries venture inside a sparse-directory entry.

This change only removes the 'const' qualifiers that are necessary for
the following change which will actually modify the implementation of
index_name_stage_pos().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:00 -07:00
839a66349e sparse-index: API protection strategy
Edit and expand the sparse-index design document with the plan for
guarding index operations with ensure_full_index().

Notably, the plan has changed to not have an expand_to_path() method in
favor of checking for a sparse-directory hit inside of the
index_path_pos() API.

The changes that follow this one will incrementally add
ensure_full_index() guards to iterations over all cache entries. Some
iterations over the cache entries are not protected due to a few
categories listed in the document. Since these are not being modified,
here is a short list of the files and methods that will not receive
these guards:

Looking for non-zero stage:
* builtin/add.c:chmod_pathspec()
* builtin/merge.c:count_unmerged_entries()
* merge-ort.c:record_conflicted_index_entries()
* read-cache.c:unmerged_index()
* rerere.c:check_one_conflict(), find_conflict(), rerere_remaining()
* revision.c:prepare_show_merge()
* sequencer.c:append_conflicts_hint()
* wt-status.c:wt_status_collect_changes_initial()

Looking for submodules:
* builtin/submodule--helper.c:module_list_compute()
* submodule.c: several methods
* worktree.c:validate_no_submodules()

Part of the index API:
* name-hash.c: lazy init methods
* preload-index.c:preload_thread(), preload_index()
* read-cache.c: file format methods

Checking for correct order of cache entries:
* read-cache.c:check_ce_order()

Ignores SKIP_WORKTREE entries or already aware:
* unpack-trees.c:mark_new_skip_worktree()
* wt-status.c:wt_status_check_sparse_checkout()

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:45:34 -07:00
54a3917115 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 15:28:53 -07:00
e0d4a63c09 Merge branch 'vs/completion-with-set-u'
The command-line completion script (in contrib/) had a couple of
references that would have given a warning under the "-u" (nounset)
option.

* vs/completion-with-set-u:
  completion: audit and guard $GIT_* against unset use
2021-04-13 15:28:53 -07:00
e6545201ad Merge branch 'ab/detox-config-gettext'
The last remnant of gettext-poison has been removed.

* ab/detox-config-gettext:
  config.c: remove last remnant of GIT_TEST_GETTEXT_POISON
2021-04-13 15:28:53 -07:00
a9414b86ac Merge branch 'gk/gitweb-redacted-email'
"gitweb" learned "e-mail privacy" feature to redact strings that
look like e-mail addresses on various pages.

* gk/gitweb-redacted-email:
  gitweb: add "e-mail privacy" feature to redact e-mail addresses
2021-04-13 15:28:52 -07:00
8446b388b1 Merge branch 'cc/test-helper-bloom-usage-fix'
Usage message fix for a test helper.

* cc/test-helper-bloom-usage-fix:
  test-bloom: fix missing 'bloom' from usage string
2021-04-13 15:28:52 -07:00
2279289e95 Merge branch 'ab/send-email-validate-errors'
Clean-up codepaths that implements "git send-email --validate"
option and improves the message from it.

* ab/send-email-validate-errors:
  git-send-email: improve --validate error output
  git-send-email: refactor duplicate $? checks into a function
  git-send-email: test full --validate output
2021-04-13 15:28:51 -07:00
4c6ac2da2c Merge branch 'tb/precompose-prefix-simplify'
Streamline the codepath to fix the UTF-8 encoding issues in the
argv[] and the prefix on macOS.

* tb/precompose-prefix-simplify:
  macOS: precompose startup_info->prefix
  precompose_utf8: make precompose_string_if_needed() public
2021-04-13 15:28:51 -07:00
1d5fbd45c4 Merge branch 'fm/user-manual-use-preface'
Doc update to improve git.info

* fm/user-manual-use-preface:
  user-manual.txt: assign preface an id and a title
2021-04-13 15:28:51 -07:00
0623669fc6 Merge branch 'tb/pack-preferred-tips-to-give-bitmap'
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.

* tb/pack-preferred-tips-to-give-bitmap:
  builtin/pack-objects.c: respect 'pack.preferBitmapTips'
  t/helper/test-bitmap.c: initial commit
  pack-bitmap: add 'test_bitmap_commits()' helper
2021-04-13 15:28:50 -07:00
7b55441db1 Merge branch 'ab/perl-do-not-abuse-map'
Perl critique.

* ab/perl-do-not-abuse-map:
  git-send-email: replace "map" in void context with "for"
2021-04-13 15:28:50 -07:00
f63add4aa8 Merge branch 'jk/ref-filter-segfault-fix'
A NULL-dereference bug has been corrected in an error codepath in
"git for-each-ref", "git branch --list" etc.

* jk/ref-filter-segfault-fix:
  ref-filter: fix NULL check for parse object failure
2021-04-13 15:28:50 -07:00
f6d25d7878 api docs: document that BUG() emits a trace2 error event
Correct documentation added in e544221d97 (trace2:
Documentation/technical/api-trace2.txt, 2019-02-22) to state that
calling BUG() also emits an "error" event. See ee4512ed48 (trace2:
create new combined trace facility, 2019-02-22) for the initial
implementation.

The BUG() function did not emit an event then however, that was only
changed later in 0a9dde4a04 (usage: trace2 BUG() invocations,
2021-02-05), that commit changed the code, but didn't update any of
the docs.

Let's also add a cross-reference from api-error-handling.txt.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:57:13 -07:00
4bf0c6f38f api docs: document BUG() in api-error-handling.txt
When the BUG() function was added in d8193743e0 (usage.c: add BUG()
function, 2017-05-12) these docs added in 1f23cfe0ef (doc: document
error handling functions and conventions, 2014-12-03) were not
updated. Let's do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:56:58 -07:00
c00c7382dd usage.c: don't copy/paste the same comment three times
In ee4512ed48 (trace2: create new combined trace facility,
2019-02-22) we started with two copies of this comment,
0ee10fd129 (usage: add trace2 entry upon warning(), 2020-11-23) added
a third. Let's instead add an earlier comment that applies to all
these mostly-the-same functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:56:28 -07:00
feeb03bce6 tests: remove all uses of test_i18cmp
Finish the removal I started in 1108cea7f8 (tests: remove most uses
of test_i18ncmp, 2021-02-11). At that time the function wasn't removed
due to disruption with in-flight changes, remove the occurrences that
have landed since then.

As of writing this there are no test_i18ncmp uses between "master" and
"seen", so let's also remove the function to finally put it to rest.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:41:24 -07:00
c1fa951d7e revision: avoid parsing with --exclude-promisor-objects
When --exclude-promisor-objects is given, before traversing any objects
we iterate over all of the objects in any promisor packs, marking them
as UNINTERESTING and SEEN. We turn the oid we get from iterating the
pack into an object with parse_object(), but this has two problems:

  - it's slow; we are zlib inflating (and reconstructing from deltas)
    every byte of every object in the packfile

  - it leaves the tree buffers attached to their structs, which means
    our heap usage will grow to store every uncompressed tree
    simultaneously. This can be gigabytes.

We can obviously fix the second by freeing the tree buffers after we've
parsed them. But we can observe that the function doesn't look at the
object contents at all! The only reason we call parse_object() is that
we need a "struct object" on which to set the flags. There are two
options here:

  - we can look up just the object type via oid_object_info(), and then
    call the appropriate lookup_foo() function

  - we can call lookup_unknown_object(), which gives us an OBJ_NONE
    struct (which will get auto-converted later by object_as_type() via
    calls to lookup_commit(), etc).

The first one is closer to the current code, but we do pay the price to
look up the type for each object. The latter should be more efficient in
CPU, though it wastes a little bit of memory (the "unknown" object
structs are a union of all object types, so some of the structs are
bigger than they need to be). It also runs the risk of triggering a
latent bug in code that calls lookup_object() directly but isn't ready
to handle OBJ_NONE (such code would already be buggy, but we use
lookup_unknown_object() infrequently enough that it might be hiding).

I went with the second option here. I don't think the risk is high (and
we'd want to find and fix any such bugs anyway), and it should be more
efficient overall.

The new tests in p5600 show off the improvement (this is on git.git):

  Test                                 HEAD^               HEAD
  -------------------------------------------------------------------------------
  5600.5: count commits                0.37(0.37+0.00)     0.38(0.38+0.00) +2.7%
  5600.6: count non-promisor commits   11.74(11.37+0.37)   0.04(0.03+0.00) -99.7%

The improvement is particularly big in this script because _every_
object in the newly-cloned partial repo is a promisor object. So after
marking them all, there's nothing left to traverse.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:22:37 -07:00
45a187cc34 lookup_unknown_object(): take a repository argument
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.

We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:18:46 -07:00
fcc07e980b is_promisor_object(): free tree buffer after parsing
To get the list of all promisor objects, we not only include all objects
in promisor packs, but also parse each of those objects to see which
objects they reference. After parsing a tree object, the tree->buffer
field will remain populated until we explicitly free it. So in a partial
clone of blob:none, for example, we are essentially reading every tree
in the repository (since they're all in the initial promisor pack), and
keeping all of their uncompressed contents in memory at once.

This patch frees the tree buffers after we've finished marking all of
their reachable objects. We shouldn't need to do this for any other
object type. While we are using some extra memory to store the structs,
no other object type stores the whole contents in its parsed form (we do
sometimes hold on to commit buffers, but less so these days due to
commit graphs, plus most commands which care about promisor objects turn
off the save_commit_buffer global).

Even for a moderate-sized repository like git.git, this patch drops the
peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB.
Fsck is a good candidate for measuring here because it doesn't interact
with the promisor code except to call is_promisor_object(), so we can
isolate just this problem.

The added perf test shows only a tiny improvement on my machine for
git.git, since 1.7GB isn't enough to cause any real memory pressure:

  Test                                 HEAD^               HEAD
  --------------------------------------------------------------------------------
  5600.4: fsck                         21.26(20.90+0.35)   20.84(20.79+0.04) -2.0%

With linux.git the absolute change is a bit bigger, though still a small
percentage:

  Test                          HEAD^                 HEAD
  -----------------------------------------------------------------------------
  5600.4: fsck                  262.26(259.13+3.12)   254.92(254.62+0.29) -2.8%

I didn't have the patience to run it under massif with linux.git, but
it's probably on the order of about 14GB improvement, since that's the
sum of the sizes of all of the uncompressed trees (but still isn't
enough to create memory pressure on this particular machine, which has
64GB of RAM). Smaller machines would probably see a bigger effect on
runtime (and sadly our perf suite does not measure peak heap).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:16:39 -07:00
2a2112a429 refs: print errno for read_raw_ref if GIT_TRACE_REFS is set
The ref backend API uses errno as a sideband error channel.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:42:37 -07:00
61a7660516 reftable: document an alternate cleanup method on Windows
The new method uses the update_index counter, which isn't susceptible to clock
inaccuracies.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:29:44 -07:00
4f4d2017a3 svn tests: refactor away a "set -e" in test body
Refactor a test added in 83c9433e67 (git-svn: support for git-svn
propset, 2014-12-07) to avoid using "set -e" in the test body. Let's
move this into a setup test using "test_expect_success" instead.

While I'm at it refactor:

 * Repeated "mkdir" to "mkdir -p"
 * Uses of "touch" to creating the files with ">" instead
 * The "rm -rf" at the end to happen in a "test_when_finished"

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:10:51 -07:00
88fce1219e svn tests: remove legacy re-setup from init-clone test
Remove the immediate "rm -rf .git" from the start of this test. This
was added back in 41337e22f0 (git-svn: add tests for command-line
usage of init and clone commands, 2007-11-17) when there was a "trash"
directory shared by all the tests, but ever since abc5d372ec (Enable
parallel tests, 2008-08-08) we've had per-test trash directories.

So this setup can simply be removed. We could use
TEST_NO_CREATE_REPO=true, but I don't think it's worth the effort to
go out of our way to be different. It doesn't matter that we now have
a redundant .git at the top-level.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 14:10:50 -07:00
8e118e8490 pack-objects: update "nr_seen" progress based on pack-reused count
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.

After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).

But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.

This worked correctly once upon a time, but was broken in 5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:

  - prior to the pack-reuse fixes from a14aebeac3 (Merge branch
    'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
    never kicked in anyway

  - the output looks _kind of_ correct. The "backwards" moment is hard
    to catch, because we overwrite the old progress number with the new
    one, and the larger number is displayed only for a second. So unless
    you look at that exact second, you just see the much smaller value,
    counting up to the number of non-reused objects (though of course if
    you catch it in stderr, or look at GIT_TRACE_PACKET from a server
    with bitmaps, you can see both values).

This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per 5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.

We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 11:31:30 -07:00
c1ea48a8f7 merge-ort: only do pointer arithmetic for non-empty lists
versions could be an empty string_list. In that case, versions->items is
NULL, and we shouldn't be trying to perform pointer arithmetic with it (as
that results in undefined behaviour).

Moreover we only use the results of this calculation once when calling
QSORT. Therefore we choose to skip creating relevant_entries and call
QSORT directly with our manipulated pointers (but only if there's data
requiring sorting). This lets us avoid abusing the string_list API,
and saves us from having to explain why this abuse is OK.

Finally, an assertion is added to make sure that write_tree() is called
with a valid offset.

This issue has probably existed since:
  ee4012dcf9 (merge-ort: step 2 of tree writing -- function to create tree object, 2020-12-13)
But it only started occurring during tests since tests started using
merge-ort:
  f3b964a07e (Add testing with merge-ort merge strategy, 2021-03-20)

For reference - here's the original UBSAN commit that implemented this
check, it sounds like this behaviour isn't actually likely to cause any
issues (but we might as well fix it regardless):
https://reviews.llvm.org/D67122

UBSAN output from t3404 or t5601:

merge-ort.c:2669:43: runtime error: applying zero offset to null pointer
    #0 0x78bb53 in write_tree merge-ort.c:2669:43
    #1 0x7856c9 in process_entries merge-ort.c:3303:2
    #2 0x782317 in merge_ort_nonrecursive_internal merge-ort.c:3744:2
    #3 0x77feef in merge_incore_nonrecursive merge-ort.c:3853:2
    #4 0x6f6a5c in do_recursive_merge sequencer.c:640:3
    #5 0x6f6a5c in do_pick_commit sequencer.c:2221:9
    #6 0x6ef055 in single_pick sequencer.c:4814:9
    #7 0x6ef055 in sequencer_pick_revisions sequencer.c:4867:10
    #8 0x4fb392 in run_sequencer revert.c:225:9
    #9 0x4fa5b0 in cmd_revert revert.c:235:8
    #10 0x42abd7 in run_builtin git.c:453:11
    #11 0x429531 in handle_builtin git.c:704:3
    #12 0x4282fb in run_argv git.c:771:4
    #13 0x4282fb in cmd_main git.c:902:19
    #14 0x524b63 in main common-main.c:52:11
    #15 0x7fc2ca340349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #16 0x4072b9 in _start start.S:120

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior merge-ort.c:2669:43 in

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 10:38:10 -07:00
9a2a4f9544 list-objects: support filtering by tag and commit
Object filters currently only support filtering blobs or trees based on
some criteria. This commit lays the foundation to also allow filtering
of tags and commits.

No change in behaviour is expected from this commit given that there are
no filters yet for those object types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 09:35:50 -07:00
414abf159f docs: fix linting issues due to incorrect relative section order
Re-order the sections of a few manual pages to be consistent with the
entirety of the rest of our documentation. This allows us to remove
the just-added whitelist of "bad" order from
lint-man-section-order.perl.

I'm doing that this way around so that code will be easy to dig up if
we'll need it in the future. I've intentionally not added some other
sections such as EXAMPLES to the list of known sections.

If we were to add that we'd find some out of order. Perhaps we'll want
to order those consistently as well in the future, at which point
whitelisting some of them might become handy again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
ea8b9271b1 doc lint: lint relative section order
Add a linting script to check the relative order of the sections in
the documentation. We should have NAME, then SYNOPSIS, DESCRIPTION,
OPTIONS etc. in that order.

That holds true throughout our documentation, except for a few
exceptions which are hardcoded in the linting script.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
cafd9828e8 doc lint: lint and fix missing "GIT" end sections
Lint for and fix the three manual pages that were missing the standard
"Part of the linkgit:git[1] suite" end section.

We only do this for the man[157] section documents (we don't have
anything outside those sections), not files to be included,
howto *.txt files etc.

We could also add this to the existing (and then renamed)
lint-gitlink.perl, but I'm not doing that here.

Obviously all of that fits in one script, but I think for something
like this that's a one-off script with global variables it's much
harder to follow when a large part of your script is some if/else or
keeping/resetting of state simply to work around the script doing two
things instead of one.

Especially because in this case this script wants to process the file
as one big string, but lint-gitlink.perl wants to look at it one line
at a time. We could also consolidate this whole thing and
t/check-non-portable-shell.pl, but that one likes to join lines as
part of its shell parsing.

So let's just add another script, whole scaffolding is basically:

    use strict;
    use warnings;
    sub report { ... }
    my $code = 0;
    while (<>) { ... }
    exit $code;

We'd spend more lines effort trying to consolidate them than just
copying that around.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
d2c9908076 doc lint: fix bugs in, simplify and improve lint script
The lint-gitlink.perl script added in ab81411ced (ci: validate
"linkgit:" in documentation, 2016-05-04) was more complex than it
needed to be. It:

 - Was using File::Find to recursively find *.txt files in
   Documentation/, let's instead use the Makefile as a source of truth
   for *.txt files, and pass it down to the script.

 - We now don't lint linkgit:* in RelNotes/* or technical/*, which we
   shouldn't have been doing in the first place anyway.

 - When the doc-diff script was added in beb188e22a (add a script to
   diff rendered documentation, 2018-08-06) we started sometimes having
   a "git worktree" under Documentation/.

   This tree contains a full checkout of git.git, as a result the
   "lint" script would recurse into that, and lint any *.txt file
   found in that entire repository.

   In practice the only in-tree "linkgit" outside of the
   Documentation/ tree is contrib/contacts/git-contacts.txt and
   contrib/subtree/git-subtree.txt, so this wouldn't emit any errors

Now we instead simply trust the Makefile to give us *.txt files.
Since the Makefile also knows what sections each page should be in we
don't have to open the files ourselves and try to parse that out. As a
bonus this will also catch bugs with the section line in the files
themselves being incorrect.

The structure of the new script is mostly based on
t/check-non-portable-shell.pl. As an added bonus it will also use
pos() to print where the problems it finds are, e.g. given an issue
like:

    diff --git a/Documentation/git-cherry.txt b/Documentation/git-cherry.txt
    [...]
     and line numbers.  git-cherry therefore detects when commits have been
    -"copied" by means of linkgit:git-cherry-pick[1], linkgit:git-am[1] or
    -linkgit:git-rebase[1].
    +"copied" by means of linkgit:git-cherry-pick[2], linkgit:git-am[3] or
    +linkgit:git-rebase[4].

We'll now emit:

    git-cherry.txt:20: error: git-cherry-pick[2]: wrong section (should be 1), shown with 'HERE' below:
    git-cherry.txt:20:      '"copied" by means of linkgit:git-cherry-pick[2]' <-- HERE
    git-cherry.txt:20: error: git-am[3]: wrong section (should be 1), shown with 'HERE' below:
    git-cherry.txt:20:      '"copied" by means of linkgit:git-cherry-pick[2], linkgit:git-am[3]' <-- HERE
    git-cherry.txt:21: error: git-rebase[4]: wrong section (should be 1), shown with 'HERE' below:
    git-cherry.txt:21:      'linkgit:git-rebase[4]' <-- HERE

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
3951eeb6d9 doc lint: Perl "strict" and "warnings" in lint-gitlink.perl
Amend this script added in ab81411ced (ci: validate "linkgit:" in
documentation, 2016-05-04) to pass under "use strict", and add a "use
warnings" for good measure.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
19bcc73e70 Documentation/Makefile: make doc.dep dependencies a variable again
Re-introduce a variable to declare what *.txt files need to be
considered for the purposes of scouring files to generate a dependency
graph of includes.

When doc.dep was introduced in a5ae8e64cf (Fix documentation
dependency generation., 2005-11-07) we had such a variable called
TEXTFILES, but it was refactored away just a few commits after that in
fb612d54c1 (Documentation: fix dependency generation.,
2005-11-07). I'm planning to add more wildcards here, so let's bring
it back.

I'm not calling it TEXTFILES because we e.g. don't consider
Documentation/technical/*.txt when generating the graph (they don't
use includes). Let's instead call it DOC_DEP_TXT.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
824c621b76 Documentation/Makefile: make $(wildcard howto/*.txt) a var
Refactor occurrences of $(wildcard howto/*.txt) into a single
HOWTO_TXT variable for readability and consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:36:34 -07:00
e5b32bffd1 rebase: don't override --no-reschedule-failed-exec with config
Fix a bug in how --no-reschedule-failed-exec interacts with
rebase.rescheduleFailedExec=true being set in the config. Before this
change the --no-reschedule-failed-exec config option would be
overridden by the config.

This bug happened because of the particulars of how "rebase" works
v.s. most other git commands when it comes to parsing options and
config:

When we read the config and parse the CLI options we correctly prefer
the --no-reschedule-failed-exec option over
rebase.rescheduleFailedExec=true in the config. So far so good.

However the --reschedule-failed-exec option doesn't take effect when
the rebase starts (we'd just create a
".git/rebase-merge/reschedule-failed-exec" file if it was true). It
only takes effect when the exec command fails, at which point we'll
reschedule the failed "exec" command.

Since we only wrote out the positive
".git/rebase-merge/reschedule-failed-exec" under
--reschedule-failed-exec, but nothing with --no-reschedule-failed-exec
we'll forget that we asked not to reschedule failed "exec", and would
happily re-read the config and see that
rebase.rescheduleFailedExec=true is set.

So the config will effectively override the user having explicitly
disabled the option on the command-line.

Even more confusingly: Since rebase accepts different options based on
its state there wasn't even a way to get around this with "rebase
--continue --no-reschedule-failed-exec" (but you could of course set
the config with "rebase -c ...").

I think the least bad way out of this is to declare that for such
options and config whatever we decide at the beginning of the rebase
goes. So we'll now always create either a "reschedule-failed-exec" or
a "no-reschedule-failed-exec file at the start, not just the former if
we decided we wanted the feature.

With this new worldview you can no longer change the setting once a
rebase has started except by manually removing the state files
discussed above. I think making it work like that is the the least
confusing thing we can do.

In the future we might want to learn to change the setting in the
middle by combining "--edit-todo" with
"--[no-]reschedule-failed-exec", we currently don't support combining
those options, or any other way to change the state in the middle of
the rebase short of manually editing the files in
".git/rebase-merge/*".

The bug being fixed here originally came about because of a
combination of the behavior of the code added in d421afa0c6 (rebase:
introduce --reschedule-failed-exec, 2018-12-10) and the addition of
the config variable in 969de3ff0e (rebase: add a config option to
default to --reschedule-failed-exec, 2018-12-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:23:49 -07:00
cd663df710 rebase tests: camel-case rebase.rescheduleFailedExec consistently
Fix a test added in 906b63942a (rebase --am: ignore
rebase.rescheduleFailedExec, 2019-07-01) to camel-case the
configuration variable. This doesn't change the behavior of the test,
it's merely to help its human readers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:23:48 -07:00
628d81be6c list-objects: move tag processing into its own function
Move processing of tags into its own function to make the logic easier
to extend when we're going to implement filtering for tags. No change in
behaviour is expected from this commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:03:20 -07:00
b2025da38b revision: mark commit parents as NOT_USER_GIVEN
The NOT_USER_GIVEN flag of an object marks whether a flag was explicitly
provided by the user or not. The most important use case for this is
when filtering objects: only objects that were not explicitly requested
will get filtered.

The flag is currently only set for blobs and trees, which has been fine
given that there are no filters for tags or commits currently. We're
about to extend filtering capabilities to add object type filter though,
which requires us to set up the NOT_USER_GIVEN flag correctly -- if it's
not set, the object wouldn't get filtered at all.

Mark unseen commit parents as NOT_USER_GIVEN when processing parents.
Like this, explicitly provided parents stay user-given and thus
unfiltered, while parents which get loaded as part of the graph walk
can be filtered.

This commit shouldn't have any user-visible impact yet as there is no
logic to filter commits yet.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:03:20 -07:00
a812789c26 uploadpack.txt: document implication of uploadpackfilter.allow
When `uploadpackfilter.allow` is set to `true`, it means that filters
are enabled by default except in the case where a filter is explicitly
disabled via `uploadpackilter.<filter>.allow`. This option will not only
enable the currently supported set of filters, but also any filters
which get added in the future. As such, an admin which wants to have
tight control over which filters are allowed and which aren't probably
shouldn't ever set `uploadpackfilter.allow=true`.

Amend the documentation to make the ramifications more explicit so that
admins are aware of this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-10 23:03:19 -07:00
6871d0cec6 fetch-pack: refactor command and capability write
A subsequent commit will need this functionality independent of the rest
of send_fetch_request(), so put this into its own function.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:50:22 -07:00
57c3451b2e fetch-pack: refactor add_haves()
A subsequent commit will need part, but not all, of the functionality in
add_haves(), so move some of its functionality to its sole caller
send_fetch_request().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:50:21 -07:00
8102570374 fetch-pack: refactor process_acks()
A subsequent commit will need part, but not all, of the functionality in
process_acks(), so move some of its functionality to its sole caller
do_fetch_pack_v2(). As a side effect, the resulting code is also
shorter.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:50:21 -07:00
6db01a7308 Merge branch 'jt/fetch-pack-request-fix' into jt/push-negotiation
* jt/fetch-pack-request-fix:
  fetch-pack: buffer object-format with other args
2021-04-08 21:50:10 -07:00
81ed96a9b2 fetch-pack: buffer object-format with other args
In send_fetch_request(), "object-format" is written directly to the file
descriptor, as opposed to the other arguments, which are buffered.
Buffer "object-format" as well. "object-format" must be buffered; in
particular, it must appear after "command=fetch" in the request.

This divergence was introduced in 4b831208bb ("fetch-pack: parse and
advertise the object-format capability", 2020-05-27), perhaps as an
oversight (the surrounding code at the point of this commit has already
been using a request buffer.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 21:49:47 -07:00
0996dd3d6d gitweb: add "e-mail privacy" feature to redact e-mail addresses
Gitweb extracts content from the Git log and makes it accessible
over HTTP. As a result, e-mail addresses found in commits are
exposed to web crawlers and they may not respect robots.txt.
This can result in unsolicited messages.

Introduce an 'email-privacy' feature which redacts e-mail addresses
from the generated HTML content. Specifically, obscure addresses
retrieved from the the author/committer and comment sections of the
Git log. The feature is off by default.

This feature does not prevent someone from downloading the
unredacted commit log, e.g., by cloning the repository, and
extracting information from it. It aims to hinder the low-
effort, bulk collection of e-mail addresses by web crawlers.

Signed-off-by: Georgios Kontaxis <geko1702+commits@99rst.org>
Acked-by: Eric Wong <e@80x24.org>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 15:54:26 -07:00
56550ea718 Makefile: add missing dependencies of 'config-list.h'
We auto-generate the list of supported configuration variables from
'Documentation/config/*.txt', and that list used to be created by the
'generate-cmdlist.sh' helper script and stored in the 'command-list.h'
header.  Commit 709df95b78 (help: move list_config_help to
builtin/help, 2020-04-16) extracted this into a dedicated
'generate-configlist.sh' script and 'config-list.h' header, and added
a new target in the 'Makefile' as well, but while doing so it forgot
to extract the dependencies of the latter.  Consequently, since then
'config-list.h' is not re-generated when 'Documentation/config/*.txt'
is updated, while 'command-list.h' is re-generated unnecessarily:

  $ touch Documentation/config/log.txt
  $ make -j4
      GEN command-list.h
      CC help.o
      AR libgit.a

Fix this and list all config-related documentation files as
dependencies of 'config-list.h' and remove them from the dependencies
of 'command-list.h'.

  $ touch Documentation/config/log.txt
  $ make
      GEN config-list.h
      CC builtin/help.o
      LINK git

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 15:04:58 -07:00
d5f4b8260f rm: honor sparse checkout patterns
`git add` refrains from adding or updating index entries that are
outside the current sparse checkout, but `git rm` doesn't follow the
same restriction. This is somewhat counter-intuitive and inconsistent.
So make `rm` honor the sparsity rules and advise on how to remove
SKIP_WORKTREE entries just like `add` does. Also add some tests for the
new behavior.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
a20f70478f add: warn when asked to update SKIP_WORKTREE entries
`git add` already refrains from updating SKIP_WORKTREE entries, but it
silently exits with zero code when it is asked to do so. Instead, let's
warn the user and display a hint on how to update these entries.

Note that we only warn the user whey they give a pathspec item that
matches no eligible path for updating, but it does match one or more
SKIP_WORKTREE entries. A warning was chosen over erroring out right away
to reproduce the same behavior `add` already exhibits with ignored
files. This also allow users to continue their workflow without having
to invoke `add` again with only the eligible paths (as those will have
already been added).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
b243012cb3 refresh_index(): add flag to ignore SKIP_WORKTREE entries
refresh_index() doesn't update SKIP_WORKTREE entries, but it still
matches them against the given pathspecs, marks the matches on the
seen[] array, check if unmerged, etc. In the following patch, one caller
will need refresh_index() to ignore SKIP_WORKTREE entries entirely, so
add a flag that implements this behavior.

While we are here, also realign the REFRESH_* flags and convert the hex
values to the more natural bit shift format, which makes it easier to
spot holes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
719630eb48 pathspec: allow to ignore SKIP_WORKTREE entries on index matching
Add a new enum parameter to `add_pathspec_matches_against_index()` and
`find_pathspecs_matching_against_index()`, allowing callers to specify
whether these function should attempt to match SKIP_WORKTREE entries or
not. This will be used in a future patch to make `git add` display a
warning when it is asked to update SKIP_WORKTREE entries.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
d73dbafc2c add: make --chmod and --renormalize honor sparse checkouts
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
6594afc3cc t3705: add tests for git add in sparse checkouts
We already have a couple tests for `add` with SKIP_WORKTREE entries in
t7012, but these only cover the most basic scenarios. As we will be
changing how `add` deals with sparse paths in the subsequent commits,
let's move these two tests to their own file and add more test cases
for different `add` options and situations. This also demonstrates two
options that don't currently respect SKIP_WORKTREE entries: `--chmod`
and `--renormalize`.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
4e95698349 add: include magic part of pathspec on --refresh error
When `git add --refresh <pathspec>` doesn't find any matches for the
given pathspec, it prints an error message using the `match` field of
the `struct pathspec_item`. However, this field doesn't contain the
magic part of the pathspec. Instead, let's use the `original` field.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
a437390310 userdiff: add support for Scheme
Add a diff driver for Scheme-like languages which recognizes top level
and local `define` forms, whether it is a function definition, binding,
syntax definition or a user-defined `define-xyzzy` form.

Also supports R6RS `library` forms, `module` forms along with class and
struct declarations used in Racket (PLT Scheme).

Alternate "def" syntax such as those in Gerbil Scheme are also
supported, like defstruct, defsyntax and so on.

The rationale for picking `define` forms for the hunk headers is because
it is usually the only significant form for defining the structure of
the program, and it is a common pattern for schemers to have local
function definitions to hide their visibility, so it is not only the top
level `define`'s that are of interest. Schemers also extend the language
with macros to provide their own define forms (for example, something
like a `define-test-suite`) which is also captured in the hunk header.

Since it is common practice to extend syntax with variants of a form
like `module+`, `class*` etc, those have been supported as well.

The word regex is a best-effort attempt to conform to R7RS[1] valid
identifiers, symbols and numbers.

[1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1)

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 13:56:09 -07:00
89b43f80a5 The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 13:23:26 -07:00
14cc08de23 Merge branch 'ab/make-tags-quiet'
Generate [ec]tags under $(QUIET_GEN).

* ab/make-tags-quiet:
  Makefile: add QUIET_GEN to "tags" and "TAGS" targets
2021-04-08 13:23:26 -07:00
bde35a2a93 Merge branch 'rs/daemon-sanitize-dir-sep'
"git daemon" has been tightened against systems that take backslash
as directory separator.

* rs/daemon-sanitize-dir-sep:
  daemon: sanitize all directory separators
2021-04-08 13:23:26 -07:00
1b31224e59 Merge branch 'en/ort-perf-batch-9'
The ort merge backend has been optimized by skipping irrelevant
renames.

* en/ort-perf-batch-9:
  diffcore-rename: avoid doing basename comparisons for irrelevant sources
  merge-ort: skip rename detection entirely if possible
  merge-ort: use relevant_sources to filter possible rename sources
  merge-ort: precompute whether directory rename detection is needed
  merge-ort: introduce wrappers for alternate tree traversal
  merge-ort: add data structures for an alternate tree traversal
  merge-ort: precompute subset of sources for which we need rename detection
  diffcore-rename: enable filtering possible rename sources
2021-04-08 13:23:26 -07:00
82fd285e46 Merge branch 'en/sequencer-edit-upon-conflict-fix'
"git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.

* en/sequencer-edit-upon-conflict-fix:
  sequencer: fix edit handling for cherry-pick and revert messages
2021-04-08 13:23:26 -07:00
22eee7f455 Merge branch 'll/clone-reject-shallow'
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.

* ll/clone-reject-shallow:
  builtin/clone.c: add --reject-shallow option
2021-04-08 13:23:25 -07:00
e6b971fcf5 Merge branch 'tb/reverse-midx'
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.

* tb/reverse-midx:
  midx.c: improve cache locality in midx_pack_order_cmp()
  pack-revindex: write multi-pack reverse indexes
  pack-write.c: extract 'write_rev_file_order'
  pack-revindex: read multi-pack reverse indexes
  Documentation/technical: describe multi-pack reverse indexes
  midx: make some functions non-static
  midx: keep track of the checksum
  midx: don't free midx_name early
  midx: allow marking a pack as preferred
  t/helper/test-read-midx.c: add '--show-objects'
  builtin/multi-pack-index.c: display usage on unrecognized command
  builtin/multi-pack-index.c: don't enter bogus cmd_mode
  builtin/multi-pack-index.c: split sub-commands
  builtin/multi-pack-index.c: define common usage with a macro
  builtin/multi-pack-index.c: don't handle 'progress' separately
  builtin/multi-pack-index.c: inline 'flags' with options
2021-04-08 13:23:25 -07:00
f08b4013c3 blame tests: simplify userdiff driver test
Simplify the test added in 9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) to use the --author support recently
added in 999cfc4f45 (test-lib functions: add --author support to
test_commit, 2021-01-12).

We also did not need the full fortran-external-function content. Let's
cut it down to just the important parts.

I'm modifying it to demonstrate that the fortran-specific userdiff
function is in effect by adding "DO NOT MATCH ..." and "AS THE ..."
lines surrounding the "RIGHT" one.

This is to check that we're using the userdiff "fortran" driver, as
opposed to the default driver which would match on those lines as part
of the general heuristic of matching a line that doesn't begin with
whitespace.

The test had also been leaving behind a .gitattributes file for later
tests to possibly trip over, let's clean it up with
"test_when_finished".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
b269441be2 blame tests: don't rely on t/t4018/ directory
Refactor a test added in 9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) so that the blame tests don't rely
on stealing the contents of "t/t4018/fortran-external-function".

I have another patch series that'll possibly (or not) refactor that
file, but having this test inter-dependency makes things simple in any
case by making this test more readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
6cb77966ec userdiff: remove support for "broken" tests
There have been no "broken" tests since 75c3b6b2e8 (userdiff: improve
Fortran xfuncname regex, 2020-08-12). Let's remove the test support
for them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
28e8f0d5e5 userdiff tests: list builtin drivers via test-tool
Change the userdiff test to list the builtin drivers via the
test-tool, using the new for_each_userdiff_driver() API function.

This gets rid of the need to modify this part of the test every time a
new pattern is added, see 2ff6c34612 (userdiff: support Bash,
2020-10-22) and 09dad9256a (userdiff: support Markdown, 2020-05-02)
for two recent examples.

I only need the "list-builtin-drivers "argument here, but let's add
"list-custom-drivers" and "list-drivers" too, just because it's easy.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
132bf25989 userdiff tests: explicitly test "default" pattern
Since 122aa6f9c0 (diff: introduce diff.<driver>.binary, 2008-10-05)
the internals of the userdiff.c code have understood a "default" name,
which is invoked as userdiff_find_by_name("default") and present in
the "builtin_drivers" struct. Let's test for this special case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
f12fa9ee6c userdiff: add and use for_each_userdiff_driver()
Refactor the userdiff_find_by_namelen() function so that a new
for_each_userdiff_driver() API function does most of the work.

This will be useful for the same reason we've got other for_each_*()
API functions as part of various APIs, and will be used in a follow-up
commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
82512e008c userdiff style: normalize pascal regex declaration
Declare the pascal pattern consistently with how we declare the
others, not having "\n" on one line by itself, but as part of the
pattern, and when there are alterations have the "|" at the start, not
end of the line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:09 -07:00
6d1c9c527e userdiff style: declare patterns with consistent style
Change those patterns which were declared with a regex on the same
line as the "PATTERNS()" line to put that regex on the next line, and
add missing "/* -- */" separator comments between the pattern and
word_regex.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:09 -07:00
ddd164d026 userdiff style: re-order drivers in alphabetical order
Address some old code smell and move around the built-in userdiff
drivers so they're both in alphabetical order, and now in the same
order they appear in the gitattributes(5) documentation.

The two started drifting in be58e70dba (diff: unify external diff and
funcname parsing code, 2008-10-05), and then even further in
80c49c3de2 (color-words: make regex configurable via attributes,
2009-01-17) when the "cpp" pattern was added.

There are no functional changes here, and as --color-moved will show
only moved existing lines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:09 -07:00
39e12650d7 config.c: remove last remnant of GIT_TEST_GETTEXT_POISON
Remove a use of GIT_TEST_GETTEXT_POISON added in f276e2a469 (config:
improve error message for boolean config, 2021-02-11).

This was simultaneously in-flight with my d162b25f95 (tests: remove
support for GIT_TEST_GETTEXT_POISON, 2021-01-20) which removed the
rest of the GIT_TEST_GETTEXT_POISON code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 10:54:08 -07:00
c5c0548d79 completion: audit and guard $GIT_* against unset use
$GIT_COMPLETION_SHOW_ALL and $GIT_TESTING_ALL_COMMAND_LIST were used
without guarding against them being unset, causing errors in nounset
(set -u) mode.

No other nounset-unsafe $GIT_* usages were found.

While at it, remove a superfluous (duplicate) unset guard from $GIT_DIR
in __git_find_repo_path.

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 10:45:36 -07:00
c0c2a37ac2 git-apply: allow simultaneous --cached and --3way options
"git apply" does not allow "--cached" and "--3way" to be used
together, since "--3way" writes conflict markers into the working
tree.

Allow "git apply" to accept "--cached" and "--3way" at the same
time.  When a single file auto-resolves cleanly, the result is
placed in the index at stage #0 and the command exits with 0 status.

For a file that has a conflict which cannot be cleanly
auto-resolved, the original contents from common ancestor (stage
conflict at the content level, and the command exists with non-zero
status, because there is no place (like the working tree) to leave a
half-resolved merge for the user to resolve.

The user can use `git diff` to view the contents of the conflict, or
`git checkout -m -- .` to regenerate the conflict markers in the
working directory.

Don't attempt rerere in this case since it depends on conflict
markers written to file for its database storage and lookup. There
would be two main changes required to get rerere working:

1. Allow the rerere api to accept in memory object rather than
   files, which would allow us to pass in the conflict markers
   contained in the result from ll_merge().

2. Rerere can't write to the working directory, so it would have to
   apply the result to cache stage #0 directly. A flag would be
   needed to control this.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 22:20:33 -07:00
a0dda6023e The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 16:54:09 -07:00
5644419d04 Merge branch 'ab/fsck-api-cleanup'
Fsck API clean-up.

* ab/fsck-api-cleanup:
  fetch-pack: use new fsck API to printing dangling submodules
  fetch-pack: use file-scope static struct for fsck_options
  fetch-pack: don't needlessly copy fsck_options
  fsck.c: move gitmodules_{found,done} into fsck_options
  fsck.c: add an fsck_set_msg_type() API that takes enums
  fsck.c: pass along the fsck_msg_id in the fsck_error callback
  fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
  fsck.c: give "FOREACH_MSG_ID" a more specific name
  fsck.c: undefine temporary STR macro after use
  fsck.c: call parse_msg_type() early in fsck_set_msg_type()
  fsck.h: re-order and re-assign "enum fsck_msg_type"
  fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
  fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
  fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
  fsck.c: remove (mostly) redundant append_msg_id() function
  fsck.c: rename variables in fsck_set_msg_type() for less confusion
  fsck.h: use "enum object_type" instead of "int"
  fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
  fsck.c: refactor and rename common config callback
2021-04-07 16:54:09 -07:00
d637a267d8 Merge branch 'cc/downcase-opt-help'
A few option description strings started with capital letters,
which were corrected.

* cc/downcase-opt-help:
  column, range-diff: downcase option description
2021-04-07 16:54:09 -07:00
3cf14f88de Merge branch 'js/security-md'
SECURITY.md that is facing individual contributors and end users
has been introduced.  Also a procedure to follow when preparing
embargoed releases has been spelled out.

* js/security-md:
  Document how we do embargoed releases
  SECURITY: describe how to report vulnerabilities
2021-04-07 16:54:09 -07:00
58840e62a4 Merge branch 'ps/pack-bitmap-optim'
Optimize "rev-list --use-bitmap-index --objects" corner case that
uses negative tags as the stopping points.

* ps/pack-bitmap-optim:
  pack-bitmap: avoid traversal of objects referenced by uninteresting tag
2021-04-07 16:54:09 -07:00
68e15e0c23 Merge branch 'zh/commit-trailer'
"git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.

* zh/commit-trailer:
  commit: add --trailer option
2021-04-07 16:54:08 -07:00
a548f3e0ad Merge branch 'js/cmake-vsbuild'
CMake update for vsbuild.

* js/cmake-vsbuild:
  cmake(install): include vcpkg dlls
  cmake: add a preparatory work-around to accommodate `vcpkg`
  cmake(install): fix double .exe suffixes
  cmake: support SKIP_DASHED_BUILT_INS
2021-04-07 16:54:08 -07:00
573c5e50ab Merge branch 'ds/clarify-hashwrite'
The hashwrite() API uses a buffering mechanism to avoid calling
write(2) too frequently. This logic has been refactored to be
easier to understand.

* ds/clarify-hashwrite:
  csum-file: make hashwrite() more readable
2021-04-07 16:54:08 -07:00
642a40019c Merge branch 'ah/plugleaks'
Plug or annotate remaining leaks that trigger while running the
very basic set of tests.

* ah/plugleaks:
  transport: also free remote_refs in transport_disconnect()
  parse-options: don't leak alias help messages
  parse-options: convert bitfield values to use binary shift
  init-db: silence template_dir leak when converting to absolute path
  init: remove git_init_db_config() while fixing leaks
  worktree: fix leak in dwim_branch()
  clone: free or UNLEAK further pointers when finished
  reset: free instead of leaking unneeded ref
  symbolic-ref: don't leak shortened refname in check_symref()
2021-04-07 16:54:08 -07:00
3994ae510e bash completion: complete CHERRY_PICK_HEAD
When e.g. in a failed cherry pick we did not recognize
CHERRY_PICK_HEAD as we do e.g. REBASE_HEAD in a failed rebase let's
rectify that.

When REBASE_HEAD was added in fbd7a23237 (rebase: introduce and use
pseudo-ref REBASE_HEAD, 2018-02-11) a completion was added for it, but
no corresponding completion existed for CHERRY_PICK_HEAD added in
d7e5c0cbfb (Introduce CHERRY_PICK_HEAD, 2011-02-19).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 15:14:51 -07:00
923cd87ac8 git-apply: try threeway first when "--3way" is used
The apply_fragments() method of "git apply"
can silently apply patches incorrectly if
a file has repeating contents. In these
cases a three-way merge is capable of applying
it correctly in more situations, and will
show a conflict rather than applying it
incorrectly. However, because the patches
apply "successfully" using apply_fragments(),
git will never fall back to the merge, even
if the "--3way" flag is used, and the user has
no way to ensure correctness by forcing the
three-way merge method.

Change the behavior so that when "--3way" is used,
git will always try the three-way merge first and
will only fall back to apply_fragments() in cases
where blobs are not available or some other error
(but not in the case of a merge conflict).

Since user-facing results will be different,
this has backwards compatibility implications
for users depending on the old behavior. In
addition, the three-way merge will be slower
than direct patch application.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 17:11:41 -07:00
a039a1fcf9 maintenance: simplify prefetch logic
The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 14:23:47 -07:00
ea7811b37e git-send-email: improve --validate error output
Improve the output we emit on --validate error to:

 * Say "FILE:LINE" instead of "FILE: LINE", to match "grep -n",
   compiler error messages etc.

 * Don't say "patch contains a" after just mentioning the filename,
   just leave it at "FILE:LINE: is longer than[...]. The "contains a"
   sounded like we were talking about the file in general, when we're
   actually checking it line-by-line.

 * Don't just say "rejected by sendemail-validate hook", but combine
   that with the system_or_msg() output to say what exit code the hook
   died with.

I had an aborted attempt to make the line length checker note all
lines that were longer than the limit. I didn't think that was worth
the effort, but I've left in the testing change to check that we die
as soon as we spot the first long line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:06 -07:00
d21616c039 git-send-email: refactor duplicate $? checks into a function
Refactor the duplicate checking of $? into a function. There's an
outstanding series[1] wanting to add a third use of system() in this
file, let's not copy this boilerplate anymore when that happens.

1. http://lore.kernel.org/git/87y2esg22j.fsf@evledraar.gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:06 -07:00
e585210e1b git-send-email: test full --validate output
Change the tests that grep substrings out of the output to use a full
test_cmp, in preparation for improving the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:05 -07:00
dba94e3a85 test-bloom: fix missing 'bloom' from usage string
Like 'get_murmur3' and 'generate_filter', 'get_filter_for_commit' is a
subcommand of `test-tool bloom` not of `test-tool` itself.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 22:54:34 -07:00
c7d0e61016 macOS: precompose startup_info->prefix
The "prefix" was precomposed for macOS in commit 5c327502 (MacOS:
precompose_argv_prefix(), 2021-02-03).

However, this commit forgot to update "startup_info->prefix" after
precomposing.

Move the (possible) precomposition towards the end of
setup_git_directory_gently(), so that precompose_string_if_needed()
can use git_config_get_bool("core.precomposeunicode") correctly.

Keep prefix, startup_info->prefix and GIT_PREFIX_ENVIRONMENT all in sync.

And as a result, the prefix no longer needs to be precomposed in git.c

Reported-by: Dmitry Torilov <d.torilov@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 17:30:36 -07:00
5020774aef precompose_utf8: make precompose_string_if_needed() public
commit 5c327502 (MacOS: precompose_argv_prefix(), 2021-02-03) uses
the function precompose_string_if_needed() internally.  It is only
used from precompose_argv_prefix() and therefore static in
compat/precompose_utf8.c

Expose this function, it will be used in the next commit.

While there, allow passing a NULL pointer, which will return NULL.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 17:30:04 -07:00
fc12b6fdde user-manual.txt: assign preface an id and a title
Two among the three warnings raised by "make git.info" are related to the fact
that the preface has not id in user-manual.txt.

    user-manual.texi:15: warning: empty menu entry name in `* : idm4.'
    user-manual.texi:141: warning: @unnumbered missing argument

This causes asciidoc creating an empty preface and an empty title tag in
user-manual.xml which turns to be an empty node in user-manual.texi and
git.info. Consequently, one can notice in user-manual.texi and git.info
a node named "idm4" in the menu and the navigation bar. In emacs, the
first entry of the menu in the git info page is even displayed as empty.

This fix will name "Introduction" the preface and assign it an id.
The result can be seen in the files: user-manual.{xml, texi, html, pdf}
and git.info.

For future reference, the diff between old and new user-manual.xml,
user-manual.texi, git.info, user-manual.html (converted through
html2markdown) and user-manual.pdf (converted through pdftotext) are
attached.

    --- before/user-manual.xml	2021-04-04 03:58:47.758008722 +0200
    +++ after/user-manual.xml	2021-04-04 03:56:40.520551163 +0200
    @@ -7,8 +7,8 @@
     <bookinfo>
         <title>Git User Manual</title>
     </bookinfo>
    -<preface>
    -<title></title>
    +<preface id="_introduction">
    +<title>Introduction</title>
     <simpara>Git is a fast distributed revision control system.</simpara>
     <simpara>This manual is designed to be readable by someone with basic UNIX
     command-line skills, but no previous knowledge of Git.</simpara>

    --- before/user-manual.texi	2021-04-04 03:58:47.490005652 +0200
    +++ after/user-manual.texi	2021-04-04 03:56:40.520551163 +0200
    @@ -7,12 +7,12 @@
     * Git: (git).           A fast distributed revision control system
     @end direntry

    -@node Top, idm4, , (dir)
    +@node Top, Introduction, , (dir)
     @documentlanguage en
     @top Git User Manual

     @menu
    -* : idm4.
    +* Introduction::
     * Repositories and Branches::
     * Exploring Git history::
     * Developing with Git::
    @@ -137,8 +137,8 @@
     @end detailmenu
     @end menu

    -@node idm4, Repositories and Branches, Top, Top
    -@unnumbered
    +@node Introduction, Repositories and Branches, Top, Top
    +@unnumbered Introduction

     Git is a fast distributed revision control system.

    @@ -178,7 +178,7 @@
     Finally, see @ref{Notes and todo list for this manual} for ways that you can help make this manual more
     complete.

    -@node Repositories and Branches, Exploring Git history, idm4, Top
    +@node Repositories and Branches, Exploring Git history, Introduction, Top
     @chapter Repositories and Branches

     @menu

    --- before/git.info	2021-04-04 03:58:46.557994966 +0200
    +++ after/git.info	2021-04-04 03:56:40.520551163 +0200
    @@ -7,14 +7,14 @@
     END-INFO-DIR-ENTRY

    -File: git.info,  Node: Top,  Next: idm4,  Up: (dir)
    +File: git.info,  Node: Top,  Next: Introduction,  Up: (dir)

     Git User Manual
     ***************

     * Menu:

    -* : idm4.
    +* Introduction::
     * Repositories and Branches::
     * Exploring Git history::
     * Developing with Git::
    @@ -137,7 +137,10 @@

    -File: git.info,  Node: idm4,  Next: Repositories and Branches,  Prev: Top,  Up: Top
    +File: git.info,  Node: Introduction,  Next: Repositories and Branches,  Prev: Top,  Up: Top
    +
    +Introduction
    +************

     Git is a fast distributed revision control system.

    @@ -174,7 +177,7 @@
     that you can help make this manual more complete.

    -File: git.info,  Node: Repositories and Branches,  Next: Exploring Git history,  Prev: idm4,  Up: Top
    +File: git.info,  Node: Repositories and Branches,  Next: Exploring Git history,  Prev: Introduction,  Up: Top

     1 Repositories and Branches
     ***************************
    @@ -5471,207 +5474,207 @@
    ...
     Tag Table:
     Node: Top212
    -Node: idm43164
    -Node: Repositories and Branches4465
    ...
    +Node: Introduction3179
    +Node: Repositories and Branches4515
    +Node: How to get a Git repository5128
    ...
    End Tag Table

    --- before/user-manual.html.md	2021-04-04 05:20:55.378695854 +0200
    +++ after/user-manual.html.md	2021-04-04 05:21:11.282850802 +0200
    @@ -4,6 +4,8 @@

      **Table of Contents**

    +Introduction
    +
     1\. Repositories and Branches

    @@ -278,7 +280,7 @@

     Todo list

    -#
    +# Introduction

     Git is a fast distributed revision control system.

    --- before/user-manual.pdf.txt	2021-04-04 05:28:20.367036836 +0200
    +++ after/user-manual.pdf.txt	2021-04-04 05:30:01.680026312 +0200
    @@ -487,6 +487,7 @@

     vii

    +Introduction
     Git is a fast distributed revision control system.
     This manual is designed to be readable by someone with basic UNIX command-line skills, but no previous knowledge of Git.
     Chapter 1 and Chapter 2 explain how to fetch and study a project using git—read these chapters to learn how to build and test a

Signed-off-by: Firmin Martin <firminmartin24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-03 23:19:04 -07:00
2e36527f23 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-02 14:43:31 -07:00
8a4394d1c1 Merge branch 'zh/format-patch-fractional-reroll-count'
"git format-patch -v<n>" learned to allow a reroll count that is
not an integer.

* zh/format-patch-fractional-reroll-count:
  format-patch: allow a non-integral version numbers
2021-04-02 14:43:14 -07:00
861794b60d Merge branch 'jh/simple-ipc'
A simple IPC interface gets introduced to build services like
fsmonitor on top.

* jh/simple-ipc:
  t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
  simple-ipc: add Unix domain socket implementation
  unix-stream-server: create unix domain socket under lock
  unix-socket: disallow chdir() when creating unix domain sockets
  unix-socket: add backlog size option to unix_stream_listen()
  unix-socket: eliminate static unix_stream_socket() helper function
  simple-ipc: add win32 implementation
  simple-ipc: design documentation for new IPC mechanism
  pkt-line: add options argument to read_packetized_to_strbuf()
  pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
  pkt-line: do not issue flush packets in write_packetized_*()
  pkt-line: eliminate the need for static buffer in packet_write_gently()
2021-04-02 14:43:14 -07:00
c47679d040 Merge branch 'mt/parallel-checkout-part-1'
Preparatory API changes for parallel checkout.

* mt/parallel-checkout-part-1:
  entry: add checkout_entry_ca() taking preloaded conv_attrs
  entry: move conv_attrs lookup up to checkout_entry()
  entry: extract update_ce_after_write() from write_entry()
  entry: make fstat_output() and read_blob_entry() public
  entry: extract a header file for entry.c functions
  convert: add classification for conv_attrs struct
  convert: add get_stream_filter_ca() variant
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: make convert_attrs() and convert structs public
2021-04-02 14:43:14 -07:00
b362acf575 git-send-email: replace "map" in void context with "for"
While using "map" instead of "for" or "map" instead of "grep" and
vice-versa makes for interesting trivia questions when interviewing
Perl programmers, it doesn't make for very readable code. Let's
refactor this loop initially added in 8fd5bb7f44 (git send-email: add
--annotate option, 2008-11-11) to be a for-loop instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-02 14:32:29 -07:00
3c80fcb591 Makefile: add QUIET_GEN to "tags" and "TAGS" targets
Don't show the very verbose $(FIND_SOURCE_FILES) command on every
"make TAGS" invocation.

Let's use "generate into temporary and rename to the final file,
after seeing the command that generated the output finished
successfully" pattern, to avoid leaving a file with an incorrect
output generated by a failed command.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 22:23:39 -07:00
3007752461 midx.c: improve cache locality in midx_pack_order_cmp()
There is a lot of pointer dereferencing in the pre-image version of
'midx_pack_order_cmp()', which this patch gets rid of.

Instead of comparing the pack preferred-ness and then the pack id, both
of these checks are done at the same time by using the high-order bit of
the pack id to represent whether it's preferred. Then the pack id and
offset are compared as usual.

This produces the same result so long as there are less than 2^31 packs,
which seems like a likely assumption to make in practice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
38ff7cabb6 pack-revindex: write multi-pack reverse indexes
Implement the writing half of multi-pack reverse indexes. This is
nothing more than the format describe a few patches ago, with a new set
of helper functions that will be used to clear out stale .rev files
corresponding to old MIDXs.

Unfortunately, a very similar comparison function as the one implemented
recently in pack-revindex.c is reimplemented here, this time accepting a
MIDX-internal type. An effort to DRY these up would create more
indirection and overhead than is necessary, so it isn't pursued here.

Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX
flag, meaning that this is all dead code. But, that won't be the case
for long, since subsequent patches will introduce the multi-pack bitmap,
which will begin passing this field.

(In midx.c:write_midx_internal(), the two adjacent if statements share a
conditional, but are written separately since the first one will
eventually also handle the MIDX_WRITE_BITMAP flag, which does not yet
exist.)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
a587b5a786 pack-write.c: extract 'write_rev_file_order'
Existing callers provide the reverse index code with an array of 'struct
pack_idx_entry *'s, which is then sorted by pack order (comparing the
offsets of each object within the pack).

Prepare for the multi-pack index to write a .rev file by providing a way
to write the reverse index without an array of pack_idx_entry (which the
MIDX code does not have).

Instead, callers can invoke 'write_rev_index_positions()', which takes
an array of uint32_t's. The ith entry in this array specifies the ith
object's (in index order) position within the pack (in pack order).

Expose this new function for use in a later patch, and rewrite the
existing write_rev_file() in terms of this new function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
f894081dea pack-revindex: read multi-pack reverse indexes
Implement reading for multi-pack reverse indexes, as described in the
previous patch.

Note that these functions don't yet have any callers, and won't until
multi-pack reachability bitmaps are introduced in a later patch series.
In the meantime, this patch implements some of the infrastructure
necessary to support multi-pack bitmaps.

There are three new functions exposed by the revindex API:

  - load_midx_revindex(): loads the reverse index corresponding to the
    given multi-pack index.

  - midx_to_pack_pos() and pack_pos_to_midx(): these convert between the
    multi-pack index and pseudo-pack order.

load_midx_revindex() and pack_pos_to_midx() are both relatively
straightforward.

load_midx_revindex() needs a few functions to be exposed from the midx
API. One to get the checksum of a midx, and another to get the .rev's
filename. Similar to recent changes in the packed_git struct, three new
fields are added to the multi_pack_index struct: one to keep track of
the size, one to keep track of the mmap'd pointer, and another to point
past the header and at the reverse index's data.

pack_pos_to_midx() simply reads the corresponding entry out of the
table.

midx_to_pack_pos() is the trickiest, since it needs to find an object's
position in the psuedo-pack order, but that order can only be recovered
in the .rev file itself. This mapping can be implemented with a binary
search, but note that the thing we're binary searching over isn't an
array of values, but rather a permuted order of those values.

So, when comparing two items, it's helpful to keep in mind the
difference. Instead of a traditional binary search, where you are
comparing two things directly, here we're comparing a (pack, offset)
tuple with an index into the multi-pack index. That index describes
another (pack, offset) tuple, and it is _those_ two tuples that are
compared.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
b25fd24c00 Documentation/technical: describe multi-pack reverse indexes
As a prerequisite to implementing multi-pack bitmaps, motivate and
describe the format and ordering of the multi-pack reverse index.

The subsequent patch will implement reading this format, and the patch
after that will implement writing it while producing a multi-pack index.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
62f2c1b509 midx: make some functions non-static
In a subsequent commit, pack-revindex.c will become responsible for
sorting a list of objects in the "MIDX pack order" (which will be
defined in the following patch). To do so, it will need to be know the
pack identifier and offset within that pack for each object in the MIDX.

The MIDX code already has functions for doing just that
(nth_midxed_offset() and nth_midxed_pack_int_id()), but they are
statically declared.

Since there is no reason that they couldn't be exposed publicly, and
because they are already doing exactly what the caller in
pack-revindex.c will want, expose them publicly so that they can be
reused there.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
9f19161172 midx: keep track of the checksum
write_midx_internal() uses a hashfile to write the multi-pack index, but
discards its checksum. This makes sense, since nothing that takes place
after writing the MIDX cares about its checksum.

That is about to change in a subsequent patch, when the optional
reverse index corresponding to the MIDX will want to include the MIDX's
checksum.

Store the checksum of the MIDX in preparation for that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
7240cc4b65 midx: don't free midx_name early
A subsequent patch will need to refer back to 'midx_name' later on in
the function. In fact, this variable is already free()'d later on, so
this makes the later free() no longer redundant.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
9218c6a40c midx: allow marking a pack as preferred
When multiple packs in the multi-pack index contain the same object, the
MIDX machinery must make a choice about which pack it associates with
that object. Prior to this patch, the lowest-ordered[1] pack was always
selected.

Pack selection for duplicate objects is relatively unimportant today,
but it will become important for multi-pack bitmaps. This is because we
can only invoke the pack-reuse mechanism when all of the bits for reused
objects come from the reuse pack (in order to ensure that all reused
deltas can find their base objects in the same pack).

To encourage the pack selection process to prefer one pack over another
(the pack to be preferred is the one a caller would like to later use as
a reuse pack), introduce the concept of a "preferred pack". When
provided, the MIDX code will always prefer an object found in a
preferred pack over any other.

No format changes are required to store the preferred pack, since it
will be able to be inferred with a corresponding MIDX bitmap, by looking
up the pack associated with the object in the first bit position (this
ordering is described in detail in a subsequent commit).

[1]: the ordering is specified by MIDX internals; for our purposes we
can consider the "lowest ordered" pack to be "the one with the
most-recent mtime.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
4fe788b1b0 builtin/clone.c: add --reject-shallow option
In some scenarios, users may want more history than the repository
offered for cloning, which happens to be a shallow repository, can
give them. But because users don't know it is a shallow repository
until they download it to local, we may want to refuse to clone
this kind of repository, without creating any unnecessary files.

The '--depth=x' option cannot be used as a solution; the source may
be deep enough to give us 'x' commits when cloned, but the user may
later need to deepen the history to arbitrary depth.

Teach '--reject-shallow' option to "git clone" to abort as soon as
we find out that we are cloning from a shallow repository.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 12:58:58 -07:00
c685450880 ref-filter: fix NULL check for parse object failure
After we run parse_object_buffer() to get an object's contents, we try
to check that the return value wasn't NULL. However, since our "struct
object" is a pointer-to-pointer, and we assign like:

  *obj = parse_object_buffer(...);

it's not correct to check:

  if (!obj)

That will always be true, since our double pointer will continue to
point to the single pointer (which is itself NULL). This is a regression
that was introduced by aa46a0da30 (ref-filter: use oid_object_info() to
get object, 2018-07-17); since that commit we'll segfault on a parse
failure, as we try to look at the NULL object pointer.

There are many ways a parse could fail, but most of them are hard to set
up in the tests (it's easy to make a bogus object, but update-ref will
refuse to point to it). The test here uses a tag which points to a wrong
object type. A parse of just the broken tag object will succeed, but
seeing both tag objects in the same process will lead to a parse error
(since we'll see the pointed-to object as both types).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 12:54:21 -07:00
3f267a1128 builtin/pack-objects.c: respect 'pack.preferBitmapTips'
When writing a new pack with a bitmap, it is sometimes convenient to
indicate some reference prefixes which should receive priority when
selecting which commits to receive bitmaps.

A truly motivated caller could accomplish this by setting
'pack.islandCore', (since all commits in the core island are similarly
marked as preferred) but this requires callers to opt into using delta
islands, which they may or may not want to do.

Introduce a new multi-valued configuration, 'pack.preferBitmapTips' to
allow callers to specify a list of reference prefixes. All references
which have a prefix contained in 'pack.preferBitmapTips' will mark their
tips as "preferred" in the same way as commits are marked as preferred
for selection by 'pack.islandCore'.

The choice of the verb "prefer" is intentional: marking the NEEDS_BITMAP
flag on an object does *not* guarantee that that object will receive a
bitmap. It merely guarantees that that commit will receive a bitmap over
any *other* commit in the same window by bitmap_writer_select_commits().

The test this patch adds reflects this quirk, too. It only tests that
a commit (which didn't receive bitmaps by default) is selected for
bitmaps after changing the value of 'pack.preferBitmapTips' to include
it. Other commits may lose their bitmaps as a byproduct of how the
selection process works (bitmap_writer_select_commits() ignores the
remainder of a window after seeing a commit with the NEEDS_BITMAP flag).

This configuration will aide in selecting important references for
multi-pack bitmaps, since they do not respect the same pack.islandCore
configuration. (They could, but doing so may be confusing, since it is
packs--not bitmaps--which are influenced by the delta-islands
configuration).

In a fork network repository (one which lists all forks of a given
repository as remotes), for example, it is useful to set
pack.preferBitmapTips to 'refs/remotes/<root>/heads' and
'refs/remotes/<root>/tags', where '<root>' is an opaque identifier
referring to the repository which is at the base of the fork chain.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
483fa7f42d t/helper/test-bitmap.c: initial commit
Add a new 'bitmap' test-tool which can be used to list the commits that
have received bitmaps.

In theory, a determined tester could run 'git rev-list --test-bitmap
<commit>' to check if '<commit>' received a bitmap or not, since
'--test-bitmap' exits with a non-zero code when it can't find the
requested commit.

But this is a dubious behavior to rely on, since arguably 'git
rev-list' could continue its object walk outside of which commits are
covered by bitmaps.

This will be used to test the behavior of 'pack.preferBitmapTips', which
will be added in the following patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
dff5e49e51 pack-bitmap: add 'test_bitmap_commits()' helper
The next patch will add a 'bitmap' test-tool which prints the list of
commits that have bitmaps computed.

The test helper could implement this itself, but it would need access to
the 'bitmaps' field of the 'pack_bitmap' struct. To avoid exposing this
private detail, implement the entirety of the helper behind a
test_bitmap_commits() function in pack-bitmap.c.

There is some precedence for this with test_bitmap_walk() which is used
to implement the '--test-bitmap' flag in 'git rev-list' (and is also
implemented in pack-bitmap.c).

A caller will be added in the next patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
39edfd5cbc sequencer: fix edit handling for cherry-pick and revert messages
save_opts() should save any non-default values.  It was intended to do
this, but since most options in struct replay_opts default to 0, it only
saved non-zero values.  Unfortunately, this does not always work for
options.edit.  Roughly speaking, options.edit had a default value of 0
for cherry-pick but a default value of 1 for revert.  Make save_opts()
record a value whenever it differs from the default.

options.edit was also overly simplistic; we had more than two cases.
The behavior that previously existed was as follows:

                       Non-conflict commits    Right after Conflict
    revert             Edit iff isatty(0)      Edit (ignore isatty(0))
    cherry-pick        No edit                 See above
    Specify --edit     Edit (ignore isatty(0)) See above
    Specify --no-edit  (*)                     See above

    (*) Before stopping for conflicts, No edit is the behavior.  After
        stopping for conflicts, the --no-edit flag is not saved so see
        the first two rows.

However, the expected behavior is:

                       Non-conflict commits    Right after Conflict
    revert             Edit iff isatty(0)      Edit iff isatty(0)
    cherry-pick        No edit                 Edit iff isatty(0)
    Specify --edit     Edit (ignore isatty(0)) Edit (ignore isatty(0))
    Specify --no-edit  No edit                 No edit

In order to get the expected behavior, we need to change options.edit
to a tri-state: unspecified, false, or true.  When specified, we follow
what it says.  When unspecified, we need to check whether the current
commit being created is resolving a conflict as well as consulting
options.action and isatty(0).  While at it, add a should_edit() utility
function that compresses options.edit down to a boolean based on the
additional information for the non-conflict case.

continue_single_pick() is the function responsible for resuming after
conflict cases, regardless of whether there is one commit being picked
or many.  Make this function stop assuming edit behavior in all cases,
so that it can correctly handle !isatty(0) and specific requests to not
edit the commit message.

Reported-by: Renato Botelho <garga@freebsd.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 14:10:50 -07:00
a65ce7f831 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 14:35:38 -07:00
5c2f7ff018 Merge branch 'jc/doc-format-patch-clarify'
Explain pieces of the format-patch output upfront before the rest
of the documentation starts referring to them.

* jc/doc-format-patch-clarify:
  format-patch: give an overview of what a "patch" message is
2021-03-30 14:35:38 -07:00
7652ce966f Merge branch 'ab/detox-gettext-tests'
Testfix.

* ab/detox-gettext-tests:
  mktag tests: fix broken "&&" chain
2021-03-30 14:35:38 -07:00
4730c5e273 Merge branch 'hx/pack-objects-chunk-comment'
Comment update.

* hx/pack-objects-chunk-comment:
  pack-objects: fix comment of reused_chunk.difference
2021-03-30 14:35:37 -07:00
1ba947cf15 Merge branch 'rf/send-email-hookspath'
"git send-email" learned to honor the core.hooksPath configuration.

* rf/send-email-hookspath:
  git-send-email: Respect core.hooksPath setting
2021-03-30 14:35:37 -07:00
dc2a073036 Merge branch 'ab/remove-rebase-usebuiltin'
Remove the final hint that we used to have a scripted "git rebase".

* ab/remove-rebase-usebuiltin:
  rebase: remove transitory rebase.useBuiltin setting & env
2021-03-30 14:35:37 -07:00
5013802862 Merge branch 'cs/http-use-basic-after-failed-negotiate'
When accessing a server with a URL like https://user:pass@site/, we
did not to fall back to the basic authentication with the
credential material embedded in the URL after the "Negotiate"
authentication failed.  Now we do.

* cs/http-use-basic-after-failed-negotiate:
  remote-curl: fall back to basic auth if Negotiate fails
2021-03-30 14:35:37 -07:00
b2309ad822 Merge branch 'ab/diff-no-index-tests'
More test coverage over "diff --no-index".

* ab/diff-no-index-tests:
  diff --no-index tests: test mode normalization
  diff --no-index tests: add test for --exit-code
2021-03-30 14:35:37 -07:00
ad16f748f2 Merge branch 'ab/read-tree'
Code simplification by removing support for a caller that is long gone.

* ab/read-tree:
  tree.h API: simplify read_tree_recursive() signature
  tree.h API: expose read_tree_1() as read_tree_at()
  archive: stop passing "stage" through read_tree_recursive()
  ls-files: refactor away read_tree()
  ls-files: don't needlessly pass around stage variable
  tree.c API: move read_tree() into builtin/ls-files.c
  ls-files tests: add meaningful --with-tree tests
  show tests: add test for "git show <tree>"
2021-03-30 14:35:37 -07:00
aab55b1d6e Merge branch 'bs/asciidoctor-installation-hints'
Doc update.

* bs/asciidoctor-installation-hints:
  INSTALL: note on using Asciidoctor to build doc
2021-03-30 14:35:36 -07:00
9210c68d2a Merge branch 'mt/checkout-remove-nofollow'
When "git checkout" removes a path that does not exist in the
commit it is checking out, it wasn't careful enough not to follow
symbolic links, which has been corrected.

* mt/checkout-remove-nofollow:
  checkout: don't follow symlinks when removing entries
  symlinks: update comment on threaded_check_leading_path()
2021-03-30 14:35:36 -07:00
c9e40ae8ec p2000: add sparse-index repos
p2000-sparse-operations.sh compares different Git commands in
repositories with many files at HEAD but using sparse-checkout to focus
on a small portion of those files.

Add extra copies of the repository that use the sparse-index format so
we can track how that affects the performance of different commands.

At this point in time, the sparse-index is 100% overhead from the CPU
front, and this is measurable in these tests:

Test
---------------------------------------------------------------
2000.2: git status (full-index-v3)              0.59(0.51+0.12)
2000.3: git status (full-index-v4)              0.59(0.52+0.11)
2000.4: git status (sparse-index-v3)            1.40(1.32+0.12)
2000.5: git status (sparse-index-v4)            1.41(1.36+0.08)
2000.6: git add -A (full-index-v3)              2.32(1.97+0.19)
2000.7: git add -A (full-index-v4)              2.17(1.92+0.14)
2000.8: git add -A (sparse-index-v3)            2.31(2.21+0.15)
2000.9: git add -A (sparse-index-v4)            2.30(2.20+0.13)
2000.10: git add . (full-index-v3)              2.39(2.02+0.20)
2000.11: git add . (full-index-v4)              2.20(1.94+0.16)
2000.12: git add . (sparse-index-v3)            2.36(2.27+0.12)
2000.13: git add . (sparse-index-v4)            2.33(2.21+0.16)
2000.14: git commit -a -m A (full-index-v3)     2.47(2.12+0.20)
2000.15: git commit -a -m A (full-index-v4)     2.26(2.00+0.17)
2000.16: git commit -a -m A (sparse-index-v3)   3.01(2.92+0.16)
2000.17: git commit -a -m A (sparse-index-v4)   3.01(2.94+0.15)

Note that there is very little difference between the v3 and v4 index
formats when the sparse-index is enabled. This is primarily due to the
fact that the relative file sizes are the same, and the command time is
mostly taken up by parsing tree objects to expand the sparse index into
a full one.

With the current file layout, the index file sizes are given by this
table:

       |  full index | sparse index |
       +-------------+--------------+
    v3 |     108 MiB |      1.6 MiB |
    v4 |      80 MiB |      1.2 MiB |

Future updates will improve the performance of Git commands when the
index is sparse.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:49 -07:00
9ad2d5ea71 sparse-index: loose integration with cache_tree_verify()
The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE
is enabled, which it is by default in the test suite. The logic must
be adjusted for the presence of these directory entries.

For now, leave the test as a simple check for whether the directory
entry is sparse. Do not go any further until needed.

This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in
t1092-sparse-checkout-compatibility.sh. Further,
p2000-sparse-operations.sh uses the test suite and hence this is enabled
for all tests. We need to integrate with it before we run our
performance tests with a sparse-index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
2de37c536d cache-tree: integrate with sparse directory entries
The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.

When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
dcc5fd5fd2 sparse-checkout: disable sparse-index
We use 'git sparse-checkout init --cone --sparse-index' to toggle the
sparse-index feature. It makes sense to also disable it when running
'git sparse-checkout disable'. This is particularly important because it
removes the extensions.sparseIndex config option, allowing other tools
to use this Git repository again.

This does mean that 'git sparse-checkout init' will not re-enable the
sparse-index feature, even if it was previously enabled.

While testing this feature, I noticed that the sparse-index was not
being written on the first run, but by a second. This was caught by the
call to 'test-tool read-cache --table'. This requires adjusting some
assignments to core_apply_sparse_checkout and pl.use_cone_patterns in
the sparse_checkout_init() logic.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
122ba1f7b5 sparse-checkout: toggle sparse index from builtin
The sparse index extension is used to signal that index writes should be
in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1.

Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that
specifies if the sparse index should be used. It also updates the index
to use the correct format, either way. Add a warning in the
documentation that the use of a repository extension might reduce
compatibility with third-party tools. 'git sparse-checkout init' already
sets extension.worktreeConfig, which places most sparse-checkout users
outside of the scope of most third-party tools.

Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of
GIT_TEST_SPARSE_INDEX=1.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
58300f4743 sparse-index: add index.sparse config option
When enabled, this config option signals that index writes should
attempt to use sparse-directory entries.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
0938e6ff55 sparse-index: check index conversion happens
Add a test case that uses test_region to ensure that we are truly
expanding a sparse index to a full one, then converting back to sparse
when writing the index. As we integrate more Git commands with the
sparse index, we will convert these commands to check that we do _not_
convert the sparse index to a full index and instead stay sparse the
entire time.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
13e1331247 unpack-trees: allow sparse directories
The index_pos_by_traverse_info() currently throws a BUG() when a
directory entry exists exactly in the index. We need to consider that it
is possible to have a directory in a sparse index as long as that entry
is itself marked with the skip-worktree bit.

The 'pos' variable is assigned a negative value if an exact match is not
found. Since a directory name can be an exact match, it is no longer an
error to have a nonnegative 'pos' value.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
f442313e2e submodule: sparse-index should not collapse links
A submodule is stored as a "Git link" that actually points to a commit
within a submodule. Submodules are populated or not depending on
submodule configuration, not sparse-checkout. To ensure that the
sparse-index feature integrates correctly with submodules, we should not
collapse a directory if there is a Git link within its range.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
6e773527b6 sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.

For now, we avoid converting the index to a sparse index if:

 1. the index is split.
 2. the index is already sparse.
 3. sparse-checkout is disabled.
 4. sparse-checkout does not use cone mode.

Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.

The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.

The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.

Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."

We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.

The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:47 -07:00
cd42415fb4 sparse-index: add 'sdir' index extension
The index format does not currently allow for sparse directory entries.
This violates some expectations that older versions of Git or
third-party tools might not understand. We need an indicator inside the
index file to warn these tools to not interact with a sparse index
unless they are aware of sparse directory entries.

Add a new _required_ index extension, 'sdir', that indicates that the
index may contain sparse directory entries. This allows us to continue
to use the differences in index formats 2, 3, and 4 before we create a
new index version 5 in a later change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
836e25c51b sparse-checkout: hold pattern list in index
As we modify the sparse-checkout definition, we perform index operations
on a pattern_list that only exists in-memory. This allows easy backing
out in case the index update fails.

However, if the index write itself cares about the sparse-checkout
pattern set, we need access to that in-memory copy. Place a pointer to
a 'struct pattern_list' in the index so we can access this on-demand.
This will be used in the next change which uses the sparse-checkout
definition to filter out directories that are outside the sparse cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
6863df3550 unpack-trees: ensure full index
The next change will translate full indexes into sparse indexes at write
time. The existing logic provides a way for every sparse index to be
expanded to a full index at read time. However, there are cases where an
index is written and then continues to be used in-memory to perform
further updates.

unpack_trees() is frequently called after such a write. In particular,
commands like 'git reset' do this double-update of the index.

Ensure that we have a full index when entering unpack_trees(), but only
when command_requires_full_index is true. This is always true at the
moment, but we will later relax that after unpack_trees() is updated to
handle sparse directory entries.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
2782db3eed test-tool: don't force full index
We will use 'test-tool read-cache --table' to check that a sparse
index is written as part of init_repos. Since we will no longer always
expand a sparse index into a full index, add an '--expand' parameter
that adds a call to ensure_full_index() so we can compare a sparse index
directly against a full index, or at least what the in-memory index
looks like when expanded in this way.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
e2df6c3972 test-read-cache: print cache entries with --table
This table is helpful for discovering data in the index to ensure it is
being written correctly, especially as we build and test the
sparse-index. This table includes an output format similar to 'git
ls-tree', but should not be compared to that directly. The biggest
reasons are that 'git ls-tree' includes a tree entry for every
subdirectory, even those that would not appear as a sparse directory in
a sparse-index. Further, 'git ls-tree' does not use a trailing directory
separator for its tree rows.

This does not print the stat() information for the blobs. That will be
added in a future change with another option. The tests that are added
in the next few changes care only about the object types and IDs.
However, this future need for full index information justifies the need
for this test helper over extending a user-facing feature, such as 'git
ls-files'.

To make the option parsing slightly more robust, wrap the string
comparisons in a loop adapted from test-dir-iterator.c.

Care must be taken with the final check for the 'cnt' variable. We
continue the expectation that the numerical value is the final argument.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
ecfc47c066 t1092: compare sparse-checkout to sparse-index
Add a new 'sparse-index' repo alongside the 'full-checkout' and
'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also
add run_on_sparse and test_sparse_match helpers. These helpers will be
used when the sparse index is implemented.

Add the GIT_TEST_SPARSE_INDEX environment variable to enable the
sparse-index by default. This can be enabled across all tests, but that
will only affect cases where the sparse-checkout feature is enabled.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
4300f8442a sparse-index: implement ensure_full_index()
We will mark an in-memory index_state as having sparse directory entries
with the sparse_index bit. These currently cannot exist, but we will add
a mechanism for collapsing a full index to a sparse one in a later
change. That will happen at write time, so we must first allow parsing
the format before writing it.

Commands or methods that require a full index in order to operate can
call ensure_full_index() to expand that index in-memory. This requires
parsing trees using that index's repository.

Sparse directory entries have a specific 'ce_mode' value. The macro
S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type.
This ce_mode is not possible with the existing index formats, so we don't
also verify all properties of a sparse-directory entry, which are:

 1. ce->ce_mode == 0040000
 2. ce->flags & CE_SKIP_WORKTREE is true
 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator)
 4. ce->oid references a tree object.

These are all semi-enforced in ensure_full_index() to some extent. Any
deviation will cause a warning at minimum or a failure in the worst
case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
3964fc2aae sparse-index: add guard to ensure full index
Upcoming changes will introduce modifications to the index format that
allow sparse directories. It will be useful to have a mechanism for
converting those sparse index files into full indexes by walking the
tree at those sparse directories. Name this method ensure_full_index()
as it will guarantee that the index is fully expanded.

This method is not implemented yet, and instead we focus on the
scaffolding to declare it and call it at the appropriate time.

Add a 'command_requires_full_index' member to struct repo_settings. This
will be an indicator that we need the index in full mode to do certain
index operations. This starts as being true for every command, then we
will set it to false as some commands integrate with sparse indexes.

If 'command_requires_full_index' is true, then we will immediately
expand a sparse index to a full one upon reading from disk. This
suffices for now, but we will want to add more callers to
ensure_full_index() later.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
4b3f765a2f t1092: clean up script quoting
This test was introduced in 19a0acc83e (t1092: test interesting
sparse-checkout scenarios, 2021-01-23), but it contains issues with quoting
that were not noticed until starting this follow-up series. The old
mechanism would drop quoting such as in

   test_all_match git commit -m "touch README.md"

The above happened to work because README.md is a file in the
repository, so 'git commit -m touch REAMDE.md' would succeed by
accident.

Other cases included quoting for no good reason, so clean that up now.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:45 -07:00
0b5fcb08b5 t/perf: add performance test for sparse operations
Create a test script that takes the default performance test (the Git
codebase) and multiplies it by 256 using four layers of duplicated
trees of width four. This results in nearly one million blob entries in
the index. Then, we can clone this repository with sparse-checkout
patterns that demonstrate four copies of the initial repository. Each
clone will use a different index format or mode so peformance can be
tested across the different options.

Note that the initial repo is stripped of submodules before doing the
copies. This preserves the expected data shape of the sparse index,
because directories containing submodules are not collapsed to a sparse
directory entry.

Run a few Git commands on these clones, especially those that use the
index (status, add, commit).

Here are the results on my Linux machine:

Test
--------------------------------------------------------------
2000.2: git status (full-index-v3)             0.37(0.30+0.09)
2000.3: git status (full-index-v4)             0.39(0.32+0.10)
2000.4: git add -A (full-index-v3)             1.42(1.06+0.20)
2000.5: git add -A (full-index-v4)             1.26(0.98+0.16)
2000.6: git add . (full-index-v3)              1.40(1.04+0.18)
2000.7: git add . (full-index-v4)              1.26(0.98+0.17)
2000.8: git commit -a -m A (full-index-v3)     1.42(1.11+0.16)
2000.9: git commit -a -m A (full-index-v4)     1.33(1.08+0.16)

It is perhaps noteworthy that there is an improvement when using index
version 4. This is because the v3 index uses 108 MiB while the v4
index uses 80 MiB. Since the repeated portions of the directories are
very short (f3/f1/f2, for example) this ratio is less pronounced than in
similarly-sized real repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:44 -07:00
0ad6090bdd sparse-index: design doc and format update
This begins a long effort to update the index format to allow sparse
directory entries. This should result in a significant improvement to
Git commands when HEAD contains millions of files, but the user has
selected many fewer files to keep in their sparse-checkout definition.

Currently, the index format is only updated in the presence of
extensions.sparseIndex instead of increasing a file format version
number. This is temporary, and index v5 is part of the plan for future
work in this area.

The design document details many of the reasons for embarking on this
work, and also the plan for completing it safely.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:44 -07:00
86d174b724 t/helper/test-read-midx.c: add '--show-objects'
The 'read-midx' helper is used in places like t5319 to display basic
information about a multi-pack-index.

In the next patch, the MIDX writing machinery will learn a new way to
choose from which pack an object is selected when multiple copies of
that object exist.

To disambiguate which pack introduces an object so that this feature can
be tested, add a '--show-objects' option which displays additional
information about each object in the MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
cd57bc41bb builtin/multi-pack-index.c: display usage on unrecognized command
When given a sub-command that it doesn't understand, 'git
multi-pack-index' dies with the following message:

    $ git multi-pack-index bogus
    fatal: unrecognized subcommand: bogus

Instead of 'die()'-ing, we can display the usage text, which is much
more helpful:

    $ git.compile multi-pack-index bogus
    error: unrecognized subcommand: bogus
    usage: git multi-pack-index [<options>] write
       or: git multi-pack-index [<options>] verify
       or: git multi-pack-index [<options>] expire
       or: git multi-pack-index [<options>] repack [--batch-size=<size>]

        --object-dir <file>   object directory containing set of packfile and pack-index pairs
        --progress            force progress reporting

While we're at it, clean up some duplication between the "no sub-command"
and "unrecognized sub-command" conditionals.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
690eb05719 builtin/multi-pack-index.c: don't enter bogus cmd_mode
Even before the recent refactoring, 'git multi-pack-index' calls
'trace2_cmd_mode()' before verifying that the sub-command is recognized.

Push this call down into the individual sub-commands so that we don't
enter a bogus command mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
60ca94769c builtin/multi-pack-index.c: split sub-commands
Handle sub-commands of the 'git multi-pack-index' builtin (e.g.,
"write", "repack", etc.) separately from one another. This allows
sub-commands with unique options, without forcing cmd_multi_pack_index()
to reject invalid combinations itself.

This comes at the cost of some duplication and boilerplate. Luckily, the
duplication is reduced to a minimum, since common options are shared
among sub-commands due to a suggestion by Ævar. (Sub-commands do have to
retain the common options, too, since this builtin accepts common
options on either side of the sub-command).

Roughly speaking, cmd_multi_pack_index() parses options (including
common ones), and stops at the first non-option, which is the
sub-command. It then dispatches to the appropriate sub-command, which
parses the remaining options (also including common options).

Unknown options are kept by the sub-commands in order to detect their
presence (and complain that too many arguments were given).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
b25b727494 builtin/multi-pack-index.c: define common usage with a macro
Factor out the usage message into pieces corresponding to each mode.
This avoids options specific to one sub-command from being shared with
another in the usage.

A subsequent commit will use these #define macros to have usage
variables for each sub-command without duplicating their contents.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
cf1f5389ec builtin/multi-pack-index.c: don't handle 'progress' separately
Now that there is a shared 'flags' member in the options structure,
there is no need to keep track of whether to force progress or not,
since ultimately the decision of whether or not to show a progress meter
is controlled by a bit in the flags member.

Manipulate that bit directly, and drop the now-unnecessary 'progress'
field while we're at it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
f7c4d63e35 builtin/multi-pack-index.c: inline 'flags' with options
Subcommands of the 'git multi-pack-index' command (e.g., 'write',
'verify', etc.) will want to optionally change a set of shared flags
that are eventually passed to the MIDX libraries.

Right now, options and flags are handled separately. That's fine, since
the options structure is never passed around. But a future patch will
make it so that common options shared by all sub-commands are defined in
a common location. That means that "flags" would have to become a global
variable.

Group it with the options structure so that we reduce the number of
global variables we have overall.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
5ee90326dc column, range-diff: downcase option description
It is customary not to begin the help text for each option given to
the parse-options API with a capital letter. Various (sub)commands'
option arrays don't follow the guideline provided by the parse_options
Documentation regarding the descriptions.

Downcase the first word of some option descriptions for "column"
and "range-diff".

Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 14:06:08 -07:00
958a5f5dfe cmake(install): include vcpkg dlls
Our CMake configuration generates not only build definitions, but also
install definitions: After building Git using `msbuild git.sln`, the
built artifacts can be installed via `msbuild INSTALL.vcxproj`.

To specify _where_ the files should be installed, the
`-DCMAKE_INSTALL_PREFIX=<path>` option can be used when running CMake.

However, this process would really only install the files that were just
built. On Windows, we need more than that: We also need the `.dll` files
of the dependencies (such as libcurl). The `vcpkg` ecosystem, which we
use to obtain those dependencies, can be asked to install said `.dll`
files really easily, so let's do that.

This requires more than just the built `vcpkg` artifacts in the CI build
definition; We now clone the `vcpkg` repository so that the relevant
CMake scripts are available, in particular the ones related to defining
the toolchain.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 13:49:04 -07:00
e8772a7af5 cmake: add a preparatory work-around to accommodate vcpkg
We are about to add support for installing the `.dll` files of Git's
dependencies (such as libcurl) in the CMake configuration. The `vcpkg`
ecosystem from which we get said dependencies makes that relatively
easy: simply turn on `X_VCPKG_APPLOCAL_DEPS_INSTALL`.

However, current `vcpkg` introduces a limitation if one does that:
While it is totally cool with CMake to specify multiple targets within
one invocation of `install(TARGETS ...) (at least according to
https://cmake.org/cmake/help/latest/command/install.html#command:install),
`vcpkg`'s parser insists on a single target per `install(TARGETS ...)`
invocation.

Well, that's easily accomplished: Let's feed the targets individually to
the `install(TARGETS ...)` function in a `foreach()` look.

This also has the advantage that we do not have to manually cull off the
two entries from the `${PROGRAMS_BUILT}` array before scheduling the
remainder to be installed into `libexec/git-core`. Instead, we iterate
through the array and decide for each entry where it wants to go.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 13:49:04 -07:00
3745e2693d fetch-pack: use new fsck API to printing dangling submodules
Refactor the check added in 5476e1efde (fetch-pack: print and use
dangling .gitmodules, 2021-02-22) to make use of us now passing the
"msg_id" to the user defined "error_func". We can now compare against
the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated
message.

Let's also replace register_found_gitmodules() with directly
manipulating the "gitmodules_found" member. A recent commit moved it
into "fsck_options" so we could do this here.

I'm sticking this callback in fsck.c. Perhaps in the future we'd like
to accumulate such callbacks into another file (maybe fsck-cb.c,
similar to parse-options-cb.c?), but while we've got just the one
let's just put it into fsck.c.

A better alternative in this case would be some library some more
obvious library shared by fetch-pack.c ad builtin/index-pack.c, but
there isn't such a thing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
c96e184cae fetch-pack: use file-scope static struct for fsck_options
Change code added in 5476e1efde (fetch-pack: print and use dangling
.gitmodules, 2021-02-22) so that we use a file-scoped "static struct
fsck_options" instead of defining one in the "fsck_gitmodules_oids()"
function.

We use this pattern in all of
builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see
fetch-pack be the odd one out. One might think that we're using other
fsck_options structs in fetch-pack, or doing on fsck twice there, but
we're not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
462f5cae0f fetch-pack: don't needlessly copy fsck_options
Change the behavior of the .gitmodules validation added in
5476e1efde (fetch-pack: print and use dangling .gitmodules,
2021-02-22) so we're using one "fsck_options".

I found that code confusing to read. One might think that not setting
up the error_func earlier means that we're relying on the "error_func"
not being set in some code in between the two hunks being modified
here.

But we're not, all we're doing in the rest of "cmd_index_pack()" is
further setup by calling fsck_set_msg_types(), and assigning to
do_fsck_object.

So there was no reason in 5476e1efde to make a shallow copy of the
fsck_options struct before setting error_func. Let's just do this
setup at the top of the function, along with the "walk" assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
c15087d17b fsck.c: move gitmodules_{found,done} into fsck_options
Move the gitmodules_{found,done} static variables added in
159e7b080b (fsck: detect gitmodules files, 2018-05-02) into the
fsck_options struct. It makes sense to keep all the context in the
same place.

This requires changing the recently added register_found_gitmodules()
function added in 5476e1efde (fetch-pack: print and use dangling
.gitmodules, 2021-02-22) to take fsck_options. That function will be
removed in a subsequent commit, but as it'll require the new
gitmodules_found attribute of "fsck_options" we need this intermediate
step first.

An earlier version of this patch removed the small amount of
duplication we now have between FSCK_OPTIONS_{DEFAULT,STRICT} with a
FSCK_OPTIONS_COMMON macro. I don't think such de-duplication is worth
it for this amount of copy/pasting.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
53692df2b8 fsck.c: add an fsck_set_msg_type() API that takes enums
Change code I added in acf9de4c94 (mktag: use fsck instead of custom
verify_tag(), 2021-01-05) to make use of a new API function that takes
the fsck_msg_{id,type} types, instead of arbitrary strings that
we'll (hopefully) parse into those types.

At the time that the fsck_set_msg_type() API was introduced in
0282f4dced (fsck: offer a function to demote fsck errors to warnings,
2015-06-22) it was only intended to be used to parse user-supplied
data.

For things that are purely internal to the C code it makes sense to
have the compiler check these arguments, and to skip the sanity
checking of the data in fsck_set_msg_type() which is redundant to
checks we get from the compiler.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
394d5d31b0 fsck.c: pass along the fsck_msg_id in the fsck_error callback
Change the fsck_error callback to also pass along the
fsck_msg_id. Before this change the only way to get the message id was
to parse it back out of the "message".

Let's pass it down explicitly for the benefit of callers that might
want to use it, as discussed in [1].

Passing the msg_type is now redundant, as you can always get it back
from the msg_id, but I'm not changing that convention. It's really
common to need the msg_type, and the report() function itself (which
calls "fsck_error") needs to call fsck_msg_type() to discover
it. Let's not needlessly re-do that work in the user callback.

1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
44e07da8bb fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps
define from fsck.c to fsck.h. This is in preparation for having
non-static functions take the fsck_msg_id as an argument.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
901f2f6742 fsck.c: give "FOREACH_MSG_ID" a more specific name
Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation
for moving it over to fsck.h. It's good convention to name macros
in *.h files in such a way as to clearly not clash with any other
names in other files.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
b5495024ec fsck.c: undefine temporary STR macro after use
In f417eed8cd (fsck: provide a function to parse fsck message IDs,
2015-06-22) the "STR" macro was introduced, but that short macro name
was not undefined after use as was done earlier in the same series for
the MSG_ID macro in c99ba492f1 (fsck: introduce identifiers for fsck
messages, 2015-06-22).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
c72da1a22b fsck.c: call parse_msg_type() early in fsck_set_msg_type()
There's no reason to defer the calling of parse_msg_type() until after
we've checked if the "id < 0". This is not a hot codepath, and
parse_msg_type() itself may die on invalid input.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
30cf618eef fsck.h: re-order and re-assign "enum fsck_msg_type"
Change the values in the "enum fsck_msg_type" from being manually
assigned to using default C enum values.

This means we end up with a FSCK_IGNORE=0, which was previously
defined as "2".

I'm confident that nothing relies on these values, we always compare
them for equality. Let's not omit "0" so it won't be assumed that
we're using these as a boolean somewhere.

This also allows us to re-structure the fields to mark which are
"private" v.s. "public". See the preceding commit for a rationale for
not simply splitting these into two enums, namely that this is used
for both the private and public fsck API.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
1b32b59f9b fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new
fsck_msg_type enum.

These defines were originally introduced in:

 - ba002f3b28 (builtin-fsck: move common object checking code to
   fsck.c, 2008-02-25)
 - f50c440730 (fsck: disallow demoting grave fsck errors to warnings,
   2015-06-22)
 - efaba7cc77 (fsck: optionally ignore specific fsck issues
   completely, 2015-06-22)
 - f27d05b170 (fsck: allow upgrading fsck warnings to errors,
   2015-06-22)

The reason these were defined in two different places is because we
use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are
used by external callbacks.

Untangling that would take some more work, since we expose the new
"enum fsck_msg_type" to both. Similar to "enum object_type" it's not
worth structuring the API in such a way that only those who need
FSCK_{ERROR,WARN} pass around a different type.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
e35d65a78a fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
Refactor "if options->msg_type" and other code added in
0282f4dced (fsck: offer a function to demote fsck errors to warnings,
2015-06-22) to reduce the scope of the "int msg_type" variable.

This is in preparation for changing its type in a subsequent commit,
only using it in the "!options->msg_type" scope makes that change

This also brings the code in line with the fsck_set_msg_type()
function (also added in 0282f4dced), which does a similar check for
"!options->msg_type". Another minor benefit is getting rid of the
style violation of not having braces for the body of the "if".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
35af754b06 fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
Rename the remaining variables of type fsck_msg_id from "id" to
"msg_id". This change is relatively small, and is worth the churn for
a later change where we have different id's in the "report" function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
034a7b7bcc fsck.c: remove (mostly) redundant append_msg_id() function
Remove the append_msg_id() function in favor of calling
prepare_msg_ids(). We already have code to compute the camel-cased
msg_id strings in msg_id_info, let's use it.

When the append_msg_id() function was added in 71ab8fa840 (fsck:
report the ID of the error/warning, 2015-06-22) the prepare_msg_ids()
function didn't exist. When prepare_msg_ids() was added in
a46baac61e (fsck: factor out msg_id_info[] lazy initialization code,
2018-05-26) this code wasn't moved over to lazy initialization.

This changes the behavior of the code to initialize all the messages
instead of just camel-casing the one we need on the fly. Since the
common case is that we're printing just one message this is mostly
redundant work.

But that's OK in this case, reporting this fsck issue to the user
isn't performance-sensitive. If we were somehow doing so in a tight
loop (in a hopelessly broken repository?) this would help, since we'd
save ourselves from re-doing this work for identical messages, we
could just grab the prepared string from msg_id_info after the first
invocation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
f1abc2d0e1 fsck.c: rename variables in fsck_set_msg_type() for less confusion
Rename variables in a function added in 0282f4dced (fsck: offer a
function to demote fsck errors to warnings, 2015-06-22).

It was needlessly confusing that it took a "msg_type" argument, but
then later declared another "msg_type" of a different type.

Let's rename that to "severity", and rename "id" to "msg_id" and
"msg_id" to "msg_id_str" etc. This will make a follow-up change
smaller.

While I'm at it properly indent the fsck_set_msg_type() argument list.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
a1aad71601 fsck.h: use "enum object_type" instead of "int"
Change the fsck_walk_func to use an "enum object_type" instead of an
"int" type. The types are compatible, and ever since this was added in
355885d531 (add generic, type aware object chain walker, 2008-02-25)
we've used entries from object_type (OBJ_BLOB etc.).

So this doesn't really change anything as far as the generated code is
concerned, it just gives the compiler more information and makes this
easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
d385784f89 fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use
designated initializers. This allows us to omit those fields that
are initialized to 0 or NULL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:02:59 -07:00
569f8d188f cmake(install): fix double .exe suffixes
By mistake, the `.exe` extension is appended _twice_ when installing the
dashed executables into `libexec/git-core/` on Windows (the extension is
already appended when adding items to the `git_links` list in the
`#Creating hardlinks` section).

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 18:02:23 -07:00
7bb544a4d1 cmake: support SKIP_DASHED_BUILT_INS
Just like the Makefile-based build learned to skip hard-linking the
dashed built-ins in 179227d6e2 (Optionally skip linking/copying the
built-ins, 2020-09-21), this patch teaches the CMake-based build the
same trick.

Note: In contrast to the Makefile-based process, the built-ins would
only be linked during installation, not already when Git is built.
Therefore, the CMake-based build that we use in our CI builds _already_
does not link those built-ins (because the files are not installed
anywhere, they are used to run the test suite in-place).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 18:02:23 -07:00
09420b7648 Document how we do embargoed releases
Whenever we fix critical vulnerabilities, we follow some sort of
protocol (e.g. setting a coordinated release date, keeping the fix under
embargo until that time, coordinating with packagers and/or hosting
sites, etc).

Similar in spirit to `Documentation/howto/maintain-git.txt`, let's
formalize the details in a document.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 15:13:12 -07:00
2e99b1e383 SECURITY: describe how to report vulnerabilities
In the same document, describe that Git does not have Long Term Support
(LTS) release trains, although security fixes are always applied to a
few of the most recent release trains.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-27 15:13:02 -07:00
9a7f1ce8b7 daemon: sanitize all directory separators
When sanitizing client-supplied strings on Windows, also strip off
backslashes, not just slashes.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 22:00:12 -07:00
84d06cdc06 Sync with v2.31.1 2021-03-26 14:59:47 -07:00
26c4f98ffd The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 14:59:03 -07:00
89519f662c Merge branch 'cm/rebase-i-fixup-amend-reword'
"git commit --fixup=<commit>", which was to tweak the changes made
to the contents while keeping the original log message intact,
learned "--fixup=(amend|reword):<commit>", that can be used to
tweak both the message and the contents, and only the message,
respectively.

* cm/rebase-i-fixup-amend-reword:
  doc/git-commit: add documentation for fixup=[amend|reword] options
  t3437: use --fixup with options to create amend! commit
  t7500: add tests for --fixup=[amend|reword] options
  commit: add a reword suboption to --fixup
  commit: add amend suboption to --fixup to create amend! commit
  sequencer: export and rename subject_length()
2021-03-26 14:59:03 -07:00
fde07fc356 Merge branch 'cm/rebase-i-updates'
Follow-up fixes to "cm/rebase-i" topic.

* cm/rebase-i-updates:
  doc/rebase -i: fix typo in the documentation of 'fixup' command
  t/t3437: fixup the test 'multiple fixup -c opens editor once'
  t/t3437: use named commits in the tests
  t/t3437: simplify and document the test helpers
  t/t3437: check the author date of fixed up commit
  t/t3437: remove the dependency of 'expected-message' file from tests
  t/t3437: fixup here-docs in the 'setup' test
  t/lib-rebase: update the documentation of FAKE_LINES
  rebase -i: clarify and fix 'fixup -c' rebase-todo help
  sequencer: rename a few functions
  sequencer: fixup the datatype of the 'flag' argument
2021-03-26 14:59:03 -07:00
ce4296cf2b Merge branch 'cm/rebase-i'
"rebase -i" is getting cleaned up and also enhanced.

* cm/rebase-i:
  doc/git-rebase: add documentation for fixup [-C|-c] options
  rebase -i: teach --autosquash to work with amend!
  t3437: test script for fixup [-C|-c] options in interactive rebase
  rebase -i: add fixup [-C | -c] command
  sequencer: use const variable for commit message comments
  sequencer: pass todo_item to do_pick_commit()
  rebase -i: comment out squash!/fixup! subjects from squash message
  sequencer: factor out code to append squash message
  rebase -i: only write fixup-message when it's needed
2021-03-26 14:59:03 -07:00
8c81fce4b0 Merge branch 'js/http-pki-credential-store'
The http codepath learned to let the credential layer to cache the
password used to unlock a certificate that has successfully been
used.

* js/http-pki-credential-store:
  http: drop the check for an empty proxy password before approving
  http: store credential when PKI auth is used
2021-03-26 14:59:02 -07:00
ed953e1076 Merge branch 'ab/make-cleanup'
Reorganize Makefile to allow building git.o and other essential
objects without extra stuff needed only for testing.

* ab/make-cleanup:
  Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets
  Makefile: split OBJECTS into OBJECTS and GIT_OBJS
  Makefile: sort OBJECTS assignment for subsequent change
  Makefile: split up long OBJECTS line
  Makefile: guard against TEST_OBJS in the environment
2021-03-26 14:59:02 -07:00
48bf2fa8ba Git 2.31.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 14:49:41 -07:00
ddaf1f62e3 csum-file: make hashwrite() more readable
The hashwrite() method takes an input buffer and updates a hashfile's
hash function while writing the data to a file. To avoid overuse of
flushes, the hashfile has an internal buffer and most writes will use
memcpy() to transfer data from the input 'buf' to the hashfile's buffer
of size 8 * 1024 bytes.

Logic introduced by a8032d12 (sha1write: don't copy full sized buffers,
2008-09-02) reduces the number of memcpy() calls when the input buffer
is sufficiently longer than the hashfile's buffer, causing nr to be the
length of the full buffer. In these cases, the input buffer is used
directly in chunks equal to the hashfile's buffer size.

This method caught my attention while investigating some performance
issues, but it turns out that these performance issues were noise within
the variance of the experiment.

However, during this investigation, I inspected hashwrite() and
misunderstood it, even after looking closely and trying to make it
faster. This change simply reorganizes some parts of the loop within
hashwrite() to make it clear that each batch either uses memcpy() to the
hashfile's buffer or writes directly from the input buffer. The previous
code relied on indirection through local variables and essentially
inlined the implementation of hashflush() to reduce lines of code.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-26 14:32:45 -07:00
9198c13e34 The third patch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 14:36:27 -07:00
858119f6d7 Merge branch 'nk/diff-index-fsmonitor'
"git diff-index" codepath has been taught to trust fsmonitor status
to reduce number of lstat() calls.

* nk/diff-index-fsmonitor:
  fsmonitor: add perf test for git diff HEAD
  fsmonitor: add assertion that fsmonitor is valid to check_removed
  fsmonitor: skip lstat deletion check during git diff-index
2021-03-24 14:36:27 -07:00
e537784f64 Merge branch 'jk/fail-prereq-testfix'
GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with
prerequisites to catch broken tests that depend on the side effects
of optional pieces, but did not work at all when negative
prerequisites were involved.

* jk/fail-prereq-testfix:
  t: annotate !PTHREADS tests with !FAIL_PREREQS
2021-03-24 14:36:27 -07:00
2744383cbd Merge branch 'tb/geometric-repack'
"git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size).  A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.

* tb/geometric-repack:
  builtin/pack-objects.c: ignore missing links with --stdin-packs
  builtin/repack.c: reword comment around pack-objects flags
  builtin/repack.c: be more conservative with unsigned overflows
  builtin/repack.c: assign pack split later
  t7703: test --geometric repack with loose objects
  builtin/repack.c: do not repack single packs with --geometric
  builtin/repack.c: add '--geometric' option
  packfile: add kept-pack cache for find_kept_pack_entry()
  builtin/pack-objects.c: rewrite honor-pack-keep logic
  p5303: measure time to repack with keep
  p5303: add missing &&-chains
  builtin/pack-objects.c: add '--stdin-packs' option
  revision: learn '--no-kept-objects'
  packfile: introduce 'find_kept_pack_entry()'
2021-03-24 14:36:27 -07:00
c6617d1e4f Merge branch 'tb/push-simple-uses-branch-merge-config'
Doc update.

* tb/push-simple-uses-branch-merge-config:
  Documentation/git-push.txt: correct configuration typo
2021-03-24 14:36:27 -07:00
bf12013f1a pack-objects: fix comment of reused_chunk.difference
As record_reused_object(offset, offset - hashfile_total(out)) said,
reused_chunk.difference should be the offset of original packfile minus
the offset of the generated packfile. But the comment presented an opposite way.

Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 13:03:22 -07:00
28e29ee38b format-patch: give an overview of what a "patch" message is
The text says something called a "patch" is prepared one for each
commit, it is suitable for e-mail submission, and "am" is the
command to use it, but does not say what the "patch" really is.

The description in the page also refers to the "three-dash" line,
but it is unclear what it is, unless the reader is given a more
detailed overview of what the "patch" is.

Add a brief paragraph to give an overview of what the output looks
like.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 12:14:23 -07:00
6131807864 git-completion.bash: use __gitcomp_builtin() in _git_stash()
The completion for 'git stash' has not changed in a major way since it
was converted from shell script to builtin. Now that it's a builtin, we
can take advantage of the groundwork laid out by parse-options and use
the generated options.

Rewrite _git_stash() to take use __gitcomp_builtin() to generate
completions for subcommands.

The main `git stash` command does not take any arguments directly. If no
subcommand is given, it automatically defaults to `git stash push`. This
means that we can simplify the logic for when no subcommands have been
given yet. We only have to offer subcommand completions when we're
completing a non-option after "stash".

One area that this patch could improve upon is that the `git stash list`
command accepts log-options. It would be nice if the completion for this
were unified with that of _git_log() and _git_show() which would allow
completions to be provided for options such as `--pretty` but that is
outside the scope of this patch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 10:05:47 -07:00
42b30bcbb7 git-completion.bash: extract from else in _git_stash()
To save a level of indentation, perform an early return in the "if" arm
so we can move the "else" code out of the block.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 10:05:47 -07:00
e94fb44042 git-completion.bash: pass $__git_subcommand_idx from __git_main()
Many completion functions perform hardcoded comparisons with $cword.
This fails in the case where the main git command is given arguments
(e.g. `git -C . bundle<TAB>` would fail to complete its subcommands).

Even _git_worktree(), which uses __git_find_on_cmdline(), could still
fail. With something like `git -C add worktree move<TAB>`, the
subcommand would be incorrectly identified as "add" instead of "move".

Assign $__git_subcommand_idx in __git_main(), where the git subcommand
is actually found and the corresponding completion function is called.
Use this variable to replace hardcoded comparisons with $cword.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 10:05:47 -07:00
76593c09bb mktag tests: fix broken "&&" chain
Remove a stray "xb" I inadvertently introduced in 780aa0a21e (tests:
remove last uses of GIT_TEST_GETTEXT_POISON=false, 2021-02-11).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 22:14:28 -07:00
c8243933c7 git-send-email: Respect core.hooksPath setting
get-send-email currently makes the assumption that the
'sendemail-validate' hook exists inside of the repository.

Since the introduction of 'core.hooksPath' configuration option in
867ad08a26 (hooks: allow customizing where the hook directory is,
2016-05-04), this is no longer true.

Instead of assuming a hardcoded repo relative path, query
git for the actual path of the hooks directory.

Signed-off-by: Robert Foss <robert.foss@linaro.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 15:02:52 -07:00
9bcde4d531 rebase: remove transitory rebase.useBuiltin setting & env
Remove the rebase.useBuiltin setting and the now-obsolete
GIT_TEST_REBASE_USE_BUILTIN test flag.

This was left in place after my d03ebd411c (rebase: remove the
rebase.useBuiltin setting, 2019-03-18) to help anyone who'd used the
experimental flag and wanted to know that it was the default, or that
they should transition their test environment to use the builtin
rebase unconditionally.

It's been more than long enough for those users to get a headsup about
this. So remove all the scaffolding that was left inplace after
d03ebd411c. I'm also removing the documentation entry, if anyone
still has this left in their configuration they can do some source
archaeology to figure out what it used to do, which makes more sense
than exposing every git user reading the documentation to this legacy
configuration switch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 14:05:58 -07:00
db91988aa1 format-patch: allow a non-integral version numbers
The `-v<n>` option of `format-patch` can give nothing but an
integral iteration number to patches in a series.  Some people,
however, prefer to mark a new iteration with only a small fixup
with a non integral iteration number (e.g. an "oops, that was
wrong" fix-up patch for v4 iteration may be labeled as "v4.1").

Allow `format-patch` to take such a non-integral iteration
number.

`<n>` can be any string, such as '3.1' or '4rev2'. In the case
where it is a non-integral value, the "Range-diff" and "Interdiff"
headers will not include the previous version.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 12:49:47 -07:00
ae22751f9b entry: add checkout_entry_ca() taking preloaded conv_attrs
The parallel checkout machinery will call checkout_entry() for entries
that could not be written in parallel due to path collisions. At this
point, we will already be holding the conversion attributes for each
entry, and it would be wasteful to let checkout_entry() load these
again. Instead, let's add the checkout_entry_ca() variant, which
optionally takes a preloaded conv_attrs struct.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
30419e7e1d entry: move conv_attrs lookup up to checkout_entry()
In a following patch, checkout_entry() will use conv_attrs to decide
whether an entry should be enqueued for parallel checkout or not. But
the attributes lookup only happens lower in this call stack. To avoid
the unnecessary work of loading the attributes twice, let's move it up
to checkout_entry(), and pass the loaded struct down to write_entry().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
584a0d13f2 entry: extract update_ce_after_write() from write_entry()
The code that updates the in-memory index information after an entry is
written currently resides in write_entry(). Extract it to a public
function so that it can be called by the parallel checkout functions,
outside entry.c, in a later patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
49cfd9032a entry: make fstat_output() and read_blob_entry() public
These two functions will be used by the parallel checkout code, so let's
make them public. Note: fstat_output() is renamed to
fstat_checkout_output(), now that it has become public, seeking to avoid
future name collisions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
d052cc0382 entry: extract a header file for entry.c functions
The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header. While at it let's also move a comment related to
checkout_entry() from entry.c to entry.h as it's more useful to describe
the function there.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
2daae3d1d1 commit: add --trailer option
Historically, Git has supported the 'Signed-off-by' commit trailer
using the '--signoff' and the '-s' option from the command line.
But users may need to provide other trailer information from the
command line such as "Helped-by", "Reported-by", "Mentored-by",

Now implement a new `--trailer <token>[(=|:)<value>]` option to pass
other trailers to `interpret-trailers` and insert them into commit
messages.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:31:38 -07:00
1424303384 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 14:00:25 -07:00
3099d4faa3 Merge branch 'bc/clone-bare-with-conflicting-config'
"git -c core.bare=false clone --bare ..." would have segfaulted,
which has been corrected.

* bc/clone-bare-with-conflicting-config:
  builtin/init-db: handle bare clones when core.bare set to false
2021-03-22 14:00:25 -07:00
d4bda9b045 Merge branch 'jk/filter-branch-sha256'
Code clean-up.

* jk/filter-branch-sha256:
  filter-branch: drop $_x40 glob
  filter-branch: drop multiple-ancestor warning
  t7003: test ref rewriting explicitly
2021-03-22 14:00:25 -07:00
20adca9006 Merge branch 'ps/update-ref-trans-hook-doc'
Doc update.

* ps/update-ref-trans-hook-doc:
  githooks.txt: clarify documentation on reference-transaction hook
  githooks.txt: replace mentions of SHA-1 specific properties
2021-03-22 14:00:25 -07:00
960f466d1a Merge branch 'rr/mailmap-entry-self'
* rr/mailmap-entry-self:
  Add entry for Ramkumar Ramachandra
2021-03-22 14:00:25 -07:00
3d92c0a784 Merge branch 'jr/doc-ignore-typofix'
Doc cleanup.

* jr/doc-ignore-typofix:
  doc: .gitignore documentation typofix
2021-03-22 14:00:25 -07:00
44e03bfdb6 Merge branch 'sv/t9801-test-path-is-file-cleanup'
Test cleanup.

* sv/t9801-test-path-is-file-cleanup:
  t9801: replace test -f with test_path_is_file
2021-03-22 14:00:24 -07:00
c83d602ad2 Merge branch 'dl/cat-file-doc-cleanup'
Doc cleanup.

* dl/cat-file-doc-cleanup:
  git-cat-file.txt: remove references to "sha1"
  git-cat-file.txt: monospace args, placeholders and filenames
2021-03-22 14:00:24 -07:00
25f9326561 Merge branch 'rs/pretty-describe'
"git log --format='...'" learned "%(describe)" placeholder.

* rs/pretty-describe:
  archive: expand only a single %(describe) per archive
  pretty: document multiple %(describe) being inconsistent
  t4205: assert %(describe) test coverage
  pretty: add merge and exclude options to %(describe)
  pretty: add %(describe)
2021-03-22 14:00:24 -07:00
f5c73f69fd Merge branch 'dl/stash-show-untracked'
"git stash show" learned to optionally show untracked part of the
stash.

* dl/stash-show-untracked:
  stash show: learn stash.showIncludeUntracked
  stash show: teach --include-untracked and --only-untracked
2021-03-22 14:00:24 -07:00
dd4048d1c7 Merge branch 'en/ort-perf-batch-8'
Rename detection rework continues.

* en/ort-perf-batch-8:
  diffcore-rename: compute dir_rename_guess from dir_rename_counts
  diffcore-rename: limit dir_rename_counts computation to relevant dirs
  diffcore-rename: compute dir_rename_counts in stages
  diffcore-rename: extend cleanup_dir_rename_info()
  diffcore-rename: move dir_rename_counts into dir_rename_info struct
  diffcore-rename: add function for clearing dir_rename_count
  Move computation of dir_rename_count from merge-ort to diffcore-rename
  diffcore-rename: add a mapping of destination names to their indices
  diffcore-rename: provide basic implementation of idx_possible_rename()
  diffcore-rename: use directory rename guided basename comparisons
2021-03-22 14:00:24 -07:00
24119d9d7b Merge branch 'ab/grep-pcre2-allocfix'
Updates to memory allocation code around the use of pcre2 library.

* ab/grep-pcre2-allocfix:
  grep/pcre2: move definitions of pcre2_{malloc,free}
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
2021-03-22 14:00:23 -07:00
e8d5a423ca Merge branch 'jk/perf-in-worktrees'
Perf test update to work better in secondary worktrees.

* jk/perf-in-worktrees:
  t/perf: avoid copying worktree files from test repo
  t/perf: handle worktrees as test repos
2021-03-22 14:00:23 -07:00
d20fa3cf9d Merge branch 'ds/commit-graph-generation-config'
A new configuration variable has been introduced to allow choosing
which version of the generation number gets used in the
commit-graph file.

* ds/commit-graph-generation-config:
  commit-graph: use config to specify generation type
  commit-graph: create local repository pointer
2021-03-22 14:00:23 -07:00
52182e3b1f Merge branch 'ab/remote-write-config-in-camel-case'
Update C code that sets a few configuration variables when a remote
is configured so that it spells configuration variable names in the
canonical camelCase.

* ab/remote-write-config-in-camel-case:
  remote: write camel-cased *.pushRemote on rename
  remote: add camel-cased *.tagOpt key, like clone
2021-03-22 14:00:23 -07:00
2435feaa20 Merge branch 'mt/cleanly-die-upon-missing-required-filter'
We had a code to diagnose and die cleanly when a required
clean/smudge filter is missing, but an assert before that
unnecessarily fired, hiding the end-user facing die() message.

* mt/cleanly-die-upon-missing-required-filter:
  convert: fail gracefully upon missing clean cmd on required filter
2021-03-22 14:00:22 -07:00
204333b015 Merge branch 'jk/open-dotgitx-with-nofollow'
It does not make sense to make ".gitattributes", ".gitignore" and
".mailmap" symlinks, as they are supposed to be usable from the
object store (think: bare repositories where HEAD:.mailmap etc. are
used).  When these files are symbolic links, we used to read the
contents of the files pointed by them by mistake, which has been
corrected.

* jk/open-dotgitx-with-nofollow:
  mailmap: do not respect symlinks for in-tree .mailmap
  exclude: do not respect symlinks for in-tree .gitignore
  attr: do not respect symlinks for in-tree .gitattributes
  exclude: add flags parameter to add_patterns()
  attr: convert "macro_ok" into a flags field
  add open_nofollow() helper
2021-03-22 14:00:22 -07:00
2be927f3d1 diff --no-index tests: test mode normalization
When "git diff --no-index X Y" is run the modes of the files being
differ are normalized by canon_mode() in fill_filespec().

I recently broke that behavior in a patch of mine[1] which would pass
all tests, or not, depending on the umask of the git.git checkout.

Let's test for this explicitly. Arguably this should not be the
behavior of "git diff --no-index". We aren't diffing our own objects
or the index, so it might be useful to show mode differences between
files.

On the other hand diff(1) does not do that, and it would be needlessly
distracting when e.g. diffing an extracted tar archive whose contents
is the same, but whose file modes are different.

1. https://lore.kernel.org/git/20210316155829.31242-2-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 12:22:26 -07:00
540cdc11ad pack-bitmap: avoid traversal of objects referenced by uninteresting tag
When preparing the bitmap walk, we first establish the set of of have
and want objects by iterating over the set of pending objects: if an
object is marked as uninteresting, it's declared as an object we already
have, otherwise as an object we want. These two sets are then used to
compute which transitively referenced objects we need to obtain.

One special case here are tag objects: when a tag is requested, we
resolve it to its first not-tag object and add both resolved objects as
well as the tag itself into either the have or want set. Given that the
uninteresting-property always propagates to referenced objects, it is
clear that if the tag is uninteresting, so are its children and vice
versa. But we fail to propagate the flag, which effectively means that
referenced objects will always be interesting except for the case where
they have already been marked as uninteresting explicitly.

This mislabeling does not impact correctness: we now have it in our
"wants" set, and given that we later do an `AND NOT` of the bitmaps of
"wants" and "haves" sets it is clear that the result must be the same.
But we now start to needlessly traverse the tag's referenced objects in
case it is uninteresting, even though we know that each referenced
object will be uninteresting anyway. In the worst case, this can lead to
a complete graph walk just to establish that we do not care for any
object.

Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag
objects and add a benchmark with negative revisions to p5310. This shows
some nice performance benefits, tested with linux.git:

Test                                                          HEAD~                  HEAD
---------------------------------------------------------------------------------------------------------------
5310.3: repack to disk                                        193.18(181.46+16.42)   194.61(183.41+15.83) +0.7%
5310.4: simulated clone                                       25.93(24.88+1.05)      25.81(24.73+1.08) -0.5%
5310.5: simulated fetch                                       2.64(5.30+0.69)        2.59(5.16+0.65) -1.9%
5310.6: pack to file (bitmap)                                 58.75(57.56+6.30)      58.29(57.61+5.73) -0.8%
5310.7: rev-list (commits)                                    1.45(1.18+0.26)        1.46(1.22+0.24) +0.7%
5310.8: rev-list (objects)                                    15.35(14.22+1.13)      15.30(14.23+1.07) -0.3%
5310.9: rev-list with tag negated via --not --all (objects)   22.49(20.93+1.56)      0.11(0.09+0.01) -99.5%
5310.10: rev-list with negative tag (objects)                 0.61(0.44+0.16)        0.51(0.35+0.16) -16.4%
5310.11: rev-list count with blob:none                        12.15(11.19+0.96)      12.18(11.19+0.99) +0.2%
5310.12: rev-list count with blob:limit=1k                    17.77(15.71+2.06)      17.75(15.63+2.12) -0.1%
5310.13: rev-list count with tree:0                           1.69(1.31+0.38)        1.68(1.28+0.39) -0.6%
5310.14: simulated partial clone                              20.14(19.15+0.98)      19.98(18.93+1.05) -0.8%
5310.16: clone (partial bitmap)                               12.78(13.89+1.07)      12.72(13.99+1.01) -0.5%
5310.17: pack to file (partial bitmap)                        42.07(45.44+2.72)      41.44(44.66+2.80) -1.5%
5310.18: rev-list with tree filter (partial bitmap)           0.44(0.29+0.15)        0.46(0.32+0.14) +4.5%

While most benchmarks are probably in the range of noise, the newly
added 5310.9 and 5310.10 benchmarks consistenly perform better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 12:10:56 -07:00
1b0d9545bb remote-curl: fall back to basic auth if Negotiate fails
When the username and password are supplied in a url like this
https://myuser:secret@git.exampe/myrepo.git and the server supports the
negotiate authenticaten method, git does not fall back to basic auth and
libcurl hardly tries to authenticate with the negotiate method.

Stop using the Negotiate authentication method after the first failure
because if it fails on the first try it will never succeed.

Signed-off-by: Christopher Schenk <christopher@cschenk.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:55:41 -07:00
36a7eb6876 t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
Create t0052-simple-ipc.sh with unit tests for the "simple-ipc" mechanism.

Create t/helper/test-simple-ipc test tool to exercise the "simple-ipc"
functions.

When the tool is invoked with "run-daemon", it runs a server to listen
for "simple-ipc" connections on a test socket or named pipe and
responds to a set of commands to exercise/stress the communication
setup.

When the tool is invoked with "start-daemon", it spawns a "run-daemon"
command in the background and waits for the server to become ready
before exiting.  (This helps make unit tests in t0052 more predictable
and avoids the need for arbitrary sleeps in the test script.)

The tool also has a series of client "send" commands to send commands
and data to a server instance.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:52:54 -07:00
7cd5dbcaba simple-ipc: add Unix domain socket implementation
Create Unix domain socket based implementation of "simple-ipc".

A set of `ipc_client` routines implement a client library to connect
to an `ipc_server` over a Unix domain socket, send a simple request,
and receive a single response.  Clients use blocking IO on the socket.

A set of `ipc_server` routines implement a thread pool to listen for
and concurrently service client connections.

The server creates a new Unix domain socket at a known location.  If a
socket already exists with that name, the server tries to determine if
another server is already listening on the socket or if the socket is
dead.  If socket is busy, the server exits with an error rather than
stealing the socket.  If the socket is dead, the server creates a new
one and starts up.

If while running, the server detects that its socket has been stolen
by another server, it automatically exits.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:52:54 -07:00
271cb303a5 diff --no-index tests: add test for --exit-code
Add a test for --exit-code working with --no-index. There's no reason
to suppose it wouldn't, but we weren't testing for it anywhere in our
tests. Let's fix that blind spot.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:48:41 -07:00
68ffe095a2 transport: also free remote_refs in transport_disconnect()
transport_get_remote_refs() can populate the transport struct's
remote_refs. transport_disconnect() is already responsible for most of
transport's cleanup - therefore we also take care of freeing remote_refs
there.

There are 2 locations where transport_disconnect() is called before
we're done using the returned remote_refs. This patch changes those
callsites to only call transport_disconnect() after the returned refs
are no longer being used - which is necessary to safely be able to
free remote_refs during transport_disconnect().

This commit fixes the following leak which was found while running
t0000, but is expected to also fix the same pattern of leak in all
locations that use transport_get_remote_refs():

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
64cc539fd2 parse-options: don't leak alias help messages
preprocess_options() allocates new strings for help messages for
OPTION_ALIAS. Therefore we also need to clean those help messages up
when freeing the returned options.

First introduced in:
  7c280589cf (parse-options: teach "git cmd -h" to show alias as alias, 2020-03-16)

The preprocessed options themselves no longer contain any indication
that a given option is/was an alias - therefore we add a new flag to
indicate former aliases. (An alternative approach would be to look back
at the original options to determine which options are aliases - but
that seems like a fragile approach. Or we could even look at the
alias_groups list - which might be less fragile, but would be slower
as it requires nested looping.)

As far as I can tell, parse_options() is only ever used once per
command, and the help messages are small - hence this leak has very
little impact.

This leak was found while running t0001. LSAN output can be found below:

Direct leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9aae36 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x939d8d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x93b936 in strbuf_vaddf /home/ahunt/oss-fuzz/git/strbuf.c:392:3
    #4 0x93b7ff in strbuf_addf /home/ahunt/oss-fuzz/git/strbuf.c:333:2
    #5 0x86747e in preprocess_options /home/ahunt/oss-fuzz/git/parse-options.c:666:3
    #6 0x866ed2 in parse_options /home/ahunt/oss-fuzz/git/parse-options.c:847:17
    #7 0x51c4a7 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:989:9
    #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #12 0x69c9fe in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #13 0x7fdac42d4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
0171dbcb42 parse-options: convert bitfield values to use binary shift
Because it's easier to read, but also likely to be easier to maintain.
I am making this change because I need to add a new flag in a later
commit.

Also add a trailing comma to the last enum entry to simplify addition of
new flags.

This change was originally suggested by Peff in:
https://public-inbox.org/git/YEZ%2FBWWbpfVwl6nO@coredump.intra.peff.net/

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
47957485b3 tree.h API: simplify read_tree_recursive() signature
Simplify the signature of read_tree_recursive() to omit the "base",
"baselen" and "stage" arguments. No callers of it use these parameters
for anything anymore.

The last function to call read_tree_recursive() with a non-"" path was
read_tree_recursive() itself, but that was changed in
ffd31f661d (Reimplement read_tree_recursive() using
tree_entry_interesting(), 2011-03-25).

The last user of the "stage" parameter went away in the last commit,
and even that use was mere boilerplate.

So let's remove those and rename the read_tree_recursive() function to
just read_tree(). We had another read_tree() function that I've
refactored away in preceding commits, since all in-tree users read
trees recursively with a callback we can change the name to signify
that this is the norm.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
6c9fc42e9f tree.h API: expose read_tree_1() as read_tree_at()
Rename the static read_tree_1() function to read_tree_at(). This
function works just like read_tree_recursive(), except you provide
your own strbuf.

This step doesn't make much sense now, but in follow-up commits I'll
remove the base/baselen/stage arguments to read_tree_recursive(). At
that point an anticipated in-tree user[1] for the old
read_tree_recursive() couldn't provide a path to start the
traversal.

Let's give them a function to do so with an API that makes more sense
for them, by taking a strbuf we should be able to avoid more casting
and/or reallocations in the future.

1. https://lore.kernel.org/git/xmqqft106sok.fsf@gitster.g

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
7367d88261 archive: stop passing "stage" through read_tree_recursive()
The "stage" variable being passed around in the archive code has only
ever been an elaborate way to hardcode the value "0".

This code was added in its original form in e4fbbfe9ec (Add
git-zip-tree, 2006-08-26), at which point a hardcoded "0" would be
passed down through read_tree_recursive() to write_zip_entry().

It was then diligently added to the "struct directory" in
ed22b4173b (archive: support filtering paths with glob, 2014-09-21),
but we were still not doing anything except passing it around as-is.

Let's stop doing that in the code internal to archive.c, we'll still
feed "0" to read_tree_recursive() itself, but won't use it. That we're
providing it at all to read_tree_recursive() will be changed in a
follow-up commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
9614ad3ce0 ls-files: refactor away read_tree()
Refactor away the read_tree() function into its only user,
overlay_tree_on_index().

First, change read_one_entry_opt() to use the strbuf parameter
read_tree_recursive() passes down in place. This finishes up a partial
refactoring started in 6a0b0b6de9 (tree.c: update read_tree_recursive
callback to pass strbuf as base, 2014-11-30).

Moving the rest into overlay_tree_on_index() makes this index juggling
we're doing easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
fcc7c12f11 ls-files: don't needlessly pass around stage variable
Now that read_tree() has been moved to ls-files.c we can get rid of
the stage != 1 case that'll never happen.

Let's not use read_tree_recursive() as a pass-through to pass "stage =
1" either. For now we'll pass an unused "stage = 0" for consistency
with other read_tree_recursive() callers, that argument will be
removed in a follow-up commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
eefadd18e1 tree.c API: move read_tree() into builtin/ls-files.c
Since the read_tree() API was added around the same time as
read_tree_recursive() in 94537c78a8 (Move "read_tree()" to
"tree.c"[...], 2005-04-22) and b12ec373b8 ([PATCH] Teach read-tree
about commit objects, 2005-04-20) things have gradually migrated over
to the read_tree_recursive() version.

Now builtin/ls-files.c is the last user of this code, let's move all
the relevant code there. This allows for subsequent simplification of
it, and an eventual move to read_tree_recursive().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
8de78218c5 ls-files tests: add meaningful --with-tree tests
Add tests for "ls-files --with-tree". There was effectively no
coverage for any normal usage of this command, only the tests added in
54e1abce90 (Add test case for ls-files --with-tree, 2007-10-03) for
an obscure bug.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
dcc0a86f2f show tests: add test for "git show <tree>"
Add missing tests for showing a tree with "git show". Let's test for
showing a tree, two trees, and that doing so doesn't recurse.

The only tests for this code added in 5d7eeee2ac (git-show: grok
blobs, trees and tags, too, 2006-12-14) were the tests in
t7701-repack-unpack-unreachable.sh added in ccc1297226 (repack:
modify behavior of -A option to leave unreferenced objects unpacked,
2008-05-09).

Let's add this common mode of operation to the "show" tests
themselves. It's more obvious, and the tests in
t7701-repack-unpack-unreachable.sh happily pass if we start buggily
emitting trees recursively.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
f3b964a07e Add testing with merge-ort merge strategy
In preparation for switching from merge-recursive to merge-ort as the
default strategy, have the testsuite default to running with merge-ort.
Keep coverage of the recursive backend by having the linux-gcc job run
with it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
259490e572 t6423: mark remaining expected failure under merge-ort as such
When we started on merge-ort, thousands of tests failed when run with
the GIT_TEST_MERGE_ALGORITHM=ort flag; with so many, it didn't make
sense to flip all their test expectations.  The ones in t6409, t6418,
and the submodule tests are being handled by an independent in-flight
topic ("Complete merge-ort implemenation...almost").  The ones in
t6423 were left out of the other series because other ongoing series
that this commit depends upon were addressing those.  Now that we only
have one remaining test failure in t6423, let's mark it as such.

This remaining test will be fixed by a future optimization series, but
since merge-recursive doesn't pass this test either, passing it is not
necessary for declaring merge-ort ready for general use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
41376b58e6 Revert "merge-ort: ignore the directory rename split conflict for now"
This reverts commit 5ced7c3da0, which was
put in place as a temporary measure to avoid optimizations unstably
erroring out on no destination having a majority of the necessary
renames for directories that had no new files and thus no need for
directory rename detection anyway.  Now that optimizations are in place
to prevent us from trying to compute directory rename count computations
for directories that do not need it, we can undo this temporary measure.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
816147e7ba merge-recursive: add a bunch of FIXME comments documenting known bugs
The plan is to just delete merge-recursive, but not until everyone is
comfortable with merge-ort as a replacement.  Given that I haven't
switched all callers of merge-recursive over yet (e.g. git-am still uses
merge-recursive), maybe there's some value documenting known bugs in the
algorithm in case we end up keeping it or someone wants to dig it up in
the future.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
5291828df8 merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
There are a variety of questions users might ask while resolving
conflicts:
  * What changes have been made since the previous (first) parent?
  * What changes are staged?
  * What is still unstaged? (or what is still conflicted?)
  * What changes did I make to resolve conflicts so far?
The first three of these have simple answers:
  * git diff HEAD
  * git diff --cached
  * git diff
There was no way to answer the final question previously.  Adding one
is trivial in merge-ort, since it works by creating a tree representing
what should be written to the working copy complete with conflict
markers.  Simply write that tree to .git/AUTO_MERGE, allowing users to
answer the fourth question with
  * git diff AUTO_MERGE

I avoided using a name like "MERGE_AUTO", because that would be
merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD,
CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on
which type of operation the merge was part of.

Ensure that paths which clean out other temporary operation-specific
files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory)
also clean out this AUTO_MERGE file.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
aa2faac03a t: mark several submodule merging tests as fixed under merge-ort
merge-ort handles submodules (and directory/file conflicts in general)
differently than merge-recursive does; it basically puts all the special
handling for different filetypes into one place in the codebase instead
of needing special handling for different filetypes in many different
code paths.  This one code path in merge-ort could perhaps use some work
still (there are still test_expect_failure cases in the testsuite), but
it passes all the tests that merge-recursive does as well as 12
additional ones that merge-recursive fails.  Mark those 12 tests as
test_expect_success under merge-ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
66b209b86a merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
When merge conflicts occur in paths removed by a sparse-checkout, we
need to unsparsify those paths (clear the SKIP_WORKTREE bit), and write
out the conflicted file to the working copy.  In the very unlikely case
that someone manually put a file into the working copy at the location
of the SKIP_WORKTREE file, we need to avoid overwriting whatever edits
they have made and move that file to a different location first.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
8ddc20b896 t6428: new test for SKIP_WORKTREE handling and conflicts
If there is a conflict during a merge for a SKIP_WORKTREE entry, we
expect that file to be written to the working copy and have the
SKIP_WORKTREE bit cleared in the index.  If the user had manually
created a file in the working tree despite SKIP_WORKTREE being set, we
do not want to clobber their changes to that file, but want to move it
out of the way.  Add tests that check for these behaviors.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
3639dfb3a8 merge-ort: support subtree shifting
merge-recursive has some simple code to support subtree shifting; copy
it over to merge-ort.  This fixes t6409.12 under
GIT_TEST_MERGE_ALGORITHM=ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
3860220bfa merge-ort: let renormalization change modify/delete into clean delete
When we have a modify/delete conflict, but the only change to the
modification is e.g. change of line endings, then if renormalization is
requested then we should be able to recognize such a case as a
not-modified/delete and resolve the conflict automatically.

This fixes t6418.10 under GIT_TEST_MERGE_ALGORITHM=ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
1218b3ab86 merge-ort: have ll_merge() use a special attr_index for renormalization
ll_merge() needs an index when renormalization is requested.  Create one
specifically for just this purpose with just the one needed entry.  This
fixes t6418.4 and t6418.5 under GIT_TEST_MERGE_ALGORITHM=ort.

NOTE 1: Even if the user has a working copy or a real index (which is
not a given as merge-ort can be used in bare repositories), we
explicitly ignore any .gitattributes file from either of these
locations.  merge-ort can be used to merge two branches that are
unrelated to HEAD, so .gitattributes from the working copy and current
index should not be considered relevant.

NOTE 2: Since we are in the middle of merging, there is a risk that
.gitattributes itself is conflicted...leaving us with an ill-defined
situation about how to perform the rest of the merge.  It could be that
the .gitattributes file does not even exist on one of the sides of the
merge, or that it has been modified on both sides.  If it's been
modified on both sides, it's possible that it could itself be merged
cleanly, though it's also possible that it only merges cleanly if you
use the right version of the .gitattributes file to drive the merge.  It
gets kind of complicated.  The only test we ever had that attempted to
test behavior in this area was seemingly unaware of the undefined
behavior, but knew the test wouldn't work for lack of attribute handling
support, marked it as test_expect_failure from the beginning, but
managed to fail for several reasons unrelated to attribute handling.
See commit 6f6e7cfb52 ("t6038: remove problematic test", 2020-08-03) for
details.  So there are probably various ways to improve what
initialize_attr_index() picks in the case of a conflicted .gitattributes
but for now I just implemented something simple -- look for whatever
.gitattributes file we can find in any of the higher order stages and
use it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
ea305a68fd merge-ort: add a special minimal index just for renormalization
renormalize_buffer() requires an index_state, which is something that
merge-ort does not operate with.  However, all the renormalization code
needs is an index with a .gitattributes file...plus a little bit of
setup.  Create such an index, along with the deallocation and
attr_direction handling.

A subsequent commit will add a function to finish the initialization
of this index.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
72b3091040 merge-ort: use STABLE_QSORT instead of QSORT where required
rename/rename conflict handling depends on the fact that if both sides
renamed the same path, that the one on the MERGE_SIDE1 will appear first
in the combined diff_queue_struct passed to process_renames().  Since we
add all pairs from MERGE_SIDE1 to combined first, and then all pairs
from MERGE_SIDE2, and then sort based on filename, this will only be
true if the sort used is stable.  This was found due to the fact that
Mac, unlike Linux, apparently has a system-defined qsort that is not
stable.

While we are at it, review the other callers of QSORT and add comments
about why they can remain as calls to QSORT instead of being modified
to call STABLE_QSORT.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:39 -07:00
ef486a9ecf Merge branch 'tb/git-mv-icase-fix'
Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.

* tb/git-mv-icase-fix:
  git mv foo FOO ; git mv foo bar gave an assert
2021-03-19 15:25:40 -07:00
98164e9585 The first batch in 2.32 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 15:25:40 -07:00
bfcc6e2a68 Merge branch 'rs/xcalloc-takes-nelem-first'
Code cleanup.

* rs/xcalloc-takes-nelem-first:
  fix xcalloc() argument order
2021-03-19 15:25:39 -07:00
af107029b1 Merge branch 'ah/make-fuzz-all-doc-update'
Update insn in Makefile comments to run fuzz-all target.

* ah/make-fuzz-all-doc-update:
  Makefile: update 'make fuzz-all' docs to reflect modern clang
2021-03-19 15:25:39 -07:00
c691e918f4 Merge branch 'jk/slimmed-down'
Unused code removal.

* jk/slimmed-down:
  vcs-svn: remove header files as well
2021-03-19 15:25:38 -07:00
92ccd7b752 Merge branch 'rs/calloc-array'
CALLOC_ARRAY() macro replaces many uses of xcalloc().

* rs/calloc-array:
  cocci: allow xcalloc(1, size)
  use CALLOC_ARRAY
  git-compat-util.h: drop trailing semicolon from macro definition
2021-03-19 15:25:38 -07:00
a8a0ac3234 Merge branch 'rs/avoid-null-statement-after-macro-call'
Fix macros that can silently inject unintended null-statements.

* rs/avoid-null-statement-after-macro-call:
  mem-pool: drop trailing semicolon from macro definition
  block-sha1: drop trailing semicolon from macro definition
2021-03-19 15:25:38 -07:00
948e8ac534 Merge branch 'km/config-doc-typofix'
Docfix.

* km/config-doc-typofix:
  config.txt: add missing period
2021-03-19 15:25:38 -07:00
cc930b7472 Merge branch 'jt/clone-unborn-head'
Test fix.

* jt/clone-unborn-head:
  t5606: run clone branch name test with protocol v2
2021-03-19 15:25:38 -07:00
1dd4e74522 Merge branch 'js/fsmonitor-unpack-fix'
The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.

* js/fsmonitor-unpack-fix:
  fsmonitor: do not forget to release the token in `discard_index()`
  fsmonitor: fix memory corruption in some corner cases
2021-03-19 15:25:37 -07:00
35381b13da Merge branch 'jk/bisect-peel-tag-fix'
"git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well.  This regression
has been corrected.

* jk/bisect-peel-tag-fix:
  bisect: peel annotated tags to commits
2021-03-19 15:25:37 -07:00
8779c141da Merge branch 'jh/fsmonitor-prework'
The fsmonitor interface read from its input without making sure
there is something to read from.  This bug is new in 2.31
timeframe.

* jh/fsmonitor-prework:
  fsmonitor: avoid global-buffer-overflow READ when checking trivial response
2021-03-19 15:25:37 -07:00
eabacfd9cb Merge branch 'jc/calloc-fix'
Code clean-up.

* jc/calloc-fix:
  xcalloc: use CALLOC_ARRAY() when applicable
2021-03-19 15:25:37 -07:00
14e7b8344f builtin/pack-objects.c: ignore missing links with --stdin-packs
When 'git pack-objects --stdin-packs' encounters a commit in a pack, it
marks it as a starting point of a best-effort reachability traversal
that is used to populate the name-hash of the objects listed in the
given packs.

The traversal expects that it should be able to walk the ancestors of
all commits in a pack without issue. Ordinarily this is the case, but it
is possible to having missing parents from an unreachable part of the
repository. In that case, we'd consider any missing objects in the
unreachable portion of the graph to be junk.

This should be handled gracefully: since the traversal is best-effort
(i.e., we don't strictly need to fill in all of the name-hash fields),
we should simply ignore any missing links.

This patch does that (by setting the 'ignore_missing_links' bit on the
rev_info struct), and ensures we don't regress in the future by adding a
test which demonstrates this case.

It is a little over-eager, since it will also ignore missing links in
reachable parts of the packs (which would indicate a corrupted
repository), but '--stdin-packs' is explicitly *not* about reachability.
So this step isn't making anything worse for a repository which contains
packs missing reachable objects (since we never drop objects with
'--stdin-packs').

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 11:19:29 -07:00
6534d436a2 INSTALL: note on using Asciidoctor to build doc
Note on using Asciidoctor to build documentation suite.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 10:49:20 -07:00
9bd342137e diffcore-rename: determine which relevant_sources are no longer relevant
As noted a few commits ago ("diffcore-rename: only compute
dir_rename_count for relevant directories"), when a source file rename
is used as part of directory rename detection, we need to increment
counts for each ancestor directory in dirs_removed with value
RELEVANT_FOR_SELF.  However, a few commits ago ("diffcore-rename: check
if we have enough renames for directories early on"), we may have
downgraded all relevant ancestor directories from RELEVANT_FOR_SELF to
RELEVANT_FOR_ANCESTOR.

For a given file, if no ancestor directory is found in dirs_removed with
a value of RELEVANT_FOR_SELF, then we can downgrade
relevant_source[PATH] from RELEVANT_LOCATION to RELEVANT_NO_MORE.  This
means we can skip detecting a rename for that particular path (and any
other paths in the same directory).

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.680 s ±  0.096 s     5.665 s ±  0.129 s
    mega-renames:     13.812 s ±  0.162 s    11.435 s ±  0.158 s
    just-one-mega:   506.0  ms ±  3.9  ms   494.2  ms ±  6.1  ms

While this improvement looks rather modest for these testcases (because
all the previous optimizations were sufficient to nearly remove all time
spent in rename detection already),  consider this alternative testcase
tweaked from the ones in commit 557ac0350d as follows

    <Same initial setup as commit 557ac0350d, then...>
    $ git switch -c add-empty-file v5.5
    $ >drivers/gpu/drm/i915/new-empty-file
    $ git add drivers/gpu/drm/i915/new-empty-file
    $ git commit -m "new file"
    $ git switch 5.4-rename
    $ git cherry-pick --strategy=ort add-empty-file

For this testcase, we see the following improvement:

                            Before                  After
    pick-empty:        1.936 s ±  0.024 s     688.1 ms ±  4.2 ms

So roughly a factor of 3 speedup.  At $DAYJOB, there was a particular
repository and cherry-pick that inspired this optimization; for that
case I saw a speedup factor of 7 with this optimization.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
ec59da6015 merge-ort: record the reason that we want a rename for a file
There are two different reasons we might want a rename for a file -- for
three-way content merging or as part of directory rename detection.
Record the reason.  diffcore-rename will potentially be able to filter
some of the ones marked as needed only for directory rename detection,
if it can determine those directory renames based solely on renames
found via exact rename detection and basename-guided rename detection.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
bf238b7137 diffcore-rename: add computation of number of unknown renames
The previous commit can only be effective if we have a computation of
the number of paths under a given directory which are still have pending
renames, and expected this number to be recorded in the dir_rename_count
map under the key UNKNOWN_DIR.  Add the code necessary to compute these
values.

Note that this change means dir_rename_count might have a directory
whose only entry (for UNKNOWN_DIR) was removed by the time merge-ort
goes to check it.  To account for this, merge-ort needs to check for the
case where the max count is 0.

With this change we are now computing the necessary value for each
directory in dirs_removed, but are not using that value anywhere.  The
next two commits will make use of the values stored in dirs_removed in
order to compute whether each relevant_source (that is needed only for
directory rename detection) has become unnecessary.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
0491d39297 diffcore-rename: check if we have enough renames for directories early on
As noted in the past few commits, if we can determine that a directory
already has enough renames to determine how directory rename detection
will be decided for that directory, then we can mark that directory as
no longer needing any more renames detected for files underneath it.
For such directories, we change the value in the dirs_removed map from
RELEVANT_TO_SELF to RELEVANT_FOR_ANCESTOR.

A subsequent patch will use this information while iterating over the
remaining potential rename sources to mark ones that were only
location_relevant as unneeded if no containing directory is still marked
as RELEVANT_TO_SELF.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:56 -07:00
e54385b97a diffcore-rename: only compute dir_rename_count for relevant directories
When one side adds files to a directory that the other side renamed,
directory rename detection is used to either move the new paths to the
newer directory or warn the user about the fact that another path
location might be better.

If a parent of the given directory had new files added to it, any
renames in the current directory are also part of determining where the
parent directory is renamed to.  Thus, naively, we need to record each
rename N times for a path at depth N.  However, we can use the
additional information added to dirs_removed in the last commit to avoid
traversing all N parent directories in many cases.  Let's use an example
to explain how this works.  If we have a path named
   src/old_dir/a/b/file.c
and src/old_dir doesn't exist on one side of history, but the other
added a file named src/old_dir/newfile.c, then if one side renamed
   src/old_dir/a/b/file.c => source/new_dir/a/b/file.c
then this file would affect potential directory rename detection counts
for
   src/old_dir/a/b => source/new_dir/a/b
   src/old_dir/a   => source/new_dir/a
   src/old_dir     => source/new_dir
   src             => source
adding a weight of 1 to each in dir_rename_counts.  However, if src/
exists on both sides of history, then we don't need to track any entries
for it in dir_rename_counts.  That was implemented previously.  What we
are adding now, is that if no new files were added to src/old_dir/a or
src/old_dir/b, then we don't need to have counts in dir_rename_count
for those directories either.

In short, we only need to track counts in dir_rename_count for
directories whose dirs_removed value is RELEVANT_FOR_SELF.  And as soon
as we reach a directory that isn't in dirs_removed (signalled by
returning the default value of NOT_RELEVANT from strintmap_get()), we
can stop looking any further up the directory hierarchy.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
fb52938eec merge-ort: record the reason that we want a rename for a directory
When one side of history renames a directory, and the other side of
history added files to the old directory, directory rename detection is
used to warn about the location of the added files so the user can
move them to the old directory or keep them with the new one.

This sets up three different types of directories:
  * directories that had new files added to them
  * directories underneath a directory that had new files added to them
  * directories where no new files were added to it or any leading path

Save this information in dirs_removed; the next several commits will
make use of this information.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
a49b55d52e merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type
As noted in the previous commit, we want to be able to take advantage of
the "majority rules" portion of directory rename detection to avoid
detecting more renames than necessary.  However, for diffcore-rename to
take advantage of that, it needs to know whether a rename source file
was needed for just directory rename detection reasons, or if it is
wanted for potential three-way content merging.  Modify relevant_sources
from a strset to a strintmap, so we can encode additional information.

We also modify dirs_removed from a strset to a strintmap at the same
time because trying to determine what files are needed for directory
rename detection will require us tracking a bit more information for
each directory.

This commit only changes the types of the two variables from strset to
strintmap; it does not actually store any special values yet and for now
only checks for presence of entries in the strintmap.  Thus, the code is
functionally identical to how it behaved before.  Future commits will
start associating values with each key for these two maps.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
ae1db7b31c diffcore-rename: take advantage of "majority rules" to skip more renames
In directory rename detection (when a directory is removed on one side
of history and the other side adds new files to that directory), we work
to find where the greatest number of files within that directory were
renamed to so that the new files can be moved with the majority of the
files.

Naively, we can just do this by detecting renames for *all* files within
the removed/renamed directory, looking at all the destination
directories where files within that directory were moved, and if there
is more than one such directory then taking the one with the greatest
number of files as the directory where the old directory was renamed to.

However, sometimes there are enough renames from exact rename detection
or basename-guided rename detection that we have enough information to
determine the majority winner already.  Add a function meant to compute
whether particular renames are still needed based on this majority rules
check.  The next several commits will then add the necessary
infrastructure to get the information we need to compute which
additional rename sources we can skip.

An important side note for future further optimization:

There is a possible improvement to this optimization that I have not yet
attempted and will not be included in this series of patches: we could
first check whether exact renames provide enough information for us to
determine directory renames, and avoid doing basename-guided rename
detection on some or all of the RELEVANT_LOCATION files within those
directories.  In effect, this variant would mean doing the
handle_early_known_dir_renames() both after exact rename detection and
again after basename-guided rename detection, though it would also mean
decrementing the number of "unknown" renames for each rename we found
from basename-guided rename detection.  Adding this additional check for
skippable renames right after exact rename detection might turn out to
be valuable, especially for partial clones where it might allow us to
download certain source files entirely.  However, this particular
optimization was actually the last one I did in original implementation
order, and by the time I implemented this idea, every testcase I had was
sufficiently fast that further optimization was unwarranted.  If future
testcases arise that tax rename detection more heavily (or perhaps
partial clones can benefit from avoiding loading more objects), it may
be worth implementing this more involved variant.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:32:55 -07:00
27d578d904 t: annotate !PTHREADS tests with !FAIL_PREREQS
Some tests in t5300 and t7810 expect us to complain about a "--threads"
argument when Git is compiled without pthread support. Running these
under GIT_TEST_FAIL_PREREQS produces a confusing failure: we pretend to
the tests that there is no pthread support, so they expect the warning,
but of course the actual build is perfectly happy to respect the
--threads argument.

We never noticed before the recent a926c4b904 (tests: remove most uses
of C_LOCALE_OUTPUT, 2021-02-11), because the tests also were marked as
requiring the C_LOCALE_OUTPUT prerequisite. Which means they'd never
have run in FAIL_PREREQS mode, since it would always pretend that the
locale prereq was not satisfied.

These tests can't possibly work in this mode; it is a mismatch between
what the tests expect and what the build was told to do. So let's just
mark them to be skipped, using the special prereq introduced by
dfe1a17df9 (tests: add a special setup where prerequisites fail,
2019-05-13).

Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:17:30 -07:00
f59d15bb42 convert: add classification for conv_attrs struct
Create `enum conv_attrs_classification` to express the different ways
that attributes are handled for a blob during checkout.

This will be used in a later commit when deciding whether to add a file
to the parallel or delayed queue during checkout. For now, we can also
use it in get_stream_filter_ca() to simplify the function (as the
classifying logic is the same).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
3e9e82c0d8 convert: add get_stream_filter_ca() variant
Like the previous patch, we will also need to call get_stream_filter()
with a precomputed `struct conv_attrs`, when we add support for parallel
checkout workers. So add the _ca() variant which takes the conversion
attributes struct as a parameter.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
55b4ad0ead convert: add [async_]convert_to_working_tree_ca() variants
Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on an index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
38e95844e8 convert: make convert_attrs() and convert structs public
Move convert_attrs() declaration from convert.c to convert.h, together
with the conv_attrs struct and the crlf_action enum. This function and
the data structures will be used outside convert.c in the upcoming
parallel checkout implementation. Note that crlf_action is renamed to
convert_crlf_action, which is more appropriate for the global namespace.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:56:40 -07:00
7e5aa13d2c fsmonitor: add perf test for git diff HEAD
Update the xargs call so that if your large repo contains
symlinks, test-tool chmtime failure does not end the script.

On Linux
Test                                                          this tree           upstream/master
---------------------------------------------------------------------------------------------------------
7519.4: status (fsmonitor=fsmonitor-watchman)                 0.52(0.43+0.10)     0.53(0.49+0.05) +1.9%
7519.5: status -uno (fsmonitor=fsmonitor-watchman)            0.21(0.15+0.07)     0.22(0.13+0.09) +4.8%
7519.6: status -uall (fsmonitor=fsmonitor-watchman)           1.65(0.93+0.71)     1.69(1.03+0.65) +2.4%
7519.7: status (dirty) (fsmonitor=fsmonitor-watchman)         11.99(11.34+1.58)   11.95(11.02+1.79) -0.3%
7519.8: diff (fsmonitor=fsmonitor-watchman)                   0.25(0.17+0.26)     0.25(0.18+0.26) +0.0%
7519.9: diff HEAD (fsmonitor=fsmonitor-watchman)              0.39(0.25+0.34)     0.89(0.35+0.74) +128.2%
7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman)       0.16(0.13+0.04)     0.16(0.12+0.05) +0.0%
7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman)      0.16(0.12+0.05)     0.16(0.12+0.05) +0.0%
7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman)     0.16(0.12+0.05)     0.16(0.12+0.05) +0.0%
7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman)    0.16(0.11+0.06)     0.16(0.12+0.05) +0.0%
7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman)   0.18(0.13+0.06)     0.17(0.10+0.08) -5.6%
7519.15: add (fsmonitor=fsmonitor-watchman)                   2.25(1.53+0.68)     2.25(1.47+0.74) +0.0%
7519.18: status (fsmonitor=disabled)                          0.88(0.73+1.03)     0.89(0.67+1.08) +1.1%
7519.19: status -uno (fsmonitor=disabled)                     0.45(0.43+0.89)     0.45(0.34+0.98) +0.0%
7519.20: status -uall (fsmonitor=disabled)                    1.88(1.16+1.58)     1.88(1.22+1.51) +0.0%
7519.21: status (dirty) (fsmonitor=disabled)                  7.53(7.05+2.11)     7.53(6.98+2.04) +0.0%
7519.22: diff (fsmonitor=disabled)                            0.42(0.37+0.92)     0.42(0.38+0.91) +0.0%
7519.23: diff HEAD (fsmonitor=disabled)                       0.44(0.41+0.90)     0.44(0.40+0.91) +0.0%
7519.24: diff -- 0_files (fsmonitor=disabled)                 0.13(0.09+0.05)     0.13(0.09+0.05) +0.0%
7519.25: diff -- 10_files (fsmonitor=disabled)                0.13(0.10+0.04)     0.13(0.10+0.04) +0.0%
7519.26: diff -- 100_files (fsmonitor=disabled)               0.13(0.09+0.05)     0.13(0.10+0.04) +0.0%
7519.27: diff -- 1000_files (fsmonitor=disabled)              0.13(0.09+0.06)     0.13(0.09+0.05) +0.0%
7519.28: diff -- 10000_files (fsmonitor=disabled)             0.14(0.11+0.05)     0.14(0.10+0.05) +0.0%
7519.29: add (fsmonitor=disabled)                             2.43(1.61+1.64)     2.43(1.69+1.57) +0.0%

On linux (2.29.2 vs w/ this patch):
nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 | grep lstat
  0.04    0.000063           3        20         6 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 | grep lstat
 94.98    5.242262          10    523783        13 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 | grep lstat
  0.38    0.000032           5         7         3 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep lstat
 99.44    0.741892           9     81634        10 lstat

On mac (2.29.2 vs w/ this patch):
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 | grep "^lstat64 "
lstat64                                         8
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 | grep "^lstat64 "
lstat64                                    120242
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 | grep "^lstat64 "
lstat64                                         4
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep "^lstat64 "
lstat64                                      4497

There are still a bunch of lstats - on directories, but not every file. Progress!

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:14 -07:00
0ec9949f78 fsmonitor: add assertion that fsmonitor is valid to check_removed
Validate that fsmonitor is valid to futureproof against bugs where
check_removed might be called from places that haven't refreshed.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:13 -07:00
4f3d6d0261 fsmonitor: skip lstat deletion check during git diff-index
Teach git to honor fsmonitor rather than issuing an lstat
when checking for dirty local deletes. Eliminates O(files)
lstats during `git diff HEAD`

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:11 -07:00
fab78a0c3d checkout: don't follow symlinks when removing entries
At 1d718a5108 ("do not overwrite untracked symlinks", 2011-02-20),
symlink.c:check_leading_path() started returning different codes for
FL_ENOENT and FL_SYMLINK. But one of its callers, unlink_entry(), was
not adjusted for this change, so it started to follow symlinks on the
leading path of to-be-removed entries. Fix that and add a regression
test.

Note that since 1d718a5108 check_leading_path() no longer differentiates
the case where it found a symlink in the path's leading components from
the cases where it found a regular file or failed to lstat() the
component. So, a side effect of this current patch is that
unlink_entry() now returns early in all of these three cases. And
because we no longer try to unlink such paths, we also don't get the
warning from remove_or_warn().

For the regular file and symlink cases, it's questionable whether the
warning was useful in the first place: unlink_entry() removes tracked
paths that should no longer be present in the state we are checking out
to. If the path had its leading dir replaced by another file, it means
that the basename already doesn't exist, so there is no need for a
warning. Sure, we are leaving a regular file or symlink behind at the
path's dirname, but this file is either untracked now (so again, no
need to warn), or it will be replaced by a tracked file during the next
phase of this checkout operation.

As for failing to lstat() one of the leading components, the basename
might still exist only we cannot unlink it (e.g. due to the lack of the
required permissions). Since the user expect it to be removed
(especially with checkout's --no-overlay option), add back the warning
in this more relevant case.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 12:58:10 -07:00
462b4e8dfd symlinks: update comment on threaded_check_leading_path()
Since 1d718a5108 ("do not overwrite untracked symlinks", 2011-02-20),
the comment on top of threaded_check_leading_path() is outdated and no
longer reflects the behavior of this function. Let's updated it to avoid
confusions. While we are here, also remove some duplicated comments to
avoid similar maintenance problems.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 12:58:08 -07:00
fb79f5bff7 fsck.c: refactor and rename common config callback
Refactor code I recently changed in 1f3299fda9 (fsck: make
fsck_config() re-usable, 2021-01-05) so that I could use fsck's config
callback in mktag in 1f3299fda9 (fsck: make fsck_config() re-usable,
2021-01-05).

I don't know what I was thinking in structuring the code this way, but
it clearly makes no sense to have an fsck_config_internal() at all
just so it can get a fsck_options when git_config() already supports
passing along some void* data.

Let's just make use of that instead, which gets us rid of the two
wrapper functions, and brings fsck's common config callback in line
with other such reusable config callbacks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 14:02:43 -07:00
4abc57848d fsmonitor: do not forget to release the token in discard_index()
In 56c6910028 (fsmonitor: change last update timestamp on the
index_state to opaque token, 2020-01-07), we forgot to adjust
`discard_index()` to release the "last-update" token: it is no longer a
64-bit number, but a free-form string that has been allocated.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 12:19:28 -07:00
3dfd30598b fsmonitor: fix memory corruption in some corner cases
In 56c6910028 (fsmonitor: change last update timestamp on the
index_state to opaque token, 2020-01-07), we forgot to adjust the part
of `unpack_trees()` that copies the FSMonitor "last-update" information
that we copy from the source index to the result index since 679f2f9fdd
(unpack-trees: skip stat on fsmonitor-valid files, 2019-11-20).

Since the "last-update" information is no longer a 64-bit number, but a
free-form string that has been allocated, we need to duplicate it rather
than just copying it.

This is important because there _are_ cases when `unpack_trees()` will
perform a oneway merge that implicitly calls `refresh_fsmonitor()`
(which will allocate that "last-update" token). This happens _after_
that token was copied into the result index. However, we _then_ call
`check_updates()` on that index, which will _also_ call
`refresh_fsmonitor()`, accessing the "last-update" string, which by now
would be released already.

In the instance that lead to this patch, this caused a segmentation
fault during a lengthy, complicated rebase involving the todo command
`reset` that (crucially) had to updated many files. Unfortunately, it
seems very hard to trigger that crash, therefore this patch is not
accompanied by a regression test.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 12:19:26 -07:00
cfd409ed09 config.txt: add missing period
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:25:15 -07:00
7730f85594 bisect: peel annotated tags to commits
This patch fixes a bug where git-bisect doesn't handle receiving
annotated tags as "git bisect good <tag>", etc. It's a regression in
27257bc466 (bisect--helper: reimplement `bisect_state` & `bisect_head`
shell functions in C, 2020-10-15).

The original shell code called:

  sha=$(git rev-parse --verify "$rev^{commit}") ||
          die "$(eval_gettext "Bad rev input: \$rev")"

which will peel the input to a commit (or complain if that's not
possible). But the C code just calls get_oid(), which will yield the oid
of the tag.

The fix is to peel to a commit. The error message here is a little
non-idiomatic for Git (since it starts with a capital). I've mostly left
it, as it matches the other converted messages (like the "Bad rev input"
we print when get_oid() fails), though I did add an indication that it
was the peeling that was the problem. It might be worth taking a pass
through this converted code to modernize some of the error messages.

Note also that the test does a bare "grep" (not i18ngrep) on the
expected "X is the first bad commit" output message. This matches the
rest of the test script.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:24:08 -07:00
5f70859c15 t5606: run clone branch name test with protocol v2
4f37d45706 ("clone: respect remote unborn HEAD", 2021-02-05) introduces
a new feature (if the remote has an unborn HEAD, e.g. when the remote
repository is empty, use it as the name of the branch) that only works
in protocol v2, but did not ensure that one of its tests always uses
protocol v2, and thus that test would fail if
GIT_TEST_PROTOCOL_VERSION=0 (or 1) is used. Therefore, add "-c
protocol.version=2" to the appropriate test.

(The rest of the tests from that commit have "-c protocol.version=2"
already added.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:19:36 -07:00
116affac3f mem-pool: drop trailing semicolon from macro definition
Allow BLOCK_GROWTH_SIZE to be used like an integer literal by removing
the trailing semicolon from its definition.  Also wrap the expression in
parentheses, to allow it to be used with operators without leading to
unexpected results.  It doesn't matter for the current use site, but
make it follow standard macro rules anyway to avoid future surprises.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 10:20:16 -07:00
3d8cbbf2c3 block-sha1: drop trailing semicolon from macro definition
23119ffb4e (block-sha1: put expanded macro parameters in parentheses,
2012-07-22) added a trailing semicolon to the definition of SHA_MIX
without explanation.  It doesn't matter with the current code, but make
sure to avoid potential surprises by removing it again.

This allows the macro to be used almost like a function: Users can
combine it with operators of their choice, but still must not pass an
expression with side-effects as a parameter, as it would be evaluated
multiple times.

Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 10:20:01 -07:00
097ea2c848 fsmonitor: avoid global-buffer-overflow READ when checking trivial response
query_result can be be an empty strbuf (STRBUF_INIT) - in that case
trying to read 3 bytes triggers a buffer overflow read (as
query_result.buf = '\0').

Therefore we need to check query_result's length before trying to read 3
bytes.

This overflow was introduced in:
  940b94f35c (fsmonitor: log invocation of FSMonitor hook to trace2, 2021-02-03)
It was found when running the test-suite against ASAN, and can be most
easily reproduced with the following command:

make GIT_TEST_OPTS="-v" DEFAULT_TEST_TARGET="t7519-status-fsmonitor.sh" \
SANITIZE=address DEVELOPER=1 test

==2235==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000019e6e5e at pc 0x00000043745c bp 0x7fffd382c520 sp 0x7fffd382bcc8
READ of size 3 at 0x0000019e6e5e thread T0
    #0 0x43745b in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7
    #1 0x43786d in bcmp /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:887:10
    #2 0x80b146 in fsmonitor_is_trivial_response /home/ahunt/oss-fuzz/git/fsmonitor.c:192:10
    #3 0x80b146 in query_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:175:7
    #4 0x80a749 in refresh_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:267:21
    #5 0x80bad1 in tweak_fsmonitor /home/ahunt/oss-fuzz/git/fsmonitor.c:429:4
    #6 0x90f040 in read_index_from /home/ahunt/oss-fuzz/git/read-cache.c:2321:3
    #7 0x8e5d08 in repo_read_index_preload /home/ahunt/oss-fuzz/git/preload-index.c:164:15
    #8 0x52dd45 in prepare_index /home/ahunt/oss-fuzz/git/builtin/commit.c:363:6
    #9 0x52a188 in cmd_commit /home/ahunt/oss-fuzz/git/builtin/commit.c:1588:15
    #10 0x4ce77e in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4ccb18 in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4cb01c in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cb01c in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x6aca8d in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7fb027bf5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #16 0x4206b9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

0x0000019e6e5e is located 2 bytes to the left of global variable 'strbuf_slopbuf' defined in 'strbuf.c:51:6' (0x19e6e60) of size 1
  'strbuf_slopbuf' is ascii string ''
0x0000019e6e5e is located 126 bytes to the right of global variable 'signals' defined in 'sigchain.c:11:31' (0x19e6be0) of size 512
SUMMARY: AddressSanitizer: global-buffer-overflow /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:842:7 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long)
Shadow bytes around the buggy address:
  0x000080334d70: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x000080334d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080334d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080334da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080334db0: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
=>0x000080334dc0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]01 f9 f9 f9
  0x000080334dd0: f9 f9 f9 f9 03 f9 f9 f9 f9 f9 f9 f9 02 f9 f9 f9
  0x000080334de0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x000080334df0: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x000080334e00: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 01 f9 f9 f9
  0x000080334e10: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 10:00:20 -07:00
1c57cc70ec cocci: allow xcalloc(1, size)
Allocating a pre-cleared single element is quite common and it is
misleading to use CALLOC_ARRAY(); these allocations that would be
affected without this change are not allocating an array.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 17:56:07 -07:00
486f4bd183 xcalloc: use CALLOC_ARRAY() when applicable
These are for codebase before Git 2.31

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 17:51:10 -07:00
9fd1902762 unix-stream-server: create unix domain socket under lock
Create a wrapper class for `unix_stream_listen()` that uses a ".lock"
lockfile to create the unix domain socket in a race-free manner.

Unix domain sockets have a fundamental problem on Unix systems because
they persist in the filesystem until they are deleted.  This is
independent of whether a server is actually listening for connections.
Well-behaved servers are expected to delete the socket when they
shutdown.  A new server cannot easily tell if a found socket is
attached to an active server or is leftover cruft from a dead server.
The traditional solution used by `unix_stream_listen()` is to force
delete the socket pathname and then create a new socket.  This solves
the latter (cruft) problem, but in the case of the former, it orphans
the existing server (by stealing the pathname associated with the
socket it is listening on).

We cannot directly use a .lock lockfile to create the socket because
the socket is created by `bind(2)` rather than the `open(2)` mechanism
used by `tempfile.c`.

As an alternative, we hold a plain lockfile ("<path>.lock") as a
mutual exclusion device.  Under the lock, we test if an existing
socket ("<path>") is has an active server.  If not, we create a new
socket and begin listening.  Then we use "rollback" to delete the
lockfile in all cases.

This wrapper code conceptually exists at a higher-level than the core
unix_stream_connect() and unix_stream_listen() routines that it
consumes.  It is isolated in a wrapper class for clarity.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
77e522caae unix-socket: disallow chdir() when creating unix domain sockets
Calls to `chdir()` are dangerous in a multi-threaded context.  If
`unix_stream_listen()` or `unix_stream_connect()` is given a socket
pathname that is too long to fit in a `sockaddr_un` structure, it will
`chdir()` to the parent directory of the requested socket pathname,
create the socket using a relative pathname, and then `chdir()` back.
This is not thread-safe.

Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
flag is set.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
55144ccb0a unix-socket: add backlog size option to unix_stream_listen()
Update `unix_stream_listen()` to take an options structure to override
default behaviors.  This commit includes the size of the `listen()` backlog.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
4f98ce5865 unix-socket: eliminate static unix_stream_socket() helper function
The static helper function `unix_stream_socket()` calls `die()`.  This
is not appropriate for all callers.  Eliminate the wrapper function
and make the callers propagate the error.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
59c7b88198 simple-ipc: add win32 implementation
Create Windows implementation of "simple-ipc" using named pipes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
066d5234d0 simple-ipc: design documentation for new IPC mechanism
Brief design documentation for new IPC mechanism allowing
foreground Git client to talk with an existing daemon process
at a known location using a named pipe or unix domain socket.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
8c2efa5d76 pkt-line: add options argument to read_packetized_to_strbuf()
Update the calling sequence of `read_packetized_to_strbuf()` to take
an options argument and not assume a fixed set of options.  Update the
only existing caller accordingly to explicitly pass the
formerly-assumed flags.

The `read_packetized_to_strbuf()` function calls `packet_read()` with
a fixed set of assumed options (`PACKET_READ_GENTLE_ON_EOF`).  This
assumption has been fine for the single existing caller
`apply_multi_file_filter()` in `convert.c`.

In a later commit we would like to add other callers to
`read_packetized_to_strbuf()` that need a different set of options.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
c4ba579397 pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
Introduce PACKET_READ_GENTLE_ON_READ_ERROR option to help libify the
packet readers.

So far, the (possibly indirect) callers of `get_packet_data()` can ask
that function to return an error instead of `die()`ing upon end-of-file.
However, random read errors will still cause the process to die.

So let's introduce an explicit option to tell the packet reader
machinery to please be nice and only return an error on read errors.

This change prepares pkt-line for use by long-running daemon processes.
Such processes should be able to serve multiple concurrent clients and
and survive random IO errors.  If there is an error on one connection,
a daemon should be able to drop that connection and continue serving
existing and future connections.

This ability will be used by a Git-aware "Builtin FSMonitor" feature
in a later patch series.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
3a63c6a48c pkt-line: do not issue flush packets in write_packetized_*()
Remove the `packet_flush_gently()` call in `write_packetized_from_buf() and
`write_packetized_from_fd()` and require the caller to call it if desired.
Rename both functions to `write_packetized_from_*_no_flush()` to prevent
later merge accidents.

`write_packetized_from_buf()` currently only has one caller:
`apply_multi_file_filter()` in `convert.c`.  It always wants a flush packet
to be written after writing the payload.

However, we are about to introduce a caller that wants to write many
packets before a final flush packet, so let's make the caller responsible
for emitting the flush packet.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
7455e05e4e pkt-line: eliminate the need for static buffer in packet_write_gently()
Teach `packet_write_gently()` to write the pkt-line header and the actual
buffer in 2 separate calls to `write_in_full()` and avoid the need for a
static buffer, thread-safe scratch space, or an excessively large stack
buffer.

Change `write_packetized_from_fd()` to allocate a temporary buffer rather
than using a static buffer to avoid similar issues here.

These changes are intended to make it easier to use pkt-line routines in
a multi-threaded context with multiple concurrent writers writing to
different streams.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
00ea64ed7a doc/git-commit: add documentation for fixup=[amend|reword] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:36 -07:00
8bedae4599 t3437: use --fixup with options to create amend! commit
We taught `git commit --fixup` to create "amend!" commit. Let's also
update the tests and use it to setup the rebase tests.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:36 -07:00
3d1bda6b5b t7500: add tests for --fixup=[amend|reword] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
3270ae82ac commit: add a reword suboption to --fixup
`git commit --fixup=reword:<commit>` aliases
`--fixup=amend:<commit> --only`, where it creates an empty "amend!"
commit that will reword <commit> without changing its contents when
it is rebased with `--autosquash`.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
494d314a05 commit: add amend suboption to --fixup to create amend! commit
`git commit --fixup=amend:<commit>` will create an "amend!" commit.
The resulting commit message subject will be "amend! ..." where
"..." is the subject line of <commit> and the initial message
body will be <commit>'s message.

The "amend!" commit when rebased with --autosquash will fixup the
contents and replace the commit message of <commit> with the
"amend!" commit's message body.

In order to prevent rebase from creating commits with an empty
message we refuse to create an "amend!" commit if commit message
body is empty.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
6e0e288779 sequencer: export and rename subject_length()
This function can be used in other parts of git. Let's move the
function to commit.c and also rename it to make the name of the
function more generic.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
a5828ae6b5 Git 2.31
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 11:51:51 -07:00
8775279891 Merge branch 'jn/mergetool-hideresolved-is-optional'
Disable the recent mergetool's hideresolved feature by default for
backward compatibility and safety.

* jn/mergetool-hideresolved-is-optional:
  doc: describe mergetool configuration in git-mergetool(1)
  mergetool: do not enable hideResolved by default
2021-03-14 16:01:41 -07:00
074d162eff Merge branch 'tb/pack-revindex-on-disk'
Fix for a topic in 'master'.

* tb/pack-revindex-on-disk:
  pack-revindex.c: don't close unopened file descriptors
2021-03-14 16:01:41 -07:00
04fe4d75fa init-db: silence template_dir leak when converting to absolute path
template_dir starts off pointing to either argv or nothing. However if
the value supplied in argv is a relative path, absolute_pathdup() is
used to turn it into an absolute path. absolute_pathdup() allocates
a new string, and we then "leak" it when cmd_init_db() completes.

We don't bother to actually free the return value (instead we UNLEAK
it), because there's no significant advantage to doing so here.
Correctly freeing it would require more significant changes to code flow
which would be more noisy than beneficial.

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:58:00 -07:00
e4de4502e6 init: remove git_init_db_config() while fixing leaks
The primary goal of this change is to stop leaking init_db_template_dir.
This leak can happen because:
 1. git_init_db_config() allocates new memory into init_db_template_dir
    without first freeing the existing value.
 2. init_db_template_dir might already contain data, either because:
  2.1 git_config() can be invoked twice with this callback in a single
      process - at least 2 allocations are likely.
  2.2 A single git_config() allocation can invoke the callback multiple
      times for a given key (see further explanation in the function
      docs) - each of those calls will trigger another leak.

The simplest fix for the leak would be to free(init_db_template_dir)
before overwriting it. Instead we choose to convert to fetching
init.templatedir via git_config_get_value() as that is more explicit,
more efficient, and avoids allocations (the returned result is owned by
the config cache, so we aren't responsible for freeing it).

If we remove init_db_template_dir, git_init_db_config() ends up being
responsible only for forwarding core.* config values to
platform_core_config(). However platform_core_config() already ignores
non-core.* config values, so we can safely remove git_init_db_config()
and invoke git_config() directly with platform_core_config() as the
callback.

The platform_core_config forwarding was originally added in:
  287853392a (mingw: respect core.hidedotfiles = false in git-init again, 2019-03-11
And I suspect the potential for a leak existed since the original
implementation of git_init_db_config in:
  90b45187ba (Add `init.templatedir` configuration variable., 2010-02-17)

LSAN output from t0001:

Direct leak of 73 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a7276 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x9362ad in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x936eaa in strbuf_add /home/ahunt/oss-fuzz/git/strbuf.c:295:2
    #4 0x868112 in strbuf_addstr /home/ahunt/oss-fuzz/git/./strbuf.h:304:2
    #5 0x86a8ad in expand_user_path /home/ahunt/oss-fuzz/git/path.c:758:2
    #6 0x720bb1 in git_config_pathname /home/ahunt/oss-fuzz/git/config.c:1287:10
    #7 0x5960e2 in git_init_db_config /home/ahunt/oss-fuzz/git/builtin/init-db.c:161:11
    #8 0x7255b8 in configset_iter /home/ahunt/oss-fuzz/git/config.c:1982:7
    #9 0x7253fc in repo_config /home/ahunt/oss-fuzz/git/config.c:2311:2
    #10 0x725ca7 in git_config /home/ahunt/oss-fuzz/git/config.c:2399:2
    #11 0x593e8d in create_default_files /home/ahunt/oss-fuzz/git/builtin/init-db.c:225:2
    #12 0x5935c6 in init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:449:11
    #13 0x59588e in cmd_init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:714:9
    #14 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #15 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #16 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #17 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #18 0x69c4de in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #19 0x7f23552d6349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
aa1b63971a worktree: fix leak in dwim_branch()
Make sure that we release the temporary strbuf during dwim_branch() for
all codepaths (and not just for the early return).

This leak appears to have been introduced in:
  f60a7b763f (worktree: teach "add" to check out existing branches, 2018-04-24)

Note that UNLEAK(branchname) is still needed: the returned result is
used in add(), and is stored in a pointer which is used to point at one
of:
  - a string literal ("HEAD")
  - member of argv (whatever the user specified in their invocation)
  - or our newly allocated string returned from dwim_branch()
Fixing the branchname leak isn't impossible, but does not seem
worthwhile given that add() is called directly from cmd_main(), and
cmd_main() returns immediately thereafter - UNLEAK is good enough.

This leak was found when running t0001 with LSAN, see also LSAN output
below:

Direct leak of 60 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ab076 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x939fcd in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x93af53 in strbuf_splice /home/ahunt/oss-fuzz/git/strbuf.c:239:3
    #4 0x83559a in strbuf_check_branch_ref /home/ahunt/oss-fuzz/git/object-name.c:1593:2
    #5 0x6988b9 in dwim_branch /home/ahunt/oss-fuzz/git/builtin/worktree.c:454:20
    #6 0x695f8f in add /home/ahunt/oss-fuzz/git/builtin/worktree.c:525:19
    #7 0x694a04 in cmd_worktree /home/ahunt/oss-fuzz/git/builtin/worktree.c:1036:10
    #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #12 0x69caee in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #13 0x7f7b7dd10349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
0c4542738e clone: free or UNLEAK further pointers when finished
Most of these pointers can safely be freed when cmd_clone() completes,
therefore we make sure to free them. The one exception is that we
have to UNLEAK(repo) because it can point either to argv[0], or a
malloc'd string returned by absolute_pathdup().

We also have to free(path) in the middle of cmd_clone(): later during
cmd_clone(), path is unconditionally overwritten with a different path,
triggering a leak. Freeing the first path immediately after use (but
only in the case where it contains data) seems like the cleanest
solution, as opposed to freeing it unconditionally before path is reused
for another path. This leak appears to have been introduced in:
  f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17)

These leaks were found when running t0001 with LSAN, see also an excerpt
of the LSAN output below (the full list is omitted because it's far too
long, and mostly consists of indirect leakage of members of the refs we
are freeing).

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21
    #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3
    #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4
    #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2
    #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10
    #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
e901de6816 reset: free instead of leaking unneeded ref
dwim_ref() allocs a new string into ref. Instead of setting to NULL to
discard it, we can FREE_AND_NULL.

This leak appears to have been introduced in:
4cf76f6bbf (builtin/reset: compute checkout metadata for reset, 2020-03-16)

This leak was found when running t0001 with LSAN, see also LSAN output below:

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9a7108 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
    #2 0x8add6b in expand_ref /home/ahunt/oss-fuzz/git/refs.c:670:12
    #3 0x8ad777 in repo_dwim_ref /home/ahunt/oss-fuzz/git/refs.c:644:22
    #4 0x6394af in dwim_ref /home/ahunt/oss-fuzz/git/./refs.h:162:9
    #5 0x637e5c in cmd_reset /home/ahunt/oss-fuzz/git/builtin/reset.c:426:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c5ce in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f57ebb9d349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
f63b88867a symbolic-ref: don't leak shortened refname in check_symref()
shorten_unambiguous_ref() returns an allocated string. We have to
track it separately from the const refname.

This leak has existed since:
9ab55daa55 (git symbolic-ref --delete $symref, 2012-10-21)

This leak was found when running t0001 with LSAN, see also LSAN output
below:

Direct leak of 19 byte(s) in 1 object(s) allocated from:
    #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ab048 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
    #2 0x8b452f in refs_shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c
    #3 0x8b47e8 in shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c:1287:9
    #4 0x679fce in check_symref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:28:14
    #5 0x679ad8 in cmd_symbolic_ref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:70:9
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69cc6e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f98388a4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
5be1c70518 Merge tag 'l10n-2.31.0-rnd2' of git://github.com/git-l10n/git-po
l10n for Git 2.31.0 round 2

* tag 'l10n-2.31.0-rnd2' of git://github.com/git-l10n/git-po:
  l10n: zh_CN: for git v2.31.0 l10n round 1 and 2
  l10n: de.po: Update German translation for Git v2.31.0
  l10n: pt_PT: add Portuguese translations part 1
  l10n: vi.po(5104t): for git v2.31.0 l10n round 2
  l10n: es: 2.31.0 round 2
  l10n: Add translation team info
  l10n: start Indonesian translation
  l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)
  l10n: bg.po: Updated Bulgarian translation (5104t)
  l10n: fr: v2.31 rnd 2
  l10n: tr: v2.31.0-rc1
  l10n: sv.po: Update Swedish translation (5104t0f0u)
  l10n: git.pot: v2.31.0 round 2 (9 new, 8 removed)
  l10n: tr: v2.31.0-rc0
  l10n: sv.po: Update Swedish translation (5103t0f0u)
  l10n: pl.po: Update translation
  l10n: fr: v2.31.0 rnd 1
  l10n: git.pot: v2.31.0 round 1 (155 new, 89 removed)
  l10n: Update Catalan translation
  l10n: ru.po: update Russian translation
2021-03-14 15:50:36 -07:00
8588aa8657 vcs-svn: remove header files as well
fc47391e24 (drop vcs-svn experiment, 2020-08-13) removed most vcs-svn
files.  Drop the remaining header files as well, as they are no longer
used.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:48:23 -07:00
473eb54151 l10n: zh_CN: for git v2.31.0 l10n round 1 and 2
Translate 161 new messages (5104t0f0u) for git 2.31.0.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-03-15 00:05:25 +08:00
4bc948a743 Merge branch 'master' of github.com:vnwildman/git
* 'master' of github.com:vnwildman/git:
  l10n: vi.po(5104t): for git v2.31.0 l10n round 2
2021-03-15 00:04:47 +08:00
e196890735 Merge branch 'l10n/zh_TW/210301' of github.com:l10n-tw/git-po
* 'l10n/zh_TW/210301' of github.com:l10n-tw/git-po:
  l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)
2021-03-14 22:35:44 +08:00
84bc81478e Merge branch 'po-id' of github.com:bagasme/git-po
* 'po-id' of github.com:bagasme/git-po:
  l10n: Add translation team info
  l10n: start Indonesian translation
2021-03-14 22:35:17 +08:00
bd5fba827b Merge branch 'master' of github.com:Softcatala/git-po
* 'master' of github.com:Softcatala/git-po:
  l10n: Update Catalan translation
2021-03-14 22:34:46 +08:00
2d897529b2 Merge branch 'russian-l10n' of github.com:DJm00n/git-po-ru
* 'russian-l10n' of github.com:DJm00n/git-po-ru:
  l10n: ru.po: update Russian translation
2021-03-14 22:34:12 +08:00
799df2e406 Merge branch 'pt-PT' of github.com:git-l10n-pt-PT/git-po
* 'pt-PT' of github.com:git-l10n-pt-PT/git-po:
  l10n: pt_PT: add Portuguese translations part 1
2021-03-14 22:33:26 +08:00
ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
f1121499e6 git-compat-util.h: drop trailing semicolon from macro definition
Make CALLOC_ARRAY usable like a function by requiring callers to supply
the trailing semicolon, which all of the current ones already do.  With
the extra semicolon e.g. the following code wouldn't compile because it
disconnects the "else" from the "if":

	if (condition)
		CALLOC_ARRAY(ptr, n);
	else
		whatever();

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:56:13 -08:00
4c8e3dca6e Documentation/git-push.txt: correct configuration typo
In the EXAMPLES section, git-push(1) says that 'git push origin' pushes
the current branch to the value of the 'remote.origin.merge'
configuration.

This wording (which dates back to b2ed944af7 (push: switch default from
"matching" to "simple", 2013-01-04)) is incorrect. There is no such
configuration as 'remote.<name>.merge'. This likely was originally
intended to read "branch.<name>.merge" instead.

Indeed, when 'push.default' is 'simple' (which is the default value, and
is applicable in this scenario per "without additional configuration"),
setup_push_upstream() dies if the branch's local name does not match
'branch.<name>.merge'.

Correct this long-standing typo to resolve some recent confusion on the
intended behavior of this example.

Reported-by: Adam Sharafeddine <adam.shrfdn@gmail.com>
Reported-by: Fabien Terrani <terranifabien@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:41:45 -08:00
53204061ac doc: describe mergetool configuration in git-mergetool(1)
In particular, this describes mergetool.hideResolved, which can help
users discover this setting (either because it may be useful to them
or in order to understand mergetool's behavior if they have forgotten
setting it in the past).

Tested by running

	make -C Documentation git-mergetool.1
	man Documentation/git-mergetool.1

and reading through the page.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:34:32 -08:00
b2a51c1b03 mergetool: do not enable hideResolved by default
When 98ea309b3f (mergetool: add hideResolved configuration,
2021-02-09) introduced the mergetool.hideResolved setting to reduce
the clutter in viewing non-conflicted sections of files in a
mergetool, it enabled it by default, explaining:

    No adverse effects were noted in a small survey of popular mergetools[1]
    so this behavior defaults to `true`.

In practice, alas, adverse effects do appear.  A few issues:

1. No indication is shown in the UI that the base, local, and remote
   versions shown have been modified by additional resolution.  This
   is inherent in the design: the idea of mergetool.hideResolved is to
   convince a mergetool that expects pristine local, base, and remote
   files to show partially resolved verisons of those files instead;
   there is no additional source of information accessible to the
   mergetool to see where the resolution has happened.

   (By contrast, a mergetool generating the partial resolution from
   conflict markers for itself would be able to hilight the resolved
   sections with a different color.)

   A user accustomed to seeing the files without partial resolution
   gets no indication that this behavior has changed when they upgrade
   Git.

2. If the computed merge did not line up the files correctly (for
   example due to repeated sections in the file), the partially
   resolved files can be misleading and do not have enough information
   to reconstruct what happened and compute the correct merge result.

3. Resolving a conflict can involve information beyond the textual
   conflict.  For example, if the local and remote versions added
   overlapping functionality in different ways, seeing the full
   unresolved versions of each alongside the base gives information
   about each side's intent that makes it possible to come up with a
   resolution that combines those two intents.  By contrast, when
   starting with partially resolved versions of those files, one can
   produce a subtly wrong resolution that includes redundant extra
   code added by one side that is not needed in the approach taken
   on the other.

All that said, a user wanting to focus on textual conflicts with
reduced clutter can still benefit from mergetool.hideResolved=true as
a way to deemphasize sections of the code that resolve cleanly without
requiring any changes to the invoked mergetool.  The caveats described
above are reduced when the user has explicitly turned this on, because
then the user is aware of them.

Flip the default to 'false'.

Reported-by: Dana Dahlstrom <dahlstrom@google.com>
Helped-by: Seth House <seth@eseth.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 15:30:29 -08:00
a4a4439fdf http: drop the check for an empty proxy password before approving
credential_approve() already checks for a non-empty password before
saving, so there's no need to do the extra check here.

Signed-off-by: John Szakmeister <john@szakmeister.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 22:17:10 -08:00
cd27f604e4 http: store credential when PKI auth is used
We already looked for the PKI credentials in the credential store, but
failed to approve it on success.  Meaning, the PKI certificate password
was never stored and git would request it on every connection to the
remote.  Let's complete the chain by storing the certificate password on
success.

Likewise, we also need to reject the credential when there is a failure.
Curl appears to report client-related certificate issues are reported
with the CURLE_SSL_CERTPROBLEM error.  This includes not only a bad
password, but potentially other client certificate related problems.
Since we cannot get more information from curl, we'll go ahead and
reject the credential upon receiving that error, just to be safe and
avoid caching or saving a bad password.

Signed-off-by: John Szakmeister <john@szakmeister.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 22:17:07 -08:00
96099726dd archive: expand only a single %(describe) per archive
Every %(describe) placeholder in $Format:...$ strings in files with the
attribute export-subst is expanded by calling git describe.  This can
potentially result in a lot of such calls per archive.  That's OK for
local repositories under control of the user of git archive, but could
be a problem for hosted repositories.

Expand only a single %(describe) placeholder per archive for now to
avoid denial-of-service attacks.  We can make this limit configurable
later if needed, but let's start out simple.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 13:22:44 -08:00
e4fd06e7e2 diffcore-rename: avoid doing basename comparisons for irrelevant sources
The basename comparison optimization implemented in
find_basename_matches() is very beneficial since it allows a source to
sometimes only be compared with one other file instead of N other files.
When a match is found, both a source and destination can be removed from
the matrix of inexact rename comparisons.  In contrast, the irrelevant
source optimization only allows us to remove a source from the matrix of
inexact rename comparisons...but it has the advantage of allowing a
source file to not even be loaded into memory at all and be compared to
0 other files.  Generally, not even comparing is a bigger performance
win, so when both optimizations could apply, prefer to use the
irrelevant-source optimization.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.708 s ±  0.111 s     5.680 s ±  0.096 s
    mega-renames:    102.171 s ±  0.440 s    13.812 s ±  0.162 s
    just-one-mega:     3.471 s ±  0.015 s   506.0  ms ±  3.9  ms

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
f89b4f2bee merge-ort: skip rename detection entirely if possible
diffcore_rename_extended() will do a bunch of setup, then check for
exact renames, then abort before inexact rename detection if there are
no more sources or destinations that need to be matched.  It will
sometimes be the case, however, that either
  * we start with neither any sources or destinations
  * we start with no *relevant* sources
In the first of these two cases, the setup and exact rename detection
will be very cheap since there are 0 files to operate on.  In the second
case, it is quite possible to have thousands of files with none of the
source ones being relevant.  Avoid calling diffcore_rename_extended() or
even some of the setup before diffcore_rename_extended() when we can
determine that rename detection is unnecessary.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        6.003 s ±  0.048 s     5.708 s ±  0.111 s
    mega-renames:    114.009 s ±  0.236 s   102.171 s ±  0.440 s
    just-one-mega:     3.489 s ±  0.017 s     3.471 s ±  0.015 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
174791f0fb merge-ort: use relevant_sources to filter possible rename sources
The past several commits determined conditions when rename sources might
be needed, and filled a relevant_sources strset with those paths.  Pass
these along to diffcore_rename_extended() to use to limit the sources
that we need to detect renames for.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       12.596 s ±  0.061 s     6.003 s ±  0.048 s
    mega-renames:    130.465 s ±  0.259 s   114.009 s ±  0.236 s
    just-one-mega:     3.958 s ±  0.010 s     3.489 s ±  0.017 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
2fd9eda462 merge-ort: precompute whether directory rename detection is needed
The point of directory rename detection is that if one side of history
renames a directory, and the other side adds new files under the old
directory, then the merge can move those new files into the new
directory.  This leads to the following important observation:

  * If the other side does not add any new files under the old
    directory, we do not need to detect any renames for that directory.

Similarly, directory rename detection had an important requirement:

  * If a directory still exists on one side of history, it has not been
    renamed on that side of history.  (See section 4 of t6423 or
    Documentation/technical/directory-rename-detection.txt for more
    details).

Using these two bits of information, we note that directory rename
detection is only needed in cases where (1) directories exist in the
merge base and on one side of history (i.e. dirmask == 3 or dirmask ==
5), and (2) where there is some new file added to that directory on the
side where it still exists (thus where the file has filemask == 2 or
filemask == 4, respectively).  This has to be done in two steps, because
we have the dirmask when we are first considering the directory, and
won't get the filemasks for the files within it until we recurse into
that directory.  So, we save
  dir_rename_mask = dirmask - 1
when we hit a directory that is missing on one side, and then later look
for cases of
  filemask == dir_rename_mask

One final note is that as soon as we hit a directory that needs
directory rename detection, we will need to detect renames in all
subdirectories of that directory as well due to the "majority rules"
decision when files are renamed into different directory hierarchies.
We arbitrarily use the special value of 0x07 to record when we've hit
such a directory.

The combination of all the above mean that we introduce a variable
named dir_rename_mask (couldn't think of a better name) which has one
of the following values as we traverse into a directory:
   * 0x00: directory rename detection not needed
   * 0x02 or 0x04: directory rename detection only needed if files added
   * 0x07: directory rename detection definitely needed

We then pass this value through to add_pairs() so that it can mark
location_relevant as true only when dir_rename_mask is 0x07.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
a68e6cea59 merge-ort: introduce wrappers for alternate tree traversal
Add traverse_trees_wrapper() and traverse_trees_wrapper_callback()
functions.  The former runs traverse_trees() with info->fn set to
traverse_trees_wrapper_callback, in order to simply save all the entries
without processing or recursing into any of them.  This step allows
extra computation to be done (e.g. checking some condition across all
files) that can be used later.  Then, after that is completed, it
iterates over all the saved entries and calls the original info->fn
callback with the saved data.

Currently, this does nothing more than marginally slowing down the tree
traversal since we do not take advantage of the opportunity to compute
anything special in traverse_trees_wrapper_callback(), and thus the real
callback will be called identically as it would have been without this
extra wrapper.  However, a subsequent commit will add some special
computation of some values that the real callback will be able to use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:05 -08:00
beb06145f8 merge-ort: add data structures for an alternate tree traversal
In order to determine whether directory rename detection is needed, we
as a pre-requisite need a way to traverse through all the files in a
given tree before visiting any directories within that tree.
traverse_trees() only iterates through the entries in the order they
appear, so add some data structures that will store all the entries as
we iterate through them in traverse_trees(), which will allow us to
re-traverse them in our desired order.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
32a56dfb99 merge-ort: precompute subset of sources for which we need rename detection
rename detection works by trying to pair all file deletions (or
"sources") with all file additions (or "destinations"), checking
similarity, and then marking the sufficiently similar ones as renames.
This can be expensive if there are many sources and destinations on a
given side of history as it results in an N x M comparison matrix.
However, there are many cases where we can compute in advance that
detecting renames for some of the sources provides no useful information
and thus that we can exclude those sources from the matrix.

To see why, first note that the merge machinery uses detected renames in
two ways:

   * directory rename detection: when one side of history renames a
       directory, and the other side of history adds new files to that
       directory, we want to be able to warn the user about the need to
       chose whether those new files stay in the old directory or move
       to the new one.

   * three-way content merging: in order to do three-way content merging
       of files, we need three different file versions.  If one side of
       history renamed a file, then some of the content for the file is
       found under a different path than in the merge base or on the
       other side of history.

Add a simple testcase showing the two kinds of reasons renames are
relevant; it's a testcase that will only pass if we detect both kinds of
needed renames.

Other than the testcase added above, this commit concentrates just on
the three-way content merging; it will punt and mark all sources as
needed for directory rename detection, and leave it to future commits to
narrow that down more.

The point of three-way content merging is to reconcile changes made on
*both* sides of history.  What if the file wasn't modified on both
sides?  There are two possibilities:

   * If it wasn't modified on the renamed side:
       -> then we get to do exact rename detection, which is cheap.

   * If it wasn't modified on the unrenamed side:
       -> then detection of a rename for that source file is irrelevant

That latter claim might be surprising at first, so let's walk through a
case to show why rename detection for that source file is irrelevant.
Let's use two filenames, old.c & new.c, with the following abbreviated
object ids (and where the value '000000' is used to denote that the file
is missing in that commit):

                 old.c     new.c
   MERGE_BASE:   01d01d    000000
   MERGE_SIDE1:  01d01d    000000
   MERGE_SIDE2:  000000    5e1ec7

If the rename *isn't* detected:
   then old.c looks like it was unmodified on one side and deleted on
   the other and should thus be removed.  new.c looks like a new file we
   should keep as-is.

If the rename *is* detected:
   then a three-way content merge is done.  Since the version of the
   file in MERGE_BASE and MERGE_SIDE1 are identical, the three-way merge
   will produce exactly the version of the file whose abbreviated
   object id is 5e1ec7.  It will record that file at the path new.c,
   while removing old.c from the directory.

Note that these two results are identical -- a single file named 'new.c'
with object id 5e1ec7.  In other words, it doesn't matter if the rename
is detected in the case where the file is unmodified on the unrenamed
side.

Use this information to compute whether we need rename detection for
each source created in add_pair().

It's probably worth noting that there used to be a few other edge or
corner cases besides three-way content merges and directory rename
detection where lack of rename detection could have affected the result,
but those cases actually highlighted where conflict resolution methods
were not consistent with each other.  Fixing those inconsistencies were
thus critically important to enabling this optimization.  That work
involved the following:

 * bringing consistency to add/add, rename/add, and rename/rename
    conflict types, as done back in the topic merged at commit
    ac193e0e0a ("Merge branch 'en/merge-path-collision'", 2019-01-04),
    and further extended in commits 2a7c16c980 ("t6422, t6426: be more
    flexible for add/add conflicts involving renames", 2020-08-10) and
    e8eb99d4a6 ("t642[23]: be more flexible for add/add conflicts
    involving pair renames", 2020-08-10)

  * making rename/delete more consistent with modify/delete
    as done in commits 1f3c9ba707 ("t6425: be more flexible with
    rename/delete conflict messages", 2020-08-10) and 727c75b23f
    ("t6404, t6423: expect improved rename/delete handling in ort
    backend", 2020-10-26)

Since the set of relevant_sources we compute has not yet been narrowed
down for directory rename detection, we do not pass it to
diffcore_rename_extended() yet.  That will be done after subsequent
commits narrow down the list of relevant_sources needed for directory
rename detection reasons.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
9799889f2e diffcore-rename: enable filtering possible rename sources
Add the ability to diffcore_rename_extended() to allow external callers
to declare that they only need renames detected for a subset of source
files, and use that information to skip detecting renames for them.

There are two important pieces to this optimization that may not be
obvious at first glance:

  * We do not require callers to just filter the filepairs out
    to remove the non-relevant sources, because exact rename detection
    is fast and when it finds a match it can remove both a source and a
    destination whereas the relevant_sources filter can only remove a
    source.

  * We need to filter out the source pairs in a preliminary pass instead
    of adding a
       strset_contains(relevant_sources, one->path)
    check within the nested matrix loop.  The reason for that is if we
    have 30k renames, doing 30k * 30k = 900M strset_contains() calls
    becomes extraordinarily expensive and defeats the performance gains
    from this change; we only want to do 30k such calls instead.

If callers pass NULL for relevant_sources, that is special cases to
treat all sources as relevant.  Since all callers currently pass NULL,
this optimization does not yet have any effect.  Subsequent commits will
have merge-ort compute a set of relevant_sources to restrict which
sources we detect renames for, and have merge-ort pass that set of
relevant_sources to diffcore_rename_extended().

A note about filtering order:

Some may be curious why we don't filter out irrelevant sources at the
same time we filter out exact renames.  While that technically could be
done at this point, there are two reasons to defer it:

First, was to reinforce a lesson that was too easy to forget.  As I
mentioned above, in the past I filtered irrelevant sources out before
exact rename checking, and then discovered that exact renames' ability
to remove both sources and destinations was an important consideration
and thus doing the filtering after exact rename checking would speed
things up.  Then at some point I realized that basename matching could
also remove both sources and destinations, and decided to put irrelevant
source filtering after basename filtering.  That slowed things down a
lot.  But, despite learning about this important ordering, in later
restructuring I forgot and made the same mistake of putting the
filtering after basename guided rename detection again.  So, I have this
series of patches structured to do the irrelevant filtering last to
start to show how much extra it costs, and then add relevant filtering
in to find_basename_matches() to show how much it speeds things up.
Basically, it's a way to reinforce something that apparently was too
easy to forget, and make sure the commit messages record this lesson.

Second, the items in the "relevant_sources" in this patch series will
include all sources that *might be* relevant.  It has to be conservative
and catch anything that might need a rename, but in the patch series
after this one we'll find ways to weed out more of the *might be*
relevant ones.  Unfortunately, merge-ort does not have sufficient
information to weed those ones out, and there isn't enough information
at the time of filtering exact renames out to remove the extra ones
either.  It has to be deferred.  So the deferral is in part to simplify
some later additions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
75555676ad builtin/init-db: handle bare clones when core.bare set to false
In 552955ed7f ("clone: use more conventional config/option layering",
2020-10-01), clone learned to read configuration options earlier in its
execution, before creating the new repository.  However, that led to a
problem: if the core.bare setting is set to false in the global config,
cloning a bare repository segfaults.  This happens because the
repository is falsely thought to be non-bare, but clone has set the work
tree to NULL, which is then dereferenced.

The code to initialize the repository already considers the fact that a
user might want to override the --bare option for git init, but it
doesn't take into account clone, which uses a different option.  Let's
just check that the work tree is not NULL, since that's how clone
indicates that the repository is bare.  This is also the case for git
init, so we won't be regressing that case.

Reported-by: Joseph Vusich <jvusich@amazon.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 15:06:48 -08:00
42efa1231a filter-branch: drop $_x40 glob
When checking whether a commit was rewritten to a single object id, we
use a glob that insists on a 40-hex result. This works for sha1, but
fails t7003 when run with GIT_TEST_DEFAULT_HASH=sha256.

Since the previous commit simplified the case statement here, we only
have two arms: an empty string or a single object id. We can just loosen
our glob to match anything, and still distinguish those cases (we lose
the ability to notice bogus input, but that's not a problem; we are the
one who wrote the map in the first place, and anyway update-ref will
complain loudly if the input isn't a valid hash).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:16:58 -08:00
98fe9e666f filter-branch: drop multiple-ancestor warning
When a ref maps to a commit that is neither rewritten nor kept by
filter-branch (e.g., because it was eliminated by rev-list's pathspec
selection), we rewrite it to its nearest ancestor.

Since the initial commit in 6f6826c52b (Add git-filter-branch,
2007-06-03), we have warned when there are multiple such ancestors in
the map file. However, the warning code is impossible to trigger these
days. Since a0e46390d3 (filter-branch: fix ref rewriting with
--subdirectory-filter, 2008-08-12), we find the ancestor using "rev-list
-1", so it can only ever have a single value.

This code is made doubly confusing by the fact that we append to the map
file when mapping ancestors. However, this can never yield multiple
values because:

  - we explicitly check whether the map already exists, and if so, do
    nothing (so our "append" will always be to a file that does not
    exist)

  - even if we were to try mapping twice, the process to do so is
    deterministic. I.e., we'd always end up with the same ancestor for a
    given sha1. So warning about it would be pointless; there is no
    ambiguity.

So swap out the warning code for a BUG (which we'll simplify further in
the next commit). And let's stop using the append operator to make the
ancestor-mapping code less confusing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:14:52 -08:00
6d875d19fd t7003: test ref rewriting explicitly
After it has rewritten all of the commits, filter-branch will then
rewrite each of the input refs based on the resulting map of old/new
commits. But we don't have any explicit test coverage of this code.
Let's make sure we are covering each of those cases:

  - deleting a ref when all of its commits were pruned

  - rewriting a ref based on the mapping (this happens throughout the
    script, but let's make sure we generate the correct messages)

  - rewriting a ref whose tip was excluded, in which case we rewrite to
    the nearest ancestor. Note in this case that we still insist that no
    "warning" line is present (even though it looks like we'd trigger
    the "... was rewritten into multiple commits" one). See the next
    commit for more details.

Note these all pass currently, but the latter two will fail when run
with GIT_TEST_DEFAULT_HASH=sha256.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:14:19 -08:00
13d7ab6b5d Git 2.31-rc2 2021-03-08 16:09:43 -08:00
56a57652ef Sync with Git 2.30.2 for CVE-2021-21300
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 16:09:07 -08:00
6c46f864e5 Merge branch 'jt/transfer-fsck-across-packs-fix'
The code to fsck objects received across multiple packs during a
single git fetch session has been broken when the packfile URI
feature was in use.  A workaround has been added by disabling the
codepath to avoid keeping a packfile that is too small.

* jt/transfer-fsck-across-packs-fix:
  fetch-pack: do not mix --pack_header and packfile uri
2021-03-08 16:04:47 -08:00
834845142d l10n: de.po: Update German translation for Git v2.31.0
Reviewed-by: Ralf Thielow <ralf.thielow@gmail.com>
Reviewed-by: Phillip Szelat <phillip.szelat@gmail.com>
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
2021-03-08 19:49:33 +01:00
68b5c3aa48 Makefile: update 'make fuzz-all' docs to reflect modern clang
Clang no longer produces a libFuzzer.a. Instead, you can include
libFuzzer by using -fsanitize=fuzzer. Therefore we should use that in
the example command for building fuzzers.

We also add -fsanitize=fuzzer-no-link to the CFLAGS to ensure that all
the required instrumentation is added when compiling git [1], and remove
 -fsanitize-coverage=trace-pc-guard as it is deprecated.

I happen to have tested with LLVM 11 - however -fsanitize=fuzzer appears
to work in a wide range of reasonably modern clangs.

(On my system: what used to be libFuzzer.a now lives under the following
 path, which is tricky albeit not impossible for a novice such as myself
 to find:
/usr/lib64/clang/11.0.0/lib/linux/libclang_rt.fuzzer-x86_64.a )

[1] https://releases.llvm.org/11.0.0/docs/LibFuzzer.html#fuzzer-usage

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 10:26:25 -08:00
e8df3b6c6c Add entry for Ramkumar Ramachandra
Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 09:56:34 -08:00
241b5d3ebe fix xcalloc() argument order
Pass the number of elements first and ther size second, as expected
by xcalloc().  Provide a semantic patch, which was actually used to
generate the rest of this patch.

The semantic patch would generate flip-flop diffs if both arguments
are sizeofs.  We don't have such a case, and it's hard to imagine
the usefulness of such an allocation.  If it ever occurs then we
could deal with it by duplicating the rule in the semantic patch to
make it cancel itself out, or we could change the code to use
CALLOC_ARRAY.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 09:45:04 -08:00
408985d301 l10n: pt_PT: add Portuguese translations part 1
* Newlines corrected.
* Add concept translation table.
* Translated some.
* Corrected some.
* Corrected some 'Negation of Emptiness'.

Signed-off-by: Daniel Santos <hello@brighterdan.com>
2021-03-08 15:21:51 +00:00
1369935987 l10n: vi.po(5104t): for git v2.31.0 l10n round 2
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2021-03-08 09:03:04 +07:00
b0adcc311b l10n: es: 2.31.0 round 2
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
2021-03-07 18:31:14 -05:00
c21ad4d941 l10n: Add translation team info
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-03-07 19:38:29 +07:00
8c4abfb8be l10n: start Indonesian translation
* Initialize PO file
  * Translate init-db.c
  * Translate wt-status.c
  * Translate builtin/clone.c
  * Translate builtin/checkout.c
  * Translate builtin/fetch.c
  * Complete core translations:
    * builtin/remote.c
    * builtin/index-pack.c
    * push.c
    * reset.c
  * Sync with l10n upstream

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
2021-03-07 19:38:07 +07:00
2aec3bc4b6 fetch-pack: do not mix --pack_header and packfile uri
When fetching (as opposed to cloning) from a repository with packfile
URIs enabled, an error like this may occur:

 fatal: pack has bad object at offset 12: unknown object type 5
 fatal: finish_http_pack_request gave result -1
 fatal: fetch-pack: expected keep then TAB at start of http-fetch output

This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs,
use index-pack arg", 2021-02-22), when the index-pack args used when
processing the inline packfile of a fetch response and when processing
packfile URIs were unified.

This bug happens because fetch, by default, partially reads (and
consumes) the header of the inline packfile to determine if it should
store the downloaded objects as a packfile or loose objects, and thus
passes --pack_header=<...> to index-pack to inform it that some bytes
are missing. However, when it subsequently fetches the additional
packfiles linked by URIs, it reuses the same index-pack arguments, thus
wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are
missing.

This does not happen when cloning because "git clone" always passes
do_keep, which instructs the fetch mechanism to always retain the
packfile, eliminating the need to read the header.

There are a few ways to fix this, including filtering out pack_header
arguments when downloading the additional packfiles, but I decided to
stick to always using index-pack throughout when packfile URIs are
present - thus, Git no longer needs to read the bytes, and no longer
needs --pack_header here.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 15:04:09 -08:00
0af760e261 stash show: learn stash.showIncludeUntracked
The previous commit teaches `git stash show --include-untracked`. It
may be desirable for a user to be able to always enable the
--include-untracked behavior. Teach the stash.showIncludeUntracked
config option which allows users to do this in a similar manner to
stash.showPatch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:27 -08:00
d3c7bf73bd stash show: teach --include-untracked and --only-untracked
Stash entries can be made with untracked files via
`git stash push --include-untracked`. However, because the untracked
files are stored in the third parent of the stash entry and not the
stash entry itself, running `git stash show` does not include the
untracked files as part of the diff.

With --include-untracked, untracked paths, which are recorded in the
third-parent if it exists, are shown in addition to the paths that have
modifications between the stash base and the working tree in the stash.

It is possible to manually craft a malformed stash entry where duplicate
untracked files in the stash entry will mask tracked files. We detect
and error out in that case via a custom unpack_trees() callback:
stash_worktree_untracked_merge().

Also, teach stash the --only-untracked option which only shows the
untracked files of a stash entry. This is similar to `git show stash^3`
but it is nice to provide a convenient abstraction for it so that users
do not have to think about the underlying implementation.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:26 -08:00
ccae01cab8 builtin/repack.c: reword comment around pack-objects flags
The comment in this block is meant to indicate that passing '--all',
'--reflog', and so on aren't necessary when repacking with the
'--geometric' option.

But, it has two problems: first, it is factually incorrect ('--all' is
*not* incompatible with '--stdin-packs' as the comment suggests);
second, it is quite focused on the geometric case for a block that is
guarding against it.

Reword this comment to address both issues.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
2a15964128 builtin/repack.c: be more conservative with unsigned overflows
There are a number of places in the geometric repack code where we
multiply the number of objects in a pack by another unsigned value. We
trust that the number of objects in a pack is always representable by a
uint32_t, but we don't necessarily trust that that number can be
multiplied without overflow.

Sprinkle some unsigned_add_overflows() and unsigned_mult_overflows() in
split_pack_geometry() to check that we never overflow any unsigned types
when adding or multiplying them.

Arguably these checks are a little too conservative, and certainly they
do not help the readability of this function. But they are serving a
useful purpose, so I think they are worthwhile overall.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
13d746a303 builtin/repack.c: assign pack split later
To determine the where to place the split when repacking with the
'--geometric' option, split_pack_geometry() assigns the "split" variable
and then decrements it in a loop.

It would be equivalent (and more readable) to assign the split to the
loop position after exiting the loop, so do that instead.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
dab3247734 t7703: test --geometric repack with loose objects
We don't currently have a test that demonstrates the non-idempotent
behavior of 'git repack --geometric' with loose objects, so add one here
to make sure we don't regress in this area.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
f25e33c156 builtin/repack.c: do not repack single packs with --geometric
In 0fabafd0b9 (builtin/repack.c: add '--geometric' option, 2021-02-22),
the 'git repack --geometric' code aborts early when there is zero or one
pack.

When there are no packs, this code does the right thing by placing the
split at "0". But when there is exactly one pack, the split is placed at
"1", which means that "git repack --geometric" (with any factor)
repacks all of the objects in a single pack.

This is wasteful, and the remaining code in split_pack_geometry() does
the right thing (not repacking the objects in a single pack) even when
only one pack is present.

Loosen the guard to only stop when there aren't any packs, and let the
rest of the code do the right thing. Add a test to ensure that this is
the case.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
8278f87022 l10n: zh_TW.po: v2.31.0 round 2 (15 untranslated)
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
2021-03-06 02:43:34 +08:00
2f176de687 l10n: bg.po: Updated Bulgarian translation (5104t)
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2021-03-05 12:12:34 +01:00
1ecef023a9 Merge branch 'fr_next' of github.com:jnavila/git
* 'fr_next' of github.com:jnavila/git:
  l10n: fr: v2.31 rnd 2
2021-03-05 13:47:07 +08:00
5b888ad949 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5104t0f0u)
2021-03-05 13:46:25 +08:00
be7935ed8b Merged the open-eintr workaround for macOS
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-04 15:42:50 -08:00
58d581c344 Documentation/RelNotes: improve release note for rename detection work
There were some early changes in the 2.31 cycle to optimize some setup
in diffcore-rename.c[1], some later changes to measure performance[2],
and finally some significant changes to improve rename detection
performance.  The final one was merged with the note

   Performance optimization work on the rename detection continues.

That works for the commit log, but feels misleading as a release note
since all the changes were within one cycle.  Simplify this to just

   Performance improvements for rename detection.

The former wording could be seen as hinting that more performance
improvements will come in 2.32, which is true, but we can just cover
those in the 2.32 release notes when the time comes.

[1] a5ac31b5b1 (Merge branch 'en/diffcore-rename', 2021-01-25)
[2] d3a035b055 (Merge branch 'en/merge-ort-perf', 2021-02-11)
[3] 12bd17521c (Merge branch 'en/diffcore-rename', 2021-03-01)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-04 15:38:11 -08:00
921846fa22 Merge branch 'jk/open-returns-eintr'
Work around platforms whose open() is reported to return EINTR (it
shouldn't, as we do our signals with SA_RESTART).

* jk/open-returns-eintr:
  config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur
  Makefile: add OPEN_RETURNS_EINTR knob
2021-03-04 15:34:45 -08:00
068cb92300 l10n: fr: v2.31 rnd 2
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-03-04 21:53:45 +01:00
85c787f1e9 Merge https://github.com/prati0100/git-gui
* https://github.com/prati0100/git-gui:
  Revert "git-gui: remove lines starting with the comment character"
2021-03-04 12:38:50 -08:00
f6a7e896b8 l10n: tr: v2.31.0-rc1
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-03-04 22:29:24 +03:00
929dc48e96 l10n: sv.po: Update Swedish translation (5104t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-03-04 19:10:43 +01:00
9b7e82b940 l10n: git.pot: v2.31.0 round 2 (9 new, 8 removed)
Generate po/git.pot from v2.31.0-rc1 for git v2.31.0 l10n round 2.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-03-04 22:41:21 +08:00
4dd8469336 Merge branch 'master' of github.com:git/git
* 'master' of github.com:git/git: (63 commits)
  Git 2.31-rc1
  Hopefully the last batch before -rc1
  Revert "commit-graph: when incompatible with graphs, indicate why"
  read-cache: make the index write buffer size 128K
  dir: fix malloc of root untracked_cache_dir
  commit-graph.c: display correct number of chunks when writing
  doc/reftable: document how to handle windows
  fetch-pack: print and use dangling .gitmodules
  fetch-pack: with packfile URIs, use index-pack arg
  http-fetch: allow custom index-pack args
  http: allow custom index-pack args
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  ...
2021-03-04 22:40:13 +08:00
df4f9e28f6 Merge branch 'py/revert-commit-comments'
This commit causes breakage on macOS, or in fact any platform using
older versions of Tcl. Revert it.

* py/revert-commit-comments:
  Revert "git-gui: remove lines starting with the comment character"
2021-03-04 13:59:45 +05:30
c0698df057 Revert "git-gui: remove lines starting with the comment character"
This reverts commit b9a43869c9.

This commit causes breakage on macOS (10.13). It causes errors on
startup and completely breaks the commit functionality. There are two
main problems. First, it uses `string cat` which is not supported on
older Tcl versions. Second, it does a half close of the bidirectional
pipe to git-stripspace which is also not supported on older Tcl
versions.

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
2021-03-04 13:53:27 +05:30
ea7e63921c doc: .gitignore documentation typofix
Signed-off-by: Julien Richard <julien.richard@ubisoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:16:48 -08:00
12604a8d0c t9801: replace test -f with test_path_is_file
Although `test -f` has the same functionality as test_path_is_file(), in
the case where test_path_is_file() fails, we get much better debugging
information.

Replace `test -f` with test_path_is_file so that future developers
will have a better experience debugging these test cases.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:11:31 -08:00
93c3d297b5 git mv foo FOO ; git mv foo bar gave an assert
The following sequence, on a case-insensitive file system,
(strictly speeking with core.ignorecase=true)
leads to an assertion failure and leaves .git/index.lock behind.

git init
echo foo >foo
git add foo
git mv foo FOO
git mv foo bar

This regression was introduced in Commit 9b906af657,
"git-mv: improve error message for conflicted file"

The bugfix is to change the "file exist case-insensitive in the index"
into the correct "file exist (case-sensitive) in the index".

This avoids the "assert" later in the code and keeps setting up the
"ce" pointer for ce_stage(ce) done in the next else if.

This fixes
https://github.com/git-for-windows/git/issues/2920

Reported-By: Dan Moseley <Dan.Moseley@microsoft.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:07:12 -08:00
f451960708 git-cat-file.txt: remove references to "sha1"
As part of the hash-transition, git can operate on more than just SHA-1
repositories. Replace "sha1"-specific documentation with hash-agnostic
terminology.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 16:43:06 -08:00
4f0ba2d533 git-cat-file.txt: monospace args, placeholders and filenames
In modern documentation, args, placeholders and filenames are
monospaced. Apply monospace formatting to these objects.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 16:43:03 -08:00
3ed77c4792 l10n: tr: v2.31.0-rc0
Signed-off-by: Emir Sarı <bitigchi@me.com>
2021-03-02 20:14:46 +08:00
273c9901c2 pretty: document multiple %(describe) being inconsistent
Each %(describe) placeholder is expanded using a separate git describe
call.  Their outputs depend on the tags present at the time, so there's
no consistency guarantee.  Document that fact.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:50:27 -08:00
09fe8ca92e t4205: assert %(describe) test coverage
Document that the test is covering both describable and
undescribable commits.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:42:17 -08:00
bbabaad298 config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur
We've had mixed reports on whether the latest release of macOS needs
this Makefile knob set. In most reported cases, there's antivirus
software running (which one might imagine could cause an open() call to
be delayed). However, one of the (off-list) reports I've gotten
indicated that it happened on an otherwise clean install of Big Sur.

Since the symptom is so bad (checkout randomly fails to write several
fails when the progress meter kicks in), and since the workaround is so
lightweight (if we don't see EINTR, it's just an extra conditional
check), let's just turn it on by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:07:45 -08:00
23c781f173 githooks.txt: clarify documentation on reference-transaction hook
The reference-transaction hook doesn't clearly document its scope and
what values it receives as input. Document it to make it less surprising
and clearly delimit its (current) scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:02:01 -08:00
5f308a89d8 githooks.txt: replace mentions of SHA-1 specific properties
The githooks(5) documentation states in several places that the hook
will receive a SHA-1 or hashes of 40 characters length. Given that we're
transitioning to a world where both SHA-1 and SHA-256 are supported,
this is inaccurate.

Fix the issue by replacing mentions of SHA-1 with "object name" and not
explicitly mentioning the hash size.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:02:01 -08:00
75f5efcba2 Merge branch 'master' of github.com:nafmo/git-l10n-sv
* 'master' of github.com:nafmo/git-l10n-sv:
  l10n: sv.po: Update Swedish translation (5103t0f0u)
2021-03-01 10:01:02 +08:00
0b71d789a8 Merge branch 'pl' of github.com:Arusekk/git-po
* 'pl' of github.com:Arusekk/git-po:
  l10n: pl.po: Update translation
2021-03-01 09:59:07 +08:00
fe8885258b l10n: sv.po: Update Swedish translation (5103t0f0u)
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2021-02-28 22:22:46 +01:00
fa42d191c6 l10n: pl.po: Update translation
Signed-off-by: Arusekk <arek_koz@o2.pl>
2021-02-27 17:17:32 +01:00
5ff5a30652 l10n: fr: v2.31.0 rnd 1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
2021-02-27 15:47:45 +01:00
81afdf7a2e diffcore-rename: compute dir_rename_guess from dir_rename_counts
dir_rename_counts has a mapping of a mapping, in particular, it has
   old_dir => { new_dir => count }
We want a simple mapping of
   old_dir => new_dir
based on which new_dir had the highest count for a given old_dir.
Compute this and store it in dir_rename_guess.

This is the final piece of the puzzle needed to make our guesses at
which directory files have been moved to when basenames aren't unique.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       12.775 s ±  0.062 s    12.596 s ±  0.061 s
    mega-renames:    188.754 s ±  0.284 s   130.465 s ±  0.259 s
    just-one-mega:     5.599 s ±  0.019 s     3.958 s ±  0.010 s

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
333899e1e3 diffcore-rename: limit dir_rename_counts computation to relevant dirs
We are using dir_rename_counts to count the number of other directories
that files within a directory moved to.  We only need this information
for directories that disappeared, though, so we can return early from
update_dir_rename_counts() for other paths.

If dirs_removed is passed to diffcore_rename_extended(), then it
provides the relevant bits of information for us to limit this counting
to relevant dirs.  If dirs_removed is not passed, we would need to
compute some replacement in order to do this limiting.  Introduce a new
info->relevant_source_dirs variable for this purpose, even though at
this stage we will only set it to dirs_removed for simplicity.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
1ad69eb0dc diffcore-rename: compute dir_rename_counts in stages
Compute dir_rename_counts based just on exact renames to start, as that
can provide us useful information in find_basename_matches().  This is
done by moving the code from compute_dir_rename_counts() into
initialize_dir_rename_info(), resulting in it being computed earlier and
based just on exact renames.  Since that's an incomplete result, we
augment the counts via calling update_dir_rename_counts() after each
basename-guide and inexact rename detection match is found.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
b1473019e8 diffcore-rename: extend cleanup_dir_rename_info()
When diffcore_rename_extended() is passed a NULL dir_rename_count, we
will still want to create a temporary one for use by
find_basename_matches(), but have it fully deallocated before
diffcore_rename_extended() returns.  However, when
diffcore_rename_extended() is passed a dir_rename_count, we want to fill
that strmap with appropriate values and return it.  However, for our
interim purposes we may also add entries corresponding to directories
that cannot have been renamed due to still existing on both sides.

Extend cleanup_dir_rename_info() to handle these two different cases,
cleaning up the relevant bits of information for each case.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:12 -08:00
b6e3d27434 diffcore-rename: move dir_rename_counts into dir_rename_info struct
This continues the migration of the directory rename detection code into
diffcore-rename, now taking the simple step of combining it with the
dir_rename_info struct.  Future commits will then make dir_rename_counts
be computed in stages, and add computation of dir_rename_guess.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
cd52e0050f diffcore-rename: add function for clearing dir_rename_count
As we adjust the usage of dir_rename_count we want to have a function
for clearing, or partially clearing it out.  Add a
partial_clear_dir_rename_count() function for this purpose.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
0c4fd732f0 Move computation of dir_rename_count from merge-ort to diffcore-rename
Move the computation of dir_rename_count from merge-ort.c to
diffcore-rename.c, making slight adjustments to the data structures
based on the move.  While the diffstat looks large, viewing this commit
with --color-moved makes it clear that only about 20 lines changed.

With this patch, the computation of dir_rename_count is still only done
after inexact rename detection, but subsequent commits will add a
preliminary computation of dir_rename_count after exact rename
detection, followed by some updates after inexact rename detection.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
ae8cf74d3f diffcore-rename: add a mapping of destination names to their indices
Compute a mapping of full filename to the index within rename_dst where
that filename is found, and store it in idx_map.  idx_possible_rename()
needs this to quickly finding an array entry in rename_dst given the
pathname.

While at it, add placeholder initializations for dir_rename_count and
dir_rename_guess; these will be more fully populated in subsequent
commits.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
bde8b9f34c diffcore-rename: provide basic implementation of idx_possible_rename()
Add a new struct dir_rename_info with various values we need inside our
idx_possible_rename() function introduced in the previous commit.  Add a
basic implementation for this function showing how we plan to use the
variables, but which will just return early with a value of -1 (not
found) when those variables are not set up.

Future commits will do the work necessary to set up those other
variables so that idx_possible_rename() does not always return -1.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
37a2514364 diffcore-rename: use directory rename guided basename comparisons
A previous commit noted that it is very common for people to move files
across directories while keeping their filename the same.  The last few
commits took advantage of this and showed that we can accelerate rename
detection significantly using basenames; since files with the same
basename serve as likely rename candidates, we can check those first and
remove them from the rename candidate pool if they are sufficiently
similar.

Unfortunately, the previous optimization was limited by the fact that
the remaining basenames after exact rename detection are not always
unique.  Many repositories have hundreds of build files with the same
name (e.g. Makefile, .gitignore, build.gradle, etc.), and may even have
hundreds of source files with the same name.  (For example, the linux
kernel has 100 setup.c, 87 irq.c, and 112 core.c files.  A repository at
$DAYJOB has a lot of ObjectFactory.java and Plugin.java files).

For these files with non-unique basenames, we are faced with the task of
attempting to determine or guess which directory they may have been
relocated to.  Such a task is precisely the job of directory rename
detection.  However, there are two catches: (1) the directory rename
detection code has traditionally been part of the merge machinery rather
than diffcore-rename.c, and (2) directory rename detection currently
runs after regular rename detection is complete.  The 1st catch is just
an implementation issue that can be overcome by some code shuffling.
The 2nd requires us to add a further approximation: we only have access
to exact renames at this point, so we need to do directory rename
detection based on just exact renames.  In some cases we won't have
exact renames, in which case this extra optimization won't apply.  We
also choose to not apply the optimization unless we know that the
underlying directory was removed, which will require extra data to be
passed in to diffcore_rename_extended().  Also, even if we get a
prediction about which directory a file may have relocated to, we will
still need to check to see if there is a file in the predicted
directory, and then compare the two files to see if they meet the higher
min_basename_score threshold required for marking the two files as
renames.

This commit introduces an idx_possible_rename() function which will
do this directory rename detection for us and give us the index within
rename_dst of the resulting filename.  For now, this function is
hardcoded to return -1 (not found) and just hooks up how its results
would be used once we have a more complete implementation in place.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 17:53:11 -08:00
66f52fa26b pack-revindex.c: don't close unopened file descriptors
When opening a reverse index, load_revindex_from_disk() jumps to the
'cleanup' label in case something goes wrong: the reverse index had the
wrong size, an unrecognized version, or similar.

It also jumps to this label when the reverse index couldn't be opened in
the first place, which will cause an error with the unguarded close()
call in the label.

Guard this call with "if (fd >= 0)" to make sure that we have a valid
file descriptor to close before attempting to close it.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:42:27 -08:00
36e834abc1 t/perf: avoid copying worktree files from test repo
When running the perf suite, we copy files from an existing $GIT_DIR to
a scratch repository to give us a realistic setup on which to operate.
Since the perf scripts themselves may modify the scratch repository, we
want to make sure we've scrubbed any references back to the original.

One existing example is that we avoid copying the file "commondir" at
the top-level of the repository. In a worktree git-dir (e.g.,
.git/worktrees/foo), that file contains the path to the parent
repository; copying it could mean ref updates in the scratch repository
affect the original.

But there are other files we should cover, too:

  - "gitdir" in a worktree git-dir contains the path to the actual .git
    file in the working tree. We _shouldn't_ end up looking at it at
    all, since the lack of a "commondir" file means Git won't consider
    this to be a worktree git-dir. But it's best to err on the safe
    side.

  - in a parent repository that contains worktrees, the
    "$GIT_DIR/worktrees" directory will contain the git dirs for the
    individual worktrees. Which will themselves contain commondir and
    gitdir files that may reference the original repository. We should
    likewise remove them.

    Note that this does mean that the perf suite's scratch repositories
    will never have any worktrees. That's OK; we don't have any perf tests
    that are influenced by their presence. If we add any, they'd
    probably want to create the worktrees themselves anyway.

This patch adds both paths to the set of omissions in
test_perf_copy_repo_contents(). Note that we won't get confused here by
matching arbitrary names like refs/heads/commondir. This list is always
matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the
actual recursion).

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:21:04 -08:00
85b87a5396 t/perf: handle worktrees as test repos
The perf suite gets confused when test_perf_default_repo is pointed at a
worktree (which includes when it is run from within a worktree at all,
since the default is to use the current repository).

Here's an example:

  $ git worktree add ~/foo
  Preparing worktree (new branch 'foo')
  HEAD is now at 328c109303 The eighth batch
  $ cd ~/foo
  $ make
  [...build output...]
  $ cd t/perf
  $ ./p0000-perf-lib-sanity.sh -v -i
  [...]
  perf 1 - test_perf_default_repo works:
  running:
  	foo=$(git rev-parse HEAD) &&
  	test_export foo

  fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

The problem is that we didn't copy all of the necessary files from the
source repository (in this case we got HEAD, but we have no refs!). We
discover the git-dir with "rev-parse --git-dir", but this points to the
worktree's partial repository in .../.git/worktrees/foo.

That partial repository has a "commondir" file which points to the main
repository, where the actual refs are stored, but we don't copy it. This
is the correct thing to do, though! If we did copy it, then our scratch
test repo would be pointing back to the original main repo, and any ref
updates we made in the tests would impact that original repo.

Instead, we need to either:

  1. Make a scratch copy of the original main repo (in addition to the
     worktree repo), and point the scratch worktree repo's commondir at
     it. This preserves the original relationship, but it's doubtful any
     script really cares (if they are testing worktree performance,
     they'd probably make their own worktrees). And it's trickier to get
     right.

  2. Collapse the main and worktree repos into a single scratch repo.
     This can be done by copying everything from both, preferring any
     files from the worktree repo.

This patch does the second one. With this applied, the example above
results in p0000 running successfully.

Reported-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:21:04 -08:00
2b08101204 Makefile: add OPEN_RETURNS_EINTR knob
On some platforms, open() reportedly returns EINTR when opening regular
files and we receive a signal (usually SIGALRM from our progress meter).
This shouldn't happen, as open() should be a restartable syscall, and we
specify SA_RESTART when setting up the alarm handler. So it may actually
be a kernel or libc bug for this to happen. But it has been reported on
at least one version of Linux (on a network filesystem):

  https://lore.kernel.org/git/c8061cce-71e4-17bd-a56a-a5fed93804da@neanderfunk.de/

as well as on macOS starting with Big Sur even on a regular filesystem.

We can work around it by retrying open() calls that get EINTR, just as
we do for read(), etc. Since we don't ever _want_ to interrupt an open()
call, we can get away with just redefining open, rather than insisting
all callsites use xopen().

We actually do have an xopen() wrapper already (and it even does this
retry, though there's no indication of it being an observed problem back
then; it seems simply to have been lifted from xread(), etc). But it is
used hardly anywhere, and isn't suitable for general use because it will
die() on error. In theory we could combine the two, but it's awkward to
do so because of the variable-args interface of open().

This patch adds a Makefile knob for enabling the workaround. It's not
enabled by default for any platforms in config.mak.uname yet, as we
don't have enough data to decide how common this is (I have not been
able to reproduce on either Linux or Big Sur myself). It may be worth
enabling preemptively anyway, since the cost is pretty low (if we don't
see an EINTR, it's just an extra conditional).

However, note that we must not enable this on Windows. It doesn't do
anything there, and the macro overrides the existing mingw_open()
redirection. I've added a preemptive #undef here in the mingw header
(which is processed first) to just quietly disable it (we could also
make it an #error, but there is little point in being so aggressive).

Reported-by: Aleksey Kliger <alklig@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:15:51 -08:00
6fab35f748 convert: fail gracefully upon missing clean cmd on required filter
The gitattributes documentation mentions that either the clean cmd or
the smudge cmd can be left unspecified in a filter definition. However,
when the filter is marked as 'required', the absence of any one of these
two should be treated as an error. Git already fails under these
circumstances, but not always in a pleasant way: omitting a clean cmd in
a required filter triggers an assertion error which leaves the user with
a quite verbose message:

git: convert.c:1459: convert_to_git_filter_fd: Assertion "ca.drv->clean || ca.drv->process" failed.

This assertion is not really necessary, as the apply_filter() call below
it already performs the same check. And when this condition is not met,
the function returns 0, making the caller die() with a much nicer
message. (Also note that die()-ing here is the right behavior as
`would_convert_to_git_filter_fd() == true` is a precondition to use
convert_to_git_filter_fd(), and the former is only true when the filter
is required.) So remove the assertion and add two regression tests to
make sure that git fails nicely when either the smudge or clean command
is missing on a required filter.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 11:20:02 -08:00
712b0ed6ec l10n: git.pot: v2.31.0 round 1 (155 new, 89 removed)
Generate po/git.pot from v2.31.0-rc0 for git v2.31.0 l10n round 1.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-02-26 22:09:42 +08:00
702110aac6 commit-graph: use config to specify generation type
We have two established generation number versions:

 1: topological levels
 2: corrected commit dates

The corrected commit dates are enabled by default, but they also write
extra data in the GDAT and GDOV chunks. Services that host Git data
might want to have more control over when this feature rolls out than
just updating the Git binaries.

Add a new "commitGraph.generationVersion" config option that specifies
the intended generation number version. If this value is less than 2,
then the GDAT chunk is never written _or read_ from an existing file.

This can replace our use of the GIT_TEST_COMMIT_GRAPH_NO_GDAT
environment variable in the test suite. Remove it.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25 15:10:41 -08:00
c7ef8fe608 commit-graph: create local repository pointer
The write_commit_graph() method uses 'the_repository' in a few places. A
new need for a repository pointer is coming in the following change, so
group these instances into a local variable 'r' that could eventually
become part of the method signature, if so desired.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25 15:10:40 -08:00
0f1da600e6 remote: write camel-cased *.pushRemote on rename
When a remote is renamed don't change the canonical "*.pushRemote"
form to "*.pushremote". Fixes and tests for a minor bug in
923d4a5ca4 (remote rename/remove: handle branch.<name>.pushRemote
config values, 2020-01-27). See the preceding commit for why this does
& doesn't matter.

While we're at it let's also test that we handle the "*.pushDefault"
key correctly. The code to handle that was added in
b3fd6cbf29 (remote rename/remove: gently handle remote.pushDefault
config, 2020-02-01) and does the right thing, but nothing tested that
we wrote out the canonical camel-cased form.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:03:00 -08:00
bfa9148ff7 remote: add camel-cased *.tagOpt key, like clone
Change "git remote add" so that it adds a *.tagOpt key, and not the
lower-cased *.tagopt on "git remote add --no-tags", just as "git clone
--no-tags" would do.

This doesn't matter for anything that reads the config. It's just
prettier if we write config keys in their documented camelCase form to
user-readable config files.

When I added support for "clone -no-tags" in 0dab2468ee (clone: add a
--no-tags option to clone without tags, 2017-04-26) I made it use
the *.tagOpt form, but the older "git remote add" added in
111fb85865 (remote add: add a --[no-]tags option, 2010-04-20) has
been using *.tagopt all this time.

It's easy enough to add a test for this, so let's do that. We can't
use "git config -l" there, because it'll normalize the keys to their
lower-cased form. Let's add the test for "git clone" too for good
measure, not just to the "git remote" codepath we're fixing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:02:58 -08:00
11875561bf Merge branch 'ds/chunked-file-api' into tb/reverse-midx
* ds/chunked-file-api:
  commit-graph.c: display correct number of chunks when writing
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  midx: add num_large_offsets to write_midx_context
  midx: add pack_perm to write_midx_context
  midx: add entries to write_midx_context
  midx: use context in write_midx_pack_names()
  midx: rename pack_info to write_midx_context
  commit-graph: use chunk-format write API
  chunk-format: create chunk format write API
  commit-graph: anonymize data in chunk_write_fn
2021-02-24 15:26:14 -08:00
a9926ecd54 l10n: Update Catalan translation
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2021-02-24 09:14:56 +01:00
029bac01a8 Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets
Add targets to compile the various *.o files we declared in commonly
used *_OBJS variables. This is useful for debugging purposes, to
e.g. get to the point where we can compile a git.o. See [1] for a
use-case for this target.

https://lore.kernel.org/git/YBCGtd9if0qtuQxx@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:59 -08:00
abc3c87f3d Makefile: split OBJECTS into OBJECTS and GIT_OBJS
Add a new GIT_OBJS variable, with the objects sufficient to get to a
git.o or common-main.o.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
d6da8b328e Makefile: sort OBJECTS assignment for subsequent change
Change the order of the OBJECTS assignment, this makes a follow-up
change where we split it up into two variables smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
752b3ef972 Makefile: split up long OBJECTS line
Split up the long OBJECTS line into multiple lines using the "+="
assignment we commonly use elsewhere in the Makefile when these lines
get unwieldy.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
bed3419925 Makefile: guard against TEST_OBJS in the environment
Add TEST_OBJS to the list of other *_OBJS variables we reset. We had
already established this pattern when TEST_OBJS was introduced in
daa99a9172 (Makefile: make sure test helpers are rebuilt when headers
change, 2010-01-26), but it wasn't added to the list in that commit
along with the rest.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 09:57:58 -08:00
0fabafd0b9 builtin/repack.c: add '--geometric' option
Often it is useful to both:

  - have relatively few packfiles in a repository, and

  - avoid having so few packfiles in a repository that we repack its
    entire contents regularly

This patch implements a '--geometric=<n>' option in 'git repack'. This
allows the caller to specify that they would like each pack to be at
least a factor times as large as the previous largest pack (by object
count).

Concretely, say that a repository has 'n' packfiles, labeled P1, P2,
..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'.
With a geometric factor of 'r', it should be that:

  objects(Pi) > r*objects(P(i-1))

for all i in [1, n], where the packs are sorted by

  objects(P1) <= objects(P2) <= ... <= objects(Pn).

Since finding a true optimal repacking is NP-hard, we approximate it
along two directions:

  1. We assume that there is a cutoff of packs _before starting the
     repack_ where everything to the right of that cut-off already forms
     a geometric progression (or no cutoff exists and everything must be
     repacked).

  2. We assume that everything smaller than the cutoff count must be
     repacked. This forms our base assumption, but it can also cause
     even the "heavy" packs to get repacked, for e.g., if we have 6
     packs containing the following number of objects:

       1, 1, 1, 2, 4, 32

     then we would place the cutoff between '1, 1' and '1, 2, 4, 32',
     rolling up the first two packs into a pack with 2 objects. That
     breaks our progression and leaves us:

       2, 1, 2, 4, 32
         ^

     (where the '^' indicates the position of our split). To restore a
     progression, we move the split forward (towards larger packs)
     joining each pack into our new pack until a geometric progression
     is restored. Here, that looks like:

       2, 1, 2, 4, 32  ~>  3, 2, 4, 32  ~>  5, 4, 32  ~> ... ~> 9, 32
         ^                   ^                ^                   ^

This has the advantage of not repacking the heavy-side of packs too
often while also only creating one new pack at a time. Another wrinkle
is that we assume that loose, indexed, and reflog'd objects are
insignificant, and lump them into any new pack that we create. This can
lead to non-idempotent results.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
20b031fede packfile: add kept-pack cache for find_kept_pack_entry()
In a recent patch we added a function 'find_kept_pack_entry()' to look
for an object only among kept packs.

While this function avoids doing any lookup work in non-kept packs, it
is still linear in the number of packs, since we have to traverse the
linked list of packs once per object. Let's cache a reduced version of
that list to save us time.

Note that this cache will last the lifetime of the program. We could
invalidate it on reprepare_packed_git(), but there's not much point in
being rigorous here:

  - we might already fail to notice new .keep packs showing up after the
    program starts. We only reprepare_packed_git() when we fail to find
    an object. But adding a new pack won't cause that to happen.
    Somebody repacking could add a new pack and delete an old one, but
    most of the time we'd have a descriptor or mmap open to the old
    pack anyway, so we might not even notice.

  - in pack-objects we already cache the .keep state at startup, since
    56dfeb6263 (pack-objects: compute local/ignore_pack_keep early,
    2016-07-29). So this is just extending that concept further.

  - we don't have to worry about any packed_git being removed; we always
    keep the old structs around, even after reprepare_packed_git()

We do defensively invalidate the cache in case the set of kept packs
being asked for changes (e.g., only in-core kept packs were cached, but
suddenly the caller also wants on-disk kept packs, too). In theory we
could build all three caches and switch between them, but it's not
necessary, since this patch (and series) never changes the set of kept
packs that it wants to inspect from the cache.

So that "optimization" is more about being defensive in the face of
future changes than it is about asking for multiple kinds of kept packs
in this patch.

Here are p5303 results (as always, measured against the kernel):

  Test                                        HEAD^                   HEAD
  -----------------------------------------------------------------------------------------------
  5303.5: repack (1)                          57.34(54.66+10.88)      56.98(54.36+10.98) -0.6%
  5303.6: repack with kept (1)                57.38(54.83+10.49)      57.17(54.97+10.26) -0.4%
  5303.11: repack (50)                        71.70(88.99+4.74)       71.62(88.48+5.08) -0.1%
  5303.12: repack with kept (50)              72.58(89.61+4.78)       71.56(88.80+4.59) -1.4%
  5303.17: repack (1000)                      217.19(491.72+14.25)    217.31(490.82+14.53) +0.1%
  5303.18: repack with kept (1000)            246.12(520.07+14.93)    217.08(490.37+15.10) -11.8%

and the --stdin-packs case, which scales a little bit better (although
not by that much even at 1,000 packs):

  5303.7: repack with --stdin-packs (1)       0.00(0.00+0.00)         0.00(0.00+0.00) =
  5303.13: repack with --stdin-packs (50)     3.43(11.75+0.24)        3.43(11.69+0.30) +0.0%
  5303.19: repack with --stdin-packs (1000)   130.50(307.15+7.66)     125.13(301.36+8.04) -4.1%

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
6325da14af builtin/pack-objects.c: rewrite honor-pack-keep logic
Now that we have find_kept_pack_entry(), we don't have to manually keep
hunting through every pack to find a possible "kept" duplicate of the
object. This should be faster, assuming only a portion of your total
packs are actually kept.

Note that we have to re-order the logic a bit here; we can deal with the
disqualifying situations first (e.g., finding the object in a non-local
pack with --local), then "kept" situation(s), and then just fall back to
other "--local" conditions.

Here are the results from p5303 (measurements again taken on the
kernel):

  Test                                        HEAD^                   HEAD
  -----------------------------------------------------------------------------------------------
  5303.5: repack (1)                          57.26(54.59+10.84)      57.34(54.66+10.88) +0.1%
  5303.6: repack with kept (1)                57.33(54.80+10.51)      57.38(54.83+10.49) +0.1%
  5303.11: repack (50)                        71.54(88.57+4.84)       71.70(88.99+4.74) +0.2%
  5303.12: repack with kept (50)              85.12(102.05+4.94)      72.58(89.61+4.78) -14.7%
  5303.17: repack (1000)                      216.87(490.79+14.57)    217.19(491.72+14.25) +0.1%
  5303.18: repack with kept (1000)            665.63(938.87+15.76)    246.12(520.07+14.93) -63.0%

and the --stdin-packs timings:

  5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)         0.00(0.00+0.00) -100.0%
  5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)        3.43(11.75+0.24) -2.8%
  5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)     130.50(307.15+7.66) -33.4%

So our repack with an empty .keep pack is roughly as fast as one without
a .keep pack up to 50 packs. But the --stdin-packs case scales a little
better, too.

Notably, it is faster than a repack of the same size and a kept pack. It
looks at fewer objects, of course, but the penalty for looking at many
packs isn't as costly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
fbf20aeeef p5303: measure time to repack with keep
Add two new tests to measure repack performance. Both tests split the
repository into synthetic "pushes", and then leave the remaining objects
in a big base pack.

The first new test marks an empty pack as "kept" and then passes
--honor-pack-keep to avoid including objects in it. That doesn't change
the resulting pack, but it does let us compare to the normal repack case
to see how much overhead we add to check whether objects are kept or
not.

The other test is of --stdin-packs, which gives us a sense of how that
number scales based on the number of packs we provide as input. In each
of those tests, the empty pack isn't considered, but the residual pack
(objects that were left over and not included in one of the synthetic
push packs) is marked as kept.

(Note that in the single-pack case of the --stdin-packs test, there is
nothing do since there are no non-excluded packs).

Here are some timings on a recent clone of the kernel:

  5303.5: repack (1)                          57.26(54.59+10.84)
  5303.6: repack with kept (1)                57.33(54.80+10.51)

in the 50-pack case, things start to slow down:

  5303.11: repack (50)                        71.54(88.57+4.84)
  5303.12: repack with kept (50)              85.12(102.05+4.94)

and by the time we hit 1,000 packs, things are substantially worse, even
though the resulting pack produced is the same:

  5303.17: repack (1000)                      216.87(490.79+14.57)
  5303.18: repack with kept (1000)            665.63(938.87+15.76)

That's because the code paths around handling .keep files are known to
scale badly; they look in every single pack file to find each object.
Our solution to that was to notice that most repos don't have keep
files, and to make that case a fast path. But as soon as you add a
single .keep, that part of pack-objects slows down again (even if we
have fewer objects total to look at).

Likewise, the scaling is pretty extreme on --stdin-packs (but each
subsequent test is also being asked to do more work):

  5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
  5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
  5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
60bb5f2f5d p5303: add missing &&-chains
These are in a helper function, so the usual chain-lint doesn't notice
them. This function is still not perfect, as it has some git invocations
on the left-hand-side of the pipe, but it's primary purpose is timing,
not finding bugs or correctness issues.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
339bce27f4 builtin/pack-objects.c: add '--stdin-packs' option
In an upcoming commit, 'git repack' will want to create a pack comprised
of all of the objects in some packs (the included packs) excluding any
objects in some other packs (the excluded packs).

This caller could iterate those packs themselves and feed the objects it
finds to 'git pack-objects' directly over stdin, but this approach has a
few downsides:

  - It requires every caller that wants to drive 'git pack-objects' in
    this way to implement pack iteration themselves. This forces the
    caller to think about details like what order objects are fed to
    pack-objects, which callers would likely rather not do.

  - If the set of objects in included packs is large, it requires
    sending a lot of data over a pipe, which is inefficient.

  - The caller is forced to keep track of the excluded objects, too, and
    make sure that it doesn't send any objects that appear in both
    included and excluded packs.

But the biggest downside is the lack of a reachability traversal.
Because the caller passes in a list of objects directly, those objects
don't get a namehash assigned to them, which can have a negative impact
on the delta selection process, causing 'git pack-objects' to fail to
find good deltas even when they exist.

The caller could formulate a reachability traversal themselves, but the
only way to drive 'git pack-objects' in this way is to do a full
traversal, and then remove objects in the excluded packs after the
traversal is complete. This can be detrimental to callers who care
about performance, especially in repositories with many objects.

Introduce 'git pack-objects --stdin-packs' which remedies these four
concerns.

'git pack-objects --stdin-packs' expects a list of pack names on stdin,
where 'pack-xyz.pack' denotes that pack as included, and
'^pack-xyz.pack' denotes it as excluded. The resulting pack includes all
objects that are present in at least one included pack, and aren't
present in any excluded pack.

To address the delta selection problem, 'git pack-objects --stdin-packs'
works as follows. First, it assembles a list of objects that it is going
to pack, as above. Then, a reachability traversal is started, whose tips
are any commits mentioned in included packs. Upon visiting an object, we
find its corresponding object_entry in the to_pack list, and set its
namehash parameter appropriately.

To avoid the traversal visiting more objects than it needs to, the
traversal is halted upon encountering an object which can be found in an
excluded pack (by marking the excluded packs as kept in-core, and
passing --no-kept-objects=in-core to the revision machinery).

This can cause the traversal to halt early, for example if an object in
an included pack is an ancestor of ones in excluded packs. But stopping
early is OK, since filling in the namehash fields of objects in the
to_pack list is only additive (i.e., having it helps the delta selection
process, but leaving it blank doesn't impact the correctness of the
resulting pack).

Even still, it is unlikely that this hurts us much in practice, since
the 'git repack --geometric' caller (which is introduced in a later
commit) marks small packs as included, and large ones as excluded.
During ordinary use, the small packs usually represent pushes after a
large repack, and so are unlikely to be ancestors of objects that
already exist in the repository.

(I found it convenient while developing this patch to have 'git
pack-objects' report the number of objects which were visited and got
their namehash fields filled in during traversal. This is also included
in the below patch via trace2 data lines).

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
c9fff00016 revision: learn '--no-kept-objects'
A future caller will want to be able to perform a reachability traversal
which terminates when visiting an object found in a kept pack. The
closest existing option is '--honor-pack-keep', but this isn't quite
what we want. Instead of halting the traversal midway through, a full
traversal is always performed, and the results are only trimmed
afterwords.

Besides needing to introduce a new flag (since culling results
post-facto can be different than halting the traversal as it's
happening), there is an additional wrinkle handling the distinction
in-core and on-disk kept packs. That is: what kinds of kept pack should
stop the traversal?

Introduce '--no-kept-objects[=<on-disk|in-core>]' to specify which kinds
of kept packs, if any, should stop a traversal. This can be useful for
callers that want to perform a reachability analysis, but want to leave
certain packs alone (for e.g., when doing a geometric repack that has
some "large" packs which are kept in-core that it wants to leave alone).

Note that this option is not guaranteed to produce exactly the set of
objects that aren't in kept packs, since it's possible the traversal
order may end up in a situation where a non-kept ancestor was "cut off"
by a kept object (at which point we would stop traversing). But, we
don't care about absolute correctness here, since this will eventually
be used as a purely additive guide in an upcoming new repack mode.

Explicitly avoid documenting this new flag, since it is only used
internally. In theory we could avoid even adding it rev-list, but being
able to spell this option out on the command-line makes some special
cases easier to test without promising to keep it behaving consistently
forever. Those tricky cases are exercised in t6114.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
f62312e028 packfile: introduce 'find_kept_pack_entry()'
Future callers will want a function to fill a 'struct pack_entry' for a
given object id but _only_ from its position in any kept pack(s).

In particular, an new 'git repack' mode which ensures the resulting
packs form a geometric progress by object count will mark packs that it
does not want to repack as "kept in-core", and it will want to halt a
reachability traversal as soon as it visits an object in any of the kept
packs. But, it does not want to halt the traversal at non-kept, or
.keep packs.

The obvious alternative is 'find_pack_entry()', but this doesn't quite
suffice since it only returns the first pack it finds, which may or may
not be kept (and the mru cache makes it unpredictable which one you'll
get if there are options).

Short of that, you could walk over all packs looking for the object in
each one, but it scales with the number of packs, which may be
prohibitive.

Introduce 'find_kept_pack_entry()', a function which is like
'find_pack_entry()', but only fills in objects in the kept packs.

Handle packs which have .keep files, as well as in-core kept packs
separately, since certain callers will want to distinguish one from the
other. (Though on-disk and in-core kept packs share the adjective
"kept", it is best to think of the two sets as independent.)

There is a gotcha when looking up objects that are duplicated in kept
and non-kept packs, particularly when the MIDX stores the non-kept
version and the caller asked for kept objects only. This could be
resolved by teaching the MIDX to resolve duplicates by always favoring
the kept pack (if one exists), but this breaks an assumption in existing
MIDXs, and so it would require a format change.

The benefit to changing the MIDX in this way is marginal, so we instead
have a more thorough check here which is explained with a comment.

Callers will be added in subsequent patches.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
c1760352e0 grep/pcre2: move definitions of pcre2_{malloc,free}
Move the definitions of the pcre2_{malloc,free} functions above the
compile_pcre2_pattern() function they're used in.

Before the preceding commit they used to be needed earlier, but now we
can move them to be adjacent to the other PCREv2 functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
cbe81e653f grep/pcre2: move back to thread-only PCREv2 structures
Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init().

This change brings it in line with how the rest of the pcre2_* members
in the grep_pat structure are set up.

As noted in the preceding commit the approach 513f2b0bbd (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.

The approach of creating a global context in grep_init() is just added
complexity for almost zero gain. On my system it's 24 bytes saved
per-thread. For comparison PCREv2 will then go on to allocate at least
a kilobyte for its own thread-local state.

As noted in 6d423dd542 (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.

So let's remove this special case and make all of them thread-local
again for simplicity. With this change we could move the
pcre2_{malloc,free} functions around to live closer to their current
use. I'm not doing that here to keep this change small, that cleanup
will be done in a follow-up commit.

See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
8d12851342 grep/pcre2: actually make pcre2 use custom allocator
Continue work started in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
functions for allocation. We'll now use it for all PCREv2 allocations.

The reason 513f2b0bbd worked as a bugfix for the USE_NED_ALLOCATOR
issue is because it targeted the allocation freed via free(), as
opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
and pcre2_maketables_free() pair.

For most of the rest we continued allocating with stock malloc()
inside PCREv2 itself, but didn't segfault because we'd use its
corresponding free().

In a preceding commit of mine I changed the free() to
pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
far as fixing the segfault goes we could revert 513f2b0bbd. But then
we wouldn't use the desired allocator, let's just use it instead.

Before this patch we'd on e.g.:

    grep --threads=1 -iP æ.*var.*xyz

Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
corresponding free() calls. Now it's 12 calls to each. This can be
observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.

Reading the history of how this bug got introduced it wasn't present
in Johannes's original patch[1] to fix the issue.

My reading of that thread is that the approach the follow-up patches
to Johannes's original pursued were based on misunderstanding of how
the PCREv2 API works. In particular this part of [2]:

    "most of the time (like when using UTF-8) the chartable (and
    therefore the global context) is not needed (even when using
    alternate allocators)"

That's simply not how PCREv2 memory allocation works. It's easy to see
how the misunderstanding came about. It's because (as noted above) the
issue was noticed because of our use of free() in our own grep.c for
freeing the memory allocated by pcre2_maketables().

Thus the misunderstanding that PCREv2's compile context is something
only needed for pcre2_maketables(), and e.g. an aborted earlier
attempt[3] to only set it up when we ourselves called
pcre2_maketables().

That's not what PCREv2's compile context is. To quote PCREv2's
documentation:

    "This context just contains pointers to (and data for) external
    memory management functions that are called from several places in
    the PCRE2 library."

Thus the failed attempts to go down the route of only creating the
general context in cases where we ourselves call pcre2_maketables(),
before finally settling on the approach 513f2b0bbd took of always
creating it, but then mostly not using it.

Instead we should always create it, and then pass the general context
to those functions that accept it, so that they'll consistently use
our preferred memory allocation functions.

1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
b76bf27f6a grep/pcre2: use pcre2_maketables_free() function
Make use of the pcre2_maketables_free() function to free the memory
allocated by pcre2_maketables().

At first sight it's strange that 10da030ab7 (grep: avoid leak of
chartables in PCRE2, 2019-10-16) which added the free() call here
doesn't make use of the pcre2_free() the author introduced in the
preceding commit in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16).

The reason is that at the time the function didn't exist. It was first
introduced in PCREv2 version 10.34, released on 2019-11-21.

Let's make use of it behind a macro. I don't think this matters for
anything to do with custom allocators, but it makes our use of PCREv2
more discoverable.

At some distant point in the future we'll be able to drop the version
guard, as nobody will be running a version older than 10.34.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
797c359978 grep/pcre2: use compile-time PCREv2 version test
Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added
in 95ca1f987e (grep/pcre2: better support invalid UTF-8 haystacks,
2021-01-24) with the same test done at compile-time.

It might be cuter to do this at runtime since we don't have to do the
"major >= 11 || (major >= 10 && ...)" test. But in the next commit
we'll add another version comparison that absolutely needs to be done
at compile-time, so we're better of being consistent across the board.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
a39b4003f0 grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
Add optional printing of PCREv2 allocations to stderr for a developer
who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1".

You need to manually change the definition in the source file similar
to the DEBUG_MAILMAP, there's no Makefile knob for this.

This will be referenced a subsequent commit, and is generally useful
to manually see what's going on with PCREv2 allocations while working
on that code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
588e4fb191 grep/pcre2: prepare to add debugging to pcre2_malloc()
Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer.

This doesn't introduce any functional change, it just makes a
subsequent commit's diff easier to read. Changes code added in
513f2b0bbd (grep: make PCRE2 aware of custom allocator, 2019-10-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
47eebd2fd2 grep/pcre2: correct reference to grep_init() in comment
Correct a comment added in 513f2b0bbd (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].

1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:18 -08:00
1cfc5a850c grep/pcre2: drop needless assignment to NULL
Remove a redundant assignment of pcre2_compile_context dating back to
my 94da9193a6 (grep: add support for PCRE v2, 2017-06-01).

In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
need to NULL out individual members here.

I think this was probably something left over from an earlier
development version of mine.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:18 -08:00
0ddf8ceac0 grep/pcre2: drop needless assignment + assert() on opt->pcre2
Drop an assignment added in b65abcafc7 (grep: use PCRE v2 for
optimized fixed-string search, 2019-07-01) and the overly cautious
assert() I added in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01).

There was never a good reason for this, it's just a relic from when I
initially wrote the PCREv2 support. We're not going to have confusion
about compile_pcre2_pattern() being called when it shouldn't just
because we forgot to cargo-cult this opt->pcre2 option.

Furthermore the "struct grep_opt" is (mostly) used for the options the
user supplied, let's avoid the pattern of needlessly assigning to it.

With my recent removal of the PCREv1 backend in 7599730b7e (Remove
support for v1 of the PCRE library, 2021-01-24) there's even less
confusion around what we call where in these codepaths, which is one
more reason to remove this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:18 -08:00
b081547ec1 pretty: add merge and exclude options to %(describe)
Allow restricting the tags used by the placeholder %(describe) with the
options match and exclude.  E.g. the following command describes the
current commit using official version tags, without those for release
candidates:

   $ git log -1 --format='%(describe:match=v[0-9]*,exclude=*rc*)'

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 09:54:33 -08:00
15ae82d5d6 pretty: add %(describe)
Add a format placeholder for describe output.  Implement it by actually
calling git describe, which is simple and guarantees correctness.  It's
intended to be used with $Format:...$ in files with the attribute
export-subst and git archive.  It can also be used with git log etc.,
even though that's going to be slow due to the fork for each commit.

Suggested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 09:54:31 -08:00
adcd9f5472 mailmap: do not respect symlinks for in-tree .mailmap
As with .gitattributes and .gitignore, we would like to make sure that
.mailmap files are handled consistently whether read from the a blob (as
is the default behavior in a bare repo) or from the filesystem.
Likewise, we would like to avoid reading out-of-tree files pointed to by
a symlink, which could have security implications in certain setups.

We can cover both by using open_nofollow() when opening the in-tree
files. We'll continue to follow links for mailmap.file, as well as when
reading .mailmap from the current directory when outside of a repository
entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
feb9b7792f exclude: do not respect symlinks for in-tree .gitignore
As with .gitattributes, we would like to make sure that .gitignore files
are handled consistently whether read from the index or from the
filesystem. Likewise, we would like to avoid reading out-of-tree files
pointed to by the symlinks, which could have security implications in
certain setups.

We can cover both by using open_nofollow() when opening the in-tree
files. We'll continue to follow links for core.excludesFile, as well as
$GIT_DIR/info/exclude.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
2ef579e261 attr: do not respect symlinks for in-tree .gitattributes
The attributes system may sometimes read in-tree files from the
filesystem, and sometimes from the index. In the latter case, we do not
resolve symbolic links (and are not likely to ever start doing so).
Let's open filesystem links with O_NOFOLLOW so that the two cases behave
consistently.

As a bonus, this means that git will not follow such symlinks to read
and parse out-of-tree paths. In some cases this could have security
implications, as a malicious repository can cause Git to open and read
arbitrary files. It could already feed arbitrary content to the parser,
but in certain setups it might be able to exfiltrate data from those
paths (e.g., if an automated service operating on the malicious repo
reveals its stderr to an attacker).

Note that O_NOFOLLOW only prevents following links for the path itself,
not intermediate directories in the path.  At first glance, it seems
like

  ln -s /some/path in-repo

might still look at "in-repo/.gitattributes", following the symlink to
"/some/path/.gitattributes". However, if "in-repo" is a symbolic link,
then we know that it has no git paths below it, and will never look at
its .gitattributes file.

We will continue to support out-of-tree symbolic links (e.g., in
$GIT_DIR/info/attributes); this just affects in-tree links. When a
symbolic link is encountered, the contents are ignored and a warning is
printed. POSIX specifies ELOOP in this case, so the user would generally
see something like:

  warning: unable to access '.gitattributes': Too many levels of symbolic links

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
1679d60bfc exclude: add flags parameter to add_patterns()
There are a number of callers of add_patterns() and its sibling
functions. Let's give them a "flags" parameter for adding new options
without having to touch each caller. We'll use this in a future patch to
add O_NOFOLLOW support. But for now each caller just passes 0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
dbf387d550 attr: convert "macro_ok" into a flags field
The attribute code can have a rather deep callstack, through
which we have to pass the "macro_ok" flag. In anticipation
of adding other flags, let's convert this to a generic
bit-field.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:32 -08:00
00611d8440 add open_nofollow() helper
Some callers of open() would like to use O_NOFOLLOW, but it is not
available on all platforms. Let's abstract this into a helper function
so we can provide system-specific implementations.

Some light web-searching reveals that we might be able to get something
similar on Windows using FILE_FLAG_OPEN_REPARSE_POINT. I didn't dig into
this further.

For other systems without O_NOFOLLOW or any equivalent, we have two
options for fallback:

  - we can just open anyway, following symlinks; this may have security
    implications (e.g., following untrusted in-tree symlinks)

  - we can determine whether the path is a symlink with lstat().

    This is slower (two syscalls instead of one), but that may be
    acceptable for infrequent uses like looking up .gitattributes files
    (especially because we can get away with a single syscall for the
    common case of ENOENT).

    It's also racy, but should be sufficient for our needs (we are
    worried about in-tree symlinks that we ourselves would have
    previously created). We could make it non-racy at the cost of making
    it even slower, by doing an fstat() on the opened descriptor and
    comparing the dev/ino fields to the original lstat().

This patch implements the lstat() option in its slightly-faster racy
form.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:32 -08:00
94f6e3e283 Git 2.30.2
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:51:13 +01:00
e4e68081bb Sync with 2.29.3
* maint-2.29:
  Git 2.29.3
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:51:12 +01:00
0628636d0c Git 2.29.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:50:15 +01:00
d7bdabe52f Sync with 2.28.1
* maint-2.28:
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:14 +01:00
e4f4299859 Git 2.28.1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:50:10 +01:00
3f01e56686 Sync with 2.27.1
* maint-2.27:
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:09 +01:00
6ff7f46039 Git 2.27.1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:50:05 +01:00
2d1142a3e8 Sync with 2.26.3
* maint-2.26:
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:04 +01:00
a79fd20c71 Git 2.26.3
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:50:00 +01:00
8f80393c14 Sync with 2.25.5
* maint-2.25:
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:59 +01:00
42ce4c7930 Git 2.25.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:49:55 +01:00
97d1dcb1ef Sync with 2.24.4
* maint-2.24:
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:55 +01:00
06214d171b Git 2.24.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:49:50 +01:00
92ac04b8ee Sync with 2.23.4
* maint-2.23:
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:50 +01:00
d60b6a96f0 Git 2.23.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:49:46 +01:00
4bd06fd490 Sync with 2.22.5
* maint-2.22:
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:45 +01:00
c753e2a7a8 Git 2.22.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:49:41 +01:00
bcf08f33d8 Sync with 2.21.4
* maint-2.21:
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:41 +01:00
c735d7470e Git 2.21.4
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:49:36 +01:00
b1726b1a38 Sync with 2.20.5
* maint-2.20:
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:35 +01:00
8b1a5f33d3 Git 2.20.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:49:17 +01:00
804963848e Sync with 2.19.6
* maint-2.19:
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:17 +01:00
9fb2a1fb08 Git 2.19.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:47:48 +01:00
fb049fd85b Sync with 2.18.5
* maint-2.18:
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:47 +01:00
6eed462c8f Git 2.18.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:47:43 +01:00
9b77cec89b Sync with 2.17.6
* maint-2.17:
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:42 +01:00
6b82d3eea6 Git 2.17.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:47:02 +01:00
22539ec3b5 unpack_trees(): start with a fresh lstat cache
We really want to avoid relying on stale information.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:47:02 +01:00
0d58fef58a run-command: invalidate lstat cache after a command finished
In the previous commit, we intercepted calls to `rmdir()` to invalidate
the lstat cache in the successful case, so that the lstat cache could
not have the idea that a directory exists where there is none.

The same situation can arise, of course, when a separate process is
spawned (most notably, this is the case in `submodule_move_head()`).
Obviously, we cannot know whether a directory was removed in that
process, therefore we must invalidate the lstat cache afterwards.

Note: in contrast to `lstat_cache_aware_rmdir()`, we invalidate the
lstat cache even in case of an error: the process might have removed a
directory and still have failed afterwards.

Co-authored-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:47:02 +01:00
684dd4c2b4 checkout: fix bug that makes checkout follow symlinks in leading path
Before checking out a file, we have to confirm that all of its leading
components are real existing directories. And to reduce the number of
lstat() calls in this process, we cache the last leading path known to
contain only directories. However, when a path collision occurs (e.g.
when checking out case-sensitive files in case-insensitive file
systems), a cached path might have its file type changed on disk,
leaving the cache on an invalid state. Normally, this doesn't bring
any bad consequences as we usually check out files in index order, and
therefore, by the time the cached path becomes outdated, we no longer
need it anyway (because all files in that directory would have already
been written).

But, there are some users of the checkout machinery that do not always
follow the index order. In particular: checkout-index writes the paths
in the same order that they appear on the CLI (or stdin); and the
delayed checkout feature -- used when a long-running filter process
replies with "status=delayed" -- postpones the checkout of some entries,
thus modifying the checkout order.

When we have to check out an out-of-order entry and the lstat() cache is
invalid (due to a previous path collision), checkout_entry() may end up
using the invalid data and thrusting that the leading components are
real directories when, in reality, they are not. In the best case
scenario, where the directory was replaced by a regular file, the user
will get an error: "fatal: unable to create file 'foo/bar': Not a
directory". But if the directory was replaced by a symlink, checkout
could actually end up following the symlink and writing the file at a
wrong place, even outside the repository. Since delayed checkout is
affected by this bug, it could be used by an attacker to write
arbitrary files during the clone of a maliciously crafted repository.

Some candidate solutions considered were to disable the lstat() cache
during unordered checkouts or sort the entries before passing them to
the checkout machinery. But both ideas include some performance penalty
and they don't future-proof the code against new unordered use cases.

Instead, we now manually reset the lstat cache whenever we successfully
remove a directory. Note: We are not even checking whether the directory
was the same as the lstat cache points to because we might face a
scenario where the paths refer to the same location but differ due to
case folding, precomposed UTF-8 issues, or the presence of `..`
components in the path. Two regression tests, with case-collisions and
utf8-collisions, are also added for both checkout-index and delayed
checkout.

Note: to make the previously mentioned clone attack unfeasible, it would
be sufficient to reset the lstat cache only after the remove_subtree()
call inside checkout_entry(). This is the place where we would remove a
directory whose path collides with the path of another entry that we are
currently trying to check out (possibly a symlink). However, in the
interest of a thorough fix that does not leave Git open to
similar-but-not-identical attack vectors, we decided to intercept
all `rmdir()` calls in one fell swoop.

This addresses CVE-2021-21300.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2021-02-12 15:47:02 +01:00
fa153c1cd7 doc/rebase -i: fix typo in the documentation of 'fixup' command
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
9ff6b74bb7 t/t3437: fixup the test 'multiple fixup -c opens editor once'
In the test, FAKE_COMMIT_MESSAGE replaces the commit message each
time it is invoked so there will be only one instance of "Modified-A3"
no matter how many times we invoke the editor. Let's fix this and use
FAKE_COMMIT_AMEND instead so that it adds "Modified-A3" once for each
time the editor is invoked.

This patch also removes the check for counting the number of
"Modified-A3" lines and instead compares the whole message to check
that the commenting code works correctly for 'fixup -c' as well as
'fixup -C'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
9c7650c45c t/t3437: use named commits in the tests
Use the named commits in the tests so that they will still refer to the
same commit if the setup gets changed in the future whereas 'branch~2'
will change which commit it points to.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
d8bd08066d t/t3437: simplify and document the test helpers
Let's simplify the test_commit_message() helper function and add
comments to the function.

This patch also document the working of 'fixup -C' with "amend!" in the
test-description.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
4755fed0a6 t/t3437: check the author date of fixed up commit
Add '%at' format in the get_author() function and update the test to
check that the author date of the fixed up commit is unchanged.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
733ad2e15a t/t3437: remove the dependency of 'expected-message' file from tests
As it is currently implemented, it's too difficult to follow along and
remember the value of "expected-message" from test to test. It also
makes it difficult to extend tests or add new tests in between existing
tests without negatively impacting other tests.

Let's set up "expected-message" to the precise content needed by the
test, so that both the problems go away and also makes easier to run
tests selectively with '--run' or 'GIT_SKIP_TESTS'

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
17665167bb t/t3437: fixup here-docs in the 'setup' test
The most common way to format here-docs in Git test scripts is for the
body and EOF to be indented the same amount as the command which opened
the here-doc. Fix a few here-docs in this script to conform to that
standard and also remove the unnecessary curly braces.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
75ace8329c t/lib-rebase: update the documentation of FAKE_LINES
FAKE_LINES helper function use underscore to embed a space in a single
command. Let's document it and also update the list of commands.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
f07871d302 rebase -i: clarify and fix 'fixup -c' rebase-todo help
When `-c` says "edit the commit message" it's not clear what will be
edited. The original's commit message or the replacement's message or a
combination of the two. Word it such that it states more precisely what
exactly will be edited. While at it, also drop the jarring period and
capitalization, neither of which is otherwise present in the message.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 13:58:19 -08:00
1f9696019a sequencer: rename a few functions
Rename functions to make them more descriptive and while at it, remove
unnecessary 'inline' of the skip_fixupish() function.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-08 13:09:57 -08:00
a25314c1ec sequencer: fixup the datatype of the 'flag' argument
As 'flag' is a combination of bits, so change its datatype from
'enum todo_item_flags' to 'unsigned'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-08 13:09:57 -08:00
2c0aa2ce2e doc/git-rebase: add documentation for fixup [-C|-c] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Marc Branchaud <marcnarc@xiplink.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
bae5b4aea5 rebase -i: teach --autosquash to work with amend!
If the commit subject starts with "amend!" then rearrange it like a
"fixup!" commit and replace `pick` command with `fixup -C` command,
which is used to fixup up the content if any and replaces the original
commit message with amend! commit's message.

Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
1d410cd8c2 t3437: test script for fixup [-C|-c] options in interactive rebase
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
9e3cebd97c rebase -i: add fixup [-C | -c] command
Add options to `fixup` command to fixup both the commit contents and
message. `fixup -C` command is used to replace the original commit
message and `fixup -c`, additionally allows to edit the commit message.

Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
71ee81cd9e sequencer: use const variable for commit message comments
This makes it easier to use and reuse the comments.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
ae70e34f23 sequencer: pass todo_item to do_pick_commit()
As an additional member of the structure todo_item will be required in
future commits pass the complete structure.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
7cdb968254 rebase -i: comment out squash!/fixup! subjects from squash message
When squashing commit messages the squash!/fixup! subjects are not of
interest so comment them out to stop them becoming part of the final
message.

This change breaks a bunch of --autosquash tests which rely on the
"squash! <subject>" line appearing in the final commit message. This is
addressed by adding a second line to the commit message of the "squash!
..." commits and testing for that.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-29 15:21:56 -08:00
a093f0ba95 l10n: ru.po: update Russian translation
Kudos to Philipp Bartsch for whitespace fixes and his helper script[1].

[1]: https://git.grmr.de/phil/pocheck

Signed-off-by: Dimitriy Ryazantcev <dimitriy.ryazantcev@gmail.com>
2021-01-29 21:45:17 +02:00
498bb5b82e sequencer: factor out code to append squash message
This code is going to grow over the next two commits so move it to
its own function.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 17:50:11 -08:00
eab0df0e5b rebase -i: only write fixup-message when it's needed
The file "$GIT_DIR/rebase-merge/fixup-message" is only used for fixup
commands, there's no point in writing it for squash commands as it is
immediately deleted.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 17:50:11 -08:00
604 changed files with 125071 additions and 70284 deletions

View File

@ -186,6 +186,11 @@ jobs:
## Unzip and remove the artifact ## Unzip and remove the artifact
unzip artifacts.zip unzip artifacts.zip
rm artifacts.zip rm artifacts.zip
- name: initialize vcpkg
uses: actions/checkout@v2
with:
repository: 'microsoft/vcpkg'
path: 'compat/vcbuild/vcpkg'
- name: download vcpkg artifacts - name: download vcpkg artifacts
shell: powershell shell: powershell
run: | run: |

2
.gitignore vendored
View File

@ -33,6 +33,7 @@
/git-check-mailmap /git-check-mailmap
/git-check-ref-format /git-check-ref-format
/git-checkout /git-checkout
/git-checkout--worker
/git-checkout-index /git-checkout-index
/git-cherry /git-cherry
/git-cherry-pick /git-cherry-pick
@ -162,6 +163,7 @@
/git-stripspace /git-stripspace
/git-submodule /git-submodule
/git-submodule--helper /git-submodule--helper
/git-subtree
/git-svn /git-svn
/git-switch /git-switch
/git-symbolic-ref /git-symbolic-ref

View File

@ -220,6 +220,7 @@ Philipp A. Hartmann <pah@qo.cx> <ph@sorgh.de>
Philippe Bruhat <book@cpan.org> Philippe Bruhat <book@cpan.org>
Ralf Thielow <ralf.thielow@gmail.com> <ralf.thielow@googlemail.com> Ralf Thielow <ralf.thielow@gmail.com> <ralf.thielow@googlemail.com>
Ramsay Jones <ramsay@ramsayjones.plus.com> <ramsay@ramsay1.demon.co.uk> Ramsay Jones <ramsay@ramsayjones.plus.com> <ramsay@ramsay1.demon.co.uk>
Ramkumar Ramachandra <r@artagnon.com> <artagnon@gmail.com>
Randall S. Becker <randall.becker@nexbridge.ca> <rsbecker@nexbridge.com> Randall S. Becker <randall.becker@nexbridge.ca> <rsbecker@nexbridge.com>
René Scharfe <l.s.r@web.de> <rene.scharfe@lsrfire.ath.cx> René Scharfe <l.s.r@web.de> <rene.scharfe@lsrfire.ath.cx>
René Scharfe <l.s.r@web.de> Rene Scharfe René Scharfe <l.s.r@web.de> Rene Scharfe

View File

@ -175,6 +175,11 @@ For shell scripts specifically (not exhaustive):
does not have such a problem. does not have such a problem.
- Even though "local" is not part of POSIX, we make heavy use of it
in our test suite. We do not use it in scripted Porcelains, and
hopefully nobody starts using "local" before they are reimplemented
in C ;-)
For C programs: For C programs:
@ -498,7 +503,12 @@ Error Messages
- Do not end error messages with a full stop. - Do not end error messages with a full stop.
- Do not capitalize ("unable to open %s", not "Unable to open %s") - Do not capitalize the first word, only because it is the first word
in the message ("unable to open %s", not "Unable to open %s"). But
"SHA-3 not supported" is fine, because the reason the first word is
capitalized is not because it is at the beginning of the sentence,
but because the word would be spelled in capital letters even when
it appeared in the middle of the sentence.
- Say what the error is first ("cannot open %s", not "%s: cannot open") - Say what the error is first ("cannot open %s", not "%s: cannot open")

View File

@ -2,6 +2,8 @@
MAN1_TXT = MAN1_TXT =
MAN5_TXT = MAN5_TXT =
MAN7_TXT = MAN7_TXT =
HOWTO_TXT =
DOC_DEP_TXT =
TECH_DOCS = TECH_DOCS =
ARTICLES = ARTICLES =
SP_ARTICLES = SP_ARTICLES =
@ -42,6 +44,11 @@ MAN7_TXT += gittutorial-2.txt
MAN7_TXT += gittutorial.txt MAN7_TXT += gittutorial.txt
MAN7_TXT += gitworkflows.txt MAN7_TXT += gitworkflows.txt
HOWTO_TXT += $(wildcard howto/*.txt)
DOC_DEP_TXT += $(wildcard *.txt)
DOC_DEP_TXT += $(wildcard config/*.txt)
ifdef MAN_FILTER ifdef MAN_FILTER
MAN_TXT = $(filter $(MAN_FILTER),$(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT)) MAN_TXT = $(filter $(MAN_FILTER),$(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT))
else else
@ -76,6 +83,7 @@ SP_ARTICLES += howto/rebuild-from-update-hook
SP_ARTICLES += howto/rebase-from-internal-branch SP_ARTICLES += howto/rebase-from-internal-branch
SP_ARTICLES += howto/keep-canonical-history-correct SP_ARTICLES += howto/keep-canonical-history-correct
SP_ARTICLES += howto/maintain-git SP_ARTICLES += howto/maintain-git
SP_ARTICLES += howto/coordinate-embargoed-releases
API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt))) API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
SP_ARTICLES += $(API_DOCS) SP_ARTICLES += $(API_DOCS)
@ -90,6 +98,7 @@ TECH_DOCS += technical/multi-pack-index
TECH_DOCS += technical/pack-format TECH_DOCS += technical/pack-format
TECH_DOCS += technical/pack-heuristics TECH_DOCS += technical/pack-heuristics
TECH_DOCS += technical/pack-protocol TECH_DOCS += technical/pack-protocol
TECH_DOCS += technical/parallel-checkout
TECH_DOCS += technical/partial-clone TECH_DOCS += technical/partial-clone
TECH_DOCS += technical/protocol-capabilities TECH_DOCS += technical/protocol-capabilities
TECH_DOCS += technical/protocol-common TECH_DOCS += technical/protocol-common
@ -284,7 +293,7 @@ docdep_prereqs = \
mergetools-list.made $(mergetools_txt) \ mergetools-list.made $(mergetools_txt) \
cmd-list.made $(cmds_txt) cmd-list.made $(cmds_txt)
doc.dep : $(docdep_prereqs) $(wildcard *.txt) $(wildcard config/*.txt) build-docdep.perl doc.dep : $(docdep_prereqs) $(DOC_DEP_TXT) build-docdep.perl
$(QUIET_GEN)$(RM) $@+ $@ && \ $(QUIET_GEN)$(RM) $@+ $@ && \
$(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \ $(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \
mv $@+ $@ mv $@+ $@
@ -427,9 +436,9 @@ $(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml
$(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@+ && \ $(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@+ && \
mv $@+ $@ mv $@+ $@
howto-index.txt: howto-index.sh $(wildcard howto/*.txt) howto-index.txt: howto-index.sh $(HOWTO_TXT)
$(QUIET_GEN)$(RM) $@+ $@ && \ $(QUIET_GEN)$(RM) $@+ $@ && \
'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(wildcard howto/*.txt)) >$@+ && \ '$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(HOWTO_TXT)) >$@+ && \
mv $@+ $@ mv $@+ $@
$(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt $(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
@ -438,7 +447,7 @@ $(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
WEBDOC_DEST = /pub/software/scm/git/docs WEBDOC_DEST = /pub/software/scm/git/docs
howto/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../ howto/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../
$(patsubst %.txt,%.html,$(wildcard howto/*.txt)): %.html : %.txt GIT-ASCIIDOCFLAGS $(patsubst %.txt,%.html,$(HOWTO_TXT)): %.html : %.txt GIT-ASCIIDOCFLAGS
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \ $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
sed -e '1,/^$$/d' $< | \ sed -e '1,/^$$/d' $< | \
$(TXT_TO_HTML) - >$@+ && \ $(TXT_TO_HTML) - >$@+ && \
@ -470,7 +479,13 @@ print-man1:
@for i in $(MAN1_TXT); do echo $$i; done @for i in $(MAN1_TXT); do echo $$i; done
lint-docs:: lint-docs::
$(QUIET_LINT)$(PERL_PATH) lint-gitlink.perl $(QUIET_LINT)$(PERL_PATH) lint-gitlink.perl \
$(HOWTO_TXT) $(DOC_DEP_TXT) \
--section=1 $(MAN1_TXT) \
--section=5 $(MAN5_TXT) \
--section=7 $(MAN7_TXT); \
$(PERL_PATH) lint-man-end-blurb.perl $(MAN_TXT); \
$(PERL_PATH) lint-man-section-order.perl $(MAN_TXT);
ifeq ($(wildcard po/Makefile),po/Makefile) ifeq ($(wildcard po/Makefile),po/Makefile)
doc-l10n install-l10n:: doc-l10n install-l10n::

View File

@ -0,0 +1,16 @@
Git v2.17.6 Release Notes
=========================
This release addresses the security issues CVE-2021-21300.
Fixes since v2.17.5
-------------------
* CVE-2021-21300:
On case-insensitive file systems with support for symbolic links,
if Git is configured globally to apply delay-capable clean/smudge
filters (such as Git LFS), Git could be fooled into running
remote code during a clone.
Credit for finding and fixing this vulnerability goes to Matheus
Tavares, helped by Johannes Schindelin.

View File

@ -0,0 +1,6 @@
Git v2.18.5 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6 to address
the security issue CVE-2021-21300; see the release notes for that
version for details.

View File

@ -0,0 +1,6 @@
Git v2.19.6 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6 and
v2.18.5 to address the security issue CVE-2021-21300; see the
release notes for these versions for details.

View File

@ -0,0 +1,6 @@
Git v2.20.5 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5
and v2.19.6 to address the security issue CVE-2021-21300; see
the release notes for these versions for details.

View File

@ -0,0 +1,6 @@
Git v2.21.4 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6 and v2.20.5 to address the security issue CVE-2021-21300;
see the release notes for these versions for details.

View File

@ -0,0 +1,7 @@
Git v2.22.5 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6,
v2.18.5, v2.19.6, v2.20.5 and v2.21.4 to address the security
issue CVE-2021-21300; see the release notes for these versions
for details.

View File

@ -0,0 +1,7 @@
Git v2.23.4 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4 and v2.22.5 to address the security
issue CVE-2021-21300; see the release notes for these versions
for details.

View File

@ -0,0 +1,7 @@
Git v2.24.4 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4, v2.22.5 and v2.23.4 to address the
security issue CVE-2021-21300; see the release notes for these
versions for details.

View File

@ -0,0 +1,7 @@
Git v2.25.5 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4 and v2.24.4 to address
the security issue CVE-2021-21300; see the release notes for
these versions for details.

View File

@ -0,0 +1,7 @@
Git v2.26.3 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4 and v2.25.5
to address the security issue CVE-2021-21300; see the release
notes for these versions for details.

View File

@ -0,0 +1,7 @@
Git v2.27.1 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, v2.25.5
and v2.26.3 to address the security issue CVE-2021-21300; see
the release notes for these versions for details.

View File

@ -0,0 +1,7 @@
Git v2.28.1 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, v2.25.5,
v2.26.3 and v2.27.1 to address the security issue CVE-2021-21300;
see the release notes for these versions for details.

View File

@ -0,0 +1,8 @@
Git v2.29.3 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6,
v2.18.5, v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4,
v2.25.5, v2.26.3, v2.27.1 and v2.28.1 to address the security
issue CVE-2021-21300; see the release notes for these versions
for details.

View File

@ -0,0 +1,8 @@
Git v2.30.2 Release Notes
=========================
This release merges up the fixes that appear in v2.17.6, v2.18.5,
v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, v2.25.5,
v2.26.3, v2.27.1, v2.28.1 and v2.29.3 to address the security
issue CVE-2021-21300; see the release notes for these versions
for details.

View File

@ -0,0 +1,24 @@
Git v2.30.2 Release Notes
=========================
This release addresses the security issue CVE-2022-24765.
Fixes since v2.30.2
-------------------
* Build fix on Windows.
* Fix `GIT_CEILING_DIRECTORIES` with Windows-style root directories.
* CVE-2022-24765:
On multi-user machines, Git users might find themselves
unexpectedly in a Git worktree, e.g. when another user created a
repository in `C:\.git`, in a mounted network drive or in a
scratch space. Merely having a Git-aware prompt that runs `git
status` (or `git diff`) and navigating to a directory which is
supposedly not a Git worktree, or opening such a directory in an
editor or IDE such as VS Code or Atom, will potentially run
commands defined by that other user.
Credit for finding this vulnerability goes to 俞晨东; The fix was
authored by Johannes Schindelin.

View File

@ -0,0 +1,21 @@
Git v2.30.4 Release Notes
=========================
This release contains minor fix-ups for the changes that went into
Git 2.30.3, which was made to address CVE-2022-24765.
* The code that was meant to parse the new `safe.directory`
configuration variable was not checking what configuration
variable was being fed to it, which has been corrected.
* '*' can be used as the value for the `safe.directory` variable to
signal that the user considers that any directory is safe.
Derrick Stolee (2):
t0033: add tests for safe.directory
setup: opt-out of check with safe.directory=*
Matheus Valadares (1):
setup: fix safe.directory key not being checked

View File

@ -0,0 +1,12 @@
Git v2.30.5 Release Notes
=========================
This release contains minor fix-ups for the changes that went into
Git 2.30.3 and 2.30.4, addressing CVE-2022-29187.
* The safety check that verifies a safe ownership of the Git
worktree is now extended to also cover the ownership of the Git
directory (and the `.git` file, if there is any).
Carlo Marcelo Arenas Belón (1):
setup: tighten ownership checks post CVE-2022-24765

View File

@ -0,0 +1,60 @@
Git v2.30.6 Release Notes
=========================
This release addresses the security issues CVE-2022-39253 and
CVE-2022-39260.
Fixes since v2.30.5
-------------------
* CVE-2022-39253:
When relying on the `--local` clone optimization, Git dereferences
symbolic links in the source repository before creating hardlinks
(or copies) of the dereferenced link in the destination repository.
This can lead to surprising behavior where arbitrary files are
present in a repository's `$GIT_DIR` when cloning from a malicious
repository.
Git will no longer dereference symbolic links via the `--local`
clone mechanism, and will instead refuse to clone repositories that
have symbolic links present in the `$GIT_DIR/objects` directory.
Additionally, the value of `protocol.file.allow` is changed to be
"user" by default.
* CVE-2022-39260:
An overly-long command string given to `git shell` can result in
overflow in `split_cmdline()`, leading to arbitrary heap writes and
remote code execution when `git shell` is exposed and the directory
`$HOME/git-shell-commands` exists.
`git shell` is taught to refuse interactive commands that are
longer than 4MiB in size. `split_cmdline()` is hardened to reject
inputs larger than 2GiB.
Credit for finding CVE-2022-39253 goes to Cory Snider of Mirantis. The
fix was authored by Taylor Blau, with help from Johannes Schindelin.
Credit for finding CVE-2022-39260 goes to Kevin Backhouse of GitHub.
The fix was authored by Kevin Backhouse, Jeff King, and Taylor Blau.
Jeff King (2):
shell: add basic tests
shell: limit size of interactive commands
Kevin Backhouse (1):
alias.c: reject too-long cmdline strings in split_cmdline()
Taylor Blau (11):
builtin/clone.c: disallow `--local` clones with symlinks
t/lib-submodule-update.sh: allow local submodules
t/t1NNN: allow local submodules
t/2NNNN: allow local submodules
t/t3NNN: allow local submodules
t/t4NNN: allow local submodules
t/t5NNN: allow local submodules
t/t6NNN: allow local submodules
t/t7NNN: allow local submodules
t/t9NNN: allow local submodules
transport: make `protocol.file.allow` be "user" by default

View File

@ -16,6 +16,8 @@ Backward incompatible and other important changes
* The support for deprecated PCRE1 library has been dropped. * The support for deprecated PCRE1 library has been dropped.
* Fixes for CVE-2021-21300 in Git 2.30.2 (and earlier) is included.
UI, Workflows & Features UI, Workflows & Features
@ -199,7 +201,7 @@ Performance, Internal Implementation, Development Support etc.
* Preliminary changes to fsmonitor integration. * Preliminary changes to fsmonitor integration.
* Performance optimization work on the rename detection continues. * Performance improvements for rename detection.
* The common code to deal with "chunked file format" that is shared * The common code to deal with "chunked file format" that is shared
by the multi-pack-index and commit-graph files have been factored by the multi-pack-index and commit-graph files have been factored
@ -222,6 +224,11 @@ Performance, Internal Implementation, Development Support etc.
* Raise the buffer size used when writing the index file out from * Raise the buffer size used when writing the index file out from
(obviously too small) 8kB to (clearly sufficiently large) 128kB. (obviously too small) 8kB to (clearly sufficiently large) 128kB.
* It is reported that open() on some platforms (e.g. macOS Big Sur)
can return EINTR even though our timers are set up with SA_RESTART.
A workaround has been implemented and enabled for macOS to rerun
open() transparently from the caller when this happens.
Fixes since v2.30 Fixes since v2.30
----------------- -----------------

View File

@ -0,0 +1,27 @@
Git 2.31.1 Release Notes
========================
Fixes since v2.31
-----------------
* The fsmonitor interface read from its input without making sure
there is something to read from. This bug is new in 2.31
timeframe.
* The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.
* "git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well. This regression
has been corrected.
* Fix macros that can silently inject unintended null-statements.
* CALLOC_ARRAY() macro replaces many uses of xcalloc().
* Update insn in Makefile comments to run fuzz-all target.
* Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.
Also contains various documentation updates and code clean-ups.

View File

@ -0,0 +1,6 @@
Git v2.31.2 Release Notes
=========================
This release merges up the fixes that appear in v2.30.3 to address
the security issue CVE-2022-24765; see the release notes for that
version for details.

View File

@ -0,0 +1,4 @@
Git Documentation/RelNotes/2.31.3.txt Release Notes
=========================
This release merges up the fixes that appear in v2.31.3.

View File

@ -0,0 +1,6 @@
Git v2.31.4 Release Notes
=========================
This release merges up the fixes that appear in v2.30.5 to address
the security issue CVE-2022-29187; see the release notes for that
version for details.

View File

@ -0,0 +1,5 @@
Git v2.31.5 Release Notes
=========================
This release merges the security fix that appears in v2.30.6; see
the release notes for that version for details.

View File

@ -0,0 +1,416 @@
Git 2.32 Release Notes
======================
Backward compatibility notes
----------------------------
* ".gitattributes", ".gitignore", and ".mailmap" files that are
symbolic links are ignored.
* "git apply --3way" used to first attempt a straight application,
and only fell back to the 3-way merge algorithm when the stright
application failed. Starting with this version, the command will
first try the 3-way merge algorithm and only when it fails (either
resulting with conflict or the base versions of blobs are missing),
falls back to the usual patch application.
Updates since v2.31
-------------------
UI, Workflows & Features
* It does not make sense to make ".gitattributes", ".gitignore" and
".mailmap" symlinks, as they are supposed to be usable from the
object store (think: bare repositories where HEAD:.mailmap etc. are
used). When these files are symbolic links, we used to read the
contents of the files pointed by them by mistake, which has been
corrected.
* "git stash show" learned to optionally show untracked part of the
stash.
* "git log --format='...'" learned "%(describe)" placeholder.
* "git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size). A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.
* The http codepath learned to let the credential layer to cache the
password used to unlock a certificate that has successfully been
used.
* "git commit --fixup=<commit>", which was to tweak the changes made
to the contents while keeping the original log message intact,
learned "--fixup=(amend|reword):<commit>", that can be used to
tweak both the message and the contents, and only the message,
respectively.
* "git send-email" learned to honor the core.hooksPath configuration.
* "git format-patch -v<n>" learned to allow a reroll count that is
not an integer.
* "git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.
* "git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.
* A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.
* "gitweb" learned "e-mail privacy" feature to redact strings that
look like e-mail addresses on various pages.
* "git apply --3way" has always been "to fall back to 3-way merge
only when straight application fails". Swap the order of falling
back so that 3-way is always attempted first (only when the option
is given, of course) and then straight patch application is used as
a fallback when it fails.
* "git apply" now takes "--3way" and "--cached" at the same time, and
work and record results only in the index.
* The command line completion (in contrib/) has learned that
CHERRY_PICK_HEAD is a possible pseudo-ref.
* Userdiff patterns for "Scheme" has been added.
* "git log" learned "--diff-merges=<style>" option, with an
associated configuration variable log.diffMerges.
* "git log --format=..." placeholders learned %ah/%ch placeholders to
request the --date=human output.
* Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.
* "git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.
* "git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.
* The command line completion (in contrib/) for "git stash" has been
updated.
* "git subtree" updates.
* It is now documented that "format-patch" skips merges.
* Options to "git pack-objects" that take numeric values like
--window and --depth should not accept negative values; the input
validation has been tightened.
* The way the command line specified by the trailer.<token>.command
configuration variable receives the end-user supplied value was
both error prone and misleading. An alternative to achieve the
same goal in a safer and more intuitive way has been added, as
the trailer.<token>.cmd configuration variable, to replace it.
* "git add -i --dry-run" does not dry-run, which was surprising. The
combination of options has taught to error out.
* "git push" learns to discover common ancestor with the receiving
end over protocol v2. This will hopefully make "git push" as
efficient as "git fetch" in avoiding objects from getting
transferred unnecessarily.
* "git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.
Performance, Internal Implementation, Development Support etc.
* Rename detection rework continues.
* GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with
prerequisites to catch broken tests that depend on the side effects
of optional pieces, but did not work at all when negative
prerequisites were involved.
(merge 27d578d904 jk/fail-prereq-testfix later to maint).
* "git diff-index" codepath has been taught to trust fsmonitor status
to reduce number of lstat() calls.
(merge 7e5aa13d2c nk/diff-index-fsmonitor later to maint).
* Reorganize Makefile to allow building git.o and other essential
objects without extra stuff needed only for testing.
* Preparatory API changes for parallel checkout.
* A simple IPC interface gets introduced to build services like
fsmonitor on top.
* Fsck API clean-up.
* SECURITY.md that is facing individual contributors and end users
has been introduced. Also a procedure to follow when preparing
embargoed releases has been spelled out.
(merge 09420b7648 js/security-md later to maint).
* Optimize "rev-list --use-bitmap-index --objects" corner case that
uses negative tags as the stopping points.
* CMake update for vsbuild.
* An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.
* Generate [ec]tags under $(QUIET_GEN).
* Clean-up codepaths that implements "git send-email --validate"
option and improves the message from it.
* The last remnant of gettext-poison has been removed.
* The test framework has been taught to optionally turn the default
merge strategy to "ort" throughout the system where we use
three-way merges internally, like cherry-pick, rebase etc.,
primarily to enhance its test coverage (the strategy has been
available as an explicit "-s ort" choice).
* A bit of code clean-up and a lot of test clean-up around userdiff
area.
* Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).
* When packet_write() fails, we gave an extra error message
unnecessarily, which has been corrected.
* The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.
* Show errno in the trace output in the error codepath that calls
read_raw_ref method.
* Effort to make the command line completion (in contrib/) safe with
"set -u" continues.
* Tweak a few tests for "log --format=..." that show timestamps in
various formats.
* The reflog expiry machinery has been taught to emit trace events.
* Over-the-wire protocol learns a new request type to ask for object
sizes given a list of object names.
Fixes since v2.31
-----------------
* The fsmonitor interface read from its input without making sure
there is something to read from. This bug is new in 2.31
timeframe.
* The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.
* "git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well. This regression
has been corrected.
* Fix macros that can silently inject unintended null-statements.
* CALLOC_ARRAY() macro replaces many uses of xcalloc().
* Update insn in Makefile comments to run fuzz-all target.
* Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.
* We had a code to diagnose and die cleanly when a required
clean/smudge filter is missing, but an assert before that
unnecessarily fired, hiding the end-user facing die() message.
(merge 6fab35f748 mt/cleanly-die-upon-missing-required-filter later to maint).
* Update C code that sets a few configuration variables when a remote
is configured so that it spells configuration variable names in the
canonical camelCase.
(merge 0f1da600e6 ab/remote-write-config-in-camel-case later to maint).
* A new configuration variable has been introduced to allow choosing
which version of the generation number gets used in the
commit-graph file.
(merge 702110aac6 ds/commit-graph-generation-config later to maint).
* Perf test update to work better in secondary worktrees.
(merge 36e834abc1 jk/perf-in-worktrees later to maint).
* Updates to memory allocation code around the use of pcre2 library.
(merge c1760352e0 ab/grep-pcre2-allocfix later to maint).
* "git -c core.bare=false clone --bare ..." would have segfaulted,
which has been corrected.
(merge 75555676ad bc/clone-bare-with-conflicting-config later to maint).
* When "git checkout" removes a path that does not exist in the
commit it is checking out, it wasn't careful enough not to follow
symbolic links, which has been corrected.
(merge fab78a0c3d mt/checkout-remove-nofollow later to maint).
* A few option description strings started with capital letters,
which were corrected.
(merge 5ee90326dc cc/downcase-opt-help later to maint).
* Plug or annotate remaining leaks that trigger while running the
very basic set of tests.
(merge 68ffe095a2 ah/plugleaks later to maint).
* The hashwrite() API uses a buffering mechanism to avoid calling
write(2) too frequently. This logic has been refactored to be
easier to understand.
(merge ddaf1f62e3 ds/clarify-hashwrite later to maint).
* "git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.
(merge 39edfd5cbc en/sequencer-edit-upon-conflict-fix later to maint).
* "git daemon" has been tightened against systems that take backslash
as directory separator.
(merge 9a7f1ce8b7 rs/daemon-sanitize-dir-sep later to maint).
* A NULL-dereference bug has been corrected in an error codepath in
"git for-each-ref", "git branch --list" etc.
(merge c685450880 jk/ref-filter-segfault-fix later to maint).
* Streamline the codepath to fix the UTF-8 encoding issues in the
argv[] and the prefix on macOS.
(merge c7d0e61016 tb/precompose-prefix-simplify later to maint).
* The command-line completion script (in contrib/) had a couple of
references that would have given a warning under the "-u" (nounset)
option.
(merge c5c0548d79 vs/completion-with-set-u later to maint).
* When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.
(merge 8e118e8490 jk/pack-objects-bitmap-progress-fix later to maint).
* The dependencies for config-list.h and command-list.h were broken
when the former was split out of the latter, which has been
corrected.
(merge 56550ea718 sg/bugreport-fixes later to maint).
* "git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.
(merge f3cce896a8 ow/push-quiet-set-upstream later to maint).
* The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.
(merge 32f67888d8 ds/maintenance-prefetch-fix later to maint).
* Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.
(merge 9364bf465d ab/pathname-encoding-doc later to maint).
* "git --config-env var=val cmd" weren't accepted (only
--config-env=var=val was).
(merge c331551ccf ps/config-env-option-with-separate-value later to maint).
* When the reachability bitmap is in effect, the "do not lose
recently created objects and those that are reachable from them"
safety to protect us from races were disabled by mistake, which has
been corrected.
(merge 2ba582ba4c jk/prune-with-bitmap-fix later to maint).
* Cygwin pathname handling fix.
(merge bccc37fdc7 ad/cygwin-no-backslashes-in-paths later to maint).
* "git rebase --[no-]reschedule-failed-exec" did not work well with
its configuration variable, which has been corrected.
(merge e5b32bffd1 ab/rebase-no-reschedule-failed-exec later to maint).
* Portability fix for command line completion script (in contrib/).
(merge f2acf763e2 si/zsh-complete-comment-fix later to maint).
* "git repack -A -d" in a partial clone unnecessarily loosened
objects in promisor pack.
* "git bisect skip" when custom words are used for new/old did not
work, which has been corrected.
* A few variants of informational message "Already up-to-date" has
been rephrased.
(merge ad9322da03 js/merge-already-up-to-date-message-reword later to maint).
* "git submodule update --quiet" did not propagate the quiet option
down to underlying "git fetch", which has been corrected.
(merge 62af4bdd42 nc/submodule-update-quiet later to maint).
* Document that our test can use "local" keyword.
(merge a84fd3bcc6 jc/test-allows-local later to maint).
* The word-diff mode has been taught to work better with a word
regexp that can match an empty string.
(merge 0324e8fc6b pw/word-diff-zero-width-matches later to maint).
* "git p4" learned to find branch points more efficiently.
(merge 6b79818bfb jk/p4-locate-branch-point-optim later to maint).
* When "git update-ref -d" removes a ref that is packed, it left
empty directories under $GIT_DIR/refs/ for
(merge 5f03e5126d wc/packed-ref-removal-cleanup later to maint).
* "git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.
(merge b548f0f156 en/dir-traversal later to maint).
* The handling of "%(push)" formatting element of "for-each-ref" and
friends was broken when the same codepath started handling
"%(push:<what>)", which has been corrected.
(merge 1e1c4c5eac zh/ref-filter-push-remote-fix later to maint).
* The bash prompt script (in contrib/) did not work under "set -u".
(merge 5c0cbdb107 en/prompt-under-set-u later to maint).
* The "chainlint" feature in the test framework is a handy way to
catch common mistakes in writing new tests, but tends to get
expensive. An knob to selectively disable it has been introduced
to help running tests that the developer has not modified.
(merge 2d86a96220 jk/test-chainlint-softer later to maint).
* The "rev-parse" command did not diagnose the lack of argument to
"--path-format" option, which was introduced in v2.31 era, which
has been corrected.
(merge 99fc555188 wm/rev-parse-path-format-wo-arg later to maint).
* Other code cleanup, docfix, build fix, etc.
(merge f451960708 dl/cat-file-doc-cleanup later to maint).
(merge 12604a8d0c sv/t9801-test-path-is-file-cleanup later to maint).
(merge ea7e63921c jr/doc-ignore-typofix later to maint).
(merge 23c781f173 ps/update-ref-trans-hook-doc later to maint).
(merge 42efa1231a jk/filter-branch-sha256 later to maint).
(merge 4c8e3dca6e tb/push-simple-uses-branch-merge-config later to maint).
(merge 6534d436a2 bs/asciidoctor-installation-hints later to maint).
(merge 47957485b3 ab/read-tree later to maint).
(merge 2be927f3d1 ab/diff-no-index-tests later to maint).
(merge 76593c09bb ab/detox-gettext-tests later to maint).
(merge 28e29ee38b jc/doc-format-patch-clarify later to maint).
(merge fc12b6fdde fm/user-manual-use-preface later to maint).
(merge dba94e3a85 cc/test-helper-bloom-usage-fix later to maint).
(merge 61a7660516 hn/reftable-tables-doc-update later to maint).
(merge 81ed96a9b2 jt/fetch-pack-request-fix later to maint).
(merge 151b6c2dd7 jc/doc-do-not-capitalize-clarification later to maint).
(merge 9160068ac6 js/access-nul-emulation-on-windows later to maint).
(merge 7a14acdbe6 po/diff-patch-doc later to maint).
(merge f91371b948 pw/patience-diff-clean-up later to maint).
(merge 3a7f0908b6 mt/clean-clean later to maint).
(merge d4e2d15a8b ab/streaming-simplify later to maint).
(merge 0e59f7ad67 ah/merge-ort-i18n later to maint).
(merge e6f68f62e0 ls/typofix later to maint).

View File

@ -0,0 +1,6 @@
Git v2.32.1 Release Notes
=========================
This release merges up the fixes that appear in v2.30.3 and
v2.31.2 to address the security issue CVE-2022-24765; see the
release notes for these versions for details.

View File

@ -0,0 +1,4 @@
Git Documentation/RelNotes/2.32.2.txt Release Notes
=========================
This release merges up the fixes that appear in v2.32.2.

View File

@ -0,0 +1,6 @@
Git v2.32.3 Release Notes
=========================
This release merges up the fixes that appear in v2.30.5 and
v2.31.4 to address the security issue CVE-2022-29187; see the
release notes for these versions for details.

View File

@ -0,0 +1,5 @@
Git v2.32.4 Release Notes
=========================
This release merges the security fix that appears in v2.30.6; see
the release notes for that version for details.

View File

@ -117,10 +117,13 @@ If in doubt which identifier to use, run `git log --no-merges` on the
files you are modifying to see the current conventions. files you are modifying to see the current conventions.
[[summary-section]] [[summary-section]]
It's customary to start the remainder of the first line after "area: " The title sentence after the "area:" prefix omits the full stop at the
with a lower-case letter. E.g. "doc: clarify...", not "doc: end, and its first word is not capitalized unless there is a reason to
Clarify...", or "githooks.txt: improve...", not "githooks.txt: capitalize it other than because it is the first word in the sentence.
Improve...". E.g. "doc: clarify...", not "doc: Clarify...", or "githooks.txt:
improve...", not "githooks.txt: Improve...". But "refs: HEAD is also
treated as a ref" is correct, as we spell `HEAD` in all caps even when
it appears in the middle of a sentence.
[[meaningful-message]] [[meaningful-message]]
The body should provide a meaningful commit message, which: The body should provide a meaningful commit message, which:

View File

@ -46,7 +46,7 @@ Subsection names are case sensitive and can contain any characters except
newline and the null byte. Doublequote `"` and backslash can be included newline and the null byte. Doublequote `"` and backslash can be included
by escaping them as `\"` and `\\`, respectively. Backslashes preceding by escaping them as `\"` and `\\`, respectively. Backslashes preceding
other characters are dropped when reading; for example, `\t` is read as other characters are dropped when reading; for example, `\t` is read as
`t` and `\0` is read as `0` Section headers cannot span multiple lines. `t` and `\0` is read as `0`. Section headers cannot span multiple lines.
Variables may belong directly to a section or to a given subsection. You Variables may belong directly to a section or to a given subsection. You
can have `[section]` if you have `[section "subsection"]`, but you don't can have `[section]` if you have `[section "subsection"]`, but you don't
need to. need to.
@ -440,6 +440,8 @@ include::config/rerere.txt[]
include::config/reset.txt[] include::config/reset.txt[]
include::config/safe.txt[]
include::config/sendemail.txt[] include::config/sendemail.txt[]
include::config/sequencer.txt[] include::config/sequencer.txt[]

View File

@ -119,4 +119,8 @@ advice.*::
addEmptyPathspec:: addEmptyPathspec::
Advice shown if a user runs the add command without providing Advice shown if a user runs the add command without providing
the pathspec parameter. the pathspec parameter.
updateSparsePath::
Advice shown when either linkgit:git-add[1] or linkgit:git-rm[1]
is asked to update index entries outside the current sparse
checkout.
-- --

View File

@ -21,3 +21,24 @@ checkout.guess::
Provides the default value for the `--guess` or `--no-guess` Provides the default value for the `--guess` or `--no-guess`
option in `git checkout` and `git switch`. See option in `git checkout` and `git switch`. See
linkgit:git-switch[1] and linkgit:git-checkout[1]. linkgit:git-switch[1] and linkgit:git-checkout[1].
checkout.workers::
The number of parallel workers to use when updating the working tree.
The default is one, i.e. sequential execution. If set to a value less
than one, Git will use as many workers as the number of logical cores
available. This setting and `checkout.thresholdForParallelism` affect
all commands that perform checkout. E.g. checkout, clone, reset,
sparse-checkout, etc.
+
Note: parallel checkout usually delivers better performance for repositories
located on SSDs or over NFS. For repositories on spinning disks and/or machines
with a small number of cores, the default sequential checkout often performs
better. The size and compression level of a repository might also influence how
well the parallel version performs.
checkout.thresholdForParallelism::
When running parallel checkout with a small number of files, the cost
of subprocess spawning and inter-process communication might outweigh
the parallelization gains. This setting allows to define the minimum
number of files for which parallel checkout should be attempted. The
default is 100.

View File

@ -2,3 +2,7 @@ clone.defaultRemoteName::
The name of the remote to create when cloning a repository. Defaults to The name of the remote to create when cloning a repository. Defaults to
`origin`, and can be overridden by passing the `--origin` command-line `origin`, and can be overridden by passing the `--origin` command-line
option to linkgit:git-clone[1]. option to linkgit:git-clone[1].
clone.rejectShallow::
Reject to clone a repository if it is a shallow one, can be overridden by
passing option `--reject-shallow` in command line. See linkgit:git-clone[1]

View File

@ -1,3 +1,9 @@
commitGraph.generationVersion::
Specifies the type of generation number version to use when writing
or reading the commit-graph file. If version 1 is specified, then
the corrected commit dates will not be written or read. Defaults to
2.
commitGraph.maxNewFilters:: commitGraph.maxNewFilters::
Specifies the default value for the `--max-new-filters` option of `git Specifies the default value for the `--max-new-filters` option of `git
commit-graph write` (c.f., linkgit:git-commit-graph[1]). commit-graph write` (c.f., linkgit:git-commit-graph[1]).

View File

@ -14,6 +14,11 @@ index.recordOffsetTable::
Defaults to 'true' if index.threads has been explicitly enabled, Defaults to 'true' if index.threads has been explicitly enabled,
'false' otherwise. 'false' otherwise.
index.sparse::
When enabled, write the index using sparse-directory entries. This
has no effect unless `core.sparseCheckout` and
`core.sparseCheckoutCone` are both enabled. Defaults to 'false'.
index.threads:: index.threads::
Specifies the number of threads to spawn when loading the index. Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines. This is meant to reduce index load time on multiprocessor machines.

View File

@ -24,6 +24,11 @@ log.excludeDecoration::
the config option can be overridden by the `--decorate-refs` the config option can be overridden by the `--decorate-refs`
option. option.
log.diffMerges::
Set default diff format to be used for merge commits. See
`--diff-merges` in linkgit:git-log[1] for details.
Defaults to `separate`.
log.follow:: log.follow::
If `true`, `git log` will act as if the `--follow` option was used when If `true`, `git log` will act as if the `--follow` option was used when
a single <path> is given. This has the same limitations as `--follow`, a single <path> is given. This has the same limitations as `--follow`,

View File

@ -53,7 +53,7 @@ mergetool.hideResolved::
resolution. This flag causes 'LOCAL' and 'REMOTE' to be overwriten so resolution. This flag causes 'LOCAL' and 'REMOTE' to be overwriten so
that only the unresolved conflicts are presented to the merge tool. Can that only the unresolved conflicts are presented to the merge tool. Can
be configured per-tool via the `mergetool.<tool>.hideResolved` be configured per-tool via the `mergetool.<tool>.hideResolved`
configuration variable. Defaults to `true`. configuration variable. Defaults to `false`.
mergetool.keepBackup:: mergetool.keepBackup::
After performing a merge, the original file with conflict markers After performing a merge, the original file with conflict markers

View File

@ -122,6 +122,21 @@ pack.useSparse::
commits contain certain types of direct renames. Default is commits contain certain types of direct renames. Default is
`true`. `true`.
pack.preferBitmapTips::
When selecting which commits will receive bitmaps, prefer a
commit at the tip of any reference that is a suffix of any value
of this configuration over any other commits in the "selection
window".
+
Note that setting this configuration to `refs/foo` does not mean that
the commits at the tips of `refs/foo/bar` and `refs/foo/baz` will
necessarily be selected. This is because commits are selected for
bitmaps from within a series of windows of variable length.
+
If a commit at the tip of any reference which is a suffix of any value
of this configuration is seen in a window, it is immediately given
preference over any other commit in that window.
pack.writeBitmaps (deprecated):: pack.writeBitmaps (deprecated)::
This is a deprecated synonym for `repack.writeBitmaps`. This is a deprecated synonym for `repack.writeBitmaps`.

View File

@ -1,10 +1,10 @@
protocol.allow:: protocol.allow::
If set, provide a user defined default policy for all protocols which If set, provide a user defined default policy for all protocols which
don't explicitly have a policy (`protocol.<name>.allow`). By default, don't explicitly have a policy (`protocol.<name>.allow`). By default,
if unset, known-safe protocols (http, https, git, ssh, file) have a if unset, known-safe protocols (http, https, git, ssh) have a
default policy of `always`, known-dangerous protocols (ext) have a default policy of `always`, known-dangerous protocols (ext) have a
default policy of `never`, and all other protocols have a default default policy of `never`, and all other protocols (including file)
policy of `user`. Supported policies: have a default policy of `user`. Supported policies:
+ +
-- --

View File

@ -120,3 +120,10 @@ push.useForceIfIncludes::
`--force-if-includes` as an option to linkgit:git-push[1] `--force-if-includes` as an option to linkgit:git-push[1]
in the command line. Adding `--no-force-if-includes` at the in the command line. Adding `--no-force-if-includes` at the
time of push overrides this configuration setting. time of push overrides this configuration setting.
push.negotiate::
If set to "true", attempt to reduce the size of the packfile
sent by rounds of negotiation in which the client and the
server attempt to find commits in common. If "false", Git will
rely solely on the server's ref advertisement to find commits
in common.

View File

@ -1,10 +1,3 @@
rebase.useBuiltin::
Unused configuration variable. Used in Git versions 2.20 and
2.21 as an escape hatch to enable the legacy shellscript
implementation of rebase. Now the built-in rewrite of it in C
is always used. Setting this will emit a warning, to alert any
remaining users that setting this now does nothing.
rebase.backend:: rebase.backend::
Default backend to use for rebasing. Possible choices are Default backend to use for rebasing. Possible choices are
'apply' or 'merge'. In the future, if the merge backend gains 'apply' or 'merge'. In the future, if the merge backend gains

View File

@ -0,0 +1,42 @@
safe.directory::
These config entries specify Git-tracked directories that are
considered safe even if they are owned by someone other than the
current user. By default, Git will refuse to even parse a Git
config of a repository owned by someone else, let alone run its
hooks, and this config setting allows users to specify exceptions,
e.g. for intentionally shared repositories (see the `--shared`
option in linkgit:git-init[1]).
+
This is a multi-valued setting, i.e. you can add more than one directory
via `git config --add`. To reset the list of safe directories (e.g. to
override any such directories specified in the system config), add a
`safe.directory` entry with an empty value.
+
This config setting is only respected when specified in a system or global
config, not when it is specified in a repository config or via the command
line option `-c safe.directory=<path>`.
+
The value of this setting is interpolated, i.e. `~/<path>` expands to a
path relative to the home directory and `%(prefix)/<path>` expands to a
path relative to Git's (runtime) prefix.
+
To completely opt-out of this security check, set `safe.directory` to the
string `*`. This will allow all repositories to be treated as if their
directory was listed in the `safe.directory` list. If `safe.directory=*`
is set in system config and you want to re-enable this protection, then
initialize your list with an empty value before listing the repositories
that you deem safe.
+
As explained, Git only allows you to access repositories owned by
yourself, i.e. the user who is running Git, by default. When Git
is running as 'root' in a non Windows platform that provides sudo,
however, git checks the SUDO_UID environment variable that sudo creates
and will allow access to the uid recorded as its value in addition to
the id from 'root'.
This is to make it easy to perform a common sequence during installation
"make && sudo make install". A git process running under 'sudo' runs as
'root' but the 'sudo' command exports the environment variable to record
which id the original user has.
If that is not what you would prefer and want git to only trust
repositories that are owned by root instead, then you can remove
the `SUDO_UID` variable from root's environment before invoking git.

View File

@ -5,6 +5,11 @@ stash.useBuiltin::
is always used. Setting this will emit a warning, to alert any is always used. Setting this will emit a warning, to alert any
remaining users that setting this now does nothing. remaining users that setting this now does nothing.
stash.showIncludeUntracked::
If this is set to true, the `git stash show` command will show
the untracked files of a stash entry. Defaults to false. See
description of 'show' command in linkgit:git-stash[1].
stash.showPatch:: stash.showPatch::
If this is set to true, the `git stash show` command without an If this is set to true, the `git stash show` command without an
option will show the stash entry in patch form. Defaults to false. option will show the stash entry in patch form. Defaults to false.

View File

@ -59,15 +59,16 @@ uploadpack.allowFilter::
uploadpackfilter.allow:: uploadpackfilter.allow::
Provides a default value for unspecified object filters (see: the Provides a default value for unspecified object filters (see: the
below configuration variable). below configuration variable). If set to `true`, this will also
enable all filters which get added in the future.
Defaults to `true`. Defaults to `true`.
uploadpackfilter.<filter>.allow:: uploadpackfilter.<filter>.allow::
Explicitly allow or ban the object filter corresponding to Explicitly allow or ban the object filter corresponding to
`<filter>`, where `<filter>` may be one of: `blob:none`, `<filter>`, where `<filter>` may be one of: `blob:none`,
`blob:limit`, `tree`, `sparse:oid`, or `combine`. If using `blob:limit`, `object:type`, `tree`, `sparse:oid`, or `combine`.
combined filters, both `combine` and all of the nested filter If using combined filters, both `combine` and all of the nested
kinds must be allowed. Defaults to `uploadpackfilter.allow`. filter kinds must be allowed. Defaults to `uploadpackfilter.allow`.
uploadpackfilter.tree.maxDepth:: uploadpackfilter.tree.maxDepth::
Only allow `--filter=tree:<n>` when `<n>` is no more than the value of Only allow `--filter=tree:<n>` when `<n>` is no more than the value of

View File

@ -11,7 +11,7 @@ linkgit:git-diff-files[1]
with the `-p` option produces patch text. with the `-p` option produces patch text.
You can customize the creation of patch text via the You can customize the creation of patch text via the
`GIT_EXTERNAL_DIFF` and the `GIT_DIFF_OPTS` environment variables `GIT_EXTERNAL_DIFF` and the `GIT_DIFF_OPTS` environment variables
(see linkgit:git[1]). (see linkgit:git[1]), and the `diff` attribute (see linkgit:gitattributes[5]).
What the -p option produces is slightly different from the traditional What the -p option produces is slightly different from the traditional
diff format: diff format:
@ -74,6 +74,11 @@ separate lines indicate the old and the new mode.
rename from b rename from b
rename to a rename to a
5. Hunk headers mention the name of the function to which the hunk
applies. See "Defining a custom hunk-header" in
linkgit:gitattributes[5] for details of how to tailor to this to
specific languages.
Combined diff format Combined diff format
-------------------- --------------------

View File

@ -34,7 +34,7 @@ endif::git-diff[]
endif::git-format-patch[] endif::git-format-patch[]
ifdef::git-log[] ifdef::git-log[]
--diff-merges=(off|none|first-parent|1|separate|m|combined|c|dense-combined|cc):: --diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
--no-diff-merges:: --no-diff-merges::
Specify diff format to be used for merge commits. Default is Specify diff format to be used for merge commits. Default is
{diff-merges-default} unless `--first-parent` is in use, in which case {diff-merges-default} unless `--first-parent` is in use, in which case
@ -45,17 +45,24 @@ ifdef::git-log[]
Disable output of diffs for merge commits. Useful to override Disable output of diffs for merge commits. Useful to override
implied value. implied value.
+ +
--diff-merges=on:::
--diff-merges=m:::
-m:::
This option makes diff output for merge commits to be shown in
the default format. `-m` will produce the output only if `-p`
is given as well. The default format could be changed using
`log.diffMerges` configuration parameter, which default value
is `separate`.
+
--diff-merges=first-parent::: --diff-merges=first-parent:::
--diff-merges=1::: --diff-merges=1:::
This option makes merge commits show the full diff with This option makes merge commits show the full diff with
respect to the first parent only. respect to the first parent only.
+ +
--diff-merges=separate::: --diff-merges=separate:::
--diff-merges=m:::
-m:::
This makes merge commits show the full diff with respect to This makes merge commits show the full diff with respect to
each of the parents. Separate log entry and diff is generated each of the parents. Separate log entry and diff is generated
for each parent. `-m` doesn't produce any output without `-p`. for each parent.
+ +
--diff-merges=combined::: --diff-merges=combined:::
--diff-merges=c::: --diff-merges=c:::
@ -293,11 +300,14 @@ explained for the configuration variable `core.quotePath` (see
linkgit:git-config[1]). linkgit:git-config[1]).
--name-only:: --name-only::
Show only names of changed files. Show only names of changed files. The file names are often encoded in UTF-8.
For more information see the discussion about encoding in the linkgit:git-log[1]
manual page.
--name-status:: --name-status::
Show only names and status of changed files. See the description Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean. of the `--diff-filter` option on what the status letters mean.
Just like `--name-only` the file names are often encoded in UTF-8.
--submodule[=<format>]:: --submodule[=<format>]::
Specify how differences in submodules are shown. When specifying Specify how differences in submodules are shown. When specifying

View File

@ -110,6 +110,11 @@ ifndef::git-pull[]
setting `fetch.writeCommitGraph`. setting `fetch.writeCommitGraph`.
endif::git-pull[] endif::git-pull[]
--prefetch::
Modify the configured refspec to place all refs into the
`refs/prefetch/` namespace. See the `prefetch` task in
linkgit:git-maintenance[1].
-p:: -p::
--prune:: --prune::
Before fetching, remove any remote-tracking references that no Before fetching, remove any remote-tracking references that no

View File

@ -15,6 +15,7 @@ SYNOPSIS
[--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>] [--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>]
[--exclude=<path>] [--include=<path>] [--reject] [-q | --quiet] [--exclude=<path>] [--include=<path>] [--reject] [-q | --quiet]
[--[no-]scissors] [-S[<keyid>]] [--patch-format=<format>] [--[no-]scissors] [-S[<keyid>]] [--patch-format=<format>]
[--quoted-cr=<action>]
[(<mbox> | <Maildir>)...] [(<mbox> | <Maildir>)...]
'git am' (--continue | --skip | --abort | --quit | --show-current-patch[=(diff|raw)]) 'git am' (--continue | --skip | --abort | --quit | --show-current-patch[=(diff|raw)])
@ -59,6 +60,9 @@ OPTIONS
--no-scissors:: --no-scissors::
Ignore scissors lines (see linkgit:git-mailinfo[1]). Ignore scissors lines (see linkgit:git-mailinfo[1]).
--quoted-cr=<action>::
This flag will be passed down to 'git mailinfo' (see linkgit:git-mailinfo[1]).
-m:: -m::
--message-id:: --message-id::
Pass the `-m` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]), Pass the `-m` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]),

View File

@ -84,12 +84,13 @@ OPTIONS
-3:: -3::
--3way:: --3way::
When the patch does not apply cleanly, fall back on 3-way merge if Attempt 3-way merge if the patch records the identity of blobs it is supposed
the patch records the identity of blobs it is supposed to apply to, to apply to and we have those blobs available locally, possibly leaving the
and we have those blobs available locally, possibly leaving the
conflict markers in the files in the working tree for the user to conflict markers in the files in the working tree for the user to
resolve. This option implies the `--index` option, and is incompatible resolve. This option implies the `--index` option unless the
with the `--reject` and the `--cached` options. `--cached` option is used, and is incompatible with the `--reject` option.
When used with the `--cached` option, any conflicts are left at higher stages
in the cache.
--build-fake-ancestor=<file>:: --build-fake-ancestor=<file>::
Newer 'git diff' output has embedded 'index information' Newer 'git diff' output has embedded 'index information'

View File

@ -35,42 +35,42 @@ OPTIONS
-t:: -t::
Instead of the content, show the object type identified by Instead of the content, show the object type identified by
<object>. `<object>`.
-s:: -s::
Instead of the content, show the object size identified by Instead of the content, show the object size identified by
<object>. `<object>`.
-e:: -e::
Exit with zero status if <object> exists and is a valid Exit with zero status if `<object>` exists and is a valid
object. If <object> is of an invalid format exit with non-zero and object. If `<object>` is of an invalid format exit with non-zero and
emits an error on stderr. emits an error on stderr.
-p:: -p::
Pretty-print the contents of <object> based on its type. Pretty-print the contents of `<object>` based on its type.
<type>:: <type>::
Typically this matches the real type of <object> but asking Typically this matches the real type of `<object>` but asking
for a type that can trivially be dereferenced from the given for a type that can trivially be dereferenced from the given
<object> is also permitted. An example is to ask for a `<object>` is also permitted. An example is to ask for a
"tree" with <object> being a commit object that contains it, "tree" with `<object>` being a commit object that contains it,
or to ask for a "blob" with <object> being a tag object that or to ask for a "blob" with `<object>` being a tag object that
points at it. points at it.
--textconv:: --textconv::
Show the content as transformed by a textconv filter. In this case, Show the content as transformed by a textconv filter. In this case,
<object> has to be of the form <tree-ish>:<path>, or :<path> in `<object>` has to be of the form `<tree-ish>:<path>`, or `:<path>` in
order to apply the filter to the content recorded in the index at order to apply the filter to the content recorded in the index at
<path>. `<path>`.
--filters:: --filters::
Show the content as converted by the filters configured in Show the content as converted by the filters configured in
the current working tree for the given <path> (i.e. smudge filters, the current working tree for the given `<path>` (i.e. smudge filters,
end-of-line conversion, etc). In this case, <object> has to be of end-of-line conversion, etc). In this case, `<object>` has to be of
the form <tree-ish>:<path>, or :<path>. the form `<tree-ish>:<path>`, or `:<path>`.
--path=<path>:: --path=<path>::
For use with --textconv or --filters, to allow specifying an object For use with `--textconv` or `--filters`, to allow specifying an object
name and a path separately, e.g. when it is difficult to figure out name and a path separately, e.g. when it is difficult to figure out
the revision from which the blob came. the revision from which the blob came.
@ -115,15 +115,15 @@ OPTIONS
repository. repository.
--allow-unknown-type:: --allow-unknown-type::
Allow -s or -t to query broken/corrupt objects of unknown type. Allow `-s` or `-t` to query broken/corrupt objects of unknown type.
--follow-symlinks:: --follow-symlinks::
With --batch or --batch-check, follow symlinks inside the With `--batch` or `--batch-check`, follow symlinks inside the
repository when requesting objects with extended SHA-1 repository when requesting objects with extended SHA-1
expressions of the form tree-ish:path-in-tree. Instead of expressions of the form tree-ish:path-in-tree. Instead of
providing output about the link itself, provide output about providing output about the link itself, provide output about
the linked-to object. If a symlink points outside the the linked-to object. If a symlink points outside the
tree-ish (e.g. a link to /foo or a root-level link to ../foo), tree-ish (e.g. a link to `/foo` or a root-level link to `../foo`),
the portion of the link which is outside the tree will be the portion of the link which is outside the tree will be
printed. printed.
+ +
@ -175,15 +175,15 @@ respectively print:
OUTPUT OUTPUT
------ ------
If `-t` is specified, one of the <type>. If `-t` is specified, one of the `<type>`.
If `-s` is specified, the size of the <object> in bytes. If `-s` is specified, the size of the `<object>` in bytes.
If `-e` is specified, no output, unless the <object> is malformed. If `-e` is specified, no output, unless the `<object>` is malformed.
If `-p` is specified, the contents of <object> are pretty-printed. If `-p` is specified, the contents of `<object>` are pretty-printed.
If <type> is specified, the raw (though uncompressed) contents of the <object> If `<type>` is specified, the raw (though uncompressed) contents of the `<object>`
will be returned. will be returned.
BATCH OUTPUT BATCH OUTPUT
@ -200,7 +200,7 @@ object, with placeholders of the form `%(atom)` expanded, followed by a
newline. The available atoms are: newline. The available atoms are:
`objectname`:: `objectname`::
The 40-hex object name of the object. The full hex representation of the object name.
`objecttype`:: `objecttype`::
The type of the object (the same as `cat-file -t` reports). The type of the object (the same as `cat-file -t` reports).
@ -215,8 +215,9 @@ newline. The available atoms are:
`deltabase`:: `deltabase`::
If the object is stored as a delta on-disk, this expands to the If the object is stored as a delta on-disk, this expands to the
40-hex sha1 of the delta base object. Otherwise, expands to the full hex representation of the delta base object name.
null sha1 (40 zeroes). See `CAVEATS` below. Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
below.
`rest`:: `rest`::
If this atom is used in the output string, input lines are split If this atom is used in the output string, input lines are split
@ -235,14 +236,14 @@ newline.
For example, `--batch` without a custom format would produce: For example, `--batch` without a custom format would produce:
------------ ------------
<sha1> SP <type> SP <size> LF <oid> SP <type> SP <size> LF
<contents> LF <contents> LF
------------ ------------
Whereas `--batch-check='%(objectname) %(objecttype)'` would produce: Whereas `--batch-check='%(objectname) %(objecttype)'` would produce:
------------ ------------
<sha1> SP <type> LF <oid> SP <type> LF
------------ ------------
If a name is specified on stdin that cannot be resolved to an object in If a name is specified on stdin that cannot be resolved to an object in
@ -258,7 +259,7 @@ If a name is specified that might refer to more than one object (an ambiguous sh
<object> SP ambiguous LF <object> SP ambiguous LF
------------ ------------
If --follow-symlinks is used, and a symlink in the repository points If `--follow-symlinks` is used, and a symlink in the repository points
outside the repository, then `cat-file` will ignore any custom format outside the repository, then `cat-file` will ignore any custom format
and print: and print:
@ -267,11 +268,11 @@ symlink SP <size> LF
<symlink> LF <symlink> LF
------------ ------------
The symlink will either be absolute (beginning with a /), or relative The symlink will either be absolute (beginning with a `/`), or relative
to the tree root. For instance, if dir/link points to ../../foo, then to the tree root. For instance, if dir/link points to `../../foo`, then
<symlink> will be ../foo. <size> is the size of the symlink in bytes. `<symlink>` will be `../foo`. `<size>` is the size of the symlink in bytes.
If --follow-symlinks is used, the following error messages will be If `--follow-symlinks` is used, the following error messages will be
displayed: displayed:
------------ ------------

View File

@ -15,7 +15,7 @@ SYNOPSIS
[--dissociate] [--separate-git-dir <git dir>] [--dissociate] [--separate-git-dir <git dir>]
[--depth <depth>] [--[no-]single-branch] [--no-tags] [--depth <depth>] [--[no-]single-branch] [--no-tags]
[--recurse-submodules[=<pathspec>]] [--[no-]shallow-submodules] [--recurse-submodules[=<pathspec>]] [--[no-]shallow-submodules]
[--[no-]remote-submodules] [--jobs <n>] [--sparse] [--[no-]remote-submodules] [--jobs <n>] [--sparse] [--[no-]reject-shallow]
[--filter=<filter>] [--] <repository> [--filter=<filter>] [--] <repository>
[<directory>] [<directory>]
@ -149,6 +149,11 @@ objects from the source repository into a pack in the cloned repository.
--no-checkout:: --no-checkout::
No checkout of HEAD is performed after the clone is complete. No checkout of HEAD is performed after the clone is complete.
--[no-]reject-shallow::
Fail if the source repository is a shallow repository.
The 'clone.rejectShallow' configuration variable can be used to
specify the default.
--bare:: --bare::
Make a 'bare' Git repository. That is, instead of Make a 'bare' Git repository. That is, instead of
creating `<directory>` and placing the administrative creating `<directory>` and placing the administrative

View File

@ -9,12 +9,13 @@ SYNOPSIS
-------- --------
[verse] [verse]
'git commit' [-a | --interactive | --patch] [-s] [-v] [-u<mode>] [--amend] 'git commit' [-a | --interactive | --patch] [-s] [-v] [-u<mode>] [--amend]
[--dry-run] [(-c | -C | --fixup | --squash) <commit>] [--dry-run] [(-c | -C | --squash) <commit> | --fixup [(amend|reword):]<commit>)]
[-F <file> | -m <msg>] [--reset-author] [--allow-empty] [-F <file> | -m <msg>] [--reset-author] [--allow-empty]
[--allow-empty-message] [--no-verify] [-e] [--author=<author>] [--allow-empty-message] [--no-verify] [-e] [--author=<author>]
[--date=<date>] [--cleanup=<mode>] [--[no-]status] [--date=<date>] [--cleanup=<mode>] [--[no-]status]
[-i | -o] [--pathspec-from-file=<file> [--pathspec-file-nul]] [-i | -o] [--pathspec-from-file=<file> [--pathspec-file-nul]]
[-S[<keyid>]] [--] [<pathspec>...] [(--trailer <token>[(=|:)<value>])...] [-S[<keyid>]]
[--] [<pathspec>...]
DESCRIPTION DESCRIPTION
----------- -----------
@ -86,11 +87,44 @@ OPTIONS
Like '-C', but with `-c` the editor is invoked, so that Like '-C', but with `-c` the editor is invoked, so that
the user can further edit the commit message. the user can further edit the commit message.
--fixup=<commit>:: --fixup=[(amend|reword):]<commit>::
Construct a commit message for use with `rebase --autosquash`. Create a new commit which "fixes up" `<commit>` when applied with
The commit message will be the subject line from the specified `git rebase --autosquash`. Plain `--fixup=<commit>` creates a
commit with a prefix of "fixup! ". See linkgit:git-rebase[1] "fixup!" commit which changes the content of `<commit>` but leaves
for details. its log message untouched. `--fixup=amend:<commit>` is similar but
creates an "amend!" commit which also replaces the log message of
`<commit>` with the log message of the "amend!" commit.
`--fixup=reword:<commit>` creates an "amend!" commit which
replaces the log message of `<commit>` with its own log message
but makes no changes to the content of `<commit>`.
+
The commit created by plain `--fixup=<commit>` has a subject
composed of "fixup!" followed by the subject line from <commit>,
and is recognized specially by `git rebase --autosquash`. The `-m`
option may be used to supplement the log message of the created
commit, but the additional commentary will be thrown away once the
"fixup!" commit is squashed into `<commit>` by
`git rebase --autosquash`.
+
The commit created by `--fixup=amend:<commit>` is similar but its
subject is instead prefixed with "amend!". The log message of
<commit> is copied into the log message of the "amend!" commit and
opened in an editor so it can be refined. When `git rebase
--autosquash` squashes the "amend!" commit into `<commit>`, the
log message of `<commit>` is replaced by the refined log message
from the "amend!" commit. It is an error for the "amend!" commit's
log message to be empty unless `--allow-empty-message` is
specified.
+
`--fixup=reword:<commit>` is shorthand for `--fixup=amend:<commit>
--only`. It creates an "amend!" commit with only a log message
(ignoring any changes staged in the index). When squashed by `git
rebase --autosquash`, it replaces the log message of `<commit>`
without making any other changes.
+
Neither "fixup!" nor "amend!" commits change authorship of
`<commit>` when applied by `git rebase --autosquash`.
See linkgit:git-rebase[1] for details.
--squash=<commit>:: --squash=<commit>::
Construct a commit message for use with `rebase --autosquash`. Construct a commit message for use with `rebase --autosquash`.
@ -166,6 +200,17 @@ The `-m` option is mutually exclusive with `-c`, `-C`, and `-F`.
include::signoff-option.txt[] include::signoff-option.txt[]
--trailer <token>[(=|:)<value>]::
Specify a (<token>, <value>) pair that should be applied as a
trailer. (e.g. `git commit --trailer "Signed-off-by:C O Mitter \
<committer@example.com>" --trailer "Helped-by:C O Mitter \
<committer@example.com>"` will add the "Signed-off-by" trailer
and the "Helped-by" trailer to the commit message.)
The `trailer.*` configuration variables
(linkgit:git-interpret-trailers[1]) can be used to define if
a duplicated trailer is omitted, where in the run of trailers
each trailer would appear, and other details.
-n:: -n::
--no-verify:: --no-verify::
This option bypasses the pre-commit and commit-msg hooks. This option bypasses the pre-commit and commit-msg hooks.

View File

@ -340,6 +340,11 @@ GIT_CONFIG::
Using the "--global" option forces this to ~/.gitconfig. Using the Using the "--global" option forces this to ~/.gitconfig. Using the
"--system" option forces this to $(prefix)/etc/gitconfig. "--system" option forces this to $(prefix)/etc/gitconfig.
GIT_CONFIG_GLOBAL::
GIT_CONFIG_SYSTEM::
Take the configuration from the given files instead from global or
system-level configuration. See linkgit:git[1] for details.
GIT_CONFIG_NOSYSTEM:: GIT_CONFIG_NOSYSTEM::
Whether to skip reading settings from the system-wide Whether to skip reading settings from the system-wide
$(prefix)/etc/gitconfig file. See linkgit:git[1] for details. $(prefix)/etc/gitconfig file. See linkgit:git[1] for details.

View File

@ -159,3 +159,7 @@ empty string.
+ +
Components which are missing from the URL (e.g., there is no Components which are missing from the URL (e.g., there is no
username in the example above) will be left unset. username in the example above) will be left unset.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -24,6 +24,18 @@ Usage:
[verse] [verse]
'git-cvsserver' [<options>] [pserver|server] [<directory> ...] 'git-cvsserver' [<options>] [pserver|server] [<directory> ...]
DESCRIPTION
-----------
This application is a CVS emulation layer for Git.
It is highly functional. However, not all methods are implemented,
and for those methods that are implemented,
not all switches are implemented.
Testing has been done using both the CLI CVS client, and the Eclipse CVS
plugin. Most functionality works fine with both of these clients.
OPTIONS OPTIONS
------- -------
@ -57,18 +69,6 @@ access still needs to be enabled by the `gitcvs.enabled` config option
unless `--export-all` was given, too. unless `--export-all` was given, too.
DESCRIPTION
-----------
This application is a CVS emulation layer for Git.
It is highly functional. However, not all methods are implemented,
and for those methods that are implemented,
not all switches are implemented.
Testing has been done using both the CLI CVS client, and the Eclipse CVS
plugin. Most functionality works fine with both of these clients.
LIMITATIONS LIMITATIONS
----------- -----------

View File

@ -36,11 +36,28 @@ SYNOPSIS
DESCRIPTION DESCRIPTION
----------- -----------
Prepare each commit with its patch in Prepare each non-merge commit with its "patch" in
one file per commit, formatted to resemble UNIX mailbox format. one "message" per commit, formatted to resemble a UNIX mailbox.
The output of this command is convenient for e-mail submission or The output of this command is convenient for e-mail submission or
for use with 'git am'. for use with 'git am'.
A "message" generated by the command consists of three parts:
* A brief metadata header that begins with `From <commit>`
with a fixed `Mon Sep 17 00:00:00 2001` datestamp to help programs
like "file(1)" to recognize that the file is an output from this
command, fields that record the author identity, the author date,
and the title of the change (taken from the first paragraph of the
commit log message).
* The second and subsequent paragraphs of the commit log message.
* The "patch", which is the "diff -p --stat" output (see
linkgit:git-diff[1]) between the commit and its parent.
The log message and the patch is separated by a line with a
three-dash line.
There are two ways to specify which commits to operate on. There are two ways to specify which commits to operate on.
1. A single commit, <since>, specifies that the commits leading 1. A single commit, <since>, specifies that the commits leading
@ -221,6 +238,11 @@ populated with placeholder text.
`--subject-prefix` option) has ` v<n>` appended to it. E.g. `--subject-prefix` option) has ` v<n>` appended to it. E.g.
`--reroll-count=4` may produce `v4-0001-add-makefile.patch` `--reroll-count=4` may produce `v4-0001-add-makefile.patch`
file that has "Subject: [PATCH v4 1/20] Add makefile" in it. file that has "Subject: [PATCH v4 1/20] Add makefile" in it.
`<n>` does not have to be an integer (e.g. "--reroll-count=4.4",
or "--reroll-count=4rev2" are allowed), but the downside of
using such a reroll-count is that the range-diff/interdiff
with the previous version does not state exactly which
version the new interation is compared against.
--to=<email>:: --to=<email>::
Add a `To:` header to the email headers. This is in addition Add a `To:` header to the email headers. This is in addition
@ -718,6 +740,14 @@ use it only when you know the recipient uses Git to apply your patch.
$ git format-patch -3 $ git format-patch -3
------------ ------------
CAVEATS
-------
Note that `format-patch` will omit merge commits from the output, even
if they are part of the requested range. A simple "patch" does not
include enough information for the receiving end to reproduce the same
merge commit.
SEE ALSO SEE ALSO
-------- --------
linkgit:git-am[1], linkgit:git-send-email[1] linkgit:git-am[1], linkgit:git-send-email[1]

View File

@ -38,38 +38,6 @@ are lists of one or more search expressions separated by newline
characters. An empty string as search expression matches all lines. characters. An empty string as search expression matches all lines.
CONFIGURATION
-------------
grep.lineNumber::
If set to true, enable `-n` option by default.
grep.column::
If set to true, enable the `--column` option by default.
grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
option is ignored when the `grep.patternType` option is set to a value
other than 'default'.
grep.threads::
Number of grep worker threads to use. If unset (or set to 0), Git will
use as many threads as the number of logical cores available.
grep.fullName::
If set to true, enable `--full-name` option by default.
grep.fallbackToNoIndex::
If set to true, fall back to git grep --no-index if git grep
is executed outside of a git repository. Defaults to false.
OPTIONS OPTIONS
------- -------
--cached:: --cached::
@ -363,6 +331,38 @@ with multiple threads might perform slower than single threaded if `--textconv`
is given and there're too many text conversions. So if you experience low is given and there're too many text conversions. So if you experience low
performance in this case, it might be desirable to use `--threads=1`. performance in this case, it might be desirable to use `--threads=1`.
CONFIGURATION
-------------
grep.lineNumber::
If set to true, enable `-n` option by default.
grep.column::
If set to true, enable the `--column` option by default.
grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
option is ignored when the `grep.patternType` option is set to a value
other than 'default'.
grep.threads::
Number of grep worker threads to use. If unset (or set to 0), Git will
use as many threads as the number of logical cores available.
grep.fullName::
If set to true, enable `--full-name` option by default.
grep.fallbackToNoIndex::
If set to true, fall back to git grep --no-index if git grep
is executed outside of a git repository. Defaults to false.
GIT GIT
--- ---
Part of the linkgit:git[1] suite Part of the linkgit:git[1] suite

View File

@ -232,25 +232,38 @@ trailer.<token>.ifmissing::
that option for trailers with the specified <token>. that option for trailers with the specified <token>.
trailer.<token>.command:: trailer.<token>.command::
This option can be used to specify a shell command that will This option behaves in the same way as 'trailer.<token>.cmd', except
be called to automatically add or modify a trailer with the that it doesn't pass anything as argument to the specified command.
specified <token>. Instead the first occurrence of substring $ARG is replaced by the
value that would be passed as argument.
+ +
When this option is specified, the behavior is as if a special The 'trailer.<token>.command' option has been deprecated in favor of
'<token>=<value>' argument were added at the beginning of the command 'trailer.<token>.cmd' due to the fact that $ARG in the user's command is
line, where <value> is taken to be the standard output of the only replaced once and that the original way of replacing $ARG is not safe.
specified command with any leading and trailing whitespace trimmed
off.
+ +
If the command contains the `$ARG` string, this string will be When both 'trailer.<token>.cmd' and 'trailer.<token>.command' are given
replaced with the <value> part of an existing trailer with the same for the same <token>, 'trailer.<token>.cmd' is used and
<token>, if any, before the command is launched. 'trailer.<token>.command' is ignored.
trailer.<token>.cmd::
This option can be used to specify a shell command that will be called:
once to automatically add a trailer with the specified <token>, and then
each time a '--trailer <token>=<value>' argument to modify the <value> of
the trailer that this option would produce.
+ +
If some '<token>=<value>' arguments are also passed on the command When the specified command is first called to add a trailer
line, when a 'trailer.<token>.command' is configured, the command will with the specified <token>, the behavior is as if a special
also be executed for each of these arguments. And the <value> part of '--trailer <token>=<value>' argument was added at the beginning
these arguments, if any, will be used to replace the `$ARG` string in of the "git interpret-trailers" command, where <value>
the command. is taken to be the standard output of the command with any
leading and trailing whitespace trimmed off.
+
If some '--trailer <token>=<value>' arguments are also passed
on the command line, the command is called again once for each
of these arguments with the same <token>. And the <value> part
of these arguments, if any, will be passed to the command as its
first argument. This way the command can produce a <value> computed
from the <value> passed in the '--trailer <token>=<value>' argument.
EXAMPLES EXAMPLES
-------- --------
@ -333,6 +346,55 @@ subject
Fix #42 Fix #42
------------ ------------
* Configure a 'help' trailer with a cmd use a script `glog-find-author`
which search specified author identity from git log in git repository
and show how it works:
+
------------
$ cat ~/bin/glog-find-author
#!/bin/sh
test -n "$1" && git log --author="$1" --pretty="%an <%ae>" -1 || true
$ git config trailer.help.key "Helped-by: "
$ git config trailer.help.ifExists "addIfDifferentNeighbor"
$ git config trailer.help.cmd "~/bin/glog-find-author"
$ git interpret-trailers --trailer="help:Junio" --trailer="help:Couder" <<EOF
> subject
>
> message
>
> EOF
subject
message
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
------------
* Configure a 'ref' trailer with a cmd use a script `glog-grep`
to grep last relevant commit from git log in the git repository
and show how it works:
+
------------
$ cat ~/bin/glog-grep
#!/bin/sh
test -n "$1" && git log --grep "$1" --pretty=reference -1 || true
$ git config trailer.ref.key "Reference-to: "
$ git config trailer.ref.ifExists "replace"
$ git config trailer.ref.cmd "~/bin/glog-grep"
$ git interpret-trailers --trailer="ref:Add copyright notices." <<EOF
> subject
>
> message
>
> EOF
subject
message
Reference-to: 8bc9a0c769 (Add copyright notices., 2005-04-07)
------------
* Configure a 'see' trailer with a command to show the subject of a * Configure a 'see' trailer with a command to show the subject of a
commit that is related, and show how it works: commit that is related, and show how it works:
+ +

View File

@ -9,7 +9,9 @@ git-mailinfo - Extracts patch and authorship from a single e-mail message
SYNOPSIS SYNOPSIS
-------- --------
[verse] [verse]
'git mailinfo' [-k|-b] [-u | --encoding=<encoding> | -n] [--[no-]scissors] <msg> <patch> 'git mailinfo' [-k|-b] [-u | --encoding=<encoding> | -n]
[--[no-]scissors] [--quoted-cr=<action>]
<msg> <patch>
DESCRIPTION DESCRIPTION
@ -89,6 +91,23 @@ This can be enabled by default with the configuration option mailinfo.scissors.
--no-scissors:: --no-scissors::
Ignore scissors lines. Useful for overriding mailinfo.scissors settings. Ignore scissors lines. Useful for overriding mailinfo.scissors settings.
--quoted-cr=<action>::
Action when processes email messages sent with base64 or
quoted-printable encoding, and the decoded lines end with a CRLF
instead of a simple LF.
+
The valid actions are:
+
--
* `nowarn`: Git will do nothing when such a CRLF is found.
* `warn`: Git will issue a warning for each message if such a CRLF is
found.
* `strip`: Git will convert those CRLF to LF.
--
+
The default action could be set by configuration option `mailinfo.quotedCR`.
If no such configuration option has been set, `warn` will be used.
<msg>:: <msg>::
The commit log message extracted from e-mail, usually The commit log message extracted from e-mail, usually
except the title line which comes from e-mail Subject. except the title line which comes from e-mail Subject.

View File

@ -92,10 +92,8 @@ commit-graph::
prefetch:: prefetch::
The `prefetch` task updates the object directory with the latest The `prefetch` task updates the object directory with the latest
objects from all registered remotes. For each remote, a `git fetch` objects from all registered remotes. For each remote, a `git fetch`
command is run. The refmap is custom to avoid updating local or remote command is run. The configured refspec is modified to place all
branches (those in `refs/heads` or `refs/remotes`). Instead, the requested refs within `refs/prefetch/`. Also, tags are not updated.
remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
not updated.
+ +
This is done to avoid disrupting the remote-tracking branches. The end users This is done to avoid disrupting the remote-tracking branches. The end users
expect these refs to stay unmoved unless they initiate a fetch. With prefetch expect these refs to stay unmoved unless they initiate a fetch. With prefetch

View File

@ -99,6 +99,10 @@ success of the resolution after the custom tool has exited.
(see linkgit:git-config[1]). To cancel `diff.orderFile`, (see linkgit:git-config[1]). To cancel `diff.orderFile`,
use `-O/dev/null`. use `-O/dev/null`.
CONFIGURATION
-------------
include::config/mergetool.txt[]
TEMPORARY FILES TEMPORARY FILES
--------------- ---------------
`git mergetool` creates `*.orig` backup files while resolving merges. `git mergetool` creates `*.orig` backup files while resolving merges.

View File

@ -11,14 +11,6 @@ SYNOPSIS
[verse] [verse]
'git mktag' 'git mktag'
OPTIONS
-------
--strict::
By default mktag turns on the equivalent of
linkgit:git-fsck[1] `--strict` mode. Use `--no-strict` to
disable it.
DESCRIPTION DESCRIPTION
----------- -----------
@ -45,6 +37,14 @@ the appropriate `fsck.<msg-id>` varible:
git -c fsck.extraHeaderEntry=ignore mktag <my-tag-with-headers git -c fsck.extraHeaderEntry=ignore mktag <my-tag-with-headers
OPTIONS
-------
--strict::
By default mktag turns on the equivalent of
linkgit:git-fsck[1] `--strict` mode. Use `--no-strict` to
disable it.
Tag Format Tag Format
---------- ----------
A tag signature file, to be fed to this command's standard input, A tag signature file, to be fed to this command's standard input,

View File

@ -9,7 +9,8 @@ git-multi-pack-index - Write and verify multi-pack-indexes
SYNOPSIS SYNOPSIS
-------- --------
[verse] [verse]
'git multi-pack-index' [--object-dir=<dir>] [--[no-]progress] <subcommand> 'git multi-pack-index' [--object-dir=<dir>] [--[no-]progress]
[--preferred-pack=<pack>] <subcommand>
DESCRIPTION DESCRIPTION
----------- -----------
@ -30,7 +31,16 @@ OPTIONS
The following subcommands are available: The following subcommands are available:
write:: write::
Write a new MIDX file. Write a new MIDX file. The following options are available for
the `write` sub-command:
+
--
--preferred-pack=<pack>::
Optionally specify the tie-breaking pack used when
multiple packs contain the same object. If not given,
ties are broken in favor of the pack with the lowest
mtime.
--
verify:: verify::
Verify the contents of the MIDX file. Verify the contents of the MIDX file.

View File

@ -762,3 +762,7 @@ IMPLEMENTATION DETAILS
message indicating the p4 depot location and change number. This message indicating the p4 depot location and change number. This
line is used by later 'git p4 sync' operations to know which p4 line is used by later 'git p4 sync' operations to know which p4
changes are new. changes are new.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -85,6 +85,16 @@ base-name::
reference was included in the resulting packfile. This reference was included in the resulting packfile. This
can be useful to send new tags to native Git clients. can be useful to send new tags to native Git clients.
--stdin-packs::
Read the basenames of packfiles (e.g., `pack-1234abcd.pack`)
from the standard input, instead of object names or revision
arguments. The resulting pack contains all objects listed in the
included packs (those not beginning with `^`), excluding any
objects listed in the excluded packs (beginning with `^`).
+
Incompatible with `--revs`, or options that imply `--revs` (such as
`--all`), with the exception of `--unpacked`, which is compatible.
--window=<n>:: --window=<n>::
--depth=<n>:: --depth=<n>::
These two options affect how the objects contained in These two options affect how the objects contained in

View File

@ -600,7 +600,7 @@ EXAMPLES
`git push origin`:: `git push origin`::
Without additional configuration, pushes the current branch to Without additional configuration, pushes the current branch to
the configured upstream (`remote.origin.merge` configuration the configured upstream (`branch.<name>.merge` configuration
variable) if it has the same name as the current branch, and variable) if it has the same name as the current branch, and
errors out without pushing otherwise. errors out without pushing otherwise.
+ +

View File

@ -200,12 +200,6 @@ Alternatively, you can undo the 'git rebase' with
git rebase --abort git rebase --abort
CONFIGURATION
-------------
include::config/rebase.txt[]
include::config/sequencer.txt[]
OPTIONS OPTIONS
------- -------
--onto <newbase>:: --onto <newbase>::
@ -593,16 +587,17 @@ See also INCOMPATIBLE OPTIONS below.
--autosquash:: --autosquash::
--no-autosquash:: --no-autosquash::
When the commit log message begins with "squash! ..." (or When the commit log message begins with "squash! ..." or "fixup! ..."
"fixup! ..."), and there is already a commit in the todo list that or "amend! ...", and there is already a commit in the todo list that
matches the same `...`, automatically modify the todo list of rebase matches the same `...`, automatically modify the todo list of
-i so that the commit marked for squashing comes right after the `rebase -i`, so that the commit marked for squashing comes right after
commit to be modified, and change the action of the moved commit the commit to be modified, and change the action of the moved commit
from `pick` to `squash` (or `fixup`). A commit matches the `...` if from `pick` to `squash` or `fixup` or `fixup -C` respectively. A commit
the commit subject matches, or if the `...` refers to the commit's matches the `...` if the commit subject matches, or if the `...` refers
hash. As a fall-back, partial matches of the commit subject work, to the commit's hash. As a fall-back, partial matches of the commit
too. The recommended way to create fixup/squash commits is by using subject work, too. The recommended way to create fixup/amend/squash
the `--fixup`/`--squash` options of linkgit:git-commit[1]. commits is by using the `--fixup`, `--fixup=amend:` or `--fixup=reword:`
and `--squash` options respectively of linkgit:git-commit[1].
+ +
If the `--autosquash` option is enabled by default using the If the `--autosquash` option is enabled by default using the
configuration variable `rebase.autoSquash`, this option can be configuration variable `rebase.autoSquash`, this option can be
@ -622,6 +617,14 @@ See also INCOMPATIBLE OPTIONS below.
--no-reschedule-failed-exec:: --no-reschedule-failed-exec::
Automatically reschedule `exec` commands that failed. This only makes Automatically reschedule `exec` commands that failed. This only makes
sense in interactive mode (or when an `--exec` option was provided). sense in interactive mode (or when an `--exec` option was provided).
+
Even though this option applies once a rebase is started, it's set for
the whole rebase at the start based on either the
`rebase.rescheduleFailedExec` configuration (see linkgit:git-config[1]
or "CONFIGURATION" below) or whether this option is
provided. Otherwise an explicit `--no-reschedule-failed-exec` at the
start would be overridden by the presence of
`rebase.rescheduleFailedExec=true` configuration.
INCOMPATIBLE OPTIONS INCOMPATIBLE OPTIONS
-------------------- --------------------
@ -887,9 +890,17 @@ If you want to fold two or more commits into one, replace the command
"pick" for the second and subsequent commits with "squash" or "fixup". "pick" for the second and subsequent commits with "squash" or "fixup".
If the commits had different authors, the folded commit will be If the commits had different authors, the folded commit will be
attributed to the author of the first commit. The suggested commit attributed to the author of the first commit. The suggested commit
message for the folded commit is the concatenation of the commit message for the folded commit is the concatenation of the first
messages of the first commit and of those with the "squash" command, commit's message with those identified by "squash" commands, omitting the
but omits the commit messages of commits with the "fixup" command. messages of commits identified by "fixup" commands, unless "fixup -c"
is used. In that case the suggested commit message is only the message
of the "fixup -c" commit, and an editor is opened allowing you to edit
the message. The contents (patch) of the "fixup -c" commit are still
incorporated into the folded commit. If there is more than one "fixup -c"
commit, the message from the final one is used. You can also use
"fixup -C" to get the same behavior as "fixup -c" except without opening
an editor.
'git rebase' will stop when "pick" has been replaced with "edit" or 'git rebase' will stop when "pick" has been replaced with "edit" or
when a command fails due to merge errors. When you are done editing when a command fails due to merge errors. When you are done editing
@ -1257,6 +1268,12 @@ merge tlsv1.3
merge cmake merge cmake
------------ ------------
CONFIGURATION
-------------
include::config/rebase.txt[]
include::config/sequencer.txt[]
BUGS BUGS
---- ----
The todo list presented by the deprecated `--preserve-merges --interactive` The todo list presented by the deprecated `--preserve-merges --interactive`

View File

@ -165,6 +165,29 @@ depth is 4095.
Pass the `--delta-islands` option to `git-pack-objects`, see Pass the `--delta-islands` option to `git-pack-objects`, see
linkgit:git-pack-objects[1]. linkgit:git-pack-objects[1].
-g=<factor>::
--geometric=<factor>::
Arrange resulting pack structure so that each successive pack
contains at least `<factor>` times the number of objects as the
next-largest pack.
+
`git repack` ensures this by determining a "cut" of packfiles that need
to be repacked into one in order to ensure a geometric progression. It
picks the smallest set of packfiles such that as many of the larger
packfiles (by count of objects contained in that pack) may be left
intact.
+
Unlike other repack modes, the set of objects to pack is determined
uniquely by the set of packs being "rolled-up"; in other words, the
packs determined to need to be combined in order to restore a geometric
progression.
+
When `--unpacked` is specified, loose objects are implicitly included in
this "roll-up", without respect to their reachability. This is subject
to change in the future. This option (implying a drastically different
repack mode) is not guaranteed to work with all other combinations of
option to `git repack`.
CONFIGURATION CONFIGURATION
------------- -------------

View File

@ -23,7 +23,9 @@ branch, and no updates to their contents can be staged in the index,
though that default behavior can be overridden with the `-f` option. though that default behavior can be overridden with the `-f` option.
When `--cached` is given, the staged content has to When `--cached` is given, the staged content has to
match either the tip of the branch or the file on disk, match either the tip of the branch or the file on disk,
allowing the file to be removed from just the index. allowing the file to be removed from just the index. When
sparse-checkouts are in use (see linkgit:git-sparse-checkout[1]),
`git rm` will only remove paths within the sparse-checkout patterns.
OPTIONS OPTIONS

View File

@ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the
When `--cone` is provided, the `core.sparseCheckoutCone` setting is When `--cone` is provided, the `core.sparseCheckoutCone` setting is
also set, allowing for better performance with a limited set of also set, allowing for better performance with a limited set of
patterns (see 'CONE PATTERN SET' below). patterns (see 'CONE PATTERN SET' below).
+
Use the `--[no-]sparse-index` option to toggle the use of the sparse
index format. This reduces the size of the index to be more closely
aligned with your sparse-checkout definition. This can have significant
performance advantages for commands such as `git status` or `git add`.
This feature is still experimental. Some commands might be slower with
a sparse index until they are properly integrated with the feature.
+
**WARNING:** Using a sparse index requires modifying the index in a way
that is not completely understood by external tools. If you have trouble
with this compatibility, then run `git sparse-checkout init --no-sparse-index`
to rewrite your index to not be sparse. Older versions of Git will not
understand the sparse directory entries index extension and may fail to
interact with your repository until it is disabled.
'set':: 'set'::
Write a set of patterns to the sparse-checkout file, as given as Write a set of patterns to the sparse-checkout file, as given as

View File

@ -9,7 +9,7 @@ SYNOPSIS
-------- --------
[verse] [verse]
'git stash' list [<log-options>] 'git stash' list [<log-options>]
'git stash' show [<diff-options>] [<stash>] 'git stash' show [-u|--include-untracked|--only-untracked] [<diff-options>] [<stash>]
'git stash' drop [-q|--quiet] [<stash>] 'git stash' drop [-q|--quiet] [<stash>]
'git stash' ( pop | apply ) [--index] [-q|--quiet] [<stash>] 'git stash' ( pop | apply ) [--index] [-q|--quiet] [<stash>]
'git stash' branch <branchname> [<stash>] 'git stash' branch <branchname> [<stash>]
@ -83,7 +83,7 @@ stash@{1}: On master: 9cc0589... Add git-stash
The command takes options applicable to the 'git log' The command takes options applicable to the 'git log'
command to control what is shown and how. See linkgit:git-log[1]. command to control what is shown and how. See linkgit:git-log[1].
show [<diff-options>] [<stash>]:: show [-u|--include-untracked|--only-untracked] [<diff-options>] [<stash>]::
Show the changes recorded in the stash entry as a diff between the Show the changes recorded in the stash entry as a diff between the
stashed contents and the commit back when the stash entry was first stashed contents and the commit back when the stash entry was first
@ -91,8 +91,10 @@ show [<diff-options>] [<stash>]::
By default, the command shows the diffstat, but it will accept any By default, the command shows the diffstat, but it will accept any
format known to 'git diff' (e.g., `git stash show -p stash@{1}` format known to 'git diff' (e.g., `git stash show -p stash@{1}`
to view the second most recent entry in patch form). to view the second most recent entry in patch form).
You can use stash.showStat and/or stash.showPatch config variables If no `<diff-option>` is provided, the default behavior will be given
to change the default behavior. by the `stash.showStat`, and `stash.showPatch` config variables. You
can also use `stash.showIncludeUntracked` to set whether
`--include-untracked` is enabled by default.
pop [--index] [-q|--quiet] [<stash>]:: pop [--index] [-q|--quiet] [<stash>]::
@ -160,10 +162,18 @@ up with `git clean`.
-u:: -u::
--include-untracked:: --include-untracked::
This option is only valid for `push` and `save` commands. --no-include-untracked::
When used with the `push` and `save` commands,
all untracked files are also stashed and then cleaned up with
`git clean`.
+ +
All untracked files are also stashed and then cleaned up with When used with the `show` command, show the untracked files in the stash
`git clean`. entry as part of the diff.
--only-untracked::
This option is only valid for the `show` command.
+
Show only the untracked files in the stash entry as part of the diff.
--index:: --index::
This option is only valid for `pop` and `apply` commands. This option is only valid for `pop` and `apply` commands.

View File

@ -1061,25 +1061,6 @@ with different name spaces. For example:
branches = stable/*:refs/remotes/svn/stable/* branches = stable/*:refs/remotes/svn/stable/*
branches = debug/*:refs/remotes/svn/debug/* branches = debug/*:refs/remotes/svn/debug/*
BUGS
----
We ignore all SVN properties except svn:executable. Any unhandled
properties are logged to $GIT_DIR/svn/<refname>/unhandled.log
Renamed and copied directories are not detected by Git and hence not
tracked when committing to SVN. I do not plan on adding support for
this as it's quite difficult and time-consuming to get working for all
the possible corner cases (Git doesn't do it, either). Committing
renamed and copied files is fully supported if they're similar enough
for Git to detect them.
In SVN, it is possible (though discouraged) to commit changes to a tag
(because a tag is just a directory copy, thus technically the same as a
branch). When cloning an SVN repository, 'git svn' cannot know if such a
commit to a tag will happen in the future. Thus it acts conservatively
and imports all SVN tags as branches, prefixing the tag name with 'tags/'.
CONFIGURATION CONFIGURATION
------------- -------------
@ -1166,6 +1147,25 @@ $GIT_DIR/svn/\**/.rev_map.*::
if it is missing or not up to date. 'git svn reset' automatically if it is missing or not up to date. 'git svn reset' automatically
rewinds it. rewinds it.
BUGS
----
We ignore all SVN properties except svn:executable. Any unhandled
properties are logged to $GIT_DIR/svn/<refname>/unhandled.log
Renamed and copied directories are not detected by Git and hence not
tracked when committing to SVN. I do not plan on adding support for
this as it's quite difficult and time-consuming to get working for all
the possible corner cases (Git doesn't do it, either). Committing
renamed and copied files is fully supported if they're similar enough
for Git to detect them.
In SVN, it is possible (though discouraged) to commit changes to a tag
(because a tag is just a directory copy, thus technically the same as a
branch). When cloning an SVN repository, 'git svn' cannot know if such a
commit to a tag will happen in the future. Thus it acts conservatively
and imports all SVN tags as branches, prefixing the tag name with 'tags/'.
SEE ALSO SEE ALSO
-------- --------
linkgit:git-rebase[1] linkgit:git-rebase[1]

View File

@ -13,7 +13,7 @@ SYNOPSIS
[--exec-path[=<path>]] [--html-path] [--man-path] [--info-path] [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
[-p|--paginate|-P|--no-pager] [--no-replace-objects] [--bare] [-p|--paginate|-P|--no-pager] [--no-replace-objects] [--bare]
[--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>] [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
[--super-prefix=<path>] [--config-env <name>=<envvar>] [--super-prefix=<path>] [--config-env=<name>=<envvar>]
<command> [<args>] <command> [<args>]
DESCRIPTION DESCRIPTION
@ -670,6 +670,16 @@ for further details.
If this environment variable is set to `0`, git will not prompt If this environment variable is set to `0`, git will not prompt
on the terminal (e.g., when asking for HTTP authentication). on the terminal (e.g., when asking for HTTP authentication).
`GIT_CONFIG_GLOBAL`::
`GIT_CONFIG_SYSTEM`::
Take the configuration from the given files instead from global or
system-level configuration files. If `GIT_CONFIG_SYSTEM` is set, the
system config file defined at build time (usually `/etc/gitconfig`)
will not be read. Likewise, if `GIT_CONFIG_GLOBAL` is set, neither
`$HOME/.gitconfig` nor `$XDG_CONFIG_HOME/git/config` will be read. Can
be set to `/dev/null` to skip reading configuration files of the
respective level.
`GIT_CONFIG_NOSYSTEM`:: `GIT_CONFIG_NOSYSTEM`::
Whether to skip reading settings from the system-wide Whether to skip reading settings from the system-wide
`$(prefix)/etc/gitconfig` file. This environment variable can `$(prefix)/etc/gitconfig` file. This environment variable can

View File

@ -845,6 +845,8 @@ patterns are available:
- `rust` suitable for source code in the Rust language. - `rust` suitable for source code in the Rust language.
- `scheme` suitable for source code in the Scheme language.
- `tex` suitable for source code for LaTeX documents. - `tex` suitable for source code for LaTeX documents.
@ -1174,7 +1176,8 @@ tag then no replacement will be done. The placeholders are the same
as those for the option `--pretty=format:` of linkgit:git-log[1], as those for the option `--pretty=format:` of linkgit:git-log[1],
except that they need to be wrapped like this: `$Format:PLACEHOLDERS$` except that they need to be wrapped like this: `$Format:PLACEHOLDERS$`
in the file. E.g. the string `$Format:%H$` will be replaced by the in the file. E.g. the string `$Format:%H$` will be replaced by the
commit hash. commit hash. However, only one `%(describe)` placeholder is expanded
per archive to avoid denial-of-service attacks.
Packing objects Packing objects
@ -1244,6 +1247,12 @@ to:
[attr]binary -diff -merge -text [attr]binary -diff -merge -text
------------ ------------
NOTES
-----
Git does not follow symbolic links when accessing a `.gitattributes`
file in the working tree. This keeps behavior consistent when the file
is accessed from the index or a tree versus from the filesystem.
EXAMPLES EXAMPLES
-------- --------

View File

@ -187,7 +187,7 @@ mark a file pair as a rename and stop considering other candidates for
better matches. At most, one comparison is done per file in this better matches. At most, one comparison is done per file in this
preliminary pass; so if there are several remaining ext.txt files preliminary pass; so if there are several remaining ext.txt files
throughout the directory hierarchy after exact rename detection, this throughout the directory hierarchy after exact rename detection, this
preliminary step will be skipped for those files. preliminary step may be skipped for those files.
Note. When the "-C" option is used with `--find-copies-harder` Note. When the "-C" option is used with `--find-copies-harder`
option, 'git diff-{asterisk}' commands feed unmodified filepairs to option, 'git diff-{asterisk}' commands feed unmodified filepairs to

View File

@ -138,7 +138,7 @@ given); `template` (if a `-t` option was given or the
configuration option `commit.template` is set); `merge` (if the configuration option `commit.template` is set); `merge` (if the
commit is a merge or a `.git/MERGE_MSG` file exists); `squash` commit is a merge or a `.git/MERGE_MSG` file exists); `squash`
(if a `.git/SQUASH_MSG` file exists); or `commit`, followed by (if a `.git/SQUASH_MSG` file exists); or `commit`, followed by
a commit SHA-1 (if a `-c`, `-C` or `--amend` option was given). a commit object name (if a `-c`, `-C` or `--amend` option was given).
If the exit status is non-zero, `git commit` will abort. If the exit status is non-zero, `git commit` will abort.
@ -231,19 +231,19 @@ named remote is not being used both values will be the same.
Information about what is to be pushed is provided on the hook's standard Information about what is to be pushed is provided on the hook's standard
input with lines of the form: input with lines of the form:
<local ref> SP <local sha1> SP <remote ref> SP <remote sha1> LF <local ref> SP <local object name> SP <remote ref> SP <remote object name> LF
For instance, if the command +git push origin master:foreign+ were run the For instance, if the command +git push origin master:foreign+ were run the
hook would receive a line like the following: hook would receive a line like the following:
refs/heads/master 67890 refs/heads/foreign 12345 refs/heads/master 67890 refs/heads/foreign 12345
although the full, 40-character SHA-1s would be supplied. If the foreign ref although the full object name would be supplied. If the foreign ref does not
does not yet exist the `<remote SHA-1>` will be 40 `0`. If a ref is to be yet exist the `<remote object name>` will be the all-zeroes object name. If a
deleted, the `<local ref>` will be supplied as `(delete)` and the `<local ref is to be deleted, the `<local ref>` will be supplied as `(delete)` and the
SHA-1>` will be 40 `0`. If the local commit was specified by something other `<local object name>` will be the all-zeroes object name. If the local commit
than a name which could be expanded (such as `HEAD~`, or a SHA-1) it will be was specified by something other than a name which could be expanded (such as
supplied as it was originally given. `HEAD~`, or an object name) it will be supplied as it was originally given.
If this hook exits with a non-zero status, `git push` will abort without If this hook exits with a non-zero status, `git push` will abort without
pushing anything. Information about why the push is rejected may be sent pushing anything. Information about why the push is rejected may be sent
@ -268,7 +268,7 @@ input a line of the format:
where `<old-value>` is the old object name stored in the ref, where `<old-value>` is the old object name stored in the ref,
`<new-value>` is the new object name to be stored in the ref and `<new-value>` is the new object name to be stored in the ref and
`<ref-name>` is the full name of the ref. `<ref-name>` is the full name of the ref.
When creating a new ref, `<old-value>` is 40 `0`. When creating a new ref, `<old-value>` is the all-zeroes object name.
If the hook exits with non-zero status, none of the refs will be If the hook exits with non-zero status, none of the refs will be
updated. If the hook exits with zero, updating of individual refs can updated. If the hook exits with zero, updating of individual refs can
@ -473,7 +473,8 @@ reference-transaction
This hook is invoked by any Git command that performs reference This hook is invoked by any Git command that performs reference
updates. It executes whenever a reference transaction is prepared, updates. It executes whenever a reference transaction is prepared,
committed or aborted and may thus get called multiple times. committed or aborted and may thus get called multiple times. The hook
does not cover symbolic references (but that may change in the future).
The hook takes exactly one argument, which is the current state the The hook takes exactly one argument, which is the current state the
given reference transaction is in: given reference transaction is in:
@ -492,6 +493,14 @@ receives on standard input a line of the format:
<old-value> SP <new-value> SP <ref-name> LF <old-value> SP <new-value> SP <ref-name> LF
where `<old-value>` is the old object name passed into the reference
transaction, `<new-value>` is the new object name to be stored in the
ref and `<ref-name>` is the full name of the ref. When force updating
the reference regardless of its current value or when the reference is
to be created anew, `<old-value>` is the all-zeroes object name. To
distinguish these cases, you can inspect the current value of
`<ref-name>` via `git rev-parse`.
The exit status of the hook is ignored for any state except for the The exit status of the hook is ignored for any state except for the
"prepared" state. In the "prepared" state, a non-zero exit status will "prepared" state. In the "prepared" state, a non-zero exit status will
cause the transaction to be aborted. The hook will not be called with cause the transaction to be aborted. The hook will not be called with
@ -550,7 +559,7 @@ command-dependent arguments may be passed in the future.
The hook receives a list of the rewritten commits on stdin, in the The hook receives a list of the rewritten commits on stdin, in the
format format
<old-sha1> SP <new-sha1> [ SP <extra-info> ] LF <old-object-name> SP <new-object-name> [ SP <extra-info> ] LF
The 'extra-info' is again command-dependent. If it is empty, the The 'extra-info' is again command-dependent. If it is empty, the
preceding SP is also omitted. Currently, no commands pass any preceding SP is also omitted. Currently, no commands pass any
@ -566,7 +575,7 @@ rebase::
For the 'squash' and 'fixup' operation, all commits that were For the 'squash' and 'fixup' operation, all commits that were
squashed are listed as being rewritten to the squashed commit. squashed are listed as being rewritten to the squashed commit.
This means that there will be several lines sharing the same This means that there will be several lines sharing the same
'new-sha1'. 'new-object-name'.
+ +
The commits are guaranteed to be listed in the order that they were The commits are guaranteed to be listed in the order that they were
processed by rebase. processed by rebase.

View File

@ -149,11 +149,15 @@ not tracked by Git remain untracked.
To stop tracking a file that is currently tracked, use To stop tracking a file that is currently tracked, use
'git rm --cached'. 'git rm --cached'.
Git does not follow symbolic links when accessing a `.gitignore` file in
the working tree. This keeps behavior consistent when the file is
accessed from the index or a tree versus from the filesystem.
EXAMPLES EXAMPLES
-------- --------
- The pattern `hello.*` matches any file or folder - The pattern `hello.*` matches any file or folder
whose name begins with `hello`. If one wants to restrict whose name begins with `hello.`. If one wants to restrict
this only to the directory and not in its subdirectories, this only to the directory and not in its subdirectories,
one can prepend the pattern with a slash, i.e. `/hello.*`; one can prepend the pattern with a slash, i.e. `/hello.*`;
the pattern now matches `hello.txt`, `hello.c` but not the pattern now matches `hello.txt`, `hello.c` but not

View File

@ -55,6 +55,13 @@ this would also match the 'Commit Name <commit&#64;email.xx>' above:
Proper Name <proper@email.xx> CoMmIt NaMe <CoMmIt@EmAiL.xX> Proper Name <proper@email.xx> CoMmIt NaMe <CoMmIt@EmAiL.xX>
-- --
NOTES
-----
Git does not follow symbolic links when accessing a `.mailmap` file in
the working tree. This keeps behavior consistent when the file is
accessed from the index or a tree versus from the filesystem.
EXAMPLES EXAMPLES
-------- --------

View File

@ -98,6 +98,14 @@ submodule.<name>.shallow::
shallow clone (with a history depth of 1) unless the user explicitly shallow clone (with a history depth of 1) unless the user explicitly
asks for a non-shallow clone. asks for a non-shallow clone.
NOTES
-----
Git does not allow the `.gitmodules` file within a working tree to be a
symbolic link, and will refuse to check out such a tree entry. This
keeps behavior consistent when the file is accessed from the index or a
tree versus from the filesystem, and helps Git reliably enforce security
checks of the file contents.
EXAMPLES EXAMPLES
-------- --------

View File

@ -62,3 +62,7 @@ git clone ext::'git --namespace=foo %s /tmp/prefixed.git'
---------- ----------
include::transfer-data-leaks.txt[] include::transfer-data-leaks.txt[]
GIT
---
Part of the linkgit:git[1] suite

View File

@ -751,6 +751,17 @@ default font sizes or lineheights are changed (e.g. via adding extra
CSS stylesheet in `@stylesheets`), it may be appropriate to change CSS stylesheet in `@stylesheets`), it may be appropriate to change
these values. these values.
email-privacy::
Redact e-mail addresses from the generated HTML, etc. content.
This obscures e-mail addresses retrieved from the author/committer
and comment sections of the Git log.
It is meant to hinder web crawlers that harvest and abuse addresses.
Such crawlers may not respect robots.txt.
Note that users and user tools also see the addresses as redacted.
If Gitweb is not the final step in a workflow then subsequent steps
may misbehave because of the redacted information they receive.
Disabled by default.
highlight:: highlight::
Server-side syntax highlight support in "blob" view. It requires Server-side syntax highlight support in "blob" view. It requires
`$highlight_bin` program to be available (see the description of `$highlight_bin` program to be available (see the description of

View File

@ -0,0 +1,131 @@
Content-type: text/asciidoc
Abstract: When a critical vulnerability is discovered and fixed, we follow this
script to coordinate a public release.
How we coordinate embargoed releases
====================================
To protect Git users from critical vulnerabilities, we do not just release
fixed versions like regular maintenance releases. Instead, we coordinate
releases with packagers, keeping the fixes under an embargo until the release
date. That way, users will have a chance to upgrade on that date, no matter
what Operating System or distribution they run.
Open a Security Advisory draft
------------------------------
The first step is to https://github.com/git/git/security/advisories/new[open an
advisory]. Technically, it is not necessary, but it is convenient and saves a
bit of hassle. This advisory can also be used to obtain the CVE number and it
will give us a private fork associated with it that can be used to collaborate
on a fix.
Release date of the embargoed version
-------------------------------------
If the vulnerability affects Windows users, we want to have our friends over at
Visual Studio on board. This means we need to target a "Patch Tuesday" (i.e. a
second Tuesday of the month), at the minimum three weeks from heads-up to
coordinated release.
If the vulnerability affects the server side, or can benefit from scans on the
server side (i.e. if `git fsck` can detect an attack), it is important to give
all involved Git repository hosting sites enough time to scan all of those
repositories.
Notifying the Linux distributions
---------------------------------
At most two weeks before release date, we need to send a notification to
distros@vs.openwall.org, preferably less than 7 days before the release date.
This will reach most (all?) Linux distributions. See an example below, and the
guidelines for this mailing list at
https://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists[here].
Once the version has been published, we send a note about that to oss-security.
As an example, see https://www.openwall.com/lists/oss-security/2019/12/13/1[the
v2.24.1 mail];
https://oss-security.openwall.org/wiki/mailing-lists/oss-security[Here] are
their guidelines.
The mail to oss-security should also describe the exploit, and give credit to
the reporter(s): security researchers still receive too little respect for the
invaluable service they provide, and public credit goes a long way to keep them
paid by their respective organizations.
Technically, describing any exploit can be delayed up to 7 days, but we usually
refrain from doing that, including it right away.
As a courtesy we typically attach a Git bundle (as `.tar.xz` because the list
will drop `.bundle` attachments) in the mail to distros@ so that the involved
parties can take care of integrating/backporting them. This bundle is typically
created using a command like this:
git bundle create cve-xxx.bundle ^origin/master vA.B.C vD.E.F
tar cJvf cve-xxx.bundle.tar.xz cve-xxx.bundle
Example mail to distros@vs.openwall.org
---------------------------------------
....
To: distros@vs.openwall.org
Cc: git-security@googlegroups.com, <other people involved in the report/fix>
Subject: [vs] Upcoming Git security fix release
Team,
The Git project will release new versions on <date> at 10am Pacific Time or
soon thereafter. I have attached a Git bundle (embedded in a `.tar.xz` to avoid
it being dropped) which you can fetch into a clone of
https://github.com/git/git via `git fetch --tags /path/to/cve-xxx.bundle`,
containing the tags for versions <versions>.
You can verify with `git tag -v <tag>` that the versions were signed by
the Git maintainer, using the same GPG key as e.g. v2.24.0.
Please use these tags to prepare `git` packages for your various
distributions, using the appropriate tagged versions. The added test cases
help verify the correctness.
The addressed issues are:
<list of CVEs with a short description, typically copy/pasted from Git's
release notes, usually demo exploit(s), too>
Credit for finding the vulnerability goes to <reporter>, credit for fixing
it goes to <developer>.
Thanks,
<name>
....
Example mail to oss-security@lists.openwall.com
-----------------------------------------------
....
To: oss-security@lists.openwall.com
Cc: git-security@googlegroups.com, <other people involved in the report/fix>
Subject: git: <copy from security advisory>
Team,
The Git project released new versions on <date>, addressing <CVE>.
All supported platforms are affected in one way or another, and all Git
versions all the way back to <version> are affected. The fixed versions are:
<versions>.
Link to the announcement: <link to lore.kernel.org/git>
We highly recommend to upgrade.
The addressed issues are:
* <list of CVEs and their explanations, along with demo exploits>
Credit for finding the vulnerability goes to <reporter>, credit for fixing
it goes to <developer>.
Thanks,
<name>
....

View File

@ -1,71 +1,67 @@
#!/usr/bin/perl #!/usr/bin/perl
use File::Find; use strict;
use Getopt::Long; use warnings;
my $basedir = "."; # Parse arguments, a simple state machine for input like:
GetOptions("basedir=s" => \$basedir) #
or die("Cannot parse command line arguments\n"); # howto/*.txt config/*.txt --section=1 git.txt git-add.txt [...] --to-lint git-add.txt a-file.txt [...]
my %TXT;
my %SECTION;
my $section;
my $lint_these = 0;
for my $arg (@ARGV) {
if (my ($sec) = $arg =~ /^--section=(\d+)$/s) {
$section = $sec;
next;
}
my $found_errors = 0; my ($name) = $arg =~ /^(.*?)\.txt$/s;
unless (defined $section) {
$TXT{$name} = $arg;
next;
}
$SECTION{$name} = $section;
}
my $exit_code = 0;
sub report { sub report {
my ($where, $what, $error) = @_; my ($pos, $line, $target, $msg) = @_;
print "$where: $error: $what\n"; substr($line, $pos) = "' <-- HERE";
$found_errors = 1; $line =~ s/^\s+//;
print "$ARGV:$.: error: $target: $msg, shown with 'HERE' below:\n";
print "$ARGV:$.:\t'$line\n";
$exit_code = 1;
} }
sub grab_section { @ARGV = sort values %TXT;
my ($page) = @_; die "BUG: Nothing to process!" unless @ARGV;
open my $fh, "<", "$basedir/$page.txt"; while (<>) {
my $firstline = <$fh>; my $line = $_;
chomp $firstline; while ($line =~ m/linkgit:((.*?)\[(\d)\])/g) {
close $fh; my $pos = pos $line;
my ($section) = ($firstline =~ /.*\((\d)\)$/); my ($target, $page, $section) = ($1, $2, $3);
return $section;
}
sub lint { # De-AsciiDoc
my ($file) = @_; $page =~ s/{litdd}/--/g;
open my $fh, "<", $file
or return;
while (<$fh>) {
my $where = "$file:$.";
while (s/linkgit:((.*?)\[(\d)\])//) {
my ($target, $page, $section) = ($1, $2, $3);
# De-AsciiDoc if (!exists $TXT{$page}) {
$page =~ s/{litdd}/--/g; report($pos, $line, $target, "link outside of our own docs");
next;
if ($page !~ /^git/) { }
report($where, $target, "nongit link"); if (!exists $SECTION{$page}) {
next; report($pos, $line, $target, "link outside of our sectioned docs");
} next;
if (! -f "$basedir/$page.txt") { }
report($where, $target, "no such source"); my $real_section = $SECTION{$page};
next; if ($section != $SECTION{$page}) {
} report($pos, $line, $target, "wrong section (should be $real_section)");
$real_section = grab_section($page); next;
if ($real_section != $section) {
report($where, $target,
"wrong section (should be $real_section)");
next;
}
} }
} }
close $fh; # this resets our $. for each file
close ARGV if eof;
} }
sub lint_it { exit $exit_code;
lint($File::Find::name) if -f && /\.txt$/;
}
if (!@ARGV) {
find({ wanted => \&lint_it, no_chdir => 1 }, $basedir);
} else {
for (@ARGV) {
lint($_);
}
}
exit $found_errors;

View File

@ -0,0 +1,24 @@
#!/usr/bin/perl
use strict;
use warnings;
my $exit_code = 0;
sub report {
my ($target, $msg) = @_;
print "error: $target: $msg\n";
$exit_code = 1;
}
local $/;
while (my $slurp = <>) {
report($ARGV, "has no 'Part of the linkgit:git[1] suite' end blurb")
unless $slurp =~ m[
^GIT\n
---\n
\QPart of the linkgit:git[1] suite\E \n
\z
]mx;
}
exit $exit_code;

View File

@ -0,0 +1,105 @@
#!/usr/bin/perl
use strict;
use warnings;
my %SECTIONS;
{
my $order = 0;
%SECTIONS = (
'NAME' => {
required => 1,
order => $order++,
},
'SYNOPSIS' => {
required => 1,
order => $order++,
},
'DESCRIPTION' => {
required => 1,
order => $order++,
},
'OPTIONS' => {
order => $order++,
required => 0,
},
'CONFIGURATION' => {
order => $order++,
},
'BUGS' => {
order => $order++,
},
'SEE ALSO' => {
order => $order++,
},
'GIT' => {
required => 1,
order => $order++,
},
);
}
my $SECTION_RX = do {
my ($names) = join "|", keys %SECTIONS;
qr/^($names)$/s;
};
my $exit_code = 0;
sub report {
my ($msg) = @_;
print "$ARGV:$.: $msg\n";
$exit_code = 1;
}
my $last_was_section;
my @actual_order;
while (my $line = <>) {
chomp $line;
if ($line =~ $SECTION_RX) {
push @actual_order => $line;
$last_was_section = 1;
# Have no "last" section yet, processing NAME
next if @actual_order == 1;
my @expected_order = sort {
$SECTIONS{$a}->{order} <=> $SECTIONS{$b}->{order}
} @actual_order;
my $expected_last = $expected_order[-2];
my $actual_last = $actual_order[-2];
if ($actual_last ne $expected_last) {
report("section '$line' incorrectly ordered, comes after '$actual_last'");
}
next;
}
if ($last_was_section) {
my $last_section = $actual_order[-1];
if (length $last_section ne length $line) {
report("dashes under '$last_section' should match its length!");
}
if ($line !~ /^-+$/) {
report("dashes under '$last_section' should be '-' dashes!");
}
$last_was_section = 0;
}
if (eof) {
# We have both a hash and an array to consider, for
# convenience
my %actual_sections;
@actual_sections{@actual_order} = ();
for my $section (sort keys %SECTIONS) {
next if !$SECTIONS{$section}->{required} or exists $actual_sections{$section};
report("has no required '$section' section!");
}
# Reset per-file state
{
@actual_order = ();
# this resets our $. for each file
close ARGV;
}
}
}
exit $exit_code;

View File

@ -190,6 +190,8 @@ The placeholders are:
'%ai':: author date, ISO 8601-like format '%ai':: author date, ISO 8601-like format
'%aI':: author date, strict ISO 8601 format '%aI':: author date, strict ISO 8601 format
'%as':: author date, short format (`YYYY-MM-DD`) '%as':: author date, short format (`YYYY-MM-DD`)
'%ah':: author date, human style (like the `--date=human` option of
linkgit:git-rev-list[1])
'%cn':: committer name '%cn':: committer name
'%cN':: committer name (respecting .mailmap, see '%cN':: committer name (respecting .mailmap, see
linkgit:git-shortlog[1] or linkgit:git-blame[1]) linkgit:git-shortlog[1] or linkgit:git-blame[1])
@ -206,8 +208,23 @@ The placeholders are:
'%ci':: committer date, ISO 8601-like format '%ci':: committer date, ISO 8601-like format
'%cI':: committer date, strict ISO 8601 format '%cI':: committer date, strict ISO 8601 format
'%cs':: committer date, short format (`YYYY-MM-DD`) '%cs':: committer date, short format (`YYYY-MM-DD`)
'%ch':: committer date, human style (like the `--date=human` option of
linkgit:git-rev-list[1])
'%d':: ref names, like the --decorate option of linkgit:git-log[1] '%d':: ref names, like the --decorate option of linkgit:git-log[1]
'%D':: ref names without the " (", ")" wrapping. '%D':: ref names without the " (", ")" wrapping.
'%(describe[:options])':: human-readable name, like
linkgit:git-describe[1]; empty string for
undescribable commits. The `describe` string
may be followed by a colon and zero or more
comma-separated options. Descriptions can be
inconsistent when tags are added or removed at
the same time.
+
** 'match=<pattern>': Only consider tags matching the given
`glob(7)` pattern, excluding the "refs/tags/" prefix.
** 'exclude=<pattern>': Do not consider tags matching the given
`glob(7)` pattern, excluding the "refs/tags/" prefix.
'%S':: ref name given on the command line by which the commit was reached '%S':: ref name given on the command line by which the commit was reached
(like `git log --source`), only works with `git log` (like `git log --source`), only works with `git log`
'%e':: encoding '%e':: encoding
@ -254,7 +271,7 @@ endif::git-rev-list[]
`trailers` string may be followed by a colon `trailers` string may be followed by a colon
and zero or more comma-separated options. and zero or more comma-separated options.
If any option is provided multiple times the If any option is provided multiple times the
last occurance wins. last occurrence wins.
+ +
The boolean options accept an optional value `[=<BOOL>]`. The values The boolean options accept an optional value `[=<BOOL>]`. The values
`true`, `false`, `on`, `off` etc. are all accepted. See the "boolean" `true`, `false`, `on`, `off` etc. are all accepted. See the "boolean"

View File

@ -892,6 +892,9 @@ or units. n may be zero. The suffixes k, m, and g can be used to name
units in KiB, MiB, or GiB. For example, 'blob:limit=1k' is the same units in KiB, MiB, or GiB. For example, 'blob:limit=1k' is the same
as 'blob:limit=1024'. as 'blob:limit=1024'.
+ +
The form '--filter=object:type=(tag|commit|tree|blob)' omits all objects
which are not of the requested type.
+
The form '--filter=sparse:oid=<blob-ish>' uses a sparse-checkout The form '--filter=sparse:oid=<blob-ish>' uses a sparse-checkout
specification contained in the blob (or blob-expression) '<blob-ish>' specification contained in the blob (or blob-expression) '<blob-ish>'
to omit blobs that would not be not required for a sparse checkout on to omit blobs that would not be not required for a sparse checkout on
@ -930,6 +933,11 @@ equivalent.
--no-filter:: --no-filter::
Turn off any previous `--filter=` argument. Turn off any previous `--filter=` argument.
--filter-provided-objects::
Filter the list of explicitly provided objects, which would otherwise
always be printed even if they did not match any of the filters. Only
useful with `--filter=`.
--filter-print-omitted:: --filter-print-omitted::
Only useful with `--filter=`; prints a list of the objects omitted Only useful with `--filter=`; prints a list of the objects omitted
by the filter. Object IDs are prefixed with a ``~'' character. by the filter. Object IDs are prefixed with a ``~'' character.

View File

@ -1,8 +1,11 @@
Error reporting in git Error reporting in git
====================== ======================
`die`, `usage`, `error`, and `warning` report errors of various `BUG`, `die`, `usage`, `error`, and `warning` report errors of
kinds. various kinds.
- `BUG` is for failed internal assertions that should never happen,
i.e. a bug in git itself.
- `die` is for fatal application errors. It prints a message to - `die` is for fatal application errors. It prints a message to
the user and exits with status 128. the user and exits with status 128.
@ -20,6 +23,9 @@ kinds.
without running into too many problems. Like `error`, it without running into too many problems. Like `error`, it
returns -1 after reporting the situation to the caller. returns -1 after reporting the situation to the caller.
These reports will be logged via the trace2 facility. See the "error"
event in link:api-trace2.txt[trace2 API].
Customizable error handlers Customizable error handlers
--------------------------- ---------------------------

View File

@ -0,0 +1,105 @@
Simple-IPC API
==============
The Simple-IPC API is a collection of `ipc_` prefixed library routines
and a basic communication protocol that allow an IPC-client process to
send an application-specific IPC-request message to an IPC-server
process and receive an application-specific IPC-response message.
Communication occurs over a named pipe on Windows and a Unix domain
socket on other platforms. IPC-clients and IPC-servers rendezvous at
a previously agreed-to application-specific pathname (which is outside
the scope of this design) that is local to the computer system.
The IPC-server routines within the server application process create a
thread pool to listen for connections and receive request messages
from multiple concurrent IPC-clients. When received, these messages
are dispatched up to the server application callbacks for handling.
IPC-server routines then incrementally relay responses back to the
IPC-client.
The IPC-client routines within a client application process connect
to the IPC-server and send a request message and wait for a response.
When received, the response is returned back the caller.
For example, the `fsmonitor--daemon` feature will be built as a server
application on top of the IPC-server library routines. It will have
threads watching for file system events and a thread pool waiting for
client connections. Clients, such as `git status` will request a list
of file system events since a point in time and the server will
respond with a list of changed files and directories. The formats of
the request and response are application-specific; the IPC-client and
IPC-server routines treat them as opaque byte streams.
Comparison with sub-process model
---------------------------------
The Simple-IPC mechanism differs from the existing `sub-process.c`
model (Documentation/technical/long-running-process-protocol.txt) and
used by applications like Git-LFS. In the LFS-style sub-process model
the helper is started by the foreground process, communication happens
via a pair of file descriptors bound to the stdin/stdout of the
sub-process, the sub-process only serves the current foreground
process, and the sub-process exits when the foreground process
terminates.
In the Simple-IPC model the server is a very long-running service. It
can service many clients at the same time and has a private socket or
named pipe connection to each active client. It might be started
(on-demand) by the current client process or it might have been
started by a previous client or by the OS at boot time. The server
process is not associated with a terminal and it persists after
clients terminate. Clients do not have access to the stdin/stdout of
the server process and therefore must communicate over sockets or
named pipes.
Server startup and shutdown
---------------------------
How an application server based upon IPC-server is started is also
outside the scope of the Simple-IPC design and is a property of the
application using it. For example, the server might be started or
restarted during routine maintenance operations, or it might be
started as a system service during the system boot-up sequence, or it
might be started on-demand by a foreground Git command when needed.
Similarly, server shutdown is a property of the application using
the simple-ipc routines. For example, the server might decide to
shutdown when idle or only upon explicit request.
Simple-IPC protocol
-------------------
The Simple-IPC protocol consists of a single request message from the
client and an optional response message from the server. Both the
client and server messages are unlimited in length and are terminated
with a flush packet.
The pkt-line routines (Documentation/technical/protocol-common.txt)
are used to simplify buffer management during message generation,
transmission, and reception. A flush packet is used to mark the end
of the message. This allows the sender to incrementally generate and
transmit the message. It allows the receiver to incrementally receive
the message in chunks and to know when they have received the entire
message.
The actual byte format of the client request and server response
messages are application specific. The IPC layer transmits and
receives them as opaque byte buffers without any concern for the
content within. It is the job of the calling application layer to
understand the contents of the request and response messages.
Summary
-------
Conceptually, the Simple-IPC protocol is similar to an HTTP REST
request. Clients connect, make an application-specific and
stateless request, receive an application-specific
response, and disconnect. It is a one round trip facility for
querying the server. The Simple-IPC routines hide the socket,
named pipe, and thread pool details and allow the application
layer to focus on the application at hand.

View File

@ -465,7 +465,7 @@ completed.)
------------ ------------
`"error"`:: `"error"`::
This event is emitted when one of the `error()`, `die()`, This event is emitted when one of the `BUG()`, `error()`, `die()`,
`warning()`, or `usage()` functions are called. `warning()`, or `usage()` functions are called.
+ +
------------ ------------

View File

@ -44,6 +44,13 @@ Git index format
localization, no special casing of directory separator '/'). Entries localization, no special casing of directory separator '/'). Entries
with the same name are sorted by their stage field. with the same name are sorted by their stage field.
An index entry typically represents a file. However, if sparse-checkout
is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the
`extensions.sparseIndex` extension is enabled, then the index may
contain entries for directories outside of the sparse-checkout definition.
These entries have mode `040000`, include the `SKIP_WORKTREE` bit, and
the path ends in a directory separator.
32-bit ctime seconds, the last time a file's metadata changed 32-bit ctime seconds, the last time a file's metadata changed
this is stat(2) data this is stat(2) data
@ -385,3 +392,15 @@ The remaining data of each directory block is grouped by type:
in this block of entries. in this block of entries.
- 32-bit count of cache entries in this block - 32-bit count of cache entries in this block
== Sparse Directory Entries
When using sparse-checkout in cone mode, some entire directories within
the index can be summarized by pointing to a tree object instead of the
entire expanded list of paths within that tree. An index containing such
entries is a "sparse index". Index format versions 4 and less were not
implemented with such entries in mind. Thus, for these versions, an
index containing sparse directory entries will include this extension
with signature { 's', 'd', 'i', 'r' }. Like the split-index extension,
tools should avoid interacting with a sparse index unless they understand
this extension.

View File

@ -43,8 +43,9 @@ Design Details
a change in format. a change in format.
- The MIDX keeps only one record per object ID. If an object appears - The MIDX keeps only one record per object ID. If an object appears
in multiple packfiles, then the MIDX selects the copy in the most- in multiple packfiles, then the MIDX selects the copy in the
recently modified packfile. preferred packfile, otherwise selecting from the most-recently
modified packfile.
- If there exist packfiles in the pack directory not registered in - If there exist packfiles in the pack directory not registered in
the MIDX, then those packfiles are loaded into the `packed_git` the MIDX, then those packfiles are loaded into the `packed_git`

View File

@ -379,3 +379,86 @@ CHUNK DATA:
TRAILER: TRAILER:
Index checksum of the above contents. Index checksum of the above contents.
== multi-pack-index reverse indexes
Similar to the pack-based reverse index, the multi-pack index can also
be used to generate a reverse index.
Instead of mapping between offset, pack-, and index position, this
reverse index maps between an object's position within the MIDX, and
that object's position within a pseudo-pack that the MIDX describes
(i.e., the ith entry of the multi-pack reverse index holds the MIDX
position of ith object in pseudo-pack order).
To clarify the difference between these orderings, consider a multi-pack
reachability bitmap (which does not yet exist, but is what we are
building towards here). Each bit needs to correspond to an object in the
MIDX, and so we need an efficient mapping from bit position to MIDX
position.
One solution is to let bits occupy the same position in the oid-sorted
index stored by the MIDX. But because oids are effectively random, their
resulting reachability bitmaps would have no locality, and thus compress
poorly. (This is the reason that single-pack bitmaps use the pack
ordering, and not the .idx ordering, for the same purpose.)
So we'd like to define an ordering for the whole MIDX based around
pack ordering, which has far better locality (and thus compresses more
efficiently). We can think of a pseudo-pack created by the concatenation
of all of the packs in the MIDX. E.g., if we had a MIDX with three packs
(a, b, c), with 10, 15, and 20 objects respectively, we can imagine an
ordering of the objects like:
|a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19|
where the ordering of the packs is defined by the MIDX's pack list,
and then the ordering of objects within each pack is the same as the
order in the actual packfile.
Given the list of packs and their counts of objects, you can
naïvely reconstruct that pseudo-pack ordering (e.g., the object at
position 27 must be (c,1) because packs "a" and "b" consumed 25 of the
slots). But there's a catch. Objects may be duplicated between packs, in
which case the MIDX only stores one pointer to the object (and thus we'd
want only one slot in the bitmap).
Callers could handle duplicates themselves by reading objects in order
of their bit-position, but that's linear in the number of objects, and
much too expensive for ordinary bitmap lookups. Building a reverse index
solves this, since it is the logical inverse of the index, and that
index has already removed duplicates. But, building a reverse index on
the fly can be expensive. Since we already have an on-disk format for
pack-based reverse indexes, let's reuse it for the MIDX's pseudo-pack,
too.
Objects from the MIDX are ordered as follows to string together the
pseudo-pack. Let `pack(o)` return the pack from which `o` was selected
by the MIDX, and define an ordering of packs based on their numeric ID
(as stored by the MIDX). Let `offset(o)` return the object offset of `o`
within `pack(o)`. Then, compare `o1` and `o2` as follows:
- If one of `pack(o1)` and `pack(o2)` is preferred and the other
is not, then the preferred one sorts first.
+
(This is a detail that allows the MIDX bitmap to determine which
pack should be used by the pack-reuse mechanism, since it can ask
the MIDX for the pack containing the object at bit position 0).
- If `pack(o1) ≠ pack(o2)`, then sort the two objects in descending
order based on the pack ID.
- Otherwise, `pack(o1) = pack(o2)`, and the objects are sorted in
pack-order (i.e., `o1` sorts ahead of `o2` exactly when `offset(o1)
< offset(o2)`).
In short, a MIDX's pseudo-pack is the de-duplicated concatenation of
objects in packs stored by the MIDX, laid out in pack order, and the
packs arranged in MIDX order (with the preferred pack coming first).
Finally, note that the MIDX's reverse index is not stored as a chunk in
the multi-pack-index itself. This is done because the reverse index
includes the checksum of the pack or MIDX to which it belongs, which
makes it impossible to write in the MIDX. To avoid races when rewriting
the MIDX, a MIDX reverse index includes the MIDX's checksum in its
filename (e.g., `multi-pack-index-xyz.rev`).

View File

@ -0,0 +1,270 @@
Parallel Checkout Design Notes
==============================
The "Parallel Checkout" feature attempts to use multiple processes to
parallelize the work of uncompressing the blobs, applying in-core
filters, and writing the resulting contents to the working tree during a
checkout operation. It can be used by all checkout-related commands,
such as `clone`, `checkout`, `reset`, `sparse-checkout`, and others.
These commands share the following basic structure:
* Step 1: Read the current index file into memory.
* Step 2: Modify the in-memory index based upon the command, and
temporarily mark all cache entries that need to be updated.
* Step 3: Populate the working tree to match the new candidate index.
This includes iterating over all of the to-be-updated cache entries
and delete, create, or overwrite the associated files in the working
tree.
* Step 4: Write the new index to disk.
Step 3 is the focus of the "parallel checkout" effort described here.
Sequential Implementation
-------------------------
For the purposes of discussion here, the current sequential
implementation of Step 3 is divided in 3 parts, each one implemented in
its own function:
* Step 3a: `unpack-trees.c:check_updates()` contains a series of
sequential loops iterating over the `cache_entry`'s array. The main
loop in this function calls the Step 3b function for each of the
to-be-updated entries.
* Step 3b: `entry.c:checkout_entry()` examines the existing working tree
for file conflicts, collisions, and unsaved changes. It removes files
and creates leading directories as necessary. It calls the Step 3c
function for each entry to be written.
* Step 3c: `entry.c:write_entry()` loads the blob into memory, smudges
it if necessary, creates the file in the working tree, writes the
smudged contents, calls `fstat()` or `lstat()`, and updates the
associated `cache_entry` struct with the stat information gathered.
It wouldn't be safe to perform Step 3b in parallel, as there could be
race conditions between file creations and removals. Instead, the
parallel checkout framework lets the sequential code handle Step 3b,
and uses parallel workers to replace the sequential
`entry.c:write_entry()` calls from Step 3c.
Rejected Multi-Threaded Solution
--------------------------------
The most "straightforward" implementation would be to spread the set of
to-be-updated cache entries across multiple threads. But due to the
thread-unsafe functions in the ODB code, we would have to use locks to
coordinate the parallel operation. An early prototype of this solution
showed that the multi-threaded checkout would bring performance
improvements over the sequential code, but there was still too much lock
contention. A `perf` profiling indicated that around 20% of the runtime
during a local Linux clone (on an SSD) was spent in locking functions.
For this reason this approach was rejected in favor of using multiple
child processes, which led to a better performance.
Multi-Process Solution
----------------------
Parallel checkout alters the aforementioned Step 3 to use multiple
`checkout--worker` background processes to distribute the work. The
long-running worker processes are controlled by the foreground Git
command using the existing run-command API.
Overview
~~~~~~~~
Step 3b is only slightly altered; for each entry to be checked out, the
main process performs the following steps:
* M1: Check whether there is any untracked or unclean file in the
working tree which would be overwritten by this entry, and decide
whether to proceed (removing the file(s)) or not.
* M2: Create the leading directories.
* M3: Load the conversion attributes for the entry's path.
* M4: Check, based on the entry's type and conversion attributes,
whether the entry is eligible for parallel checkout (more on this
later). If it is eligible, enqueue the entry and the loaded
attributes to later write the entry in parallel. If not, write the
entry right away, using the default sequential code.
Note: we save the conversion attributes associated with each entry
because the workers don't have access to the main process' index state,
so they can't load the attributes by themselves (and the attributes are
needed to properly smudge the entry). Additionally, this has a positive
impact on performance as (1) we don't need to load the attributes twice
and (2) the attributes machinery is optimized to handle paths in
sequential order.
After all entries have passed through the above steps, the main process
checks if the number of enqueued entries is sufficient to spread among
the workers. If not, it just writes them sequentially. Otherwise, it
spawns the workers and distributes the queued entries uniformly in
continuous chunks. This aims to minimize the chances of two workers
writing to the same directory simultaneously, which could increase lock
contention in the kernel.
Then, for each assigned item, each worker:
* W1: Checks if there is any non-directory file in the leading part of
the entry's path or if there already exists a file at the entry' path.
If so, mark the entry with `PC_ITEM_COLLIDED` and skip it (more on
this later).
* W2: Creates the file (with O_CREAT and O_EXCL).
* W3: Loads the blob into memory (inflating and delta reconstructing
it).
* W4: Applies any required in-process filter, like end-of-line
conversion and re-encoding.
* W5: Writes the result to the file descriptor opened at W2.
* W6: Calls `fstat()` or lstat()` on the just-written path, and sends
the result back to the main process, together with the end status of
the operation and the item's identification number.
Note that, when possible, steps W3 to W5 are delegated to the streaming
machinery, removing the need to keep the entire blob in memory.
If the worker fails to read the blob or to write it to the working tree,
it removes the created file to avoid leaving empty files behind. This is
the *only* time a worker is allowed to remove a file.
As mentioned earlier, it is the responsibility of the main process to
remove any file that blocks the checkout operation (or abort if the
removal(s) would cause data loss and the user didn't ask to `--force`).
This is crucial to avoid race conditions and also to properly detect
path collisions at Step W1.
After the workers finish writing the items and sending back the required
information, the main process handles the results in two steps:
- First, it updates the in-memory index with the `lstat()` information
sent by the workers. (This must be done first as this information
might me required in the following step.)
- Then it writes the items which collided on disk (i.e. items marked
with `PC_ITEM_COLLIDED`). More on this below.
Path Collisions
---------------
Path collisions happen when two different paths correspond to the same
entry in the file system. E.g. the paths 'a' and 'A' would collide in a
case-insensitive file system.
The sequential checkout deals with collisions in the same way that it
deals with files that were already present in the working tree before
checkout. Basically, it checks if the path that it wants to write
already exists on disk, makes sure the existing file doesn't have
unsaved data, and then overwrites it. (To be more pedantic: it deletes
the existing file and creates the new one.) So, if there are multiple
colliding files to be checked out, the sequential code will write each
one of them but only the last will actually survive on disk.
Parallel checkout aims to reproduce the same behavior. However, we
cannot let the workers racily write to the same file on disk. Instead,
the workers detect when the entry that they want to check out would
collide with an existing file, and mark it with `PC_ITEM_COLLIDED`.
Later, the main process can sequentially feed these entries back to
`checkout_entry()` without the risk of race conditions. On clone, this
also has the effect of marking the colliding entries to later emit a
warning for the user, like the classic sequential checkout does.
The workers are able to detect both collisions among the entries being
concurrently written and collisions between a parallel-eligible entry
and an ineligible entry. The general idea for collision detection is
quite straightforward: for each parallel-eligible entry, the main
process must remove all files that prevent this entry from being written
(before enqueueing it). This includes any non-directory file in the
leading path of the entry. Later, when a worker gets assigned the entry,
it looks again for the non-directories files and for an already existing
file at the entry's path. If any of these checks finds something, the
worker knows that there was a path collision.
Because parallel checkout can distinguish path collisions from the case
where the file was already present in the working tree before checkout,
we could alternatively choose to skip the checkout of colliding entries.
However, each entry that doesn't get written would have NULL `lstat()`
fields on the index. This could cause performance penalties for
subsequent commands that need to refresh the index, as they would have
to go to the file system to see if the entry is dirty. Thus, if we have
N entries in a colliding group and we decide to write and `lstat()` only
one of them, every subsequent `git-status` will have to read, convert,
and hash the written file N - 1 times. By checking out all colliding
entries (like the sequential code does), we only pay the overhead once,
during checkout.
Eligible Entries for Parallel Checkout
--------------------------------------
As previously mentioned, not all entries passed to `checkout_entry()`
will be considered eligible for parallel checkout. More specifically, we
exclude:
- Symbolic links; to avoid race conditions that, in combination with
path collisions, could cause workers to write files at the wrong
place. For example, if we were to concurrently check out a symlink
'a' -> 'b' and a regular file 'A/f' in a case-insensitive file system,
we could potentially end up writing the file 'A/f' at 'a/f', due to a
race condition.
- Regular files that require external filters (either "one shot" filters
or long-running process filters). These filters are black-boxes to Git
and may have their own internal locking or non-concurrent assumptions.
So it might not be safe to run multiple instances in parallel.
+
Besides, long-running filters may use the delayed checkout feature to
postpone the return of some filtered blobs. The delayed checkout queue
and the parallel checkout queue are not compatible and should remain
separate.
+
Note: regular files that only require internal filters, like end-of-line
conversion and re-encoding, are eligible for parallel checkout.
Ineligible entries are checked out by the classic sequential codepath
*before* spawning workers.
Note: submodules's files are also eligible for parallel checkout (as
long as they don't fall into any of the excluding categories mentioned
above). But since each submodule is checked out in its own child
process, we don't mix the superproject's and the submodules' files in
the same parallel checkout process or queue.
The API
-------
The parallel checkout API was designed with the goal of minimizing
changes to the current users of the checkout machinery. This means that
they don't have to call a different function for sequential or parallel
checkout. As already mentioned, `checkout_entry()` will automatically
insert the given entry in the parallel checkout queue when this feature
is enabled and the entry is eligible; otherwise, it will just write the
entry right away, using the sequential code. In general, callers of the
parallel checkout API should look similar to this:
----------------------------------------------
int pc_workers, pc_threshold, err = 0;
struct checkout state;
get_parallel_checkout_configs(&pc_workers, &pc_threshold);
/*
* This check is not strictly required, but it
* should save some time in sequential mode.
*/
if (pc_workers > 1)
init_parallel_checkout();
for (each cache_entry ce to-be-updated)
err |= checkout_entry(ce, &state, NULL, NULL);
err |= run_parallel_checkout(&state, pc_workers, pc_threshold, NULL, NULL);
----------------------------------------------

Some files were not shown because too many files have changed in this diff Show More