Compare commits

..

93 Commits

Author SHA1 Message Date
668f2d5361 Git 2.30.9
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:43 +02:00
528290f8c6 Merge branch 'tb/config-copy-or-rename-in-file-injection'
Avoids issues with renaming or deleting sections with long lines, where
configuration values may be interpreted as sections, leading to
configuration injection. Addresses CVE-2023-29007.

* tb/config-copy-or-rename-in-file-injection:
  config.c: disallow overly-long lines in `copy_or_rename_section_in_file()`
  config.c: avoid integer truncation in `copy_or_rename_section_in_file()`
  config: avoid fixed-sized buffer when renaming/deleting a section
  t1300: demonstrate failure when renaming sections with long lines

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:42 +02:00
4fe5d0b10a Merge branch 'avoid-using-uninitialized-gettext'
Avoids the overhead of calling `gettext` when initialization of the
translated messages was skipped. Addresses CVE-2023-25815.

* avoid-using-uninitialized-gettext: (1 commit)
  gettext: avoid using gettext if the locale dir is not present
2023-04-17 21:15:42 +02:00
18e2b1cfc8 Merge branch 'js/apply-overwrite-rej-symlink-if-exists' into maint-2.30
Address CVE-2023-25652 by deleting any existing `.rej` symbolic links
instead of following them.

* js/apply-overwrite-rej-symlink-if-exists:
  apply --reject: overwrite existing `.rej` symlink if it exists

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2023-04-17 21:15:41 +02:00
3bb3d6bac5 config.c: disallow overly-long lines in copy_or_rename_section_in_file()
As a defense-in-depth measure to guard against any potentially-unknown
buffer overflows in `copy_or_rename_section_in_file()`, refuse to work
with overly-long lines in a gitconfig.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2023-04-17 21:15:40 +02:00
e91cfe6085 config.c: avoid integer truncation in copy_or_rename_section_in_file()
There are a couple of spots within `copy_or_rename_section_in_file()`
that incorrectly use an `int` to track an offset within a string, which
may truncate or wrap around to a negative value.

Historically it was impossible to have a line longer than 1024 bytes
anyway, since we used fgets() with a fixed-size buffer of exactly that
length. But the recent change to use a strbuf permits us to read lines
of arbitrary length, so it's possible for a malicious input to cause us
to overflow past INT_MAX and do an out-of-bounds array read.

Practically speaking, however, this should never happen, since it
requires 2GB section names or values, which are unrealistic in
non-malicious circumstances.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:40 +02:00
a5bb10fd5e config: avoid fixed-sized buffer when renaming/deleting a section
When renaming (or deleting) a section of configuration, Git uses the
function `git_config_copy_or_rename_section_in_file()` to rewrite the
configuration file after applying the rename or deletion to the given
section.

To do this, Git repeatedly calls `fgets()` to read the existing
configuration data into a fixed size buffer.

When the configuration value under `old_name` exceeds the size of the
buffer, we will call `fgets()` an additional time even if there is no
newline in the configuration file, since our read length is capped at
`sizeof(buf)`.

If the first character of the buffer (after zero or more characters
satisfying `isspace()`) is a '[', Git will incorrectly treat it as
beginning a new section when the original section is being removed. In
other words, a configuration value satisfying this criteria can
incorrectly be considered as a new secftion instead of a variable in the
original section.

Avoid this issue by using a variable-width buffer in the form of a
strbuf rather than a fixed-with region on the stack. A couple of small
points worth noting:

  - Using a strbuf will cause us to allocate arbitrary sizes to match
    the length of each line.  In practice, we don't expect any
    reasonable configuration files to have lines that long, and a
    bandaid will be introduced in a later patch to ensure that this is
    the case.

  - We are using strbuf_getwholeline() here instead of strbuf_getline()
    in order to match `fgets()`'s behavior of leaving the trailing LF
    character on the buffer (as well as a trailing NUL).

    This could be changed later, but using strbuf_getwholeline() changes
    the least about this function's implementation, so it is picked as
    the safest path.

  - It is temping to want to replace the loop to skip over characters
    matching isspace() at the beginning of the buffer with a convenience
    function like `strbuf_ltrim()`. But this is the wrong approach for a
    couple of reasons:

    First, it involves a potentially large and expensive `memmove()`
    which we would like to avoid. Second, and more importantly, we also
    *do* want to preserve those spaces to avoid changing the output of
    other sections.

In all, this patch is a minimal replacement of the fixed-width buffer in
`git_config_copy_or_rename_section_in_file()` to instead use a `struct
strbuf`.

Reported-by: André Baptista <andre@ethiack.com>
Reported-by: Vítor Pinho <vitor@ethiack.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:40 +02:00
c4137be0f5 gettext: avoid using gettext if the locale dir is not present
In cc5e1bf992 (gettext: avoid initialization if the locale dir is not
present, 2018-04-21) Git was taught to avoid a costly gettext start-up
when there are not even any localized messages to work with.

But we still called `gettext()` and `ngettext()` functions.

Which caused a problem in Git for Windows when the libgettext that is
consumed from the MSYS2 project stopped using a runtime prefix in
https://github.com/msys2/MINGW-packages/pull/10461

Due to that change, we now use an unintialized gettext machinery that
might get auto-initialized _using an unintended locale directory_:
`C:\mingw64\share\locale`.

Let's record the fact when the gettext initialization was skipped, and
skip calling the gettext functions accordingly.

This addresses CVE-2023-25815.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:39 +02:00
29198213c9 t1300: demonstrate failure when renaming sections with long lines
When renaming a configuration section which has an entry whose length
exceeds the size of our buffer in config.c's implementation of
`git_config_copy_or_rename_section_in_file()`, Git will incorrectly
form a new configuration section with part of the data in the section
being removed.

In this instance, our first configuration file looks something like:

    [b]
      c = d <spaces> [a] e = f
    [a]
      g = h

Here, we have two configuration values, "b.c", and "a.g". The value "[a]
e = f" belongs to the configuration value "b.c", and does not form its
own section.

However, when renaming the section 'a' to 'xyz', Git will write back
"[xyz]\ne = f", but "[xyz]" is still attached to the value of "b.c",
which is why "e = f" on its own line becomes a new entry called "b.e".

A slightly different example embeds the section being renamed within
another section.

Demonstrate this failure in a test in t1300, which we will fix in the
following commit.

Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2023-04-17 21:15:39 +02:00
9db05711c9 apply --reject: overwrite existing .rej symlink if it exists
The `git apply --reject` is expected to write out `.rej` files in case
one or more hunks fail to apply cleanly. Historically, the command
overwrites any existing `.rej` files. The idea being that
apply/reject/edit cycles are relatively common, and the generated `.rej`
files are not considered precious.

But the command does not overwrite existing `.rej` symbolic links, and
instead follows them. This is unsafe because the same patch could
potentially create such a symbolic link and point at arbitrary paths
outside the current worktree, and `git apply` would write the contents
of the `.rej` file into that location.

Therefore, let's make sure that any existing `.rej` file or symbolic
link is removed before writing it.

Reported-by: RyotaK <ryotak.mail@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-04-17 21:15:38 +02:00
2f3b28f272 Merge branch 'js/gettext-poison-fixes'
The `maint-2.30` branch accumulated quite a few fixes over the past two
years. Most of those fixes were originally based on newer versions, and
while the patches cherry-picked cleanly, we weren't diligent enough to
pay attention to the CI builds and the GETTEXT_POISON job regressed.
This topic branch fixes that.

* js/gettext-poison-fixes
  t0033: GETTEXT_POISON fix
  t0003: GETTEXT_POISON fix, part 1
  t0003: GETTEXT_POISON fix, conclusion
  t5619: GETTEXT_POISON fix
  t5604: GETTEXT_POISON fix, part 1
  t5604: GETTEXT_POISON fix, conclusion
2023-04-17 21:15:37 +02:00
4989c35688 Merge branch 'ds/github-actions-use-newer-ubuntu'
Update the version of Ubuntu used for GitHub Actions CI from 18.04
to 22.04.

* ds/github-actions-use-newer-ubuntu:
  ci: update 'static-analysis' to Ubuntu 22.04
2023-04-17 21:15:36 +02:00
fef08dd32e ci: update 'static-analysis' to Ubuntu 22.04
GitHub Actions scheduled a brownout of Ubuntu 18.04, which canceled all
runs of the 'static-analysis' job in our CI runs. Update to 22.04 to
avoid this as the brownout later turns into a complete deprecation.

The use of 18.04 was set in d051ed77ee (.github/workflows/main.yml: run
static-analysis on bionic, 2021-02-08) due to the lack of Coccinelle
being available on 20.04 (which continues today).

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 18:17:53 +02:00
e4cb3693a4 Merge branch 'backport/jk/range-diff-fixes'
"git range-diff" code clean-up. Needed to pacify modern GCC versions.

* jk/range-diff-fixes:
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
2023-03-22 18:00:36 +01:00
3c7896e362 Merge branch 'backport/jk/curl-avoid-deprecated-api' into maint-2.30
Deal with a few deprecation warning from cURL library.

* jk/curl-avoid-deprecated-api:
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
2023-03-22 18:00:36 +01:00
6f5ff3aa31 Merge branch 'backport/jx/ci-ubuntu-fix' into maint-2.30
Adjust the GitHub CI to newer ubuntu release.

* jx/ci-ubuntu-fix:
  github-actions: run gcc-8 on ubuntu-20.04 image
  ci: install python on ubuntu
  ci: use the same version of p4 on both Linux and macOS
  ci: remove the pipe after "p4 -V" to catch errors
2023-03-22 18:00:35 +01:00
0737200a06 Merge branch 'backport/jc/http-clear-finished-pointer' into maint-2.30
Meant to go with js/ci-gcc-12-fixes.
source: <xmqq7d68ytj8.fsf_-_@gitster.g>

* jc/http-clear-finished-pointer:
  http.c: clear the 'finished' member once we are done with it
2023-03-22 18:00:34 +01:00
0a1dc55c40 Merge branch 'backport/js/ci-gcc-12-fixes'
Fixes real problems noticed by gcc 12 and works around false
positives.

* js/ci-gcc-12-fixes:
  nedmalloc: avoid new compile error
  compat/win32/syslog: fix use-after-realloc
2023-03-22 18:00:34 +01:00
5843080c85 http.c: clear the 'finished' member once we are done with it
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.

Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request).  The loop terminating condition mistakenly
thought that the original request has yet to be completed.

Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed.  It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.

One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns.  Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 17:58:29 +01:00
321854ac46 clone.c: avoid "exceeds maximum object size" error with GCC v12.x
Technically, the pointer difference `end - start` _could_ be negative,
and when cast to an (unsigned) `size_t` that would cause problems. In
this instance, the symptom is:

dir.c: In function 'git_url_basename':
dir.c:3087:13: error: 'memchr' specified bound [9223372036854775808, 0]
       exceeds maximum object size 9223372036854775807
       [-Werror=stringop-overread]
    CC ewah/bitmap.o
 3087 |         if (memchr(start, '/', end - start) == NULL
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While it is a bit far-fetched to think that `end` (which is defined as
`repo + strlen(repo)`) and `start` (which starts at `repo` and never
steps beyond the NUL terminator) could result in such a negative
difference, GCC has no way of knowing that.

See also https://gcc.gnu.org/bugzilla//show_bug.cgi?id=85783.

Let's just add a safety check, primarily for GCC's benefit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 17:53:32 +01:00
0c8d22abaf t5604: GETTEXT_POISON fix, conclusion
In fade728df1 (apply: fix writing behind newly created symbolic links,
2023-02-02), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:56 +01:00
7c811ed5e5 t5604: GETTEXT_POISON fix, part 1
In bffc762f87 (dir-iterator: prevent top-level symlinks without
FOLLOW_SYMLINKS, 2023-01-24), we backported a patch onto v2.30.* that
was originally based on a much newer version. The v2.30.* release train
still has the GETTEXT_POISON CI job, though, and hence needs
`test_i18n*` in its tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:56 +01:00
a2b2173cfe t5619: GETTEXT_POISON fix
In cf8f6ce02a (clone: delay picking a transport until after
get_repo_path(), 2023-01-24), we backported a patch onto v2.30.* that
was originally based on a much newer version. The v2.30.* release train
still has the GETTEXT_POISON CI job, though, and hence needs
`test_i18n*` in its tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:56 +01:00
c025b4b2f1 range-diff: use ssize_t for parsed "len" in read_patches()
As we iterate through the buffer containing git-log output, parsing
lines, we use an "int" to store the size of an individual line. This
should be a size_t, as we have no guarantee that there is not a
malicious 2GB+ commit-message line in the output.

Overflowing this integer probably doesn't do anything _too_ terrible. We
are not using the value to size a buffer, so the worst case is probably
an out-of-bounds read from before the array. But it's easy enough to
fix.

Note that we have to use ssize_t here, since we also store the length
result from parse_git_diff_header(), which may return a negative value
for error. That function actually returns an int itself, which has a
similar overflow problem, but I'll leave that for another day. Much
of the apply.c code uses ints and should be converted as a whole; in the
meantime, a negative return from parse_git_diff_header() will be
interpreted as an error, and we'll bail (so we can't handle such a case,
but given that it's likely to be malicious anyway, the important thing
is we don't have any memory errors).

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:55 +01:00
d99728b2ca t0003: GETTEXT_POISON fix, conclusion
In 3c50032ff5 (attr: ignore overly large gitattributes files,
2022-12-01), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:55 +01:00
a36df79a37 range-diff: handle unterminated lines in read_patches()
When parsing our buffer of output from git-log, we have a
find_end_of_line() helper that finds the next newline, and gives us the
number of bytes to move past it, or the size of the whole remaining
buffer if there is no newline.

But trying to handle both those cases leads to some oddities:

  - we try to overwrite the newline with NUL in the caller, by writing
    over line[len-1]. This is at best redundant, since the helper will
    already have done so if it saw a newline. But if it didn't see a
    newline, it's actively wrong; we'll overwrite the byte at the end of
    the (unterminated) line.

    We could solve this just dropping the extra NUL assignment in the
    caller and just letting the helper do the right thing. But...

  - if we see a "diff --git" line, we'll restore the newline on top of
    the NUL byte, so we can pass the string to parse_git_diff_header().
    But if there was no newline in the first place, we can't do this.
    There's no place to put it (the current code writes a newline
    over whatever byte we obliterated earlier). The best we can do is
    feed the complete remainder of the buffer to the function (which is,
    in fact, a string, by virtue of being a strbuf).

To solve this, the caller needs to know whether we actually found a
newline or not. We could modify find_end_of_line() to return that
information, but we can further observe that it has only one caller.
So let's just inline it in that caller.

Nobody seems to have noticed this case, probably because git-log would
never produce input that doesn't end with a newline. Arguably we could
just return an error as soon as we see that the output does not end in a
newline. But the code to do so actually ends up _longer_, mostly because
of the cleanup we have to do in handling the error.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:55 +01:00
e4298ccd7f t0003: GETTEXT_POISON fix, part 1
In dfa6b32b5e (attr: ignore attribute lines exceeding 2048 bytes,
2022-12-01), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:55 +01:00
8516dac1e1 t0033: GETTEXT_POISON fix
In e47363e5a8 (t0033: add tests for safe.directory, 2022-04-13), we
backported a patch onto v2.30.* that was originally based on a much
newer version. The v2.30.* release train still has the GETTEXT_POISON
CI job, though, and hence needs `test_i18n*` in its tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:55 +01:00
07f91e5e79 http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.

Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:

  - we don't have to worry that silencing curl's deprecation warnings
    might cause us to miss other more useful ones

  - we'd eventually want to move to the new variant anyway, so this gets
    us set up (albeit with some extra ugly boilerplate for the
    conditional)

There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:

  GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
  if (...http is allowed...)
	GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);

and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.

On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).

This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:54 +01:00
a69043d510 ci: install python on ubuntu
Python is missing from the default ubuntu-22.04 runner image, which
prevents git-p4 from working. To install python on ubuntu, we need
to provide the correct package names:

 * On Ubuntu 18.04 (bionic), "/usr/bin/python2" is provided by the
   "python" package, and "/usr/bin/python3" is provided by the "python3"
   package.

 * On Ubuntu 20.04 (focal) and above, "/usr/bin/python2" is provided by
   the "python2" package which has a different name from bionic, and
   "/usr/bin/python3" is provided by "python3".

Since the "ubuntu-latest" runner image has a higher version, its
safe to use "python2" or "python3" package name.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:54 +01:00
18bc8eb7b5 range-diff: drop useless "offset" variable from read_patches()
The "offset" variable was was introduced in 44b67cb62b (range-diff:
split lines manually, 2019-07-11), but it has never done anything
useful. We use it to count up the number of bytes we've consumed, but we
never look at the result. It was probably copied accidentally from an
almost-identical loop in apply.c:find_header() (and the point of that
commit was to make use of the parse_git_diff_header() function which
underlies both).

Because the variable was set but not used, most compilers didn't seem to
notice, but the upcoming clang-14 does complain about it, via its
-Wunused-but-set-variable warning.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:54 +01:00
b0e3e2d06b http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.

But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros).  But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.

Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).

Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.

Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:54 +01:00
fda237cb64 http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-03-12 20:31:54 +01:00
86f6f4fa91 nedmalloc: avoid new compile error
GCC v12.x complains thusly:

compat/nedmalloc/nedmalloc.c: In function 'DestroyCaches':
compat/nedmalloc/nedmalloc.c:326:12: error: the comparison will always
                              evaluate as 'true' for the address of 'caches'
                              will never be NULL [-Werror=address]
  326 |         if(p->caches)
      |            ^
compat/nedmalloc/nedmalloc.c:196:22: note: 'caches' declared here
  196 |         threadcache *caches[THREADCACHEMAXCACHES];
      |                      ^~~~~~

... and it is correct, of course.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
79e0626b39 ci: use the same version of p4 on both Linux and macOS
There would be a segmentation fault when running p4 v16.2 on ubuntu
22.04 which is the latest version of ubuntu runner image for github
actions.

By checking each version from [1], p4d version 21.1 and above can work
properly on ubuntu 22.04. But version 22.x will break some p4 test
cases. So p4 version 21.x is exactly the version we can use.

With this update, the versions of p4 for Linux and macOS happen to be
the same. So we can add the version number directly into the "P4WHENCE"
variable, and reuse it in p4 installation for macOS.

By removing the "LINUX_P4_VERSION" variable from "ci/lib.sh", the
comment left above has nothing to do with p4, but still applies to
git-lfs. Since we have a fixed version of git-lfs installed on Linux,
we may have a different version on macOS.

[1]: https://cdist2.perforce.com/perforce/

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
20854bc47a ci: remove the pipe after "p4 -V" to catch errors
When installing p4 as a dependency, we used to pipe output of "p4 -V"
and "p4d -V" to validate the installation and output a condensed version
information. But this would hide potential errors of p4 and would stop
with an empty output. E.g.: p4d version 16.2 running on ubuntu 22.04
causes sigfaults, even before it produces any output.

By removing the pipe after "p4 -V" and "p4d -V", we may get a
verbose output, and stop immediately on errors because we have "set
-e" in "ci/lib.sh". Since we won't look at these trace logs unless
something fails, just including the raw output seems most sensible.

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
c03ffcff4e github-actions: run gcc-8 on ubuntu-20.04 image
GitHub starts to upgrade its runner image "ubuntu-latest" from version
"ubuntu-20.04" to version "ubuntu-22.04". It will fail to find and
install "gcc-8" package on the new runner image.

Change the runner image of the `linux-gcc` job from "ubuntu-latest" to
"ubuntu-20.04" in order to install "gcc-8" as a dependency.

Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:53 +01:00
417fb91b5d compat/win32/syslog: fix use-after-realloc
Git for Windows' SDK recently upgraded to GCC v12.x which points out
that the `pos` variable might be used even after the corresponding
memory was `realloc()`ed and therefore potentially no longer valid.

Since a subset of this SDK is used in Git's CI/PR builds, we need to fix
this to continue to be able to benefit from the CI/PR runs.

Note: This bug has been with us since 2a6b149c64 (mingw: avoid using
strbuf in syslog, 2011-10-06), and while it looks tempting to replace
the hand-rolled string manipulation with a `strbuf`-based one, that
commit's message explains why we cannot do that: The `syslog()` function
is called as part of the function in `daemon.c` which is set as the
`die()` routine, and since `strbuf_grow()` can call that function if it
runs out of memory, this would cause a nasty infinite loop that we do
not want to re-introduce.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-12 20:31:52 +01:00
394a759d2b Git 2.30.8
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 09:14:45 +01:00
a3033a68ac Merge branch 'ps/apply-beyond-symlink' into maint-2.30
Fix a vulnerability (CVE-2023-23946) that allows crafted input to trick
`git apply` into writing files outside of the working tree.

* ps/apply-beyond-symlink:
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:12:16 +01:00
2c9a4c7310 Merge branch 'tb/clone-local-symlinks' into maint-2.30
Resolve a security vulnerability (CVE-2023-22490) where `clone_local()`
is used in conjunction with non-local transports, leading to arbitrary
path exfiltration.

* tb/clone-local-symlinks:
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:09:14 +01:00
fade728df1 apply: fix writing behind newly created symbolic links
When writing files git-apply(1) initially makes sure that none of the
files it is about to create are behind a symlink:

```
 $ git init repo
 Initialized empty Git repository in /tmp/repo/.git/
 $ cd repo/
 $ ln -s dir symlink
 $ git apply - <<EOF
 diff --git a/symlink/file b/symlink/file
 new file mode 100644
 index 0000000..e69de29
 EOF
 error: affected file 'symlink/file' is beyond a symbolic link
```

This safety mechanism is crucial to ensure that we don't write outside
of the repository's working directory. It can be fooled though when the
patch that is being applied creates the symbolic link in the first
place, which can lead to writing files in arbitrary locations.

Fix this by checking whether the path we're about to create is
beyond a symlink or not. Tightening these checks like this should be
fine as we already have these precautions in Git as explained
above. Ideally, we should update the check we do up-front before
starting to reflect the computed changes to the working tree so that
we catch this case as well, but as part of embargoed security work,
adding an equivalent check just before we try to write out a file
should serve us well as a reasonable first step.

Digging back into history shows that this vulnerability has existed
since at least Git v2.9.0. As Git v2.8.0 and older don't build on my
system anymore I cannot tell whether older versions are affected, as
well.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-03 14:41:31 -08:00
bffc762f87 dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
When using the dir_iterator API, we first stat(2) the base path, and
then use that as a starting point to enumerate the directory's contents.

If the directory contains symbolic links, we will immediately die() upon
encountering them without the `FOLLOW_SYMLINKS` flag. The same is not
true when resolving the top-level directory, though.

As explained in a previous commit, this oversight in 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28)
can be used as an attack vector to include arbitrary files on a victim's
filesystem from outside of the repository.

Prevent resolving top-level symlinks unless the FOLLOW_SYMLINKS flag is
given, which will cause clones of a repository with a symlink'd
"$GIT_DIR/objects" directory to fail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
cf8f6ce02a clone: delay picking a transport until after get_repo_path()
In the previous commit, t5619 demonstrates an issue where two calls to
`get_repo_path()` could trick Git into using its local clone mechanism
in conjunction with a non-local transport.

That sequence is:

 - the starting state is that the local path https:/example.com/foo is a
   symlink that points to ../../../.git/modules/foo. So it's dangling.

 - get_repo_path() sees that no such path exists (because it's
   dangling), and thus we do not canonicalize it into an absolute path

 - because we're using --separate-git-dir, we create .git/modules/foo.
   Now our symlink is no longer dangling!

 - we pass the url to transport_get(), which sees it as an https URL.

 - we call get_repo_path() again, on the url. This second call was
   introduced by f38aa83f9a (use local cloning if insteadOf makes a
   local URL, 2014-07-17). The idea is that we want to pull the url
   fresh from the remote.c API, because it will apply any aliases.

And of course now it sees that there is a local file, which is a
mismatch with the transport we already selected.

The issue in the above sequence is calling `transport_get()` before
deciding whether or not the repository is indeed local, and not passing
in an absolute path if it is local.

This is reminiscent of a similar bug report in [1], where it was
suggested to perform the `insteadOf` lookup earlier. Taking that
approach may not be as straightforward, since the intent is to store the
original URL in the config, but to actually fetch from the insteadOf
one, so conflating the two early on is a non-starter.

Note: we pass the path returned by `get_repo_path(remote->url[0])`,
which should be the same as `repo_name` (aside from any `insteadOf`
rewrites).

We *could* pass `absolute_pathdup()` of the same argument, which
86521acaca (Bring local clone's origin URL in line with that of a remote
clone, 2008-09-01) indicates may differ depending on the presence of
".git/" for a non-bare repo. That matters for forming relative submodule
paths, but doesn't matter for the second call, since we're just feeding
it to the transport code, which is fine either way.

[1]: https://lore.kernel.org/git/CAMoD=Bi41mB3QRn3JdZL-FGHs4w3C2jGpnJB-CqSndO7FMtfzA@mail.gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
58325b93c5 t5619: demonstrate clone_local() with ambiguous transport
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.

Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.

However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.

This happens in a few parts:

 1. We first call `get_repo_path()` to see if the remote is a local
    path. If it is, we replace the repo name with its absolute path.

 2. We then call `transport_get()` on the repo name and decide how to
    access it. If it was turned into an absolute path in the previous
    step, then we should always treat it like a file.

 3. We use `get_repo_path()` again, and set `is_local` as appropriate.
    But it's already too late to rewrite the repo name as an absolute
    path, since we've already fed it to the transport code.

The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):

    http://127.0.0.1:NNNN/dumb/sub.git

However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.

To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").

This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.

This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.

The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.

In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.

This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").

The appropriate fix is two-fold:

 - Resolve the transport later on (to avoid using the local
   clone optimization with a non-local transport).

 - Avoid reading through the top-level "objects" directory when
   (correctly) using the clone_local() optimization.

This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.

[^1]: Provided that any target directory does not contain symbolic
  links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
  disallow `--local` clones with symlinks, 2022-07-28) will abort the
  clone.

Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
0227130244 attr: adjust a mismatched data type
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.

Windows and 32-bit Linux are among the affected platforms.

In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)

The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.

Let's address this by passing a pointer to a variable of the expected
data type.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19 13:38:06 -08:00
b7b37a3371 Git 2.30.7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 20:56:43 +09:00
6662a836eb Merge branch 'ps/attr-limits' into maint-2.30 2022-12-09 16:05:52 +09:00
3305300f4c Merge branch 'ps/format-padding-fix' into maint-2.30 2022-12-09 16:02:39 +09:00
304a50adff pretty: restrict input lengths for padding and wrapping formats
Both the padding and wrapping formatting directives allow the caller to
specify an integer that ultimately leads to us adding this many chars to
the result buffer. As a consequence, it is trivial to e.g. allocate 2GB
of RAM via a single formatting directive and cause resource exhaustion
on the machine executing this logic. Furthermore, it is debatable
whether there are any sane usecases that require the user to pad data to
2GB boundaries or to indent wrapped data by 2GB.

Restrict the input sizes to 16 kilobytes at a maximum to limit the
amount of bytes that can be requested by the user. This is not meant
as a fix because there are ways to trivially amplify the amount of
data we generate via formatting directives; the real protection is
achieved by the changes in previous steps to catch and avoid integer
wraparound that causes us to under-allocate and access beyond the
end of allocated memory reagions. But having such a limit
significantly helps fuzzing the pretty format, because the fuzzer is
otherwise quite fast to run out-of-memory as it discovers these
formatters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
f930a23943 utf8: refactor strbuf_utf8_replace to not rely on preallocated buffer
In `strbuf_utf8_replace`, we preallocate the destination buffer and then
use `memcpy` to copy bytes into it at computed offsets. This feels
rather fragile and is hard to understand at times. Refactor the code to
instead use `strbuf_add` and `strbuf_addstr` so that we can be sure that
there is no possibility to perform an out-of-bounds write.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
81c2d4c3a5 utf8: fix checking for glyph width in strbuf_utf8_replace()
In `strbuf_utf8_replace()`, we call `utf8_width()` to compute the width
of the current glyph. If the glyph is a control character though it can
be that `utf8_width()` returns `-1`, but because we assign this value to
a `size_t` the conversion will cause us to underflow. This bug can
easily be triggered with the following command:

    $ git log --pretty='format:xxx%<|(1,trunc)%x10'

>From all I can see though this seems to be a benign underflow that has
no security-related consequences.

Fix the bug by using an `int` instead. When we see a control character,
we now copy it into the target buffer but don't advance the current
width of the string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
937b71cc8b utf8: fix overflow when returning string width
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.

This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:

    =================================================================
    ==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
    WRITE of size 2147483649 at 0x603000001168 thread T0
        #0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
        #2 0x5612bbb1087a in format_commit_item pretty.c:1801
        #3 0x5612bbc33bab in strbuf_expand strbuf.c:429
        #4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #6 0x5612bba0a4d5 in show_log log-tree.c:781
        #7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #10 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #11 0x5612bb56c993 in run_builtin git.c:466
        #12 0x5612bb56d397 in handle_builtin git.c:721
        #13 0x5612bb56db07 in run_argv git.c:788
        #14 0x5612bb56e8a7 in cmd_main git.c:923
        #15 0x5612bb803682 in main common-main.c:57
        #16 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
    allocated by thread T0 here:
        #0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5612bbcdd556 in xrealloc wrapper.c:136
        #2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
        #3 0x5612bbc32acd in strbuf_add strbuf.c:298
        #4 0x5612bbc33aec in strbuf_expand strbuf.c:418
        #5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #7 0x5612bba0a4d5 in show_log log-tree.c:781
        #8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #11 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #12 0x5612bb56c993 in run_builtin git.c:466
        #13 0x5612bb56d397 in handle_builtin git.c:721
        #14 0x5612bb56db07 in run_argv git.c:788
        #15 0x5612bb56e8a7 in cmd_main git.c:923
        #16 0x5612bb803682 in main common-main.c:57
        #17 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
      0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
      0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
      0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
    =>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
      0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==26009==ABORTING

Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.

Add a test that would have previously caused us to crash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
17d23e8a38 utf8: fix returning negative string width
The `utf8_strnwidth()` function calls `utf8_width()` in a loop and adds
its returned width to the end result. `utf8_width()` can return `-1`
though in case it reads a control character, which means that the
computed string width is going to be wrong. In the worst case where
there are more control characters than non-control characters, we may
even return a negative string width.

Fix this bug by treating control characters as having zero width.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
522cc87fdc utf8: fix truncated string lengths in utf8_strnwidth()
The `utf8_strnwidth()` function accepts an optional string length as
input parameter. This parameter can either be set to `-1`, in which case
we call `strlen()` on the input. Or it can be set to a positive integer
that indicates a precomputed length, which callers typically compute by
calling `strlen()` at some point themselves.

The input parameter is an `int` though, whereas `strlen()` returns a
`size_t`. This can lead to implementation-defined behaviour though when
the `size_t` cannot be represented by the `int`. In the general case
though this leads to wrap-around and thus to negative string sizes,
which is sure enough to not lead to well-defined behaviour.

Fix this by accepting a `size_t` instead of an `int` as string length.
While this takes away the ability of callers to simply pass in `-1` as
string length, it really is trivial enough to convert them to instead
pass in `strlen()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
48050c42c7 pretty: fix integer overflow in wrapping format
The `%w(width,indent1,indent2)` formatting directive can be used to
rewrap text to a specific width and is designed after git-shortlog(1)'s
`-w` parameter. While the three parameters are all stored as `size_t`
internally, `strbuf_add_wrapped_text()` accepts integers as input. As a
result, the casted integers may overflow. As these now-negative integers
are later on passed to `strbuf_addchars()`, we will ultimately run into
implementation-defined behaviour due to casting a negative number back
to `size_t` again. On my platform, this results in trying to allocate
9000 petabyte of memory.

Fix this overflow by using `cast_size_t_to_int()` so that we reject
inputs that cannot be represented as an integer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
1de69c0cdd pretty: fix adding linefeed when placeholder is not expanded
When a formatting directive has a `+` or ` ` after the `%`, then we add
either a line feed or space if the placeholder expands to a non-empty
string. In specific cases though this logic doesn't work as expected,
and we try to add the character even in the case where the formatting
directive is empty.

One such pattern is `%w(1)%+d%+w(2)`. `%+d` expands to reference names
pointing to a certain commit, like in `git log --decorate`. For a tagged
commit this would for example expand to `\n (tag: v1.0.0)`, which has a
leading newline due to the `+` modifier and a space added by `%d`. Now
the second wrapping directive will cause us to rewrap the text to
`\n(tag:\nv1.0.0)`, which is one byte shorter due to the missing leading
space. The code that handles the `+` magic now notices that the length
has changed and will thus try to insert a leading line feed at the
original posititon. But as the string was shortened, the original
position is past the buffer's boundary and thus we die with an error.

Now there are two issues here:

    1. We check whether the buffer length has changed, not whether it
       has been extended. This causes us to try and add the character
       past the string boundary.

    2. The current logic does not make any sense whatsoever. When the
       string got expanded due to the rewrap, putting the separator into
       the original position is likely to put it somewhere into the
       middle of the rewrapped contents.

It is debatable whether `%+w()` makes any sense in the first place.
Strictly speaking, the placeholder never expands to a non-empty string,
and consequentially we shouldn't ever accept this combination. We thus
fix the bug by simply refusing `%+w()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
f6e0b9f389 pretty: fix out-of-bounds read when parsing invalid padding format
An out-of-bounds read can be triggered when parsing an incomplete
padding format string passed via `--pretty=format` or in Git archives
when files are marked with the `export-subst` gitattribute.

This bug exists since we have introduced support for truncating output
via the `trunc` keyword a7f01c6b4d (pretty: support truncating in %>, %<
and %><, 2013-04-19). Before this commit, we used to find the end of the
formatting string by using strchr(3P). This function returns a `NULL`
pointer in case the character in question wasn't found. The subsequent
check whether any character was found thus simply checked the returned
pointer. After the commit we switched to strcspn(3P) though, which only
returns the offset to the first found character or to the trailing NUL
byte. As the end pointer is now computed by adding the offset to the
start pointer it won't be `NULL` anymore, and as a consequence the check
doesn't do anything anymore.

The out-of-bounds data that is being read can in fact end up in the
formatted string. As a consequence, it is possible to leak memory
contents either by calling git-log(1) or via git-archive(1) when any of
the archived files is marked with the `export-subst` gitattribute.

    ==10888==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000398 at pc 0x7f0356047cb2 bp 0x7fff3ffb95d0 sp 0x7fff3ffb8d78
    READ of size 1 at 0x602000000398 thread T0
        #0 0x7f0356047cb1 in __interceptor_strchrnul /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725
        #1 0x563b7cec9a43 in strbuf_expand strbuf.c:417
        #2 0x563b7cda7060 in repo_format_commit_message pretty.c:1869
        #3 0x563b7cda8d0f in pretty_print_commit pretty.c:2161
        #4 0x563b7cca04c8 in show_log log-tree.c:781
        #5 0x563b7cca36ba in log_tree_commit log-tree.c:1117
        #6 0x563b7c927ed5 in cmd_log_walk_no_free builtin/log.c:508
        #7 0x563b7c92835b in cmd_log_walk builtin/log.c:549
        #8 0x563b7c92b1a2 in cmd_log builtin/log.c:883
        #9 0x563b7c802993 in run_builtin git.c:466
        #10 0x563b7c803397 in handle_builtin git.c:721
        #11 0x563b7c803b07 in run_argv git.c:788
        #12 0x563b7c8048a7 in cmd_main git.c:923
        #13 0x563b7ca99682 in main common-main.c:57
        #14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    0x602000000398 is located 0 bytes to the right of 8-byte region [0x602000000390,0x602000000398)
    allocated by thread T0 here:
        #0 0x7f0356072faa in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:439
        #1 0x563b7cf7317c in xstrdup wrapper.c:39
        #2 0x563b7cd9a06a in save_user_format pretty.c:40
        #3 0x563b7cd9b3e5 in get_commit_format pretty.c:173
        #4 0x563b7ce54ea0 in handle_revision_opt revision.c:2456
        #5 0x563b7ce597c9 in setup_revisions revision.c:2850
        #6 0x563b7c9269e0 in cmd_log_init_finish builtin/log.c:269
        #7 0x563b7c927362 in cmd_log_init builtin/log.c:348
        #8 0x563b7c92b193 in cmd_log builtin/log.c:882
        #9 0x563b7c802993 in run_builtin git.c:466
        #10 0x563b7c803397 in handle_builtin git.c:721
        #11 0x563b7c803b07 in run_argv git.c:788
        #12 0x563b7c8048a7 in cmd_main git.c:923
        #13 0x563b7ca99682 in main common-main.c:57
        #14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725 in __interceptor_strchrnul
    Shadow bytes around the buggy address:
      0x0c047fff8020: fa fa fd fd fa fa 00 06 fa fa 05 fa fa fa fd fd
      0x0c047fff8030: fa fa 00 02 fa fa 06 fa fa fa 05 fa fa fa fd fd
      0x0c047fff8040: fa fa 00 07 fa fa 03 fa fa fa fd fd fa fa 00 00
      0x0c047fff8050: fa fa 00 01 fa fa fd fd fa fa 00 00 fa fa 00 01
      0x0c047fff8060: fa fa 00 06 fa fa 00 06 fa fa 05 fa fa fa 05 fa
    =>0x0c047fff8070: fa fa 00[fa]fa fa fd fa fa fa fd fd fa fa fd fd
      0x0c047fff8080: fa fa fd fd fa fa 00 00 fa fa 00 fa fa fa fd fa
      0x0c047fff8090: fa fa fd fd fa fa 00 00 fa fa fa fa fa fa fa fa
      0x0c047fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==10888==ABORTING

Fix this bug by checking whether `end` points at the trailing NUL byte.
Add a test which catches this out-of-bounds read and which demonstrates
that we used to write out-of-bounds data into the formatted message.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Original-patch-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
b49f309aa1 pretty: fix out-of-bounds read when left-flushing with stealing
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to
steal spaces. To do so we need to look ahead of the next token to see
whether there are spaces there. This loop takes into account ANSI
sequences that end with an `m`, and if it finds any it will skip them
until it finds the first space. While doing so it does not take into
account the buffer's limits though and easily does an out-of-bounds
read.

Add a test that hits this behaviour. While we don't have an easy way to
verify this, the test causes the following failure when run with
`SANITIZE=address`:

    ==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10
    READ of size 1 at 0x603000000baf thread T0
        #0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712
        #1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801
        #2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429
        #3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #5 0x55ba6f7884c8 in show_log log-tree.c:781
        #6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        #8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        #9 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        #10 0x55ba6f2ea993 in run_builtin git.c:466
        #11 0x55ba6f2eb397 in handle_builtin git.c:721
        #12 0x55ba6f2ebb07 in run_argv git.c:788
        #13 0x55ba6f2ec8a7 in cmd_main git.c:923
        #14 0x55ba6f581682 in main common-main.c:57
        #15 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8)
    allocated by thread T0 here:
        #0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x55ba6fa5b494 in xrealloc wrapper.c:136
        #2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99
        #3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298
        #4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418
        #5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #7 0x55ba6f7884c8 in show_log log-tree.c:781
        #8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        #11 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        #12 0x55ba6f2ea993 in run_builtin git.c:466
        #13 0x55ba6f2eb397 in handle_builtin git.c:721
        #14 0x55ba6f2ebb07 in run_argv git.c:788
        #15 0x55ba6f2ec8a7 in cmd_main git.c:923
        #16 0x55ba6f581682 in main common-main.c:57
        #17 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit
    Shadow bytes around the buggy address:
      0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
      0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa
      0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
      0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
    =>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa
      0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb

Luckily enough, this would only cause us to copy the out-of-bounds data
into the formatted commit in case we really had an ANSI sequence
preceding our buffer. So this bug likely has no security consequences.

Fix it regardless by not traversing past the buffer's start.

Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
81dc898df9 pretty: fix out-of-bounds write caused by integer overflow
When using a padding specifier in the pretty format passed to git-log(1)
we need to calculate the string length in several places. These string
lengths are stored in `int`s though, which means that these can easily
overflow when the input lengths exceeds 2GB. This can ultimately lead to
an out-of-bounds write when these are used in a call to memcpy(3P):

        ==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588
    WRITE of size 1 at 0x7f1ec62f97fe thread T0
        #0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762
        #2 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #3 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #5 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        #6 0x5628e95a44c8 in show_log log-tree.c:781
        #7 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        #8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        #10 0x5628e922f1a2 in cmd_log builtin/log.c:883
        #11 0x5628e9106993 in run_builtin git.c:466
        #12 0x5628e9107397 in handle_builtin git.c:721
        #13 0x5628e9107b07 in run_argv git.c:788
        #14 0x5628e91088a7 in cmd_main git.c:923
        #15 0x5628e939d682 in main common-main.c:57
        #16 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839)
    allocated by thread T0 here:
        #0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5628e98774d4 in xrealloc wrapper.c:136
        #2 0x5628e97cb01c in strbuf_grow strbuf.c:99
        #3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327
        #4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761
        #5 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #6 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #8 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        #9 0x5628e95a44c8 in show_log log-tree.c:781
        #10 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        #11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        #12 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        #13 0x5628e922f1a2 in cmd_log builtin/log.c:883
        #14 0x5628e9106993 in run_builtin git.c:466
        #15 0x5628e9107397 in handle_builtin git.c:721
        #16 0x5628e9107b07 in run_argv git.c:788
        #17 0x5628e91088a7 in cmd_main git.c:923
        #18 0x5628e939d682 in main common-main.c:57
        #19 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
      0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==8340==ABORTING

The pretty format can also be used in `git archive` operations via the
`export-subst` attribute. So this is what in our opinion makes this a
critical issue in the context of Git forges which allow to download an
archive of user supplied Git repositories.

Fix this vulnerability by using `size_t` instead of `int` to track the
string lengths. Add tests which detect this vulnerability when Git is
compiled with the address sanitizer.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Modified-by: Taylor  Blau <me@ttalorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
a244dc5b0a test-lib: add prerequisite for 64-bit platforms
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
platforms and regardless of the size of `long`.

This imitates the `LONG_IS_64BIT` prerequisite.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:04 +09:00
3c50032ff5 attr: ignore overly large gitattributes files
Similar as with the preceding commit, start ignoring gitattributes files
that are overly large to protect us against out-of-bounds reads and
writes caused by integer overflows. Unfortunately, we cannot just define
"overly large" in terms of any preexisting limits in the codebase.

Instead, we choose a very conservative limit of 100MB. This is plenty of
room for specifying gitattributes, and incidentally it is also the limit
for blob sizes for GitHub. While we don't want GitHub to dictate limits
here, it is still sensible to use this fact for an informed decision
given that it is hosting a huge set of repositories. Furthermore, over
at GitLab we scanned a subset of repositories for their root-level
attribute files. We found that 80% of them have a gitattributes file
smaller than 100kB, 99.99% have one smaller than 1MB, and only a single
repository had one that was almost 3MB in size. So enforcing a limit of
100MB seems to give us ample of headroom.

With this limit in place we can be reasonably sure that there is no easy
way to exploit the gitattributes file via integer overflows anymore.
Furthermore, it protects us against resource exhaustion caused by
allocating the in-memory data structures required to represent the
parsed attributes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:50:03 +09:00
dfa6b32b5e attr: ignore attribute lines exceeding 2048 bytes
There are two different code paths to read gitattributes: once via a
file, and once via the index. These two paths used to behave differently
because when reading attributes from a file, we used fgets(3P) with a
buffer size of 2kB. Consequentially, we silently truncate line lengths
when lines are longer than that and will then parse the remainder of the
line as a new pattern. It goes without saying that this is entirely
unexpected, but it's even worse that the behaviour depends on how the
gitattributes are parsed.

While this is simply wrong, the silent truncation saves us with the
recently discovered vulnerabilities that can cause out-of-bound writes
or reads with unreasonably long lines due to integer overflows. As the
common path is to read gitattributes via the worktree file instead of
via the index, we can assume that any gitattributes file that had lines
longer than that is already broken anyway. So instead of lifting the
limit here, we can double down on it to fix the vulnerabilities.

Introduce an explicit line length limit of 2kB that is shared across all
paths that read attributes and ignore any line that hits this limit
while printing a warning.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:33:07 +09:00
d74b1fd54f attr: fix silently splitting up lines longer than 2048 bytes
When reading attributes from a file we use fgets(3P) with a buffer size
of 2048 bytes. This means that as soon as a line exceeds the buffer size
we split it up into multiple parts and parse each of them as a separate
pattern line. This is of course not what the user intended, and even
worse the behaviour is inconsistent with how we read attributes from the
index.

Fix this bug by converting the code to use `strbuf_getline()` instead.
This will indeed read in the whole line, which may theoretically lead to
an out-of-memory situation when the gitattributes file is huge. We're
about to reject any gitattributes files larger than 100MB in the next
commit though, which makes this less of a concern.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:29:30 +09:00
a60a66e409 attr: harden allocation against integer overflows
When parsing an attributes line, we need to allocate an array that holds
all attributes specified for the given file pattern. The calculation to
determine the number of bytes that need to be allocated was prone to an
overflow though when there was an unreasonable amount of attributes.

Harden the allocation by instead using the `st_` helper functions that
cause us to die when we hit an integer overflow.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
e1e12e97ac attr: fix integer overflow with more than INT_MAX macros
Attributes have a field that tracks the position in the `all_attrs`
array they're stored inside. This field gets set via `hashmap_get_size`
when adding the attribute to the global map of attributes. But while the
field is of type `int`, the value returned by `hashmap_get_size` is an
`unsigned int`. It can thus happen that the value overflows, where we
would now dereference teh `all_attrs` array at an out-of-bounds value.

We do have a sanity check for this overflow via an assert that verifies
the index matches the new hashmap's size. But asserts are not a proper
mechanism to detect against any such overflows as they may not in fact
be compiled into production code.

Fix this by using an `unsigned int` to track the index and convert the
assert to a call `die()`.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
447ac906e1 attr: fix out-of-bounds read with unreasonable amount of patterns
The `struct attr_stack` tracks the stack of all patterns together with
their attributes. When parsing a gitattributes file that has more than
2^31 such patterns though we may trigger multiple out-of-bounds reads on
64 bit platforms. This is because while the `num_matches` variable is an
unsigned integer, we always use a signed integer to iterate over them.

I have not been able to reproduce this issue due to memory constraints
on my systems. But despite the out-of-bounds reads, the worst thing that
can seemingly happen is to call free(3P) with a garbage pointer when
calling `attr_stack_free()`.

Fix this bug by using unsigned integers to iterate over the array. While
this makes the iteration somewhat awkward when iterating in reverse, it
is at least better than knowingly running into an out-of-bounds read.
While at it, convert the call to `ALLOC_GROW` to use `ALLOC_GROW_BY`
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
34ace8bad0 attr: fix out-of-bounds write when parsing huge number of attributes
It is possible to trigger an integer overflow when parsing attribute
names when there are more than 2^31 of them for a single pattern. This
can either lead to us dying due to trying to request too many bytes:

     blob=$(perl -e 'print "f" . " a=" x 2147483649' | git hash-object -w --stdin)
     git update-index --add --cacheinfo 100644,$blob,.gitattributes
     git attr-check --all file

    =================================================================
    ==1022==ERROR: AddressSanitizer: requested allocation size 0xfffffff800000032 (0xfffffff800001038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
        #0 0x7fd3efabf411 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
        #1 0x5563a0a1e3d3 in xcalloc wrapper.c:150
        #2 0x5563a058d005 in parse_attr_line attr.c:384
        #3 0x5563a058e661 in handle_attr_line attr.c:660
        #4 0x5563a058eddb in read_attr_from_index attr.c:769
        #5 0x5563a058ef12 in read_attr attr.c:797
        #6 0x5563a058f24c in bootstrap_attr_stack attr.c:867
        #7 0x5563a058f4a3 in prepare_attr_stack attr.c:902
        #8 0x5563a05905da in collect_some_attrs attr.c:1097
        #9 0x5563a059093d in git_all_attrs attr.c:1128
        #10 0x5563a02f636e in check_attr builtin/check-attr.c:67
        #11 0x5563a02f6c12 in cmd_check_attr builtin/check-attr.c:183
        #12 0x5563a02aa993 in run_builtin git.c:466
        #13 0x5563a02ab397 in handle_builtin git.c:721
        #14 0x5563a02abb2b in run_argv git.c:788
        #15 0x5563a02ac991 in cmd_main git.c:926
        #16 0x5563a05432bd in main common-main.c:57
        #17 0x7fd3ef82228f  (/usr/lib/libc.so.6+0x2328f)

    ==1022==HINT: if you don't care about these errors you may set allocator_may_return_null=1
    SUMMARY: AddressSanitizer: allocation-size-too-big /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 in __interceptor_calloc
    ==1022==ABORTING

Or, much worse, it can lead to an out-of-bounds write because we
underallocate and then memcpy(3P) into an array:

    perl -e '
        print "A " . "\rh="x2000000000;
        print "\rh="x2000000000;
        print "\rh="x294967294 . "\n"
    ' >.gitattributes
    git add .gitattributes
    git commit -am "evil attributes"

    $ git clone --quiet /path/to/repo
    =================================================================
    ==15062==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002550 at pc 0x5555559884d5 bp 0x7fffffffbc60 sp 0x7fffffffbc58
    WRITE of size 8 at 0x602000002550 thread T0
        #0 0x5555559884d4 in parse_attr_line attr.c:393
        #1 0x5555559884d4 in handle_attr_line attr.c:660
        #2 0x555555988902 in read_attr_from_index attr.c:784
        #3 0x555555988902 in read_attr_from_index attr.c:747
        #4 0x555555988a1d in read_attr attr.c:800
        #5 0x555555989b0c in bootstrap_attr_stack attr.c:882
        #6 0x555555989b0c in prepare_attr_stack attr.c:917
        #7 0x555555989b0c in collect_some_attrs attr.c:1112
        #8 0x55555598b141 in git_check_attr attr.c:1126
        #9 0x555555a13004 in convert_attrs convert.c:1311
        #10 0x555555a95e04 in checkout_entry_ca entry.c:553
        #11 0x555555d58bf6 in checkout_entry entry.h:42
        #12 0x555555d58bf6 in check_updates unpack-trees.c:480
        #13 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
        #14 0x555555785ab7 in checkout builtin/clone.c:724
        #15 0x555555785ab7 in cmd_clone builtin/clone.c:1384
        #16 0x55555572443c in run_builtin git.c:466
        #17 0x55555572443c in handle_builtin git.c:721
        #18 0x555555727872 in run_argv git.c:788
        #19 0x555555727872 in cmd_main git.c:926
        #20 0x555555721fa0 in main common-main.c:57
        #21 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
        #22 0x555555723f39 in _start (git+0x1cff39)

    0x602000002552 is located 0 bytes to the right of 2-byte region [0x602000002550,0x602000002552) allocated by thread T0 here:
        #0 0x7ffff768c037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x555555d7fff7 in xcalloc wrapper.c:150
        #2 0x55555598815f in parse_attr_line attr.c:384
        #3 0x55555598815f in handle_attr_line attr.c:660
        #4 0x555555988902 in read_attr_from_index attr.c:784
        #5 0x555555988902 in read_attr_from_index attr.c:747
        #6 0x555555988a1d in read_attr attr.c:800
        #7 0x555555989b0c in bootstrap_attr_stack attr.c:882
        #8 0x555555989b0c in prepare_attr_stack attr.c:917
        #9 0x555555989b0c in collect_some_attrs attr.c:1112
        #10 0x55555598b141 in git_check_attr attr.c:1126
        #11 0x555555a13004 in convert_attrs convert.c:1311
        #12 0x555555a95e04 in checkout_entry_ca entry.c:553
        #13 0x555555d58bf6 in checkout_entry entry.h:42
        #14 0x555555d58bf6 in check_updates unpack-trees.c:480
        #15 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
        #16 0x555555785ab7 in checkout builtin/clone.c:724
        #17 0x555555785ab7 in cmd_clone builtin/clone.c:1384
        #18 0x55555572443c in run_builtin git.c:466
        #19 0x55555572443c in handle_builtin git.c:721
        #20 0x555555727872 in run_argv git.c:788
        #21 0x555555727872 in cmd_main git.c:926
        #22 0x555555721fa0 in main common-main.c:57
        #23 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308

    SUMMARY: AddressSanitizer: heap-buffer-overflow attr.c:393 in parse_attr_line
    Shadow bytes around the buggy address:
      0x0c047fff8450: fa fa 00 02 fa fa 00 07 fa fa fd fd fa fa 00 00
      0x0c047fff8460: fa fa 02 fa fa fa fd fd fa fa 00 06 fa fa 05 fa
      0x0c047fff8470: fa fa fd fd fa fa 00 02 fa fa 06 fa fa fa 05 fa
      0x0c047fff8480: fa fa 07 fa fa fa fd fd fa fa 00 01 fa fa 00 02
      0x0c047fff8490: fa fa 00 03 fa fa 00 fa fa fa 00 01 fa fa 00 03
    =>0x0c047fff84a0: fa fa 00 01 fa fa 00 02 fa fa[02]fa fa fa fa fa
      0x0c047fff84b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
      Shadow gap:              cc
    ==15062==ABORTING

Fix this bug by using `size_t` instead to count the number of attributes
so that this value cannot reasonably overflow without running out of
memory before already.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
2455720950 attr: fix integer overflow when parsing huge attribute names
It is possible to trigger an integer overflow when parsing attribute
names that are longer than 2^31 bytes because we assign the result of
strlen(3P) to an `int` instead of to a `size_t`. This can lead to an
abort in vsnprintf(3P) with the following reproducer:

    blob=$(perl -e 'print "A " . "B"x2147483648 . "\n"' | git hash-object -w --stdin)
    git update-index --add --cacheinfo 100644,$blob,.gitattributes
    git check-attr --all path

    BUG: strbuf.c:400: your vsnprintf is broken (returned -1)

But furthermore, assuming that the attribute name is even longer than
that, it can cause us to silently truncate the attribute and thus lead
to wrong results.

Fix this integer overflow by using a `size_t` instead. This fixes the
silent truncation of attribute names, but it only partially fixes the
BUG we hit: even though the initial BUG is fixed, we can still hit a BUG
when parsing invalid attribute lines via `report_invalid_attr()`.

This is due to an underlying design issue in vsnprintf(3P) which only
knows to return an `int`, and thus it may always overflow with large
inputs. This issue is benign though: the worst that can happen is that
the error message is misreported to be either truncated or too long, but
due to the buffer being NUL terminated we wouldn't ever do an
out-of-bounds read here.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
8d0d48cf21 attr: fix out-of-bounds read with huge attribute names
There is an out-of-bounds read possible when parsing gitattributes that
have an attribute that is 2^31+1 bytes long. This is caused due to an
integer overflow when we assign the result of strlen(3P) to an `int`,
where we use the wrapped-around value in a subsequent call to
memcpy(3P). The following code reproduces the issue:

    blob=$(perl -e 'print "a" x 2147483649 . " attr"' | git hash-object -w --stdin)
    git update-index --add --cacheinfo 100644,$blob,.gitattributes
    git check-attr --all file

    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==8451==ERROR: AddressSanitizer: SEGV on unknown address 0x7f93efa00800 (pc 0x7f94f1f8f082 bp 0x7ffddb59b3a0 sp 0x7ffddb59ab28 T0)
    ==8451==The signal is caused by a READ memory access.
        #0 0x7f94f1f8f082  (/usr/lib/libc.so.6+0x176082)
        #1 0x7f94f2047d9c in __interceptor_strspn /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:752
        #2 0x560e190f7f26 in parse_attr_line attr.c:375
        #3 0x560e190f9663 in handle_attr_line attr.c:660
        #4 0x560e190f9ddd in read_attr_from_index attr.c:769
        #5 0x560e190f9f14 in read_attr attr.c:797
        #6 0x560e190fa24e in bootstrap_attr_stack attr.c:867
        #7 0x560e190fa4a5 in prepare_attr_stack attr.c:902
        #8 0x560e190fb5dc in collect_some_attrs attr.c:1097
        #9 0x560e190fb93f in git_all_attrs attr.c:1128
        #10 0x560e18e6136e in check_attr builtin/check-attr.c:67
        #11 0x560e18e61c12 in cmd_check_attr builtin/check-attr.c:183
        #12 0x560e18e15993 in run_builtin git.c:466
        #13 0x560e18e16397 in handle_builtin git.c:721
        #14 0x560e18e16b2b in run_argv git.c:788
        #15 0x560e18e17991 in cmd_main git.c:926
        #16 0x560e190ae2bd in main common-main.c:57
        #17 0x7f94f1e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #18 0x7f94f1e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #19 0x560e18e110e4 in _start ../sysdeps/x86_64/start.S:115

    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/usr/lib/libc.so.6+0x176082)
    ==8451==ABORTING

Fix this bug by converting the variable to a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
eb22e7dfa2 attr: fix overflow when upserting attribute with overly long name
The function `git_attr_internal()` is called to upsert attributes into
the global map. And while all callers pass a `size_t`, the function
itself accepts an `int` as the attribute name's length. This can lead to
an integer overflow in case the attribute name is longer than `INT_MAX`.

Now this overflow seems harmless as the first thing we do is to call
`attr_name_valid()`, and that function only succeeds in case all chars
in the range of `namelen` match a certain small set of chars. We thus
can't do an out-of-bounds read as NUL is not part of that set and all
strings passed to this function are NUL-terminated. And furthermore, we
wouldn't ever read past the current attribute name anyway due to the
same reason. And if validation fails we will return early.

On the other hand it feels fragile to rely on this behaviour, even more
so given that we pass `namelen` to `FLEX_ALLOC_MEM()`. So let's instead
just do the correct thing here and accept a `size_t` as line length.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:14:16 +09:00
abd4d67ab0 Git 2.30.6
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06 17:38:16 -04:00
0ca6ead81e alias.c: reject too-long cmdline strings in split_cmdline()
This function improperly uses an int to represent the number of entries
in the resulting argument array. This allows a malicious actor to
intentionally overflow the return value, leading to arbitrary heap
writes.

Because the resulting argv array is typically passed to execv(), it may
be possible to leverage this attack to gain remote code execution on a
victim machine. This was almost certainly the case for certain
configurations of git-shell until the previous commit limited the size
of input it would accept. Other calls to split_cmdline() are typically
limited by the size of argv the OS is willing to hand us, so are
similarly protected.

So this is not strictly fixing a known vulnerability, but is a hardening
of the function that is worth doing to protect against possible unknown
vulnerabilities.

One approach to fixing this would be modifying the signature of
`split_cmdline()` to look something like:

    int split_cmdline(char *cmdline, const char ***argv, size_t *argc);

Where the return value of `split_cmdline()` is negative for errors, and
zero otherwise. If non-NULL, the `*argc` pointer is modified to contain
the size of the `**argv` array.

But this implies an absurdly large `argv` array, which more than likely
larger than the system's argument limit. So even if split_cmdline()
allowed this, it would fail immediately afterwards when we called
execv(). So instead of converting all of `split_cmdline()`'s callers to
work with `size_t` types in this patch, instead pursue the minimal fix
here to prevent ever returning an array with more than INT_MAX entries
in it.

Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
71ad7fe1bc shell: limit size of interactive commands
When git-shell is run in interactive mode (which must be enabled by
creating $HOME/git-shell-commands), it reads commands from stdin, one
per line, and executes them.

We read the commands with git_read_line_interactively(), which uses a
strbuf under the hood. That means we'll accept an input of arbitrary
size (limited only by how much heap we can allocate). That creates two
problems:

  - the rest of the code is not prepared to handle large inputs. The
    most serious issue here is that split_cmdline() uses "int" for most
    of its types, which can lead to integer overflow and out-of-bounds
    array reads and writes. But even with that fixed, we assume that we
    can feed the command name to snprintf() (via xstrfmt()), which is
    stuck for historical reasons using "int", and causes it to fail (and
    even trigger a BUG() call).

  - since the point of git-shell is to take input from untrusted or
    semi-trusted clients, it's a mild denial-of-service. We'll allocate
    as many bytes as the client sends us (actually twice as many, since
    we immediately duplicate the buffer).

We can fix both by just limiting the amount of per-command input we're
willing to receive.

We should also fix split_cmdline(), of course, which is an accident
waiting to happen, but that can come on top. Most calls to
split_cmdline(), including the other one in git-shell, are OK because
they are reading from an OS-provided argv, which is limited in practice.
This patch should eliminate the immediate vulnerabilities.

I picked 4MB as an arbitrary limit. It's big enough that nobody should
ever run into it in practice (since the point is to run the commands via
exec, we're subject to OS limits which are typically much lower). But
it's small enough that allocating it isn't that big a deal.

The code is mostly just swapping out fgets() for the strbuf call, but we
have to add a few niceties like flushing and trimming line endings. We
could simplify things further by putting the buffer on the stack, but
4MB is probably a bit much there. Note that we'll _always_ allocate 4MB,
which for normal, non-malicious requests is more than we would before
this patch. But on the other hand, other git programs are happy to use
96MB for a delta cache. And since we'd never touch most of those pages,
on a lazy-allocating OS like Linux they won't even get allocated to
actual RAM.

The ideal would be a version of strbuf_getline() that accepted a maximum
value. But for a minimal vulnerability fix, let's keep things localized
and simple. We can always refactor further on top.

The included test fails in an obvious way with ASan or UBSan (which
notice the integer overflow and out-of-bounds reads). Without them, it
fails in a less obvious way: we may segfault, or we may try to xstrfmt()
a long string, leading to a BUG(). Either way, it fails reliably before
this patch, and passes with it. Note that we don't need an EXPENSIVE
prereq on it. It does take 10-15s to fail before this patch, but with
the new limit, we fail almost immediately (and the perl process
generating 2GB of data exits via SIGPIPE).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
32696a4cbe shell: add basic tests
We have no tests of even basic functionality of git-shell. Let's add a
couple of obvious ones. This will serve as a framework for adding tests
for new things we fix, as well as making sure we don't screw anything up
too badly while doing so.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
a1d4f67c12 transport: make protocol.file.allow be "user" by default
An earlier patch discussed and fixed a scenario where Git could be used
as a vector to exfiltrate sensitive data through a Docker container when
a potential victim clones a suspicious repository with local submodules
that contain symlinks.

That security hole has since been plugged, but a similar one still
exists.  Instead of convincing a would-be victim to clone an embedded
submodule via the "file" protocol, an attacker could convince an
individual to clone a repository that has a submodule pointing to a
valid path on the victim's filesystem.

For example, if an individual (with username "foo") has their home
directory ("/home/foo") stored as a Git repository, then an attacker
could exfiltrate data by convincing a victim to clone a malicious
repository containing a submodule pointing at "/home/foo/.git" with
`--recurse-submodules`. Doing so would expose any sensitive contents in
stored in "/home/foo" tracked in Git.

For systems (such as Docker) that consider everything outside of the
immediate top-level working directory containing a Dockerfile as
inaccessible to the container (with the exception of volume mounts, and
so on), this is a violation of trust by exposing unexpected contents in
the working copy.

To mitigate the likelihood of this kind of attack, adjust the "file://"
protocol's default policy to be "user" to prevent commands that execute
without user input (including recursive submodule initialization) from
taking place by default.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
f4a32a550f t/t9NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that interact with submodules a handful of times use
`test_config_global`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
0d3beb71da t/t7NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
0f21b8f468 t/t6NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
225d2d50cc t/t5NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
ac7e57fa28 t/t4NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
f8d510ed0b t/t3NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
99f4abb8da t/2NNNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead. Test scripts that rely on
submodules throughout use a `git config --global` during a setup test
towards the beginning of the script.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
8a96dbcb33 t/t1NNN: allow local submodules
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.

Tests that only need to interact with submodules in a limited capacity
have individual Git commands annotated with the appropriate
configuration via `-c`. Tests that interact with submodules a handful of
times use `test_config_global` instead.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
7de0c306f7 t/lib-submodule-update.sh: allow local submodules
To prepare for changing the default value of `protocol.file.allow` to
"user", update the `prolog()` function in lib-submodule-update to allow
submodules to be cloned over the file protocol.

This is used by a handful of submodule-related test scripts, which
themselves will have to tweak the value of `protocol.file.allow` in
certain locations. Those will be done in subsequent commits.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
6f054f9fb3 builtin/clone.c: disallow --local clones with symlinks
When cloning a repository with `--local`, Git relies on either making a
hardlink or copy to every file in the "objects" directory of the source
repository. This is done through the callpath `cmd_clone()` ->
`clone_local()` -> `copy_or_link_directory()`.

The way this optimization works is by enumerating every file and
directory recursively in the source repository's `$GIT_DIR/objects`
directory, and then either making a copy or hardlink of each file. The
only exception to this rule is when copying the "alternates" file, in
which case paths are rewritten to be absolute before writing a new
"alternates" file in the destination repo.

One quirk of this implementation is that it dereferences symlinks when
cloning. This behavior was most recently modified in 36596fd2df (clone:
better handle symlinked files at .git/objects/, 2019-07-10), which
attempted to support `--local` clones of repositories with symlinks in
their objects directory in a platform-independent way.

Unfortunately, this behavior of dereferencing symlinks (that is,
creating a hardlink or copy of the source's link target in the
destination repository) can be used as a component in attacking a
victim by inadvertently exposing the contents of file stored outside of
the repository.

Take, for example, a repository that stores a Dockerfile and is used to
build Docker images. When building an image, Docker copies the directory
contents into the VM, and then instructs the VM to execute the
Dockerfile at the root of the copied directory. This protects against
directory traversal attacks by copying symbolic links as-is without
dereferencing them.

That is, if a user has a symlink pointing at their private key material
(where the symlink is present in the same directory as the Dockerfile,
but the key itself is present outside of that directory), the key is
unreadable to a Docker image, since the link will appear broken from the
container's point of view.

This behavior enables an attack whereby a victim is convinced to clone a
repository containing an embedded submodule (with a URL like
"file:///proc/self/cwd/path/to/submodule") which has a symlink pointing
at a path containing sensitive information on the victim's machine. If a
user is tricked into doing this, the contents at the destination of
those symbolic links are exposed to the Docker image at runtime.

One approach to preventing this behavior is to recreate symlinks in the
destination repository. But this is problematic, since symlinking the
objects directory are not well-supported. (One potential problem is that
when sharing, e.g. a "pack" directory via symlinks, different writers
performing garbage collection may consider different sets of objects to
be reachable, enabling a situation whereby garbage collecting one
repository may remove reachable objects in another repository).

Instead, prohibit the local clone optimization when any symlinks are
present in the `$GIT_DIR/objects` directory of the source repository.
Users may clone the repository again by prepending the "file://" scheme
to their clone URL, or by adding the `--no-local` option to their `git
clone` invocation.

The directory iterator used by `copy_or_link_directory()` must no longer
dereference symlinks (i.e., it *must* call `lstat()` instead of `stat()`
in order to discover whether or not there are symlinks present). This has
no bearing on the overall behavior, since we will immediately `die()` on
encounter a symlink.

Note that t5604.33 suggests that we do support local clones with
symbolic links in the source repository's objects directory, but this
was likely unintentional, or at least did not take into consideration
the problem with sharing parts of the objects directory with symbolic
links at the time. Update this test to reflect which options are and
aren't supported.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01 00:23:38 -04:00
88b7be68a4 Git 2.30.5
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-06-23 12:31:05 +02:00
3b0bf27049 setup: tighten ownership checks post CVE-2022-24765
8959555cee (setup_git_directory(): add an owner check for the top-level
directory, 2022-03-02), adds a function to check for ownership of
repositories using a directory that is representative of it, and ways to
add exempt a specific repository from said check if needed, but that
check didn't account for owership of the gitdir, or (when used) the
gitfile that points to that gitdir.

An attacker could create a git repository in a directory that they can
write into but that is owned by the victim to work around the fix that
was introduced with CVE-2022-24765 to potentially run code as the
victim.

An example that could result in privilege escalation to root in *NIX would
be to set a repository in a shared tmp directory by doing (for example):

  $ git -C /tmp init

To avoid that, extend the ensure_valid_ownership function to be able to
check for all three paths.

This will have the side effect of tripling the number of stat() calls
when a repository is detected, but the effect is expected to be likely
minimal, as it is done only once during the directory walk in which Git
looks for a repository.

Additionally make sure to resolve the gitfile (if one was used) to find
the relevant gitdir for checking.

While at it change the message printed on failure so it is clear we are
referring to the repository by its worktree (or gitdir if it is bare) and
not to a specific directory.

Helped-by: Junio C Hamano <junio@pobox.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-06-23 12:31:05 +02:00
b779214eaf Merge branch 'cb/path-owner-check-with-sudo'
With a recent update to refuse access to repositories of other
people by default, "sudo make install" and "sudo git describe"
stopped working.  This series intends to loosen it while keeping
the safety.

* cb/path-owner-check-with-sudo:
  t0034: add negative tests and allow git init to mostly work under sudo
  git-compat-util: avoid failing dir ownership checks if running privileged
  t: regression git needs safe.directory when using sudo

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-06-23 12:31:04 +02:00
6b11e3d52e git-compat-util: allow root to access both SUDO_UID and root owned
Previous changes introduced a regression which will prevent root for
accessing repositories owned by thyself if using sudo because SUDO_UID
takes precedence.

Loosen that restriction by allowing root to access repositories owned
by both uid by default and without having to add a safe.directory
exception.

A previous workaround that was documented in the tests is no longer
needed so it has been removed together with its specially crafted
prerequisite.

Helped-by: Johanness Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17 14:03:08 -07:00
b9063afda1 t0034: add negative tests and allow git init to mostly work under sudo
Add a support library that provides one function that can be used
to run a "scriplet" of commands through sudo and that helps invoking
sudo in the slightly awkward way that is required to ensure it doesn't
block the call (if shell was allowed as tested in the prerequisite)
and it doesn't run the command through a different shell than the one
we intended.

Add additional negative tests as suggested by Junio and that use a
new workspace that is owned by root.

Document a regression that was introduced by previous commits where
root won't be able anymore to access directories they own unless
SUDO_UID is removed from their environment.

The tests document additional ways that this new restriction could
be worked around and the documentation explains why it might be instead
considered a feature, but a "fix" is planned for a future change.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-12 18:12:23 -07:00
ae9abbb63e git-compat-util: avoid failing dir ownership checks if running privileged
bdc77d1d68 (Add a function to determine whether a path is owned by the
current user, 2022-03-02) checks for the effective uid of the running
process using geteuid() but didn't account for cases where that user was
root (because git was invoked through sudo or a compatible tool) and the
original uid that repository trusted for its config was no longer known,
therefore failing the following otherwise safe call:

  guy@renard ~/Software/uncrustify $ sudo git describe --always --dirty
  [sudo] password for guy:
  fatal: unsafe repository ('/home/guy/Software/uncrustify' is owned by someone else)

Attempt to detect those cases by using the environment variables that
those tools create to keep track of the original user id, and do the
ownership check using that instead.

This assumes the environment the user is running on after going
privileged can't be tampered with, and also adds code to restrict that
the new behavior only applies if running as root, therefore keeping the
most common case, which runs unprivileged, from changing, but because of
that, it will miss cases where sudo (or an equivalent) was used to change
to another unprivileged user or where the equivalent tool used to raise
privileges didn't track the original id in a sudo compatible way.

Because of compatibility with sudo, the code assumes that uid_t is an
unsigned integer type (which is not required by the standard) but is used
that way in their codebase to generate SUDO_UID.  In systems where uid_t
is signed, sudo might be also patched to NOT be unsigned and that might
be able to trigger an edge case and a bug (as described in the code), but
it is considered unlikely to happen and even if it does, the code would
just mostly fail safely, so there was no attempt either to detect it or
prevent it by the code, which is something that might change in the future,
based on expected user feedback.

Reported-by: Guy Maurel <guy.j@maurel.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Randall Becker <rsbecker@nexbridge.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-12 18:12:23 -07:00
5f1a3fec8c t: regression git needs safe.directory when using sudo
Originally reported after release of v2.35.2 (and other maint branches)
for CVE-2022-24765 and blocking otherwise harmless commands that were
done using sudo in a repository that was owned by the user.

Add a new test script with very basic support to allow running git
commands through sudo, so a reproduction could be implemented and that
uses only `git status` as a proxy of the issue reported.

Note that because of the way sudo interacts with the system, a much
more complete integration with the test framework will require a lot
more work and that was therefore intentionally punted for now.

The current implementation requires the execution of a special cleanup
function which should always be kept as the last "test" or otherwise
the standard cleanup functions will fail because they can't remove
the root owned directories that are used.  This also means that if
failures are found while running, the specifics of the failure might
not be kept for further debugging and if the test was interrupted, it
will be necessary to clean the working directory manually before
restarting by running:

  $ sudo rm -rf trash\ directory.t0034-root-safe-directory/

The test file also uses at least one initial "setup" test that creates
a parallel execution directory under the "root" sub directory, which
should be used as top level directory for all repositories that are
used in this test file.  Unlike all other tests the repository provided
by the test framework should go unused.

Special care should be taken when invoking commands through sudo, since
the environment is otherwise independent from what the test framework
setup and might have changed the values for HOME, SHELL and dropped
several relevant environment variables for your test.  Indeed `git status`
was used as a proxy because it doesn't even require commits in the
repository to work and usually doesn't require much from the environment
to run, but a future patch will add calls to `git init` and that will
fail to honor the default branch name, unless that setting is NOT
provided through an environment variable (which means even a CI run
could fail that test if enabled incorrectly).

A new SUDO prerequisite is provided that does some sanity checking
to make sure the sudo command that will be used allows for passwordless
execution as root without restrictions and doesn't mess with git's
execution path.  This matches what is provided by the macOS agents that
are used as part of GitHub actions and probably nowhere else.

Most of those characteristics make this test mostly only suitable for
CI, but it might be executed locally if special care is taken to provide
for all of them in the local configuration and maybe making use of the
sudo credential cache by first invoking sudo, entering your password if
needed, and then invoking the test with:

  $ GIT_TEST_ALLOW_SUDO=YES ./t0034-root-safe-directory.sh

If it fails to run, then it means your local setup wouldn't work for the
test because of the configuration sudo has or other system settings, and
things that might help are to comment out sudo's secure_path config, and
make sure that the account you are using has no restrictions on the
commands it can run through sudo, just like is provided for the user in
the CI.

For example (assuming a username of marta for you) something probably
similar to the following entry in your /etc/sudoers (or equivalent) file:

  marta	ALL=(ALL:ALL) NOPASSWD: ALL

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-12 18:12:23 -07:00
878 changed files with 55982 additions and 100211 deletions

1
.gitattributes vendored
View File

@ -6,7 +6,6 @@
*.pm eol=lf diff=perl
*.py eol=lf diff=python
*.bat eol=crlf
CODE_OF_CONDUCT.md -whitespace
/Documentation/**/*.txt eol=lf
/command-list.txt eol=lf
/GIT-VERSION-GEN eol=lf

View File

@ -282,14 +282,14 @@ jobs:
pool: ubuntu-latest
- jobname: linux-gcc
cc: gcc
pool: ubuntu-latest
pool: ubuntu-20.04
- jobname: osx-clang
cc: clang
pool: macos-latest
- jobname: osx-gcc
cc: gcc
pool: macos-latest
- jobname: linux-gcc-default
- jobname: GETTEXT_POISON
cc: gcc
pool: ubuntu-latest
env:
@ -340,7 +340,7 @@ jobs:
if: needs.ci-config.outputs.enabled == 'yes'
env:
jobname: StaticAnalysis
runs-on: ubuntu-18.04
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v1
- run: ci/install-dependencies.sh

View File

@ -16,7 +16,7 @@ compiler:
matrix:
include:
- env: jobname=linux-gcc-default
- env: jobname=GETTEXT_POISON
os: linux
compiler:
addons:

View File

@ -8,64 +8,73 @@ this code of conduct may be banned from the community.
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to make participation in our project and
our community a harassment-free experience for everyone, regardless of age,
body size, disability, ethnicity, sex characteristics, gender identity and
expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity and
orientation.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
Examples of behavior that contributes to creating a positive environment
include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior include:
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
## Our Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
This Code of Conduct applies within all project spaces, and it also applies
when an individual is representing the project or its community in public
spaces. Examples of representing a project or community include using an
official project e-mail address, posting via an official social media account,
or acting as an appointed representative at an online or offline event.
Representation of a project may be further defined and clarified by project
maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
reported by contacting the project team at git@sfconservancy.org. All
complaints will be reviewed and investigated and will result in a response
that is deemed necessary and appropriate to the circumstances. The project
team is obligated to maintain confidentiality with regard to the reporter of
an incident. Further details of specific enforcement policies may be posted
separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
The project leadership team can be contacted by email as a whole at
git@sfconservancy.org, or individually:
- Ævar Arnfjörð Bjarmason <avarab@gmail.com>
@ -73,73 +82,12 @@ git@sfconservancy.org, or individually:
- Jeff King <peff@peff.net>
- Junio C Hamano <gitster@pobox.com>
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within
the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
at [https://www.contributor-covenant.org/translations][translations].
version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
[homepage]: https://www.contributor-covenant.org
[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

View File

@ -21,7 +21,6 @@ MAN1_TXT += gitweb.txt
MAN5_TXT += gitattributes.txt
MAN5_TXT += githooks.txt
MAN5_TXT += gitignore.txt
MAN5_TXT += gitmailmap.txt
MAN5_TXT += gitmodules.txt
MAN5_TXT += gitrepository-layout.txt
MAN5_TXT += gitweb.conf.txt

View File

@ -664,7 +664,7 @@ mention the right animal somewhere:
----
test_expect_success 'runs correctly with no args and good output' '
git psuh >actual &&
grep Pony actual
test_i18ngrep Pony actual
'
----

View File

@ -0,0 +1,12 @@
Git v2.30.5 Release Notes
=========================
This release contains minor fix-ups for the changes that went into
Git 2.30.3 and 2.30.4, addressing CVE-2022-29187.
* The safety check that verifies a safe ownership of the Git
worktree is now extended to also cover the ownership of the Git
directory (and the `.git` file, if there is any).
Carlo Marcelo Arenas Belón (1):
setup: tighten ownership checks post CVE-2022-24765

View File

@ -0,0 +1,60 @@
Git v2.30.6 Release Notes
=========================
This release addresses the security issues CVE-2022-39253 and
CVE-2022-39260.
Fixes since v2.30.5
-------------------
* CVE-2022-39253:
When relying on the `--local` clone optimization, Git dereferences
symbolic links in the source repository before creating hardlinks
(or copies) of the dereferenced link in the destination repository.
This can lead to surprising behavior where arbitrary files are
present in a repository's `$GIT_DIR` when cloning from a malicious
repository.
Git will no longer dereference symbolic links via the `--local`
clone mechanism, and will instead refuse to clone repositories that
have symbolic links present in the `$GIT_DIR/objects` directory.
Additionally, the value of `protocol.file.allow` is changed to be
"user" by default.
* CVE-2022-39260:
An overly-long command string given to `git shell` can result in
overflow in `split_cmdline()`, leading to arbitrary heap writes and
remote code execution when `git shell` is exposed and the directory
`$HOME/git-shell-commands` exists.
`git shell` is taught to refuse interactive commands that are
longer than 4MiB in size. `split_cmdline()` is hardened to reject
inputs larger than 2GiB.
Credit for finding CVE-2022-39253 goes to Cory Snider of Mirantis. The
fix was authored by Taylor Blau, with help from Johannes Schindelin.
Credit for finding CVE-2022-39260 goes to Kevin Backhouse of GitHub.
The fix was authored by Kevin Backhouse, Jeff King, and Taylor Blau.
Jeff King (2):
shell: add basic tests
shell: limit size of interactive commands
Kevin Backhouse (1):
alias.c: reject too-long cmdline strings in split_cmdline()
Taylor Blau (11):
builtin/clone.c: disallow `--local` clones with symlinks
t/lib-submodule-update.sh: allow local submodules
t/t1NNN: allow local submodules
t/2NNNN: allow local submodules
t/t3NNN: allow local submodules
t/t4NNN: allow local submodules
t/t5NNN: allow local submodules
t/t6NNN: allow local submodules
t/t7NNN: allow local submodules
t/t9NNN: allow local submodules
transport: make `protocol.file.allow` be "user" by default

View File

@ -0,0 +1,86 @@
Git v2.30.7 Release Notes
=========================
This release addresses the security issues CVE-2022-41903 and
CVE-2022-23521.
Fixes since v2.30.6
-------------------
* CVE-2022-41903:
git log has the ability to display commits using an arbitrary
format with its --format specifiers. This functionality is also
exposed to git archive via the export-subst gitattribute.
When processing the padding operators (e.g., %<(, %<|(, %>(,
%>>(, or %><( ), an integer overflow can occur in
pretty.c::format_and_pad_commit() where a size_t is improperly
stored as an int, and then added as an offset to a subsequent
memcpy() call.
This overflow can be triggered directly by a user running a
command which invokes the commit formatting machinery (e.g., git
log --format=...). It may also be triggered indirectly through
git archive via the export-subst mechanism, which expands format
specifiers inside of files within the repository during a git
archive.
This integer overflow can result in arbitrary heap writes, which
may result in remote code execution.
* CVE-2022-23521:
gitattributes are a mechanism to allow defining attributes for
paths. These attributes can be defined by adding a `.gitattributes`
file to the repository, which contains a set of file patterns and
the attributes that should be set for paths matching this pattern.
When parsing gitattributes, multiple integer overflows can occur
when there is a huge number of path patterns, a huge number of
attributes for a single pattern, or when the declared attribute
names are huge.
These overflows can be triggered via a crafted `.gitattributes` file
that may be part of the commit history. Git silently splits lines
longer than 2KB when parsing gitattributes from a file, but not when
parsing them from the index. Consequentially, the failure mode
depends on whether the file exists in the working tree, the index or
both.
This integer overflow can result in arbitrary heap reads and writes,
which may result in remote code execution.
Credit for finding CVE-2022-41903 goes to Joern Schneeweisz of GitLab.
An initial fix was authored by Markus Vervier of X41 D-Sec. Credit for
finding CVE-2022-23521 goes to Markus Vervier and Eric Sesterhenn of X41
D-Sec. This work was sponsored by OSTIF.
The proposed fixes have been polished and extended to cover additional
findings by Patrick Steinhardt of GitLab, with help from others on the
Git security mailing list.
Patrick Steinhardt (21):
attr: fix overflow when upserting attribute with overly long name
attr: fix out-of-bounds read with huge attribute names
attr: fix integer overflow when parsing huge attribute names
attr: fix out-of-bounds write when parsing huge number of attributes
attr: fix out-of-bounds read with unreasonable amount of patterns
attr: fix integer overflow with more than INT_MAX macros
attr: harden allocation against integer overflows
attr: fix silently splitting up lines longer than 2048 bytes
attr: ignore attribute lines exceeding 2048 bytes
attr: ignore overly large gitattributes files
pretty: fix out-of-bounds write caused by integer overflow
pretty: fix out-of-bounds read when left-flushing with stealing
pretty: fix out-of-bounds read when parsing invalid padding format
pretty: fix adding linefeed when placeholder is not expanded
pretty: fix integer overflow in wrapping format
utf8: fix truncated string lengths in `utf8_strnwidth()`
utf8: fix returning negative string width
utf8: fix overflow when returning string width
utf8: fix checking for glyph width in `strbuf_utf8_replace()`
utf8: refactor `strbuf_utf8_replace` to not rely on preallocated buffer
pretty: restrict input lengths for padding and wrapping formats

View File

@ -0,0 +1,52 @@
Git v2.30.8 Release Notes
=========================
This release addresses the security issues CVE-2023-22490 and
CVE-2023-23946.
Fixes since v2.30.7
-------------------
* CVE-2023-22490:
Using a specially-crafted repository, Git can be tricked into using
its local clone optimization even when using a non-local transport.
Though Git will abort local clones whose source $GIT_DIR/objects
directory contains symbolic links (c.f., CVE-2022-39253), the objects
directory itself may still be a symbolic link.
These two may be combined to include arbitrary files based on known
paths on the victim's filesystem within the malicious repository's
working copy, allowing for data exfiltration in a similar manner as
CVE-2022-39253.
* CVE-2023-23946:
By feeding a crafted input to "git apply", a path outside the
working tree can be overwritten as the user who is running "git
apply".
* A mismatched type in `attr.c::read_attr_from_index()` which could
cause Git to errantly reject attributes on Windows and 32-bit Linux
has been corrected.
Credit for finding CVE-2023-22490 goes to yvvdwf, and the fix was
developed by Taylor Blau, with additional help from others on the
Git security mailing list.
Credit for finding CVE-2023-23946 goes to Joern Schneeweisz, and the
fix was developed by Patrick Steinhardt.
Johannes Schindelin (1):
attr: adjust a mismatched data type
Patrick Steinhardt (1):
apply: fix writing behind newly created symbolic links
Taylor Blau (3):
t5619: demonstrate clone_local() with ambiguous transport
clone: delay picking a transport until after get_repo_path()
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS

View File

@ -0,0 +1,43 @@
Git v2.30.9 Release Notes
=========================
This release addresses the security issues CVE-2023-25652,
CVE-2023-25815, and CVE-2023-29007.
Fixes since v2.30.8
-------------------
* CVE-2023-25652:
By feeding specially crafted input to `git apply --reject`, a
path outside the working tree can be overwritten with partially
controlled contents (corresponding to the rejected hunk(s) from
the given patch).
* CVE-2023-25815:
When Git is compiled with runtime prefix support and runs without
translated messages, it still used the gettext machinery to
display messages, which subsequently potentially looked for
translated messages in unexpected places. This allowed for
malicious placement of crafted messages.
* CVE-2023-29007:
When renaming or deleting a section from a configuration file,
certain malicious configuration values may be misinterpreted as
the beginning of a new configuration section, leading to arbitrary
configuration injection.
Credit for finding CVE-2023-25652 goes to Ry0taK, and the fix was
developed by Taylor Blau, Junio C Hamano and Johannes Schindelin,
with the help of Linus Torvalds.
Credit for finding CVE-2023-25815 goes to Maxime Escourbiac and
Yassine BENGANA of Michelin, and the fix was developed by Johannes
Schindelin.
Credit for finding CVE-2023-29007 goes to André Baptista and Vítor Pinho
of Ethiack, and the fix was developed by Taylor Blau, and Johannes
Schindelin, with help from Jeff King, and Patrick Steinhardt.

View File

@ -1,365 +0,0 @@
Git 2.31 Release Notes
======================
Updates since v2.30
-------------------
Backward incompatible and other important changes
* The "pack-redundant" command, which has been left stale with almost
unusable performance issues, now warns loudly when it gets used, as
we no longer want to recommend its use (instead just "repack -d"
instead).
* The development community has adopted Contributor Covenant v2.0 to
update from v1.4 that we have been using.
* The support for deprecated PCRE1 library has been dropped.
* Fixes for CVE-2021-21300 in Git 2.30.2 (and earlier) is included.
UI, Workflows & Features
* The "--format=%(trailers)" mechanism gets enhanced to make it
easier to design output for machine consumption.
* When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation. Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.
* The configuration variable 'core.abbrev' can be set to 'no' to
force no abbreviation regardless of the hash algorithm.
* "git rev-parse" can be explicitly told to give output as absolute
or relative path with the `--path-format=(absolute|relative)` option.
* Bash completion (in contrib/) update to make it easier for
end-users to add completion for their custom "git" subcommands.
* "git maintenance" learned to drive scheduled maintenance on
platforms whose native scheduling methods are not 'cron'.
* After expiring a reflog and making a single commit, the reflog for
the branch would record a single entry that knows both @{0} and
@{1}, but we failed to answer "what commit were we on?", i.e. @{1}
* "git bundle" learns "--stdin" option to read its refs from the
standard input. Also, it now does not lose refs whey they point
at the same object.
* "git log" learned a new "--diff-merges=<how>" option.
* "git ls-files" can and does show multiple entries when the index is
unmerged, which is a source for confusion unless -s/-u option is in
use. A new option --deduplicate has been introduced.
* `git worktree list` now annotates worktrees as prunable, shows
locked and prunable attributes in --porcelain mode, and gained
a --verbose option.
* "git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository. The protocol v2 learned how to do so.
* There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.
* The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.
* "git mergetool" feeds three versions (base, local and remote) of
a conflicted path unmodified. The command learned to optionally
prepare these files with unconflicted parts already resolved.
* The .mailmap is documented to be read only from the root level of a
working tree, but a stray file in a bare repository also was read
by accident, which has been corrected.
* "git maintenance" tool learned a new "pack-refs" maintenance task.
* The error message given when a configuration variable that is
expected to have a boolean value has been improved.
* Signed commits and tags now allow verification of objects, whose
two object names (one in SHA-1, the other in SHA-256) are both
signed.
* "git rev-list" command learned "--disk-usage" option.
* "git {diff,log} --{skip,rotate}-to=<path>" allows the user to
discard diff output for early paths or move them to the end of the
output.
* "git difftool" learned "--skip-to=<path>" option to restart an
interrupted session from an arbitrary path.
* "git grep" has been tweaked to be limited to the sparse checkout
paths.
* "git rebase --[no-]fork-point" gained a configuration variable
rebase.forkPoint so that users do not have to keep specifying a
non-default setting.
Performance, Internal Implementation, Development Support etc.
* A 3-year old test that was not testing anything useful has been
corrected.
* Retire more names with "sha1" in it.
* The topological walk codepath is covered by new trace2 stats.
* Update the Code-of-conduct to version 2.0 from the upstream (we've
been using version 1.4).
* "git mktag" validates its input using its own rules before writing
a tag object---it has been updated to share the logic with "git
fsck".
* Two new ways to feed configuration variable-value pairs via
environment variables have been introduced, and the way
GIT_CONFIG_PARAMETERS encodes variable/value pairs has been tweaked
to make it more robust.
* Tests have been updated so that they do not to get affected by the
name of the default branch "git init" creates.
* "git fetch" learns to treat ref updates atomically in all-or-none
fashion, just like "git push" does, with the new "--atomic" option.
* The peel_ref() API has been replaced with peel_iterated_oid().
* The .use_shell flag in struct child_process that is passed to
run_command() API has been clarified with a bit more documentation.
* Document, clean-up and optimize the code around the cache-tree
extension in the index.
* The ls-refs protocol operation has been optimized to narrow the
sub-hierarchy of refs/ it walks to produce response.
* When removing many branches and tags, the code used to do so one
ref at a time. There is another API it can use to delete multiple
refs, and it makes quite a lot of performance difference when the
refs are packed.
* The "pack-objects" command needs to iterate over all the tags when
automatic tag following is enabled, but it actually iterated over
all refs and then discarded everything outside "refs/tags/"
hierarchy, which was quite wasteful.
* A perf script was made more portable.
* Our setting of GitHub CI test jobs were a bit too eager to give up
once there is even one failure found. Tweak the knob to allow
other jobs keep running even when we see a failure, so that we can
find more failures in a single run.
* We've carried compatibility codepaths for compilers without
variadic macros for quite some time, but the world may be ready for
them to be removed. Force compilation failure on exotic platforms
where variadic macros are not available to find out who screams in
such a way that we can easily revert if it turns out that the world
is not yet ready.
* Code clean-up to ensure our use of hashtables using object names as
keys use the "struct object_id" objects, not the raw hash values.
* Lose the debugging aid that may have been useful in the past, but
no longer is, in the "grep" codepaths.
* Some pretty-format specifiers do not need the data in commit object
(e.g. "%H"), but we were over-eager to load and parse it, which has
been made even lazier.
* Get rid of "GETTEXT_POISON" support altogether, which may or may
not be controversial.
* Introduce an on-disk file to record revindex for packdata, which
traditionally was always created on the fly and only in-core.
* The commit-graph learned to use corrected commit dates instead of
the generation number to help topological revision traversal.
* Piecemeal of rewrite of "git bisect" in C continues.
* When a pager spawned by us exited, the trace log did not record its
exit status correctly, which has been corrected.
* Removal of GIT_TEST_GETTEXT_POISON continues.
* The code to implement "git merge-base --independent" was poorly
done and was kept from the very beginning of the feature.
* Preliminary changes to fsmonitor integration.
* Performance improvements for rename detection.
* The common code to deal with "chunked file format" that is shared
by the multi-pack-index and commit-graph files have been factored
out, to help codepaths for both filetypes to become more robust.
* The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example. Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.
* The logic to handle "trailer" related placeholders in the
"--format=" mechanisms in the "log" family and "for-each-ref"
family is getting unified.
* Raise the buffer size used when writing the index file out from
(obviously too small) 8kB to (clearly sufficiently large) 128kB.
* It is reported that open() on some platforms (e.g. macOS Big Sur)
can return EINTR even though our timers are set up with SA_RESTART.
A workaround has been implemented and enabled for macOS to rerun
open() transparently from the caller when this happens.
Fixes since v2.30
-----------------
* Diagnose command line error of "git rebase" early.
* Clean up option descriptions in "git cmd --help".
* "git stash" did not work well in a sparsely checked out working
tree.
* Some tests expect that "ls -l" output has either '-' or 'x' for
group executable bit, but setgid bit can be inherited from parent
directory and make these fields 'S' or 's' instead, causing test
failures.
* "git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.
* Fix 2.29 regression where "git mergetool --tool-help" fails to list
all the available tools.
* Fix for procedure to building CI test environment for mac.
* The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.
* Newline characters in the host and path part of git:// URL are
now forbidden.
* "git diff" showed a submodule working tree with untracked cruft as
"Submodule commit <objectname>-dirty", but a natural expectation is
that the "-dirty" indicator would align with "git describe --dirty",
which does not consider having untracked files in the working tree
as source of dirtiness. The inconsistency has been fixed.
* When more than one commit with the same patch ID appears on one
side, "git log --cherry-pick A...B" did not exclude them all when a
commit with the same patch ID appears on the other side. Now it
does.
* Documentation for "git fsck" lost stale bits that has become
incorrect.
* Doc fix for packfile URI feature.
* When "git rebase -i" processes "fixup" insn, there is no reason to
clean up the commit log message, but we did the usual stripspace
processing. This has been corrected.
(merge f7d42ceec5 js/rebase-i-commit-cleanup-fix later to maint).
* Fix in passing custom args from "git clone" to "upload-pack" on the
other side.
(merge ad6b5fefbd jv/upload-pack-filter-spec-quotefix later to maint).
* The command line completion (in contrib/) completed "git branch -d"
with branch names, but "git branch -D" offered tagnames in addition,
which has been corrected. "git branch -M" had the same problem.
(merge 27dc071b9a jk/complete-branch-force-delete later to maint).
* When commands are started from a subdirectory, they may have to
compare the path to the subdirectory (called prefix and found out
from $(pwd)) with the tracked paths. On macOS, $(pwd) and
readdir() yield decomposed path, while the tracked paths are
usually normalized to the precomposed form, causing mismatch. This
has been fixed by taking the same approach used to normalize the
command line arguments.
(merge 5c327502db tb/precompose-prefix-too later to maint).
* Even though invocations of "die()" were logged to the trace2
system, "BUG()"s were not, which has been corrected.
(merge 0a9dde4a04 jt/trace2-BUG later to maint).
* "git grep --untracked" is meant to be "let's ALSO find in these
files on the filesystem" when looking for matches in the working
tree files, and does not make any sense if the primary search is
done against the index, or the tree objects. The "--cached" and
"--untracked" options have been marked as mutually incompatible.
(merge 0c5d83b248 mt/grep-cached-untracked later to maint).
* Fix "git fsck --name-objects" which apparently has not been used by
anybody who is motivated enough to report breakage.
(merge e89f89361c js/fsck-name-objects-fix later to maint).
* Avoid individual tests in t5411 from getting affected by each other
by forcing them to use separate output files during the test.
(merge 822ee894f6 jx/t5411-unique-filenames later to maint).
* Test to make sure "git rev-parse one-thing one-thing" gives
the same thing twice (when one-thing is --since=X).
(merge a5cdca4520 ew/rev-parse-since-test later to maint).
* When certain features (e.g. grafts) used in the repository are
incompatible with the use of the commit-graph, we used to silently
turned commit-graph off; we now tell the user what we are doing.
(merge c85eec7fc3 js/commit-graph-warning later to maint).
* Objects that lost references can be pruned away, even when they
have notes attached to it (and these notes will become dangling,
which in turn can be pruned with "git notes prune"). This has been
clarified in the documentation.
(merge fa9ab027ba mz/doc-notes-are-not-anchors later to maint).
* The error codepath around the "--temp/--prefix" feature of "git
checkout-index" has been improved.
(merge 3f7ba60350 mt/checkout-index-corner-cases later to maint).
* The "git maintenance register" command had trouble registering bare
repositories, which had been corrected.
* A handful of multi-word configuration variable names in
documentation that are spelled in all lowercase have been corrected
to use the more canonical camelCase.
(merge 7dd0eaa39c dl/doc-config-camelcase later to maint).
* "git push $there --delete ''" should have been diagnosed as an
error, but instead turned into a matching push, which has been
corrected.
(merge 20e416409f jc/push-delete-nothing later to maint).
* Test script modernization.
(merge 488acf15df sv/t7001-modernize later to maint).
* An under-allocation for the untracked cache data has been corrected.
(merge 6347d649bc jh/untracked-cache-fix later to maint).
* Other code cleanup, docfix, build fix, etc.
(merge e3f5da7e60 sg/t7800-difftool-robustify later to maint).
(merge 9d336655ba js/doc-proto-v2-response-end later to maint).
(merge 1b5b8cf072 jc/maint-column-doc-typofix later to maint).
(merge 3a837b58e3 cw/pack-config-doc later to maint).
(merge 01168a9d89 ug/doc-commit-approxidate later to maint).
(merge b865734760 js/params-vs-args later to maint).

View File

@ -1,27 +0,0 @@
Git 2.31.1 Release Notes
========================
Fixes since v2.31
-----------------
* The fsmonitor interface read from its input without making sure
there is something to read from. This bug is new in 2.31
timeframe.
* The data structure used by fsmonitor interface was not properly
duplicated during an in-core merge, leading to use-after-free etc.
* "git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well. This regression
has been corrected.
* Fix macros that can silently inject unintended null-statements.
* CALLOC_ARRAY() macro replaces many uses of xcalloc().
* Update insn in Makefile comments to run fuzz-all target.
* Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.
Also contains various documentation updates and code clean-ups.

View File

@ -1,6 +0,0 @@
Git v2.31.2 Release Notes
=========================
This release merges up the fixes that appear in v2.30.3 to address
the security issue CVE-2022-24765; see the release notes for that
version for details.

View File

@ -1,4 +0,0 @@
Git Documentation/RelNotes/2.31.3.txt Release Notes
=========================
This release merges up the fixes that appear in v2.31.3.

View File

@ -1,6 +1,6 @@
-b::
Show blank SHA-1 for boundary commits. This can also
be controlled via the `blame.blankBoundary` config option.
be controlled via the `blame.blankboundary` config option.
--root::
Do not treat root commits as boundaries. This can also be

View File

@ -46,7 +46,7 @@ Subsection names are case sensitive and can contain any characters except
newline and the null byte. Doublequote `"` and backslash can be included
by escaping them as `\"` and `\\`, respectively. Backslashes preceding
other characters are dropped when reading; for example, `\t` is read as
`t` and `\0` is read as `0`. Section headers cannot span multiple lines.
`t` and `\0` is read as `0` Section headers cannot span multiple lines.
Variables may belong directly to a section or to a given subsection. You
can have `[section]` if you have `[section "subsection"]`, but you don't
need to.
@ -398,8 +398,6 @@ include::config/interactive.txt[]
include::config/log.txt[]
include::config/lsrefs.txt[]
include::config/mailinfo.txt[]
include::config/mailmap.txt[]

View File

@ -625,6 +625,4 @@ core.abbrev::
computed based on the approximate number of packed objects
in your repository, which hopefully is enough for
abbreviated object names to stay unique for some time.
If set to "no", no abbreviation is made and the object names
are shown in their full length.
The minimum length is 4.

View File

@ -85,8 +85,6 @@ diff.ignoreSubmodules::
and 'git status' when `status.submoduleSummary` is set unless it is
overridden by using the --ignore-submodules command-line option.
The 'git submodule' commands are not affected by this setting.
By default this is set to untracked so that any untracked
submodules are ignored.
diff.mnemonicPrefix::
If set, 'git diff' uses a prefix pair that is different from the

View File

@ -4,4 +4,4 @@ init.templateDir::
init.defaultBranch::
Allows overriding the default branch name e.g. when initializing
a new repository.
a new repository or when cloning an empty repository.

View File

@ -1,9 +0,0 @@
lsrefs.unborn::
May be "advertise" (the default), "allow", or "ignore". If "advertise",
the server will respond to the client sending "unborn" (as described in
protocol-v2.txt) and will advertise support for this feature during the
protocol v2 capability advertisement. "allow" is the same as
"advertise" except that the server will not advertise support for this
feature; this is useful for load-balanced servers that cannot be
updated atomically (for example), since the administrator could
configure "allow", then after a delay, configure "advertise".

View File

@ -15,9 +15,8 @@ maintenance.strategy::
* `none`: This default setting implies no task are run at any schedule.
* `incremental`: This setting optimizes for performing small maintenance
activities that do not delete any data. This does not schedule the `gc`
task, but runs the `prefetch` and `commit-graph` tasks hourly, the
`loose-objects` and `incremental-repack` tasks daily, and the `pack-refs`
task weekly.
task, but runs the `prefetch` and `commit-graph` tasks hourly and the
`loose-objects` and `incremental-repack` tasks daily.
maintenance.<task>.enabled::
This boolean config option controls whether the maintenance task

View File

@ -13,11 +13,6 @@ mergetool.<tool>.cmd::
merged; 'MERGED' contains the name of the file to which the merge
tool should write the results of a successful merge.
mergetool.<tool>.hideResolved::
Allows the user to override the global `mergetool.hideResolved` value
for a specific tool. See `mergetool.hideResolved` for the full
description.
mergetool.<tool>.trustExitCode::
For a custom merge command, specify whether the exit code of
the merge command can be used to determine whether the merge was
@ -45,16 +40,6 @@ mergetool.meld.useAutoMerge::
value of `false` avoids using `--auto-merge` altogether, and is the
default value.
mergetool.hideResolved::
During a merge Git will automatically resolve as many conflicts as
possible and write the 'MERGED' file containing conflict markers around
any conflicts that it cannot resolve; 'LOCAL' and 'REMOTE' normally
represent the versions of the file from before Git's conflict
resolution. This flag causes 'LOCAL' and 'REMOTE' to be overwriten so
that only the unresolved conflicts are presented to the merge tool. Can
be configured per-tool via the `mergetool.<tool>.hideResolved`
configuration variable. Defaults to `false`.
mergetool.keepBackup::
After performing a merge, the original file with conflict markers
can be saved as a file with a `.orig` extension. If this variable

View File

@ -133,10 +133,3 @@ pack.writeBitmapHashCache::
between an older, bitmapped pack and objects that have been
pushed since the last gc). The downside is that it consumes 4
bytes per object of disk space. Defaults to true.
pack.writeReverseIndex::
When true, git will write a corresponding .rev file (see:
link:../technical/pack-format.html[Documentation/technical/pack-format.txt])
for each new packfile that it writes in all places except for
linkgit:git-fast-import[1] and in the bulk checkin mechanism.
Defaults to false.

View File

@ -1,10 +1,10 @@
protocol.allow::
If set, provide a user defined default policy for all protocols which
don't explicitly have a policy (`protocol.<name>.allow`). By default,
if unset, known-safe protocols (http, https, git, ssh, file) have a
if unset, known-safe protocols (http, https, git, ssh) have a
default policy of `always`, known-dangerous protocols (ext) have a
default policy of `never`, and all other protocols have a default
policy of `user`. Supported policies:
default policy of `never`, and all other protocols (including file)
have a default policy of `user`. Supported policies:
+
--

View File

@ -68,6 +68,3 @@ rebase.rescheduleFailedExec::
Automatically reschedule `exec` commands that failed. This only makes
sense in interactive mode (or when an `--exec` option was provided).
This is the same as specifying the `--reschedule-failed-exec` option.
rebase.forkPoint::
If set to false set `--no-fork-point` option by default.

View File

@ -26,3 +26,17 @@ directory was listed in the `safe.directory` list. If `safe.directory=*`
is set in system config and you want to re-enable this protection, then
initialize your list with an empty value before listing the repositories
that you deem safe.
+
As explained, Git only allows you to access repositories owned by
yourself, i.e. the user who is running Git, by default. When Git
is running as 'root' in a non Windows platform that provides sudo,
however, git checks the SUDO_UID environment variable that sudo creates
and will allow access to the uid recorded as its value in addition to
the id from 'root'.
This is to make it easy to perform a common sequence during installation
"make && sudo make install". A git process running under 'sudo' runs as
'root' but the 'sudo' command exports the environment variable to record
which id the original user has.
If that is not what you would prefer and want git to only trust
repositories that are owned by root instead, then you can remove
the `SUDO_UID` variable from root's environment before invoking git.

View File

@ -1,7 +1,10 @@
DATE FORMATS
------------
The `GIT_AUTHOR_DATE` and `GIT_COMMITTER_DATE` environment variables
The `GIT_AUTHOR_DATE`, `GIT_COMMITTER_DATE` environment variables
ifdef::git-commit[]
and the `--date` option
endif::git-commit[]
support the following date formats:
Git internal format::
@ -23,9 +26,3 @@ ISO 8601::
+
NOTE: In addition, the date part is accepted in the following formats:
`YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
ifdef::git-commit[]
In addition to recognizing all date formats above, the `--date` option
will also try to make sense of other, more human-centric date formats,
such as relative dates like "yesterday" or "last Friday at noon".
endif::git-commit[]

View File

@ -81,9 +81,9 @@ Combined diff format
Any diff-generating command can take the `-c` or `--cc` option to
produce a 'combined diff' when showing a merge. This is the default
format when showing merges with linkgit:git-diff[1] or
linkgit:git-show[1]. Note also that you can give suitable
`--diff-merges` option to any of these commands to force generation of
diffs in specific format.
linkgit:git-show[1]. Note also that you can give the `-m` option to any
of these commands to force generation of diffs with individual parents
of a merge.
A "combined diff" format looks like this:

View File

@ -33,57 +33,6 @@ endif::git-diff[]
show the patch by default, or to cancel the effect of `--patch`.
endif::git-format-patch[]
ifdef::git-log[]
--diff-merges=(off|none|first-parent|1|separate|m|combined|c|dense-combined|cc)::
--no-diff-merges::
Specify diff format to be used for merge commits. Default is
{diff-merges-default} unless `--first-parent` is in use, in which case
`first-parent` is the default.
+
--diff-merges=(off|none):::
--no-diff-merges:::
Disable output of diffs for merge commits. Useful to override
implied value.
+
--diff-merges=first-parent:::
--diff-merges=1:::
This option makes merge commits show the full diff with
respect to the first parent only.
+
--diff-merges=separate:::
--diff-merges=m:::
-m:::
This makes merge commits show the full diff with respect to
each of the parents. Separate log entry and diff is generated
for each parent. `-m` doesn't produce any output without `-p`.
+
--diff-merges=combined:::
--diff-merges=c:::
-c:::
With this option, diff output for a merge commit shows the
differences from each of the parents to the merge result
simultaneously instead of showing pairwise diff between a
parent and the result one at a time. Furthermore, it lists
only files which were modified from all parents. `-c` implies
`-p`.
+
--diff-merges=dense-combined:::
--diff-merges=cc:::
--cc:::
With this option the output produced by
`--diff-merges=combined` is further compressed by omitting
uninteresting hunks whose contents in the parents have only
two variants and the merge result picks one of them without
modification. `--cc` implies `-p`.
--combined-all-paths::
This flag causes combined diffs (used for merge commits) to
list the name of the file from all parents. It thus only has
effect when `--diff-merges=[dense-]combined` is in use, and
is likely only useful if filename changes are detected (i.e.
when either rename or copy detection have been requested).
endif::git-log[]
-U<n>::
--unified=<n>::
Generate diffs with <n> lines of context instead of
@ -700,14 +649,6 @@ matches a pattern if removing any number of the final pathname
components matches the pattern. For example, the pattern "`foo*bar`"
matches "`fooasdfbar`" and "`foo/bar/baz/asdf`" but not "`foobarx`".
--skip-to=<file>::
--rotate-to=<file>::
Discard the files before the named <file> from the output
(i.e. 'skip to'), or move them to the end of the output
(i.e. 'rotate to'). These were invented primarily for use
of the `git difftool` command, and may not be very useful
otherwise.
ifndef::git-format-patch[]
-R::
Swap two inputs; that is, show differences from index or

View File

@ -7,10 +7,6 @@
existing contents of `.git/FETCH_HEAD`. Without this
option old data in `.git/FETCH_HEAD` will be overwritten.
--atomic::
Use an atomic transaction to update local refs. Either all refs are
updated, or on error, no refs are updated.
--depth=<depth>::
Limit fetching to the specified number of commits from the tip of
each remote branch history. If fetching to a 'shallow' repository

View File

@ -79,7 +79,7 @@ OPTIONS
Pass `-u` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]).
The proposed commit log message taken from the e-mail
is re-coded into UTF-8 encoding (configuration variable
`i18n.commitEncoding` can be used to specify project's
`i18n.commitencoding` can be used to specify project's
preferred encoding if it is not UTF-8).
+
This was optional in prior versions of git, but now it is the

View File

@ -226,7 +226,7 @@ commit commentary), a blame viewer will not care.
MAPPING AUTHORS
---------------
See linkgit:gitmailmap[5].
include::mailmap.txt[]
SEE ALSO

View File

@ -78,8 +78,8 @@ renaming. If <newbranch> exists, -M must be used to force the rename
to happen.
The `-c` and `-C` options have the exact same semantics as `-m` and
`-M`, except instead of the branch being renamed, it will be copied to a
new name, along with its config and reflog.
`-M`, except instead of the branch being renamed it along with its
config and reflog will be copied to a new name.
With a `-d` or `-D` option, `<branchname>` will be deleted. You may
specify more than one branch for deletion. If the branch currently
@ -153,7 +153,7 @@ OPTIONS
--column[=<options>]::
--no-column::
Display branch listing in columns. See configuration variable
`column.branch` for option syntax. `--column` and `--no-column`
column.branch for option syntax.`--column` and `--no-column`
without options are equivalent to 'always' and 'never' respectively.
+
This option is only applicable in non-verbose mode.

View File

@ -36,17 +36,10 @@ name is provided or known to the 'mailmap', ``Name $$<user@host>$$'' is
printed; otherwise only ``$$<user@host>$$'' is printed.
CONFIGURATION
-------------
See `mailmap.file` and `mailmap.blob` in linkgit:git-config[1] for how
to specify a custom `.mailmap` target file or object.
MAPPING AUTHORS
---------------
See linkgit:gitmailmap[5].
include::mailmap.txt[]
GIT

View File

@ -346,22 +346,6 @@ GIT_CONFIG_NOSYSTEM::
See also <<FILES>>.
GIT_CONFIG_COUNT::
GIT_CONFIG_KEY_<n>::
GIT_CONFIG_VALUE_<n>::
If GIT_CONFIG_COUNT is set to a positive number, all environment pairs
GIT_CONFIG_KEY_<n> and GIT_CONFIG_VALUE_<n> up to that number will be
added to the process's runtime configuration. The config pairs are
zero-indexed. Any missing key or value is treated as an error. An empty
GIT_CONFIG_COUNT is treated the same as GIT_CONFIG_COUNT=0, namely no
pairs are processed. These environment variables will override values
in configuration files, but will be overridden by any explicit options
passed via `git -c`.
+
This is useful for cases where you want to spawn multiple git commands
with a common configuration but cannot depend on a configuration file,
for example when writing scripts.
[[EXAMPLES]]
EXAMPLES

View File

@ -34,14 +34,6 @@ OPTIONS
This is the default behaviour; the option is provided to
override any configuration settings.
--rotate-to=<file>::
Start showing the diff for the given path,
the paths before it will move to end and output.
--skip-to=<file>::
Start showing the diff for the given path, skipping all
the paths before it.
-t <tool>::
--tool=<tool>::
Use the diff tool specified by <tool>. Valid values include

View File

@ -260,9 +260,11 @@ contents:lines=N::
The first `N` lines of the message.
Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1]
are obtained as `trailers[:options]` (or by using the historical alias
`contents:trailers[:options]`). For valid [:option] values see `trailers`
section of linkgit:git-log[1].
are obtained as `trailers` (or by using the historical alias
`contents:trailers`). Non-trailer lines from the trailer block can be omitted
with `trailers:only`. Whitespace-continuations can be removed from trailers so
that each trailer appears on a line by itself with its full content with
`trailers:unfold`. Both can be used together as `trailers:unfold,only`.
For sorting purposes, fields with numeric values sort in numeric order
(`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`).

View File

@ -117,14 +117,12 @@ NOTES
'git gc' tries very hard not to delete objects that are referenced
anywhere in your repository. In particular, it will keep not only
objects referenced by your current set of branches and tags, but also
objects referenced by the index, remote-tracking branches, reflogs
(which may reference commits in branches that were later amended or
rewound), and anything else in the refs/* namespace. Note that a note
(of the kind created by 'git notes') attached to an object does not
contribute in keeping the object alive. If you are expecting some
objects to be deleted and they aren't, check all of those locations
and decide whether it makes sense in your case to remove those
references.
objects referenced by the index, remote-tracking branches, notes saved
by 'git notes' under refs/notes/, reflogs (which may reference commits
in branches that were later amended or rewound), and anything else in
the refs/* namespace. If you are expecting some objects to be deleted
and they aren't, check all of those locations and decide whether it
makes sense in your case to remove those references.
On the other hand, when 'git gc' runs concurrently with another process,
there is a risk of it deleting an object that the other process is using

View File

@ -41,17 +41,11 @@ commit-id::
<commit-id>['\t'<filename-as-in--w>]
--packfile=<hash>::
For internal use only. Instead of a commit id on the command
line (which is not expected in
Instead of a commit id on the command line (which is not expected in
this case), 'git http-fetch' fetches the packfile directly at the given
URL and uses index-pack to generate corresponding .idx and .keep files.
The hash is used to determine the name of the temporary file and is
arbitrary. The output of index-pack is printed to stdout. Requires
--index-pack-args.
--index-pack-args=<args>::
For internal use only. The command to run on the contents of the
downloaded pack. Arguments are URL-encoded separated by spaces.
arbitrary. The output of index-pack is printed to stdout.
--recover::
Verify that everything reachable from target is fetched. Used after

View File

@ -9,18 +9,17 @@ git-index-pack - Build pack index file for an existing packed archive
SYNOPSIS
--------
[verse]
'git index-pack' [-v] [-o <index-file>] [--[no-]rev-index] <pack-file>
'git index-pack' [-v] [-o <index-file>] <pack-file>
'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>]
[--[no-]rev-index] [<pack-file>]
[<pack-file>]
DESCRIPTION
-----------
Reads a packed archive (.pack) from the specified file, and
builds a pack index file (.idx) for it. Optionally writes a
reverse-index (.rev) for the specified pack. The packed
archive together with the pack index can then be placed in
the objects/pack/ directory of a Git repository.
builds a pack index file (.idx) for it. The packed archive
together with the pack index can then be placed in the
objects/pack/ directory of a Git repository.
OPTIONS
@ -36,13 +35,6 @@ OPTIONS
fails if the name of packed archive does not end
with .pack).
--[no-]rev-index::
When this flag is provided, generate a reverse index
(a `.rev` file) corresponding to the given pack. If
`--verify` is given, ensure that the existing
reverse index is correct. Takes precedence over
`pack.writeReverseIndex`.
--stdin::
When this flag is provided, the pack is read from stdin
instead and a copy is then written to <pack-file>. If
@ -86,12 +78,7 @@ OPTIONS
Die if the pack contains broken links. For internal use only.
--fsck-objects::
For internal use only.
+
Die if the pack contains broken objects. If the pack contains a tree
pointing to a .gitmodules blob that does not exist, prints the hash of
that blob (for the caller to check) after the hash that goes into the
name of the pack/idx file (see "Notes").
Die if the pack contains broken objects. For internal use only.
--threads=<n>::
Specifies the number of threads to spawn when resolving

View File

@ -107,15 +107,47 @@ DIFF FORMATTING
By default, `git log` does not generate any diff output. The options
below can be used to show the changes made by each commit.
Note that unless one of `--diff-merges` variants (including short
`-m`, `-c`, and `--cc` options) is explicitly given, merge commits
will not show a diff, even if a diff format like `--patch` is
selected, nor will they match search options like `-S`. The exception
is when `--first-parent` is in use, in which case `first-parent` is
the default format.
Note that unless one of `-c`, `--cc`, or `-m` is given, merge commits
will never show a diff, even if a diff format like `--patch` is
selected, nor will they match search options like `-S`. The exception is
when `--first-parent` is in use, in which merges are treated like normal
single-parent commits (this can be overridden by providing a
combined-diff option or with `--no-diff-merges`).
-c::
With this option, diff output for a merge commit
shows the differences from each of the parents to the merge result
simultaneously instead of showing pairwise diff between a parent
and the result one at a time. Furthermore, it lists only files
which were modified from all parents.
--cc::
This flag implies the `-c` option and further compresses the
patch output by omitting uninteresting hunks whose contents in
the parents have only two variants and the merge result picks
one of them without modification.
--combined-all-paths::
This flag causes combined diffs (used for merge commits) to
list the name of the file from all parents. It thus only has
effect when -c or --cc are specified, and is likely only
useful if filename changes are detected (i.e. when either
rename or copy detection have been requested).
-m::
This flag makes the merge commits show the full diff like
regular commits; for each merge parent, a separate log entry
and diff is generated. An exception is that only diff against
the first parent is shown when `--first-parent` option is given;
in that case, the output represents the changes the merge
brought _into_ the then-current branch.
--diff-merges=off::
--no-diff-merges::
Disable output of diffs for merge commits (default). Useful to
override `-m`, `-c`, or `--cc`.
:git-log: 1
:diff-merges-default: `off`
include::diff-options.txt[]
include::diff-generate-patch.txt[]

View File

@ -13,7 +13,6 @@ SYNOPSIS
(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
(-[c|d|o|i|s|u|k|m])*
[--eol]
[--deduplicate]
[-x <pattern>|--exclude=<pattern>]
[-X <file>|--exclude-from=<file>]
[--exclude-per-directory=<file>]
@ -81,13 +80,6 @@ OPTIONS
\0 line termination on output and do not quote filenames.
See OUTPUT below for more information.
--deduplicate::
When only filenames are shown, suppress duplicates that may
come from having multiple stages during a merge, or giving
`--deleted` and `--modified` option at the same time.
When any of the `-t`, `--unmerged`, or `--stage` option is
in use, this option has no effect.
-x <pattern>::
--exclude=<pattern>::
Skip untracked files matching pattern.

View File

@ -53,7 +53,7 @@ character.
The commit log message, author name and author email are
taken from the e-mail, and after minimally decoding MIME
transfer encoding, re-coded in the charset specified by
`i18n.commitEncoding` (defaulting to UTF-8) by transliterating
i18n.commitencoding (defaulting to UTF-8) by transliterating
them. This used to be optional but now it is the default.
+
Note that the patch is always used as-is without charset
@ -61,7 +61,7 @@ conversion, even with this flag.
--encoding=<encoding>::
Similar to -u. But when re-coding, the charset specified here is
used instead of the one specified by `i18n.commitEncoding` or UTF-8.
used instead of the one specified by i18n.commitencoding or UTF-8.
-n::
Disable all charset re-coding of the metadata.

View File

@ -145,12 +145,6 @@ incremental-repack::
which is a special case that attempts to repack all pack-files
into a single pack-file.
pack-refs::
The `pack-refs` task collects the loose reference files and
collects them into a single file. This speeds up operations that
need to iterate across many references. See linkgit:git-pack-refs[1]
for more information.
OPTIONS
-------
--auto::
@ -224,122 +218,6 @@ Further, the `git gc` command should not be combined with
but does not take the lock in the same way as `git maintenance run`. If
possible, use `git maintenance run --task=gc` instead of `git gc`.
The following sections describe the mechanisms put in place to run
background maintenance by `git maintenance start` and how to customize
them.
BACKGROUND MAINTENANCE ON POSIX SYSTEMS
---------------------------------------
The standard mechanism for scheduling background tasks on POSIX systems
is cron(8). This tool executes commands based on a given schedule. The
current list of user-scheduled tasks can be found by running `crontab -l`.
The schedule written by `git maintenance start` is similar to this:
-----------------------------------------------------------------------
# BEGIN GIT MAINTENANCE SCHEDULE
# The following schedule was created by Git
# Any edits made in this region might be
# replaced in the future by a Git command.
0 1-23 * * * "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=hourly
0 0 * * 1-6 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=daily
0 0 * * 0 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=weekly
# END GIT MAINTENANCE SCHEDULE
-----------------------------------------------------------------------
The comments are used as a region to mark the schedule as written by Git.
Any modifications within this region will be completely deleted by
`git maintenance stop` or overwritten by `git maintenance start`.
The `crontab` entry specifies the full path of the `git` executable to
ensure that the executed `git` command is the same one with which
`git maintenance start` was issued independent of `PATH`. If the same user
runs `git maintenance start` with multiple Git executables, then only the
latest executable is used.
These commands use `git for-each-repo --config=maintenance.repo` to run
`git maintenance run --schedule=<frequency>` on each repository listed in
the multi-valued `maintenance.repo` config option. These are typically
loaded from the user-specific global config. The `git maintenance` process
then determines which maintenance tasks are configured to run on each
repository with each `<frequency>` using the `maintenance.<task>.schedule`
config options. These values are loaded from the global or repository
config values.
If the config values are insufficient to achieve your desired background
maintenance schedule, then you can create your own schedule. If you run
`crontab -e`, then an editor will load with your user-specific `cron`
schedule. In that editor, you can add your own schedule lines. You could
start by adapting the default schedule listed earlier, or you could read
the crontab(5) documentation for advanced scheduling techniques. Please
do use the full path and `--exec-path` techniques from the default
schedule to ensure you are executing the correct binaries in your
schedule.
BACKGROUND MAINTENANCE ON MACOS SYSTEMS
---------------------------------------
While macOS technically supports `cron`, using `crontab -e` requires
elevated privileges and the executed process does not have a full user
context. Without a full user context, Git and its credential helpers
cannot access stored credentials, so some maintenance tasks are not
functional.
Instead, `git maintenance start` interacts with the `launchctl` tool,
which is the recommended way to schedule timed jobs in macOS. Scheduling
maintenance through `git maintenance (start|stop)` requires some
`launchctl` features available only in macOS 10.11 or later.
Your user-specific scheduled tasks are stored as XML-formatted `.plist`
files in `~/Library/LaunchAgents/`. You can see the currently-registered
tasks using the following command:
-----------------------------------------------------------------------
$ ls ~/Library/LaunchAgents/org.git-scm.git*
org.git-scm.git.daily.plist
org.git-scm.git.hourly.plist
org.git-scm.git.weekly.plist
-----------------------------------------------------------------------
One task is registered for each `--schedule=<frequency>` option. To
inspect how the XML format describes each schedule, open one of these
`.plist` files in an editor and inspect the `<array>` element following
the `<key>StartCalendarInterval</key>` element.
`git maintenance start` will overwrite these files and register the
tasks again with `launchctl`, so any customizations should be done by
creating your own `.plist` files with distinct names. Similarly, the
`git maintenance stop` command will unregister the tasks with `launchctl`
and delete the `.plist` files.
To create more advanced customizations to your background tasks, see
launchctl.plist(5) for more information.
BACKGROUND MAINTENANCE ON WINDOWS SYSTEMS
-----------------------------------------
Windows does not support `cron` and instead has its own system for
scheduling background tasks. The `git maintenance start` command uses
the `schtasks` command to submit tasks to this system. You can inspect
all background tasks using the Task Scheduler application. The tasks
added by Git have names of the form `Git Maintenance (<frequency>)`.
The Task Scheduler GUI has ways to inspect these tasks, but you can also
export the tasks to XML files and view the details there.
Note that since Git is a console application, these background tasks
create a console window visible to the current user. This can be changed
manually by selecting the "Run whether user is logged in or not" option
in Task Scheduler. This change requires a password input, which is why
`git maintenance start` does not select it by default.
If you want to customize the background tasks, please rename the tasks
so future calls to `git maintenance (start|stop)` do not overwrite your
custom tasks.
GIT
---

View File

@ -38,10 +38,6 @@ get_merge_tool_cmd::
get_merge_tool_path::
returns the custom path for a merge tool.
initialize_merge_tool::
bring merge tool specific functions into scope so they can be used or
overridden.
run_merge_tool::
launches a merge tool given the tool name and a true/false
flag to indicate whether a merge base is present.

View File

@ -99,10 +99,6 @@ success of the resolution after the custom tool has exited.
(see linkgit:git-config[1]). To cancel `diff.orderFile`,
use `-O/dev/null`.
CONFIGURATION
-------------
include::config/mergetool.txt[]
TEMPORARY FILES
---------------
`git mergetool` creates `*.orig` backup files while resolving merges.

View File

@ -3,7 +3,7 @@ git-mktag(1)
NAME
----
git-mktag - Creates a tag object with extra validation
git-mktag - Creates a tag object
SYNOPSIS
@ -11,52 +11,25 @@ SYNOPSIS
[verse]
'git mktag'
OPTIONS
-------
--strict::
By default mktag turns on the equivalent of
linkgit:git-fsck[1] `--strict` mode. Use `--no-strict` to
disable it.
DESCRIPTION
-----------
Reads a tag contents on standard input and creates a tag object
that can also be used to sign other objects.
Reads a tag contents on standard input and creates a tag object. The
output is the new tag's <object> identifier.
This command is mostly equivalent to linkgit:git-hash-object[1]
invoked with `-t tag -w --stdin`. I.e. both of these will create and
write a tag found in `my-tag`:
git mktag <my-tag
git hash-object -t tag -w --stdin <my-tag
The difference is that mktag will die before writing the tag if the
tag doesn't pass a linkgit:git-fsck[1] check.
The "fsck" check done mktag is stricter than what linkgit:git-fsck[1]
would run by default in that all `fsck.<msg-id>` messages are promoted
from warnings to errors (so e.g. a missing "tagger" line is an error).
Extra headers in the object are also an error under mktag, but ignored
by linkgit:git-fsck[1]. This extra check can be turned off by setting
the appropriate `fsck.<msg-id>` varible:
git -c fsck.extraHeaderEntry=ignore mktag <my-tag-with-headers
The output is the new tag's <object> identifier.
Tag Format
----------
A tag signature file, to be fed to this command's standard input,
has a very simple fixed format: four lines of
object <hash>
object <sha1>
type <typename>
tag <tagname>
tagger <tagger>
followed by some 'optional' free-form message (some tags created
by older Git may not have `tagger` line). The message, when it
by older Git may not have `tagger` line). The message, when
exists, is separated by a blank line from the header. The
message part may contain a signature that Git itself doesn't
care about, but that can be verified with gpg.

View File

@ -400,17 +400,6 @@ Note that we pick a single island for each regex to go into, using "last
one wins" ordering (which allows repo-specific config to take precedence
over user-wide config, and so forth).
CONFIGURATION
-------------
Various configuration variables affect packing, see
linkgit:git-config[1] (search for "pack" and "delta").
Notably, delta compression is not used on objects larger than the
`core.bigFileThreshold` configuration variable and on files with the
attribute `delta` set to false.
SEE ALSO
--------
linkgit:git-rev-list[1]

View File

@ -10,7 +10,6 @@ SYNOPSIS
[verse]
'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
[--no-dual-color] [--creation-factor=<factor>]
[--left-only | --right-only]
( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
DESCRIPTION
@ -29,17 +28,6 @@ Finally, the list of matching commits is shown in the order of the
second commit range, with unmatched commits being inserted just after
all of their ancestors have been shown.
There are three ways to specify the commit ranges:
- `<range1> <range2>`: Either commit range can be of the form
`<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
in linkgit:gitrevisions[7] for more details.
- `<rev1>...<rev2>`. This is equivalent to
`<rev2>..<rev1> <rev1>..<rev2>`.
- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
<base>..<rev2>`.
OPTIONS
-------
@ -69,14 +57,6 @@ to revert to color all lines according to the outer diff markers
See the ``Algorithm`` section below for an explanation why this is
needed.
--left-only::
Suppress commits that are missing from the first specified range
(or the "left range" when using the `<rev1>...<rev2>` format).
--right-only::
Suppress commits that are missing from the second specified range
(or the "right range" when using the `<rev1>...<rev2>` format).
--[no-]notes[=<ref>]::
This flag is passed to the `git log` program
(see linkgit:git-log[1]) that generates the patches.

View File

@ -165,12 +165,9 @@ depth is 4095.
Pass the `--delta-islands` option to `git-pack-objects`, see
linkgit:git-pack-objects[1].
CONFIGURATION
Configuration
-------------
Various configuration variables affect packing, see
linkgit:git-config[1] (search for "pack" and "delta").
By default, the command passes `--delta-base-offset` option to
'git pack-objects'; this typically results in slightly smaller packs,
but the generated packs are incompatible with versions of Git older than
@ -181,10 +178,6 @@ need to set the configuration variable `repack.UseDeltaBaseOffset` to
is unaffected by this option as the conversion is performed on the fly
as needed in that case.
Delta compression is not used on objects larger than the
`core.bigFileThreshold` configuration variable and on files with the
attribute `delta` set to false.
SEE ALSO
--------
linkgit:git-pack-objects[1]

View File

@ -31,99 +31,6 @@ include::rev-list-options.txt[]
include::pretty-formats.txt[]
EXAMPLES
--------
* Print the list of commits reachable from the current branch.
+
----------
git rev-list HEAD
----------
* Print the list of commits on this branch, but not present in the
upstream branch.
+
----------
git rev-list @{upstream}..HEAD
----------
* Format commits with their author and commit message (see also the
porcelain linkgit:git-log[1]).
+
----------
git rev-list --format=medium HEAD
----------
* Format commits along with their diffs (see also the porcelain
linkgit:git-log[1], which can do this in a single process).
+
----------
git rev-list HEAD |
git diff-tree --stdin --format=medium -p
----------
* Print the list of commits on the current branch that touched any
file in the `Documentation` directory.
+
----------
git rev-list HEAD -- Documentation/
----------
* Print the list of commits authored by you in the past year, on
any branch, tag, or other ref.
+
----------
git rev-list --author=you@example.com --since=1.year.ago --all
----------
* Print the list of objects reachable from the current branch (i.e., all
commits and the blobs and trees they contain).
+
----------
git rev-list --objects HEAD
----------
* Compare the disk size of all reachable objects, versus those
reachable from reflogs, versus the total packed size. This can tell
you whether running `git repack -ad` might reduce the repository size
(by dropping unreachable objects), and whether expiring reflogs might
help.
+
----------
# reachable objects
git rev-list --disk-usage --objects --all
# plus reflogs
git rev-list --disk-usage --objects --all --reflog
# total disk size used
du -c .git/objects/pack/*.pack .git/objects/??/*
# alternative to du: add up "size" and "size-pack" fields
git count-objects -v
----------
* Report the disk size of each branch, not including objects used by the
current branch. This can find outliers that are contributing to a
bloated repository size (e.g., because somebody accidentally committed
large build artifacts).
+
----------
git for-each-ref --format='%(refname)' |
while read branch
do
size=$(git rev-list --disk-usage --objects HEAD..$branch)
echo "$size $branch"
done |
sort -n
----------
* Compare the on-disk size of branches in one group of refs, excluding
another. If you co-mingle objects from multiple remotes in a single
repository, this can show which remotes are contributing to the
repository size (taking the size of `origin` as a baseline).
+
----------
git rev-list --disk-usage --objects --remotes=$suspect --not --remotes=origin
----------
GIT
---
Part of the linkgit:git[1] suite

View File

@ -212,18 +212,6 @@ Options for Files
Only the names of the variables are listed, not their value,
even if they are set.
--path-format=(absolute|relative)::
Controls the behavior of certain other options. If specified as absolute, the
paths printed by those options will be absolute and canonical. If specified as
relative, the paths will be relative to the current working directory if that
is possible. The default is option specific.
+
This option may be specified multiple times and affects only the arguments that
follow it on the command line, either to the end of the command line or the next
instance of this option.
The following options are modified by `--path-format`:
--git-dir::
Show `$GIT_DIR` if defined. Otherwise show the path to
the .git directory. The path shown, when relative, is
@ -233,42 +221,13 @@ If `$GIT_DIR` is not defined and the current directory
is not detected to lie in a Git repository or work tree
print a message to stderr and exit with nonzero status.
--git-common-dir::
Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
--resolve-git-dir <path>::
Check if <path> is a valid repository or a gitfile that
points at a valid repository, and print the location of the
repository. If <path> is a gitfile then the resolved path
to the real repository is printed.
--git-path <path>::
Resolve "$GIT_DIR/<path>" and takes other path relocation
variables such as $GIT_OBJECT_DIRECTORY,
$GIT_INDEX_FILE... into account. For example, if
$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
--git-path objects/abc" returns /foo/bar/abc.
--show-toplevel::
Show the (by default, absolute) path of the top-level directory
of the working tree. If there is no working tree, report an error.
--show-superproject-working-tree::
Show the absolute path of the root of the superproject's
working tree (if exists) that uses the current repository as
its submodule. Outputs nothing if the current repository is
not used as a submodule by any project.
--shared-index-path::
Show the path to the shared index file in split index mode, or
empty if not in split-index mode.
The following options are unaffected by `--path-format`:
--absolute-git-dir::
Like `--git-dir`, but its output is always the canonicalized
absolute path.
--git-common-dir::
Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
--is-inside-git-dir::
When the current working directory is below the repository
directory print "true", otherwise "false".
@ -283,6 +242,19 @@ The following options are unaffected by `--path-format`:
--is-shallow-repository::
When the repository is shallow print "true", otherwise "false".
--resolve-git-dir <path>::
Check if <path> is a valid repository or a gitfile that
points at a valid repository, and print the location of the
repository. If <path> is a gitfile then the resolved path
to the real repository is printed.
--git-path <path>::
Resolve "$GIT_DIR/<path>" and takes other path relocation
variables such as $GIT_OBJECT_DIRECTORY,
$GIT_INDEX_FILE... into account. For example, if
$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
--git-path objects/abc" returns /foo/bar/abc.
--show-cdup::
When the command is invoked from a subdirectory, show the
path of the top-level directory relative to the current
@ -293,6 +265,20 @@ The following options are unaffected by `--path-format`:
path of the current directory relative to the top-level
directory.
--show-toplevel::
Show the absolute path of the top-level directory of the working
tree. If there is no working tree, report an error.
--show-superproject-working-tree::
Show the absolute path of the root of the superproject's
working tree (if exists) that uses the current repository as
its submodule. Outputs nothing if the current repository is
not used as a submodule by any project.
--shared-index-path::
Show the path to the shared index file in split index mode, or
empty if not in split-index mode.
--show-object-format[=(storage|input|output)]::
Show the object format (hash algorithm) used for the repository
for storage inside the `.git` directory, input, or output. For

View File

@ -111,11 +111,11 @@ include::rev-list-options.txt[]
MAPPING AUTHORS
---------------
See linkgit:gitmailmap[5].
The `.mailmap` feature is used to coalesce together commits by the same
person in the shortlog, where their name and/or email address was
spelled differently.
Note that if `git shortlog` is run outside of a repository (to process
log contents on standard input), it will look for a `.mailmap` file in
the current directory.
include::mailmap.txt[]
GIT
---

View File

@ -45,13 +45,10 @@ include::pretty-options.txt[]
include::pretty-formats.txt[]
DIFF FORMATTING
---------------
The options below can be used to change the way `git show` generates
diff output.
COMMON DIFF OPTIONS
-------------------
:git-log: 1
:diff-merges-default: `dense-combined`
include::diff-options.txt[]
include::diff-generate-patch.txt[]

View File

@ -8,8 +8,8 @@ git-stash - Stash the changes in a dirty working directory away
SYNOPSIS
--------
[verse]
'git stash' list [<log-options>]
'git stash' show [<diff-options>] [<stash>]
'git stash' list [<options>]
'git stash' show [<options>] [<stash>]
'git stash' drop [-q|--quiet] [<stash>]
'git stash' ( pop | apply ) [--index] [-q|--quiet] [<stash>]
'git stash' branch <branchname> [<stash>]
@ -67,7 +67,7 @@ save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] [-q
Instead, all non-option arguments are concatenated to form the stash
message.
list [<log-options>]::
list [<options>]::
List the stash entries that you currently have. Each 'stash entry' is
listed with its name (e.g. `stash@{0}` is the latest entry, `stash@{1}` is
@ -83,7 +83,7 @@ stash@{1}: On master: 9cc0589... Add git-stash
The command takes options applicable to the 'git log'
command to control what is shown and how. See linkgit:git-log[1].
show [<diff-options>] [<stash>]::
show [<options>] [<stash>]::
Show the changes recorded in the stash entry as a diff between the
stashed contents and the commit back when the stash entry was first

View File

@ -130,7 +130,7 @@ ignored, then the directory is not shown, but all contents are shown.
--column[=<options>]::
--no-column::
Display untracked files in columns. See configuration variable
`column.status` for option syntax. `--column` and `--no-column`
column.status for option syntax.`--column` and `--no-column`
without options are equivalent to 'always' and 'never'
respectively.

View File

@ -134,7 +134,7 @@ options for details.
--column[=<options>]::
--no-column::
Display tag listing in columns. See configuration variable
`column.tag` for option syntax. `--column` and `--no-column`
column.tag for option syntax.`--column` and `--no-column`
without options are equivalent to 'always' and 'never' respectively.
+
This option is only applicable when listing tags without annotation lines.

View File

@ -97,9 +97,8 @@ list::
List details of each working tree. The main working tree is listed first,
followed by each of the linked working trees. The output details include
whether the working tree is bare, the revision currently checked out, the
branch currently checked out (or "detached HEAD" if none), "locked" if
the worktree is locked, "prunable" if the worktree can be pruned by `prune`
command.
branch currently checked out (or "detached HEAD" if none), and "locked" if
the worktree is locked.
lock::
@ -144,11 +143,6 @@ locate it. Running `repair` within the recently-moved working tree will
reestablish the connection. If multiple linked working trees are moved,
running `repair` from any working tree with each tree's new `<path>` as
an argument, will reestablish the connection to all the specified paths.
+
If both the main working tree and linked working trees have been moved
manually, then running `repair` in the main working tree and specifying the
new `<path>` of each linked working tree will reestablish all connections
in both directions.
unlock::
@ -232,14 +226,9 @@ This can also be set up as the default behaviour by using the
-v::
--verbose::
With `prune`, report all removals.
+
With `list`, output additional information about worktrees (see below).
--expire <time>::
With `prune`, only expire unused working trees older than `<time>`.
+
With `list`, annotate missing working trees as prunable if they are
older than `<time>`.
--reason <string>::
With `lock`, an explanation why the working tree is locked.
@ -378,46 +367,13 @@ $ git worktree list
/path/to/other-linked-worktree 1234abc (detached HEAD)
------------
The command also shows annotations for each working tree, according to its state.
These annotations are:
* `locked`, if the working tree is locked.
* `prunable`, if the working tree can be pruned via `git worktree prune`.
------------
$ git worktree list
/path/to/linked-worktree abcd1234 [master]
/path/to/locked-worktreee acbd5678 (brancha) locked
/path/to/prunable-worktree 5678abc (detached HEAD) prunable
------------
For these annotations, a reason might also be available and this can be
seen using the verbose mode. The annotation is then moved to the next line
indented followed by the additional information.
------------
$ git worktree list --verbose
/path/to/linked-worktree abcd1234 [master]
/path/to/locked-worktree-no-reason abcd5678 (detached HEAD) locked
/path/to/locked-worktree-with-reason 1234abcd (brancha)
locked: working tree path is mounted on a portable device
/path/to/prunable-worktree 5678abc1 (detached HEAD)
prunable: gitdir file points to non-existent location
------------
Note that the annotation is moved to the next line if the additional
information is available, otherwise it stays on the same line as the
working tree itself.
Porcelain Format
~~~~~~~~~~~~~~~~
The porcelain format has a line per attribute. Attributes are listed with a
label and value separated by a single space. Boolean attributes (like `bare`
and `detached`) are listed as a label only, and are present only
if the value is true. Some attributes (like `locked`) can be listed as a label
only or with a value depending upon whether a reason is available. The first
attribute of a working tree is always `worktree`, an empty line indicates the
end of the record. For example:
if the value is true. The first attribute of a working tree is always
`worktree`, an empty line indicates the end of the record. For example:
------------
$ git worktree list --porcelain
@ -432,33 +388,6 @@ worktree /path/to/other-linked-worktree
HEAD 1234abc1234abc1234abc1234abc1234abc1234a
detached
worktree /path/to/linked-worktree-locked-no-reason
HEAD 5678abc5678abc5678abc5678abc5678abc5678c
branch refs/heads/locked-no-reason
locked
worktree /path/to/linked-worktree-locked-with-reason
HEAD 3456def3456def3456def3456def3456def3456b
branch refs/heads/locked-with-reason
locked reason why is locked
worktree /path/to/linked-worktree-prunable
HEAD 1233def1234def1234def1234def1234def1234b
detached
prunable gitdir file points to non-existent location
------------
If the lock reason contains "unusual" characters such as newline, they
are escaped and the entire reason is quoted as explained for the
configuration variable `core.quotePath` (see linkgit:git-config[1]).
For Example:
------------
$ git worktree list --porcelain
...
locked "reason\nwhy is locked"
...
------------
EXAMPLES

View File

@ -13,7 +13,7 @@ SYNOPSIS
[--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
[-p|--paginate|-P|--no-pager] [--no-replace-objects] [--bare]
[--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
[--super-prefix=<path>] [--config-env <name>=<envvar>]
[--super-prefix=<path>]
<command> [<args>]
DESCRIPTION
@ -80,28 +80,6 @@ config file). Including the equals but with an empty value (like `git -c
foo.bar= ...`) sets `foo.bar` to the empty string which `git config
--type=bool` will convert to `false`.
--config-env=<name>=<envvar>::
Like `-c <name>=<value>`, give configuration variable
'<name>' a value, where <envvar> is the name of an
environment variable from which to retrieve the value. Unlike
`-c` there is no shortcut for directly setting the value to an
empty string, instead the environment variable itself must be
set to the empty string. It is an error if the `<envvar>` does not exist
in the environment. `<envvar>` may not contain an equals sign
to avoid ambiguity with `<name>` containing one.
+
This is useful for cases where you want to pass transitory
configuration options to git, but are doing so on OS's where
other processes might be able to read your cmdline
(e.g. `/proc/self/cmdline`), but not your environ
(e.g. `/proc/self/environ`). That behavior is the default on
Linux, but may not be on your system.
+
Note that this might add security for variables such as
`http.extraHeader` where the sensitive information is part of
the value, but not e.g. `url.<base>.insteadOf` where the
sensitive information can be part of the key.
--exec-path[=<path>]::
Path to wherever your core Git programs are installed.
This can also be controlled by setting the GIT_EXEC_PATH

View File

@ -74,7 +74,6 @@ into another list. There are currently 5 such transformations:
- diffcore-merge-broken
- diffcore-pickaxe
- diffcore-order
- diffcore-rotate
These are applied in sequence. The set of filepairs 'git diff-{asterisk}'
commands find are used as the input to diffcore-break, and
@ -169,26 +168,6 @@ a similarity score different from the default of 50% by giving a
number after the "-M" or "-C" option (e.g. "-M8" to tell it to use
8/10 = 80%).
Note that when rename detection is on but both copy and break
detection are off, rename detection adds a preliminary step that first
checks if files are moved across directories while keeping their
filename the same. If there is a file added to a directory whose
contents is sufficiently similar to a file with the same name that got
deleted from a different directory, it will mark them as renames and
exclude them from the later quadratic step (the one that pairwise
compares all unmatched files to find the "best" matches, determined by
the highest content similarity). So, for example, if a deleted
docs/ext.txt and an added docs/config/ext.txt are similar enough, they
will be marked as a rename and prevent an added docs/ext.md that may
be even more similar to the deleted docs/ext.txt from being considered
as the rename destination in the later step. For this reason, the
preliminary "match same filename" step uses a bit higher threshold to
mark a file pair as a rename and stop considering other candidates for
better matches. At most, one comparison is done per file in this
preliminary pass; so if there are several remaining ext.txt files
throughout the directory hierarchy after exact rename detection, this
preliminary step will be skipped for those files.
Note. When the "-C" option is used with `--find-copies-harder`
option, 'git diff-{asterisk}' commands feed unmodified filepairs to
diffcore mechanism as well as modified ones. This lets the copy
@ -297,26 +276,6 @@ Documentation
t
------------------------------------------------
diffcore-rotate: For Changing At Which Path Output Starts
---------------------------------------------------------
This transformation takes one pathname, and rotates the set of
filepairs so that the filepair for the given pathname comes first,
optionally discarding the paths that come before it. This is used
to implement the `--skip-to` and the `--rotate-to` options. It is
an error when the specified pathname is not in the set of filepairs,
but it is not useful to error out when used with "git log" family of
commands, because it is unreasonable to expect that a given path
would be modified by each and every commit shown by the "git log"
command. For this reason, when used with "git log", the filepair
that sorts the same as, or the first one that sorts after, the given
pathname is where the output starts.
Use of this transformation combined with diffcore-order will produce
unexpected results, as the input to this transformation is likely
not sorted when diffcore-order is in effect.
SEE ALSO
--------
linkgit:git-diff[1],

View File

@ -1,123 +0,0 @@
gitmailmap(5)
=============
NAME
----
gitmailmap - Map author/committer names and/or E-Mail addresses
SYNOPSIS
--------
$GIT_WORK_TREE/.mailmap
DESCRIPTION
-----------
If the file `.mailmap` exists at the toplevel of the repository, or at
the location pointed to by the `mailmap.file` or `mailmap.blob`
configuration options (see linkgit:git-config[1]), it
is used to map author and committer names and email addresses to
canonical real names and email addresses.
SYNTAX
------
The '#' character begins a comment to the end of line, blank lines
are ignored.
In the simple form, each line in the file consists of the canonical
real name of an author, whitespace, and an email address used in the
commit (enclosed by '<' and '>') to map to the name. For example:
--
Proper Name <commit@email.xx>
--
The more complex forms are:
--
<proper@email.xx> <commit@email.xx>
--
which allows mailmap to replace only the email part of a commit, and:
--
Proper Name <proper@email.xx> <commit@email.xx>
--
which allows mailmap to replace both the name and the email of a
commit matching the specified commit email address, and:
--
Proper Name <proper@email.xx> Commit Name <commit@email.xx>
--
which allows mailmap to replace both the name and the email of a
commit matching both the specified commit name and email address.
Both E-Mails and names are matched case-insensitively. For example
this would also match the 'Commit Name <commit&#64;email.xx>' above:
--
Proper Name <proper@email.xx> CoMmIt NaMe <CoMmIt@EmAiL.xX>
--
EXAMPLES
--------
Your history contains commits by two authors, Jane
and Joe, whose names appear in the repository under several forms:
------------
Joe Developer <joe@example.com>
Joe R. Developer <joe@example.com>
Jane Doe <jane@example.com>
Jane Doe <jane@laptop.(none)>
Jane D. <jane@desktop.(none)>
------------
Now suppose that Joe wants his middle name initial used, and Jane
prefers her family name fully spelled out. A `.mailmap` file to
correct the names would look like:
------------
Joe R. Developer <joe@example.com>
Jane Doe <jane@example.com>
Jane Doe <jane@desktop.(none)>
------------
Note that there's no need to map the name for '<jane&#64;laptop.(none)>' to
only correct the names. However, leaving the obviously broken
'<jane&#64;laptop.(none)>' and '<jane&#64;desktop.(none)>' E-Mails as-is is
usually not what you want. A `.mailmap` file which also corrects those
is:
------------
Joe R. Developer <joe@example.com>
Jane Doe <jane@example.com> <jane@laptop.(none)>
Jane Doe <jane@example.com> <jane@desktop.(none)>
------------
Finally, let's say that Joe and Jane shared an E-Mail address, but not
a name, e.g. by having these two commits in the history generated by a
bug reporting system. I.e. names appearing in history as:
------------
Joe <bugs@example.com>
Jane <bugs@example.com>
------------
A full `.mailmap` file which also handles those cases (an addition of
two lines to the above example) would be:
------------
Joe R. Developer <joe@example.com>
Jane Doe <jane@example.com> <jane@laptop.(none)>
Jane Doe <jane@example.com> <jane@desktop.(none)>
Joe R. Developer <joe@example.com> Joe <bugs@example.com>
Jane Doe <jane@example.com> Jane <bugs@example.com>
------------
SEE ALSO
--------
linkgit:git-check-mailmap[1]
GIT
---
Part of the linkgit:git[1] suite

View File

@ -38,7 +38,7 @@ mind.
a warning if the commit log message given to it does not look
like a valid UTF-8 string, unless you explicitly say your
project uses a legacy encoding. The way to say this is to
have `i18n.commitEncoding` in `.git/config` file, like this:
have i18n.commitencoding in `.git/config` file, like this:
+
------------
[i18n]

75
Documentation/mailmap.txt Normal file
View File

@ -0,0 +1,75 @@
If the file `.mailmap` exists at the toplevel of the repository, or at
the location pointed to by the mailmap.file or mailmap.blob
configuration options, it
is used to map author and committer names and email addresses to
canonical real names and email addresses.
In the simple form, each line in the file consists of the canonical
real name of an author, whitespace, and an email address used in the
commit (enclosed by '<' and '>') to map to the name. For example:
--
Proper Name <commit@email.xx>
--
The more complex forms are:
--
<proper@email.xx> <commit@email.xx>
--
which allows mailmap to replace only the email part of a commit, and:
--
Proper Name <proper@email.xx> <commit@email.xx>
--
which allows mailmap to replace both the name and the email of a
commit matching the specified commit email address, and:
--
Proper Name <proper@email.xx> Commit Name <commit@email.xx>
--
which allows mailmap to replace both the name and the email of a
commit matching both the specified commit name and email address.
Example 1: Your history contains commits by two authors, Jane
and Joe, whose names appear in the repository under several forms:
------------
Joe Developer <joe@example.com>
Joe R. Developer <joe@example.com>
Jane Doe <jane@example.com>
Jane Doe <jane@laptop.(none)>
Jane D. <jane@desktop.(none)>
------------
Now suppose that Joe wants his middle name initial used, and Jane
prefers her family name fully spelled out. A proper `.mailmap` file
would look like:
------------
Jane Doe <jane@desktop.(none)>
Joe R. Developer <joe@example.com>
------------
Note how there is no need for an entry for `<jane@laptop.(none)>`, because the
real name of that author is already correct.
Example 2: Your repository contains commits from the following
authors:
------------
nick1 <bugs@company.xx>
nick2 <bugs@company.xx>
nick2 <nick2@company.xx>
santa <me@company.xx>
claus <me@company.xx>
CTO <cto@coompany.xx>
------------
Then you might want a `.mailmap` file that looks like:
------------
<cto@company.xx> <cto@coompany.xx>
Some Dude <some@dude.xx> nick1 <bugs@company.xx>
Other Author <other@author.xx> nick2 <bugs@company.xx>
Other Author <other@author.xx> <nick2@company.xx>
Santa Claus <santa.claus@northpole.xx> <me@company.xx>
------------
Use hash '#' for comments that are either on their own line, or after
the email address.

View File

@ -252,15 +252,7 @@ endif::git-rev-list[]
interpreted by
linkgit:git-interpret-trailers[1]. The
`trailers` string may be followed by a colon
and zero or more comma-separated options.
If any option is provided multiple times the
last occurance wins.
+
The boolean options accept an optional value `[=<BOOL>]`. The values
`true`, `false`, `on`, `off` etc. are all accepted. See the "boolean"
sub-section in "EXAMPLES" in linkgit:git-config[1]. If a boolean
option is given with no value, it's enabled.
+
and zero or more comma-separated options:
** 'key=<K>': only show trailers with specified key. Matching is done
case-insensitively and trailing colon is optional. If option is
given multiple times trailer lines matching any of the keys are
@ -269,25 +261,27 @@ option is given with no value, it's enabled.
desired it can be disabled with `only=false`. E.g.,
`%(trailers:key=Reviewed-by)` shows trailer lines with key
`Reviewed-by`.
** 'only[=<BOOL>]': select whether non-trailer lines from the trailer
block should be included.
** 'only[=val]': select whether non-trailer lines from the trailer
block should be included. The `only` keyword may optionally be
followed by an equal sign and one of `true`, `on`, `yes` to omit or
`false`, `off`, `no` to show the non-trailer lines. If option is
given without value it is enabled. If given multiple times the last
value is used.
** 'separator=<SEP>': specify a separator inserted between trailer
lines. When this option is not given each trailer line is
terminated with a line feed character. The string SEP may contain
the literal formatting codes described above. To use comma as
separator one must use `%x2C` as it would otherwise be parsed as
next option. E.g., `%(trailers:key=Ticket,separator=%x2C )`
next option. If separator option is given multiple times only the
last one is used. E.g., `%(trailers:key=Ticket,separator=%x2C )`
shows all trailer lines whose key is "Ticket" separated by a comma
and a space.
** 'unfold[=<BOOL>]': make it behave as if interpret-trailer's `--unfold`
option was given. E.g.,
** 'unfold[=val]': make it behave as if interpret-trailer's `--unfold`
option was given. In same way as to for `only` it can be followed
by an equal sign and explicit value. E.g.,
`%(trailers:only,unfold=true)` unfolds and shows all trailer lines.
** 'keyonly[=<BOOL>]': only show the key part of the trailer.
** 'valueonly[=<BOOL>]': only show the value part of the trailer.
** 'key_value_separator=<SEP>': specify a separator inserted between
trailer lines. When this option is not given each trailer key-value
pair is separated by ": ". Otherwise it shares the same semantics
as 'separator=<SEP>' above.
** 'valueonly[=val]': skip over the key part of the trailer line and only
show the value part. Also this optionally allows explicit value.
NOTE: Some placeholders may depend on other options given to the
revision traversal engine. For example, the `%g*` reflog options will

View File

@ -129,11 +129,6 @@ parents) and `--max-parents=-1` (negative numbers denote no upper limit).
adjusting to updated upstream from time to time, and
this option allows you to ignore the individual commits
brought in to your history by such a merge.
ifdef::git-log[]
+
This option also changes default diff format for merge commits
to `first-parent`, see `--diff-merges=first-parent` for details.
endif::git-log[]
--not::
Reverses the meaning of the '{caret}' prefix (or lack thereof)
@ -227,15 +222,6 @@ ifdef::git-rev-list[]
test the exit status to see if a range of objects is fully
connected (or not). It is faster than redirecting stdout
to `/dev/null` as the output does not have to be formatted.
--disk-usage::
Suppress normal output; instead, print the sum of the bytes used
for on-disk storage by the selected commits or objects. This is
equivalent to piping the output into `git cat-file
--batch-check='%(objectsize:disk)'`, except that it runs much
faster (especially with `--use-bitmap-index`). See the `CAVEATS`
section in linkgit:git-cat-file[1] for the limitations of what
"on-disk storage" means.
endif::git-rev-list[]
--cherry-mark::

View File

@ -1,116 +0,0 @@
Chunk-based file formats
========================
Some file formats in Git use a common concept of "chunks" to describe
sections of the file. This allows structured access to a large file by
scanning a small "table of contents" for the remaining data. This common
format is used by the `commit-graph` and `multi-pack-index` files. See
link:technical/pack-format.html[the `multi-pack-index` format] and
link:technical/commit-graph-format.html[the `commit-graph` format] for
how they use the chunks to describe structured data.
A chunk-based file format begins with some header information custom to
that format. That header should include enough information to identify
the file type, format version, and number of chunks in the file. From this
information, that file can determine the start of the chunk-based region.
The chunk-based region starts with a table of contents describing where
each chunk starts and ends. This consists of (C+1) rows of 12 bytes each,
where C is the number of chunks. Consider the following table:
| Chunk ID (4 bytes) | Chunk Offset (8 bytes) |
|--------------------|------------------------|
| ID[0] | OFFSET[0] |
| ... | ... |
| ID[C] | OFFSET[C] |
| 0x0000 | OFFSET[C+1] |
Each row consists of a 4-byte chunk identifier (ID) and an 8-byte offset.
Each integer is stored in network-byte order.
The chunk identifier `ID[i]` is a label for the data stored within this
fill from `OFFSET[i]` (inclusive) to `OFFSET[i+1]` (exclusive). Thus, the
size of the `i`th chunk is equal to the difference between `OFFSET[i+1]`
and `OFFSET[i]`. This requires that the chunk data appears contiguously
in the same order as the table of contents.
The final entry in the table of contents must be four zero bytes. This
confirms that the table of contents is ending and provides the offset for
the end of the chunk-based data.
Note: The chunk-based format expects that the file contains _at least_ a
trailing hash after `OFFSET[C+1]`.
Functions for working with chunk-based file formats are declared in
`chunk-format.h`. Using these methods provide extra checks that assist
developers when creating new file formats.
Writing chunk-based file formats
--------------------------------
To write a chunk-based file format, create a `struct chunkfile` by
calling `init_chunkfile()` and pass a `struct hashfile` pointer. The
caller is responsible for opening the `hashfile` and writing header
information so the file format is identifiable before the chunk-based
format begins.
Then, call `add_chunk()` for each chunk that is intended for write. This
populates the `chunkfile` with information about the order and size of
each chunk to write. Provide a `chunk_write_fn` function pointer to
perform the write of the chunk data upon request.
Call `write_chunkfile()` to write the table of contents to the `hashfile`
followed by each of the chunks. This will verify that each chunk wrote
the expected amount of data so the table of contents is correct.
Finally, call `free_chunkfile()` to clear the `struct chunkfile` data. The
caller is responsible for finalizing the `hashfile` by writing the trailing
hash and closing the file.
Reading chunk-based file formats
--------------------------------
To read a chunk-based file format, the file must be opened as a
memory-mapped region. The chunk-format API expects that the entire file
is mapped as a contiguous memory region.
Initialize a `struct chunkfile` pointer with `init_chunkfile(NULL)`.
After reading the header information from the beginning of the file,
including the chunk count, call `read_table_of_contents()` to populate
the `struct chunkfile` with the list of chunks, their offsets, and their
sizes.
Extract the data information for each chunk using `pair_chunk()` or
`read_chunk()`:
* `pair_chunk()` assigns a given pointer with the location inside the
memory-mapped file corresponding to that chunk's offset. If the chunk
does not exist, then the pointer is not modified.
* `read_chunk()` takes a `chunk_read_fn` function pointer and calls it
with the appropriate initial pointer and size information. The function
is not called if the chunk does not exist. Use this method to read chunks
if you need to perform immediate parsing or if you need to execute logic
based on the size of the chunk.
After calling these methods, call `free_chunkfile()` to clear the
`struct chunkfile` data. This will not close the memory-mapped region.
Callers are expected to own that data for the timeframe the pointers into
the region are needed.
Examples
--------
These file formats use the chunk-format API, and can be used as examples
for future formats:
* *commit-graph:* see `write_commit_graph_file()` and `parse_commit_graph()`
in `commit-graph.c` for how the chunk-format API is used to write and
parse the commit-graph file format documented in
link:technical/commit-graph-format.html[the commit-graph file format].
* *multi-pack-index:* see `write_midx_internal()` and `load_multi_pack_index()`
in `midx.c` for how the chunk-format API is used to write and
parse the multi-pack-index file format documented in
link:technical/pack-format.html[the multi-pack-index file format].

View File

@ -4,7 +4,11 @@ Git commit graph format
The Git commit graph stores a list of commit OIDs and some associated
metadata, including:
- The generation number of the commit.
- The generation number of the commit. Commits with no parents have
generation number 1; commits with parents have generation number
one more than the maximum generation number of its parents. We
reserve zero as special, and can be used to mark a generation
number invalid or as "not computed".
- The root tree OID.
@ -61,9 +65,6 @@ CHUNK LOOKUP:
the length using the next chunk position if necessary.) Each chunk
ID appears at most once.
The CHUNK LOOKUP matches the table of contents from
link:technical/chunk-format.html[the chunk-based file format].
The remaining data in the body is described one chunk at a time, and
these chunks may be given in any order. Chunks are required unless
otherwise specified.
@ -85,33 +86,13 @@ CHUNK DATA:
position. If there are more than two parents, the second value
has its most-significant bit on and the other bits store an array
position into the Extra Edge List chunk.
* The next 8 bytes store the topological level (generation number v1)
of the commit and
* The next 8 bytes store the generation number of the commit and
the commit time in seconds since EPOCH. The generation number
uses the higher 30 bits of the first 4 bytes, while the commit
time uses the 32 bits of the second 4 bytes, along with the lowest
2 bits of the lowest byte, storing the 33rd and 34th bit of the
commit time.
Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional]
* This list of 4-byte values store corrected commit date offsets for the
commits, arranged in the same order as commit data chunk.
* If the corrected commit date offset cannot be stored within 31 bits,
the value has its most-significant bit on and the other bits store
the position of corrected commit date into the Generation Data Overflow
chunk.
* Generation Data chunk is present only when commit-graph file is written
by compatible versions of Git and in case of split commit-graph chains,
the topmost layer also has Generation Data chunk.
Generation Data Overflow (ID: {'G', 'D', 'O', 'V' }) [Optional]
* This list of 8-byte values stores the corrected commit date offsets
for commits with corrected commit date offsets that cannot be
stored within 31 bits.
* Generation Data Overflow chunk is present only when Generation Data
chunk is present and atleast one corrected commit date offset cannot
be stored within 31 bits.
Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
This list of 4-byte values store the second through nth parents for
all octopus merges. The second parent value in the commit data stores

View File

@ -38,31 +38,14 @@ A consumer may load the following info for a commit from the graph:
Values 1-4 satisfy the requirements of parse_commit_gently().
There are two definitions of generation number:
1. Corrected committer dates (generation number v2)
2. Topological levels (generation nummber v1)
Define the "generation number" of a commit recursively as follows:
Define "corrected committer date" of a commit recursively as follows:
* A commit with no parents (a root commit) has generation number one.
* A commit with no parents (a root commit) has corrected committer date
equal to its committer date.
* A commit with at least one parent has generation number one more than
the largest generation number among its parents.
* A commit with at least one parent has corrected committer date equal to
the maximum of its commiter date and one more than the largest corrected
committer date among its parents.
* As a special case, a root commit with timestamp zero has corrected commit
date of 1, to be able to distinguish it from GENERATION_NUMBER_ZERO
(that is, an uncomputed corrected commit date).
Define the "topological level" of a commit recursively as follows:
* A commit with no parents (a root commit) has topological level of one.
* A commit with at least one parent has topological level one more than
the largest topological level among its parents.
Equivalently, the topological level of a commit A is one more than the
Equivalently, the generation number of a commit A is one more than the
length of a longest path from A to a root commit. The recursive definition
is easier to use for computation and observing the following property:
@ -77,9 +60,6 @@ is easier to use for computation and observing the following property:
generation numbers, then we always expand the boundary commit with highest
generation number and can easily detect the stopping condition.
The property applies to both versions of generation number, that is both
corrected committer dates and topological levels.
This property can be used to significantly reduce the time it takes to
walk commits and determine topological relationships. Without generation
numbers, the general heuristic is the following:
@ -87,9 +67,7 @@ numbers, the general heuristic is the following:
If A and B are commits with commit time X and Y, respectively, and
X < Y, then A _probably_ cannot reach B.
In absence of corrected commit dates (for example, old versions of Git or
mixed generation graph chains),
this heuristic is currently used whenever the computation is allowed to
This heuristic is currently used whenever the computation is allowed to
violate topological relationships due to clock skew (such as "git log"
with default order), but is not used when the topological order is
required (such as merge base calculations, "git log --graph").
@ -99,7 +77,7 @@ in the commit graph. We can treat these commits as having "infinite"
generation number and walk until reaching commits with known generation
number.
We use the macro GENERATION_NUMBER_INFINITY to mark commits not
We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not
in the commit-graph file. If a commit-graph file was written by a version
of Git that did not compute generation numbers, then those commits will
have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
@ -115,12 +93,12 @@ fully-computed generation numbers. Using strict inequality may result in
walking a few extra commits, but the simplicity in dealing with commits
with generation number *_INFINITY or *_ZERO is valuable.
We use the macro GENERATION_NUMBER_V1_MAX = 0x3FFFFFFF for commits whose
topological levels (generation number v1) are computed to be at least
this value. We limit at this value since it is the largest value that
can be stored in the commit-graph file using the 30 bits available
to topological levels. This presents another case where a commit can
have generation number equal to that of a parent.
We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose
generation numbers are computed to be at least this value. We limit at
this value since it is the largest value that can be stored in the
commit-graph file using the 30 bits available to generation numbers. This
presents another case where a commit can have generation number equal to
that of a parent.
Design Details
--------------
@ -289,35 +267,6 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum
number of commits) could be extracted into config settings for full
flexibility.
## Handling Mixed Generation Number Chains
With the introduction of generation number v2 and generation data chunk, the
following scenario is possible:
1. "New" Git writes a commit-graph with the corrected commit dates.
2. "Old" Git writes a split commit-graph on top without corrected commit dates.
A naive approach of using the newest available generation number from
each layer would lead to violated expectations: the lower layer would
use corrected commit dates which are much larger than the topological
levels of the higher layer. For this reason, Git inspects the topmost
layer to see if the layer is missing corrected commit dates. In such a case
Git only uses topological level for generation numbers.
When writing a new layer in split commit-graph, we write corrected commit
dates if the topmost layer has corrected commit dates written. This
guarantees that if a layer has corrected commit dates, all lower layers
must have corrected commit dates as well.
When merging layers, we do not consider whether the merged layers had corrected
commit dates. Instead, the new layer will have corrected commit dates if the
layer below the new layer has corrected commit dates.
While writing or merging layers, if the new layer is the only layer, it will
have corrected commit dates when written by compatible versions of Git. Thus,
rewriting split commit-graph as a single file (`--split=replace`) creates a
single layer with corrected commit dates.
## Deleting graph-{hash} files
After a new tip file is written, some `graph-{hash}` files may no longer

View File

@ -33,9 +33,16 @@ researchers. On 23 February 2017 the SHAttered attack
Git v2.13.0 and later subsequently moved to a hardened SHA-1
implementation by default, which isn't vulnerable to the SHAttered
attack, but SHA-1 is still weak.
attack.
Thus it's considered prudent to move past any variant of SHA-1
Thus Git has in effect already migrated to a new hash that isn't SHA-1
and doesn't share its vulnerabilities, its new hash function just
happens to produce exactly the same output for all known inputs,
except two PDFs published by the SHAttered researchers, and the new
implementation (written by those researchers) claims to detect future
cryptanalytic collision attacks.
Regardless, it's considered prudent to move past any variant of SHA-1
to a new hash. There's no guarantee that future attacks on SHA-1 won't
be published in the future, and those attacks may not have viable
mitigations.
@ -50,38 +57,6 @@ SHA-1 still possesses the other properties such as fast object lookup
and safe error checking, but other hash functions are equally suitable
that are believed to be cryptographically secure.
Choice of Hash
--------------
The hash to replace the hardened SHA-1 should be stronger than SHA-1
was: we would like it to be trustworthy and useful in practice for at
least 10 years.
Some other relevant properties:
1. A 256-bit hash (long enough to match common security practice; not
excessively long to hurt performance and disk usage).
2. High quality implementations should be widely available (e.g., in
OpenSSL and Apple CommonCrypto).
3. The hash function's properties should match Git's needs (e.g. Git
requires collision and 2nd preimage resistance and does not require
length extension resistance).
4. As a tiebreaker, the hash should be fast to compute (fortunately
many contenders are faster than SHA-1).
There were several contenders for a successor hash to SHA-1, including
SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
In late 2018 the project picked SHA-256 as its successor hash.
See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
NewHash, 2018-08-04) and numerous mailing list threads at the time,
particularly the one starting at
https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
for more information.
Goals
-----
1. The transition to SHA-256 can be done one local repository at a time.
@ -119,7 +94,7 @@ Overview
--------
We introduce a new repository format extension. Repositories with this
extension enabled use SHA-256 instead of SHA-1 to name their objects.
This affects both object names and object content -- both the names
This affects both object names and object content --- both the names
of objects and all references to other objects within an object are
switched to the new hash function.
@ -132,7 +107,7 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names
interchangeably.
"git cat-file" and "git hash-object" gain options to display an object
in its SHA-1 form and write an object given its SHA-1 form. This
in its sha1 form and write an object given its sha1 form. This
requires all objects referenced by that object to be present in the
object database so that they can be named using the appropriate name
(using the bidirectional hash mapping).
@ -140,7 +115,7 @@ object database so that they can be named using the appropriate name
Fetches from a SHA-1 based server convert the fetched objects into
SHA-256 form and record the mapping in the bidirectional mapping table
(see below for details). Pushes to a SHA-1 based server convert the
objects being pushed into SHA-1 form so the server does not have to be
objects being pushed into sha1 form so the server does not have to be
aware of the hash function the client is using.
Detailed Design
@ -176,38 +151,38 @@ repository extensions.
Object names
~~~~~~~~~~~~
Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
hexadecimal digit SHA-256 name, plus names derived from those (see
Objects can be named by their 40 hexadecimal digit sha1-name or 64
hexadecimal digit sha256-name, plus names derived from those (see
gitrevisions(7)).
The SHA-1 name of an object is the SHA-1 of the concatenation of its
type, length, a nul byte, and the object's SHA-1 content. This is the
The sha1-name of an object is the SHA-1 of the concatenation of its
type, length, a nul byte, and the object's sha1-content. This is the
traditional <sha1> used in Git to name objects.
The SHA-256 name of an object is the SHA-256 of the concatenation of its
type, length, a nul byte, and the object's SHA-256 content.
The sha256-name of an object is the SHA-256 of the concatenation of its
type, length, a nul byte, and the object's sha256-content.
Object format
~~~~~~~~~~~~~
The content as a byte sequence of a tag, commit, or tree object named
by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
other objects by their SHA-256 names and an object named by SHA-1 name
refers to other objects by their SHA-1 names.
by sha1 and sha256 differ because an object named by sha256-name refers to
other objects by their sha256-names and an object named by sha1-name
refers to other objects by their sha1-names.
The SHA-256 content of an object is the same as its SHA-1 content, except
that objects referenced by the object are named using their SHA-256 names
instead of SHA-1 names. Because a blob object does not refer to any
other object, its SHA-1 content and SHA-256 content are the same.
The sha256-content of an object is the same as its sha1-content, except
that objects referenced by the object are named using their sha256-names
instead of sha1-names. Because a blob object does not refer to any
other object, its sha1-content and sha256-content are the same.
The format allows round-trip conversion between SHA-256 content and
SHA-1 content.
The format allows round-trip conversion between sha256-content and
sha1-content.
Object storage
~~~~~~~~~~~~~~
Loose objects use zlib compression and packed objects use the packed
format described in Documentation/technical/pack-format.txt, just like
today. The content that is compressed and stored uses SHA-256 content
instead of SHA-1 content.
today. The content that is compressed and stored uses sha256-content
instead of sha1-content.
Pack index
~~~~~~~~~~
@ -216,21 +191,21 @@ hash functions. They have the following format (all integers are in
network byte order):
- A header appears at the beginning and consists of the following:
* The 4-byte pack index signature: '\377t0c'
* 4-byte version number: 3
* 4-byte length of the header section, including the signature and
- The 4-byte pack index signature: '\377t0c'
- 4-byte version number: 3
- 4-byte length of the header section, including the signature and
version number
* 4-byte number of objects contained in the pack
* 4-byte number of object formats in this pack index: 2
* For each object format:
** 4-byte format identifier (e.g., 'sha1' for SHA-1)
** 4-byte length in bytes of shortened object names. This is the
- 4-byte number of objects contained in the pack
- 4-byte number of object formats in this pack index: 2
- For each object format:
- 4-byte format identifier (e.g., 'sha1' for SHA-1)
- 4-byte length in bytes of shortened object names. This is the
shortest possible length needed to make names in the shortened
object name table unambiguous.
** 4-byte integer, recording where tables relating to this format
- 4-byte integer, recording where tables relating to this format
are stored in this index file, as an offset from the beginning.
* 4-byte offset to the trailer from the beginning of this file.
* Zero or more additional key/value pairs (4-byte key, 4-byte
- 4-byte offset to the trailer from the beginning of this file.
- Zero or more additional key/value pairs (4-byte key, 4-byte
value). Only one key is supported: 'PSRC'. See the "Loose objects
and unreachable objects" section for supported values and how this
is used. All other keys are reserved. Readers must ignore
@ -238,36 +213,37 @@ network byte order):
- Zero or more NUL bytes. This can optionally be used to improve the
alignment of the full object name table below.
- Tables for the first object format:
* A sorted table of shortened object names. These are prefixes of
- A sorted table of shortened object names. These are prefixes of
the names of all objects in this pack file, packed together
without offset values to reduce the cache footprint of the binary
search for a specific object name.
* A table of full object names in pack order. This allows resolving
- A table of full object names in pack order. This allows resolving
a reference to "the nth object in the pack file" (from a
reachability bitmap or from the next table of another object
format) to its object name.
* A table of 4-byte values mapping object name order to pack order.
- A table of 4-byte values mapping object name order to pack order.
For an object in the table of sorted shortened object names, the
value at the corresponding index in this table is the index in the
previous table for that same object.
This can be used to look up the object in reachability bitmaps or
to look up its name in another object format.
* A table of 4-byte CRC32 values of the packed object data, in the
- A table of 4-byte CRC32 values of the packed object data, in the
order that the objects appear in the pack file. This is to allow
compressed data to be copied directly from pack to pack during
repacking without undetected data corruption.
* A table of 4-byte offset values. For an object in the table of
- A table of 4-byte offset values. For an object in the table of
sorted shortened object names, the value at the corresponding
index in this table indicates where that object can be found in
the pack file. These are usually 31-bit pack file offsets, but
large offsets are encoded as an index into the next table with the
most significant bit set.
* A table of 8-byte offset entries (empty for pack files less than
- A table of 8-byte offset entries (empty for pack files less than
2 GiB). Pack files are organized with heavily used objects toward
the front, so most object references should not need to refer to
this table.
@ -276,10 +252,10 @@ network byte order):
up to and not including the table of CRC32 values.
- Zero or more NUL bytes.
- The trailer consists of the following:
* A copy of the 20-byte SHA-256 checksum at the end of the
- A copy of the 20-byte SHA-256 checksum at the end of the
corresponding packfile.
* 20-byte SHA-256 checksum of all of the above.
- 20-byte SHA-256 checksum of all of the above.
Loose object index
~~~~~~~~~~~~~~~~~~
@ -312,18 +288,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"):
Translation table
~~~~~~~~~~~~~~~~~
The index files support a bidirectional mapping between SHA-1 names
and SHA-256 names. The lookup proceeds similarly to ordinary object
lookups. For example, to convert a SHA-1 name to a SHA-256 name:
The index files support a bidirectional mapping between sha1-names
and sha256-names. The lookup proceeds similarly to ordinary object
lookups. For example, to convert a sha1-name to a sha256-name:
1. Look for the object in idx files. If a match is present in the
idx's sorted list of truncated SHA-1 names, then:
a. Read the corresponding entry in the SHA-1 name order to pack
idx's sorted list of truncated sha1-names, then:
a. Read the corresponding entry in the sha1-name order to pack
name order mapping.
b. Read the corresponding entry in the full SHA-1 name table to
b. Read the corresponding entry in the full sha1-name table to
verify we found the right object. If it is, then
c. Read the corresponding entry in the full SHA-256 name table.
That is the object's SHA-256 name.
c. Read the corresponding entry in the full sha256-name table.
That is the object's sha256-name.
2. Check for a loose object. Read lines from loose-object-idx until
we find a match.
@ -337,10 +313,10 @@ Since all operations that make new objects (e.g., "git commit") add
the new objects to the corresponding index, this mapping is possible
for all objects in the object store.
Reading an object's SHA-1 content
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The SHA-1 content of an object can be read by converting all SHA-256 names
of its SHA-256 content references to SHA-1 names using the translation table.
Reading an object's sha1-content
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The sha1-content of an object can be read by converting all sha256-names
its sha256-content references to sha1-names using the translation table.
Fetch
~~~~~
@ -363,7 +339,7 @@ the following steps:
1. index-pack: inflate each object in the packfile and compute its
SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
objects the client has locally. These objects can be looked up
using the translation table and their SHA-1 content read as
using the translation table and their sha1-content read as
described above to resolve the deltas.
2. topological sort: starting at the "want"s from the negotiation
phase, walk through objects in the pack and emit a list of them,
@ -372,12 +348,12 @@ the following steps:
(This list only contains objects reachable from the "wants". If the
pack from the server contained additional extraneous objects, then
they will be discarded.)
3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
3. convert to sha256: open a new (sha256) packfile. Read the topologically
sorted list just generated. For each object, inflate its
SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
sha1-content, convert to sha256-content, and write it to the sha256
pack. Record the new sha1<->sha256 mapping entry for use in the idx.
4. sort: reorder entries in the new pack to match the order of objects
in the pack the server generated and include blobs. Write a SHA-256 idx
in the pack the server generated and include blobs. Write a sha256 idx
file
5. clean up: remove the SHA-1 based pack file, index, and
topologically sorted list obtained from the server in steps 1
@ -402,20 +378,19 @@ experimenting to get this to perform well.
Push
~~~~
Push is simpler than fetch because the objects referenced by the
pushed objects are already in the translation table. The SHA-1 content
pushed objects are already in the translation table. The sha1-content
of each object being pushed can be read as described in the "Reading
an object's SHA-1 content" section to generate the pack written by git
an object's sha1-content" section to generate the pack written by git
send-pack.
Signed Commits
~~~~~~~~~~~~~~
We add a new field "gpgsig-sha256" to the commit object format to allow
signing commits without relying on SHA-1. It is similar to the
existing "gpgsig" field. Its signed payload is the SHA-256 content of the
existing "gpgsig" field. Its signed payload is the sha256-content of the
commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
This means commits can be signed
1. using SHA-1 only, as in existing signed commit objects
2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
fields.
@ -429,11 +404,10 @@ Signed Tags
~~~~~~~~~~~
We add a new field "gpgsig-sha256" to the tag object format to allow
signing tags without relying on SHA-1. Its signed payload is the
SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
SIGNATURE-----" delimited in-body signature removed.
This means tags can be signed
1. using SHA-1 only, as in existing signed tag objects
2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
signature.
@ -441,11 +415,11 @@ This means tags can be signed
Mergetag embedding
~~~~~~~~~~~~~~~~~~
The mergetag field in the SHA-1 content of a commit contains the
SHA-1 content of a tag that was merged by that commit.
The mergetag field in the sha1-content of a commit contains the
sha1-content of a tag that was merged by that commit.
The mergetag field in the SHA-256 content of the same commit contains the
SHA-256 content of the same tag.
The mergetag field in the sha256-content of the same commit contains the
sha256-content of the same tag.
Submodules
~~~~~~~~~~
@ -520,7 +494,7 @@ Caveats
-------
Invalid objects
~~~~~~~~~~~~~~~
The conversion from SHA-1 content to SHA-256 content retains any
The conversion from sha1-content to sha256-content retains any
brokenness in the original object (e.g., tree entry modes encoded with
leading 0, tree objects whose paths are not sorted correctly, and
commit objects without an author or committer). This is a deliberate
@ -539,15 +513,15 @@ allow lifting this restriction.
Alternates
~~~~~~~~~~
For the same reason, a SHA-256 repository cannot borrow objects from a
SHA-1 repository using objects/info/alternates or
For the same reason, a sha256 repository cannot borrow objects from a
sha1 repository using objects/info/alternates or
$GIT_ALTERNATE_OBJECT_REPOSITORIES.
git notes
~~~~~~~~~
The "git notes" tool annotates objects using their SHA-1 name as key.
The "git notes" tool annotates objects using their sha1-name as key.
This design does not describe a way to migrate notes trees to use
SHA-256 names. That migration is expected to happen separately (for
sha256-names. That migration is expected to happen separately (for
example using a file at the root of the notes tree to describe which
hash it uses).
@ -581,7 +555,7 @@ unclear:
Git 2.12
Does this mean Git v2.12.0 is the commit with SHA-1 name
Does this mean Git v2.12.0 is the commit with sha1-name
e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
@ -624,12 +598,44 @@ The user can also explicitly specify which format to use for a
particular revision specifier and for output, overriding the mode. For
example:
git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
Choice of Hash
--------------
In early 2005, around the time that Git was written, Xiaoyun Wang,
Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
collisions in 2^69 operations. In August they published details.
Luckily, no practical demonstrations of a collision in full SHA-1 were
published until 10 years later, in 2017.
Git v2.13.0 and later subsequently moved to a hardened SHA-1
implementation by default that mitigates the SHAttered attack, but
SHA-1 is still believed to be weak.
The hash to replace this hardened SHA-1 should be stronger than SHA-1
was: we would like it to be trustworthy and useful in practice for at
least 10 years.
Some other relevant properties:
1. A 256-bit hash (long enough to match common security practice; not
excessively long to hurt performance and disk usage).
2. High quality implementations should be widely available (e.g., in
OpenSSL and Apple CommonCrypto).
3. The hash function's properties should match Git's needs (e.g. Git
requires collision and 2nd preimage resistance and does not require
length extension resistance).
4. As a tiebreaker, the hash should be fast to compute (fortunately
many contenders are faster than SHA-1).
We choose SHA-256.
Transition plan
---------------
Some initial steps can be implemented independently of one another:
- adding a hash function API (vtable)
- teaching fsck to tolerate the gpgsig-sha256 field
- excluding gpgsig-* from the fields copied by "git commit --amend"
@ -641,9 +647,9 @@ Some initial steps can be implemented independently of one another:
- introducing index v3
- adding support for the PSRC field and safer object pruning
The first user-visible change is the introduction of the objectFormat
extension (without compatObjectFormat). This requires:
- teaching fsck about this mode of operation
- using the hash function API (vtable) when computing object names
- signing objects and verifying signatures
@ -651,7 +657,6 @@ extension (without compatObjectFormat). This requires:
repository
Next comes introduction of compatObjectFormat:
- implementing the loose-object-idx
- translating object names between object formats
- translating object content between object formats
@ -664,11 +669,10 @@ Next comes introduction of compatObjectFormat:
"Object names on the command line" above)
The next step is supporting fetches and pushes to SHA-1 repositories:
- allow pushes to a repository using the compat format
- generate a topologically sorted list of the SHA-1 names of fetched
objects
- convert the fetched packfile to SHA-256 format and generate an idx
- convert the fetched packfile to sha256 format and generate an idx
file
- re-sort to match the order of objects in the fetched packfile
@ -730,7 +734,6 @@ Using hash functions in parallel
Objects newly created would be addressed by the new hash, but inside
such an object (e.g. commit) it is still possible to address objects
using the old hash function.
* You cannot trust its history (needed for bisectability) in the
future without further work
* Maintenance burden as the number of supported hash functions grows
@ -740,38 +743,36 @@ using the old hash function.
Signed objects with multiple hashes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Instead of introducing the gpgsig-sha256 field in commit and tag objects
for SHA-256 content based signatures, an earlier version of this design
added "hash sha256 <SHA-256 name>" fields to strengthen the existing
SHA-1 content based signatures.
for sha256-content based signatures, an earlier version of this design
added "hash sha256 <sha256-name>" fields to strengthen the existing
sha1-content based signatures.
In other words, a single signature was used to attest to the object
content using both hash functions. This had some advantages:
* Using one signature instead of two speeds up the signing process.
* Having one signed payload with both hashes allows the signer to
attest to the SHA-1 name and SHA-256 name referring to the same object.
attest to the sha1-name and sha256-name referring to the same object.
* All users consume the same signature. Broken signatures are likely
to be detected quickly using current versions of git.
However, it also came with disadvantages:
* Verifying a signed object requires access to the SHA-1 names of all
* Verifying a signed object requires access to the sha1-names of all
objects it references, even after the transition is complete and
translation table is no longer needed for anything else. To support
this, the design added fields such as "hash sha1 tree <SHA-1 name>"
and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed
this, the design added fields such as "hash sha1 tree <sha1-name>"
and "hash sha1 parent <sha1-name>" to the sha256-content of a signed
commit, complicating the conversion process.
* Allowing signed objects without a SHA-1 (for after the transition is
* Allowing signed objects without a sha1 (for after the transition is
complete) complicated the design further, requiring a "nohash sha1"
field to suppress including "hash sha1" fields in the SHA-256 content
field to suppress including "hash sha1" fields in the sha256-content
and signed payload.
Lazily populated translation table
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some of the work of building the translation table could be deferred to
push time, but that would significantly complicate and slow down pushes.
Calculating the SHA-1 name at object creation time at the same time it is
being streamed to disk and having its SHA-256 name calculated should be
Calculating the sha1-name at object creation time at the same time it is
being streamed to disk and having its sha256-name calculated should be
an acceptable cost.
Document History
@ -781,19 +782,18 @@ Document History
bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
sbeller@google.com
* Initial version sent to https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
Initial version sent to
http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
2017-03-03 jrnieder@gmail.com
Incorporated suggestions from jonathantanmy and sbeller:
* Describe purpose of signed objects with each hash type
* Redefine signed object verification using object content under the
* describe purpose of signed objects with each hash type
* redefine signed object verification using object content under the
first hash function
2017-03-06 jrnieder@gmail.com
* Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
* Make SHA3-based signatures a separate field, avoiding the need for
* Make sha3-based signatures a separate field, avoiding the need for
"hash" and "nohash" fields (thanks to peff[3]).
* Add a sorting phase to fetch (thanks to Junio for noticing the need
for this).
@ -805,26 +805,23 @@ Incorporated suggestions from jonathantanmy and sbeller:
especially Junio).
2017-09-27 jrnieder@gmail.com, sbeller@google.com
* Use placeholder NewHash instead of SHA3-256
* Describe criteria for picking a hash function.
* Include a transition plan (thanks especially to Brandon Williams
* use placeholder NewHash instead of SHA3-256
* describe criteria for picking a hash function.
* include a transition plan (thanks especially to Brandon Williams
for fleshing these ideas out)
* Define the translation table (thanks, Shawn Pearce[5], Jonathan
* define the translation table (thanks, Shawn Pearce[5], Jonathan
Tan, and Masaya Suzuki)
* Avoid loose object overhead by packing more aggressively in
* avoid loose object overhead by packing more aggressively in
"git gc --auto"
Later history:
* See the history of this file in git.git for the history of subsequent
edits. This document history is no longer being maintained as it
would now be superfluous to the commit log
See the history of this file in git.git for the history of subsequent
edits. This document history is no longer being maintained as it
would now be superfluous to the commit log
References:
[1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
[2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
[3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
[4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/

View File

@ -26,7 +26,7 @@ Git index format
Extensions are identified by signature. Optional extensions can
be ignored if Git does not understand them.
Git currently supports cache tree and resolve undo extensions.
Git currently supports cached tree and resolve undo extensions.
4-byte extension signature. If the first byte is 'A'..'Z' the
extension is optional and can be ignored.
@ -136,35 +136,14 @@ Git index format
== Extensions
=== Cache tree
=== Cached tree
Since the index does not record entries for directories, the cache
entries cannot describe tree objects that already exist in the object
database for regions of the index that are unchanged from an existing
commit. The cache tree extension stores a recursive tree structure that
describes the trees that already exist and completely match sections of
the cache entries. This speeds up tree object generation from the index
for a new commit by only computing the trees that are "new" to that
commit. It also assists when comparing the index to another tree, such
as `HEAD^{tree}`, since sections of the index can be skipped when a tree
comparison demonstrates equality.
Cached tree extension contains pre-computed hashes for trees that can
be derived from the index. It helps speed up tree object generation
from index for a new commit.
The recursive tree structure uses nodes that store a number of cache
entries, a list of subnodes, and an object ID (OID). The OID references
the existing tree for that node, if it is known to exist. The subnodes
correspond to subdirectories that themselves have cache tree nodes. The
number of cache entries corresponds to the number of cache entries in
the index that describe paths within that tree's directory.
The extension tracks the full directory structure in the cache tree
extension, but this is generally smaller than the full cache entry list.
When a path is updated in index, Git invalidates all nodes of the
recursive cache tree corresponding to the parent directories of that
path. We store these tree nodes as being "invalid" by using "-1" as the
number of cache entries. Invalid nodes still store a span of index
entries, allowing Git to focus its efforts when reconstructing a full
cache tree.
When a path is updated in index, the path must be invalidated and
removed from tree cache.
The signature for this extension is { 'T', 'R', 'E', 'E' }.
@ -195,8 +174,7 @@ Git index format
first entry represents the root level of the repository, followed by the
first subtree--let's call this A--of the root level (with its name
relative to the root level), followed by the first subtree of A (with
its name relative to A), and so on. The specified number of subtrees
indicates when the current level of the recursive stack is complete.
its name relative to A), ...
=== Resolve undo
@ -273,14 +251,14 @@ Git index format
- Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
ctime field until "file size".
- Stat data of core.excludesFile
- Stat data of core.excludesfile
- 32-bit dir_flags (see struct dir_struct)
- Hash of $GIT_DIR/info/exclude. A null hash means the file
does not exist.
- Hash of core.excludesFile. A null hash means the file does
- Hash of core.excludesfile. A null hash means the file does
not exist.
- NUL-terminated string of per-dir exclude file name. This usually

View File

@ -274,26 +274,6 @@ Pack file entry: <+
Index checksum of all of the above.
== pack-*.rev files have the format:
- A 4-byte magic number '0x52494458' ('RIDX').
- A 4-byte version identifier (= 1).
- A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256).
- A table of index positions (one per packed object, num_objects in
total, each a 4-byte unsigned integer in network order), sorted by
their corresponding offsets in the packfile.
- A trailer, containing a:
checksum of the corresponding packfile, and
a checksum of all of the above.
All 4-byte numbers are in network order.
== multi-pack-index (MIDX) files have the following format:
The multi-pack-index files refer to multiple pack-files and loose objects.
@ -336,9 +316,6 @@ CHUNK LOOKUP:
(Chunks are provided in file-order, so you can infer the length
using the next chunk position if necessary.)
The CHUNK LOOKUP matches the table of contents from
link:technical/chunk-format.html[the chunk-based file format].
The remaining data in the body is described one chunk at a time, and
these chunks may be given in any order. Chunks are required unless
otherwise specified.

View File

@ -33,8 +33,8 @@ In protocol v2 these special packets will have the following semantics:
* '0000' Flush Packet (flush-pkt) - indicates the end of a message
* '0001' Delimiter Packet (delim-pkt) - separates sections of a message
* '0002' Response End Packet (response-end-pkt) - indicates the end of a
response for stateless connections
* '0002' Message Packet (response-end-pkt) - indicates the end of a response
for stateless connections
Initial Client Request
----------------------
@ -192,20 +192,11 @@ ls-refs takes in the following arguments:
When specified, only references having a prefix matching one of
the provided prefixes are displayed.
If the 'unborn' feature is advertised the following argument can be
included in the client's request.
unborn
The server will send information about HEAD even if it is a symref
pointing to an unborn branch in the form "unborn HEAD
symref-target:<target>".
The output of ls-refs is as follows:
output = *ref
flush-pkt
obj-id-or-unborn = (obj-id | "unborn")
ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
ref-attribute = (symref | peeled)
symref = "symref-target:" symref-target
peeled = "peeled:" obj-id

View File

@ -872,11 +872,17 @@ A repository must set its `$GIT_DIR/config` to configure reftable:
Layout
^^^^^^
A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory.
Their names should have a random element, such that each filename is globally
unique; this helps avoid spurious failures on Windows, where open files cannot
be removed or overwritten. It suggested to use
`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention.
A collection of reftable files are stored in the `$GIT_DIR/reftable/`
directory:
....
00000001-00000001.log
00000002-00000002.ref
00000003-00000003.ref
....
where reftable files are named by a unique name such as produced by the
function `${min_update_index}-${max_update_index}.ref`.
Log-only files use the `.log` extension, while ref-only and mixed ref
and log files use `.ref`. extension.
@ -887,9 +893,9 @@ current files, one per line, in order, from oldest (base) to newest
....
$ cat .git/reftable/tables.list
00000001-00000001-RANDOM1.log
00000002-00000002-RANDOM2.ref
00000003-00000003-RANDOM3.ref
00000001-00000001.log
00000002-00000002.ref
00000003-00000003.ref
....
Readers must read `$GIT_DIR/reftable/tables.list` to determine which
@ -934,7 +940,7 @@ new reftable and atomically appending it to the stack:
3. Select `update_index` to be most recent file's
`max_update_index + 1`.
4. Prepare temp reftable `tmp_XXXXXX`, including log entries.
5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`.
5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`.
6. Copy `tables.list` to `tables.list.lock`, appending file from (5).
7. Rename `tables.list.lock` to `tables.list`.
@ -987,7 +993,7 @@ prevents other processes from trying to compact these files.
should always be the case, assuming that other processes are adhering to
the locking protocol.
7. Rename `${min_update_index}-${max_update_index}_XXXXXX` to
`${min_update_index}-${max_update_index}-${random}.ref`.
`${min_update_index}-${max_update_index}.ref`.
8. Write the new stack to `tables.list.lock`, replacing `B` and `C`
with the file from (4).
9. Rename `tables.list.lock` to `tables.list`.
@ -999,22 +1005,6 @@ This strategy permits compactions to proceed independently of updates.
Each reftable (compacted or not) is uniquely identified by its name, so
open reftables can be cached by their name.
Windows
^^^^^^^
On windows, and other systems that do not allow deleting or renaming to open
files, compaction may succeed, but other readers may prevent obsolete tables
from being deleted.
On these platforms, the following strategy can be followed: on closing a
reftable stack, reload `tables.list`, and delete any tables no longer mentioned
in `tables.list`.
Irregular program exit may still leave about unused files. In this case, a
cleanup operation can read `tables.list`, note its modification timestamp, and
delete any unreferenced `*.ref` files that are older.
Alternatives considered
~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -1,7 +1,7 @@
#!/bin/sh
GVF=GIT-VERSION-FILE
DEF_VER=v2.31.3
DEF_VER=v2.30.9
LF='
'

View File

@ -145,6 +145,10 @@ Issues of note:
patches into an IMAP mailbox, you do not have to have them
(use NO_CURL).
Git requires version "7.19.5" or later of "libcurl" to build
without NO_CURL. This version requirement may be bumped in
the future.
- "expat" library; git-http-push uses it for remote lock
management over DAV. Similar to "curl" above, this is optional
(with NO_EXPAT).

View File

@ -22,9 +22,6 @@ all::
# when attempting to read from an fopen'ed directory (or even to fopen
# it at all).
#
# Define OPEN_RETURNS_EINTR if your open() system call may return EINTR
# when a signal is received (as opposed to restarting).
#
# Define NO_OPENSSL environment variable if you do not have OpenSSL.
#
# Define USE_LIBPCRE if you have and want to use libpcre. Various
@ -32,11 +29,18 @@ all::
# Perl-compatible regular expressions instead of standard or extended
# POSIX regular expressions.
#
# Only libpcre version 2 is supported. USE_LIBPCRE2 is a synonym for
# USE_LIBPCRE, support for the old USE_LIBPCRE1 has been removed.
# USE_LIBPCRE is a synonym for USE_LIBPCRE2, define USE_LIBPCRE1
# instead if you'd like to use the legacy version 1 of the PCRE
# library. Support for version 1 will likely be removed in some future
# release of Git, as upstream has all but abandoned it.
#
# When using USE_LIBPCRE1, define NO_LIBPCRE1_JIT if you want to
# disable JIT even if supported by your library.
#
# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
# in /foo/bar/include and /foo/bar/lib directories.
# in /foo/bar/include and /foo/bar/lib directories. Which version of
# PCRE this points to determined by the USE_LIBPCRE1 and USE_LIBPCRE2
# variables.
#
# Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
#
@ -718,7 +722,6 @@ TEST_BUILTINS_OBJS += test-online-cpus.o
TEST_BUILTINS_OBJS += test-parse-options.o
TEST_BUILTINS_OBJS += test-parse-pathspec-file.o
TEST_BUILTINS_OBJS += test-path-utils.o
TEST_BUILTINS_OBJS += test-pcre2-config.o
TEST_BUILTINS_OBJS += test-pkt-line.o
TEST_BUILTINS_OBJS += test-prio-queue.o
TEST_BUILTINS_OBJS += test-proc-receive.o
@ -837,7 +840,6 @@ LIB_OBJS += bundle.o
LIB_OBJS += cache-tree.o
LIB_OBJS += chdir-notify.o
LIB_OBJS += checkout.o
LIB_OBJS += chunk-format.o
LIB_OBJS += color.o
LIB_OBJS += column.o
LIB_OBJS += combine-diff.o
@ -858,7 +860,6 @@ LIB_OBJS += date.o
LIB_OBJS += decorate.o
LIB_OBJS += delta-islands.o
LIB_OBJS += diff-delta.o
LIB_OBJS += diff-merges.o
LIB_OBJS += diff-lib.o
LIB_OBJS += diff-no-index.o
LIB_OBJS += diff.o
@ -867,7 +868,6 @@ LIB_OBJS += diffcore-delta.o
LIB_OBJS += diffcore-order.o
LIB_OBJS += diffcore-pickaxe.o
LIB_OBJS += diffcore-rename.o
LIB_OBJS += diffcore-rotate.o
LIB_OBJS += dir-iterator.o
LIB_OBJS += dir.o
LIB_OBJS += editor.o
@ -887,7 +887,6 @@ LIB_OBJS += gettext.o
LIB_OBJS += gpg-interface.o
LIB_OBJS += graph.o
LIB_OBJS += grep.o
LIB_OBJS += hash-lookup.o
LIB_OBJS += hashmap.o
LIB_OBJS += help.o
LIB_OBJS += hex.o
@ -924,8 +923,6 @@ LIB_OBJS += notes-cache.o
LIB_OBJS += notes-merge.o
LIB_OBJS += notes-utils.o
LIB_OBJS += notes.o
LIB_OBJS += object-file.o
LIB_OBJS += object-name.o
LIB_OBJS += object.o
LIB_OBJS += oid-array.o
LIB_OBJS += oidmap.o
@ -982,6 +979,9 @@ LIB_OBJS += sequencer.o
LIB_OBJS += serve.o
LIB_OBJS += server-info.o
LIB_OBJS += setup.o
LIB_OBJS += sha1-file.o
LIB_OBJS += sha1-lookup.o
LIB_OBJS += sha1-name.o
LIB_OBJS += shallow.o
LIB_OBJS += sideband.o
LIB_OBJS += sigchain.o
@ -1359,17 +1359,26 @@ ifdef NO_LIBGEN_H
COMPAT_OBJS += compat/basename.o
endif
ifdef USE_LIBPCRE1
$(error The USE_LIBPCRE1 build option has been removed, use version 2 with USE_LIBPCRE)
endif
USE_LIBPCRE2 ?= $(USE_LIBPCRE)
ifneq (,$(USE_LIBPCRE2))
ifdef USE_LIBPCRE1
$(error Only set USE_LIBPCRE2 (or its alias USE_LIBPCRE) or USE_LIBPCRE1, not both!)
endif
BASIC_CFLAGS += -DUSE_LIBPCRE2
EXTLIBS += -lpcre2-8
endif
ifdef USE_LIBPCRE1
BASIC_CFLAGS += -DUSE_LIBPCRE1
EXTLIBS += -lpcre
ifdef NO_LIBPCRE1_JIT
BASIC_CFLAGS += -DNO_LIBPCRE1_JIT
endif
endif
ifdef LIBPCREDIR
BASIC_CFLAGS += -I$(LIBPCREDIR)/include
EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
@ -1542,10 +1551,6 @@ ifdef FREAD_READS_DIRECTORIES
COMPAT_CFLAGS += -DFREAD_READS_DIRECTORIES
COMPAT_OBJS += compat/fopen.o
endif
ifdef OPEN_RETURNS_EINTR
COMPAT_CFLAGS += -DOPEN_RETURNS_EINTR
COMPAT_OBJS += compat/open.o
endif
ifdef NO_SYMLINK_HEAD
BASIC_CFLAGS += -DNO_SYMLINK_HEAD
endif
@ -2721,7 +2726,9 @@ GIT-BUILD-OPTIONS: FORCE
@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
@echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE1)))'\' >>$@+
@echo USE_LIBPCRE2=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE2)))'\' >>$@+
@echo NO_LIBPCRE1_JIT=\''$(subst ','\'',$(subst ','\'',$(NO_LIBPCRE1_JIT)))'\' >>$@+
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
@echo NO_PTHREADS=\''$(subst ','\'',$(subst ','\'',$(NO_PTHREADS)))'\' >>$@+
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
@ -3299,11 +3306,11 @@ cover_db_html: cover_db
# are not necessarily appropriate for general builds, and that vary greatly
# depending on the compiler version used.
#
# An example command to build against libFuzzer from LLVM 11.0.0:
# An example command to build against libFuzzer from LLVM 4.0.0:
#
# make CC=clang CXX=clang++ \
# CFLAGS="-fsanitize=fuzzer-no-link,address" \
# LIB_FUZZING_ENGINE="-fsanitize=fuzzer" \
# CFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
# LIB_FUZZING_ENGINE=/usr/lib/llvm-4.0/lib/libFuzzer.a \
# fuzz-all
#
FUZZ_CXXFLAGS ?= $(CFLAGS)

View File

@ -1 +1 @@
Documentation/RelNotes/2.31.3.txt
Documentation/RelNotes/2.30.9.txt

View File

@ -67,15 +67,19 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
#endif
/*
* If set, any number of trailing components may be missing; otherwise, only one
* may be.
* Return the real path (i.e., absolute path, with symlinks resolved
* and extra slashes removed) equivalent to the specified path. (If
* you want an absolute path but don't mind links, use
* absolute_path().) Places the resolved realpath in the provided strbuf.
*
* The directory part of path (i.e., everything up to the last
* dir_sep) must denote a valid, existing directory, but the last
* component need not exist. If die_on_error is set, then die with an
* informative error message if there is a problem. Otherwise, return
* NULL on errors (without generating any output).
*/
#define REALPATH_MANY_MISSING (1 << 0)
/* Should we die if there's an error? */
#define REALPATH_DIE_ON_ERROR (1 << 1)
static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
int flags)
char *strbuf_realpath(struct strbuf *resolved, const char *path,
int die_on_error)
{
struct strbuf remaining = STRBUF_INIT;
struct strbuf next = STRBUF_INIT;
@ -85,7 +89,7 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
struct stat st;
if (!*path) {
if (flags & REALPATH_DIE_ON_ERROR)
if (die_on_error)
die("The empty string is not a valid path");
else
goto error_out;
@ -97,7 +101,7 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
if (!resolved->len) {
/* relative path; can use CWD as the initial resolved path */
if (strbuf_getcwd(resolved)) {
if (flags & REALPATH_DIE_ON_ERROR)
if (die_on_error)
die_errno("unable to get current working directory");
else
goto error_out;
@ -125,9 +129,8 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
if (lstat(resolved->buf, &st)) {
/* error out unless this was the last component */
if (errno != ENOENT ||
(!(flags & REALPATH_MANY_MISSING) && remaining.len)) {
if (flags & REALPATH_DIE_ON_ERROR)
if (errno != ENOENT || remaining.len) {
if (die_on_error)
die_errno("Invalid path '%s'",
resolved->buf);
else
@ -140,7 +143,7 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
if (num_symlinks++ > MAXSYMLINKS) {
errno = ELOOP;
if (flags & REALPATH_DIE_ON_ERROR)
if (die_on_error)
die("More than %d nested symlinks "
"on path '%s'", MAXSYMLINKS, path);
else
@ -150,7 +153,7 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path,
len = strbuf_readlink(&symlink, resolved->buf,
st.st_size);
if (len < 0) {
if (flags & REALPATH_DIE_ON_ERROR)
if (die_on_error)
die_errno("Invalid symlink '%s'",
resolved->buf);
else
@ -199,37 +202,6 @@ error_out:
return retval;
}
/*
* Return the real path (i.e., absolute path, with symlinks resolved
* and extra slashes removed) equivalent to the specified path. (If
* you want an absolute path but don't mind links, use
* absolute_path().) Places the resolved realpath in the provided strbuf.
*
* The directory part of path (i.e., everything up to the last
* dir_sep) must denote a valid, existing directory, but the last
* component need not exist. If die_on_error is set, then die with an
* informative error message if there is a problem. Otherwise, return
* NULL on errors (without generating any output).
*/
char *strbuf_realpath(struct strbuf *resolved, const char *path,
int die_on_error)
{
return strbuf_realpath_1(resolved, path,
die_on_error ? REALPATH_DIE_ON_ERROR : 0);
}
/*
* Just like strbuf_realpath, but allows an arbitrary number of path
* components to be missing.
*/
char *strbuf_realpath_forgiving(struct strbuf *resolved, const char *path,
int die_on_error)
{
return strbuf_realpath_1(resolved, path,
((die_on_error ? REALPATH_DIE_ON_ERROR : 0) |
REALPATH_MANY_MISSING));
}
char *real_pathdup(const char *path, int die_on_error)
{
struct strbuf realpath = STRBUF_INIT;

View File

@ -413,7 +413,7 @@ struct file_item {
static void add_file_item(struct string_list *files, const char *name)
{
struct file_item *item = xcalloc(1, sizeof(*item));
struct file_item *item = xcalloc(sizeof(*item), 1);
string_list_append(files, name)->util = item;
}
@ -476,7 +476,7 @@ static void collect_changes_cb(struct diff_queue_struct *q,
add_file_item(s->files, name);
CALLOC_ARRAY(entry, 1);
entry = xcalloc(sizeof(*entry), 1);
hashmap_entry_init(&entry->ent, hash);
entry->name = s->files->items[s->files->nr - 1].string;
entry->item = s->files->items[s->files->nr - 1].util;
@ -1120,7 +1120,7 @@ int run_add_i(struct repository *r, const struct pathspec *ps)
int res = 0;
for (i = 0; i < ARRAY_SIZE(command_list); i++) {
struct command_item *util = xcalloc(1, sizeof(*util));
struct command_item *util = xcalloc(sizeof(*util), 1);
util->command = command_list[i].command;
string_list_append(&commands.items, command_list[i].string)
->util = util;

11
alias.c
View File

@ -46,14 +46,16 @@ void list_aliases(struct string_list *list)
#define SPLIT_CMDLINE_BAD_ENDING 1
#define SPLIT_CMDLINE_UNCLOSED_QUOTE 2
#define SPLIT_CMDLINE_ARGC_OVERFLOW 3
static const char *split_cmdline_errors[] = {
N_("cmdline ends with \\"),
N_("unclosed quote")
N_("unclosed quote"),
N_("too many arguments"),
};
int split_cmdline(char *cmdline, const char ***argv)
{
int src, dst, count = 0, size = 16;
size_t src, dst, count = 0, size = 16;
char quoted = 0;
ALLOC_ARRAY(*argv, size);
@ -96,6 +98,11 @@ int split_cmdline(char *cmdline, const char ***argv)
return -SPLIT_CMDLINE_UNCLOSED_QUOTE;
}
if (count >= INT_MAX) {
FREE_AND_NULL(*argv);
return -SPLIT_CMDLINE_ARGC_OVERFLOW;
}
ALLOC_GROW(*argv, count + 1, size);
(*argv)[count] = NULL;

47
apply.c
View File

@ -1781,7 +1781,7 @@ static int parse_single_patch(struct apply_state *state,
struct fragment *fragment;
int len;
CALLOC_ARRAY(fragment, 1);
fragment = xcalloc(1, sizeof(*fragment));
fragment->linenr = state->linenr;
len = parse_fragment(state, line, size, patch, fragment);
if (len <= 0) {
@ -1959,7 +1959,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
size -= llen;
}
CALLOC_ARRAY(frag, 1);
frag = xcalloc(1, sizeof(*frag));
frag->patch = inflate_it(data, hunk_size, origlen);
frag->free_patch = 1;
if (!frag->patch)
@ -4400,6 +4400,33 @@ static int create_one_file(struct apply_state *state,
if (state->cached)
return 0;
/*
* We already try to detect whether files are beyond a symlink in our
* up-front checks. But in the case where symlinks are created by any
* of the intermediate hunks it can happen that our up-front checks
* didn't yet see the symlink, but at the point of arriving here there
* in fact is one. We thus repeat the check for symlinks here.
*
* Note that this does not make the up-front check obsolete as the
* failure mode is different:
*
* - The up-front checks cause us to abort before we have written
* anything into the working directory. So when we exit this way the
* working directory remains clean.
*
* - The checks here happen in the middle of the action where we have
* already started to apply the patch. The end result will be a dirty
* working directory.
*
* Ideally, we should update the up-front checks to catch what would
* happen when we apply the patch before we damage the working tree.
* We have all the information necessary to do so. But for now, as a
* part of embargoed security work, having this check would serve as a
* reasonable first step.
*/
if (path_is_beyond_symlink(state, path))
return error(_("affected file '%s' is beyond a symbolic link"), path);
res = try_create_file(state, path, mode, buf, size);
if (res < 0)
return -1;
@ -4531,7 +4558,7 @@ static int write_out_one_reject(struct apply_state *state, struct patch *patch)
FILE *rej;
char namebuf[PATH_MAX];
struct fragment *frag;
int cnt = 0;
int fd, cnt = 0;
struct strbuf sb = STRBUF_INIT;
for (cnt = 0, frag = patch->fragments; frag; frag = frag->next) {
@ -4571,7 +4598,17 @@ static int write_out_one_reject(struct apply_state *state, struct patch *patch)
memcpy(namebuf, patch->new_name, cnt);
memcpy(namebuf + cnt, ".rej", 5);
rej = fopen(namebuf, "w");
fd = open(namebuf, O_CREAT | O_EXCL | O_WRONLY, 0666);
if (fd < 0) {
if (errno != EEXIST)
return error_errno(_("cannot open %s"), namebuf);
if (unlink(namebuf))
return error_errno(_("cannot unlink '%s'"), namebuf);
fd = open(namebuf, O_CREAT | O_EXCL | O_WRONLY, 0666);
if (fd < 0)
return error_errno(_("cannot open %s"), namebuf);
}
rej = fdopen(fd, "w");
if (!rej)
return error_errno(_("cannot open %s"), namebuf);
@ -4681,7 +4718,7 @@ static int apply_patch(struct apply_state *state,
struct patch *patch;
int nr;
CALLOC_ARRAY(patch, 1);
patch = xcalloc(1, sizeof(*patch));
patch->inaccurate_eof = !!(options & APPLY_OPT_INACCURATE_EOF);
patch->recount = !!(options & APPLY_OPT_RECOUNT);
nr = parse_chunk(state, buf.buf + offset, buf.len - offset, patch);

View File

@ -371,7 +371,7 @@ static int tar_filter_config(const char *var, const char *value, void *data)
ar = find_tar_filter(name, namelen);
if (!ar) {
CALLOC_ARRAY(ar, 1);
ar = xcalloc(1, sizeof(*ar));
ar->name = xmemdupz(name, namelen);
ar->write_archive = write_tar_filter_archive;
ar->flags = ARCHIVER_WANT_COMPRESSION_LEVELS |

107
attr.c
View File

@ -28,7 +28,7 @@ static const char git_attr__unknown[] = "(builtin)unknown";
#endif
struct git_attr {
int attr_nr; /* unique attribute number */
unsigned int attr_nr; /* unique attribute number */
char name[FLEX_ARRAY]; /* attribute name */
};
@ -210,7 +210,7 @@ static void report_invalid_attr(const char *name, size_t len,
* dictionary. If no entry is found, create a new attribute and store it in
* the dictionary.
*/
static const struct git_attr *git_attr_internal(const char *name, int namelen)
static const struct git_attr *git_attr_internal(const char *name, size_t namelen)
{
struct git_attr *a;
@ -226,8 +226,8 @@ static const struct git_attr *git_attr_internal(const char *name, int namelen)
a->attr_nr = hashmap_get_size(&g_attr_hashmap.map);
attr_hashmap_add(&g_attr_hashmap, a->name, namelen, a);
assert(a->attr_nr ==
(hashmap_get_size(&g_attr_hashmap.map) - 1));
if (a->attr_nr != hashmap_get_size(&g_attr_hashmap.map) - 1)
die(_("unable to add additional attribute"));
}
hashmap_unlock(&g_attr_hashmap);
@ -272,7 +272,7 @@ struct match_attr {
const struct git_attr *attr;
} u;
char is_macro;
unsigned num_attr;
size_t num_attr;
struct attr_state state[FLEX_ARRAY];
};
@ -289,7 +289,7 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
struct attr_state *e)
{
const char *ep, *equals;
int len;
size_t len;
ep = cp + strcspn(cp, blank);
equals = strchr(cp, '=');
@ -333,8 +333,7 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
static struct match_attr *parse_attr_line(const char *line, const char *src,
int lineno, int macro_ok)
{
int namelen;
int num_attr, i;
size_t namelen, num_attr, i;
const char *cp, *name, *states;
struct match_attr *res = NULL;
int is_macro;
@ -345,6 +344,11 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
return NULL;
name = cp;
if (strlen(line) >= ATTR_MAX_LINE_LENGTH) {
warning(_("ignoring overly long attributes line %d"), lineno);
return NULL;
}
if (*cp == '"' && !unquote_c_style(&pattern, name, &states)) {
name = pattern.buf;
namelen = pattern.len;
@ -381,10 +385,9 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
goto fail_return;
}
res = xcalloc(1,
sizeof(*res) +
sizeof(struct attr_state) * num_attr +
(is_macro ? 0 : namelen + 1));
res = xcalloc(1, st_add3(sizeof(*res),
st_mult(sizeof(struct attr_state), num_attr),
is_macro ? 0 : namelen + 1));
if (is_macro) {
res->u.attr = git_attr_internal(name, namelen);
} else {
@ -447,11 +450,12 @@ struct attr_stack {
static void attr_stack_free(struct attr_stack *e)
{
int i;
unsigned i;
free(e->origin);
for (i = 0; i < e->num_matches; i++) {
struct match_attr *a = e->attrs[i];
int j;
size_t j;
for (j = 0; j < a->num_attr; j++) {
const char *setto = a->state[j].setto;
if (setto == ATTR__TRUE ||
@ -569,7 +573,7 @@ struct attr_check *attr_check_initl(const char *one, ...)
check = attr_check_alloc();
check->nr = cnt;
check->alloc = cnt;
CALLOC_ARRAY(check->items, cnt);
check->items = xcalloc(cnt, sizeof(struct attr_check_item));
check->items[0].attr = git_attr(one);
va_start(params, one);
@ -660,8 +664,8 @@ static void handle_attr_line(struct attr_stack *res,
a = parse_attr_line(line, src, lineno, macro_ok);
if (!a)
return;
ALLOC_GROW(res->attrs, res->num_matches + 1, res->alloc);
res->attrs[res->num_matches++] = a;
ALLOC_GROW_BY(res->attrs, res->num_matches, 1, res->alloc);
res->attrs[res->num_matches - 1] = a;
}
static struct attr_stack *read_attr_from_array(const char **list)
@ -670,7 +674,7 @@ static struct attr_stack *read_attr_from_array(const char **list)
const char *line;
int lineno = 0;
CALLOC_ARRAY(res, 1);
res = xcalloc(1, sizeof(*res));
while ((line = *(list++)) != NULL)
handle_attr_line(res, line, "[builtin]", ++lineno, 1);
return res;
@ -700,21 +704,37 @@ void git_attr_set_direction(enum git_attr_direction new_direction)
static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
{
struct strbuf buf = STRBUF_INIT;
FILE *fp = fopen_or_warn(path, "r");
struct attr_stack *res;
char buf[2048];
int lineno = 0;
int fd;
struct stat st;
if (!fp)
return NULL;
CALLOC_ARRAY(res, 1);
while (fgets(buf, sizeof(buf), fp)) {
char *bufp = buf;
if (!lineno)
skip_utf8_bom(&bufp, strlen(bufp));
handle_attr_line(res, bufp, path, ++lineno, macro_ok);
fd = fileno(fp);
if (fstat(fd, &st)) {
warning_errno(_("cannot fstat gitattributes file '%s'"), path);
fclose(fp);
return NULL;
}
if (st.st_size >= ATTR_MAX_FILE_SIZE) {
warning(_("ignoring overly large gitattributes file '%s'"), path);
fclose(fp);
return NULL;
}
CALLOC_ARRAY(res, 1);
while (strbuf_getline(&buf, fp) != EOF) {
if (!lineno && starts_with(buf.buf, utf8_bom))
strbuf_remove(&buf, 0, strlen(utf8_bom));
handle_attr_line(res, buf.buf, path, ++lineno, macro_ok);
}
fclose(fp);
strbuf_release(&buf);
return res;
}
@ -725,15 +745,20 @@ static struct attr_stack *read_attr_from_index(const struct index_state *istate,
struct attr_stack *res;
char *buf, *sp;
int lineno = 0;
unsigned long size;
if (!istate)
return NULL;
buf = read_blob_data_from_index(istate, path, NULL);
buf = read_blob_data_from_index(istate, path, &size);
if (!buf)
return NULL;
if (size >= ATTR_MAX_FILE_SIZE) {
warning(_("ignoring overly large gitattributes blob '%s'"), path);
return NULL;
}
CALLOC_ARRAY(res, 1);
res = xcalloc(1, sizeof(*res));
for (sp = buf; *sp; ) {
char *ep;
int more;
@ -774,7 +799,7 @@ static struct attr_stack *read_attr(const struct index_state *istate,
}
if (!res)
CALLOC_ARRAY(res, 1);
res = xcalloc(1, sizeof(*res));
return res;
}
@ -874,7 +899,7 @@ static void bootstrap_attr_stack(const struct index_state *istate,
else
e = NULL;
if (!e)
CALLOC_ARRAY(e, 1);
e = xcalloc(1, sizeof(struct attr_stack));
push_stack(stack, e, NULL, 0);
}
@ -1001,12 +1026,12 @@ static int macroexpand_one(struct all_attrs_item *all_attrs, int nr, int rem);
static int fill_one(const char *what, struct all_attrs_item *all_attrs,
const struct match_attr *a, int rem)
{
int i;
size_t i;
for (i = a->num_attr - 1; rem > 0 && i >= 0; i--) {
const struct git_attr *attr = a->state[i].attr;
for (i = a->num_attr; rem > 0 && i > 0; i--) {
const struct git_attr *attr = a->state[i - 1].attr;
const char **n = &(all_attrs[attr->attr_nr].value);
const char *v = a->state[i].setto;
const char *v = a->state[i - 1].setto;
if (*n == ATTR__UNKNOWN) {
debug_set(what,
@ -1025,11 +1050,11 @@ static int fill(const char *path, int pathlen, int basename_offset,
struct all_attrs_item *all_attrs, int rem)
{
for (; rem > 0 && stack; stack = stack->prev) {
int i;
unsigned i;
const char *base = stack->origin ? stack->origin : "";
for (i = stack->num_matches - 1; 0 < rem && 0 <= i; i--) {
const struct match_attr *a = stack->attrs[i];
for (i = stack->num_matches; 0 < rem && 0 < i; i--) {
const struct match_attr *a = stack->attrs[i - 1];
if (a->is_macro)
continue;
if (path_matches(path, pathlen, basename_offset,
@ -1060,11 +1085,11 @@ static void determine_macros(struct all_attrs_item *all_attrs,
const struct attr_stack *stack)
{
for (; stack; stack = stack->prev) {
int i;
for (i = stack->num_matches - 1; i >= 0; i--) {
const struct match_attr *ma = stack->attrs[i];
unsigned i;
for (i = stack->num_matches; i > 0; i--) {
const struct match_attr *ma = stack->attrs[i - 1];
if (ma->is_macro) {
int n = ma->u.attr->attr_nr;
unsigned int n = ma->u.attr->attr_nr;
if (!all_attrs[n].macro) {
all_attrs[n].macro = ma;
}
@ -1116,7 +1141,7 @@ void git_check_attr(const struct index_state *istate,
collect_some_attrs(istate, path, check);
for (i = 0; i < check->nr; i++) {
size_t n = check->items[i].attr->attr_nr;
unsigned int n = check->items[i].attr->attr_nr;
const char *value = check->all_attrs[n].value;
if (value == ATTR__UNKNOWN)
value = ATTR__UNSET;

12
attr.h
View File

@ -107,6 +107,18 @@
* - Free the `attr_check` struct by calling `attr_check_free()`.
*/
/**
* The maximum line length for a gitattributes file. If the line exceeds this
* length we will ignore it.
*/
#define ATTR_MAX_LINE_LENGTH 2048
/**
* The maximum size of the giattributes file. If the file exceeds this size we
* will ignore it.
*/
#define ATTR_MAX_FILE_SIZE (100 * 1024 * 1024)
struct index_state;
/**

View File

@ -6,7 +6,7 @@
#include "refs.h"
#include "list-objects.h"
#include "quote.h"
#include "hash-lookup.h"
#include "sha1-lookup.h"
#include "run-command.h"
#include "log-tree.h"
#include "bisect.h"
@ -423,7 +423,7 @@ void find_bisection(struct commit_list **commit_list, int *reaches,
show_list("bisection 2 sorted", 0, nr, list);
*all = nr;
CALLOC_ARRAY(weights, on_list);
weights = xcalloc(on_list, sizeof(*weights));
/* Do the real work of finding bisection commit. */
best = do_find_bisection(list, nr, weights, bisect_flags);
@ -1064,7 +1064,7 @@ enum bisect_error bisect_next_all(struct repository *r, const char *prefix)
if (!all) {
fprintf(stderr, _("No testable commit found.\n"
"Maybe you started with bad path arguments?\n"));
"Maybe you started with bad path parameters?\n"));
return BISECT_NO_TESTABLE_COMMIT;
}

17
blame.c
View File

@ -951,13 +951,13 @@ static int *fuzzy_find_matching_lines(struct blame_origin *parent,
max_search_distance_b = ((2 * max_search_distance_a + 1) * length_b
- 1) / length_a;
CALLOC_ARRAY(result, length_b);
CALLOC_ARRAY(second_best_result, length_b);
CALLOC_ARRAY(certainties, length_b);
result = xcalloc(sizeof(int), length_b);
second_best_result = xcalloc(sizeof(int), length_b);
certainties = xcalloc(sizeof(int), length_b);
/* See get_similarity() for details of similarities. */
similarity_count = length_b * (max_search_distance_a * 2 + 1);
CALLOC_ARRAY(similarities, similarity_count);
similarities = xcalloc(sizeof(int), similarity_count);
for (i = 0; i < length_b; ++i) {
result[i] = -1;
@ -995,7 +995,7 @@ static void fill_origin_fingerprints(struct blame_origin *o)
return;
o->num_lines = find_line_starts(&line_starts, o->file.ptr,
o->file.size);
CALLOC_ARRAY(o->fingerprints, o->num_lines);
o->fingerprints = xcalloc(sizeof(struct fingerprint), o->num_lines);
get_line_fingerprints(o->fingerprints, o->file.ptr, line_starts,
0, o->num_lines);
free(line_starts);
@ -1853,7 +1853,8 @@ static void blame_chunk(struct blame_entry ***dstq, struct blame_entry ***srcq,
diffp = NULL;
if (ignore_diffs && same - tlno > 0) {
CALLOC_ARRAY(line_blames, same - tlno);
line_blames = xcalloc(sizeof(struct blame_line_tracker),
same - tlno);
guess_line_blames(parent, target, tlno, offset, same,
parent_len, line_blames);
}
@ -2215,7 +2216,7 @@ static struct blame_list *setup_blame_list(struct blame_entry *unblamed,
for (e = unblamed, num_ents = 0; e; e = e->next)
num_ents++;
if (num_ents) {
CALLOC_ARRAY(blame_list, num_ents);
blame_list = xcalloc(num_ents, sizeof(struct blame_list));
for (e = unblamed, i = 0; e; e = e->next)
blame_list[i++].ent = e;
}
@ -2427,7 +2428,7 @@ static void pass_blame(struct blame_scoreboard *sb, struct blame_origin *origin,
else if (num_sg < ARRAY_SIZE(sg_buf))
memset(sg_buf, 0, sizeof(sg_buf));
else
CALLOC_ARRAY(sg_origin, num_sg);
sg_origin = xcalloc(num_sg, sizeof(*sg_origin));
/*
* The first pass looks for unrenamed path to optimize for

View File

@ -70,7 +70,7 @@
* the input data, the next mix it from the 512-bit array.
*/
#define SHA_SRC(t) get_be32((unsigned char *) block + (t)*4)
#define SHA_MIX(t) SHA_ROL(W((t)+13) ^ W((t)+8) ^ W((t)+2) ^ W(t), 1)
#define SHA_MIX(t) SHA_ROL(W((t)+13) ^ W((t)+8) ^ W((t)+2) ^ W(t), 1);
#define SHA_ROUND(t, input, fn, constant, A, B, C, D, E) do { \
unsigned int TEMP = input(t); setW(t, TEMP); \

View File

@ -277,7 +277,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
*computed |= BLOOM_TRUNC_EMPTY;
filter->len = 1;
}
CALLOC_ARRAY(filter->data, filter->len);
filter->data = xcalloc(filter->len, sizeof(unsigned char));
hashmap_for_each_entry(&pathmap, &iter, e, entry) {
struct bloom_key key;

View File

@ -38,27 +38,19 @@ struct update_callback_data {
int add_errors;
};
static int chmod_pathspec(struct pathspec *pathspec, char flip, int show_only)
static void chmod_pathspec(struct pathspec *pathspec, char flip)
{
int i, ret = 0;
int i;
for (i = 0; i < active_nr; i++) {
struct cache_entry *ce = active_cache[i];
int err;
if (pathspec && !ce_path_match(&the_index, ce, pathspec, NULL))
continue;
if (!show_only)
err = chmod_cache_entry(ce, flip);
else
err = S_ISREG(ce->ce_mode) ? 0 : -1;
if (err < 0)
ret = error(_("cannot chmod %cx '%s'"), flip, ce->name);
if (chmod_cache_entry(ce, flip) < 0)
fprintf(stderr, "cannot chmod %cx '%s'\n", flip, ce->name);
}
return ret;
}
static int fix_unmerged_status(struct diff_filepair *p,
@ -617,7 +609,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
exit_status |= add_files(&dir, flags);
if (chmod_arg && pathspec.nr)
exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
chmod_pathspec(&pathspec, chmod_arg[0]);
unplug_bulk_checkin();
finish:

View File

@ -21,15 +21,16 @@ static GIT_PATH_FUNC(git_path_bisect_first_parent, "BISECT_FIRST_PARENT")
static const char * const git_bisect_helper_usage[] = {
N_("git bisect--helper --bisect-reset [<commit>]"),
N_("git bisect--helper --bisect-write [--no-log] <state> <revision> <good_term> <bad_term>"),
N_("git bisect--helper --bisect-check-and-set-terms <command> <good_term> <bad_term>"),
N_("git bisect--helper --bisect-next-check <good_term> <bad_term> [<term>]"),
N_("git bisect--helper --bisect-terms [--term-good | --term-old | --term-bad | --term-new]"),
N_("git bisect--helper --bisect-start [--term-{new,bad}=<term> --term-{old,good}=<term>]"
" [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<paths>...]"),
N_("git bisect--helper --bisect-next"),
N_("git bisect--helper --bisect-auto-next"),
N_("git bisect--helper --bisect-state (bad|new) [<rev>]"),
N_("git bisect--helper --bisect-state (good|old) [<rev>...]"),
N_("git bisect--helper --bisect-replay <filename>"),
N_("git bisect--helper --bisect-skip [(<rev>|<range>)...]"),
NULL
};
@ -874,19 +875,12 @@ static enum bisect_error bisect_state(struct bisect_terms *terms, const char **a
*/
for (; argc; argc--, argv++) {
struct commit *commit;
if (get_oid(*argv, &oid)){
error(_("Bad rev input: %s"), *argv);
oid_array_clear(&revs);
return BISECT_FAILED;
}
commit = lookup_commit_reference(the_repository, &oid);
if (!commit)
die(_("Bad rev input (not a commit): %s"), *argv);
oid_array_append(&revs, &commit->object.oid);
oid_array_append(&revs, &oid);
}
if (strbuf_read_file(&buf, git_path_bisect_expected_rev(), 0) < the_hash_algo->hexsz ||
@ -910,148 +904,28 @@ static enum bisect_error bisect_state(struct bisect_terms *terms, const char **a
return bisect_auto_next(terms, NULL);
}
static enum bisect_error bisect_log(void)
{
int fd, status;
const char* filename = git_path_bisect_log();
if (is_empty_or_missing_file(filename))
return error(_("We are not bisecting."));
fd = open(filename, O_RDONLY);
if (fd < 0)
return BISECT_FAILED;
status = copy_fd(fd, STDOUT_FILENO);
close(fd);
return status ? BISECT_FAILED : BISECT_OK;
}
static int process_replay_line(struct bisect_terms *terms, struct strbuf *line)
{
const char *p = line->buf + strspn(line->buf, " \t");
char *word_end, *rev;
if ((!skip_prefix(p, "git bisect", &p) &&
!skip_prefix(p, "git-bisect", &p)) || !isspace(*p))
return 0;
p += strspn(p, " \t");
word_end = (char *)p + strcspn(p, " \t");
rev = word_end + strspn(word_end, " \t");
*word_end = '\0'; /* NUL-terminate the word */
get_terms(terms);
if (check_and_set_terms(terms, p))
return -1;
if (!strcmp(p, "start")) {
struct strvec argv = STRVEC_INIT;
int res;
sq_dequote_to_strvec(rev, &argv);
res = bisect_start(terms, argv.v, argv.nr);
strvec_clear(&argv);
return res;
}
if (one_of(p, terms->term_good,
terms->term_bad, "skip", NULL))
return bisect_write(p, rev, terms, 0);
if (!strcmp(p, "terms")) {
struct strvec argv = STRVEC_INIT;
int res;
sq_dequote_to_strvec(rev, &argv);
res = bisect_terms(terms, argv.nr == 1 ? argv.v[0] : NULL);
strvec_clear(&argv);
return res;
}
error(_("'%s'?? what are you talking about?"), p);
return -1;
}
static enum bisect_error bisect_replay(struct bisect_terms *terms, const char *filename)
{
FILE *fp = NULL;
enum bisect_error res = BISECT_OK;
struct strbuf line = STRBUF_INIT;
if (is_empty_or_missing_file(filename))
return error(_("cannot read file '%s' for replaying"), filename);
if (bisect_reset(NULL))
return BISECT_FAILED;
fp = fopen(filename, "r");
if (!fp)
return BISECT_FAILED;
while ((strbuf_getline(&line, fp) != EOF) && !res)
res = process_replay_line(terms, &line);
strbuf_release(&line);
fclose(fp);
if (res)
return BISECT_FAILED;
return bisect_auto_next(terms, NULL);
}
static enum bisect_error bisect_skip(struct bisect_terms *terms, const char **argv, int argc)
{
int i;
enum bisect_error res;
struct strvec argv_state = STRVEC_INIT;
strvec_push(&argv_state, "skip");
for (i = 0; i < argc; i++) {
const char *dotdot = strstr(argv[i], "..");
if (dotdot) {
struct rev_info revs;
struct commit *commit;
init_revisions(&revs, NULL);
setup_revisions(2, argv + i - 1, &revs, NULL);
if (prepare_revision_walk(&revs))
die(_("revision walk setup failed\n"));
while ((commit = get_revision(&revs)) != NULL)
strvec_push(&argv_state,
oid_to_hex(&commit->object.oid));
reset_revision_walk();
} else {
strvec_push(&argv_state, argv[i]);
}
}
res = bisect_state(terms, argv_state.v, argv_state.nr);
strvec_clear(&argv_state);
return res;
}
int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
{
enum {
BISECT_RESET = 1,
BISECT_WRITE,
CHECK_AND_SET_TERMS,
BISECT_NEXT_CHECK,
BISECT_TERMS,
BISECT_START,
BISECT_AUTOSTART,
BISECT_NEXT,
BISECT_STATE,
BISECT_LOG,
BISECT_REPLAY,
BISECT_SKIP
BISECT_AUTO_NEXT,
BISECT_STATE
} cmdmode = 0;
int res = 0, nolog = 0;
struct option options[] = {
OPT_CMDMODE(0, "bisect-reset", &cmdmode,
N_("reset the bisection state"), BISECT_RESET),
OPT_CMDMODE(0, "bisect-write", &cmdmode,
N_("write out the bisection state in BISECT_LOG"), BISECT_WRITE),
OPT_CMDMODE(0, "check-and-set-terms", &cmdmode,
N_("check and set terms in a bisection state"), CHECK_AND_SET_TERMS),
OPT_CMDMODE(0, "bisect-next-check", &cmdmode,
N_("check whether bad or good terms exist"), BISECT_NEXT_CHECK),
OPT_CMDMODE(0, "bisect-terms", &cmdmode,
@ -1060,14 +934,10 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
N_("start the bisect session"), BISECT_START),
OPT_CMDMODE(0, "bisect-next", &cmdmode,
N_("find the next bisection commit"), BISECT_NEXT),
OPT_CMDMODE(0, "bisect-auto-next", &cmdmode,
N_("verify the next bisection state then checkout the next bisection commit"), BISECT_AUTO_NEXT),
OPT_CMDMODE(0, "bisect-state", &cmdmode,
N_("mark the state of ref (or refs)"), BISECT_STATE),
OPT_CMDMODE(0, "bisect-log", &cmdmode,
N_("list the bisection steps so far"), BISECT_LOG),
OPT_CMDMODE(0, "bisect-replay", &cmdmode,
N_("replay the bisection process from the given file"), BISECT_REPLAY),
OPT_CMDMODE(0, "bisect-skip", &cmdmode,
N_("skip some commits for checkout"), BISECT_SKIP),
OPT_BOOL(0, "no-log", &nolog,
N_("no log for BISECT_WRITE")),
OPT_END()
@ -1085,7 +955,18 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
case BISECT_RESET:
if (argc > 1)
return error(_("--bisect-reset requires either no argument or a commit"));
res = bisect_reset(argc ? argv[0] : NULL);
return !!bisect_reset(argc ? argv[0] : NULL);
case BISECT_WRITE:
if (argc != 4 && argc != 5)
return error(_("--bisect-write requires either 4 or 5 arguments"));
set_terms(&terms, argv[3], argv[2]);
res = bisect_write(argv[0], argv[1], &terms, nolog);
break;
case CHECK_AND_SET_TERMS:
if (argc != 3)
return error(_("--check-and-set-terms requires 3 arguments"));
set_terms(&terms, argv[2], argv[1]);
res = check_and_set_terms(&terms, argv[0]);
break;
case BISECT_NEXT_CHECK:
if (argc != 2 && argc != 3)
@ -1108,26 +989,17 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
get_terms(&terms);
res = bisect_next(&terms, prefix);
break;
case BISECT_AUTO_NEXT:
if (argc)
return error(_("--bisect-auto-next requires 0 arguments"));
get_terms(&terms);
res = bisect_auto_next(&terms, prefix);
break;
case BISECT_STATE:
set_terms(&terms, "bad", "good");
get_terms(&terms);
res = bisect_state(&terms, argv, argc);
break;
case BISECT_LOG:
if (argc)
return error(_("--bisect-log requires 0 arguments"));
res = bisect_log();
break;
case BISECT_REPLAY:
if (argc != 1)
return error(_("no logfile given"));
set_terms(&terms, "bad", "good");
res = bisect_replay(&terms, argv[0]);
break;
case BISECT_SKIP:
set_terms(&terms, "bad", "good");
res = bisect_skip(&terms, argv, argc);
break;
default:
BUG("unknown subcommand %d", cmdmode);
}

View File

@ -425,11 +425,13 @@ static void setup_default_color_by_age(void)
parse_color_fields("blue,12 month ago,white,1 month ago,red");
}
static void determine_line_heat(struct commit_info *ci, const char **dest_color)
static void determine_line_heat(struct blame_entry *ent, const char **dest_color)
{
int i = 0;
struct commit_info ci;
get_commit_info(ent->suspect->commit, &ci, 1);
while (i < colorfield_nr && ci->author_time > colorfield[i].hop)
while (i < colorfield_nr && ci.author_time > colorfield[i].hop)
i++;
*dest_color = colorfield[i].col;
@ -451,7 +453,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
cp = blame_nth_line(sb, ent->lno);
if (opt & OUTPUT_SHOW_AGE_WITH_COLOR) {
determine_line_heat(&ci, &default_color);
determine_line_heat(ent, &default_color);
color = default_color;
reset = GIT_COLOR_RESET;
}
@ -1149,7 +1151,7 @@ parse_done:
sb.xdl_opts = xdl_opts;
sb.no_whole_file_rename = no_whole_file_rename;
read_mailmap(&mailmap);
read_mailmap(&mailmap, NULL);
sb.found_guilty_entry = &found_guilty_entry;
sb.found_guilty_entry_data = &pi;

View File

@ -202,9 +202,6 @@ static int delete_branches(int argc, const char **argv, int force, int kinds,
int remote_branch = 0;
struct strbuf bname = STRBUF_INIT;
unsigned allowed_interpret;
struct string_list refs_to_delete = STRING_LIST_INIT_DUP;
struct string_list_item *item;
int branch_name_pos;
switch (kinds) {
case FILTER_REFS_REMOTES:
@ -222,7 +219,6 @@ static int delete_branches(int argc, const char **argv, int force, int kinds,
default:
die(_("cannot use -a with -d"));
}
branch_name_pos = strcspn(fmt, "%");
if (!force) {
head_rev = lookup_commit_reference(the_repository, &head_oid);
@ -269,35 +265,30 @@ static int delete_branches(int argc, const char **argv, int force, int kinds,
goto next;
}
item = string_list_append(&refs_to_delete, name);
item->util = xstrdup((flags & REF_ISBROKEN) ? "broken"
: (flags & REF_ISSYMREF) ? target
: find_unique_abbrev(&oid, DEFAULT_ABBREV));
if (delete_ref(NULL, name, is_null_oid(&oid) ? NULL : &oid,
REF_NO_DEREF)) {
error(remote_branch
? _("Error deleting remote-tracking branch '%s'")
: _("Error deleting branch '%s'"),
bname.buf);
ret = 1;
goto next;
}
if (!quiet) {
printf(remote_branch
? _("Deleted remote-tracking branch %s (was %s).\n")
: _("Deleted branch %s (was %s).\n"),
bname.buf,
(flags & REF_ISBROKEN) ? "broken"
: (flags & REF_ISSYMREF) ? target
: find_unique_abbrev(&oid, DEFAULT_ABBREV));
}
delete_branch_config(bname.buf);
next:
free(target);
}
if (delete_refs(NULL, &refs_to_delete, REF_NO_DEREF))
ret = 1;
for_each_string_list_item(item, &refs_to_delete) {
char *describe_ref = item->util;
char *name = item->string;
if (!ref_exists(name)) {
char *refname = name + branch_name_pos;
if (!quiet)
printf(remote_branch
? _("Deleted remote-tracking branch %s (was %s).\n")
: _("Deleted branch %s (was %s).\n"),
name + branch_name_pos, describe_ref);
delete_branch_config(refname);
}
free(describe_ref);
}
string_list_clear(&refs_to_delete, 0);
free(name);
strbuf_release(&bname);

View File

@ -47,7 +47,7 @@ int cmd_check_mailmap(int argc, const char **argv, const char *prefix)
if (argc == 0 && !use_stdin)
die(_("no contacts specified"));
read_mailmap(&mailmap);
read_mailmap(&mailmap, NULL);
for (i = 0; i < argc; ++i)
check_mailmap(&mailmap, argv[i]);

View File

@ -23,35 +23,22 @@ static struct checkout state = CHECKOUT_INIT;
static void write_tempfile_record(const char *name, const char *prefix)
{
int i;
int have_tempname = 0;
if (CHECKOUT_ALL == checkout_stage) {
for (i = 1; i < 4; i++)
if (topath[i][0]) {
have_tempname = 1;
break;
}
if (have_tempname) {
for (i = 1; i < 4; i++) {
if (i > 1)
putchar(' ');
if (topath[i][0])
fputs(topath[i], stdout);
else
putchar('.');
}
for (i = 1; i < 4; i++) {
if (i > 1)
putchar(' ');
if (topath[i][0])
fputs(topath[i], stdout);
else
putchar('.');
}
} else if (topath[checkout_stage][0]) {
have_tempname = 1;
} else
fputs(topath[checkout_stage], stdout);
}
if (have_tempname) {
putchar('\t');
write_name_quoted_relative(name, prefix, stdout,
nul_term_line ? '\0' : '\n');
}
putchar('\t');
write_name_quoted_relative(name, prefix, stdout,
nul_term_line ? '\0' : '\n');
for (i = 0; i < 4; i++) {
topath[i][0] = 0;

View File

@ -821,6 +821,9 @@ static int merge_working_tree(const struct checkout_opts *opts,
}
}
if (!active_cache_tree)
active_cache_tree = cache_tree();
if (!cache_tree_fully_valid(active_cache_tree))
cache_tree_update(&the_index, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);

View File

@ -623,7 +623,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
nr += chosen[i];
}
CALLOC_ARRAY(result, st_add(nr, 1));
result = xcalloc(st_add(nr, 1), sizeof(int));
for (i = 0; i < stuff->nr && j < nr; i++) {
if (chosen[i])
result[j++] = i;

View File

@ -250,6 +250,15 @@ static char *guess_dir_name(const char *repo, int is_bundle, int is_bare)
end--;
}
/*
* It should not be possible to overflow `ptrdiff_t` by passing in an
* insanely long URL, but GCC does not know that and will complain
* without this check.
*/
if (end - start < 0)
die(_("No directory name could be guessed.\n"
"Please specify a directory on the command line"));
/*
* Strip trailing port number if we've got only a
* hostname (that is, there is no dir separator but a
@ -420,13 +429,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
int src_len, dest_len;
struct dir_iterator *iter;
int iter_status;
unsigned int flags;
struct strbuf realpath = STRBUF_INIT;
mkdir_if_missing(dest->buf, 0777);
flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS;
iter = dir_iterator_begin(src->buf, flags);
iter = dir_iterator_begin(src->buf, DIR_ITERATOR_PEDANTIC);
if (!iter)
die_errno(_("failed to start iterator over '%s'"), src->buf);
@ -442,6 +449,10 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
strbuf_setlen(dest, dest_len);
strbuf_addstr(dest, iter->relative_path);
if (S_ISLNK(iter->st.st_mode))
die(_("symlink '%s' exists, refusing to clone with --local"),
iter->relative_path);
if (S_ISDIR(iter->st.st_mode)) {
mkdir_if_missing(dest->buf, 0777);
continue;
@ -979,8 +990,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
int err = 0, complete_refs_before_fetch = 1;
int submodule_progress;
struct transport_ls_refs_options transport_ls_refs_options =
TRANSPORT_LS_REFS_OPTIONS_INIT;
struct strvec ref_prefixes = STRVEC_INIT;
packet_trace_identity("clone");
@ -1200,10 +1210,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
refspec_appendf(&remote->fetch, "+%s*:%s*", src_ref_prefix,
branch_top.buf);
transport = transport_get(remote, remote->url[0]);
transport_set_verbosity(transport, option_verbosity, option_progress);
transport->family = family;
path = get_repo_path(remote->url[0], &is_bundle);
is_local = option_local != 0 && path && !is_bundle;
if (is_local) {
@ -1223,6 +1229,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
}
if (option_local > 0 && !is_local)
warning(_("--local is ignored"));
transport = transport_get(remote, path ? path : remote->url[0]);
transport_set_verbosity(transport, option_verbosity, option_progress);
transport->family = family;
transport->cloning = 1;
transport_set_option(transport, TRANS_OPT_KEEP, "yes");
@ -1258,17 +1268,14 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
transport->smart_options->check_self_contained_and_connected = 1;
strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
refspec_ref_prefixes(&remote->fetch,
&transport_ls_refs_options.ref_prefixes);
strvec_push(&ref_prefixes, "HEAD");
refspec_ref_prefixes(&remote->fetch, &ref_prefixes);
if (option_branch)
expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
option_branch);
expand_ref_prefix(&ref_prefixes, option_branch);
if (!option_no_tags)
strvec_push(&transport_ls_refs_options.ref_prefixes,
"refs/tags/");
strvec_push(&ref_prefixes, "refs/tags/");
refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
refs = transport_get_remote_refs(transport, &ref_prefixes);
if (refs) {
int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@ -1330,19 +1337,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
remote_head = NULL;
option_no_checkout = 1;
if (!option_bare) {
const char *branch;
char *ref;
if (transport_ls_refs_options.unborn_head_target &&
skip_prefix(transport_ls_refs_options.unborn_head_target,
"refs/heads/", &branch)) {
ref = transport_ls_refs_options.unborn_head_target;
transport_ls_refs_options.unborn_head_target = NULL;
create_symref("HEAD", ref, reflog_msg.buf);
} else {
branch = git_default_branch_name(0);
ref = xstrfmt("refs/heads/%s", branch);
}
const char *branch = git_default_branch_name(0);
char *ref = xstrfmt("refs/heads/%s", branch);
install_branch_config(0, branch, remote_name, ref);
free(ref);
@ -1395,7 +1391,6 @@ cleanup:
strbuf_release(&key);
junk_mode = JUNK_LEAVE_ALL;
strvec_clear(&transport_ls_refs_options.ref_prefixes);
free(transport_ls_refs_options.unborn_head_target);
strvec_clear(&ref_prefixes);
return err;
}

View File

@ -1039,7 +1039,7 @@ static const char *find_author_by_nickname(const char *name)
av[++ac] = NULL;
setup_revisions(ac, av, &revs, NULL);
revs.mailmap = &mailmap;
read_mailmap(revs.mailmap);
read_mailmap(revs.mailmap, NULL);
if (prepare_revision_walk(&revs))
die(_("revision walk setup failed"));

View File

@ -194,7 +194,7 @@ static int get_name(const char *path, const struct object_id *oid, int flag, voi
}
/* Is it annotated? */
if (!peel_iterated_oid(oid, &peeled)) {
if (!peel_ref(path, &peeled)) {
is_annotated = !oideq(oid, &peeled);
} else {
oidcpy(&peeled, oid);

Some files were not shown because too many files have changed in this diff Show More