The code to fsck objects received across multiple packs during a
single git fetch session has been broken when the packfile URI
feature was in use. A workaround has been added by disabling the
codepath to avoid keeping a packfile that is too small.
* jt/transfer-fsck-across-packs-fix:
fetch-pack: do not mix --pack_header and packfile uri
When fetching (as opposed to cloning) from a repository with packfile
URIs enabled, an error like this may occur:
fatal: pack has bad object at offset 12: unknown object type 5
fatal: finish_http_pack_request gave result -1
fatal: fetch-pack: expected keep then TAB at start of http-fetch output
This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs,
use index-pack arg", 2021-02-22), when the index-pack args used when
processing the inline packfile of a fetch response and when processing
packfile URIs were unified.
This bug happens because fetch, by default, partially reads (and
consumes) the header of the inline packfile to determine if it should
store the downloaded objects as a packfile or loose objects, and thus
passes --pack_header=<...> to index-pack to inform it that some bytes
are missing. However, when it subsequently fetches the additional
packfiles linked by URIs, it reuses the same index-pack arguments, thus
wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are
missing.
This does not happen when cloning because "git clone" always passes
do_keep, which instructs the fetch mechanism to always retain the
packfile, eliminating the need to read the header.
There are a few ways to fix this, including filtering out pack_header
arguments when downloading the additional packfiles, but I decided to
stick to always using index-pack throughout when packfile URIs are
present - thus, Git no longer needs to read the bytes, and no longer
needs --pack_header here.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were some early changes in the 2.31 cycle to optimize some setup
in diffcore-rename.c[1], some later changes to measure performance[2],
and finally some significant changes to improve rename detection
performance. The final one was merged with the note
Performance optimization work on the rename detection continues.
That works for the commit log, but feels misleading as a release note
since all the changes were within one cycle. Simplify this to just
Performance improvements for rename detection.
The former wording could be seen as hinting that more performance
improvements will come in 2.32, which is true, but we can just cover
those in the 2.32 release notes when the time comes.
[1] a5ac31b5b1 (Merge branch 'en/diffcore-rename', 2021-01-25)
[2] d3a035b055 (Merge branch 'en/merge-ort-perf', 2021-02-11)
[3] 12bd17521c (Merge branch 'en/diffcore-rename', 2021-03-01)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Work around platforms whose open() is reported to return EINTR (it
shouldn't, as we do our signals with SA_RESTART).
* jk/open-returns-eintr:
config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur
Makefile: add OPEN_RETURNS_EINTR knob
This commit causes breakage on macOS, or in fact any platform using
older versions of Tcl. Revert it.
* py/revert-commit-comments:
Revert "git-gui: remove lines starting with the comment character"
This reverts commit b9a43869c9.
This commit causes breakage on macOS (10.13). It causes errors on
startup and completely breaks the commit functionality. There are two
main problems. First, it uses `string cat` which is not supported on
older Tcl versions. Second, it does a half close of the bidirectional
pipe to git-stripspace which is also not supported on older Tcl
versions.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
We've had mixed reports on whether the latest release of macOS needs
this Makefile knob set. In most reported cases, there's antivirus
software running (which one might imagine could cause an open() call to
be delayed). However, one of the (off-list) reports I've gotten
indicated that it happened on an otherwise clean install of Big Sur.
Since the symptom is so bad (checkout randomly fails to write several
fails when the progress meter kicks in), and since the workaround is so
lightweight (if we don't see EINTR, it's just an extra conditional
check), let's just turn it on by default.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some platforms, open() reportedly returns EINTR when opening regular
files and we receive a signal (usually SIGALRM from our progress meter).
This shouldn't happen, as open() should be a restartable syscall, and we
specify SA_RESTART when setting up the alarm handler. So it may actually
be a kernel or libc bug for this to happen. But it has been reported on
at least one version of Linux (on a network filesystem):
https://lore.kernel.org/git/c8061cce-71e4-17bd-a56a-a5fed93804da@neanderfunk.de/
as well as on macOS starting with Big Sur even on a regular filesystem.
We can work around it by retrying open() calls that get EINTR, just as
we do for read(), etc. Since we don't ever _want_ to interrupt an open()
call, we can get away with just redefining open, rather than insisting
all callsites use xopen().
We actually do have an xopen() wrapper already (and it even does this
retry, though there's no indication of it being an observed problem back
then; it seems simply to have been lifted from xread(), etc). But it is
used hardly anywhere, and isn't suitable for general use because it will
die() on error. In theory we could combine the two, but it's awkward to
do so because of the variable-args interface of open().
This patch adds a Makefile knob for enabling the workaround. It's not
enabled by default for any platforms in config.mak.uname yet, as we
don't have enough data to decide how common this is (I have not been
able to reproduce on either Linux or Big Sur myself). It may be worth
enabling preemptively anyway, since the cost is pretty low (if we don't
see an EINTR, it's just an extra conditional).
However, note that we must not enable this on Windows. It doesn't do
anything there, and the macro overrides the existing mingw_open()
redirection. I've added a preemptive #undef here in the mingw header
(which is processed first) to just quietly disable it (we could also
make it an #error, but there is little point in being so aggressive).
Reported-by: Aleksey Kliger <alklig@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.22:
Git 2.22.5
Git 2.21.4
Git 2.20.5
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.21:
Git 2.21.4
Git 2.20.5
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.20:
Git 2.20.5
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.19:
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.18:
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.17:
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
In the previous commit, we intercepted calls to `rmdir()` to invalidate
the lstat cache in the successful case, so that the lstat cache could
not have the idea that a directory exists where there is none.
The same situation can arise, of course, when a separate process is
spawned (most notably, this is the case in `submodule_move_head()`).
Obviously, we cannot know whether a directory was removed in that
process, therefore we must invalidate the lstat cache afterwards.
Note: in contrast to `lstat_cache_aware_rmdir()`, we invalidate the
lstat cache even in case of an error: the process might have removed a
directory and still have failed afterwards.
Co-authored-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Before checking out a file, we have to confirm that all of its leading
components are real existing directories. And to reduce the number of
lstat() calls in this process, we cache the last leading path known to
contain only directories. However, when a path collision occurs (e.g.
when checking out case-sensitive files in case-insensitive file
systems), a cached path might have its file type changed on disk,
leaving the cache on an invalid state. Normally, this doesn't bring
any bad consequences as we usually check out files in index order, and
therefore, by the time the cached path becomes outdated, we no longer
need it anyway (because all files in that directory would have already
been written).
But, there are some users of the checkout machinery that do not always
follow the index order. In particular: checkout-index writes the paths
in the same order that they appear on the CLI (or stdin); and the
delayed checkout feature -- used when a long-running filter process
replies with "status=delayed" -- postpones the checkout of some entries,
thus modifying the checkout order.
When we have to check out an out-of-order entry and the lstat() cache is
invalid (due to a previous path collision), checkout_entry() may end up
using the invalid data and thrusting that the leading components are
real directories when, in reality, they are not. In the best case
scenario, where the directory was replaced by a regular file, the user
will get an error: "fatal: unable to create file 'foo/bar': Not a
directory". But if the directory was replaced by a symlink, checkout
could actually end up following the symlink and writing the file at a
wrong place, even outside the repository. Since delayed checkout is
affected by this bug, it could be used by an attacker to write
arbitrary files during the clone of a maliciously crafted repository.
Some candidate solutions considered were to disable the lstat() cache
during unordered checkouts or sort the entries before passing them to
the checkout machinery. But both ideas include some performance penalty
and they don't future-proof the code against new unordered use cases.
Instead, we now manually reset the lstat cache whenever we successfully
remove a directory. Note: We are not even checking whether the directory
was the same as the lstat cache points to because we might face a
scenario where the paths refer to the same location but differ due to
case folding, precomposed UTF-8 issues, or the presence of `..`
components in the path. Two regression tests, with case-collisions and
utf8-collisions, are also added for both checkout-index and delayed
checkout.
Note: to make the previously mentioned clone attack unfeasible, it would
be sufficient to reset the lstat cache only after the remove_subtree()
call inside checkout_entry(). This is the place where we would remove a
directory whose path collides with the path of another entry that we are
currently trying to check out (possibly a symlink). However, in the
interest of a thorough fix that does not leave Git open to
similar-but-not-identical attack vectors, we decided to intercept
all `rmdir()` calls in one fell swoop.
This addresses CVE-2021-21300.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2021-02-12 15:47:02 +01:00
33 changed files with 368 additions and 29 deletions
@ -417,6 +417,9 @@ static int check_updates(struct unpack_trees_options *o,
progress=get_progress(o,index);
/* Start with clean cache to avoid using any possibly outdated info. */
invalidate_lstat_cache();
git_attr_set_direction(GIT_ATTR_CHECKOUT);
if(should_update_submodules())
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.