cache-tree: skip some blob checks in partial clone

In a partial clone, whenever a sparse checkout occurs, the existence of
all blobs in the index is verified, whether they are included or
excluded by the .git/info/sparse-checkout specification. This
significantly degrades performance because a lazy fetch occurs whenever
the existence of a missing blob is checked.

This is because cache_tree_update() checks the existence of all objects
in the index, whether or not CE_SKIP_WORKTREE is set on them. Teach
cache_tree_update() to skip checking CE_SKIP_WORKTREE objects when the
repository is a partial clone. This improves performance for sparse
checkout and also other operations that use cache_tree_update().

Instead of completely removing the check, an argument could be made that
the check should instead be replaced by a check that the blob is
promised, but for performance reasons, I decided not to do this.
If the user needs to verify the repository, it can be done using fsck
(which will notify if a tree points to a missing and non-promised blob,
whether the blob is included or excluded by the sparse-checkout
specification).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jonathan Tan
2018-10-09 11:40:37 -07:00
committed by Junio C Hamano
parent 2efbb7f521
commit 2f215ff10b
2 changed files with 38 additions and 1 deletions

View File

@ -326,6 +326,7 @@ static int update_one(struct cache_tree *it,
unsigned mode;
int expected_missing = 0;
int contains_ita = 0;
int ce_missing_ok;
path = ce->name;
pathlen = ce_namelen(ce);
@ -355,8 +356,11 @@ static int update_one(struct cache_tree *it,
i++;
}
ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
(repository_format_partial_clone &&
ce_skip_worktree(ce));
if (is_null_oid(oid) ||
(mode != S_IFGITLINK && !missing_ok && !has_object_file(oid))) {
(!ce_missing_ok && !has_object_file(oid))) {
strbuf_release(&buffer);
if (expected_missing)
return -1;