diff: restrict when prefetching occurs

Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08)
optimized "diff" by prefetching blobs in a partial clone, but there are
some cases wherein blobs do not need to be prefetched. In these cases,
any command that uses the diff machinery will unnecessarily fetch blobs.

diffcore_std() may read blobs when it calls the following functions:
 (1) diffcore_skip_stat_unmatch() (controlled by the config variable
     diff.autorefreshindex)
 (2) diffcore_break() and diffcore_merge_broken() (for break-rewrite
     detection)
 (3) diffcore_rename() (for rename detection)
 (4) diffcore_pickaxe() (for detecting addition/deletion of specified
     string)

Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(),
diffcore_break(), and diffcore_rename() to prefetch blobs upon the first
read of a missing object. This covers (1), (2), and (3): to cover the
rest, teach diffcore_std() to prefetch if the output type is one that
includes blob data (and hence blob data will be required later anyway),
or if it knows that (4) will be run.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jonathan Tan
2020-04-07 15:11:43 -07:00
committed by Junio C Hamano
parent c14b6f83ec
commit 95acf11a3d
5 changed files with 181 additions and 28 deletions

View File

@ -131,4 +131,52 @@ test_expect_success 'diff with rename detection batches blobs' '
test_line_count = 1 done_lines
'
test_expect_success 'diff does not fetch anything if inexact rename detection is not needed' '
test_when_finished "rm -rf server client trace" &&
test_create_repo server &&
echo a >server/a &&
printf "b\nb\nb\nb\nb\n" >server/b &&
git -C server add a b &&
git -C server commit -m x &&
mv server/b server/c &&
git -C server add c &&
git -C server commit -a -m x &&
test_config -C server uploadpack.allowfilter 1 &&
test_config -C server uploadpack.allowanysha1inwant 1 &&
git clone --bare --filter=blob:limit=0 "file://$(pwd)/server" client &&
# Ensure no fetches.
GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --raw -M HEAD^ HEAD &&
! test_path_exists trace
'
test_expect_success 'diff --break-rewrites fetches only if necessary, and batches blobs if it does' '
test_when_finished "rm -rf server client trace" &&
test_create_repo server &&
echo a >server/a &&
printf "b\nb\nb\nb\nb\n" >server/b &&
git -C server add a b &&
git -C server commit -m x &&
printf "c\nc\nc\nc\nc\n" >server/b &&
git -C server commit -a -m x &&
test_config -C server uploadpack.allowfilter 1 &&
test_config -C server uploadpack.allowanysha1inwant 1 &&
git clone --bare --filter=blob:limit=0 "file://$(pwd)/server" client &&
# Ensure no fetches.
GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --raw -M HEAD^ HEAD &&
! test_path_exists trace &&
# But with --break-rewrites, ensure that there is exactly 1 negotiation
# by checking that there is only 1 "done" line sent. ("done" marks the
# end of negotiation.)
GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --break-rewrites --raw -M HEAD^ HEAD &&
grep "git> done" trace >done_lines &&
test_line_count = 1 done_lines
'
test_done