sparse-checkout: respect core.ignoreCase in cone mode

When a user uses the sparse-checkout feature in cone mode, they
add patterns using "git sparse-checkout set <dir1> <dir2> ..."
or by using "--stdin" to provide the directories line-by-line over
stdin. This behaviour naturally looks a lot like the way a user
would type "git add <dir1> <dir2> ..."

If core.ignoreCase is enabled, then "git add" will match the input
using a case-insensitive match. Do the same for the sparse-checkout
feature.

Perform case-insensitive checks while updating the skip-worktree
bits during unpack_trees(). This is done by changing the hash
algorithm and hashmap comparison methods to optionally use case-
insensitive methods.

When this is enabled, there is a small performance cost in the
hashing algorithm. To tease out the worst possible case, the
following was run on a repo with a deep directory structure:

	git ls-tree -d -r --name-only HEAD |
		git sparse-checkout set --stdin

The 'set' command was timed with core.ignoreCase disabled or
enabled. For the repo with a deep history, the numbers were

	core.ignoreCase=false: 62s
	core.ignoreCase=true:  74s (+19.3%)

For reproducibility, the equivalent test on the Linux kernel
repository had these numbers:

	core.ignoreCase=false: 3.1s
	core.ignoreCase=true:  3.6s (+16%)

Now, this is not an entirely fair comparison, as most users
will define their sparse cone using more shallow directories,
and the performance improvement from eb42feca97 ("unpack-trees:
hash less in cone mode" 2019-11-21) can remove most of the
hash cost. For a more realistic test, drop the "-r" from the
ls-tree command to store only the first-level directories.
In that case, the Linux kernel repository takes 0.2-0.25s in
each case, and the deep repository takes one second, plus or
minus 0.05s, in each case.

Thus, we _can_ demonstrate a cost to this change, but it is
unlikely to matter to any reasonable sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Derrick Stolee
2019-12-13 18:09:53 +00:00
committed by Junio C Hamano
parent cff4e9138d
commit 190a65f9db
4 changed files with 42 additions and 5 deletions

View File

@ -313,7 +313,10 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
struct pattern_entry *e = xmalloc(sizeof(*e));
e->patternlen = path->len;
e->pattern = strbuf_detach(path, NULL);
hashmap_entry_init(&e->ent, memhash(e->pattern, e->patternlen));
hashmap_entry_init(&e->ent,
ignore_case ?
strihash(e->pattern) :
strhash(e->pattern));
hashmap_add(&pl->recursive_hashmap, &e->ent);
@ -329,7 +332,10 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
e = xmalloc(sizeof(struct pattern_entry));
e->patternlen = newlen;
e->pattern = xstrndup(oldpattern, newlen);
hashmap_entry_init(&e->ent, memhash(e->pattern, e->patternlen));
hashmap_entry_init(&e->ent,
ignore_case ?
strihash(e->pattern) :
strhash(e->pattern));
if (!hashmap_get_entry(&pl->parent_hashmap, e, ent, NULL))
hashmap_add(&pl->parent_hashmap, &e->ent);