From 524c0183c999c59940ce1a8712b78e4dbd87ae60 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 12 Jun 2024 10:03:23 +0200 Subject: [PATCH 1/3] config: fix segfault when parsing "core.abbrev" without repo The "core.abbrev" config allows the user to specify the minimum length when abbreviating object hashes. Next to the values "auto" and "no", this config also accepts a concrete length that needs to be bigger or equal to the minimum length and smaller or equal to the hash algorithm's hex length. While the former condition is trivial, the latter depends on the object format used by the current repository. It is thus a variable upper boundary that may either be 40 (SHA-1) or 64 (SHA-256). This has two major downsides. First, the user that specifies this config must be aware of the object hashes that its repository use. If they want to configure the value globally, then they cannot pick any value in the range `[41, 64]` if they have any repository that uses SHA-1. If they did, Git would error out when parsing the config. Second, and more importantly, parsing "core.abbrev" crashes when outside of a Git repository because we dereference `the_hash_algo` to figure out its hex length. Starting with c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07) though, we stopped initializing `the_hash_algo` outside of Git repositories. Fix both of these issues by not making it an error anymore when the given length exceeds the hash length. Instead, leave the abbreviated length intact. `repo_find_unique_abbrev_r()` handles this just fine except for a performance penalty which we will fix in a subsequent commit. Reported-by: Kyle Lippincott Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- config.c | 4 ++-- t/t4202-log.sh | 12 ++++++++++++ t/t5601-clone.sh | 7 +++++++ 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/config.c b/config.c index 14461312b3..d0e9396bda 100644 --- a/config.c +++ b/config.c @@ -1456,10 +1456,10 @@ static int git_default_core_config(const char *var, const char *value, if (!strcasecmp(value, "auto")) default_abbrev = -1; else if (!git_parse_maybe_bool_text(value)) - default_abbrev = the_hash_algo->hexsz; + default_abbrev = GIT_MAX_HEXSZ; else { int abbrev = git_config_int(var, value, ctx->kvi); - if (abbrev < minimum_abbrev || abbrev > the_hash_algo->hexsz) + if (abbrev < minimum_abbrev) return error(_("abbrev length out of range: %d"), abbrev); default_abbrev = abbrev; } diff --git a/t/t4202-log.sh b/t/t4202-log.sh index 86c695eb0a..e97826458c 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -1237,6 +1237,18 @@ test_expect_success 'log.abbrevCommit configuration' ' test_cmp expect.whatchanged.full actual ' +test_expect_success '--abbrev-commit with core.abbrev=false' ' + git log --no-abbrev >expect && + git -c core.abbrev=false log --abbrev-commit >actual && + test_cmp expect actual +' + +test_expect_success '--abbrev-commit with core.abbrev=9000' ' + git log --no-abbrev >expect && + git -c core.abbrev=9000 log --abbrev-commit >actual && + test_cmp expect actual +' + test_expect_success 'show added path under "--follow -M"' ' # This tests for a regression introduced in v1.7.2-rc0~103^2~2 test_create_repo regression && diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index cc0b953f14..5d7ea147f1 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -46,6 +46,13 @@ test_expect_success 'output from clone' ' test $(grep Clon output | wc -l) = 1 ' +test_expect_success 'output from clone with core.abbrev does not crash' ' + rm -fr dst && + echo "Cloning into ${SQ}dst${SQ}..." >expect && + git -c core.abbrev=12 clone -n "file://$(pwd)/src" dst >actual 2>&1 && + test_cmp expect actual +' + test_expect_success 'clone does not keep pack' ' rm -fr dst && From 59ff92c516be0a2b0292acea869c6ac79f8bae5c Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 12 Jun 2024 10:03:31 +0200 Subject: [PATCH 2/3] parse-options-cb: stop clamping "--abbrev=" to hash length The `OPT__ABBREV()` option allows the user to specify the length that object hashes shall be abbreviated to. This length needs to be in the range of `(MIN_ABBREV, the_hash_algo->hexsz)`, which is why we clamp the value as required. While this makes sense in the case of `MIN_ABBREV`, it is unnecessary for the upper boundary as the value is eventually passed down to `repo_find_unnique_abbrev_r()`, which handles values larger than the current hash length just fine. In the preceding commit, we have changed parsing of the "core.abbrev" config to stop clamping to the upper boundary. Let's do the same here so that the code becomes simpler, we are consistent with how we treat the "core.abbrev" config and so that we stop depending on `the_repository`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- parse-options-cb.c | 2 -- t/t4202-log.sh | 12 ++++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/parse-options-cb.c b/parse-options-cb.c index d99d688d3c..b2aa62a9dc 100644 --- a/parse-options-cb.c +++ b/parse-options-cb.c @@ -30,8 +30,6 @@ int parse_opt_abbrev_cb(const struct option *opt, const char *arg, int unset) opt->long_name); if (v && v < MINIMUM_ABBREV) v = MINIMUM_ABBREV; - else if (startup_info->have_repository && v > the_hash_algo->hexsz) - v = the_hash_algo->hexsz; } *(int *)(opt->value) = v; return 0; diff --git a/t/t4202-log.sh b/t/t4202-log.sh index e97826458c..51f7beb59f 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -1243,12 +1243,24 @@ test_expect_success '--abbrev-commit with core.abbrev=false' ' test_cmp expect actual ' +test_expect_success '--abbrev-commit with --no-abbrev' ' + git log --no-abbrev >expect && + git log --abbrev-commit --no-abbrev >actual && + test_cmp expect actual +' + test_expect_success '--abbrev-commit with core.abbrev=9000' ' git log --no-abbrev >expect && git -c core.abbrev=9000 log --abbrev-commit >actual && test_cmp expect actual ' +test_expect_success '--abbrev-commit with --abbrev=9000' ' + git log --no-abbrev >expect && + git log --abbrev-commit --abbrev=9000 >actual && + test_cmp expect actual +' + test_expect_success 'show added path under "--follow -M"' ' # This tests for a regression introduced in v1.7.2-rc0~103^2~2 test_create_repo regression && From 037df60013bb3f1534d4db6bf850d7547f1c1d13 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 12 Jun 2024 10:03:36 +0200 Subject: [PATCH 3/3] object-name: don't try to abbreviate to lengths greater than hexsz MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When given a length that equals the current hash algorithm's hex size, then `repo_find_unique_abbrev_r()` exits early without trying to find an abbreviation. This is only sensible because there is nothing to abbreviate in the first place, so searching through objects to find a unique prefix would be a waste of compute. What we don't handle though is the case where the user passes a length greater than the hash length. This is fine in practice as we still compute the correct result. But at the very least, this is a waste of resources as we try to abbreviate a value that cannot be abbreviated, which causes us to hit the object database. Start to explicitly handle values larger than hexsz to avoid this performance penalty, which leads to a measureable speedup. The following benchmark has been executed in linux.git: Benchmark 1: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD~) Time (mean ± σ): 12.812 s ± 0.040 s [User: 12.225 s, System: 0.554 s] Range (min … max): 12.723 s … 12.857 s 10 runs Benchmark 2: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD) Time (mean ± σ): 11.095 s ± 0.029 s [User: 10.546 s, System: 0.521 s] Range (min … max): 11.037 s … 11.122 s 10 runs Summary git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD) ran 1.15 ± 0.00 times faster than git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD~) Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- object-name.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/object-name.c b/object-name.c index 523af6f64f..1be2ad1a16 100644 --- a/object-name.c +++ b/object-name.c @@ -837,7 +837,7 @@ int repo_find_unique_abbrev_r(struct repository *r, char *hex, } oid_to_hex_r(hex, oid); - if (len == hexsz || !len) + if (len >= hexsz || !len) return hexsz; mad.repo = r;