From 866b43e6442f16d0073ae9ce8d79b6cb1161b1a9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 29 Jun 2023 13:23:08 +0000 Subject: [PATCH 1/3] do_read_index(): always mark index as initialized unless erroring out In 913e0e99b6a (unpack_trees(): protect the handcrafted in-core index from read_cache(), 2008-08-23) a flag was introduced into the `index_state` structure to indicate whether it had been initialized (or more correctly: read and parsed). There was one code path that was not handled, though: when the index file does not yet exist (but the `must_exist` parameter is set to 0 to indicate that that's okay). In this instance, Git wants to go forward with a new, pristine Git index, almost as if the file had existed and contained no index entries or extensions. Since Git wants to handle this situation the same as if an "empty" Git index file existed, let's set the `initialized` flag also in that case. This is necessary to prepare for fixing the bug where the condition `cache_nr == 0` is incorrectly used as an indicator that the index was already read, and the condition `initialized != 0` needs to be used instead. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- read-cache.c | 1 + 1 file changed, 1 insertion(+) diff --git a/read-cache.c b/read-cache.c index 35e5657877..1ac4defff3 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2330,6 +2330,7 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) if (fd < 0) { if (!must_exist && errno == ENOENT) { set_new_index_sparsity(istate); + istate->initialized = 1; return 0; } die_errno(_("%s: index file open failed"), path); From 7667f4f0a3c2002940c0b03930597fddc8599277 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 29 Jun 2023 13:23:09 +0000 Subject: [PATCH 2/3] split-index: accept that a base index can be empty We are about to fix an ancient bug where `do_read_index()` pretended that the index was not initialized when there are no index entries. Before the `index_state` structure gained the `initialized` flag in 913e0e99b6a (unpack_trees(): protect the handcrafted in-core index from read_cache(), 2008-08-23), that was the best we could do (even if it was incorrect: it is totally possible to read a Git index file that contains no index entries). This pattern was repeated also in 998330ac2e7 (read-cache: look for shared index files next to the index, too, 2021-08-26), which we fix here by _not_ mistaking an empty base index for a missing `sharedindex.*` file. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- read-cache.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/read-cache.c b/read-cache.c index 1ac4defff3..d0a3c9082b 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2500,12 +2500,14 @@ int read_index_from(struct index_state *istate, const char *path, base_oid_hex = oid_to_hex(&split_index->base_oid); base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_oid_hex); - trace2_region_enter_printf("index", "shared/do_read_index", - the_repository, "%s", base_path); - ret = do_read_index(split_index->base, base_path, 0); - trace2_region_leave_printf("index", "shared/do_read_index", - the_repository, "%s", base_path); - if (!ret) { + if (file_exists(base_path)) { + trace2_region_enter_printf("index", "shared/do_read_index", + the_repository, "%s", base_path); + + ret = do_read_index(split_index->base, base_path, 0); + trace2_region_leave_printf("index", "shared/do_read_index", + the_repository, "%s", base_path); + } else { char *path_copy = xstrdup(path); char *base_path2 = xstrfmt("%s/sharedindex.%s", dirname(path_copy), base_oid_hex); From 2ee045eea103e8818ffe0c4085fad3f6b535c8d6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 29 Jun 2023 13:23:10 +0000 Subject: [PATCH 3/3] commit -a -m: allow the top-level tree to become empty again In 03267e8656c (commit: discard partial cache before (re-)reading it, 2022-11-08), a memory leak was plugged by discarding any partial index before re-reading it. The problem with this memory leak fix is that it was based on an incomplete understanding of the logic introduced in 7168624c353 (Do not generate full commit log message if it is not going to be used, 2007-11-28). That logic was introduced to add a shortcut when committing without editing the commit message interactively. A part of that logic was to ensure that the index was read into memory: if (!active_nr && read_cache() < 0) die(...) Translation to English: If the index has not yet been read, read it, and if that fails, error out. That logic was incorrect, though: It used `!active_nr` as an indicator that the index was not yet read. Usually this is not a problem because in the vast majority of instances, the index contains at least one entry. And it was natural to do it this way because at the time that condition was introduced, the `index_state` structure had no explicit flag to indicate that it was initialized: This flag was only introduced in 913e0e99b6a (unpack_trees(): protect the handcrafted in-core index from read_cache(), 2008-08-23), but that commit did not adjust the code path where no index file was found and a new, pristine index was initialized. Now, when the index does not contain any entry (which is quite common in Git's test suite because it starts quite a many repositories from scratch), subsequent calls to `do_read_index()` will mistake the index not to be initialized, and read it again unnecessarily. This is a problem because after initializing the empty index e.g. the `cache_tree` in that index could have been initialized before a subsequent call to `do_read_index()` wants to ensure an initialized index. And if that subsequent call mistakes the index not to have been initialized, it would lead to leaked memory. The correct fix for that memory leak is to adjust the condition so that it does not mistake `active_nr == 0` to mean that the index has not yet been read. Using the `initialized` flag instead, we avoid that mistake, and as a bonus we can fix a bug at the same time that was introduced by the memory leak fix: When deleting all tracked files and then asking `git commit -a -m ...` to commit the result, Git would internally update the index, then discard and re-read the index undoing the update, and fail to commit anything. This fixes https://github.com/git-for-windows/git/issues/4462 Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- builtin/commit.c | 7 ++----- t/t2200-add-update.sh | 11 +++++++++++ 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 985a0445b7..d7ccfa0bfa 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -991,11 +991,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix, struct object_id oid; const char *parent = "HEAD"; - if (!the_index.cache_nr) { - discard_index(&the_index); - if (repo_read_index(the_repository) < 0) - die(_("Cannot read index")); - } + if (!the_index.initialized && repo_read_index(the_repository) < 0) + die(_("Cannot read index")); if (amend) parent = "HEAD^1"; diff --git a/t/t2200-add-update.sh b/t/t2200-add-update.sh index be394f1131..c01492f33f 100755 --- a/t/t2200-add-update.sh +++ b/t/t2200-add-update.sh @@ -197,4 +197,15 @@ test_expect_success '"add -u non-existent" should fail' ' ! grep "non-existent" actual ' +test_expect_success '"commit -a" implies "add -u" if index becomes empty' ' + git rm -rf \* && + git commit -m clean-slate && + test_commit file1 && + rm file1.t && + test_tick && + git commit -a -m remove && + git ls-tree HEAD: >out && + test_must_be_empty out +' + test_done