From 6fd80375640f565fe58c7c780e11a0e6fa115737 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:28 +0200 Subject: [PATCH 01/10] Documentation/glossary: redefine pseudorefs as special refs Nowadays, Git knows about three different kinds of refs. As defined in gitglossary(7): - Regular refs that start with "refs/", like "refs/heads/main". - Pseudorefs, which live in the root directory. These must have all-caps names and must be a file that start with an object hash. Consequently, symbolic refs are not pseudorefs because they do not start with an object hash. - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD". This state is extremely confusing, and I would claim that most folks don't fully understand what is what here. The current definitions also have several problems: - Where does "HEAD" fit in? It's not a pseudoref because it can be a symbolic ref. It's not a regular ref because it does not start with "refs/". And it's not a special ref, either. - There is a strong overlap between pseudorefs and special refs. The pseudoref section for example mentions "MERGE_HEAD", even though it is a special ref. Is it thus both a pseudoref and a special ref? - Why do we even need to distinguish refs that live in the root from other refs when they behave just like a regular ref anyway? In other words, the current state is quite a mess and leads to wild inconsistencies without much of a good reason. The original reason why pseudorefs were introduced is that there are some refs that sometimes behave like a ref, even though they aren't a ref. And we really only have two of these nowadays, namely "MERGE_HEAD" and "FETCH_HEAD". Those files are never written via the ref backends, but are instead written by git-fetch(1), git-pull(1) and git-merge(1). They contain additional metadata that highlights where a ref has been fetched from or the list of commits that have been merged. This original intent in fact matches the definition of special refs that we have recently introduced in 8df4c5d205 (Documentation: add "special refs" to the glossary, 2024-01-19). Due to the introduction of the new reftable backend we were forced to distinguish those refs more clearly such that we don't ever try to read or write them via the reftable backend. In the same series, we also addressed all the other cases where we used to write those special refs via the filesystem directly, thus circumventing the ref backend, to instead write them via the backends. Consequently, there are no other refs left anymore which are special. Let's address this mess and return the pseudoref terminology back to its original intent: a ref that sometimes behave like a ref, but which isn't really a ref because it gets written to the filesystem directly. Or in other words, let's redefine pseudorefs to match the current definition of special refs. As special refs and pseudorefs are now the same per definition, we can drop the "special refs" term again. It's not exposed to our users and thus they wouldn't ever encounter that term anyway. Refs that live in the root of the ref hierarchy but which are not pseudorefs will be further defined in a subsequent commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- Documentation/glossary-content.txt | 40 +++++++++--------------------- 1 file changed, 12 insertions(+), 28 deletions(-) diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index d71b199955..e686c83026 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -497,20 +497,18 @@ exclude;; unusual refs. [[def_pseudoref]]pseudoref:: - Pseudorefs are a class of files under `$GIT_DIR` which behave - like refs for the purposes of rev-parse, but which are treated - specially by git. Pseudorefs both have names that are all-caps, - and always start with a line consisting of a - <> followed by whitespace. So, HEAD is not a - pseudoref, because it is sometimes a symbolic ref. They might - optionally contain some additional data. `MERGE_HEAD` and - `CHERRY_PICK_HEAD` are examples. Unlike - <>, these files cannot - be symbolic refs, and never have reflogs. They also cannot be - updated through the normal ref update machinery. Instead, - they are updated by directly writing to the files. However, - they can be read as if they were refs, so `git rev-parse - MERGE_HEAD` will work. + A ref that has different semantics than normal refs. These refs can be + accessed via normal Git commands but may not behave the same as a + normal ref in some cases. ++ +The following pseudorefs are known to Git: + + - `FETCH_HEAD` is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It + may refer to multiple object IDs. Each object ID is annotated with metadata + indicating where it was fetched from and its fetch status. + + - `MERGE_HEAD` is written by linkgit:git-merge[1] when resolving merge + conflicts. It contains all commit IDs which are being merged. [[def_pull]]pull:: Pulling a <> means to <> it and @@ -638,20 +636,6 @@ The most notable example is `HEAD`. An <> used to temporarily store the contents of a <> working directory and the index for future reuse. -[[def_special_ref]]special ref:: - A ref that has different semantics than normal refs. These refs can be - accessed via normal Git commands but may not behave the same as a - normal ref in some cases. -+ -The following special refs are known to Git: - - - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It - may refer to multiple object IDs. Each object ID is annotated with metadata - indicating where it was fetched from and its fetch status. - - - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge - conflicts. It contains all commit IDs which are being merged. - [[def_submodule]]submodule:: A <> that holds the history of a separate project inside another repository (the latter of From 29be36a2ea4a7d681f8318090a22b937756b89d0 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:33 +0200 Subject: [PATCH 02/10] Documentation/glossary: clarify limitations of pseudorefs Clarify limitations that pseudorefs have: - They can be read via git-rev-parse(1) and similar tools. - They are not surfaced when iterating through refs, like when using git-for-each-ref(1). They are not refs, so iterating through refs should not surface them. - They cannot be written via git-update-ref(1) and related commands. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- Documentation/glossary-content.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index e686c83026..d8c04b37be 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -498,8 +498,8 @@ exclude;; [[def_pseudoref]]pseudoref:: A ref that has different semantics than normal refs. These refs can be - accessed via normal Git commands but may not behave the same as a - normal ref in some cases. + read via normal Git commands, but cannot be written to by commands like + linkgit:git-update-ref[1]. + The following pseudorefs are known to Git: From 74b50a5881d3a43bc2bd466c01fa6e08849dc1d2 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:37 +0200 Subject: [PATCH 03/10] Documentation/glossary: define root refs as refs Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live in the root of the ref hierarchy behave the exact same as normal refs. They can be symbolic refs or direct refs and can be read, iterated over and written via normal tooling. All of these refs are stored in the ref backends, which further demonstrates that they are just normal refs. Extend the definition of "ref" to also cover such root refs. The only additional restriction for root refs is that they must conform to a specific naming schema. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- Documentation/glossary-content.txt | 32 +++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index d8c04b37be..c434387186 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -550,20 +550,38 @@ The following pseudorefs are known to Git: to the result. [[def_ref]]ref:: - A name that begins with `refs/` (e.g. `refs/heads/master`) - that points to an <> or another - ref (the latter is called a <>). + A name that that points to an <> or + another ref (the latter is called a <>). For convenience, a ref can sometimes be abbreviated when used as an argument to a Git command; see linkgit:gitrevisions[7] for details. Refs are stored in the <>. + The ref namespace is hierarchical. -Different subhierarchies are used for different purposes (e.g. the -`refs/heads/` hierarchy is used to represent local branches). +Ref names must either start with `refs/` or be located in the root of +the hierarchy. For the latter, their name must follow these rules: + -There are a few special-purpose refs that do not begin with `refs/`. -The most notable example is `HEAD`. + - The name consists of only upper-case characters or underscores. + + - The name ends with "`_HEAD`" or is equal to "`HEAD`". ++ +There are some irregular refs in the root of the hierarchy that do not +match these rules. The following list is exhaustive and shall not be +extended in the future: ++ + - `AUTO_MERGE` + + - `BISECT_EXPECTED_REV` + + - `NOTES_MERGE_PARTIAL` + + - `NOTES_MERGE_REF` + + - `MERGE_AUTOSTASH` ++ +Different subhierarchies are used for different purposes. For example, +the `refs/heads/` hierarchy is used to represent local branches whereas +the `refs/tags/` hierarchy is used to represent local tags.. [[def_reflog]]reflog:: A reflog shows the local "history" of a ref. In other words, From f6936e62a50e8bc9e268f3396618d33e88f4df7e Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:42 +0200 Subject: [PATCH 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined terminology in our gitglossary(7). Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- ref-filter.c | 2 +- refs.c | 18 +++++++----------- refs.h | 28 +++++++++++++++++++++++++++- refs/files-backend.c | 2 +- refs/reftable-backend.c | 2 +- 5 files changed, 37 insertions(+), 15 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 59ad6f54dd..361beb6619 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname) return ref_kind[i].kind; } - if (is_pseudoref(get_main_ref_store(the_repository), refname)) + if (is_root_ref(get_main_ref_store(the_repository), refname)) return FILTER_REFS_PSEUDOREFS; return FILTER_REFS_OTHERS; diff --git a/refs.c b/refs.c index 55d2e0b2cb..434c4da7ce 100644 --- a/refs.c +++ b/refs.c @@ -844,7 +844,7 @@ int is_per_worktree_ref(const char *refname) starts_with(refname, "refs/rewritten/"); } -static int is_pseudoref_syntax(const char *refname) +static int is_root_ref_syntax(const char *refname) { const char *c; @@ -853,16 +853,12 @@ static int is_pseudoref_syntax(const char *refname) return 0; } - /* - * HEAD is not a pseudoref, but it certainly uses the - * pseudoref syntax. - */ return 1; } -int is_pseudoref(struct ref_store *refs, const char *refname) +int is_root_ref(struct ref_store *refs, const char *refname) { - static const char *const irregular_pseudorefs[] = { + static const char *const irregular_root_refs[] = { "AUTO_MERGE", "BISECT_EXPECTED_REV", "NOTES_MERGE_PARTIAL", @@ -872,7 +868,7 @@ int is_pseudoref(struct ref_store *refs, const char *refname) struct object_id oid; size_t i; - if (!is_pseudoref_syntax(refname)) + if (!is_root_ref_syntax(refname)) return 0; if (ends_with(refname, "_HEAD")) { @@ -882,8 +878,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname) return !is_null_oid(&oid); } - for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++) - if (!strcmp(refname, irregular_pseudorefs[i])) { + for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) + if (!strcmp(refname, irregular_root_refs[i])) { refs_resolve_ref_unsafe(refs, refname, RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, &oid, NULL); @@ -902,7 +898,7 @@ int is_headref(struct ref_store *refs, const char *refname) } static int is_current_worktree_ref(const char *ref) { - return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref); + return is_root_ref_syntax(ref) || is_per_worktree_ref(ref); } enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref, diff --git a/refs.h b/refs.h index d278775e08..d0374c3275 100644 --- a/refs.h +++ b/refs.h @@ -1051,7 +1051,33 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT]; */ void update_ref_namespace(enum ref_namespace namespace, char *ref); -int is_pseudoref(struct ref_store *refs, const char *refname); +/* + * Check whether the reference is an existing root reference. + * + * A root ref is a reference that lives in the root of the reference hierarchy. + * These references must conform to special syntax: + * + * - Their name must be all-uppercase or underscores ("_"). + * + * - Their name must end with "_HEAD". + * + * - Their name may not contain a slash. + * + * There is a special set of irregular root refs that exist due to historic + * reasons, only. This list shall not be expanded in the future: + * + * - AUTO_MERGE + * + * - BISECT_EXPECTED_REV + * + * - NOTES_MERGE_PARTIAL + * + * - NOTES_MERGE_REF + * + * - MERGE_AUTOSTASH + */ +int is_root_ref(struct ref_store *refs, const char *refname); + int is_headref(struct ref_store *refs, const char *refname); #endif /* REFS_H */ diff --git a/refs/files-backend.c b/refs/files-backend.c index a098d14ea0..0fcb601444 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -351,7 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store, strbuf_addstr(&refname, de->d_name); dtype = get_dtype(de, &path, 1); - if (dtype == DT_REG && (is_pseudoref(ref_store, de->d_name) || + if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) || is_headref(ref_store, de->d_name))) loose_fill_ref_dir_regular_file(refs, refname.buf, dir); diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 1cda48c504..5a5e64fe69 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -356,7 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator) */ if (!starts_with(iter->ref.refname, "refs/") && !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS && - (is_pseudoref(&iter->refs->base, iter->ref.refname) || + (is_root_ref(&iter->refs->base, iter->ref.refname) || is_headref(&iter->refs->base, iter->ref.refname)))) { continue; } From 32019a7a76ba0e43714c2d161cd65992f161a7ab Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:47 +0200 Subject: [PATCH 05/10] refs: rename `is_special_ref()` to `is_pseudo_ref()` Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly defined terminology in our gitglossary(7). Note that in the preceding commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where there may be confusion for in-flight patch series that add new calls to `is_pseudoref()`. In order to intentionally break such patch series we have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the new name. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- refs.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/refs.c b/refs.c index 434c4da7ce..c1c406fc5f 100644 --- a/refs.c +++ b/refs.c @@ -1872,13 +1872,13 @@ done: return result; } -static int is_special_ref(const char *refname) +static int is_pseudo_ref(const char *refname) { /* - * Special references are refs that have different semantics compared - * to "normal" refs. These refs can thus not be stored in the ref - * backend, but must always be accessed via the filesystem. The - * following refs are special: + * Pseudorefs are refs that have different semantics compared to + * "normal" refs. These refs can thus not be stored in the ref backend, + * but must always be accessed via the filesystem. The following refs + * are pseudorefs: * * - FETCH_HEAD may contain multiple object IDs, and each one of them * carries additional metadata like where it came from. @@ -1887,17 +1887,17 @@ static int is_special_ref(const char *refname) * heads. * * Reading, writing or deleting references must consistently go either - * through the filesystem (special refs) or through the reference + * through the filesystem (pseudorefs) or through the reference * backend (normal ones). */ - static const char * const special_refs[] = { + static const char * const pseudo_refs[] = { "FETCH_HEAD", "MERGE_HEAD", }; size_t i; - for (i = 0; i < ARRAY_SIZE(special_refs); i++) - if (!strcmp(refname, special_refs[i])) + for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++) + if (!strcmp(refname, pseudo_refs[i])) return 1; return 0; @@ -1908,7 +1908,7 @@ int refs_read_raw_ref(struct ref_store *ref_store, const char *refname, unsigned int *type, int *failure_errno) { assert(failure_errno); - if (is_special_ref(refname)) + if (is_pseudo_ref(refname)) return refs_read_special_head(ref_store, refname, oid, referent, type, failure_errno); From afcd067dad27fa7eddde01ebde90daa6cc541cc5 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:51 +0200 Subject: [PATCH 06/10] refs: do not check ref existence in `is_root_ref()` Before this patch series, root refs except for "HEAD" and our special refs were classified as pseudorefs. Furthermore, our terminology clarified that pseudorefs must not be symbolic refs. This restriction is enforced in `is_root_ref()`, which explicitly checks that a supposed root ref resolves to an object ID without recursing. This has been extremely confusing right from the start because (in old terminology) a ref name may sometimes be a pseudoref and sometimes not depending on whether it is a symbolic or regular ref. This behaviour does not seem reasonable at all and I very much doubt that it results in anything sane. Last but not least, the current behaviour can actually lead to a segfault when calling `is_root_ref()` with a reference that either does not exist or that is a symbolic ref because we never initialized `oid`, but then read it via `is_null_oid()`. We have now changed terminology to clarify that pseudorefs are really only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live in the root of the ref hierarchy are just plain refs. Thus, we do not need to check whether the ref is symbolic or not. In fact, we can now avoid looking up the ref completely as the name is sufficient for us to figure out whether something would be a root ref or not. This change of course changes semantics for our callers. As there are only three of them we can assess each of them individually: - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs. It's clear that the intent is to classify based on the ref name, only. - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to filter root refs. Again, using existence checks is pointless here as the iterator has just surfaced the ref, so we know it does exist. - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to determine whether it should add a ref to the root directory of its iterator. This had the effect that we skipped over any files that are either a symbolic ref, or which are not a ref at all. The new behaviour is to include symbolic refs know, which aligns us with the adapted terminology. Furthermore, files which look like root refs but aren't are now mark those as "broken". As broken refs are not surfaced by our tooling, this should not lead to a change in user-visible behaviour, but may cause us to emit warnings. This feels like the right thing to do as we would otherwise just silently ignore corrupted root refs completely. So in all cases the existence check was either superfluous, not in line with the adapted terminology or masked potential issues. This commit thus changes the behaviour as proposed and drops the existence check altogether. Add a test that verifies that this does not change user-visible behaviour. Namely, we still don't want to show broken refs to the user by default in git-for-each-ref(1). What this does allow though is for internal callers to surface dangling root refs when they pass in the `DO_FOR_EACH_INCLUDE_BROKEN` flag. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- ref-filter.c | 2 +- refs.c | 19 +++++-------------- refs.h | 5 +++-- refs/files-backend.c | 4 ++-- refs/reftable-backend.c | 2 +- t/t6302-for-each-ref-filter.sh | 17 +++++++++++++++++ 6 files changed, 29 insertions(+), 20 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 361beb6619..23e81e3e04 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname) return ref_kind[i].kind; } - if (is_root_ref(get_main_ref_store(the_repository), refname)) + if (is_root_ref(refname)) return FILTER_REFS_PSEUDOREFS; return FILTER_REFS_OTHERS; diff --git a/refs.c b/refs.c index c1c406fc5f..4fec29e660 100644 --- a/refs.c +++ b/refs.c @@ -856,7 +856,7 @@ static int is_root_ref_syntax(const char *refname) return 1; } -int is_root_ref(struct ref_store *refs, const char *refname) +int is_root_ref(const char *refname) { static const char *const irregular_root_refs[] = { "AUTO_MERGE", @@ -865,26 +865,17 @@ int is_root_ref(struct ref_store *refs, const char *refname) "NOTES_MERGE_REF", "MERGE_AUTOSTASH", }; - struct object_id oid; size_t i; if (!is_root_ref_syntax(refname)) return 0; - if (ends_with(refname, "_HEAD")) { - refs_resolve_ref_unsafe(refs, refname, - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, - &oid, NULL); - return !is_null_oid(&oid); - } + if (ends_with(refname, "_HEAD")) + return 1; for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) - if (!strcmp(refname, irregular_root_refs[i])) { - refs_resolve_ref_unsafe(refs, refname, - RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE, - &oid, NULL); - return !is_null_oid(&oid); - } + if (!strcmp(refname, irregular_root_refs[i])) + return 1; return 0; } diff --git a/refs.h b/refs.h index d0374c3275..8a574a22c7 100644 --- a/refs.h +++ b/refs.h @@ -1052,7 +1052,8 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT]; void update_ref_namespace(enum ref_namespace namespace, char *ref); /* - * Check whether the reference is an existing root reference. + * Check whether the provided name names a root reference. This function only + * performs a syntactic check. * * A root ref is a reference that lives in the root of the reference hierarchy. * These references must conform to special syntax: @@ -1076,7 +1077,7 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref); * * - MERGE_AUTOSTASH */ -int is_root_ref(struct ref_store *refs, const char *refname); +int is_root_ref(const char *refname); int is_headref(struct ref_store *refs, const char *refname); diff --git a/refs/files-backend.c b/refs/files-backend.c index 0fcb601444..06240ce327 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -351,8 +351,8 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store, strbuf_addstr(&refname, de->d_name); dtype = get_dtype(de, &path, 1); - if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) || - is_headref(ref_store, de->d_name))) + if (dtype == DT_REG && (is_root_ref(de->d_name) || + is_headref(ref_store, de->d_name))) loose_fill_ref_dir_regular_file(refs, refname.buf, dir); strbuf_setlen(&refname, dirnamelen); diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 5a5e64fe69..c11ba95736 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -356,7 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator) */ if (!starts_with(iter->ref.refname, "refs/") && !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS && - (is_root_ref(&iter->refs->base, iter->ref.refname) || + (is_root_ref(iter->ref.refname) || is_headref(&iter->refs->base, iter->ref.refname)))) { continue; } diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh index 948f1bb5f4..92ed8957c8 100755 --- a/t/t6302-for-each-ref-filter.sh +++ b/t/t6302-for-each-ref-filter.sh @@ -62,6 +62,23 @@ test_expect_success '--include-root-refs with other patterns' ' test_cmp expect actual ' +test_expect_success '--include-root-refs omits dangling symrefs' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + git symbolic-ref DANGLING_HEAD refs/heads/missing && + cat >expect <<-EOF && + HEAD + $(git symbolic-ref HEAD) + refs/tags/initial + EOF + git for-each-ref --format="%(refname)" --include-root-refs >actual && + test_cmp expect actual + ) +' + test_expect_success 'filtering with --points-at' ' cat >expect <<-\EOF && refs/heads/main From 31951c22489dc4238ef881478a896a460531b722 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:50:56 +0200 Subject: [PATCH 07/10] refs: classify HEAD as a root ref Root refs are those refs that live in the root of the ref hierarchy. Our old and venerable "HEAD" reference falls into this category, but we don't yet classify it as such in `is_root_ref()`. Adapt the function to also treat "HEAD" as a root ref. This change is safe to do for all current callers: - `ref_kind_from_refname()` already handles "HEAD" explicitly before calling `is_root_ref()`. - The "files" and "reftable" backends explicitly call both `is_root_ref()` and `is_headref()` together. This also aligns behaviour or `is_root_ref()` and `is_headref()` such that we stop checking for ref existence. This changes semantics for our backends: - In the reftable backend we already know that the ref must exist because `is_headref()` is called as part of the ref iterator. The existence check is thus redundant, and the change is safe to do. - In the files backend we use it when populating root refs, where we would skip adding the "HEAD" file if it was not possible to resolve it. The new behaviour is to instead mark "HEAD" as broken, which will cause us to emit warnings in various places. As there are no callers of `is_headref()` left afer the refactoring, we can absorb it completely into `is_root_ref()`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- refs.c | 9 +-------- refs.h | 5 ++--- refs/files-backend.c | 3 +-- refs/reftable-backend.c | 3 +-- 4 files changed, 5 insertions(+), 15 deletions(-) diff --git a/refs.c b/refs.c index 4fec29e660..9fb1061d52 100644 --- a/refs.c +++ b/refs.c @@ -859,6 +859,7 @@ static int is_root_ref_syntax(const char *refname) int is_root_ref(const char *refname) { static const char *const irregular_root_refs[] = { + "HEAD", "AUTO_MERGE", "BISECT_EXPECTED_REV", "NOTES_MERGE_PARTIAL", @@ -880,14 +881,6 @@ int is_root_ref(const char *refname) return 0; } -int is_headref(struct ref_store *refs, const char *refname) -{ - if (!strcmp(refname, "HEAD")) - return refs_ref_exists(refs, refname); - - return 0; -} - static int is_current_worktree_ref(const char *ref) { return is_root_ref_syntax(ref) || is_per_worktree_ref(ref); } diff --git a/refs.h b/refs.h index 8a574a22c7..8489b45265 100644 --- a/refs.h +++ b/refs.h @@ -1060,7 +1060,8 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref); * * - Their name must be all-uppercase or underscores ("_"). * - * - Their name must end with "_HEAD". + * - Their name must end with "_HEAD". As a special rule, "HEAD" is a root + * ref, as well. * * - Their name may not contain a slash. * @@ -1079,6 +1080,4 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref); */ int is_root_ref(const char *refname); -int is_headref(struct ref_store *refs, const char *refname); - #endif /* REFS_H */ diff --git a/refs/files-backend.c b/refs/files-backend.c index 06240ce327..6f9a631592 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -351,8 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store, strbuf_addstr(&refname, de->d_name); dtype = get_dtype(de, &path, 1); - if (dtype == DT_REG && (is_root_ref(de->d_name) || - is_headref(ref_store, de->d_name))) + if (dtype == DT_REG && is_root_ref(de->d_name)) loose_fill_ref_dir_regular_file(refs, refname.buf, dir); strbuf_setlen(&refname, dirnamelen); diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index c11ba95736..b6668e7c7e 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -356,8 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator) */ if (!starts_with(iter->ref.refname, "refs/") && !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS && - (is_root_ref(iter->ref.refname) || - is_headref(&iter->refs->base, iter->ref.refname)))) { + is_root_ref(iter->ref.refname))) { continue; } From 993d57ededa2238e7526ccdb05b39b0265053f37 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:51:01 +0200 Subject: [PATCH 08/10] refs: pseudorefs are no refs The `is_root_ref()` function will happily clarify a pseudoref as a root ref, even though pseudorefs are no refs. Next to being wrong, it also leads to inconsistent behaviour across ref backends: while the "files" backend accidentally knows to parse those pseudorefs and thus yields them to the caller, the "reftable" backend won't ever see the pseudoref at all because they are never stored in the "reftable" backend. Fix this issue by filtering out pseudorefs in `is_root_ref()`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- refs.c | 65 +++++++++++++++++----------------- t/t6302-for-each-ref-filter.sh | 17 +++++++++ 2 files changed, 50 insertions(+), 32 deletions(-) diff --git a/refs.c b/refs.c index 9fb1061d52..2074281a0e 100644 --- a/refs.c +++ b/refs.c @@ -844,6 +844,37 @@ int is_per_worktree_ref(const char *refname) starts_with(refname, "refs/rewritten/"); } +static int is_pseudo_ref(const char *refname) +{ + /* + * Pseudorefs are refs that have different semantics compared to + * "normal" refs. These refs can thus not be stored in the ref backend, + * but must always be accessed via the filesystem. The following refs + * are pseudorefs: + * + * - FETCH_HEAD may contain multiple object IDs, and each one of them + * carries additional metadata like where it came from. + * + * - MERGE_HEAD may contain multiple object IDs when merging multiple + * heads. + * + * Reading, writing or deleting references must consistently go either + * through the filesystem (pseudorefs) or through the reference + * backend (normal ones). + */ + static const char * const pseudo_refs[] = { + "FETCH_HEAD", + "MERGE_HEAD", + }; + size_t i; + + for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++) + if (!strcmp(refname, pseudo_refs[i])) + return 1; + + return 0; +} + static int is_root_ref_syntax(const char *refname) { const char *c; @@ -868,7 +899,8 @@ int is_root_ref(const char *refname) }; size_t i; - if (!is_root_ref_syntax(refname)) + if (!is_root_ref_syntax(refname) || + is_pseudo_ref(refname)) return 0; if (ends_with(refname, "_HEAD")) @@ -1856,37 +1888,6 @@ done: return result; } -static int is_pseudo_ref(const char *refname) -{ - /* - * Pseudorefs are refs that have different semantics compared to - * "normal" refs. These refs can thus not be stored in the ref backend, - * but must always be accessed via the filesystem. The following refs - * are pseudorefs: - * - * - FETCH_HEAD may contain multiple object IDs, and each one of them - * carries additional metadata like where it came from. - * - * - MERGE_HEAD may contain multiple object IDs when merging multiple - * heads. - * - * Reading, writing or deleting references must consistently go either - * through the filesystem (pseudorefs) or through the reference - * backend (normal ones). - */ - static const char * const pseudo_refs[] = { - "FETCH_HEAD", - "MERGE_HEAD", - }; - size_t i; - - for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++) - if (!strcmp(refname, pseudo_refs[i])) - return 1; - - return 0; -} - int refs_read_raw_ref(struct ref_store *ref_store, const char *refname, struct object_id *oid, struct strbuf *referent, unsigned int *type, int *failure_errno) diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh index 92ed8957c8..163c378cfd 100755 --- a/t/t6302-for-each-ref-filter.sh +++ b/t/t6302-for-each-ref-filter.sh @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' ' test_cmp expect actual ' +test_expect_success '--include-root-refs pattern does not print special refs' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + git rev-parse HEAD >.git/MERGE_HEAD && + git for-each-ref --format="%(refname)" --include-root-refs >actual && + cat >expect <<-EOF && + HEAD + $(git symbolic-ref HEAD) + refs/tags/initial + EOF + test_cmp expect actual + ) +' + test_expect_success '--include-root-refs with other patterns' ' cat >expect <<-\EOF && HEAD From f1701f279a3678e95daa5c093b8d30e815c1701b Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:51:05 +0200 Subject: [PATCH 09/10] ref-filter: properly distinuish pseudo and root refs The ref-filter interfaces currently define root refs as either a detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so let's properly distinguish those ref types. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/for-each-ref.c | 2 +- ref-filter.c | 16 +++++++++------- ref-filter.h | 4 ++-- refs.c | 18 +----------------- refs.h | 18 ++++++++++++++++++ 5 files changed, 31 insertions(+), 27 deletions(-) diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c index 919282e12a..5517a4a1c0 100644 --- a/builtin/for-each-ref.c +++ b/builtin/for-each-ref.c @@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) } if (include_root_refs) - flags |= FILTER_REFS_ROOT_REFS; + flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD; filter.match_as_path = 1; filter_and_format_refs(&filter, flags, sorting, &format); diff --git a/ref-filter.c b/ref-filter.c index 23e81e3e04..41f639bc2f 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2628,7 +2628,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter, each_ref_fn cb, void *cb_data) { - if (filter->kind == FILTER_REFS_KIND_MASK) { + if (filter->kind & FILTER_REFS_ROOT_REFS) { /* In this case, we want to print all refs including root refs. */ return refs_for_each_include_root_refs(get_main_ref_store(the_repository), cb, cb_data); @@ -2756,8 +2756,10 @@ static int ref_kind_from_refname(const char *refname) return ref_kind[i].kind; } - if (is_root_ref(refname)) + if (is_pseudo_ref(refname)) return FILTER_REFS_PSEUDOREFS; + if (is_root_ref(refname)) + return FILTER_REFS_ROOT_REFS; return FILTER_REFS_OTHERS; } @@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct /* * Generally HEAD refs are printed with special description denoting a rebase, * detached state and so forth. This is useful when only printing the HEAD ref - * But when it is being printed along with other pseudorefs, it makes sense to - * keep the formatting consistent. So we mask the type to act like a pseudoref. + * But when it is being printed along with other root refs, it makes sense to + * keep the formatting consistent. So we mask the type to act like a root ref. */ - if (filter->kind == FILTER_REFS_KIND_MASK && kind == FILTER_REFS_DETACHED_HEAD) - kind = FILTER_REFS_PSEUDOREFS; + if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD) + kind = FILTER_REFS_ROOT_REFS; else if (!(kind & filter->kind)) return NULL; @@ -3072,7 +3074,7 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref * When printing all ref types, HEAD is already included, * so we don't want to print HEAD again. */ - if (!ret && (filter->kind != FILTER_REFS_KIND_MASK) && + if (!ret && !(filter->kind & FILTER_REFS_ROOT_REFS) && (filter->kind & FILTER_REFS_DETACHED_HEAD)) head_ref(fn, cb_data); } diff --git a/ref-filter.h b/ref-filter.h index 0ca28d2bba..27ae1aa0d1 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -23,9 +23,9 @@ FILTER_REFS_REMOTES | FILTER_REFS_OTHERS) #define FILTER_REFS_DETACHED_HEAD 0x0020 #define FILTER_REFS_PSEUDOREFS 0x0040 -#define FILTER_REFS_ROOT_REFS (FILTER_REFS_DETACHED_HEAD | FILTER_REFS_PSEUDOREFS) +#define FILTER_REFS_ROOT_REFS 0x0080 #define FILTER_REFS_KIND_MASK (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD | \ - FILTER_REFS_PSEUDOREFS) + FILTER_REFS_PSEUDOREFS | FILTER_REFS_ROOT_REFS) struct atom_value; struct ref_sorting; diff --git a/refs.c b/refs.c index 2074281a0e..c13b8ff6d8 100644 --- a/refs.c +++ b/refs.c @@ -844,24 +844,8 @@ int is_per_worktree_ref(const char *refname) starts_with(refname, "refs/rewritten/"); } -static int is_pseudo_ref(const char *refname) +int is_pseudo_ref(const char *refname) { - /* - * Pseudorefs are refs that have different semantics compared to - * "normal" refs. These refs can thus not be stored in the ref backend, - * but must always be accessed via the filesystem. The following refs - * are pseudorefs: - * - * - FETCH_HEAD may contain multiple object IDs, and each one of them - * carries additional metadata like where it came from. - * - * - MERGE_HEAD may contain multiple object IDs when merging multiple - * heads. - * - * Reading, writing or deleting references must consistently go either - * through the filesystem (pseudorefs) or through the reference - * backend (normal ones). - */ static const char * const pseudo_refs[] = { "FETCH_HEAD", "MERGE_HEAD", diff --git a/refs.h b/refs.h index 8489b45265..dc4358727f 100644 --- a/refs.h +++ b/refs.h @@ -1080,4 +1080,22 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref); */ int is_root_ref(const char *refname); +/* + * Pseudorefs are refs that have different semantics compared to + * "normal" refs. These refs can thus not be stored in the ref backend, + * but must always be accessed via the filesystem. The following refs + * are pseudorefs: + * + * - FETCH_HEAD may contain multiple object IDs, and each one of them + * carries additional metadata like where it came from. + * + * - MERGE_HEAD may contain multiple object IDs when merging multiple + * heads. + * + * Reading, writing or deleting references must consistently go either + * through the filesystem (pseudorefs) or through the reference + * backend (normal ones). + */ +int is_pseudo_ref(const char *refname); + #endif /* REFS_H */ From 8e4f5c2dc26e8e88c8c4784f133d2b35a771d2ac Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 15 May 2024 08:51:10 +0200 Subject: [PATCH 10/10] refs: refuse to write pseudorefs Pseudorefs are not stored in the ref database as by definition, they carry additional metadata that essentially makes them not a ref. As such, writing pseudorefs via the ref backend does not make any sense whatsoever as the ref backend wouldn't know how exactly to store the data. Restrict writing pseudorefs via the ref backend. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- refs.c | 7 +++++++ t/t5510-fetch.sh | 6 +++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/refs.c b/refs.c index c13b8ff6d8..9abd9e5b86 100644 --- a/refs.c +++ b/refs.c @@ -1263,6 +1263,13 @@ int ref_transaction_update(struct ref_transaction *transaction, return -1; } + if (!(flags & REF_SKIP_REFNAME_VERIFICATION) && + is_pseudo_ref(refname)) { + strbuf_addf(err, _("refusing to update pseudoref '%s'"), + refname); + return -1; + } + if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS) BUG("illegal flags 0x%x passed to ref_transaction_update()", flags); diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 33d34d5ae9..4eb569f4df 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -518,7 +518,7 @@ test_expect_success 'fetch with a non-applying branch..merge' ' test_expect_success 'fetch from GIT URL with a non-applying branch..merge [1]' ' one_head=$(cd one && git rev-parse HEAD) && this_head=$(git rev-parse HEAD) && - git update-ref -d FETCH_HEAD && + rm .git/FETCH_HEAD && git fetch one && test $one_head = "$(git rev-parse --verify FETCH_HEAD)" && test $this_head = "$(git rev-parse --verify HEAD)" @@ -530,7 +530,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch..merge one_ref=$(cd one && git symbolic-ref HEAD) && git config branch.main.remote blub && git config branch.main.merge "$one_ref" && - git update-ref -d FETCH_HEAD && + rm .git/FETCH_HEAD && git fetch one && test $one_head = "$(git rev-parse --verify FETCH_HEAD)" && test $this_head = "$(git rev-parse --verify HEAD)" @@ -540,7 +540,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch..merge # the merge spec does not match the branch the remote HEAD points to test_expect_success 'fetch from GIT URL with a non-applying branch..merge [3]' ' git config branch.main.merge "${one_ref}_not" && - git update-ref -d FETCH_HEAD && + rm .git/FETCH_HEAD && git fetch one && test $one_head = "$(git rev-parse --verify FETCH_HEAD)" && test $this_head = "$(git rev-parse --verify HEAD)"