pseudo-merge: implement support for finding existing merges

This patch implements support for reusing existing pseudo-merge commits
when writing bitmaps when there is an existing pseudo-merge bitmap which
has exactly the same set of parents as one that we are about to write.

Note that unstable pseudo-merges are likely to change between
consecutive repacks, and so are generally poor candidates for reuse.
However, stable pseudo-merges (see the configuration option
'bitmapPseudoMerge.<name>.stableThreshold') are by definition unlikely
to change between runs (as they represent long-running branches).

Because there is no index from a *set* of pseudo-merge parents to a
matching pseudo-merge bitmap, we have to construct the bitmap
corresponding to the set of parents for each pending pseudo-merge commit
and see if a matching bitmap exists.

This is technically quadratic in the number of pseudo-merges, but is OK
in practice for a couple of reasons:

  - non-matching pseudo-merge bitmaps are rejected quickly as soon as
    they differ in a single bit

  - already-matched pseudo-merge bitmaps are discarded from subsequent
    rounds of search

  - the number of pseudo-merges is generally small, even for large
    repositories

In order to do this, implement (a) a function that finds a matching
pseudo-merge given some uncompressed bitset describing its parents, (b)
a function that computes the bitset of parents for a given pseudo-merge
commit, and (c) call that function before computing the set of reachable
objects for some pending pseudo-merge.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Taylor Blau
2024-05-23 17:27:21 -04:00
committed by Junio C Hamano
parent 94c1addf86
commit 7252d9a036
6 changed files with 174 additions and 2 deletions

View File

@ -699,3 +699,58 @@ int cascade_pseudo_merges(const struct pseudo_merge_map *pm,
return ret;
}
struct pseudo_merge *pseudo_merge_for_parents(const struct pseudo_merge_map *pm,
struct bitmap *parents)
{
struct pseudo_merge *match = NULL;
size_t i;
if (!pm->nr)
return NULL;
/*
* NOTE: this loop is quadratic in the worst-case (where no
* matching pseudo-merge bitmaps are found), but in practice
* this is OK for a few reasons:
*
* - Rejecting pseudo-merge bitmaps that do not match the
* given commit is done quickly (i.e. `bitmap_equals_ewah()`
* returns early when we know the two bitmaps aren't equal.
*
* - Already matched pseudo-merge bitmaps (which we track with
* the `->satisfied` bit here) are skipped as potential
* candidates.
*
* - The number of pseudo-merges should be small (in the
* hundreds for most repositories).
*
* If in the future this semi-quadratic behavior does become a
* problem, another approach would be to keep track of which
* pseudo-merges are still "viable" after enumerating the
* pseudo-merge commit's parents:
*
* - A pseudo-merge bitmap becomes non-viable when the bit(s)
* corresponding to one or more parent(s) of the given
* commit are not set in a candidate pseudo-merge's commits
* bitmap.
*
* - After processing all bits, enumerate the remaining set of
* viable pseudo-merge bitmaps, and check that their
* popcount() matches the number of parents in the given
* commit.
*/
for (i = 0; i < pm->nr; i++) {
struct pseudo_merge *candidate = use_pseudo_merge(pm, &pm->v[i]);
if (!candidate || candidate->satisfied)
continue;
if (!bitmap_equals_ewah(parents, candidate->commits))
continue;
match = candidate;
match->satisfied = 1;
break;
}
return match;
}