Introduce a new `--ref-format` value flag for git-clone(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new `--ref-format` value flag for git-init(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new `--show-ref-format` to git-rev-parse(1) that causes it
to print the ref format used by a repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new "extensions.refStorage" extension that allows us to
specify the ref storage format used by a repository. For now, the only
supported format is the "files" format, but this list will likely soon
be extended to also support the upcoming "reftable" format.
There have been discussions on the Git mailing list in the past around
how exactly this extension should look like. One alternative [1] that
was discussed was whether it would make sense to model the extension in
such a way that backends are arbitrarily stackable. This would allow for
a combined value of e.g. "loose,packed-refs" or "loose,reftable", which
indicates that new refs would be written via "loose" files backend and
compressed into "packed-refs" or "reftable" backends, respectively.
It is arguable though whether this flexibility and the complexity that
it brings with it is really required for now. It is not foreseeable that
there will be a proliferation of backends in the near-term future, and
the current set of existing formats and formats which are on the horizon
can easily be configured with the much simpler proposal where we have a
single value, only.
Furthermore, if we ever see that we indeed want to gain the ability to
arbitrarily stack the ref formats, then we can adapt the current
extension rather easily. Given that Git clients will refuse any unknown
value for the "extensions.refStorage" extension they would also know to
ignore a stacked "loose,packed-refs" in the future.
So let's stick with the easy proposal for the time being and wire up the
extension.
[1]: <pull.1408.git.1667846164.gitgitgadget@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to discern which ref storage format a repository is supposed to
use we need to start setting up and/or discovering the format. This
needs to happen in two separate code paths.
- The first path is when we create a repository via `init_db()`. When
we are re-initializing a preexisting repository we need to retain
the previously used ref storage format -- if the user asked for a
different format then this indicates an error and we error out.
Otherwise we either initialize the repository with the format asked
for by the user or the default format, which currently is the
"files" backend.
- The second path is when discovering repositories, where we need to
read the config of that repository. There is not yet any way to
configure something other than the "files" backend, so we can just
blindly set the ref storage format to this backend.
Wire up this logic so that we have the ref storage format always readily
available when needed. As there is only a single backend and because it
is not configurable we cannot yet verify that this tracking works as
expected via tests, but tests will be added in subsequent commits. To
countermand this ommission now though, raise a BUG() in case the ref
storage format is not set up properly in `ref_store_init()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone" has been prepared to allow cloning a repository with
non-default hash function into a repository that uses the reftable
backend.
* ps/clone-into-reftable-repository:
builtin/clone: create the refdb with the correct object format
builtin/clone: skip reading HEAD when retrieving remote
builtin/clone: set up sparse checkout later
builtin/clone: fix bundle URIs with mismatching object formats
remote-curl: rediscover repository when fetching refs
setup: allow skipping creation of the refdb
setup: extract function to create the refdb
"git fetch --atomic" issued an unnecessary empty error message,
which has been corrected.
* jx/fetch-atomic-error-message-fix:
fetch: no redundant error message for atomic fetch
t5574: test porcelain output of atomic fetch
Code clean-up for sanity checking of command line options for "git
show-ref".
* rs/show-ref-incompatible-options:
show-ref: use die_for_incompatible_opt3()
"git checkout -B <branch> [<start-point>]" allowed a branch that is
in use in another worktree to be updated and checked out, which
might be a bit unexpected. The rule has been tightened, which is a
breaking change. "--ignore-other-worktrees" option is required to
unbreak you, if you are used to the current behaviour that "-B"
overrides the safety.
* jc/checkout-B-branch-in-use:
checkout: forbid "-B <branch>" from touching a branch used elsewhere
checkout: refactor die_if_checked_out() caller
"git sparse-checkout set ---no-cone" uses default patterns when none
is given from the command line, but it should do so ONLY when
--stdin is not being used. Right now, add_patterns_from_input()
called when reading from the standard input is sloppy and does not
check if there are extra command line parameters that the command
will silently ignore, but that will change soon and not setting this
unnecessary and unused default patterns start to matter when it gets
fixed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
93851746 (parse-options: decouple "--end-of-options" and "--",
2023-12-06) updated the world order to make callers of parse-options
that set PARSE_OPT_KEEP_UNKNOWN_OPT responsible for deciding what to
do with "--end-of-options" they may see after parse_options() returns.
This made a previous bug in sparse-checkout more visible; namely,
that
git sparse-checkout [add|set] --[no-]cone --end-of-options ...
would simply treat "--end-of-options" as one of the paths to include in
the sparse-checkout. But this was already problematic before; namely,
git sparse-checkout [add|set| --[no-]cone --sikp-checks ...
would not give an error on the mis-typed "--skip-checks" but instead
simply treat "--sikp-checks" as a path or pattern to include in the
sparse-checkout, which is highly unfriendly.
This behavior began when the command was converted to parse-options in
7bffca95ea (sparse-checkout: add '--stdin' option to set subcommand,
2019-11-21). Back then it was just called KEEP_UNKNOWN. Later it was
renamed to KEEP_UNKNOWN_OPT in 99d86d60e5 (parse-options:
PARSE_OPT_KEEP_UNKNOWN only applies to --options, 2022-08-19) to clarify
that it was only about dashed options; we always keep non-option
arguments. Looking at that original patch, both Peff and I think that
the author was simply confused about the mis-named option, and really
just wanted to keep the non-option arguments. We never should have used
the flag all along (and the other cases were cargo-culted within the
file).
Remove the erroneous PARSE_OPT_KEEP_UNKNOWN_OPT flag now to fix this
bug. Note that this does mean that anyone who might have been using
git sparse-checkout [add|set] [--[no-]cone] --foo --bar
to request paths or patterns '--foo' and '--bar' will now have to use
git sparse-checkout [add|set] [--[no-]cone] -- --foo --bar
That makes sparse-checkout more consistent with other git commands,
provides users much friendlier error messages and behavior, and is
consistent with the all-caps warning in git-sparse-checkout.txt that
this command "is experimental...its behavior...will likely change". :-)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next commit will remove a bunch of unnecessary includes, but to do
so, we need some of the lower level direct includes that files rely on
to be explicitly specified.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though. Have those
source files explicitly include the headers they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The CodingGuidelines documents stipulates that multi-word placeholders
are to be separated by dashes, not underscores nor spaces.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use mem_pool_calloc() to get a zeroed buffer instead of zeroing it
ourselves. This makes the code clearer and less repetitive.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/clone-into-reftable-repository:
builtin/clone: create the refdb with the correct object format
builtin/clone: skip reading HEAD when retrieving remote
builtin/clone: set up sparse checkout later
builtin/clone: fix bundle URIs with mismatching object formats
remote-curl: rediscover repository when fetching refs
setup: allow skipping creation of the refdb
setup: extract function to create the refdb
Code clean-up around use of configuration variables.
* jk/config-cleanup:
sequencer: simplify away extra git_config_string() call
gpg-interface: drop pointless config_error_nonbool() checks
push: drop confusing configset/callback redundancy
config: use git_config_string() for core.checkRoundTripEncoding
diff: give more detailed messages for bogus diff.* config
config: use config_error_nonbool() instead of custom messages
imap-send: don't use git_die_config() inside callback
git_xmerge_config(): prefer error() to die()
config: reject bogus values for core.checkstat
Some codepaths did not correctly parse configuration variables
specified with valueless "true", which has been corrected.
* jk/implicit-true:
fsck: handle NULL value when parsing message config
trailer: handle NULL value when parsing trailer-specific config
submodule: handle NULL value when parsing submodule.*.branch
help: handle NULL value for alias.* config
trace2: handle NULL values in tr2_sysenv config callback
setup: handle NULL value when parsing extensions
config: handle NULL value when parsing non-bools
"git bisect reset" has been taught to clean up state files and refs
even when BISECT_START file is gone.
* jk/bisect-reset-fix:
bisect: always clean on reset
Clean-up code that handles combinations of incompatible options.
* rs/incompatible-options-messages:
worktree: simplify incompatibility message for --orphan and commit-ish
worktree: standardize incompatibility messages
clean: factorize incompatibility message
revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
repack: use die_for_incompatible_opt3() for -A/-k/--cruft
push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
In run_am(), a strbuf is used to create a revision argument that is then
added to the argument list for git format-patch using strvec_push().
Use strvec_pushf() to add it directly instead, simplifying the code
and plugging a small leak on the error code path.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "--force-with-lease" option was introduced in 28f5d176
(remote.c: add command line option parser for "--force-with-lease",
2013-07-08), the design discussion revolved around the concept of
"compare-and-swap", and it can still be seen in the name used for
variables and helper functions. The end-user facing option name
ended up to be a bit different, so during the development iteration
of the feature, we used this C preprocessor macro to make it easier
to rename it later.
All of that happened more than 10 years ago, and the flexibility
afforded by the CAS_OPT_NAME macro outlived its usefulness. Inline
the constant string for the option name, like all other option names
in the code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge-file" learned to take the "--diff-algorithm" option to
use algorithm different from the default "myers" diff.
* ad/merge-file-diff-algo:
merge-file: add --diff-algorithm option
Clean-up code that handles combinations of incompatible options.
* rs/i18n-cannot-be-used-together:
i18n: factorize even more 'incompatible options' messages
Introduce "git replay", a tool meant on the server side without
working tree to recreate a history.
* cc/git-replay:
replay: stop assuming replayed branches do not diverge
replay: add --contained to rebase contained branches
replay: add --advance or 'cherry-pick' mode
replay: use standard revision ranges
replay: make it a minimal server side command
replay: remove HEAD related sanity check
replay: remove progress and info output
replay: add an important FIXME comment about gpg signing
replay: change rev walking options
replay: introduce pick_regular_commit()
replay: die() instead of failing assert()
replay: start using parse_options API
replay: introduce new builtin
t6429: remove switching aspects of fast-rebase
If an error occurs during an atomic fetch, a redundant error message
will appear at the end of do_fetch(). It was introduced in b3a804663c
(fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17).
Because a failure message is displayed before setting retcode in the
function do_fetch(), calling error() on the err message at the end of
this function may result in redundant or empty error message to be
displayed.
We can remove the redundant error() function, because we know that
the function ref_transaction_abort() never fails. While we can find a
common pattern for calling ref_transaction_abort() by running command
"git grep -A1 ref_transaction_abort", e.g.:
if (ref_transaction_abort(transaction, &error))
error("abort: %s", error.buf);
Following this pattern, we can tolerate the return value of the function
ref_transaction_abort() being changed in the future. We also delay the
output of the err message to the end of do_fetch() to reduce redundant
code. With these changes, the test case "fetch porcelain output
(atomic)" in t5574 will also be fixed.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that both the pack-bitmap and pack-objects code are prepared to
handle marking and using objects from multiple bitmapped packs for
verbatim reuse, allow marking objects from all bitmapped packs as
eligible for reuse.
Within the `reuse_partial_packfile_from_bitmap()` function, we no longer
only mark the pack whose first object is at bit position zero for reuse,
and instead mark any pack contained in the MIDX as a reuse candidate.
Provide a handful of test cases in a new script (t5332) exercising
interesting behavior for multi-pack reuse to ensure that we performed
all of the previous steps correctly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In e704fc7978 (pack-objects: introduce pack.allowPackReuse, 2019-12-18),
the `pack.allowPackReuse` configuration option was introduced, allowing
users to disable the pack reuse mechanism.
To prepare for debugging multi-pack reuse, allow setting configuration
to "single" in addition to the usual bool-or-int values.
"single" implies the same behavior as "true", "1", "yes", and so on. But
it will complement a new "multi" value (to be introduced in a future
commit). When set to "single", we will only perform pack reuse on a
single pack, regardless of whether or not there are multiple MIDX'd
packs.
This requires no code changes (yet), since we only support single pack
reuse.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of the multi-pack reuse effort, we will want to add some tests
that assert that we reused a certain number of objects from a certain
number of packs.
We could do this by grepping through the stderr output of
`pack-objects`, but doing so would be brittle in case the output format
changed.
Instead, let's use the trace2 mechanism to log various pieces of
information about the generated packfile, which we can then use to
compare against desired values.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to including the number of objects reused verbatim from a
reuse-pack, include the number of packs from which objects were reused.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `write_reused_pack_verbatim()` within
`builtin/pack-objects.c` is responsible for writing out a continuous
set of objects beginning at the start of the reuse packfile.
In the existing implementation, we did something like:
while (pos < reuse_packfile_bitmap->word_alloc &&
reuse_packfile_bitmap->words[pos] == (eword_t)~0)
pos++;
if (pos)
/* write first `pos * BITS_IN_WORD` objects from pack */
as an optimization to record a single chunk for the longest continuous
prefix of objects wanted out of the reuse pack, instead of having a
chunk for each individual object. For more details, see bb514de356
(pack-objects: improve partial packfile reuse, 2019-12-18).
In order to retain this optimization in a multi-pack reuse world, we can
no longer assume that the first object in a pack is on a word boundary
in the bitmap storing the set of reusable objects.
Assuming that all objects from the beginning of the reuse packfile up to
the object corresponding to the first bit on a word boundary are part of
the result, consume whole words at a time until the last whole word
belonging to the reuse packfile. Copy those objects to the resulting
packfile, and track that we reused them by recording a single chunk.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `write_reused_pack()` within `builtin/pack-objects.c` is
responsible for performing pack-reuse on a single pack, and has two main
functions:
- it dispatches a call to `write_reused_pack_verbatim()` to see if we
can reuse portions of the packfile in whole-word chunks
- for any remaining objects (that is, any objects that appear after
the first "gap" in the bitmap), call write_reused_pack_one() on that
object to record it for reuse.
Prepare this function for multi-pack reuse by removing the assumption
that the bit position corresponding to the first object being reused
from a given pack must be at bit position zero.
The changes in this function are mostly straightforward. Initialize `i`
to the position of the first word to contain bits corresponding to that
reuse pack. In most situations, we throw the initialized value away,
since we end up replacing it with the return value from
write_reused_pack_verbatim(), moving us past the section of whole words
that we reused.
Likewise, modify the per-object loop to ignore any bits at the beginning
of the first word that do not belong to the pack currently being reused,
as well as skip to the "done" section once we have processed the last
bit corresponding to this pack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further prepare pack-objects to perform verbatim pack-reuse over
multiple packfiles by converting functions that take in a pointer to a
`struct packed_git` to instead take in a pointer to a `struct
bitmapped_pack`.
The additional information found in the bitmapped_pack struct (such as
the bit position corresponding to the beginning of the pack) will be
necessary in order to perform verbatim pack-reuse.
Note that we don't use any of the extra pieces of information contained
in the bitmapped_pack struct, so this step is merely preparatory and
does not introduce any functional changes.
Note further that we do not change the argument type to
write_reused_pack_one(). That function is responsible for copying
sections of the packfile directly and optionally patching any OFS_DELTAs
to account for not reusing sections of the packfile in between a delta
and its base.
As such, that function is (and should remain) oblivious to multi-pack
reuse, and does not require any of the extra pieces of information
stored in the bitmapped_pack struct.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reusing objects from a pack, we keep track of a set of one or more
`reused_chunk`s, corresponding to sections of one or more object(s) from
a source pack that we are reusing. Each chunk contains two pieces of
information:
- the offset of the first object in the source pack (relative to the
beginning of the source pack)
- the difference between that offset, and the corresponding offset in
the pack we're generating
The purpose of keeping track of these is so that we can patch an
OFS_DELTAs that cross over a section of the reuse pack that we didn't
take.
For instance, consider a hypothetical pack as shown below:
(chunk #2)
__________...
/
/
+--------+---------+-------------------+---------+
... | <base> | <other> | (unused) | <delta> | ...
+--------+---------+-------------------+---------+
\ /
\______________/
(chunk #1)
Suppose that we are sending objects "base", "other", and "delta", and
that the "delta" object is stored as an OFS_DELTA, and that its base is
"base". If we don't send any objects in the "(unused)" range, we can't
copy the delta'd object directly, since its delta offset includes a
range of the pack that we didn't copy, so we have to account for that
difference when patching and reassembling the delta.
In order to compute this value correctly, we need to know not only where
we are in the packfile we're assembling (with `hashfile_total(f)`) but
also the position of the first byte of the packfile that we are
currently reusing. Currently, this works just fine, since when reusing
only a single pack those two values are always identical (because
verbatim reuse is the first thing pack-objects does when enabled after
writing the pack header).
But when reusing multiple packs which have one or more gaps, we'll need
to account for these two values diverging.
Together, these two allow us to compute the reused chunk's offset
difference relative to the start of the reused pack, as desired.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The routines pack-objects uses to perform verbatim pack-reuse are:
- write_reused_pack_one()
- write_reused_pack_verbatim()
- write_reused_pack()
, all of which assume that there is exactly one packfile being reused:
the global constant `reuse_packfile`.
Prepare for reusing objects from multiple packs by making reuse packfile
a parameter of each of the above functions in preparation for calling
these functions in a loop with multiple packfiles.
Note that we still have the global "reuse_packfile", but pass it through
each of the above function's parameter lists, eliminating all but one
direct access (the top-level caller in `write_pack_file()`). Even after
this series, we will still have a global, but it will hold the array of
reusable packfiles, and we'll pass them one at a time to these functions
in a loop.
Note also that we will eventually need to pass a `bitmapped_pack`
instead of a `packed_git` in order to hold onto additional information
required for reuse (such as the bit position of the first object
belonging to that pack). But that change will be made in a future commit
so as to minimize the noise below as much as possible.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further prepare for enabling verbatim pack-reuse over multiple packfiles
by changing the signature of reuse_partial_packfile_from_bitmap() to
populate an array of `struct bitmapped_pack *`'s instead of a pointer to
a single packfile.
Since the array we're filling out is sized dynamically[^1], add an
additional `size_t *` parameter which will hold the number of reusable
packs (equal to the number of elements in the array).
Note that since we still have not implemented true multi-pack reuse,
these changes aren't propagated out to the rest of the caller in
builtin/pack-objects.c.
In the interim state, we expect that the array has a single element, and
we use that element to fill out the static `reuse_packfile` variable
(which is a bog-standard `struct packed_git *`). Future commits will
continue to push this change further out through the pack-objects code.
[^1]: That is, even though we know the number of packs which are
candidates for pack-reuse, we do not know how many of those
candidates we can actually reuse.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The signature of `reuse_partial_packfile_from_bitmap()` currently takes
in a bitmap, as well as three output parameters (filled through
pointers, and passed as arguments), and also returns an integer result.
The output parameters are filled out with: (a) the packfile used for
pack-reuse, (b) the number of objects from that pack that we can reuse,
and (c) a bitmap indicating which objects we can reuse. The return value
is either -1 (when there are no objects to reuse), or 0 (when there is
at least one object to reuse).
Some of these parameters are redundant. Notably, we can infer from the
bitmap how many objects are reused by calling bitmap_popcount(). And we
can similar compute the return value based on that number as well.
As such, clean up the signature of this function to drop the "*entries"
parameter, as well as the int return value, since the single caller of
this function can infer these values themself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pack-objects internals use a packing_data struct to track what
objects are part of the pack(s) being formed.
Since these structures contain allocated fields, failing to
appropriately free() them results in a leak. Plug that leak by
introducing a clear_packing_data() function, and call it in the
appropriate spots.
This is a fairly straightforward leak to plug, since none of the callers
expect to read any values or have any references to parts of the address
space being freed.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're inconsistently writing BISECT_EXPECTED_REV both via the filesystem
and via the refdb, which violates the newly established rules for how
special refs must be treated. This works alright in practice with the
reffiles reference backend, but will cause bugs once we gain additional
backends.
Fix this issue and consistently write BISECT_EXPECTED_REV via the refdb
so that it is no longer a special ref.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git checkout -B <branch> [<start-point>]", being a "forced" version
of "-b", switches to the <branch>, after optionally resetting its
tip to the <start-point>, even if the <branch> is in use in another
worktree, which is somewhat unexpected.
Protect the <branch> using the same logic that forbids "git checkout
<branch>" from touching a branch that is in use elsewhere.
This is a breaking change that may deserve backward compatibliity
warning in the Release Notes. The "--ignore-other-worktrees" option
can be used as an escape hatch if the finger memory of existing
users depend on the current behaviour of "-B".
Reported-by: Willem Verstraeten <willem.verstraeten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're currently creating the reference database with a potentially
incorrect object format when the remote repository's object format is
different from the local default object format. This works just fine for
now because the files backend never records the object format anywhere.
But this logic will fail with any new reference backend that encodes
this information in some form either on-disk or in-memory.
The preceding commits have reshuffled code in git-clone(1) so that there
is no code path that will access the reference database before we have
detected the remote's object format. With these refactorings we can now
defer initialization of the reference database until after we have
learned the remote's object format and thus initialize it with the
correct format from the get-go.
These refactorings are required to make git-clone(1) work with the
upcoming reftable backend when cloning repositories with the SHA256
object format.
This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an
empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects
the resulting hash format of the empty cloned repository to match the
default hash, but now we always end up with a sha1 repository. The
problem is that for dumb HTTP fetches, we have no easy way to figure out
the remote's hash function except for deriving it based on the hash
length of refs in `info/refs`. But as the remote repository is empty we
cannot rely on this detection mechanism.
Before the change in this commit we already initialized the repository
with the default hash function and then left it as-is. With this patch
we always use the hash function detected via the remote, where we fall
back to "sha1" in case we cannot detect it.
Neither the old nor the new behaviour are correct as we second-guess the
remote hash function in both cases. But given that this is a rather
unlikely edge case (we use the dumb HTTP protocol, the remote repository
uses SHA256 and the remote repository is empty), let's simply adapt the
test to assert the new behaviour. If we want to properly address this
edge case in the future we will have to extend the dumb HTTP protocol so
that we can properly detect the hash function for empty repositories.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After we have set up the remote configuration in git-clone(1) we'll call
`remote_get()` to read the remote from the on-disk configuration. But
next to reading the on-disk configuration, `remote_get()` will also
cause us to try and read the repository's HEAD reference so that we can
figure out the current branch. Besides being pointless in git-clone(1)
because we're operating in an empty repository anyway, this will also
break once we move creation of the reference database to a later point
in time.
Refactor the code to introduce a new `remote_get_early()` function that
will skip reading the HEAD reference to address this issue.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When asked to do a sparse checkout, then git-clone(1) will spawn
`git sparse-checkout set` to set up the configuration accordingly. This
requires a proper Git repository or otherwise the command will fail. But
as we are about to move creation of the reference database to a later
point, this prerequisite will not hold anymore.
Move the logic to a later point in time where we know to have created
the reference database already.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We create the reference database in git-clone(1) quite early before
connecting to the remote repository. Given that we do not yet know about
the object format that the remote repository uses at that point in time
the consequence is that the refdb may be initialized with the wrong
object format.
This is not a problem in the context of the files backend as we do not
encode the object format anywhere, and furthermore the only reference
that we write between initializing the refdb and learning about the
object format is the "HEAD" symref. It will become a problem though once
we land the reftable backend, which indeed does require to know about
the proper object format at the time of creation. We thus need to
rearrange the logic in git-clone(1) so that we only initialize the refdb
once we have learned about the actual object format.
As a first step, move listing of remote references to happen earlier,
which also allow us to set up the hash algorithm of the repository
earlier now. While we aim to execute this logic as late as possible
until after most of the setup has happened already, detection of the
object format and thus later the setup of the reference database must
happen before any other logic that may spawn Git commands or otherwise
these Git commands may not recognize the repository as such.
The first Git step where we expect the repository to be fully initalized
is when we fetch bundles via bundle URIs. Funny enough, the comments
there also state that "the_repository must match the cloned repo", which
is indeed not necessarily the case for the hash algorithm right now. So
in practice it is the right thing to detect the remote's object format
before downloading bundle URIs anyway, and not doing so causes clones
with bundle URIs to fail when the local default object format does not
match the remote repository's format.
Unfortunately though, this creates a new issue: downloading bundles may
take a long time, so if we list refs beforehand they might've grown
stale meanwhile. It is not clear how to solve this issue except for a
second reference listing though after we have downloaded the bundles,
which may be an expensive thing to do.
Arguably though, it's preferable to have a staleness issue compared to
being unable to clone a repository altogether.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard message for reporting the use of multiple mutually
exclusive options by calling die_for_incompatible_opt3() instead of
rolling our own. This has the benefits of showing only the actually
given options, reducing the number of strings to translate and making
the UI slightly more consistent.
Adjust the test to no longer insist on a specific order of the
reported options, as this implementation detail does not affect the
usefulness of the error message.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --autosquash" is now enabled for non-interactive rebase,
but it is still incompatible with the apply backend.
* ak/rebase-autosquash:
rebase: rewrite --(no-)autosquash documentation
rebase: support --autosquash without -i
rebase: fully ignore rebase.autoSquash without -i