
Currently,
- Running "index-pack --promisor" outside a repo segfaults.
- It may be confusing to a user that running "index-pack --promisor"
within a repo may make changes to the repo's object DB, especially
since the packs indexed by the index-pack invocation may not even be
related to the repo.
As discussed in [1] and [2], teaching --promisor to forbid a packfile
name solves both these problems. This combination of arguments requires
a repo (since we are writing the resulting .pack and .idx to it) and it
is clear that the files are related to the repo.
Currently, Git uses "index-pack --promisor" only when fetching into
a repo, so it could be argued that we should teach "index-pack" a
new argument (say, "--fetching-mode") instead of tying --promisor to
a generic argument like the packfile name. However, this --promisor
feature could conceivably be used whenever we have a packfile that is
known to come from the promisor remote (whether obtained through Git's
fetch protocol or through other means) so not using a new argument seems
reasonable - one could envision a user-made script obtaining a packfile
and then running "index-pack --promisor --stdin", for example. In fact,
it might be possible to relax the restriction further (say, by also
allowing --promisor when indexing a packfile that is in the object DB),
but relaxing the restriction is backwards-compatible so we can revisit
that later.
One thing to watch out for is the possibility of a future Git feature
that indexes a pack in the context of a repo, but does not necessarily
write the resulting pack to it (and does not necessarily desire to
make any changes to the object DB). One such feature would be fetch
quarantine, which might need the repo context in order to detect
hash collisions, but would also need to ensure that the object DB
is undisturbed in case the fetch fails for whatever reason, even if
the reason occurs only after the indexing is complete. It may not be
obvious to the implementer of such a feature that "index-pack" could
sometimes write packs other than the indexed pack to the object DB,
but there are already other ways that "fetch" could write to the object
DB (in particular, packfile URIs and bundle URIs), so hopefully the
implementation of this future feature would already include a test that
the object DB be undisturbed.
This change requires the change to t5300 by 1f52cdfacb
(index-pack:
document and test the --promisor option, 2022-03-09) to be undone.
(--promisor is already tested indirectly, so we don't need the explicit
test here any more.)
[1] https://lore.kernel.org/git/20241114005652.GC1140565@coredump.intra.peff.net/
[2] https://lore.kernel.org/git/20241119185345.GB15723@coredump.intra.peff.net/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
163 lines
6.1 KiB
Plaintext
163 lines
6.1 KiB
Plaintext
git-index-pack(1)
|
|
=================
|
|
|
|
NAME
|
|
----
|
|
git-index-pack - Build pack index file for an existing packed archive
|
|
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'git index-pack' [-v] [-o <index-file>] [--[no-]rev-index] <pack-file>
|
|
'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>]
|
|
[--[no-]rev-index] [<pack-file>]
|
|
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
Reads a packed archive (.pack) from the specified file,
|
|
builds a pack index file (.idx) for it, and optionally writes a
|
|
reverse-index (.rev) for the specified pack. The packed
|
|
archive, together with the pack index, can then be placed in
|
|
the objects/pack/ directory of a Git repository.
|
|
|
|
|
|
OPTIONS
|
|
-------
|
|
-v::
|
|
Be verbose about what is going on, including progress status.
|
|
|
|
-o <index-file>::
|
|
Write the generated pack index into the specified
|
|
file. Without this option the name of pack index
|
|
file is constructed from the name of packed archive
|
|
file by replacing .pack with .idx (and the program
|
|
fails if the name of packed archive does not end
|
|
with .pack).
|
|
|
|
--[no-]rev-index::
|
|
When this flag is provided, generate a reverse index
|
|
(a `.rev` file) corresponding to the given pack. If
|
|
`--verify` is given, ensure that the existing
|
|
reverse index is correct. Takes precedence over
|
|
`pack.writeReverseIndex`.
|
|
|
|
--stdin::
|
|
When this flag is provided, the pack is read from stdin
|
|
instead and a copy is then written to <pack-file>. If
|
|
<pack-file> is not specified, the pack is written to
|
|
objects/pack/ directory of the current Git repository with
|
|
a default name determined from the pack content. If
|
|
<pack-file> is not specified consider using --keep to
|
|
prevent a race condition between this process and
|
|
'git repack'.
|
|
|
|
--fix-thin::
|
|
Fix a "thin" pack produced by `git pack-objects --thin` (see
|
|
linkgit:git-pack-objects[1] for details) by adding the
|
|
excluded objects the deltified objects are based on to the
|
|
pack. This option only makes sense in conjunction with --stdin.
|
|
|
|
--keep::
|
|
Before moving the index into its final destination
|
|
create an empty .keep file for the associated pack file.
|
|
This option is usually necessary with --stdin to prevent a
|
|
simultaneous 'git repack' process from deleting
|
|
the newly constructed pack and index before refs can be
|
|
updated to use objects contained in the pack.
|
|
|
|
--keep=<msg>::
|
|
Like --keep, create a .keep file before moving the index into
|
|
its final destination. However, instead of creating an empty file
|
|
place '<msg>' followed by an LF into the .keep file. The '<msg>'
|
|
message can later be searched for within all .keep files to
|
|
locate any which have outlived their usefulness.
|
|
|
|
--index-version=<version>[,<offset>]::
|
|
This is intended to be used by the test suite only. It allows
|
|
to force the version for the generated pack index, and to force
|
|
64-bit index entries on objects located above the given offset.
|
|
|
|
--strict[=<msg-id>=<severity>...]::
|
|
Die, if the pack contains broken objects or links. An optional
|
|
comma-separated list of `<msg-id>=<severity>` can be passed to change
|
|
the severity of some possible issues, e.g.,
|
|
`--strict="missingEmail=ignore,badTagName=error"`. See the entry for the
|
|
`fsck.<msg-id>` configuration options in linkgit:git-fsck[1] for more
|
|
information on the possible values of `<msg-id>` and `<severity>`.
|
|
|
|
--progress-title::
|
|
For internal use only.
|
|
+
|
|
Set the title of the progress bar. The title is "Receiving objects" by
|
|
default and "Indexing objects" when `--stdin` is specified.
|
|
|
|
--check-self-contained-and-connected::
|
|
Die if the pack contains broken links. For internal use only.
|
|
|
|
--fsck-objects[=<msg-id>=<severity>...]::
|
|
Die if the pack contains broken objects, but unlike `--strict`, don't
|
|
choke on broken links. If the pack contains a tree pointing to a
|
|
.gitmodules blob that does not exist, prints the hash of that blob
|
|
(for the caller to check) after the hash that goes into the name of the
|
|
pack/idx file (see "Notes").
|
|
+
|
|
An optional comma-separated list of `<msg-id>=<severity>` can be passed to
|
|
change the severity of some possible issues, e.g.,
|
|
`--fsck-objects="missingEmail=ignore,badTagName=ignore"`. See the entry for the
|
|
`fsck.<msg-id>` configuration options in linkgit:git-fsck[1] for more
|
|
information on the possible values of `<msg-id>` and `<severity>`.
|
|
|
|
--threads=<n>::
|
|
Specifies the number of threads to spawn when resolving
|
|
deltas. This requires that index-pack be compiled with
|
|
pthreads otherwise this option is ignored with a warning.
|
|
This is meant to reduce packing time on multiprocessor
|
|
machines. The required amount of memory for the delta search
|
|
window is however multiplied by the number of threads.
|
|
Specifying 0 will cause Git to auto-detect the number of CPU's
|
|
and use maximum 3 threads.
|
|
|
|
--max-input-size=<size>::
|
|
Die, if the pack is larger than <size>.
|
|
|
|
--object-format=<hash-algorithm>::
|
|
Specify the given object format (hash algorithm) for the pack. The valid
|
|
values are 'sha1' and (if enabled) 'sha256'. The default is the algorithm for
|
|
the current repository (set by `extensions.objectFormat`), or 'sha1' if no
|
|
value is set or outside a repository.
|
|
+
|
|
This option cannot be used with --stdin.
|
|
+
|
|
include::object-format-disclaimer.txt[]
|
|
|
|
--promisor[=<message>]::
|
|
Before committing the pack-index, create a .promisor file for this
|
|
pack. Particularly helpful when writing a promisor pack with --fix-thin
|
|
since the name of the pack is not final until the pack has been fully
|
|
written. If a `<message>` is provided, then that content will be
|
|
written to the .promisor file for future reference. See
|
|
link:technical/partial-clone.html[partial clone] for more information.
|
|
+
|
|
Also, if there are objects in the given pack that references non-promisor
|
|
objects (in the repo), repacks those non-promisor objects into a promisor
|
|
pack. This avoids a situation in which a repo has non-promisor objects that are
|
|
accessible through promisor objects.
|
|
+
|
|
Requires <pack-file> to not be specified.
|
|
|
|
NOTES
|
|
-----
|
|
|
|
Once the index has been created, the hash that goes into the name of
|
|
the pack/idx file is printed to stdout. If --stdin was
|
|
also used then this is prefixed by either "pack\t", or "keep\t" if a
|
|
new .keep file was successfully created. This is useful to remove a
|
|
.keep file used as a lock to prevent the race with 'git repack'
|
|
mentioned above.
|
|
|
|
GIT
|
|
---
|
|
Part of the linkgit:git[1] suite
|