When multiple concurrent processes try to update references in a
repository they may try to lock the same lockfiles. This can happen even
when the updates are non-conflicting and can both be applied, so it
doesn't always make sense to abort the transaction immediately. Both the
"loose" and "packed" backends thus have a grace period that they wait
for the lock to be released that can be controlled via the config values
"core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively.
The reftable backend doesn't have such a setting yet and instead fails
immediately when it sees such a lock. But the exact same concepts apply
here as they do apply to the other backends.
Introduce a new "reftable.lockTimeout" config that controls how long we
may wait for a "tables.list" lock to be released. The default value of
this config is 100ms, which is the same default as we have it for the
"loose" backend.
Note that even though we also lock individual tables, this config really
only applies to the "tables.list" file. This is because individual
tables are only ever locked when we already hold the "tables.list" lock
during compaction. When we observe such a lock we in fact do not want to
compact the table at all because it is already in the process of being
compacted by a concurrent process. So applying the same timeout here
would not make any sense and only delay progress.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark unused parameters as UNUSED to squelch -Wunused warnings.
* jk/mark-unused-parameters:
t-hashmap: stop calling setup() for t_intern() test
scalar: mark unused parameters in dummy function
daemon: mark unused parameters in non-posix fallbacks
setup: mark unused parameter in config callback
test-mergesort: mark unused parameters in trivial callback
t-hashmap: mark unused parameters in callback function
reftable: mark unused parameters in virtual functions
reftable: drop obsolete test function declarations
reftable: ignore unused argc/argv in test functions
unit-tests: ignore unused argc/argv
t/helper: mark more unused argv/argc arguments
oss-fuzz: mark unused argv/argc argument
refs: mark unused parameters in do_for_each_reflog_helper()
refs: mark unused parameters in ref_store fsck callbacks
update-ref: mark more unused parameters in parser callbacks
imap-send: mark unused parameter in ssl_socket_connect() fallback
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.
* ps/config-wo-the-repository:
config: hide functions using `the_repository` by default
global: prepare for hiding away repo-less config functions
config: don't depend on `the_repository` with branch conditions
config: don't have setters depend on `the_repository`
config: pass repo to functions that rename or copy sections
config: pass repo to `git_die_config()`
config: pass repo to `git_config_get_expiry_in_days()`
config: pass repo to `git_config_get_expiry()`
config: pass repo to `git_config_get_max_percent_split_change()`
config: pass repo to `git_config_get_split_index()`
config: pass repo to `git_config_get_index_threads()`
config: expose `repo_config_clear()`
config: introduce missing setters that take repo as parameter
path: hide functions using `the_repository` by default
path: stop relying on `the_repository` in `worktree_git_path()`
path: stop relying on `the_repository` when reporting garbage
hooks: remove implicit dependency on `the_repository`
editor: do not rely on `the_repository` for interactive edits
path: expose `do_git_common_path()` as `repo_common_pathv()`
path: expose `do_git_path()` as `repo_git_pathv()`
The reftable code uses a lot of virtual function pointers, but many of
the concrete implementations do not need all of the parameters.
For the most part these are obviously fine to just mark as UNUSED (e.g.,
the empty_iterator functions unsurprisingly do not do anything). Here
are a few cases where I dug a little deeper (but still ended up just
marking them UNUSED):
- the iterator exclude_patterns is best-effort and optional (though it
would be nice to support in the long run as an optimization)
- ignoring the ref_store in many transaction functions is unexpected,
but works because the ref_transaction itself carries enough
information to do what we need.
- ignoring "err" for in some cases (e.g., transaction abort) is OK
because we do not return any errors. It is a little odd for
reftable_be_create_reflog(), though, since we do return errors
there. We should perhaps be creating string error messages at this
layer, but I've punted on that for now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ab6f79d8df (refs: set up ref consistency check infrastructure,
2024-08-08) added virtual functions to the ref store for doing fsck
checks. But the packed and reftable backends do not yet do anything.
Let's annotate them to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.
* sj/ref-fsck:
fsck: add ref name check for files backend
files-backend: add unified interface for refs scanning
builtin/refs: add verify subcommand
refs: set up ref consistency check infrastructure
fsck: add refs report function
fsck: add a unified interface for reporting fsck messages
fsck: make "fsck_error" callback generic
fsck: rename objects-related fsck error functions
fsck: rename "skiplist" to "skip_oids"
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.
* jc/refs-symref-referent:
ref-filter: populate symref from iterator
refs: add referent to each_ref_fn
refs: keep track of unresolved reference value in iterators
We're about to hide config functions that implicitly depend on
`the_repository` behind the `USE_THE_REPOSITORY_VARIABLE` macro. This
will uncover a bunch of dependents that transitively relied on the
global variable, but didn't define the macro yet.
Adapt them such that we define the macro to prepare for this change.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since ref iterators do not hold onto the direct value of a reference
without resolving it, the only way to get ahold of a direct value of a
symbolic ref is to make a separate call to refs_read_symbolic_ref.
To make accessing the direct value of a symbolic ref more efficient,
let's save the direct value of the ref in the iterators for both the
files backend and the reftable backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct ref_store" is the base class which contains the "be" pointer
which provides backend-specific functions whose interfaces are defined
in the "ref_storage_be". We could reuse this polymorphism to define only
one interface. For every backend, we need to provide its own function
pointer.
The interfaces defined in the `ref_storage_be` are carefully structured
in semantic. It's organized as the five parts:
1. The name and the initialization interfaces.
2. The ref transaction interfaces.
3. The ref internal interfaces (pack, rename and copy).
4. The ref filesystem interfaces.
5. The reflog related interfaces.
To keep consistent with the git-fsck(1), add a new interface named
"fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be
grouped into any above five categories. Explicitly add blank line to
make it different from others.
Last, implement placeholder functions for each ref backends.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the reftable ref backend to stop using `the_repository` in favor
of the repo that gets passed in via `struct ref_store`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.
* ps/use-the-repository:
hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
t/helper: remove dependency on `the_repository` in "proc-receive"
t/helper: fix segfault in "oid-array" command without repository
t/helper: use correct object hash in partial-clone helper
compat/fsmonitor: fix socket path in networked SHA256 repos
replace-object: use hash algorithm from passed-in repository
protocol-caps: use hash algorithm from passed-in repository
oidset: pass hash algorithm when parsing file
http-fetch: don't crash when parsing packfile without a repo
hash-ll: merge with "hash.h"
refs: avoid include cycle with "repository.h"
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
hash: require hash algorithm in `empty_tree_oid_hex()`
hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
hash: make `is_null_oid()` independent of `the_repository`
hash: convert `oidcmp()` and `oideq()` to compare whole hash
global: ensure that object IDs are always padded
hash: require hash algorithm in `oidread()` and `oidclr()`
hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
"git update-ref --stdin" learned to handle transactional updates of
symbolic-refs.
* kn/update-ref-symref:
update-ref: add support for 'symref-update' command
reftable: pick either 'oid' or 'target' for new updates
update-ref: add support for 'symref-create' command
update-ref: add support for 'symref-delete' command
update-ref: add support for 'symref-verify' command
refs: specify error for regular refs with `old_target`
refs: create and use `ref_update_expects_existing_old_ref()`
Building with "-Werror -Wwrite-strings" is now supported.
* ps/no-writable-strings: (27 commits)
config.mak.dev: enable `-Wwrite-strings` warning
builtin/merge: always store allocated strings in `pull_twohead`
builtin/rebase: always store allocated string in `options.strategy`
builtin/rebase: do not assign default backend to non-constant field
imap-send: fix leaking memory in `imap_server_conf`
imap-send: drop global `imap_server_conf` variable
mailmap: always store allocated strings in mailmap blob
revision: always store allocated strings in output encoding
remote-curl: avoid assigning string constant to non-const variable
send-pack: always allocate receive status
parse-options: cast long name for OPTION_ALIAS
http: do not assign string constant to non-const field
compat/win32: fix const-correctness with string constants
pretty: add casts for decoration option pointers
object-file: make `buf` parameter of `index_mem()` a constant
object-file: mark cached object buffers as const
ident: add casts for fallback name and GECOS
entry: refactor how we remove items for delayed checkouts
line-log: always allocate the output prefix
line-log: stop assigning string constant to file parent buffer
...
A new command has been added to migrate a repository that uses the
files backend for its ref storage to use the reftable backend, with
limitations.
* ps/ref-storage-migration:
builtin/refs: new command to migrate ref storage formats
refs: implement logic to migrate between ref storage formats
refs: implement removal of ref storages
worktree: don't store main worktree twice
reftable: inline `merged_table_release()`
refs/files: fix NULL pointer deref when releasing ref store
refs/files: extract function to iterate through root refs
refs/files: refactor `add_pseudoref_and_head_entries()`
refs: allow to skip creation of reflog entries
refs: pass storage format to `ref_store_init()` explicitly
refs: convert ref storage format to an enum
setup: unset ref storage when reinitializing repository version
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/ref-storage-migration:
builtin/refs: new command to migrate ref storage formats
refs: implement logic to migrate between ref storage formats
refs: implement removal of ref storages
worktree: don't store main worktree twice
reftable: inline `merged_table_release()`
refs/files: fix NULL pointer deref when releasing ref store
refs/files: extract function to iterate through root refs
refs/files: refactor `add_pseudoref_and_head_entries()`
refs: allow to skip creation of reflog entries
refs: pass storage format to `ref_store_init()` explicitly
refs: convert ref storage format to an enum
setup: unset ref storage when reinitializing repository version
When copying refs, we execute `write_copy_table()` to write the new
table. As the names are given to us via `arg->newname` and
`arg->oldname`, respectively, we optimize away some allocations by
assigning those fields to the reftable records we are about to write
directly, without duplicating them. This requires us to cast the input
to `char *` pointers as they are in fact constant strings. Later on, we
then unset the refname for all of the records before calling
`reftable_log_record_release()` on them.
We also do this when assigning the "HEAD" constant, but here we do not
cast because its type is `char[]` by default. It's about to be turned
into `const char *` though once we enable `-Wwrite-strings` and will
thus cause another warning.
It's quite dubious whether this micro-optimization really helps. We're
about to write to disk anyway, which is going to be way slower than a
small handful of allocations. Let's drop the optimization altogther and
instead copy arguments to simplify the code and avoid the future warning
with `-Wwrite-strings`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a reference transaction update, we can provide the old/new
oid/target for the update. We have checks in place to ensure that for
each old/new, either oid or target is set and not both.
In the reftable backend, when dealing with updates without the
`REF_NO_DEREF` flag, we don't selectively propagate data as needed.
Since there are no active users of the path, this is not caught. As we
want to introduce the 'symref-update' command in the upcoming commit,
which would use this flow, correct it.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a reference update tries to update a symref, but the ref in
question is actually a regular ref, we raise an error. However the error
raised in this situation is:
verifying symref target: '<ref>': reference is missing but expected <old-target>
which is very generic and doesn't indicate the mismatch of types. Let's
make this error more specific:
cannot lock ref '<ref>': expected symref with target '<old-target>': but is a regular ref
so that users have a clearer understanding.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The files and reftable backend, need to check if a ref must exist, so
that the required validation can be done. A ref must exist only when the
`old_oid` value of the update has been explicitly set and it is not the
`null_oid` value.
Since we also support symrefs now, we need to ensure that even when
`old_target` is set a ref must exist. While this was missed when we
added symref support in transactions, there are no active users of this
path. As we introduce the 'symref-verify' command in the upcoming
commits, it is important to fix this.
So let's export this to a function called
`ref_update_expects_existing_old_ref()` and expose it internally via
'refs-internal.h'.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.
Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further clean-up the refs subsystem to stop relying on
the_repository, and instead use the repository associated to the
ref_store object.
* ps/refs-without-the-repository-updates:
refs/packed: remove references to `the_hash_algo`
refs/files: remove references to `the_hash_algo`
refs/files: use correct repository
refs: remove `dwim_log()`
refs: drop `git_default_branch_name()`
refs: pass repo when peeling objects
refs: move object peeling into "object.c"
refs: pass ref store when detecting dangling symrefs
refs: convert iteration over replace refs to accept ref store
refs: retrieve worktree ref stores via associated repository
refs: refactor `resolve_gitlink_ref()` to accept a repository
refs: pass repo when retrieving submodule ref store
refs: track ref stores via strmap
refs: implement releasing ref storages
refs: rename `init_db` callback to avoid confusion
refs: adjust names for `init` and `init_db` callbacks
Code clean-up to make the reftable iterator closer to be reusable.
* ps/reftable-reusable-iterator:
reftable/merged: adapt interface to allow reuse of iterators
reftable/stack: provide convenience functions to create iterators
reftable/reader: adapt interface to allow reuse of iterators
reftable/generic: adapt interface to allow reuse of iterators
reftable/generic: move seeking of records into the iterator
reftable/merged: simplify indices for subiterators
reftable/merged: split up initialization and seeking of records
reftable/reader: set up the reader when initializing table iterator
reftable/reader: inline `reader_seek_internal()`
reftable/reader: separate concerns of table iter and reftable reader
reftable/reader: unify indexed and linear seeking
reftable/reader: avoid copying index iterator
reftable/block: use `size_t` to track restart point index
The knobs to tweak how reftable files are written have been made
available as configuration variables.
* ps/reftable-write-options:
refs/reftable: allow configuring geometric factor
reftable: make the compaction factor configurable
refs/reftable: allow disabling writing the object index
refs/reftable: allow configuring restart interval
reftable: use `uint16_t` to track restart interval
refs/reftable: allow configuring block size
reftable/dump: support dumping a table's block structure
reftable/writer: improve error when passed an invalid block size
reftable/writer: drop static variable used to initialize strbuf
reftable: pass opts as constant pointer
reftable: consistently refer to `reftable_write_options` as `opts`
Terminology to call various ref-like things are getting
straightened out.
* ps/pseudo-ref-terminology:
refs: refuse to write pseudorefs
ref-filter: properly distinuish pseudo and root refs
refs: pseudorefs are no refs
refs: classify HEAD as a root ref
refs: do not check ref existence in `is_root_ref()`
refs: rename `is_special_ref()` to `is_pseudo_ref()`
refs: rename `is_pseudoref()` to `is_root_ref()`
Documentation/glossary: define root refs as refs
Documentation/glossary: clarify limitations of pseudorefs
Documentation/glossary: redefine pseudorefs as special refs
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
* ps/pseudo-ref-terminology:
refs: refuse to write pseudorefs
ref-filter: properly distinuish pseudo and root refs
refs: pseudorefs are no refs
refs: classify HEAD as a root ref
refs: do not check ref existence in `is_root_ref()`
refs: rename `is_special_ref()` to `is_pseudo_ref()`
refs: rename `is_pseudoref()` to `is_root_ref()`
Documentation/glossary: define root refs as refs
Documentation/glossary: clarify limitations of pseudorefs
Documentation/glossary: redefine pseudorefs as special refs
Updates to symbolic refs can now be made as a part of ref
transaction.
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.
Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ref storages are typically only initialized once for `the_repository`
and then never released. Until now we got away with that without causing
memory leaks because `the_repository` stays reachable, and because the
ref backend is reachable via `the_repository` its memory basically never
leaks.
This is about to change though because of the upcoming migration logic,
which will create a secondary ref storage. In that case, we will either
have to release the old or new ref storage to avoid leaks.
Implement a new `release` callback and expose it via a new
`ref_storage_release()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reference backends have two callbacks `init` and `init_db`. The
similarity of these two callbacks has repeatedly confused me whenever I
was looking at them, where I always had to look up which of them does
what.
Rename the `init_db` callback to `create_on_disk`, which should
hopefully be clearer.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.
Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:
- `ref_kind_from_refname()` already handles "HEAD" explicitly before
calling `is_root_ref()`.
- The "files" and "reftable" backends explicitly call both
`is_root_ref()` and `is_headref()` together.
This also aligns behaviour or `is_root_ref()` and `is_headref()` such
that we stop checking for ref existence. This changes semantics for our
backends:
- In the reftable backend we already know that the ref must exist
because `is_headref()` is called as part of the ref iterator. The
existence check is thus redundant, and the change is safe to do.
- In the files backend we use it when populating root refs, where we
would skip adding the "HEAD" file if it was not possible to resolve
it. The new behaviour is to instead mark "HEAD" as broken, which
will cause us to emit warnings in various places.
As there are no callers of `is_headref()` left afer the refactoring, we
can absorb it completely into `is_root_ref()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.
This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.
Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.
We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.
This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:
- "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
It's clear that the intent is to classify based on the ref name,
only.
- "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
filter root refs. Again, using existence checks is pointless here as
the iterator has just surfaced the ref, so we know it does exist.
- "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
determine whether it should add a ref to the root directory of its
iterator. This had the effect that we skipped over any files that
are either a symbolic ref, or which are not a ref at all.
The new behaviour is to include symbolic refs know, which aligns us
with the adapted terminology. Furthermore, files which look like
root refs but aren't are now mark those as "broken". As broken refs
are not surfaced by our tooling, this should not lead to a change in
user-visible behaviour, but may cause us to emit warnings. This
feels like the right thing to do as we would otherwise just silently
ignore corrupted root refs completely.
So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.
Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There exist a bunch of call sites in the reftable backend that want to
create iterators for a reftable stack. This is rather convoluted right
now, where you always have to go via the merged table. And it is about
to become even more convoluted when we split up iterator initialization
and seeking in the next commit.
Introduce convenience functions that allow the caller to create an
iterator from a reftable stack directly without going through the merged
table. Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow configuring the geometric factor used by the auto-compaction
algorithm whenever a new table is appended to the stack of tables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Besides the expected "ref" and "log" records, the reftable library also
writes "obj" records. These are basically a reverse mapping of object
IDs to their respective ref records so that it becomes efficient to
figure out which references point to a specific object. The motivation
for this data structure is the "uploadpack.allowTipSHA1InWant" config,
which allows a client to fetch any object by its hash that has a ref
pointing to it.
This reverse index is not used by Git at all though, and the expectation
is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It
may thus be preferable for many users to disable writing these optional
object indices altogether to safe some precious disk space.
Add a new config "reftable.indexObjects" that allows the user to disable
the object index altogether.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new option `reftable.restartInterval` that allows the user to
control the restart interval when writing reftable records used by the
reftable library.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new option `reftable.blockSize` that allows the user to control
the block size used by the reftable library.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.
In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.
Adapt the code so that we do not modify the caller-provided values
anymore. While at it, refactor the code to code to consistently pass the
options as a constant pointer to clarify that the caller-provided opts
will not ever get modified.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code to write out reftable has seen some optimization and
simplification.
* ps/reftable-write-optim:
reftable/block: reuse compressed array
reftable/block: reuse zstream when writing log blocks
reftable/writer: reset `last_key` instead of releasing it
reftable/writer: unify releasing memory
reftable/writer: refactorings for `writer_flush_nonempty_block()`
reftable/writer: refactorings for `writer_add_record()`
refs/reftable: don't recompute committer ident
reftable: remove name checks
refs/reftable: skip duplicate name checks
refs/reftable: perform explicit D/F check when writing symrefs
refs/reftable: fix D/F conflict error message on ref copy
In the previous commits, we converted `refs_create_symref()` to utilize
transactions to perform symref updates. Earlier `refs_create_symref()`
used `create_symref()` to do the same.
We can now remove `create_symref()` and any code associated with it
which is no longer used. We remove `create_symref()` code from all the
reference backends and also remove it entirely from the `ref_storage_be`
struct.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reference backends currently support transactional reference
updates. While this is exposed to users via 'git-update-ref' and its
'--stdin' mode, it is also used internally within various commands.
However, we do not support transactional updates of symrefs. This commit
adds support for symrefs in both the 'files' and the 'reftable' backend.
Here, we add and use `ref_update_has_null_new_value()`, a helper
function which is used to check if there is a new_value in a reference
update. The new value could either be a symref target `new_target` or a
OID `new_oid`.
We also add another common function `ref_update_check_old_target` which
will be used to check if the update's old_target corresponds to a
reference's current target.
Now transactional updates (verify, create, delete, update) can be used
for:
- regular refs
- symbolic refs
- conversion of regular to symbolic refs and vice versa
This also allows us to expose this to users via new commands in
'git-update-ref' in the future.
Note that a dangling symref update does not record a new reflog entry,
which is unchanged before and after this commit.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The files backend and the reftable backend implement
`original_update_refname` to obtain the original refname of the update.
Move it out to 'refs.c' and only expose it internally to the refs
library. This will be used in an upcoming commit to also introduce
another common functionality for the two backends.
We also rename the function to `ref_update_original_update_refname` to
keep it consistent with the upcoming other 'ref_update_*' functions
that'll be introduced.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `ref_transaction_update()` obtains ref information and
flags to create a `ref_update` and add them to the transaction at hand.
To extend symref support in transactions, we need to also accept the
old and new ref targets and process it. This commit adds the required
parameters to the function and modifies all call sites.
The two parameters added are `new_target` and `old_target`. The
`new_target` is used to denote what the reference should point to when
the transaction is applied. Some functions allow this parameter to be
NULL, meaning that the reference is not changed.
The `old_target` denotes the value the reference must have before the
update. Some functions allow this parameter to be NULL, meaning that the
old value of the reference is not checked.
We also update the internal function `ref_transaction_add_update()`
similarly to take the two new parameters.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pack-refs" learned the "--auto" option, which is a useful
addition to be triggered from "git gc --auto".
Acked-by: Karthik Nayak <karthik.188@gmail.com>
cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com>
* ps/pack-refs-auto:
builtin/gc: pack refs when using `git maintenance run --auto`
builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
t6500: extract objects with "17" prefix
builtin/gc: move `struct maintenance_run_opts`
builtin/pack-refs: introduce new "--auto" flag
builtin/pack-refs: release allocated memory
refs/reftable: expose auto compaction via new flag
refs: remove `PACK_REFS_ALL` flag
refs: move `struct pack_refs_opts` to where it's used
t/helper: drop pack-refs wrapper
refs/reftable: print errors on compaction failure
reftable/stack: gracefully handle failed auto-compaction due to locks
reftable/stack: use error codes when locking fails during compaction
reftable/error: discern locked/outdated errors
reftable/stack: fix error handling in `reftable_stack_init_addition()`