This brings back some of the performance lost in optimizing recency
order inside pack objects. We were doing extreme amounts of object
re-traversal: for the 2.14 million objects in the Linux kernel
repository, we were calling add_to_write_order() over 1.03 billion times
(a 0.2% hit rate, making 99.8% of of these calls extraneous).
Two optimizations take place here- we can start our objects array
iteration from a known point where we left off before we started trying
to find our tags, and we don't need to do the deep dives required by
add_family_to_write_order() if the object has already been marked as
filled.
These two optimizations bring some pretty spectacular results via `perf
stat`:
task-clock: 83373 ms --> 43800 ms (50% faster)
cycles: 221,633,461,676 --> 116,307,209,986 (47% fewer)
instructions: 149,299,179,939 --> 122,998,800,184 (18% fewer)
Helped-by: Ramsay Jones (format string fix in "die" message)
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/no-cherry-pick-head-after-punted:
cherry-pick: do not give irrelevant advice when cherry-pick punted
revert.c: defer writing CHERRY_PICK_HEAD till it is safe to do so
This removes the need to call this function recursively, shinking the
code size slightly and netting a small performance increase.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is done in some of the new pack layout code introduced in commit
1b4bb16b9e. This more closely matches the nr_objects global that is
unsigned that these variables are based off of and bounded by.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is a whole 26 bytes when compiled on x86_64, but is
currently invoked over 1.037 billion times when running pack-objects on
the Linux kernel git repository. This is hitting the point where
micro-optimizations do make a difference, and inlining it only increases
the object file size by 38 bytes.
As reported by perf, this dropped task-clock from 84183 to 83373 ms, and
total cycles from 223.5 billion to 221.6 billion. Not astronomical, but
worth getting for adding one word.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If --tags is specified, add that refspec to the list given to
prune_refs so it knows to treat it as a filter on what refs to
should consider for prunning. This way
git fetch --prune --tags origin
only prunes tags and doesn't delete the branch refs.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user gave us refspecs on the command line, we should use those
when deciding whether to prune a ref instead of relying on the
refspecs in the config.
Previously, running
git fetch --prune origin refs/heads/master:refs/remotes/origin/master
would delete every other ref under the origin namespace because we
were using the refspec to filter the available refs but using the
configured refspec to figure out if a ref had been deleted on the
remote. This is clearly the wrong thing to do.
Change prune_refs and get_stale_heads to simply accept a list of
references and a list of refspecs. The caller of either function needs
to decide what refspecs should be used to decide whether a ref is
stale.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These can happen if another process simultaneously prunes a
pack. But that is not usually an error condition, because a
properly-running prune should have repacked the object into
a new pack. So we will notice that the pack has disappeared
unexpectedly, print a message, try other packs (possibly
after re-scanning the list of packs), and find it in the new
pack.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible that while pack-objects is running, a
simultaneously running prune process might delete a pack
that we are interested in. Because we load the pack indices
early on, we know that the pack contains our item, but by
the time we try to open and map it, it is gone.
Since c715f78, we already protect against this in the normal
object access code path, but pack-objects accesses the packs
at a lower level. In the normal access path, we call
find_pack_entry, which will call find_pack_entry_one on each
pack index, which does the actual lookup. If it gets a hit,
we will actually open and verify the validity of the
matching packfile (using c715f78's is_pack_valid). If we
can't open it, we'll issue a warning and pretend that we
didn't find it, causing us to go on to the next pack (or on
to loose objects).
Furthermore, we will cache the descriptor to the opened
packfile. Which means that later, when we actually try to
access the object, we are likely to still have that packfile
opened, and won't care if it has been unlinked from the
filesystem.
Notice the "likely" above. If there is another pack access
in the interim, and we run out of descriptors, we could
close the pack. And then a later attempt to access the
closed pack could fail (we'll try to re-open it, of course,
but it may have been deleted). In practice, this doesn't
happen because we tend to look up items and then access them
immediately.
Pack-objects does not follow this code path. Instead, it
accesses the packs at a much lower level, using
find_pack_entry_one directly. This means we skip the
is_pack_valid check, and may end up with the name of a
packfile, but no open descriptor.
We can add the same is_pack_valid check here. Unfortunately,
the access patterns of pack-objects are not quite as nice
for keeping lookup and object access together. We look up
each object as we find out about it, and the only later when
writing the packfile do we necessarily access it. Which
means that the opened packfile may be closed in the interim.
In practice, however, adding this check still has value, for
three reasons.
1. If you have a reasonable number of packs and/or a
reasonable file descriptor limit, you can keep all of
your packs open simultaneously. If this is the case,
then the race is impossible to trigger.
2. Even if you can't keep all packs open at once, you
may end up keeping the deleted one open (i.e., you may
get lucky).
3. The race window is shortened. You may notice early that
the pack is gone, and not try to access it. Triggering
the problem without this check means deleting the pack
any time after we read the list of index files, but
before we access the looked-up objects. Triggering it
with this check means deleting the pack means deleting
the pack after we do a lookup (and successfully access
the packfile), but before we access the object. Which
is a smaller window.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rs/pending:
commit: factor out clear_commit_marks_for_object_array
checkout: use leak_pending flag
bundle: use leak_pending flag
bisect: use leak_pending flag
revision: add leak_pending flag
checkout: use add_pending_{object,sha1} in orphan check
revision: factor out add_pending_sha1
checkout: check for "Previous HEAD" notice in t2020
Conflicts:
builtin/checkout.c
revision.c
* nd/maint-autofix-tag-in-head:
Accept tags in HEAD or MERGE_HEAD
merge: remove global variable head[]
merge: use return value of resolve_ref() to determine if HEAD is invalid
merge: keep stash[] a local variable
Conflicts:
builtin/merge.c
Implemented internally instead of as "git merge --no-commit && git commit"
so that "merge --edit" is otherwise consistent (hooks, etc) with "merge".
Note: the edit message does not include the status information that one
gets with "commit --status" and it is cleaned up after editing like one
gets with "commit --cleanup=default". A later patch could add the status
information if desired.
Note: previously we were not calling stripspace() after running the
prepare-commit-msg hook. Now we are, stripping comments and
leading/trailing whitespace lines if --edit is given, otherwise only
stripping leading/trailing whitespace lines if not given --edit.
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I noticed this when "git am CORRUPTED" unexpectedly failed with an
odd diagnostic, and even removed one of the files it was supposed
to have patched.
Reproduce with any valid old/new patch from which you have removed
the "+++ b/FILE" line. You'll see a diagnostic like this
fatal: unable to write file '(null)' mode 100644: Bad address
and you'll find that FILE has been removed.
The above is on glibc-based systems. On other systems, rather than
getting "null", you may provoke a segfault as git tries to
dereference the NULL file name.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mh/check-ref-format-3: (23 commits)
add_ref(): verify that the refname is formatted correctly
resolve_ref(): expand documentation
resolve_ref(): also treat a too-long SHA1 as invalid
resolve_ref(): emit warnings for improperly-formatted references
resolve_ref(): verify that the input refname has the right format
remote: avoid passing NULL to read_ref()
remote: use xstrdup() instead of strdup()
resolve_ref(): do not follow incorrectly-formatted symbolic refs
resolve_ref(): extract a function get_packed_ref()
resolve_ref(): turn buffer into a proper string as soon as possible
resolve_ref(): only follow a symlink that contains a valid, normalized refname
resolve_ref(): use prefixcmp()
resolve_ref(): explicitly fail if a symlink is not readable
Change check_refname_format() to reject unnormalized refnames
Inline function refname_format_print()
Make collapse_slashes() allocate memory for its result
Do not allow ".lock" at the end of any refname component
Refactor check_refname_format()
Change check_ref_format() to take a flags argument
Change bad_ref_char() to return a boolean value
...
* mz/remote-rename:
remote: only update remote-tracking branch if updating refspec
remote rename: warn when refspec was not updated
remote: "rename o foo" should not rename ref "origin/bar"
remote: write correct fetch spec when renaming remote 'remote'
* cb/common-prefix-unification:
rename pathspec_prefix() to common_prefix() and move to dir.[ch]
consolidate pathspec_prefix and common_prefix
remove prefix argument from pathspec_prefix
The previous logic in show_config was to print the delimiter when the
value was set, but Boolean variables have an implicit value "true" when
they appear with no value in the config file. As a result, we got:
git_Config --get-regexp '.*\.Boolean' #1. Ok: example.boolean
git_Config --bool --get-regexp '.*\.Boolean' #2. NO: example.booleantrue
Fix this by defering the display of the separator until after the value
to display has been computed.
Reported-by: Brian Foster <brian.foster@maxim-ic.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In particular, gcc complains as follows:
CC tree-walk.o
tree-walk.c: In function `traverse_trees':
tree-walk.c:347: warning: 'e' might be used uninitialized in this \
function
CC builtin/revert.o
builtin/revert.c: In function `verify_opt_mutually_compatible':
builtin/revert.c:113: warning: 'opt2' might be used uninitialized in \
this function
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches "merge --log" and fmt-merge-msg to use branch description
information when merging a local topic branch into the mainline. The
description goes between the branch name label and the list of commit
titles.
The refactoring to share the common configuration parsing between
merge and fmt-merge-msg needs to be made into a separate patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/maint-no-cherry-pick-head-after-punted:
cherry-pick: do not give irrelevant advice when cherry-pick punted
revert.c: defer writing CHERRY_PICK_HEAD till it is safe to do so
Conflicts:
builtin/revert.c
If a cherry-pick did not even start because the working tree had local
changes that would overlap with the operation, we shouldn't be advising
the users to resolve conflicts nor to conclude it with "git commit".
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
do_pick_commit() writes out CHERRY_PICK_HEAD before invoking merge (either
via do_recursive_merge() or try_merge_command()) on the assumption that if
the merge fails it is due to conflict. However, if the tree is dirty, the
merge may not even start, aborting before do_pick_commit() can remove
CHERRY_PICK_HEAD.
Instead, defer writing CHERRY_PICK_HEAD till after merge has returned.
At this point we know the merge has either succeeded or failed due
to conflict. In either case, we want CHERRY_PICK_HEAD to be written
so that it may be picked up by the subsequent invocation of commit.
Note that do_recursive_merge() aborts if the merge cannot start, while
try_merge_command() returns a non-zero value other than 1.
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code calls git_config from a helper function to parse the config entry
it is interested in. Calling git_config in this way may cause a problem if
the helper function can be called after a previous call to git_config by
another function since the second call to git_config may reset some
variable to the value in the config file which was previously overridden.
The above is not a problem in this case since the function passed to
git_config only parses one config entry and the variable it sets is not
assigned outside of the parsing function. But a programmer who desires
all of the standard config options to be parsed may be tempted to modify
git_attr_config() so that it falls back to git_default_config() and then it
_would_ be vulnerable to the above described behavior.
So, move the call to git_config up into the top-level cmd_* function and
move the responsibility for parsing core.attributesfile into the main
config file parser.
Which is only the logical thing to do ;-)
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>