When a repo was fully repacked, and is repacked again, we may run
into the situation that "new" packfiles have the same name as
already existing ones (traditionally packfiles have been named after
the list of names of objects in them, so repacking all the objects
in a single pack would have produced a packfile with the same name).
The logic is to rename the existing ones into filename like
"old-XXX", create the new ones and then remove the "old-" ones.
When something went wrong in the middle, this sequence is rolled
back by renaming the "old-" files back.
The renaming into "old-" did not work as intended, because
file_exists() was done on "XXX", not "pack-XXX". Also when rolling
back the change, the code tried to rename "old-pack-XXX" but the
saved ones are named "old-XXX", so this couldn't have worked.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --prefix, --default, and --resolve-git-dir options to
git-rev-parse require an argument, but when given no argument,
the code uses the NULL read from argv[argc] without checking,
leading to a segfault.
Instead, check first and die() with an error message.
Signed-off-by: David Sharp <dhsharp@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git repack --max-pack-size=8g" stopped being parsed correctly when
the command was reimplemented in C.
* sb/repack-in-c:
repack: propagate pack-objects options as strings
repack: make parsed string options const-correct
repack: fix typo in max-pack-size option
Code clean-up and protection against concurrent write access to the
ref namespace.
* mh/safe-create-leading-directories:
rename_tmp_log(): on SCLD_VANISHED, retry
rename_tmp_log(): limit the number of remote_empty_directories() attempts
rename_tmp_log(): handle a possible mkdir/rmdir race
rename_ref(): extract function rename_tmp_log()
remove_dir_recurse(): handle disappearing files and directories
remove_dir_recurse(): tighten condition for removing unreadable dir
lock_ref_sha1_basic(): if locking fails with ENOENT, retry
lock_ref_sha1_basic(): on SCLD_VANISHED, retry
safe_create_leading_directories(): add new error value SCLD_VANISHED
cmd_init_db(): when creating directories, handle errors conservatively
safe_create_leading_directories(): introduce enum for return values
safe_create_leading_directories(): always restore slash at end of loop
safe_create_leading_directories(): split on first of multiple slashes
safe_create_leading_directories(): rename local variable
safe_create_leading_directories(): add explicit "slash" pointer
safe_create_leading_directories(): reduce scope of local variable
safe_create_leading_directories(): fix format of "if" chaining
In the original shell version of git-repack, any options
destined for pack-objects were left as strings, and passed
as a whole. Since the C rewrite in commit a1bbc6c (repack:
rewrite the shell script in C, 2013-09-15), we now parse
these values to integers internally, then reformat the
integers when passing the option to pack-objects.
This has the advantage that we catch format errors earlier
(i.e., when repack is invoked, rather than when pack-objects
is invoked).
It has three disadvantages, though:
1. Our internal data types may not be the right size. In
the case of "--window-memory" and "--max-pack-size",
these are "unsigned long" in pack-objects, but we can
only represent a regular "int".
2. Our parsing routines might not be the same as those of
pack-objects. For the two options above, pack-objects
understands "100m" to mean "100 megabytes", but repack
does not.
3. We have to keep a sentinel value to know whether it is
worth passing the option along. In the case of
"--window-memory", we currently do not pass it if the
value is "0". But that is a meaningful value to
pack-objects, where it overrides any configured value.
We can fix all of these by simply passing the strings from
the user along to pack-objects verbatim. This does not
actually fix anything for "--depth" or "--window", but these
are converted, too, for consistency.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we use OPT_STRING to parse an option, we get back a
pointer into the argv array, which should be "const char *".
The compiler doesn't notice because it gets passed through a
"void *" in the option struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see "--max-pack-size", we accidentally propagated
this to pack-objects as "--max_pack_size", which does not
work at all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).
* nd/shallow-clone: (31 commits)
t5537: fix incorrect expectation in test case 10
shallow: remove unused code
send-pack.c: mark a file-local function static
git-clone.txt: remove shallow clone limitations
prune: clean .git/shallow after pruning objects
clone: use git protocol for cloning shallow repo locally
send-pack: support pushing from a shallow clone via http
receive-pack: support pushing to a shallow clone via http
smart-http: support shallow fetch/clone
remote-curl: pass ref SHA-1 to fetch-pack as well
send-pack: support pushing to a shallow clone
receive-pack: allow pushes that update .git/shallow
connected.c: add new variant that runs with --shallow-file
add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
receive/send-pack: support pushing from a shallow clone
receive-pack: reorder some code in unpack()
fetch: add --update-shallow to accept refs that update .git/shallow
upload-pack: make sure deepening preserves shallow roots
fetch: support fetching from a shallow repository
clone: support remote shallow repository
...
Our xwrite wrapper already deals with a few potential hazards, and
are as such more robust. Prefer it instead of write to get the
robustness benefits everywhere.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-and-improved-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A "gc" process running as a different user should be able to stop a
new "gc" process from starting.
* km/gc-eperm:
gc: notice gc processes run by other users
"git mv A B/", when B does not exist as a directory, should error
out, but it didn't.
* mm/mv-file-to-no-such-dir-with-slash:
mv: let 'git mv file no-such-dir/' error out on Windows, too
mv: let 'git mv file no-such-dir/' error out
"git rev-parse <revs> -- <paths>" did not implement the usual
disambiguation rules the commands in the "git log" family used in
the same way.
* jk/rev-parse-double-dashes:
rev-parse: be more careful with munging arguments
rev-parse: correctly diagnose revision errors before "--"
"git cat-file --batch=", an admittedly useless command, did not
behave very well.
* jk/cat-file-regression-fix:
cat-file: handle --batch format with missing type/size
cat-file: pass expand_data to print_object_or_die
The previous commit c57f628 (mv: let 'git mv file no-such-dir/' error out)
relies on that rename("file", "no-such-dir/") fails if the directory does not
exist (note the trailing slash). This does not work as expected on Windows:
This rename() call does not fail, but renames "file" to "no-such-dir" (not to
"no-such-dir/file"). Insert an explicit check for this case to force an error.
This changes the error message from
$ git mv file no-such-dir/
fatal: renaming 'file' failed: Not a directory
to
$ git mv file no-such-dir/
fatal: destination directory does not exist, source=file, destination=no-such-dir/
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git help $cmd" unnecessarily enumerated potential command names
from the filesystem, even when $cmd is known to be a built-in.
Ideas for further optimization, primarily by killing the use of
is_in_cmdlist(), were suggested in the discussion, but they can
come as follow-ups on top of this series.
* ss/builtin-cleanup:
builtin/help.c: speed up is_git_command() by checking for builtin commands first
builtin/help.c: call load_command_list() only when it is needed
git.c: consistently use the term "builtin" instead of "internal command"
There is no reason to have a hardcoded upper limit of the number of
parents for an octopus merge, created via the graft mechanism.
* js/lift-parent-count-limit:
Remove the line length limit for graft files
"git merge-base --octopus" used to leave cleaning up suboptimal
result to the caller, but now it does the clean-up itself.
* bm/merge-base-octopus-dedup:
merge-base --octopus: reduce the result from get_octopus_merge_bases()
merge-base: separate "--independent" codepath into its own helper
A "gc" process running as a different user should be able to stop a
new "gc" process from starting.
* km/gc-eperm:
gc: notice gc processes run by other users
Teach "cat-file --batch" to show delta-base object name for a
packed object that is represented as a delta.
* jk/oi-delta-base:
cat-file: provide %(deltabase) batch format
sha1_object_info_extended: provide delta base sha1s
"git add -A" (no other arguments) in a totally empty working tree
used to emit an error.
* nd/add-empty-fix:
add: don't complain when adding empty project root
Fetching 'frotz' branch with "git fetch", while having
'frotz/nitfol' remote-tracking branch from an earlier fetch, would
error out, primarily because the command has not been told to
remove anything on our side. In such a case, "git fetch --prune"
can be used to remove 'frotz/nitfol' to make room to fetch and
store 'frotz' remote-tracking branch.
* tm/fetch-prune:
fetch --prune: Run prune before fetching
fetch --prune: always print header url
A few places where we relied on a fixed length buffer to hold
pathnames in these two programs have been converted to use strbuf.
* mh/path-max:
builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX
prune-packed: use strbuf to avoid having to worry about PATH_MAX
read_sha1_file() that is the workhorse to read the contents given
an object name honoured object replacements, but there is no
corresponding mechanism to sha1_object_info() that is used to
obtain the metainfo (e.g. type & size) about the object, leading
callers to weird inconsistencies.
* cc/replace-object-info:
replace info: rename 'full' to 'long' and clarify in-code symbols
Documentation/git-replace: describe --format option
builtin/replace: unset read_replace_refs
t6050: add tests for listing with --format
builtin/replace: teach listing using short, medium or full formats
sha1_file: perform object replacement in sha1_object_info_extended()
t6050: show that git cat-file --batch fails with replace objects
sha1_object_info_extended(): add an "unsigned flags" parameter
sha1_file.c: add lookup_replace_object_extended() to pass flags
replace_object: don't check read_replace_refs twice
rename READ_SHA1_FILE_REPLACE flag to LOOKUP_REPLACE_OBJECT
Introduce "negative pathspec" magic, to allow "git log -- . ':!dir'" to
tell us "I am interested in everything but 'dir' directory".
* nd/negative-pathspec:
pathspec.c: support adding prefix magic to a pathspec with mnemonic magic
Support pathspec magic :(exclude) and its short form :!
glossary-content.txt: rephrase magic signature part
We should always have been freeing our strbuf, but doing so
consistently was annoying until the refactoring in the
previous patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This just pulls the return value for the function out of the
inner loop, so we can break out of the loop rather than do
an early return. This will make it easier to put any cleanup
for the function in one place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 2dce956 is_git_command() is a bit slow as it does file I/O in
the call to list_commands_in_dir(). Avoid the file I/O by adding an
early check for the builtin commands.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids list_commands_in_dir() being called when not needed which is
quite slow due to file I/O in order to list matching files in a directory.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
safe_create_leading_directories_const() returns a non-zero value on
error. The old code at this calling site recognized a couple of
particular error values, and treated all other return values as
success. Instead, be more conservative: recognize the errors we are
interested in, but treat any other nonzero values as failures. This
is more robust in case somebody adds another possible return value
without telling us.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of returning magic integer values (which a couple of callers
go to the trouble of distinguishing), return values from an enum. Add
a docstring.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 58babfff ("shallow.c: the 8 steps to select new commits for
.git/shallow", 05-12-2013) added a function to implement step 5 of
the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'.
This function implements an optional optimization step in the new
shallow commit selection algorithm. However, this function has no
callers. (The commented out call sites would need to change, in
order to provide information required by the function.)
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we have a remote-tracking branch named "frotz/nitfol" from a
previous fetch, and the upstream now has a branch named "frotz",
fetch would fail to remove "frotz/nitfol" with a "git fetch --prune"
from the upstream. git would inform the user to use "git remote
prune" to fix the problem.
Change the way "fetch --prune" works by moving the pruning operation
before the fetching operation. This way, instead of warning the user
of a conflict, it autmatically fixes it.
Signed-off-by: Tom Miller <jackerran@gmail.com>
Tested-by: Thomas Rast <tr@thomasrast.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "fetch --prune" is run with no new refs to fetch, but it has refs
to prune. Then, the header url is not printed as it would if there were
new refs to fetch.
Output before this patch:
$ git fetch --prune remote-with-no-new-refs
x [deleted] (none) -> origin/world
Output after this patch:
$ git fetch --prune remote-with-no-new-refs
From https://github.com/git/git
x [deleted] (none) -> origin/test
Signed-off-by: Tom Miller <jackerran@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 64a99eb4 git gc refuses to run without the --force option if
another gc process on the same repository is already running.
However, if the repository is shared and user A runs git gc on the
repository and while that gc is still running user B runs git gc on
the same repository the gc process run by user A will not be noticed
and the gc run by user B will go ahead and run.
The problem is that the kill(pid, 0) test fails with an EPERM error
since user B is not allowed to signal processes owned by user A
(unless user B is root).
Update the test to recognize an EPERM error as meaning the process
exists and another gc should not be run (unless --force is given).
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enum names SHORT/MEDIUM/FULL were too broad to be descriptive. And
they clashed with built-in symbols on platforms like Windows.
Clarify by giving them REPLACE_FORMAT_ prefix.
Rename 'full' format in "git replace --format=<name>" to 'long', to
match others (i.e. 'short' and 'medium').
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No code ever used this symbol since the command was introduced at
9f613ddd (Add git-for-each-ref: helper for language bindings,
2006-09-15).
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.
In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.
As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.
The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.
But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.
This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.
Here are perf results for p5310:
Test origin/master HEAD^ HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7%
5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5%
5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2%
5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9%
You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since `pack-objects` will write a `.bitmap` file next to the `.pack` and
`.idx` files, this commit teaches `git-repack` to consider the new
bitmap indexes (if they exist) when performing repack operations.
This implies moving old bitmap indexes out of the way if we are
repacking a repository that already has them, and moving the newly
generated bitmap indexes into the `objects/pack` directory, next to
their corresponding packfiles.
Since `git repack` is now capable of handling these `.bitmap` files,
a normal `git gc` run on a repository that has `pack.writebitmaps` set
to true in its config file will generate bitmap indexes as part of the
garbage collection process.
Alternatively, `git repack` can be called with the `-b` switch to
explicitly generate bitmap indexes if you are experimenting
and don't want them on all the time.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>