Commit Graph

3024 Commits

Author SHA1 Message Date
b0504a9519 Merge branch 'cc/replace-object-info'
read_sha1_file() that is the workhorse to read the contents given
an object name honoured object replacements, but there is no
corresponding mechanism to sha1_object_info() that is used to
obtain the metainfo (e.g. type & size) about the object, leading
callers to weird inconsistencies.

* cc/replace-object-info:
  replace info: rename 'full' to 'long' and clarify in-code symbols
  Documentation/git-replace: describe --format option
  builtin/replace: unset read_replace_refs
  t6050: add tests for listing with --format
  builtin/replace: teach listing using short, medium or full formats
  sha1_file: perform object replacement in sha1_object_info_extended()
  t6050: show that git cat-file --batch fails with replace objects
  sha1_object_info_extended(): add an "unsigned flags" parameter
  sha1_file.c: add lookup_replace_object_extended() to pass flags
  replace_object: don't check read_replace_refs twice
  rename READ_SHA1_FILE_REPLACE flag to LOOKUP_REPLACE_OBJECT
2014-01-10 10:32:10 -08:00
010d81ae35 Merge branch 'nd/negative-pathspec'
Introduce "negative pathspec" magic, to allow "git log -- . ':!dir'" to
tell us "I am interested in everything but 'dir' directory".

* nd/negative-pathspec:
  pathspec.c: support adding prefix magic to a pathspec with mnemonic magic
  Support pathspec magic :(exclude) and its short form :!
  glossary-content.txt: rephrase magic signature part
2014-01-10 10:31:48 -08:00
648027c4c8 cat-file: fix a minor memory leak in batch_objects
We should always have been freeing our strbuf, but doing so
consistently was annoying until the refactoring in the
previous patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-07 14:31:52 -08:00
07e2383945 cat-file: refactor error handling of batch_objects
This just pulls the return value for the function out of the
inner loop, so we can break out of the loop rather than do
an early return. This will make it easier to put any cleanup
for the function in one place.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-07 14:31:10 -08:00
c6127fa3e2 builtin/help.c: speed up is_git_command() by checking for builtin commands first
Since 2dce956 is_git_command() is a bit slow as it does file I/O in
the call to list_commands_in_dir(). Avoid the file I/O by adding an
early check for the builtin commands.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 11:26:31 -08:00
a3c5263438 builtin/help.c: call load_command_list() only when it is needed
This avoids list_commands_in_dir() being called when not needed which is
quite slow due to file I/O in order to list matching files in a directory.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 11:26:10 -08:00
f3565c0ca5 cmd_init_db(): when creating directories, handle errors conservatively
safe_create_leading_directories_const() returns a non-zero value on
error.  The old code at this calling site recognized a couple of
particular error values, and treated all other return values as
success.  Instead, be more conservative: recognize the errors we are
interested in, but treat any other nonzero values as failures.  This
is more robust in case somebody adds another possible return value
without telling us.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:34:22 -08:00
0be0521b23 safe_create_leading_directories(): introduce enum for return values
Instead of returning magic integer values (which a couple of callers
go to the trouble of distinguishing), return values from an enum.  Add
a docstring.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:34:21 -08:00
feefdf62c1 shallow: remove unused code
Commit 58babfff ("shallow.c: the 8 steps to select new commits for
.git/shallow", 05-12-2013) added a function to implement step 5 of
the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'.
This function implements an optional optimization step in the new
shallow commit selection algorithm. However, this function has no
callers. (The commented out call sites would need to change, in
order to provide information required by the function.)

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:05:40 -08:00
10a6cc8890 fetch --prune: Run prune before fetching
When we have a remote-tracking branch named "frotz/nitfol" from a
previous fetch, and the upstream now has a branch named "frotz",
fetch would fail to remove "frotz/nitfol" with a "git fetch --prune"
from the upstream. git would inform the user to use "git remote
prune" to fix the problem.

Change the way "fetch --prune" works by moving the pruning operation
before the fetching operation. This way, instead of warning the user
of a conflict, it autmatically fixes it.

Signed-off-by: Tom Miller <jackerran@gmail.com>
Tested-by: Thomas Rast <tr@thomasrast.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-03 10:18:40 -08:00
4b3b33a747 fetch --prune: always print header url
If "fetch --prune" is run with no new refs to fetch, but it has refs
to prune. Then, the header url is not printed as it would if there were
new refs to fetch.

Output before this patch:

	$ git fetch --prune remote-with-no-new-refs
	 x [deleted]         (none)     -> origin/world

Output after this patch:

	$ git fetch --prune remote-with-no-new-refs
	From https://github.com/git/git
	 x [deleted]         (none)     -> origin/test

Signed-off-by: Tom Miller <jackerran@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-03 10:13:39 -08:00
ed7eda8b38 gc: notice gc processes run by other users
Since 64a99eb4 git gc refuses to run without the --force option if
another gc process on the same repository is already running.

However, if the repository is shared and user A runs git gc on the
repository and while that gc is still running user B runs git gc on
the same repository the gc process run by user A will not be noticed
and the gc run by user B will go ahead and run.

The problem is that the kill(pid, 0) test fails with an EPERM error
since user B is not allowed to signal processes owned by user A
(unless user B is root).

Update the test to recognize an EPERM error as meaning the process
exists and another gc should not be run (unless --force is given).

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-02 16:15:29 -08:00
663a8566be replace info: rename 'full' to 'long' and clarify in-code symbols
Enum names SHORT/MEDIUM/FULL were too broad to be descriptive.  And
they clashed with built-in symbols on platforms like Windows.
Clarify by giving them REPLACE_FORMAT_ prefix.

Rename 'full' format in "git replace --format=<name>" to 'long', to
match others (i.e. 'short' and 'medium').

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:33:11 -08:00
44484662d8 Merge branch 'maint'
* maint:
  for-each-ref: remove unused variable
2013-12-30 12:27:01 -08:00
b9cf14d43b for-each-ref: remove unused variable
No code ever used this symbol since the command was introduced at
9f613ddd (Add git-for-each-ref: helper for language bindings,
2006-09-15).

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:23:51 -08:00
ae4f07fbcc pack-bitmap: implement optional name_hash cache
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test                      origin/master       HEAD^                      HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk    36.81(37.82+1.43)   47.70(48.74+1.41) +29.6%   47.75(48.70+1.51) +29.7%
5310.3: simulated clone   30.78(29.70+2.14)   1.08(0.97+0.10) -96.5%     1.07(0.94+0.12) -96.5%
5310.4: simulated fetch   3.16(6.10+0.08)     3.54(10.65+0.06) +12.0%    1.70(3.07+0.06) -46.2%
5310.6: partial bitmap    36.76(43.19+1.81)   6.71(11.25+0.76) -81.7%    4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
5cf2741c5a repack: consider bitmaps when performing repacks
Since `pack-objects` will write a `.bitmap` file next to the `.pack` and
`.idx` files, this commit teaches `git-repack` to consider the new
bitmap indexes (if they exist) when performing repack operations.

This implies moving old bitmap indexes out of the way if we are
repacking a repository that already has them, and moving the newly
generated bitmap indexes into the `objects/pack` directory, next to
their corresponding packfiles.

Since `git repack` is now capable of handling these `.bitmap` files,
a normal `git gc` run on a repository that has `pack.writebitmaps` set
to true in its config file will generate bitmap indexes as part of the
garbage collection process.

Alternatively, `git repack` can be called with the `-b` switch to
explicitly generate bitmap indexes if you are experimenting
and don't want them on all the time.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
b77fcd1edc repack: handle optional files created by pack-objects
We ask pack-objects to pack to a set of temporary files, and
then rename them into place. Some files that pack-objects
creates may be optional (like a .bitmap file), in which case
we would not want to call rename(). We already call stat()
and make the chmod optional if the file cannot be accessed.
We could simply skip the rename step in this case, but that
would be a minor regression in noticing problems with
non-optional files (like the .pack and .idx files).

Instead, we can now annotate extensions as optional, and
skip them if they don't exist (and otherwise rely on
rename() to barf).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
42a02d8529 repack: turn exts array into array-of-struct
This is slightly more verbose, but will let us annotate the
extensions with further options in future commits.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
b328c2166e repack: stop using magic number for ARRAY_SIZE(exts)
We have a static array of extensions, but hardcode the size
of the array in our loops. Let's pull out this magic number,
which will make it easier to change.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
7cc8f97108 pack-objects: implement bitmap writing
This commit extends more the functionality of `pack-objects` by allowing
it to write out a `.bitmap` index next to any written packs, together
with the `.idx` index that currently gets written.

If bitmap writing is enabled for a given repository (either by calling
`pack-objects` with the `--write-bitmap-index` flag or by having
`pack.writebitmaps` set to `true` in the config) and pack-objects is
writing a packfile that would normally be indexed (i.e. not piping to
stdout), we will attempt to write the corresponding bitmap index for the
packfile.

Bitmap index writing happens after the packfile and its index has been
successfully written to disk (`finish_tmp_packfile`). The process is
performed in several steps:

    1. `bitmap_writer_set_checksum`: this call stores the partial
       checksum for the packfile being written; the checksum will be
       written in the resulting bitmap index to verify its integrity

    2. `bitmap_writer_build_type_index`: this call uses the array of
       `struct object_entry` that has just been sorted when writing out
       the actual packfile index to disk to generate 4 type-index bitmaps
       (one for each object type).

       These bitmaps have their nth bit set if the given object is of
       the bitmap's type. E.g. the nth bit of the Commits bitmap will be
       1 if the nth object in the packfile index is a commit.

       This is a very cheap operation because the bitmap writing code has
       access to the metadata stored in the `struct object_entry` array,
       and hence the real type for each object in the packfile.

    3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap
       index for one of the packfiles we're trying to repack, this call
       will efficiently rebuild the existing bitmaps so they can be
       reused on the new index. All the existing bitmaps will be stored
       in a `reuse` hash table, and the commit selection phase will
       prioritize these when selecting, as they can be written directly
       to the new index without having to perform a revision walk to
       fill the bitmap. This can greatly speed up the repack of a
       repository that already has bitmaps.

    4. `bitmap_writer_select_commits`: if bitmap writing is enabled for
       a given `pack-objects` run, the sequence of commits generated
       during the Counting Objects phase will be stored in an array.

       We then use that array to build up the list of selected commits.
       Writing a bitmap in the index for each object in the repository
       would be cost-prohibitive, so we use a simple heuristic to pick
       the commits that will be indexed with bitmaps.

       The current heuristics are a simplified version of JGit's
       original implementation. We select a higher density of commits
       depending on their age: the 100 most recent commits are always
       selected, after that we pick 1 commit of each 100, and the gap
       increases as the commits grow older. On top of that, we make sure
       that every single branch that has not been merged (all the tips
       that would be required from a clone) gets their own bitmap, and
       when selecting commits between a gap, we tend to prioritize the
       commit with the most parents.

       Do note that there is no right/wrong way to perform commit
       selection; different selection algorithms will result in
       different commits being selected, but there's no such thing as
       "missing a commit". The bitmap walker algorithm implemented in
       `prepare_bitmap_walk` is able to adapt to missing bitmaps by
       performing manual walks that complete the bitmap: the ideal
       selection algorithm, however, would select the commits that are
       more likely to be used as roots for a walk in the future (e.g.
       the tips of each branch, and so on) to ensure a bitmap for them
       is always available.

    5. `bitmap_writer_build`: this is the computationally expensive part
       of bitmap generation. Based on the list of commits that were
       selected in the previous step, we perform several incremental
       walks to generate the bitmap for each commit.

       The walks begin from the oldest commit, and are built up
       incrementally for each branch. E.g. consider this dag where A, B,
       C, D, E, F are the selected commits, and a, b, c, e are a chunk
       of simplified history that will not receive bitmaps.

            A---a---B--b--C--c--D
                     \
                      E--e--F

       We start by building the bitmap for A, using A as the root for a
       revision walk and marking all the objects that are reachable
       until the walk is over. Once this bitmap is stored, we reuse the
       bitmap walker to perform the walk for B, assuming that once we
       reach A again, the walk will be terminated because A has already
       been SEEN on the previous walk.

       This process is repeated for C, and D, but when we try to
       generate the bitmaps for E, we can reuse neither the current walk
       nor the bitmap we have generated so far.

       What we do now is resetting both the walk and clearing the
       bitmap, and performing the walk from scratch using E as the
       origin. This new walk, however, does not need to be completed.
       Once we hit B, we can lookup the bitmap we have already stored
       for that commit and OR it with the existing bitmap we've composed
       so far, allowing us to limit the walk early.

       After all the bitmaps have been generated, another iteration
       through the list of commits is performed to find the best XOR
       offsets for compression before writing them to disk. Because of
       the incremental nature of these bitmaps, XORing one of them with
       its predecesor results in a minimal "bitmap delta" most of the
       time. We can write this delta to the on-disk bitmap index, and
       then re-compose the original bitmaps by XORing them again when
       loaded.

       This is a phase very similar to pack-object's `find_delta` (using
       bitmaps instead of objects, of course), except the heuristics
       have been greatly simplified: we only check the 10 bitmaps before
       any given one to find best compressing one. This gives good
       results in practice, because there is locality in the ordering of
       the objects (and therefore bitmaps) in the packfile.

     6. `bitmap_writer_finish`: the last step in the process is
	serializing to disk all the bitmap data that has been generated
	in the two previous steps.

	The bitmap is written to a tmp file and then moved atomically to
	its final destination, using the same process as
	`pack-write.c:write_idx_file`.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
aa32939fea rev-list: add bitmap mode to speed up object lists
The bitmap reachability index used to speed up the counting objects
phase during `pack-objects` can also be used to optimize a normal
rev-list if the only thing required are the SHA1s of the objects during
the list (i.e., not the path names at which trees and blobs were found).

Calling `git rev-list --objects --use-bitmap-index [committish]` will
perform an object iteration based on a bitmap result instead of actually
walking the object graph.

These are some example timings for `torvalds/linux` (warm cache,
best-of-five):

    $ time git rev-list --objects master > /dev/null

    real    0m34.191s
    user    0m33.904s
    sys     0m0.268s

    $ time git rev-list --objects --use-bitmap-index master > /dev/null

    real    0m1.041s
    user    0m0.976s
    sys     0m0.064s

Likewise, using `git rev-list --count --use-bitmap-index` will speed up
the counting operation by building the resulting bitmap and performing a
fast popcount (number of bits set on the bitmap) on the result.

Here are some sample timings of different ways to count commits in
`torvalds/linux`:

    $ time git rev-list master | wc -l
        399882

        real    0m6.524s
        user    0m6.060s
        sys     0m3.284s

    $ time git rev-list --count master
        399882

        real    0m4.318s
        user    0m4.236s
        sys     0m0.076s

    $ time git rev-list --use-bitmap-index --count master
        399882

        real    0m0.217s
        user    0m0.176s
        sys     0m0.040s

This also respects negative refs, so you can use it to count
a slice of history:

        $ time git rev-list --count v3.0..master
        144843

        real    0m1.971s
        user    0m1.932s
        sys     0m0.036s

        $ time git rev-list --use-bitmap-index --count v3.0..master
        real    0m0.280s
        user    0m0.220s
        sys     0m0.056s

Though note that the closer the endpoints, the less it helps. In the
traversal case, we have fewer commits to cross, so we take less time.
But the bitmap time is dominated by generating the pack revindex, which
is constant with respect to the refs given.

Note that you cannot yet get a fast --left-right count of a symmetric
difference (e.g., "--count --left-right master...topic"). The slow part
of that walk actually happens during the merge-base determination when
we parse "master...topic". Even though a count does not actually need to
know the real merge base (it only needs to take the symmetric difference
of the bitmaps), the revision code would require some refactoring to
handle this case.

Additionally, a `--test-bitmap` flag has been added that will perform
the same rev-list manually (i.e. using a normal revwalk) and using
bitmaps, and verify that the results are the same. This can be used to
exercise the bitmap code, and also to verify that the contents of the
.bitmap file are sane.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
6b8fda2db1 pack-objects: use bitmaps when packing objects
In this patch, we use the bitmap API to perform the `Counting Objects`
phase in pack-objects, rather than a traditional walk through the object
graph. For a reasonably-packed large repo, the time to fetch and clone
is often dominated by the full-object revision walk during the Counting
Objects phase. Using bitmaps can reduce the CPU time required on the
server (and therefore start sending the actual pack data with less
delay).

For bitmaps to be used, the following must be true:

  1. We must be packing to stdout (as a normal `pack-objects` from
     `upload-pack` would do).

  2. There must be a .bitmap index containing at least one of the
     "have" objects that the client is asking for.

  3. Bitmaps must be enabled (they are enabled by default, but can be
     disabled by setting `pack.usebitmaps` to false, or by using
     `--no-use-bitmap-index` on the command-line).

If any of these is not true, we fall back to doing a normal walk of the
object graph.

Here are some sample timings from a full pack of `torvalds/linux` (i.e.
something very similar to what would be generated for a clone of the
repository) that show the speedup produced by various
methods:

    [existing graph traversal]
    $ time git pack-objects --all --stdout --no-use-bitmap-index \
			    </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m44.111s
    user    0m42.396s
    sys     0m3.544s

    [bitmaps only, without partial pack reuse; note that
     pack reuse is automatic, so timing this required a
     patch to disable it]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m5.413s
    user    0m5.604s
    sys     0m1.804s

    [bitmaps with pack reuse (what you get with this patch)]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Reusing existing pack: 3237103, done.
    Total 3237103 (delta 0), reused 0 (delta 0)

    real    0m1.636s
    user    0m1.460s
    sys     0m0.172s

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
ce2bc42456 pack-objects: split add_object_entry
This function actually does three things:

  1. Check whether we've already added the object to our
     packing list.

  2. Check whether the object meets our criteria for adding.

  3. Actually add the object to our packing list.

It's a little hard to see these three phases, because they
happen linearly in the rather long function. Instead, this
patch breaks them up into three separate helper functions.

The result is a little easier to follow, though it
unfortunately suffers from some optimization
interdependencies between the stages (e.g., during step 3 we
use the packing list index from step 1 and the packfile
information from step 2).

More importantly, though, the various parts can be
composed differently, as they will be in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
8f29299136 merge-base --octopus: reduce the result from get_octopus_merge_bases()
Scripts that use "merge-base --octopus" could do the reducing
themselves, but most of them are expected to want to get the reduced
results without having to do any work themselves.

Tests are taken from a message by Василий Макаров
<einmalfel@gmail.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>

---

 We might want to vet the existing callers of the underlying
 get_octopus_merge_bases() and find out if _all_ of them are doing
 anything extra (like deduping) because the machinery can return
 duplicate results. And if that is the case, then we may want to
 move the dedupling down the callchain instead of having it here.
2013-12-30 11:58:54 -08:00
e2f5df4244 merge-base: separate "--independent" codepath into its own helper
It piggybacks on an unrelated handle_octopus() function only because
there are some similarities between the way they need to preprocess
their input and output their result.  There is nothing similar in
the true logic between these two operations.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 11:37:49 -08:00
e228c1736f Remove the line length limit for graft files
Support for grafts predates Git's strbuf, and hence it is understandable
that there was a hard-coded line length limit of 1023 characters (which
was chosen a bit awkwardly, given that it is *exactly* one byte short of
aligning with the 41 bytes occupied by a commit name and the following
space or new-line character).

While regular commit histories hardly win comprehensibility in general
if they merge more than twenty-two branches in one go, it is not Git's
business to limit grafts in such a way.

In this particular developer's case, the use case that requires
substantially longer graft lines to be supported is the visualization of
the commits' order implied by their changes: commits are considered to
have an implicit relationship iff exchanging them in an interactive
rebase would result in merge conflicts.

Thusly implied branches tend to be very shallow in general, and the
resulting thicket of implied branches is usually very wide; It is
actually quite common that *most* of the commits in a topic branch have
not even one implied parent, so that a final merge commit has about as
many implied parents as there are commits in said branch.

[jc: squashed in tests by Jonathan]

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-27 16:46:25 -08:00
73b063130b Merge branch 'tg/diff-no-index-refactor'
"git diff ../else/where/A ../else/where/B" when ../else/where is
clearly outside the repository, and "git diff --no-index A B", do
not have to look at the index at all, but we used to read the index
unconditionally.

* tg/diff-no-index-refactor:
  diff: avoid some nesting
  diff: add test for --no-index executed outside repo
  diff: don't read index when --no-index is given
  diff: move no-index detection to builtin/diff.c
2013-12-27 14:58:17 -08:00
604ada435b Merge branch 'jk/cat-file-regression-fix'
"git cat-file --batch=", an admittedly useless command, did not
behave very well.

* jk/cat-file-regression-fix:
  cat-file: handle --batch format with missing type/size
  cat-file: pass expand_data to print_object_or_die
2013-12-27 14:58:11 -08:00
e9ecee0423 Merge branch 'jk/rev-parse-double-dashes'
"git rev-parse <revs> -- <paths>" did not implement the usual
disambiguation rules the commands in the "git log" family used in
the same way.

* jk/rev-parse-double-dashes:
  rev-parse: be more careful with munging arguments
  rev-parse: correctly diagnose revision errors before "--"
2013-12-27 14:58:01 -08:00
7cdebd8a20 Merge branch 'jc/push-refmap'
Make "git push origin master" update the same ref that would be
updated by our 'master' when "git push origin" (no refspecs) is run
while the 'master' branch is checked out, which makes "git push"
more symmetric to "git fetch" and more usable for the triangular
workflow.

* jc/push-refmap:
  push: also use "upstream" mapping when pushing a single ref
  push: use remote.$name.push as a refmap
  builtin/push.c: use strbuf instead of manual allocation
2013-12-27 14:57:50 -08:00
65ea9c3c3d cat-file: provide %(deltabase) batch format
It can be useful for debugging or analysis to see which
objects are stored as delta bases on top of others. This
information is available by running `git verify-pack`, but
that is extremely expensive (and is harder than necessary to
parse).

Instead, let's make it available as a cat-file query format,
which makes it fast and simple to get the bases for a subset
of the objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 11:54:26 -08:00
9af270e8c2 do not pretend sha1write returns errors
The sha1write function returns an int, but it will always be
"0". The failure-prone parts of the function happen in the
"flush" callback, which cannot pass an error back to us. So
we just end up calling die() during the flush.

Let's just drop the return value altogether, as it only
confuses callers into thinking that it might be useful.

Only one call site actually checked the return value. We can
drop that check, since it just led to a die() anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 11:50:20 -08:00
64ed07cee0 add: don't complain when adding empty project root
This behavior was added in 07d7bed (add: don't complain when adding
empty project root - 2009-04-28) then broken by 84b8b5d (remove
match_pathspec() in favor of match_pathspec_depth() -
2013-07-14). Reinstate it.

Noticed-by: Thomas Ferris Nicolaisen <tfnico@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 10:46:26 -08:00
4454e9cb59 builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX
While at it, rename prune_tmp_object(), which used to be a helper to
remove temporary files that were created to become loose object
files, to prune_tmp_file(), as the function is also used to remove
any random cruft whose name begins with tmp_ directly in .git/object
or .git/object/pack directories these days.

Noticed-by:  Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-18 15:53:56 -08:00
7794a680e6 Sync with 1.8.5.2
* maint:
  Git 1.8.5.2
  cmd_repack(): remove redundant local variable "nr_packs"
2013-12-17 14:12:17 -08:00
1945e8ac85 Merge branch 'tb/clone-ssh-with-colon-for-port'
Be more careful when parsing remote repository URL given in the
scp-style host:path notation.

* tb/clone-ssh-with-colon-for-port:
  git_connect(): use common return point
  connect.c: refactor url parsing
  git_connect(): refactor the port handling for ssh
  git fetch: support host:/~repo
  t5500: add test cases for diag-url
  git fetch-pack: add --diag-url
  git_connect: factor out discovery of the protocol and its parts
  git_connect: remove artificial limit of a remote command
  t5601: add tests for ssh
  t5601: remove clear_ssh, refactor setup_ssh_wrapper
2013-12-17 12:03:32 -08:00
88cb2f96ac Merge branch 'nd/transport-positive-depth-only'
"git fetch --depth=0" was a no-op, and was silently
ignored. Diagnose it as an error.

* nd/transport-positive-depth-only:
  clone,fetch: catch non positive --depth option value
2013-12-17 12:03:29 -08:00
ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
14a9c5f261 Merge branch 'jl/commit-v-strip-marker'
"git commit -v" appends the patch to the log message before
editing, and then removes the patch when the editor returned
control. However, the patch was not stripped correctly when the
first modified path was a submodule.

* jl/commit-v-strip-marker:
  commit -v: strip diffs and submodule shortlogs from the commit message
2013-12-17 11:47:18 -08:00
fb230b3523 Merge branch 'mm/mv-file-to-no-such-dir-with-slash'
* mm/mv-file-to-no-such-dir-with-slash:
  mv: let 'git mv file no-such-dir/' error out
2013-12-17 11:47:08 -08:00
4d1826d1d9 Merge branch 'fc/trivial'
* fc/trivial:
  remote: fix status with branch...rebase=preserve
  fetch: add missing documentation
  t: trivial whitespace cleanups
  abspath: trivial style fix
2013-12-17 11:46:32 -08:00
c8b928d770 Merge branch 'nd/magic-pathspec' into maint
"git diff -- ':(icase)makefile'" was unnecessarily rejected at the
command line parser.

* nd/magic-pathspec:
  diff: restrict pathspec limitations to diff b/f case only
2013-12-17 11:21:34 -08:00
3e7b066e22 cmd_repack(): remove redundant local variable "nr_packs"
Its value is the same as the number of entries in the "names"
string_list, so just use "names.nr" in its place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-17 10:54:41 -08:00
c235d960cb prune-packed: use strbuf to avoid having to worry about PATH_MAX
A/very/long/path/to/.git that becomes exactly PATH_MAX bytes long
after suffixed with /objects/??/??38-hex??, would have overflown
the on-stack pathname[] buffer.

Noticed-by:  Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-17 10:43:30 -08:00
aad90e85f8 diff: avoid some nesting
Avoid some nesting in builtin/diff.c, to make the code easier to read.
There are no functional changes.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 13:13:05 -08:00
577aed296a Merge branch 'jk/remove-deprecated'
* jk/remove-deprecated:
  stop installing git-tar-tree link
  peek-remote: remove deprecated alias of ls-remote
  lost-found: remove deprecated command
  tar-tree: remove deprecated command
  repo-config: remove deprecated alias for "git config"
2013-12-12 14:18:34 -08:00
e66ef7ae6f Merge branch 'mh/fetch-tags-in-addition-to-normal-refs'
The "--tags" option to "git fetch" used to be literally a synonym to
a "refs/tags/*:refs/tags/*" refspec, which meant that (1) as an
explicit refspec given from the command line, it silenced the lazy
"git fetch" default that is configured, and (2) also as an explicit
refspec given from the command line, it interacted with "--prune"
to remove any tag that the remote we are fetching from does not
have.

This demotes it to an option; with it, we fetch all tags in
addition to what would be fetched without the option, and it does
not interact with the decision "--prune" makes to see what
remote-tracking refs the local has are missing the remote
counterpart.

* mh/fetch-tags-in-addition-to-normal-refs: (23 commits)
  fetch: improve the error messages emitted for conflicting refspecs
  handle_duplicate(): mark error message for translation
  ref_remote_duplicates(): extract a function handle_duplicate()
  ref_remove_duplicates(): simplify loop logic
  t5536: new test of refspec conflicts when fetching
  ref_remove_duplicates(): avoid redundant bisection
  git-fetch.txt: improve description of tag auto-following
  fetch-options.txt: simplify ifdef/ifndef/endif usage
  fetch, remote: properly convey --no-prune options to subprocesses
  builtin/remote.c:update(): use struct argv_array
  builtin/remote.c: reorder function definitions
  query_refspecs(): move some constants out of the loop
  fetch --prune: prune only based on explicit refspecs
  fetch --tags: fetch tags *in addition to* other stuff
  fetch: only opportunistically update references based on command line
  get_expanded_map(): avoid memory leak
  get_expanded_map(): add docstring
  builtin/fetch.c: reorder function definitions
  get_ref_map(): rename local variables
  api-remote.txt: correct section "struct refspec"
  ...
2013-12-12 14:14:10 -08:00
6df5762db3 diff: don't read index when --no-index is given
git diff --no-index ... currently reads the index, during setup, when
calling gitmodules_config().  This results in worse performance when the
index is not actually needed.  This patch avoids calling
gitmodules_config() when the --no-index option is given.  The times for
executing "git diff --no-index" in the WebKit repository are improved as
follows:

Test                      HEAD~3            HEAD
------------------------------------------------------------------
4001.1: diff --no-index   0.24(0.15+0.09)   0.01(0.00+0.00) -95.8%

An additional improvement of this patch is that "git diff --no-index" no
longer breaks when the index file is corrupt, which makes it possible to
use it for investigating the broken repository.

To improve the possible usage as investigation tool for broken
repositories, setup_git_directory_gently() is also not called when the
--no-index option is given.

Also add a test to guard against future breakages, and a performance
test to show the improvements.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 12:23:02 -08:00
470faf9654 diff: move no-index detection to builtin/diff.c
Currently the --no-index option is parsed in diff_no_index().  Move the
detection if a no-index diff should be executed to builtin/diff.c, where
we can use it for executing diff_no_index() conditionally.  This will
also allow us to execute other operations conditionally, which will be
done in the next patch.

There are no functional changes.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 12:23:02 -08:00