Commit Graph

1416 Commits

Author SHA1 Message Date
89641472aa revert: Separate cmdline parsing from functional code
Currently, revert_or_cherry_pick sets up a default git config, parses
command-line arguments, before preparing to pick commits.  This makes
for a bad API as the central worry of callers is to assert whether or
not a conflict occured while cherry picking.  The current API is like:

  if (revert_or_cherry_pick(argc, argv, opts) < 0)
     print "Something failed, we're not sure what"

Simplify and rename revert_or_cherry_pick to pick_commits so that it
only has the responsibility of setting up the revision walker and
picking commits in a loop.  Transfer the remaining work to its
callers.  Now, the API is simplified as:

  if (parse_args(argc, argv, opts) < 0)
     print "Can't parse arguments"
  if (pick_commits(opts) < 0)
     print "Error encountered in picking machinery"

Later in the series, pick_commits will also serve as the starting
point for continuing a cherry-pick or revert.

Inspired-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:43 -07:00
80e1f79188 revert: Introduce struct to keep command-line options
The current code uses a set of file-scope static variables to tell the
cherry-pick/ revert machinery how to replay the changes, and
initializes them by parsing the command-line arguments.  In later
steps in this series, we would like to introduce an API function that
calls into this machinery directly and have a way to tell it what to
do.  Hence, introduce a structure to group these variables, so that
the API can take them as a single replay_options parameter.  The only
exception is the variable "me" -- remove it since it not an
independent option, and can be inferred from the action.

Unfortunately, this patch introduces a minor regression.  Parsing
strategy-option violates a C89 rule: Initializers cannot refer to
variables whose address is not known at compile time.  Currently, this
rule is violated by some other parts of Git as well, and it is
possible to get GCC to report these instances using the "-std=c89
-pedantic" option.

Inspired-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Jonathan Nieder <jrnieder@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:43 -07:00
708f9d96d9 revert: Eliminate global "commit" variable
Functions which act on commits currently rely on a file-scope static
variable to be set before they're called.  Consequently, the API and
corresponding callsites are ugly and unclear.  Remove this variable
and change their API to accept the commit to act on as additional
argument so that the callsites change from looking like

  commit = prepare_a_commit();
  act_on_commit();

to looking like

  commit = prepare_a_commit();
  act_on_commit(commit);

This change is also in line with our long-term goal of exposing some
of these functions through a public API.

Inspired-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:42 -07:00
54decbd4d8 revert: Rename no_replay to record_origin
The "-x" command-line option is used to record the name of the
original commits being picked in the commit message.  The variable
corresponding to this option is named "no_replay" for historical
reasons; the name is especially confusing because the term "replay" is
used to describe what cherry-pick does (for example, in the
documentation of the "--mainline" option).  So, give the variable
corresponding to the "-x" command-line option a better name:
"record_origin".

Mentored-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:42 -07:00
a2ec3ad28f revert: Don't check lone argument in get_encoding
The only place get_encoding uses the global "commit" variable is when
writing an error message explaining that its lone argument was NULL.
Since the function's only caller ensures that a NULL argument isn't
passed, we can remove this check with two beneficial consequences:

1. Since the function doesn't use the global "commit" variable any
   more, it won't need to change when we eliminate the global variable
   later in the series.
2. Translators no longer need to localize an error message that will
   never be shown.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:42 -07:00
be33c46cda revert: Simplify and inline add_message_to_msg
The add_message_to_msg function has some dead code, an unclear API,
only one callsite.  While it originally intended fill up an empty
commit message with the commit object name while picking, it really
doesn't do this -- a bug introduced in v1.5.1-rc1~65^2~2 (Make
git-revert & git-cherry-pick a builtin, 2007-03-01).  Today, tests in
t3505-cherry-pick-empty.sh indicate that not filling up an empty
commit message is the desired behavior.  Re-implement and inline the
function accordingly, with a beneficial side-effect: don't dereference
a NULL pointer when the commit doesn't have a delimeter after the
header.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:41 -07:00
38ef61cfde advice: Introduce error_resolve_conflict
Enable future callers to report a conflict and not die immediately by
introducing a new function called error_resolve_conflict.
Re-implement die_resolve_conflict as a call to error_resolve_conflict
followed by a call to die.  Consequently, the message printed by
die_resolve_conflict changes from

  fatal: 'commit' is not possible because you have unmerged files.
         Please, fix them up in the work tree ...
         ...

to

  error: 'commit' is not possible because you have unmerged files.
  hint: Fix them up in the work tree ...
  hint: ...
  fatal: Exiting because of an unresolved conflict.

Hints are printed using the same advise function introduced in
v1.7.3-rc0~26^2~3 (Introduce advise() to print hints, 2010-08-11).

Inspired-by: Christian Couder <chistian.couder@gmail.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:40:41 -07:00
fee92fc1dd bisect: introduce support for --no-checkout option.
If --no-checkout is specified, then the bisection process uses:

	git update-ref --no-deref HEAD <trial>

at each trial instead of:

	git checkout <trial>

Improved-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:34:32 -07:00
8894d53580 commit: allow partial commits with relative paths
In order to do partial commits, git-commit overlays a tree on the
cache and checks pathspecs against the result. Currently, the
overlaying is done using "prefix" which prevents relative pathspecs
with ".." and absolute pathspec from matching when they refer to
files not under "prefix" and absent from the index, but still in
the tree (i.e.  files staged for removal).

The point of providing a prefix at all is performance optimization.
If we say there is no common prefix for the files of interest, then
we have to read the entire tree into the index.

But even if we cannot use the working directory as a prefix, we can
still figure out if there is a common prefix for all given paths,
and use that instead. The pathspec_prefix() routine from ls-files.c
does exactly that.

Any use of global variables is removed from pathspec_prefix() so
that it can be called from commit.c.

Reported-by: Reuben Thomas <rrt@sc3d.org>
Analyzed-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-02 14:20:35 -07:00
317f63c21c grep: long context options
Take long option names for -A (--after-context), -B (--before-context)
and -C (--context) from GNU grep and add a similar long option name
for -W (--function-context).

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01 16:11:50 -07:00
ba8ea7496f grep: add option to show whole function as context
Add a new option, -W, to show the whole surrounding function of a match.

It uses the same regular expressions as -p and diff to find the beginning
of sections.

Currently it will not display comments in front of a function, but those
that are following one.  Despite this shortcoming it is already useful,
e.g. to simply see a more complete applicable context or to extract whole
functions.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01 16:09:15 -07:00
8ab19bc5de Merge branch 'jk/clone-detached'
* jk/clone-detached:
  clone: always fetch remote HEAD
  make copy_ref globally available
  consider only branches in guess_remote_head
  t: add tests for cloning remotes with detached HEAD
2011-08-01 15:00:35 -07:00
59d9ba869e Merge branch 'sr/transport-helper-fix'
* sr/transport-helper-fix: (21 commits)
  transport-helper: die early on encountering deleted refs
  transport-helper: implement marks location as capability
  transport-helper: Use capname for refspec capability too
  transport-helper: change import semantics
  transport-helper: update ref status after push with export
  transport-helper: use the new done feature where possible
  transport-helper: check status code of finish_command
  transport-helper: factor out push_update_refs_status
  fast-export: support done feature
  fast-import: introduce 'done' command
  git-remote-testgit: fix error handling
  git-remote-testgit: only push for non-local repositories
  remote-curl: accept empty line as terminator
  remote-helpers: export GIT_DIR variable to helpers
  git_remote_helpers: push all refs during a non-local export
  transport-helper: don't feed bogus refs to export push
  git-remote-testgit: import non-HEAD refs
  t5800: document some non-functional parts of remote helpers
  t5800: use skip_all instead of prereq
  t5800: factor out some ref tests
  ...
2011-08-01 15:00:14 -07:00
1df561fb48 Merge branch 'jc/maint-reset-unmerged-path'
* jc/maint-reset-unmerged-path:
  reset [<commit>] paths...: do not mishandle unmerged paths
2011-08-01 15:00:08 -07:00
055f2c5d34 Merge branch 'jc/maint-1.7.3-checkout-describe' into maint
* jc/maint-1.7.3-checkout-describe:
  checkout -b <name>: correctly detect existing branch
2011-08-01 14:43:18 -07:00
bf01d4a334 reflog: actually default to subcommand 'show'
The reflog manpage says:

	git reflog [show] [log-options] [<ref>]

the subcommand 'show' is the default "in the absence of any
subcommands". Currently this is only true if the user provided either
at least one option or no additional argument at all. For example:

	git reflog master

won't work. Change this by actually calling cmd_log_reflog in
absence of any subcommand.

Signed-off-by: Michael Schubert <mschub@elegosoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01 10:52:34 -07:00
90a6c7d443 propagate --quiet to send-pack/receive-pack
Currently, git push --quiet produces some non-error output, e.g.:

 $ git push --quiet
 Unpacking objects: 100% (3/3), done.

Add the --quiet option to send-pack/receive-pack and pass it to
unpack-objects in the receive-pack codepath and to receive-pack in
the push codepath.

This fixes a bug reported for the fedora git package:

 https://bugzilla.redhat.com/show_bug.cgi?id=725593

Reported-by: Jesse Keating <jkeating@redhat.com>
Cc: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-31 18:45:41 -07:00
04f89259a6 Ensure git ls-tree exits with a non-zero exit code if read_tree_recursive fails.
In the case of a corrupt repository, git ls-tree may report an error but
presently it exits with a code of 0.

This change uses the return code of read_tree_recursive instead.

Improved-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-25 10:50:11 -07:00
9c81e6421b Merge branch 'jk/tag-contains-ab'
* jk/tag-contains-ab:
  Revert clock-skew based attempt to optimize tag --contains traversal
  git skew: a tool to find how big a clock skew exists in the history
  default core.clockskew variable to one day
  limit "contains" traversals based on commit timestamp
  tag: speed up --contains calculation
2011-07-22 14:45:19 -07:00
4f9aba9c26 Merge branch 'jc/checkout-reflog-fix'
* jc/checkout-reflog-fix:
  checkout: do not write bogus reflog entry out
2011-07-22 14:43:03 -07:00
d04520e344 reset: give better reflog messages
The reset command creates its reflog entry from argv.
However, it does so after having run parse_options, which
means the only thing left in argv is any non-option
arguments. Thus you would end up with confusing reflog
entries like:

  $ git reset --hard HEAD^
  $ git reset --soft HEAD@{1}
  $ git log -2 -g --oneline
  8e46cad HEAD@{0}: HEAD@{1}: updating HEAD
  1eb9486 HEAD@{1}: HEAD^: updating HEAD

However, we must also consider that some scripts may set
GIT_REFLOG_ACTION before calling reset, and we need to show
their reflog action (with our text appended). For example:

  rebase -i (squash): updating HEAD

On top of that, we also set the ORIG_HEAD reflog action
(even though it doesn't generally exist). In that case, the
reset argument is somewhat meaningless, as it has nothing to
do with what's in ORIG_HEAD.

This patch changes the reset reflog code to show:

  $GIT_REFLOG_ACTION: updating {HEAD,ORIG_HEAD}

as before, but only if GIT_REFLOG_ACTION is set. Otherwise,
show:

   reset: moving to $rev

for HEAD, and:

   reset: updating ORIG_HEAD

for ORIG_HEAD (this is still somewhat superfluous, since we
are in the ORIG_HEAD reflog, obviously, but at least we now
mention which command was used to update it).

While we're at it, we can clean up the code a bit:

 - Use strbufs to make the message.

 - Use the "rev" parameter instead of showing all options.
   This makes more sense, since it is the only thing
   impacting the writing of the ref.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-22 13:54:57 -07:00
82670a5cb5 fast-export: support done feature
If fast-export is being used to generate a fast-import stream that
will be used afterwards it is desirable to indicate the end of the
stream with the new 'done' command.

Add a flag that causes fast-export to end with 'done'.

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-19 11:17:47 -07:00
d907bf8ef3 Merge branch 'jc/index-pack'
* jc/index-pack:
  verify-pack: use index-pack --verify
  index-pack: show histogram when emulating "verify-pack -v"
  index-pack: start learning to emulate "verify-pack -v"
  index-pack: a miniscule refactor
  index-pack --verify: read anomalous offsets from v2 idx file
  write_idx_file: need_large_offset() helper function
  index-pack: --verify
  write_idx_file: introduce a struct to hold idx customization options
  index-pack: group the delta-base array entries also by type

Conflicts:
	builtin/verify-pack.c
	cache.h
	sha1_file.c
2011-07-19 09:54:51 -07:00
765c7e4f31 Merge branch 'jk/archive-tar-filter'
* jk/archive-tar-filter:
  upload-archive: allow user to turn off filters
  archive: provide builtin .tar.gz filter
  archive: implement configurable tar filters
  archive: refactor file extension format-guessing
  archive: move file extension format-guessing lower
  archive: pass archiver struct to write_archive callback
  archive: refactor list of archive formats
  archive-tar: don't reload default config options
  archive: reorder option parsing and config reading
2011-07-19 09:45:32 -07:00
ff94409da9 Merge branch 'jk/clone-cmdline-config'
* jk/clone-cmdline-config:
  clone: accept config options on the command line
  config: make git_config_parse_parameter a public function
  remote: use new OPT_STRING_LIST
  parse-options: add OPT_STRING_LIST helper
2011-07-19 09:45:24 -07:00
20a80d04a4 Merge branch 'jk/tag-list-multiple-patterns'
* jk/tag-list-multiple-patterns:
  tag: accept multiple patterns for --list
2011-07-19 09:45:15 -07:00
eb4f4076aa Merge branch 'jc/zlib-wrap'
* jc/zlib-wrap:
  zlib: allow feeding more than 4GB in one go
  zlib: zlib can only process 4GB at a time
  zlib: wrap deflateBound() too
  zlib: wrap deflate side of the API
  zlib: wrap inflateInit2 used to accept only for gzip format
  zlib: wrap remaining calls to direct inflate/inflateEnd
  zlib wrapper: refactor error message formatter

Conflicts:
	sha1_file.c
2011-07-19 09:33:04 -07:00
c6d72c4972 Revert clock-skew based attempt to optimize tag --contains traversal
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-14 11:02:06 -07:00
ff00b682f2 reset [<commit>] paths...: do not mishandle unmerged paths
Because "diff --cached HEAD" showed an incorrect blob object name on the
LHS of the diff, we ended up updating the index entry with bogus value,
not what we read from the tree.

Noticed by John Nowak.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13 21:39:18 -07:00
6b01ecfe22 ref namespaces: Support remote repositories via upload-pack and receive-pack
Change upload-pack and receive-pack to use the namespace-prefixed refs
when working with the repository, and use the unprefixed refs when
talking to the client, maintaining the masquerade.  This allows
clone, pull, fetch, and push to work with a suitably configured
GIT_NAMESPACE.

receive-pack advertises refs outside the current namespace as .have refs
(as it currently does for refs in alternates), so that the client can
use them to minimize data transfer but will otherwise ignore them.

With appropriate configuration, this also allows http-backend to expose
namespaces as multiple repositories with different paths.  This only
requires setting GIT_NAMESPACE, which http-backend passes through to
upload-pack and receive-pack.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-11 09:35:38 -07:00
1b4bb16b9e pack-objects: optimize "recency order"
This optimizes the "recency order" (see pack-heuristics.txt in
Documentation/technical/ directory) used to order objects within a
packfile in three ways:

 - Commits at the tip of tags are written together, in the hope that
   revision traversal done in incremental fetch (which starts by
   putting them in a revision queue marked as UNINTERESTING) will see a
   better locality of these objects;

 - In the original recency order, trees and blobs are intermixed. Write
   trees together before blobs, in the hope that this will improve
   locality when running pathspec-limited revision traversal, i.e.
   "git log paths...";

 - When writing blob objects out, write the whole family of blobs that use
   the same delta base object together, by starting from the root of the
   delta chain, and writing its immediate children in a width-first
   manner, in the hope that this will again improve locality when reading
   blobs that belong to the same path, which are likely to be deltified
   against each other.

I tried various workloads in the Linux kernel repositories (HEAD at
v3.0-rc6-71-g4dd1b49) packed with v1.7.6 and with this patch, counting how
large seeks are needed between adjacent accesses to objects in the pack,
and the result looks promising.  The history has 2072052 objects, weighing
some 490MiB.

 * Simple commit-only log.

   $ git log >/dev/null

   There are 254656 commits in total.

                                  v1.7.6  with patch
   Total number of access :      258,031     258,032
          0.0% percentile :           12          12
         10.0% percentile :          259         259
         20.0% percentile :          294         294
         30.0% percentile :          326         326
         40.0% percentile :          363         363
         50.0% percentile :          415         415
         60.0% percentile :          513         513
         70.0% percentile :          857         858
         80.0% percentile :       10,434      10,441
         90.0% percentile :       91,985      91,996
         95.0% percentile :      260,852     260,885
         99.0% percentile :    1,150,680   1,152,811
         99.9% percentile :    3,148,435   3,148,435
       Less than 2MiB seek:       99.70%      99.69%

   95% of the pack accesses look at data that is no further than 260kB
   from the previous location we accessed. The patch does not change the
   order of commit objects very much, and the result is very similar.

 * Pathspec-limited log.

   $ git log drivers/net >/dev/null

   The path is touched by 26551 commits and merges (among 254656 total).

                                  v1.7.6  with patch
   Total number of access :      559,511     558,663
          0.0% percentile :            0           0
         10.0% percentile :          182         167
         20.0% percentile :          259         233
         30.0% percentile :          357         304
         40.0% percentile :          714         485
         50.0% percentile :        5,046       3,976
         60.0% percentile :      688,671     443,578
         70.0% percentile :  319,574,732 110,370,100
         80.0% percentile :  361,647,599 123,707,229
         90.0% percentile :  393,195,669 128,947,636
         95.0% percentile :  405,496,875 131,609,321
         99.0% percentile :  412,942,470 133,078,115
         99.5% percentile :  413,172,266 133,163,349
         99.9% percentile :  413,354,356 133,240,445
       Less than 2MiB seek:       61.71%      62.87%

   With the current pack heuristics, more than 30% of accesses have to
   seek further than 300MB; the updated pack heuristics ensures that less
   than 0.1% of accesses have to seek further than 135MB. This is largely
   due to the fact that the updated heuristics does not mix blobs and
   trees together.

 * Blame.

   $ git blame drivers/net/ne.c >/dev/null

   The path is touched by 34 commits and merges.

                                  v1.7.6  with patch
   Total number of access :      178,147     178,166
          0.0% percentile :            0           0
         10.0% percentile :          142         139
         20.0% percentile :          222         194
         30.0% percentile :          373         300
         40.0% percentile :        1,168         837
         50.0% percentile :       11,248       7,334
         60.0% percentile :  305,121,284 106,850,130
         70.0% percentile :  361,427,854 123,709,715
         80.0% percentile :  388,127,343 128,171,047
         90.0% percentile :  399,987,762 130,200,707
         95.0% percentile :  408,230,673 132,174,308
         99.0% percentile :  412,947,017 133,181,160
         99.5% percentile :  413,312,798 133,220,425
         99.9% percentile :  413,352,366 133,269,051
       Less than 2MiB seek:       56.47%      56.83%

   The result is very similar to the pathspec-limited log above, which
   only looks at the tree objects.

 * Packing recent history.

   $ (git for-each-ref --format='^%(refname)' refs/tags; echo HEAD) |
     git pack-objects --revs --stdout >/dev/null

   This should pack data worth 71 commits.

                                  v1.7.6  with patch
   Total number of access :       11,511      11,514
          0.0% percentile :            0           0
         10.0% percentile :           48          47
         20.0% percentile :          134          98
         30.0% percentile :          332         178
         40.0% percentile :        1,386         293
         50.0% percentile :        8,030         478
         60.0% percentile :       33,676       1,195
         70.0% percentile :      147,268      26,216
         80.0% percentile :    9,178,662     464,598
         90.0% percentile :   67,922,665     965,782
         95.0% percentile :   87,773,251   1,226,102
         99.0% percentile :   98,011,763   1,932,377
         99.5% percentile :  100,074,427  33,642,128
         99.9% percentile :  105,336,398 275,772,650
       Less than 2MiB seek:       77.09%      99.04%

    The long-tail part of the result looks worse with the patch, but
    the change helps majority of the access. 99.04% of the accesses
    need less than 2MiB of seeking, compared to 77.09% with the current
    packing heuristics.

 * Index pack.

   $ git index-pack -v .git/objects/pack/pack*.pack

                                  v1.7.6  with patch
   Total number of access :    2,791,228   2,788,802
          0.0% percentile :            9           9
         10.0% percentile :          140          89
         20.0% percentile :          233         167
         30.0% percentile :          322         235
         40.0% percentile :          464         310
         50.0% percentile :          862         423
         60.0% percentile :        2,566         686
         70.0% percentile :       25,827       1,498
         80.0% percentile :    1,317,862       4,971
         90.0% percentile :   11,926,385     119,398
         95.0% percentile :   41,304,149     952,519
         99.0% percentile :  227,613,070   6,709,650
         99.5% percentile :  321,265,121  11,734,871
         99.9% percentile :  382,919,785  33,155,191
       Less than 2MiB seek:       81.73%      96.92%

   As the index-pack command already walks objects in the delta chain
   order, writing the blobs out in the delta chain order seems to
   drastically improve the locality of access.

Note that a half-a-gigabyte packfile comfortably fits in the buffer cache,
and you would unlikely to see much performance difference on a modern and
reasonably beefy machine with enough memory and local disks. Benchmarking
with cold cache (or over NFS) would be interesting.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-08 10:03:24 -07:00
25d33546d4 Merge commit 'v1.7.6' into jc/checkout-reflog-fix
* commit 'v1.7.6': (3211 commits)
  Git 1.7.6
  completion: replace core.abbrevguard to core.abbrev
  Git 1.7.6-rc3
  Documentation: git diff --check respects core.whitespace
  gitweb: 'pickaxe' and 'grep' features requires 'search' to be enabled
  t7810: avoid unportable use of "echo"
  plug a few coverity-spotted leaks
  builtin/gc.c: add missing newline in message
  tests: link shell libraries into valgrind directory
  t/Makefile: pass test opts to valgrind target properly
  sh-i18n--envsubst.c: do not #include getopt.h
  Fix typo: existant->existent
  Git 1.7.6-rc2
  gitweb: do not misparse nonnumeric content tag files that contain a digit
  Git 1.7.6-rc1
  fetch: do not leak a refspec
  t3703: skip more tests using colons in file names on Windows
  gitweb: Fix usability of $prevent_xss
  gitweb: Move "Requirements" up in gitweb/INSTALL
  gitweb: Describe CSSMIN and JSMIN in gitweb/INSTALL
  ...
2011-07-06 15:38:28 -07:00
b792c06787 branch -v: honor core.abbrev
Use the value from 'core.abbrev' configuration variable unless user
specifies the length on command line when showing commit object name
in "branch -v" output.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-01 11:22:09 -07:00
188c35e36d git skew: a tool to find how big a clock skew exists in the history
> As you probably guessed from the specificity of the number, I wrote a
> short program to actually traverse and find the worst skew. It takes
> about 5 seconds to run (unsurprisingly, since it is doing the same full
> traversal that we end up doing in the above numbers). So we could
> "autoskew" by setting up the configuration on clone, and then
> periodically updating it as part of "git gc".

This patch doesn't implement auto-detection of skew, but is the program
I used to calculate, and would provide the basis for such
auto-detection. It would be interesting to see average skew numbers for
popular repositories. You can run it as "git skew --all".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-30 12:21:13 -07:00
55ac692661 Merge branch 'jc/streaming' into next
* jc/streaming:
  sha1_file: use the correct type (ssize_t, not size_t) for read-style function
  streaming: read loose objects incrementally
  sha1_file.c: expose helpers to read loose objects
  streaming: read non-delta incrementally from a pack
  streaming_write_entry(): support files with holes
  convert: CRLF_INPUT is a no-op in the output codepath
  streaming_write_entry(): use streaming API in write_entry()
  streaming: a new API to read from the object store
  write_entry(): separate two helper functions out
  unpack_object_header(): make it public
  sha1_object_info_extended(): hint about objects in delta-base cache
  sha1_object_info_extended(): expose a bit more info
  packed_object_info_detail(): do not return a string
2011-06-29 17:09:27 -07:00
1692d0c64a Merge branch 'rs/grep-color'
* rs/grep-color:
  grep: add --heading
  grep: add --break
  grep: fix coloring of hunk marks between files
2011-06-29 17:03:13 -07:00
b985f2aeca Merge branch 'jc/maint-1.7.3-checkout-describe'
* jc/maint-1.7.3-checkout-describe:
  checkout -b <name>: correctly detect existing branch
2011-06-29 17:03:12 -07:00
0faf247485 Merge branch 'jc/advice-about-to-lose-commit'
* jc/advice-about-to-lose-commit:
  checkout: make advice when reattaching the HEAD less loud

Conflicts:
	builtin/checkout.c
2011-06-29 17:03:10 -07:00
f5cfd52f7b Merge branch 'maint-1.7.5' into maint
* maint-1.7.5:
  test: skip clean-up when running under --immediate mode
  "branch -d" can remove more than one branches
2011-06-29 16:41:55 -07:00
534cea3fce "branch -d" can remove more than one branches
Since 03feddd (git-check-ref-format: reject funny ref names, 2005-10-13),
"git branch -d" can take more than one branch names to remove.

The documentation was correct, but the usage string was not.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-29 16:35:29 -07:00
84054f79de clone: accept config options on the command line
Clone does all of init, "remote add", fetch, and checkout
without giving the user a chance to intervene and set any
configuration. This patch allows you to set config options
in the newly created repository after the clone, but before
we do any other operations.

In many cases, this is a minor convenience over something
like:

  git clone git://...
  git config core.whatever true

But in some cases, it can bring extra efficiency by changing
how the fetch or checkout work. For example, setting
line-ending config before the checkout avoids having to
re-checkout all of the contents with the correct line
endings.

It also provides a mechanism for passing information to remote
helpers during a clone; the helpers may read the git config
to influence how they operate.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:25:21 -07:00
615ff912c5 remote: use new OPT_STRING_LIST
This saves us having our own callback function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:25:20 -07:00
7b97730b76 upload-archive: allow user to turn off filters
Some tar filters may be very expensive to run, so sites do
not want to expose them via upload-archive. This patch lets
users configure tar.<filter>.remote to turn them off.

By default, gzip filters are left on, as they are about as
expensive as creating zip archives.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
56baa61d01 archive: move file extension format-guessing lower
The process for guessing an archive output format based on
the filename is something like this:

  a. parse --output in cmd_archive; check the filename
     against a static set of mapping heuristics (right now
     it just matches ".zip" for zip files).

  b. if found, stick a fake "--format=zip" at the beginning
     of the arguments list (if the user did specify a
     --format manually, the later option will override our
     fake one)

  c. if it's a remote call, ship the arguments to the remote
     (including the fake), which will call write_archive on
     their end

  d. if it's local, ship the arguments to write_archive
     locally

There are two problems:

  1. The set of mappings is static and at too high a level.
     The write_archive level is going to check config for
     user-defined formats, some of which will specify
     extensions. We need to delay lookup until those are
     parsed, so we can match against them.

  2. For a remote archive call, our set of mappings (or
     formats) may not match the remote side's. This is OK in
     practice right now, because all versions of git
     understand "zip" and "tar". But as new formats are
     added, there is going to be a mismatch between what the
     client can do and what the remote server can do.

To fix (1), this patch refactors the location guessing to
happen at the write_archive level, instead of the
cmd_archive level. So instead of sticking a fake --format
field in the argv list, we actually pass a "name hint" down
the callchain; this hint is used at the appropriate time to
guess the format (if one hasn't been given already).

This patch leaves (2) unfixed. The name_hint is converted to
a "--format" option as before, and passed to the remote.
This means the local side's idea of how extensions map to
formats will take precedence.

Another option would be to pass the name hint to the remote
side and let the remote choose. This isn't a good idea for
two reasons:

  1. There's no room in the protocol for passing that
     information. We can pass a new argument, but older
     versions of git on the server will choke on it.

  2. Letting the remote side decide creates a silent
     inconsistency in user experience. Consider the case
     that the locally installed git knows about the "tar.gz"
     format, but a remote server doesn't.

     Running "git archive -o foo.tar.gz" will use the tar.gz
     format. If we use --remote, and the local side chooses
     the format, then we send "--format=tar.gz" to the
     remote, which will complain about the unknown format.
     But if we let the remote side choose the format, then
     it will realize that it doesn't know about "tar.gz" and
     output uncompressed tar without even issuing a warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
dc4cd76710 plug a few coverity-spotted leaks
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-20 14:27:36 -07:00
588d0e834b tag: accept multiple patterns for --list
Until now, "git tag -l foo* bar*" would silently ignore the
second argument, showing only refs starting with "foo". It's
not just unfriendly not to take a second pattern; we
actually generated subtly wrong results (from the user's
perspective) because some of the requested tags were
omitted.

This patch allows an arbitrary number of patterns on the
command line; if any of them matches, the ref is shown.

While we're tweaking the documentation, let's also make it
clear that the pattern is fnmatch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-20 13:00:54 -07:00
b93d2ff3aa Merge branch 'maint'
* maint:
  builtin/gc.c: add missing newline in message
2011-06-19 16:01:51 -07:00
daab4eeafa builtin/gc.c: add missing newline in message
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-19 14:46:39 -07:00
de9f14e26a default core.clockskew variable to one day
This is the slop value used by name-rev, so presumably is a
reasonable default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-11 22:32:30 -07:00
0c811a7a6f limit "contains" traversals based on commit timestamp
When looking for commits that contain other commits (e.g.,
via "git tag --contains"), we can end up traversing useless
portions of the graph. For example, if I am looking for a
tag that contains a commit made last week, there is not much
point in traversing portions of the history graph made five
years ago.

This optimization can provide massive speedups. For example,
doing "git tag --contains HEAD~200" in the linux-2.6
repository goes from:

  real    0m5.302s
  user    0m5.116s
  sys     0m0.184s

to:

  real    0m0.030s
  user    0m0.020s
  sys     0m0.008s

The downside is that we will no longer find some answers in
the face of extreme clock skew, as we will stop the
traversal early when seeing commits skewed too far into the
past.

Name-rev already implements a similar optimization, using a
"slop" of one day to allow for a certain amount of clock
skew in commit timestamps. This patch introduces a
"core.clockskew" variable, which allows specifying the
allowable amount of clock skew in seconds.  For safety, it
defaults to "none", causing a full traversal (i.e., no
change in behavior from previous versions).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-11 22:32:30 -07:00