As of 95c6f271 "dir.c: unify is_excluded and is_path_excluded APIs", the
is_excluded API no longer recurses into directories that match an ignore
pattern, and returns the directory's ignored state for all contained paths.
This is OK for normal ignore patterns, i.e. ignoring a directory affects
the entire contents recursively.
Unfortunately, this also "works" for negated ignore patterns ('!dir'), i.e.
the entire contents is "not-ignored" recursively, regardless of ignore
patterns that match the contents directly.
In prep_exclude, skip recursing into a directory only if it is really
ignored (i.e. the ignore pattern is not negated).
Signed-off-by: Karsten Blees <blees@dcon.de>
Tested-by: Øystein Walle <oystwa@gmail.com>
Reviewed-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.
fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).
The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.
We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.
In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.
In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.
builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.
Update API documentation to reflect the changes.
Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' recursively scans directories up to three times:
1. To collect untracked files.
2. To collect ignored files.
3. When collecting ignored files, to check that an untracked directory
that potentially contains ignored files doesn't also contain untracked
files (i.e. isn't already listed as untracked).
Let's get rid of case 3 first.
Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.
To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.
Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.
Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").
Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.
In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.
In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.
The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).
Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.
This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.
There's just one line left in treat_file, so move this to treat_one_path
as well.
Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):
| status | status --ignored
| linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 | 1.583 | 0.321 | 2.579
after | 0.156 | 0.988 | 0.202 | 1.279
gain | 1.397 | 1.602 | 1.589 | 2.016
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The is_excluded and is_path_excluded APIs are very similar, except for a
few noteworthy differences:
is_excluded doesn't handle ignored directories, results for paths within
ignored directories are incorrect. This is probably based on the premise
that recursive directory scans should stop at ignored directories, which
is no longer true (in certain cases, read_directory_recursive currently
calls is_excluded *and* is_path_excluded to get correct ignored state).
is_excluded caches parsed .gitignore files of the last directory in struct
dir_struct. If the directory changes, it finds a common parent directory
and is very careful to drop only as much state as necessary. On the other
hand, is_excluded will also read and parse .gitignore files in already
ignored directories, which are completely irrelevant.
is_path_excluded correctly handles ignored directories by checking if any
component in the path is excluded. As it uses is_excluded internally, this
unfortunately forces is_excluded to drop and re-read all .gitignore files,
as there is no common parent directory for the root dir.
is_path_excluded tracks state in a separate struct path_exclude_check,
which is essentially a wrapper of dir_struct with two more fields. However,
as is_path_excluded also modifies dir_struct, it is not possible to e.g.
use multiple path_exclude_check structures with the same dir_struct in
parallel. The additional structure just unnecessarily complicates the API.
Teach is_excluded / prep_exclude about ignored directories: whenever
entering a new directory, first check if the entire directory is excluded.
Remember the excluded state in dir_struct. Don't traverse into already
ignored directories (i.e. don't read irrelevant .gitignore files).
Directories could also be excluded by exclude patterns specified on the
command line or .git/info/exclude, so we cannot simply skip prep_exclude
entirely if there's no .gitignore file name (dir_struct.exclude_per_dir).
Move this check to just before actually reading the file.
is_path_excluded is now equivalent to is_excluded, so we can simply
redirect to it (the public API is cleaned up in the next patch).
The performance impact of the additional ignored check per directory is
hardly noticeable when reading directories recursively (e.g. 'git status').
However, performance of git commands using the is_path_excluded API (e.g.
'git ls-files --cached --ignored --exclude-standard') is greatly improved
as this no longer re-reads .gitignore files on each call.
Here's some performance data from the linux and WebKit repos (best of 10
runs on a Debian Linux on SSD, core.preloadIndex=true):
| ls-files -ci | status | status --ignored
| linux | WebKit | linux | WebKit | linux | WebKit
-------+-------+--------+-------+--------+-------+---------
before | 0.506 | 6.539 | 0.212 | 1.555 | 0.323 | 2.541
after | 0.080 | 1.191 | 0.218 | 1.583 | 0.321 | 2.579
gain | 6.325 | 5.490 | 0.972 | 0.982 | 1.006 | 0.985
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The notion of "ignored tracked" directories introduced in 721ac4ed "dir.c:
Make git-status --ignored more consistent" has a few unwanted side effects:
- git-clean -d -X: deletes ignored tracked directories. git-clean should
never delete tracked content.
- git-ls-files --ignored --other --directory: lists ignored tracked
directories instead of "other" directories.
- git-status --ignored: lists ignored tracked directories while contained
files may be listed as modified. Paths listed by git-status should be
disjoint (except in long format where a path may be listed in both the
staged and unstaged section).
Additionally, the current behaviour violates documentation in gitignore(5)
("Specifies intentionally *untracked* files to ignore") and Documentation/
technical/api-directory-listing.txt ("DIR_SHOW_OTHER_DIRECTORIES: Include
a directory that is *not tracked*.").
In dir.c::treat_directory, remove the special handling of ignored tracked
directories, so that the DIR_SHOW_OTHER_DIRECTORIES flag only affects
"other" (i.e. untracked) directories. In dir.c::dir_add_name, check that
added paths are untracked even if DIR_SHOW_IGNORED is set.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored path/' doesn't list ignored files and directories
within 'path' if some component of 'path' is classified as untracked.
Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading
directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag
from aborting at the top level untracked directory.
As a side effect, this also eliminates a recursive directory scan per
leading directory level, as treat_directory() can no longer call
read_directory_recursive() when called from treat_leading_path().
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' lists empty untracked directories as ignored, even
though they don't have any ignored files.
When checking if a directory is already listed as untracked (i.e. shouldn't
be listed as ignored as well), don't assume that the directory has only
ignored files if it doesn't have untracked files, as the directory may be
empty.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-ls-files --ignored --directories' hides empty directories even though
--no-empty-directory was not specified.
Treat the DIR_HIDE_EMPTY_DIRECTORIES flag independently from
DIR_SHOW_IGNORED to make all git-ls-files options work as expected.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' lists ignored tracked directories without any
ignored files if a tracked file happens to match an exclude pattern.
Always exclude tracked files.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' lists both the ignored directory and the ignored
files if the files are in a tracked sub directory.
When recursing into sub directories in read_directory_recursive, pass on
the check_only parameter so that we don't accidentally add the files.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' drops ignored directories if they contain untracked
files in an untracked sub directory.
Fix it by getting exact (recursive) excluded status in treat_directory.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the shell script more portable:
- Split export X=Y into 2 lines
- Use printf instead of echo -e
Use UTF-8 code points which are not decomposed by the filesystem:
Code points like "á" will be decomposed by Mac OS X.
bzr is unable to find the file "á" on disk.
Use code points from unicode which can not be decomposed.
In other words, the precompsed form use the same bytes as decomposed.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Acked-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Correct common spelling mistakes in comments and tests
kwset: fix spelling in comments
precompose-utf8: fix spelling of "want" in error message
compat/nedmalloc: fix spelling in comments
compat/regex: fix spelling and grammar in comments
obstack: fix spelling of similar
contrib/subtree: fix spelling of accidentally
git-remote-mediawiki: spelling fixes
doc: various spelling fixes
fast-export: fix argument name in error messages
Documentation: distinguish between ref and offset deltas in pack-format
i18n: make the translation of -u advice in one go
* rr/send-email-perl-critique:
send-email: use the three-arg form of open in recipients_cmd
send-email: drop misleading function prototype
send-email: use "return;" not "return undef;" on error codepaths
The --signed-tags argument is plural, while error messages referred
to --signed-tag (singular). Tweak error messages to correspond to the
argument.
Signed-off-by: Paul Price <price@astro.princeton.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
eb32d236 introduced the OBJ_OFS_DELTA object that uses a relative offset to
identify the base object instead of the 20-byte SHA1 reference. The pack file
documentation only mentions the SHA1 based reference in its description of the
deltified object entry.
Update the pack format documentation to clarify that the deltified object
representation refers to its base using either a relative negative offset or
the absolute SHA1 identifier.
Signed-off-by: Stefan Saasen <ssaasen@atlassian.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advice (consider use of -u when read_directory takes too long) is
separated into 3 different status_printf_ln() calls, and which brings
trouble for translators.
Since status_vprintf() called by status_printf_ln() can handle eol in
buffer, we could simply join these lines into one paragraph.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Typo fix: replacing it's -> its
t: make PIPE a standard test prerequisite
archive: clarify explanation of --worktree-attributes
t/README: --immediate skips cleanup commands for failed tests
Attempts to minimize "diff -c/--cc" output by coalescing the same
lines removed from the parents better, but with an O(n^2)
complexity.
* ap/combine-diff-coalesce-lost:
combine-diff: coalesce lost lines optimally
"git log -S/-G" started paying attention to textconv filter, but
there was no way to disable this. Make it honor --no-textconv
option.
* sr/log-SG-no-textconv:
diffcore-pickaxe: unify code for log -S/-G
diffcore-pickaxe: fix leaks in "log -S<block>" and "log -G<pattern>"
diffcore-pickaxe: port optimization from has_changes() to diff_grep()
diffcore-pickaxe: respect --no-textconv
diffcore-pickaxe: remove fill_one()
diffcore-pickaxe: remove unnecessary call to get_textconv()
A few bugfixes to "git rerere" working on corner case merge
conflicts.
* js/rerere-forget-protect-against-NUL:
rerere forget: do not segfault if not all stages are present
rerere forget: grok files containing NUL
"git help" learned "-g" option to show the list of guides just like
list of commands are given with "-a".
* po/help-guides:
doc: include --guide option description for "git help"
help: mention -a and -g option, and 'git help <concept>' usage.
builtin/help.c: add list_common_guides_help() function
builtin/help.c: add --guide option
builtin/help.c: split "-a" processing into two
The 'PIPE' test prerequisite was already defined identically by t9010
and t9300, therefore it makes sense to make it a predefined
prerequisite.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it a bit clearer that --worktree-attributes is about files in the
working tree (checked out files, possibly changed) and not the current
working directory ($PWD). Link to the ATTRIBUTES section, which has
more details.
Reported-by: Amit Bakshi <ambakshi@gmail.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>