copy vs rename detection: avoid unnecessary O(n*m) loops

The core rename detection had some rather stupid code to check if a
pathname was used by a later modification or rename, which basically
walked the whole pathname space for all renames for each rename, in
order to tell whether it was a pure rename (no remaining users) or
should be considered a copy (other users of the source file remaining).

That's really silly, since we can just keep a count of users around, and
replace all those complex and expensive loops with just testing that
simple counter (but this all depends on the previous commit that shared
the diff_filespec data structure by using a separate reference count).

Note that the reference count is not the same as the rename count: they
behave otherwise rather similarly, but the reference count is tied to
the allocation (and decremented at de-allocation, so that when it turns
zero we can get rid of the memory), while the rename count is tied to
the renames and is decremented when we find a rename (so that when it
turns zero we know that it was a rename, not a copy).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Linus Torvalds
2007-10-25 11:20:56 -07:00
committed by Junio C Hamano
parent 9fb88419ba
commit 644797119d
3 changed files with 35 additions and 75 deletions

40
diff.c
View File

@ -2597,9 +2597,9 @@ void diff_debug_filepair(const struct diff_filepair *p, int i)
{
diff_debug_filespec(p->one, i, "one");
diff_debug_filespec(p->two, i, "two");
fprintf(stderr, "score %d, status %c stays %d broken %d\n",
fprintf(stderr, "score %d, status %c rename_used %d broken %d\n",
p->score, p->status ? p->status : '?',
p->source_stays, p->broken_pair);
p->one->rename_used, p->broken_pair);
}
void diff_debug_queue(const char *msg, struct diff_queue_struct *q)
@ -2617,8 +2617,8 @@ void diff_debug_queue(const char *msg, struct diff_queue_struct *q)
static void diff_resolve_rename_copy(void)
{
int i, j;
struct diff_filepair *p, *pp;
int i;
struct diff_filepair *p;
struct diff_queue_struct *q = &diff_queued_diff;
diff_debug_queue("resolve-rename-copy", q);
@ -2640,27 +2640,21 @@ static void diff_resolve_rename_copy(void)
* either in-place edit or rename/copy edit.
*/
else if (DIFF_PAIR_RENAME(p)) {
if (p->source_stays) {
p->status = DIFF_STATUS_COPIED;
continue;
}
/* See if there is some other filepair that
* copies from the same source as us. If so
* we are a copy. Otherwise we are either a
* copy if the path stays, or a rename if it
* does not, but we already handled "stays" case.
/*
* A rename might have re-connected a broken
* pair up, causing the pathnames to be the
* same again. If so, that's not a rename at
* all, just a modification..
*
* Otherwise, see if this source was used for
* multiple renames, in which case we decrement
* the count, and call it a copy.
*/
for (j = i + 1; j < q->nr; j++) {
pp = q->queue[j];
if (strcmp(pp->one->path, p->one->path))
continue; /* not us */
if (!DIFF_PAIR_RENAME(pp))
continue; /* not a rename/copy */
/* pp is a rename/copy from the same source */
if (!strcmp(p->one->path, p->two->path))
p->status = DIFF_STATUS_MODIFIED;
else if (--p->one->rename_used > 0)
p->status = DIFF_STATUS_COPIED;
break;
}
if (!p->status)
else
p->status = DIFF_STATUS_RENAMED;
}
else if (hashcmp(p->one->sha1, p->two->sha1) ||