speed up alt_odb_usable() with many alternates

With many alternates, the duplicate check in alt_odb_usable()
wastes many cycles doing repeated fspathcmp() on every existing
alternate.  Use a khash to speed up lookups by odb->path.

Since the kh_put_* API uses the supplied key without
duplicating it, we also take advantage of it to replace both
xstrdup() and strbuf_release() in link_alt_odb_entry() with
strbuf_detach() to avoid the allocation and copy.

In a test repository with 50K alternates and each of those 50K
alternates having one alternate each (for a total of 100K total
alternates); this speeds up lookup of a non-existent blob from
over 16 minutes to roughly 2.7 seconds on my busy workstation.

Note: all underlying git object directories were small and
unpacked with only loose objects and no packs.  Having to load
packs increases times significantly.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Eric Wong
2021-07-07 23:10:15 +00:00
committed by Junio C Hamano
parent 670b81a890
commit cf2dc1c238
5 changed files with 43 additions and 13 deletions

View File

@ -7,6 +7,8 @@
#include "oid-array.h"
#include "strbuf.h"
#include "thread-utils.h"
#include "khash.h"
#include "dir.h"
struct object_directory {
struct object_directory *next;
@ -30,6 +32,9 @@ struct object_directory {
char *path;
};
KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
struct object_directory *, 1, fspathhash, fspatheq);
void prepare_alt_odb(struct repository *r);
char *compute_alternate_path(const char *path, struct strbuf *err);
typedef int alt_odb_fn(struct object_directory *, void *);
@ -116,6 +121,8 @@ struct raw_object_store {
*/
struct object_directory *odb;
struct object_directory **odb_tail;
kh_odb_path_map_t *odb_by_path;
int loaded_alternates;
/*