hashmap: add string interning API

Interning short strings with high probability of duplicates can reduce the
memory footprint and speed up comparisons.

Add strintern() and memintern() APIs that use a hashmap to manage the pool
of unique, interned strings.

Note: strintern(getenv()) could be used to sanitize git's use of getenv(),
in case we ever encounter a platform where a call to getenv() invalidates
previous getenv() results (which is allowed by POSIX).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Karsten Blees
2014-07-03 00:22:54 +02:00
committed by Junio C Hamano
parent ab73a9d119
commit 7b64d42d22
5 changed files with 88 additions and 0 deletions

View File

@ -234,6 +234,20 @@ int main(int argc, char *argv[])
/* print table sizes */
printf("%u %u\n", map.tablesize, map.size);
} else if (!strcmp("intern", cmd) && l1) {
/* test that strintern works */
const char *i1 = strintern(p1);
const char *i2 = strintern(p1);
if (strcmp(i1, p1))
printf("strintern(%s) returns %s\n", p1, i1);
else if (i1 == p1)
printf("strintern(%s) returns input pointer\n", p1);
else if (i1 != i2)
printf("strintern(%s) != strintern(%s)", i1, i2);
else
printf("%s\n", i1);
} else if (!strcmp("perfhashmap", cmd) && l1 && l2) {
perf_hashmap(atoi(p1), atoi(p2));