Commit Graph

20912 Commits

Author SHA1 Message Date
2ca91d1ee0 Merge branch 'mh/credential-oauth-refresh-token'
The credential subsystem learns to help OAuth framework.

* mh/credential-oauth-refresh-token:
  credential: new attribute oauth_refresh_token
2023-05-10 10:23:29 -07:00
7f3cc51b28 Merge branch 'ar/test-cleanup-unused-file-creation-part2'
Test cleanup.

* ar/test-cleanup-unused-file-creation-part2:
  t2019: don't create unused files
  t1502: don't create unused files
  t1450: don't create unused files
  t1300: don't create unused files
  t1300: fix config file syntax error descriptions
  t0300: don't create unused file
2023-05-10 10:23:28 -07:00
b6e9521956 Merge branch 'ms/send-email-feed-header-to-validate-hook'
"git send-email" learned to give the e-mail headers to the validate
hook by passing an extra argument from the command line.

* ms/send-email-feed-header-to-validate-hook:
  send-email: expose header information to git-send-email's sendemail-validate hook
  send-email: refactor header generation functions
2023-05-10 10:23:28 -07:00
fbbf60a9bc Merge branch 'tb/credential-long-lines'
The implementation of credential helpers used fgets() over fixed
size buffers to read protocol messages, causing the remainder of
the folded long line to trigger unexpected behaviour, which has
been corrected.

* tb/credential-long-lines:
  contrib/credential: embiggen fixed-size buffer in wincred
  contrib/credential: avoid fixed-size buffer in libsecret
  contrib/credential: .gitignore libsecret build artifacts
  contrib/credential: remove 'gnome-keyring' credential helper
  contrib/credential: avoid fixed-size buffer in osxkeychain
  t/lib-credential.sh: ensure credential helpers handle long headers
  credential.c: store "wwwauth[]" values in `credential_read()`
2023-05-10 10:23:27 -07:00
6710b68db1 Merge branch 'rs/test-ctype-eof'
ctype tests have been taught to test EOF, too.

* rs/test-ctype-eof:
  test-ctype: check EOF
2023-05-10 10:23:27 -07:00
0004d97099 Merge branch 'ob/t3501-retitle'
Retitle a test script with an overly narrow name.

* ob/t3501-retitle:
  t/t3501-revert-cherry-pick.sh: clarify scope of the file
2023-05-09 16:45:46 -07:00
ccd12a3d6c Merge branch 'en/header-split-cache-h-part-2'
More header clean-up.

* en/header-split-cache-h-part-2: (22 commits)
  reftable: ensure git-compat-util.h is the first (indirect) include
  diff.h: reduce unnecessary includes
  object-store.h: reduce unnecessary includes
  commit.h: reduce unnecessary includes
  fsmonitor: reduce includes of cache.h
  cache.h: remove unnecessary headers
  treewide: remove cache.h inclusion due to previous changes
  cache,tree: move basic name compare functions from read-cache to tree
  cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
  hash-ll.h: split out of hash.h to remove dependency on repository.h
  tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
  dir.h: move DTYPE defines from cache.h
  versioncmp.h: move declarations for versioncmp.c functions from cache.h
  ws.h: move declarations for ws.c functions from cache.h
  match-trees.h: move declarations for match-trees.c functions from cache.h
  pkt-line.h: move declarations for pkt-line.c functions from cache.h
  base85.h: move declarations for base85.c functions from cache.h
  copy.h: move declarations for copy.c functions from cache.h
  server-info.h: move declarations for server-info.c functions from cache.h
  packfile.h: move pack_window and pack_entry from cache.h
  ...
2023-05-09 16:45:46 -07:00
620e92b845 Merge branch 'jk/parse-commit-with-malformed-ident'
The commit object parser has been taught to be a bit more lenient
to parse timestamps on the author/committer line with a malformed
author/committer ident.

* jk/parse-commit-with-malformed-ident:
  parse_commit(): describe more date-parsing failure modes
  parse_commit(): handle broken whitespace-only timestamp
  parse_commit(): parse timestamp from end of line
  t4212: avoid putting git on left-hand side of pipe
2023-05-09 16:45:45 -07:00
a5855fd8d4 t2019: don't create unused files
Tests in t2019-checkout-ambiguous-ref.sh redirect two invocations of
"git checkout" to files "stdout" and "stderr".  Several assertions are
made using file "stderr".  File "stdout", however, is unused.

Don't redirect standard output of "git checkout" to file "stdout" in
t2019-checkout-ambiguous-ref.sh to avoid creating unnecessary files.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:53:10 -07:00
dca675c6ef t1502: don't create unused files
Three tests in file t1502-rev-parse-parseopt.sh use three redirections
with invocation of "git rev-parse --parseopt --".  All three tests
redirect standard output to file "out" and file "spec" to standard
input.  Two of the tests redirect standard output a second time to file
"actual", and the third test redirects standard error to file "err".
These tests check contents of files "actual" and "err", but don't use
the files named "out" for assertions.  The two tests that redirect to
standard output twice might also be confusing to the reader.

Don't redirect standard output of "git rev-parse" to file "out" in
t1502-rev-parse-parseopt.sh to avoid creating unnecessary files.

Acked-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:53:06 -07:00
59162ece57 t1450: don't create unused files
Test 'fsck error and recovery on invalid object type' in file
t1450-fsck.sh redirects output of a failing "git fsck" invocation to
files "out" and "err" to assert presence of error messages in the output
of the command.  Commit 31deb28f5e (fsck: don't hard die on invalid
object types, 2021-10-01) changed the way assertions in this test are
performed.  The test doesn't compare the whole standard error with
prepared file "err.expect" and it doesn't assert that standard output is
empty.

Don't create unused files "err.expect" and "out" in test 'fsck error and
recovery on invalid object type'.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:53:03 -07:00
a7cae2905b t1300: don't create unused files
Three tests in t1300-config.sh check that "git config --get" barfs when
syntax errors are present in the config file.  The tests redirect
standard output and standard error of "git config --get" to files,
"actual" and "error" correspondingly.  They assert presence of an error
message in file "error".  However, these tests don't use file "actual"
for assertions.

Don't redirect standard output of "git config --get" to file "actual" in
t1300-config.sh to avoid creating unnecessary files.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:52:48 -07:00
6fc68e7ca3 t1300: fix config file syntax error descriptions
Three tests in t1300-config.sh check that "git config --get" barfs when
the config file contains various syntax errors: key=value pair without
equals sign, broken section line, and broken value string.  The sample
config files include a comment describing the kind of broken syntax.
This description seems to have been copy-pasted from the "broken section
line" sample to the other two samples.

Fix descriptions of broken config file syntax in samples used in
t1300-config.sh.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:52:45 -07:00
ed5288cff2 t0300: don't create unused file
Test 'credential config with partial URLs' in t0300-credentials.sh
contains three "git credential fill" invocations.  For two of the
invocations, the test asserts presence or absence of string "yep" in the
standard output.  For the third test it checks for an error message in
standard error.

Don't redirect standard output of "git credential" to file "stdout" in
t0300-credentials.sh to avoid creating an unnecessary file when only
standard error is checked.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-03 08:52:17 -07:00
d699e27bd4 Merge branch 'tb/ban-strtok'
Mark strtok() and strtok_r() to be banned.

* tb/ban-strtok:
  banned.h: mark `strtok()` and `strtok_r()` as banned
  t/helper/test-json-writer.c: avoid using `strtok()`
  t/helper/test-oidmap.c: avoid using `strtok()`
  t/helper/test-hashmap.c: avoid using `strtok()`
  string-list: introduce `string_list_setlen()`
  string-list: multi-delimiter `string_list_split_in_place()`
2023-05-02 10:13:35 -07:00
cf85f4b3bd Merge branch 'jk/blame-fake-commit-label'
The output given by "git blame" that attributes a line to contents
taken from the file specified by the "--contents" option shows it
differently from a line attributed to the working tree file.

* jk/blame-fake-commit-label:
  blame: use different author name for fake commit generated by --contents
2023-05-02 10:13:35 -07:00
31885f64e9 test-ctype: check EOF
The character classifiers are supposed to allow passing EOF to them, a
negative value.  It isn't part of any character class.  Extend the tests
to cover that.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 09:25:54 -07:00
71201ab0e5 t/lib-credential.sh: ensure credential helpers handle long headers
Add a test ensuring that the "wwwauth[]" field cannot be used to
inject malicious data into the credential helper stream.

Many of the credential helpers in contrib/credential read the
newline-delimited protocol stream one line at a time by repeatedly
calling fgets() into a fixed-size buffer.

This assumes that each line is no more than 1024 characters long, since
each iteration of the loop assumes that it is parsing starting at the
beginning of a new line in the stream. However, similar to a5bb10fd5e
(config: avoid fixed-sized buffer when renaming/deleting a section,
2023-04-06), if a line is longer than 1024 characters, a malicious actor
can embed another command within an existing line, bypassing the usual
checks introduced in 9a6bbee800 (credential: avoid writing values with
newlines, 2020-03-11).

As with the problem fixed in that commit, specially crafted input can
cause the helper to return the credential for the wrong host, letting an
attacker trick the victim into sending credentials for one host to
another.

Luckily, all parts of the credential helper protocol that are available
in a tagged release of Git are immune to this attack:

  - "protocol" is restricted to known values, and is thus immune.

  - "host" is immune because curl will reject hostnames that have a '='
    character in them, which would be required to carry out this attack.

  - "username" is immune, because the buffer characters to fill out the
    first `fgets()` call would pollute the `username` field, causing the
    credential helper to return nothing (because it would match a
    username if present, and the username of the credential to be stolen
    is likely not 1024 characters).

  - "password" is immune because providing a password instructs
    credential helpers to avoid filling credentials in the first place.

  - "path" is similar to username; if present, it is not likely to match
    any credential the victim is storing. It's also not enabled by
    default; the victim would have to set credential.useHTTPPath
    explicitly.

However, the new "wwwauth[]" field introduced via 5f2117b24f
(credential: add WWW-Authenticate header to cred requests, 2023-02-27)
can be used to inject data into the credential helper stream. For
example, running:

    {
      printf 'HTTP/1.1 401\r\n'
      printf 'WWW-Authenticate: basic realm='
      perl -e 'print "a" x 1024'
      printf 'host=victim.com\r\n'
    } | nc -Nlp 8080

in one terminal, and then:

    git clone http://localhost:8080

in another would result in a line like:

    wwwauth[]=basic realm=aaa[...]aaahost=victim.com

being sent to the credential helper. If we tweak that "1024" to align
our output with the helper's buffer size and the rest of the data on the
line, it can cause the helper to see "host=victim.com" on its own line,
allowing motivated attackers to exfiltrate credentials belonging to
"victim.com".

The below test demonstrates these failures and provides us with a test
to ensure that our fix is correct. That said, it has a couple of
shortcomings:

  - it's in t0303, since that's the only mechanism we have for testing
    random helpers. But that means nobody is going to run it under
    normal circumstances.

  - to get the attack right, it has to line up the stuffed name with the
    buffer size, so we depend on the exact buffer size. I parameterized
    it so it could be used to test other helpers, but in practice it's
    not likely for anybody to do that.

Still, it's the best we can do, and will help us confirm the presence of
the problem (and our fixes) in the new few patches.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 09:27:01 -07:00
8bb19c14fb t/t3501-revert-cherry-pick.sh: clarify scope of the file
The file started out as a test for picks and reverts with renames, but
has been subsequently populated with all kinds of basic tests, in
accordance with its generic name. Adjust the description to reflect
that.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 08:24:58 -07:00
aabc69885e Merge branch 'jk/gpg-trust-level-fix'
The "%GT" placeholder for the "--format" option of "git log" and
friends caused BUG() to trigger on a commit signed with an unknown
key, which has been corrected.

* jk/gpg-trust-level-fix:
  gpg-interface: set trust level of missing key to "undefined"
2023-04-28 16:03:03 -07:00
fc23c397c7 Merge branch 'tb/enable-cruft-packs-by-default'
When "gc" needs to retain unreachable objects, packing them into
cruft packs (instead of exploding them into loose object files) has
been offered as a more efficient option for some time.  Now the use
of cruft packs has been made the default and no longer considered
an experimental feature.

* tb/enable-cruft-packs-by-default:
  repository.h: drop unused `gc_cruft_packs`
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: add additional test cases
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  builtin/repack.c: fix incorrect reference to '-C'
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-28 16:03:03 -07:00
a02675ad90 Merge branch 'ds/fsck-pack-revindex'
"git fsck" learned to validate the on-disk pack reverse index files.

* ds/fsck-pack-revindex:
  fsck: validate .rev file header
  fsck: check rev-index position values
  fsck: check rev-index checksums
  fsck: create scaffolding for rev-index checks
2023-04-27 16:00:59 -07:00
849c8b3dbf Merge branch 'tb/pack-revindex-on-disk'
The on-disk reverse index that allows mapping from the pack offset
to the object name for the object stored at the offset has been
enabled by default.

* tb/pack-revindex-on-disk:
  t: invert `GIT_TEST_WRITE_REV_INDEX`
  config: enable `pack.writeReverseIndex` by default
  pack-revindex: introduce `pack.readReverseIndex`
  pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK
  pack-revindex: make `load_pack_revindex` take a repository
  t5325: mark as leak-free
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-27 16:00:59 -07:00
089d9adff6 parse_commit(): handle broken whitespace-only timestamp
The comment in parse_commit_date() claims that parse_timestamp() will
not walk past the end of the buffer we've been given, since it will hit
the newline at "eol" and stop. This is usually true, when dateptr
contains actual numbers to parse. But with a line like:

   committer name <email>   \n

with just whitespace, and no numbers, parse_timestamp() will consume
that newline as part of the leading whitespace, and we may walk past our
"tail" pointer (which itself is set from the "size" parameter passed in
to parse_commit_buffer()).

In practice this can't cause us to walk off the end of an array, because
we always add an extra NUL byte to the end of objects we load from disk
(as a defense against exactly this kind of bug). However, you can see
the behavior in action when "committer" is the final header (which it
usually is, unless there's an encoding) and the subject line can be
parsed as an integer. We walk right past the newline on the committer
line, as well as the "\n\n" separator, and mistake the subject for the
timestamp.

We can solve this by trimming the whitespace ourselves, making sure that
it has some non-whitespace to parse. Note that we need to be a bit
careful about the definition of "whitespace" here, as our isspace()
doesn't match exotic characters like vertical tab or formfeed. We can
work around that by checking for an actual number (see the in-code
comment). This is slightly more restrictive than the current code, but
in practice the results are either the same (we reject "foo" as "0", but
so would parse_timestamp()) or extremely unlikely even for broken
commits (parse_timestamp() would allow "\v123" as "123", but we'll now
make it "0").

I did also allow "-" here, which may be controversial, as we don't
currently support negative timestamps. My reasoning was two-fold. One,
the design of parse_timestamp() is such that we should be able to easily
switch it to handling signed values, and this otherwise creates a
hard-to-find gotcha that anybody doing that work would get tripped up
on. And two, the status quo is that we currently parse them, though the
result of course ends up as a very large unsigned value (which is likely
to just get clamped to "0" for display anyway, since our date routines
can't handle it).

The new test checks the commit parser (via "--until") for both vanilla
spaces and the vertical-tab case. I also added a test to check these
against the pretty-print formatter, which uses split_ident_line().  It's
not subject to the same bug, because it already insists that there be
one or more digits in the timestamp.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:53:53 -07:00
ea1615dfdd parse_commit(): parse timestamp from end of line
To find the committer timestamp, we parse left-to-right looking for the
closing ">" of the email, and then expect the timestamp right after
that. But we've seen some broken cases in the wild where this fails, but
we _could_ find the timestamp with a little extra work. E.g.:

  Name <Name<email>> 123456789 -0500

This means that features that rely on the committer timestamp, like
--since or --until, will treat the commit as happening at time 0 (i.e.,
1970).

This is doubly confusing because the pretty-print parser learned to
handle these in 03818a4a94 (split_ident: parse timestamp from end of
line, 2013-10-14). So printing them via "git show", etc, makes
everything look normal, but --until, etc are still broken (despite the
fact that that commit explicitly mentioned --until!).

So let's use the same trick as 03818a4a94: find the end of the line, and
parse back to the final ">". In theory we could use split_ident_line()
here, but it's actually a bit more strict. In particular, it requires a
valid time-zone token, too. That should be present, of course, but we
wouldn't want to break --until for cases that are working currently.

We might want to teach split_ident_line() to become more lenient there,
but it would require checking its many callers (since right now they can
assume that if date_start is non-NULL, so is tz_start).

So for now we'll just reimplement the same trick in the commit parser.

The test is in t4212, which already covers similar cases, courtesy of
03818a4a94. We'll just adjust the broken commit to munge both the author
and committer timestamps. Note that we could match (author|committer)
here, but alternation can't be used portably in sed. Since we wouldn't
expect to see ">" except as part of an ident line, we can just match
that character on any line.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:53:35 -07:00
2063b86b81 t4212: avoid putting git on left-hand side of pipe
We wouldn't expect cat-file to fail here, but it's good practice to
avoid putting git on the upstream of a pipe, as we otherwise ignore its
exit code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 08:53:32 -07:00
36628c56ed Merge branch 'ps/fix-geom-repack-with-alternates'
Geometric repacking ("git repack --geometric=<n>") in a repository
that borrows from an alternate object database had various corner
case bugs, which have been corrected.

* ps/fix-geom-repack-with-alternates:
  repack: disable writing bitmaps when doing a local repack
  repack: honor `-l` when calculating pack geometry
  t/helper: allow chmtime to print verbosely without modifying mtime
  pack-objects: extend test coverage of `--stdin-packs` with alternates
  pack-objects: fix error when same packfile is included and excluded
  pack-objects: fix error when packing same pack twice
  pack-objects: split out `--stdin-packs` tests into separate file
  repack: fix generating multi-pack-index with only non-local packs
  repack: fix trying to use preferred pack in alternates
  midx: fix segfault with no packs and invalid preferred pack
2023-04-25 13:56:20 -07:00
c4c9d5586f Merge branch 'rj/send-email-validate-hook-count-messages'
The sendemail-validate validate hook learned to pass the total
number of input files and where in the sequence each invocation is
via environment variables.

* rj/send-email-validate-hook-count-messages:
  send-email: export patch counters in validate environment
2023-04-25 13:56:20 -07:00
80d268f309 Merge branch 'jk/protocol-cap-parse-fix'
The code to parse capability list for v0 on-wire protocol fell into
an infinite loop when a capability appears multiple times, which
has been corrected.

* jk/protocol-cap-parse-fix:
  v0 protocol: use size_t for capability length/offset
  t5512: test "ls-remote --heads --symref" filtering with v0 and v2
  t5512: allow any protocol version for filtered symref test
  t5512: add v2 support for "ls-remote --symref" test
  v0 protocol: fix sha1/sha256 confusion for capabilities^{}
  t5512: stop referring to "v1" protocol
  v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25 13:56:20 -07:00
0807e57807 Merge branch 'en/header-split-cache-h'
Header clean-up.

* en/header-split-cache-h: (24 commits)
  protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
  mailmap, quote: move declarations of global vars to correct unit
  treewide: reduce includes of cache.h in other headers
  treewide: remove double forward declaration of read_in_full
  cache.h: remove unnecessary includes
  treewide: remove cache.h inclusion due to pager.h changes
  pager.h: move declarations for pager.c functions from cache.h
  treewide: remove cache.h inclusion due to editor.h changes
  editor: move editor-related functions and declarations into common file
  treewide: remove cache.h inclusion due to object.h changes
  object.h: move some inline functions and defines from cache.h
  treewide: remove cache.h inclusion due to object-file.h changes
  object-file.h: move declarations for object-file.c functions from cache.h
  treewide: remove cache.h inclusion due to git-zlib changes
  git-zlib: move declarations for git-zlib functions from cache.h
  treewide: remove cache.h inclusion due to object-name.h changes
  object-name.h: move declarations for object-name.c functions from cache.h
  treewide: remove unnecessary cache.h inclusion
  treewide: be explicit about dependence on mem-pool.h
  treewide: be explicit about dependence on oid-array.h
  ...
2023-04-25 13:56:20 -07:00
9ce9dea4e1 Sync with Git 2.40.1 2023-04-24 22:31:32 -07:00
a2742f8c59 t/helper/test-json-writer.c: avoid using strtok()
Apply similar treatment as in the previous commit to remove usage of
`strtok()` from the "oidmap" test helper.

Each of the different commands that the "json-writer" helper accepts
pops the next space-delimited token from the current line and interprets
it as a string, integer, or double (with the exception of the very first
token, which is the command itself).

To accommodate this, split the line in place by the space character, and
pass the corresponding string_list to each of the specialized `get_s()`,
`get_i()`, and `get_d()` functions.

`get_i()` and `get_d()` are thin wrappers around `get_s()` that convert
their result into the appropriate type by either calling `strtol()` or
`strtod()`, respectively. In `get_s()`, we mark the token as "consumed"
by incrementing the `consumed_nr` counter, indicating how many tokens we
have read up to that point.

Because each of these functions needs the string-list parts, the number
of tokens consumed, and the line number, these three are wrapped up in
to a struct representing the line state.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
deeabc1ff0 t/helper/test-oidmap.c: avoid using strtok()
Apply similar treatment as in the previous commit to remove usage of
`strtok()` from the "oidmap" test helper.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
826f0e33ab t/helper/test-hashmap.c: avoid using strtok()
Avoid using the non-reentrant `strtok()` to separate the parts of each
incoming command. Instead of replacing it with `strtok_r()`, let's
instead use the more friendly pair of `string_list_split_in_place()` and
`string_list_remove_empty_items()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
52acddf36c string-list: multi-delimiter string_list_split_in_place()
Enhance `string_list_split_in_place()` to accept multiple characters as
delimiters instead of a single character.

Instead of using `strchr(2)` to locate the first occurrence of the given
delimiter character, `string_list_split_in_place_multi()` uses
`strcspn(2)` to move past the initial segment of characters comprised of
any characters in the delimiting set.

When only a single delimiting character is provided, `strpbrk(2)` (which
is implemented with `strcspn(2)`) has equivalent performance to
`strchr(2)`. Modern `strcspn(2)` implementations treat an empty
delimiter or the singleton delimiter as a special case and fall back to
calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)`
this way.

This change is one step to removing `strtok(2)` from the tree. Note that
`string_list_split_in_place()` is not a strict replacement for
`strtok()`, since it will happily turn sequential delimiter characters
into empty entries in the resulting string_list. For example:

    string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1)

would yield a string list of:

    ["foo", "", "", "bar", "", "", "baz"]

Callers that wish to emulate the behavior of strtok(2) more directly
should call `string_list_remove_empty_items()` after splitting.

To avoid regressions for the new multi-character delimter cases, update
t0063 in this patch as well.

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35
[2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
603d0fdce2 blame: use different author name for fake commit generated by --contents
When the --contents option is used with git blame, and the contents of
the file have lines which can't be annotated by the history being
blamed, the user will see an author of "Not Committed Yet". This is
similar to the way blame handles working tree contents when blaming
without a revision.

This is slightly confusing since this data isn't the working copy and
while it is technically "not committed yet", its also coming from an
external file. Replace this author name with "External file
(--contents)" to better differentiate such lines from actual working
copy lines.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Glen Choo <chooglen@google.com>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 15:16:31 -07:00
e3a3f5edf5 reftable: ensure git-compat-util.h is the first (indirect) include
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
5e3f94dfe3 treewide: remove cache.h inclusion due to previous changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
d1cbe1e6d8 hash-ll.h: split out of hash.h to remove dependency on repository.h
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository->hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
d4ff2072ab match-trees.h: move declarations for match-trees.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
0ff73d742b packfile.h: move pack_window and pack_entry from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
69a63fe663 treewide: be explicit about dependence on strbuf.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
b64894c206 Merge branch 'ow/ref-filter-omit-empty'
"git branch --format=..." and "git format-patch --format=..."
learns "--omit-empty" to hide refs that whose formatting result
becomes an empty string from the output.

* ow/ref-filter-omit-empty:
  branch, for-each-ref, tag: add option to omit empty lines
2023-04-21 15:35:05 -07:00
9e0d1aa495 Merge branch 'ah/format-patch-thread-doc'
Doc update.

* ah/format-patch-thread-doc:
  format-patch: correct documentation of --thread without an argument
2023-04-21 15:35:05 -07:00
7ac228c994 Merge branch 'rn/sparse-describe'
"git describe --dirty" learns to work better with sparse-index.

* rn/sparse-describe:
  describe: enable sparse index for describe
2023-04-21 15:35:04 -07:00
de73a20756 Merge branch 'rs/archive-from-subdirectory-fixes'
"git archive" run from a subdirectory mishandled attributes and
paths outside the current directory.

* rs/archive-from-subdirectory-fixes:
  archive: improve support for running in subdirectory
2023-04-21 15:35:04 -07:00
a5c76569e7 credential: new attribute oauth_refresh_token
Git authentication with OAuth access token is supported by every popular
Git host including GitHub, GitLab and BitBucket [1][2][3]. Credential
helpers Git Credential Manager (GCM) and git-credential-oauth generate
OAuth credentials [4][5]. Following RFC 6749, the application prints a
link for the user to authorize access in browser. A loopback redirect
communicates the response including access token to the application.

For security, RFC 6749 recommends that OAuth response also includes
expiry date and refresh token [6]. After expiry, applications can use
the refresh token to generate a new access token without user
reauthorization in browser. GitLab and BitBucket set the expiry at two
hours [2][3]. (GitHub doesn't populate expiry or refresh token.)

However the Git credential protocol has no attribute to store the OAuth
refresh token (unrecognised attributes are silently discarded). This
means that the user has to regularly reauthorize the helper in browser.
On a browserless system, this is particularly intrusive, requiring a
second device.

Introduce a new attribute oauth_refresh_token. This is especially
useful when a storage helper and a read-only OAuth helper are configured
together. Recall that `credential fill` calls each helper until it has a
non-expired password.

```
[credential]
	helper = storage  # eg. cache or osxkeychain
	helper = oauth
```

The OAuth helper can use the stored refresh token forwarded by
`credential fill` to generate a fresh access token without opening the
browser. See
https://github.com/hickford/git-credential-oauth/pull/3/files
for an implementation tested with this patch.

Add support for the new attribute to credential-cache. Eventually, I
hope to see support in other popular storage helpers.

Alternatives considered: ask helpers to store all unrecognised
attributes. This seems excessively complex for no obvious gain.
Helpers would also need extra information to distinguish between
confidential and non-confidential attributes.

Workarounds: GCM abuses the helper get/store/erase contract to store the
refresh token during credential *get* as the password for a fictitious
host [7] (I wrote this hack). This workaround is only feasible for a
monolithic helper with its own storage.

[1] https://github.blog/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/
[2] https://docs.gitlab.com/ee/api/oauth2.html#access-git-over-https-with-access-token
[3] https://support.atlassian.com/bitbucket-cloud/docs/use-oauth-on-bitbucket-cloud/#Cloning-a-repository-with-an-access-token
[4] https://github.com/GitCredentialManager/git-credential-manager
[5] https://github.com/hickford/git-credential-oauth
[6] https://datatracker.ietf.org/doc/html/rfc6749#section-5.1
[7] 66b94e489a/src/shared/GitLab/GitLabHostProvider.cs (L207)

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-21 09:38:30 -07:00
a4a4db8cf7 Merge branch 'gc/better-error-when-local-clone-fails-with-symlink'
"git clone --local" stops copying from an original repository that
has symbolic links inside its $GIT_DIR; an error message when that
happens has been updated.

* gc/better-error-when-local-clone-fails-with-symlink:
  clone: error specifically with --local and symlinked objects
2023-04-20 14:33:36 -07:00
98c496fcd0 Merge branch 'ar/t2024-checkout-output-fix'
Test fix.

* ar/t2024-checkout-output-fix:
  t2024: fix loose/strict local base branch DWIM test
2023-04-20 14:33:36 -07:00
fa9172c70a Merge branch 'rs/remove-approxidate-relative'
The approxidate() API has been simplified by losing an extra
function that did the same thing as another one.

* rs/remove-approxidate-relative:
  date: remove approxidate_relative()
2023-04-20 14:33:35 -07:00