mirrors/git - git - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Johannes Schindelin	64f3ff3ffc	GIT-VERSION-GEN: allow it to be run in parallel "Why would one want to run it in parallel?" I hear you ask. I am glad you are curious, because a curious story is what it is, indeed. The `GIT-VERSION-GEN` script is quite a pillar of Git's source code, with most lines being unchanged for the past 15 years. Until the v2.48.0 release candidate cycle. Its original purpose was to generate the version string and store it in the `GIT-VERSION-FILE`. This paradigm changed quite dramatically when support for building with Meson was introduced. Most crucially, `a38edab7c8` (Makefile: generate doc versions via GIT-VERSION-GEN, 2024-12-06) changed the way the documentation is built by using the `GIT-VERSION-GEN` file to write out the `asciidocor-extensions.rb` and `asciidoc.conf` files with now hard-coded version strings. Crucially, the Makefile rule to generate those files needs to be run in every build because `GIT_VERSION` could have been specified in the `make` command-line, which would require these files to be modified. This introduced a surprising race condition! And this is how that race surfaces: When calling `make -j2 html man` from the top-level directory (a variant of which is invoked in Git for Windows' release process), two sub-processes are spawned, a `make -C Documentation html` one and a `make -C Documentation man` one. Both run the rule to (re-)generate `asciidoctor-extensions.rb` or `asciidoc.conf`, invoking `GIT-VERSION-GEN` to do so. That script first generates a temporary file (appending the `+` character to the filename), then looks whether it contains something different than the already existing file (if it exists, that is), and either replaces it if needed, or removes the temporary file. If one of the two parallel invocations removes that temporary file before the other can compare it, or even worse: if one tries to replace the target file just after the other _started_ writing the temporary file (but did not finish writing it yet), that race condition now causes bad builds. This may sound highly theoretical, but due to the design of Git's build process, Git for Windows is forced to use a (slow) POSIX emulation layer to run that script and in the blink of an eye it becomes very much not theoretical at all. See Exhibit A: These GitHub workflow runs failed because one of the two competing `make` processes tried to remove the temporary file when the other process had already done so: https://github.com/git-for-windows/git-sdk-32/actions/runs/12663456654 https://github.com/git-for-windows/git-sdk-32/actions/runs/12683174970 https://github.com/git-for-windows/git-sdk-64/actions/runs/12649348496 While it is undesirable to run this script over and over again, certainly when this involves above-mentioned slow POSIX emulation layer, the stage of the release cycle in which we are presently finding ourselves does not lend itself to a re-design where this script could be run once, and once only, but instead dictates that a quick and reliable work-around be implemented that prevents the race condition without changing the overall architecture of the build process. This patch does that: By using a filename suffix for the temporary file which is based on the currently-executing script's process ID, We guarantee that the two competing invocations cannot overwrite or remove each others' temporary files. The filename suffix still ends in `+` to ensure that the temporary artifacts are matched by the `*+` pattern in `.gitignore` that was added in `f9bbaa384e` (Add intermediate build products to .gitignore, 2009-11-08). Helped-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-10 08:50:53 -08:00
Patrick Steinhardt	e7fb2ca945	builtin/blame: fix out-of-bounds write with blank boundary commits When passing the `-b` flag to git-blame(1), then any blamed boundary commits which were marked as uninteresting will not get their actual commit ID printed, but will instead be replaced by a couple of spaces. The flag can lead to an out-of-bounds write as though when combined with `--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ` as we simply use memset(3p) on that array with the user-provided length directly. The result is most likely that we segfault. An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes. But when the underlying object ID is SHA1, and if the abbreviated length exceeds the SHA1 length, it would cause us to print more bytes than desired, and the result would be misaligned. Instead, fix the bug by computing the length via strlen(3p). This makes us write as many bytes as the formatted object ID requires and thus effectively limits the length of what we may end up printing to the length of its hash. If `--abbrev=` asks us to abbreviate to something shorter than the full length of the underlying hash function it would be handled by the call to printf(3p) correctly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-10 06:56:55 -08:00
Patrick Steinhardt	1fbb8d7ecb	builtin/blame: fix out-of-bounds read with excessive `--abbrev` In `6411a0a896` (builtin/blame: fix type of `length` variable when emitting object ID, 2024-12-06) we have fixed the type of the `length` variable. In order to avoid a cast from `size_t` to `int` in the call to printf(3p) with the "%.*s" formatter we have converted the code to instead use fwrite(3p), which accepts the length as a `size_t`. It was reported though that this makes us read over the end of the OID array when the provided `--abbrev=` length exceeds the length of the object ID. This is because fwrite(3p) of course doesn't stop when it sees a NUL byte, whereas printf(3p) does. Fix the bug by reverting back to printf(3p) and culling the provided length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an `int`. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-10 06:56:54 -08:00
M Hickford	0b43274850	credential-cache: respect authtype capability Previously, credential-cache populated authtype regardless whether "get" request had authtype capability. As documented in git-credential.txt, authtype "should not be sent unless the appropriate capability ... is provided". Add test. Without this change, the test failed because "credential fill" printed an incomplete credential with only protocol and host attributes (the unexpected authtype attribute was discarded by credential.c). Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 15:04:15 -08:00
Jeff King	6979bf6f8f	tree-diff: make list tail-passing more explicit The ll_diff_tree_paths() function and its helpers all take a pointer to a list tail, possibly add to it, and then return the new tail. This works but has two downsides: - The top-level caller (diff_tree_paths() in this case) has to make a fake combine_diff_path struct to act as the list head. This is especially weird here, as it's a flexible-sized struct which will have an empty FLEX_ARRAY field. That used to be a portability problem, though these days it is legal because our FLEX_ARRAY macro over-allocates if necessary. It's still kind of ugly, though. - Besides the name "tail", it's not immediately obvious that the entry we pass around will not be examined by each function. Using a pointer-to-pointer or similar makes it more obvious we only care about the pointer itself, not its contents. We can solve both by passing around a pointer to the tail instead. That gets rid of the return value entirely, though note that because of the recursion we actually need a three-star pointer for this to work. The result is fairly readable, as we only need to dereference the tail in one spot. If we wanted to make it simpler we could wrap the tail in a struct, which we pass around. Another option is to convert combine_diff to use our generic list_head API. I tried that and found the result became much harder to read overall. It means that _all_ code that looks at combine_diff_path structs needs to be modified, since the "next" pointer is now inside a list_head which has to be dereferenced with list_entry(). And we lose some type safety, since we're just passing around a list_head struct everywhere, and everybody who looks at it has to specify the type to list_entry themselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:27 -08:00
Jeff King	6632bcba51	tree-diff: simplify emit_path() list management In emit_path() we may append a new combine_diff_path entry to our list, decide that we don't want it (because opt->pathchange() told us so) and then roll it back. Between the addition and the rollback, it doesn't matter if it's in the list or not (no functions can even tell, since it's a singly-linked list and we pass around just the tail entry). So it's much simpler to just wait until opt->pathchange() tells us whether to keep it, and either attach it (or free it) then. We do still have to allocate it up front since it's that struct itself which is passed to the pathchange callback. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:26 -08:00
Jeff King	d8baf083c5	tree-diff: use the name "tail" to refer to list tail The ll_diff_tree_paths() function and its helpers all append to a running list by taking in a pointer to the old tail and returning the new tail. But they just call this argument "p", which is not very descriptive. It gets particularly confusing in emit_path(), where we actually add to the list, because "p" does double-duty: it is the tail of the list, but it is also the entry which we add. Except that in some cases we _don't_ add a new entry (or we might even add it and roll it back) if the path isn't interesting. At first glance, this makes it look like a bug that we pass "p" on to ll_diff_tree_paths() to recurse; sometimes it is getting the new entry we made and sometimes not! But it's not a bug, because ll_diff_tree_paths() does not care about the entry itself at all. It is only using its "next" pointer as the tail of the list. Let's swap out "p" for "tail" to make this obvious. And then in emit_path() we'll continue to use "p" for our newly allocated entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:26 -08:00
Jeff King	a5c4e31af9	tree-diff: drop list-tail argument to diff_tree_paths() The internals of the path diffing code, including ll_diff_tree_paths(), all take an extra combine_diff_path parameter which they use as the tail of a list of results, appending any new entries to it. The public-facing diff_tree_paths() takes the same argument, but it just makes the callers more awkward. They always start with a clean list, and have to set up a fake head struct to pass in. Let's keep the public API clean by always returning a new list. That keeps the fake struct as an implementation detail of tree-diff.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:26 -08:00
Jeff King	69f6dea44c	combine-diff: drop public declaration of combine_diff_path_size() We want callers to use combine_diff_path_new() to allocate structs, rather than using combine_diff_path_size() and xmalloc(). That gives us more consistency over the initialization of the fields. Now that the final external user of combine_diff_path_size() is gone, we can stop declaring it publicly. And since our constructor is the only caller, we can just inline it there. Breaking the size computation into two parts also lets us reuse the intermediate multiplication result of the parent length, since we need to know it to perform our memset(). The result is a little easier to read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:26 -08:00
Jeff King	b20d7d348c	tree-diff: inline path_appendnew() Our path_appendnew() has been simplified to the point that it is mostly just implementing combine_diff_path_new(), plus setting the "next" pointer. Since there's only one caller, let's replace it completely with a call to that helper function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:25 -08:00
Jeff King	8c53354658	tree-diff: pass whole path string to path_appendnew() When diffing trees, we'll have a strbuf "base" containing the slash-separted names of our parent trees, and a "path" string representing an entry name from the current tree. We pass these separately to path_appendnew(), which combines them to form a single path string in the combine_diff_path struct. Instead, let's append the path string to our base strbuf ourselves, pass in the result, and then roll it back with strbuf_setlen(). This lets us simplify path_appendnew() a bit, enabling further refactoring. And while it might seem like this causes extra wasted allocations, it does not in practice. We reuse the same strbuf for each tree entry, so we only have to allocate it to match the largest name. Plus, in a recursive diff we'll end up doing this same operation to extend the base for the next level of recursion. So we're really just incurring a small memcpy(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:25 -08:00
Jeff King	a8dda1af6a	tree-diff: drop path_appendnew() alloc optimization When we're diffing trees, we create a list of combine_diff_path structs that represent changed paths. We allocate each struct and add it to the list with path_appendnew(), which we then feed to opt->pathchange(). That function tells us whether the path is of interest or not; if not, then we can throw away the struct we allocated. So there's an optimization to avoid extra allocations: instead of throwing away the new entry, we try to reuse it. If it was large enough to store the next path we care about, we can do so. And if not, we fall back to freeing and re-allocating a new struct. This comes from `72441af7c4` (tree-diff: rework diff_tree() to generate diffs for multiparent cases as well, 2014-04-07), where the goal was to have even the 2-parent diff code use the combine-diff infrastructure, but without taking a performance hit. The implementation causes some complexities in the interface (as we store the allocation length inside the "next" pointer), and prevents us from using the regular combine_diff_path_new() constructor. The complexity is mostly contained inside two functions, but it's worth re-evaluating how much it's helping. That commit claims it helps ~1% on generating two-parent diffs in linux.git. Here are the timings I get on the same command today ("old" is the current tip of master, and "new" has this patch applied): Benchmark 1: ./git.old log --raw --no-abbrev --no-renames v3.10..v3.11 Time (mean ± σ): 532.9 ms ± 5.8 ms [User: 472.7 ms, System: 59.6 ms] Range (min … max): 525.9 ms … 543.3 ms 10 runs Benchmark 2: ./git.new log --raw --no-abbrev --no-renames v3.10..v3.11 Time (mean ± σ): 538.3 ms ± 5.7 ms [User: 478.0 ms, System: 59.7 ms] Range (min … max): 528.5 ms … 545.3 ms 10 runs Summary ./git.old log --raw --no-abbrev --no-renames v3.10..v3.11 ran 1.01 ± 0.02 times faster than ./git.new log --raw --no-abbrev --no-renames v3.10..v3.11 So we do end up on average 1% faster, but with 2% of noise. I tried to focus more on diff performance by running the commit traversal separately, like: git rev-list v3.10..v3.11 >in and then timing just the diffs: Benchmark 1: ./git.old diff-tree --stdin -r <in Time (mean ± σ): 415.7 ms ± 5.8 ms [User: 357.7 ms, System: 58.0 ms] Range (min … max): 410.9 ms … 430.3 ms 10 runs Benchmark 2: ./git.new diff-tree --stdin -r <in Time (mean ± σ): 418.5 ms ± 2.1 ms [User: 361.7 ms, System: 56.6 ms] Range (min … max): 414.9 ms … 421.3 ms 10 runs Summary ./git.old diff-tree --stdin -r <in ran 1.01 ± 0.02 times faster than ./git.new diff-tree --stdin -r <in That gets roughly the same result. Adding in "-c" to do multi-parent diffs doesn't change much: Benchmark 1: ./git.old diff-tree --stdin -r -c <in Time (mean ± σ): 525.3 ms ± 6.6 ms [User: 470.0 ms, System: 55.1 ms] Range (min … max): 508.4 ms … 531.0 ms 10 runs Benchmark 2: ./git.new diff-tree --stdin -r -c <in Time (mean ± σ): 532.3 ms ± 6.2 ms [User: 469.0 ms, System: 63.1 ms] Range (min … max): 520.3 ms … 539.4 ms 10 runs Summary ./git.old diff-tree --stdin -r -c <in ran 1.01 ± 0.02 times faster than ./git.new diff-tree --stdin -r -c <in And of course if you add in a lot more work by doing actual content-level diffs, any difference is lost entirely (here the newer version is actually faster, but that's really just noise): Benchmark 1: ./git.old diff-tree --stdin -r --cc <in Time (mean ± σ): 11.571 s ± 0.064 s [User: 11.287 s, System: 0.283 s] Range (min … max): 11.497 s … 11.615 s 3 runs Benchmark 2: ./git.new diff-tree --stdin -r --cc <in Time (mean ± σ): 11.466 s ± 0.109 s [User: 11.108 s, System: 0.357 s] Range (min … max): 11.346 s … 11.560 s 3 runs Summary ./git.new diff-tree --stdin -r --cc <in ran 1.01 ± 0.01 times faster than ./git.old diff-tree --stdin -r --cc <in So my conclusion is that it probably does help a little, but it's mostly lost in the noise. I could see an argument for keeping it, as the complexity is hidden away in functions that do not often need to be touched. But it does make them more confusing than necessary (despite some detailed explanations from the author of that commit; it just took me a while to wrap my head around what was going on) and prevents further refactoring of the combine_diff_path struct. So let's drop it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:25 -08:00
Jeff King	ca3abe41d7	run_diff_files(): de-mystify the size of combine_diff_path struct We allocate a combine_diff_path struct with space for 5 parents. Why 5? The history is not particularly enlightening. The allocation comes from `b4b1550315` (Don't instantiate structures with FAMs., 2006-06-18), which just switched to xmalloc from a stack struct with 5 elements. That struct changed to 5 from 4 in `2454c962fb` (combine-diff: show mode changes as well., 2006-02-06), when we also moved from storing raw sha1 bytes to the combine_diff_parent struct. But no explanation is given. That 4 comes from the earliest code in `ea726d02e9` (diff-files: -c and --cc options., 2006-01-28). One might guess it is for the 4 stages we can store in the index. But this code path only ever diffs the current state against stages 2 and 3. So we only need two slots. And it's easy to see this is still the case. We fill the parent slots by subtracting 2 from the ce_stage() values, ignoring values below 2. And since ce_stage() is only 2 bits, there are 4 values, and thus we need 2 slots. Let's use the correct value (saving a tiny bit of memory) and add a comment explaining what's going on (saving a tiny bit of programmer brain power). Arguably we could use: 1 + (STAGEMASK >> STAGESHIFT) - 2 which lets the compiler enforce that we will not go out-of-bounds if we see an unexpected value from ce_stage(). But that is more confusing to explain, and the constant "2" is baked into other parts of the function. It is a fundamental constant, not something where somebody might bump a macro and forget to update this code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:24 -08:00
Jeff King	30f7414ca1	diff: add a comment about combine_diff_path.parent.path We only fill in the per-parent "path" field when it differs from what's in combine_diff_path.path (and even then only when the option is appropriate). Let's document that. Suggested-by: Wink Saville <wink@saville.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 12:24:24 -08:00
Jeff King	3a0599788f	combine-diff: use pointer for parent paths Commit `d76ce4f734` (log,diff-tree: add --combined-all-paths option, 2019-02-07) added a "path" field to each combine_diff_parent struct. It's defined as a strbuf, but this is overkill. We never manipulate the buffer beyond inserting a single string into it. And in fact there's a small bug: we zero the parent structs, including the path strbufs. For the 0th parent, we strbuf_init() the strbuf before adding to it. But for subsequent parents, we never do the init. This is technically violating the strbuf API, though the code there is resilient enough to handle this zero'd state. This patch switches us to just store an allocated string pointer. Zeroing it is enough to properly initialize it there (modulo the usual assumption we make that a NULL pointer is all-zeroes). And as a bonus, we can just check for a non-NULL value to see if it is present, rather than repeating the combined_all_paths logic at each site. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 10:31:23 -08:00
Jeff King	5173099aae	tree-diff: clear parent array in path_appendnew() All of the other functions which allocate a combine_diff_path struct zero out the parent array, but this code path does not. There's no bug, since our caller will fill in most of the fields. But leaving the unused fields (like combine_diff_parent.path) uninitialized makes working with the struct more error-prone than it needs to be. Let's just zero the parent field to be consistent with the combine_diff_path_new() allocator. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 10:05:50 -08:00
Jeff King	7067793441	combine-diff: add combine_diff_path_new() The combine_diff_path struct has variable size, since it embeds both the memory allocation for the path field as well as a variable-sized parent array. This makes allocating one a bit tricky. We have a helper to compute the required size, but it's up to individual sites to actually initialize all of the fields. Let's provide a constructor function to make that a little nicer. Besides being shorter, it also hides away tricky bits like the computation of the "path" pointer (which is right after the "parent" flex array). As a bonus, using the same constructor everywhere means that we'll consistently initialize all parts of the struct. A few code paths left the parent array unitialized. This didn't cause any bugs, but we'll be able to simplify some code in the next few patches knowing that the parent fields have all been zero'd. This also gets rid of some questionable uses of "int" to store buffer lengths. Though we do use them to allocate, I don't think there are any integer overflow vulnerabilities here (the allocation helper promotes them to size_t and checks arithmetic for overflow, and the actual memcpy of the bytes is done using the possibly-truncated "int" value). Sadly we can't use the FLEX_* macros to simplify the allocation here, because there are two variable-sized parts to the struct (and those macros only handle one). Nor can we get stop publicly declaring combine_diff_path_size(). This patch does not touch the code in path_appendnew() at all, which is not ready to be moved to our new constructor for a few reasons: - path_appendnew() has a memory-reuse optimization where it tries to reuse combine_diff_path structs rather than freeing and reallocating. - path_appendnew() does not create the struct from a single path string, but rather allocates and copies into the buffer from multiple sources. These can be addressed by some refactoring, but let's leave it as-is for now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 09:57:44 -08:00
Jeff King	949bb8f74f	run_diff_files(): delay allocation of combine_diff_path While looping over the index entries, when we see a higher level stage the first thing we do is allocate a combine_diff_path struct for it. But this can leak; if check_removed() returns an error, we'll continue to the next iteration of the loop without cleaning up. We can fix this by just delaying the allocation by a few lines. I don't think this leak is triggered in the test suite, but it's pretty easy to see by inspection. My ulterior motive here is that the delayed allocation means we have all of the data needed to initialize "dpath" at the time of malloc, making it easier to factor out a constructor function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 09:56:28 -08:00
Adam Johnson	21e1b44865	difftool docs: restore correct position of tool list `2a9dfdf260` (difftool docs: de-duplicate configuration sections, 2022-09-07) moved the difftool documentation, but missed moving this "include" line that includes the generated list of diff tools, as referenced in the moved text. Restore the correct position of the included list. Signed-off-by: Adam Johnson <me@adamj.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 08:46:53 -08:00
Seyi Kuforiji	43850dcf9c	t/unit-tests: convert hash to use clar test framework Adapt the hash test functions to clar framework by using clar assertions where necessary. Following the consensus to convert the unit-tests scripts found in the t/unit-tests folder to clar driven by Patrick Steinhardt. Test functions are structured as a standalone to test individual hash string and literal case. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-09 07:55:00 -08:00
Junio C Hamano	a60673e925	Merge branch 'js/reftable-realloc-errors-fix' Last-minute fix to a recent update. * js/reftable-realloc-errors-fix: t-reftable-basics: allow for `malloc` to be `#define`d	2025-01-08 14:10:27 -08:00
Junio C Hamano	e05e111feb	Merge branch 'sj/meson-perl-build-fix' The build procedure in "meson" for the "perl/" hierarchy lacked necessary dependencies, which has been corrected. * sj/meson-perl-build-fix: meson: fix perl dependencies	2025-01-08 14:10:26 -08:00
Johannes Schindelin	d02c37c3e6	t-reftable-basics: allow for `malloc` to be `#define`d As indicated by the `#undef malloc` line in `reftable/basics.h`, it is quite common to use allocators other than the default one by defining `malloc` constants and friends. This pattern is used e.g. in Git for Windows, which uses the powerful and performant `mimalloc` allocator. Furthermore, in `reftable/basics.c` this `#undef malloc` is _specifically_ disabled by virtue of defining the `REFTABLE_ALLOW_BANNED_ALLOCATORS` constant before including `reftable/basic.h`, to ensure that such a custom allocator is also used in the reftable code. However, in `8db127d43f` (reftable: avoid leaks on realloc error, 2024-12-28) and in `2cca185e85` (reftable: fix allocation count on realloc error, 2024-12-28), `reftable_set_alloc()` function calls were introduced that pass `malloc`, `realloc` and `free` function pointers as parameters _after_ `reftable/basics.h` ensured that they were no longer `#define`d. This would override the custom allocator and re-set it to the default allocator provided by, say, libc or MSVCRT. This causes problems because those calls happen after the initial allocator has already been used to initialize an array, which is subsequently resized using the overridden default `realloc()` allocator. You cannot mix and match allocators like that, which leads to a `STATUS_HEAP_CORRUPTION` (C0000374) on Windows, and when running this unit test through shell and/or `prove` (which only support 7-bit status codes), it surfaces as exit code 127. It is actually unnecessary to use those function pointers to `malloc`/`realloc`/`free`, though: The `reftable` code goes out of its way to fall back to the initial allocator when passing `NULL` parameters instead. So let's do that instead of causing heap corruptions. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-08 09:41:52 -08:00
Sam James	45c0897204	meson: fix perl dependencies `generate_perl_command` needs `depends: [git_version_file]` and the uses in top-level meson.build were fine, but the ones in perl/ weren't, causing parallel build failures in some cases as GIT-BUILD-OPTIONS wasn't yet available. Signed-off-by: Sam James <sam@gentoo.org> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-08 08:05:39 -08:00
Matthew Hughes	6a63995335	docs: fix typesetting of merge driver placeholders Following the `CodingGuidlines`, since these placeholders are literal they should be typeset verbatim, so fix some that aren't. Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 15:11:36 -08:00
Kristoffer Haugsbakk	14650065b7	RelNotes/2.48.0: fix typos etc. Correct verb tense, add missing words, avoid double blank lines, and rephrase things that don’t read well to me like “Turn this linkage to relative paths”. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 10:46:18 -08:00
Justin Tobler	ca7158076f	fsck: reject misconfigured fsck.skipList In Git, fsck operations can ignore known broken objects via the `fsck.skipList` configuration. This option expects a path to a file with the list of object names. When the configuration is specified without a path, an error message is printed, but the command continues as if the configuration was not set. Configuring `fsck.skipList` without a value is a misconfiguration so config parsing should be more strict and reject it. Update `git_fsck_config()` to no longer ignore misconfiguration of `fsck.skipList`. The same behavior is also present for `fetch.fsck.skipList` and `receive.fsck.skipList` so the configuration parsers for these are updated to ensure the related operations remain consistent. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 09:22:25 -08:00
Patrick Steinhardt	0b4f8afef6	reftable/stack: accept insecure random bytes The reftable library uses randomness in two call paths: - When reading a stack in case some of the referenced tables disappears. The randomness is used to delay the next read by a couple of milliseconds. - When writing a new table, where the randomness gets appended to the table name (e.g. "0x000000000001-0x000000000002-0b1d8ddf.ref"). In neither of these cases do we need strong randomness. Unfortunately though, we have observed test failures caused by the former case. In t0610 we have a test that spawns a 100 processes at once, all of which try to write a new table to the stack. And given that all of the processes will require randomness, it can happen that these processes make the entropy pool run dry, which will then cause us to die: + test_seq 100 + printf %s commit\trefs/heads/branch-%s\n 68d032e9edd3481ac96382786ececc37ec28709e 1 + printf %s commit\trefs/heads/branch-%s\n 68d032e9edd3481ac96382786ececc37ec28709e 2 ... + git update-ref refs/heads/branch-98 HEAD + git update-ref refs/heads/branch-97 HEAD + git update-ref refs/heads/branch-99 HEAD + git update-ref refs/heads/branch-100 HEAD fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes The report was for NonStop, which uses OpenSSL as the backend for randomness. In the preceding commit we have adapted that backend to also return randomness in case the entropy pool is empty and the caller passes the `CSPRNG_BYTES_INSECURE` flag. Do so to fix the issue. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 09:04:18 -08:00
Patrick Steinhardt	1568d1562e	wrapper: allow generating insecure random bytes The `csprng_bytes()` function generates randomness and writes it into a caller-provided buffer. It abstracts over a couple of implementations, where the exact one that is used depends on the platform. These implementations have different guarantees: while some guarantee to never fail (arc4random(3)), others may fail. There are two significant failures to distinguish from one another: - Systemic failure, where e.g. opening "/dev/urandom" fails or when OpenSSL doesn't have a provider configured. - Entropy failure, where the entropy pool is exhausted, and thus the function cannot guarantee strong cryptographic randomness. While we cannot do anything about the former, the latter failure can be acceptable in some situations where we don't care whether or not the randomness can be predicted. Introduce a new `CSPRNG_BYTES_INSECURE` flag that allows callers to opt into weak cryptographic randomness. The exact behaviour of the flag depends on the underlying implementation: - `arc4random_buf()` never returns an error, so it doesn't change. - `getrandom()` pulls from "/dev/urandom" by default, which never blocks on modern systems even when the entropy pool is empty. - `getentropy()` seems to block when there is not enough randomness available, and there is no way of changing that behaviour. - `GtlGenRandom()` doesn't mention anything about its specific failure mode. - The fallback reads from "/dev/urandom", which also returns bytes in case the entropy pool is drained in modern Linux systems. That only leaves OpenSSL with `RAND_bytes()`, which returns an error in case the returned data wouldn't be cryptographically safe. This function is replaced with a call to `RAND_pseudo_bytes()`, which can indicate whether or not the returned data is cryptographically secure via its return value. If it is insecure, and if the `CSPRNG_BYTES_INSECURE` flag is set, then we ignore the insecurity and return the data regardless. It is somewhat questionable whether we really need the flag in the first place, or whether we wouldn't just ignore the potentially-insecure data. But the risk of doing that is that we might have or grow callsites that aren't aware of the potential insecureness of the data in places where it really matters. So using a flag to opt-in to that behaviour feels like the more secure choice. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 09:04:18 -08:00
Junio C Hamano	4a2b3df546	Merge tag 'l10n-2.48.0-rnd1' of https://github.com/git-l10n/git-po l10n-2.48.0-rnd1 * tag 'l10n-2.48.0-rnd1' of https://github.com/git-l10n/git-po: l10n: po-id for 2.48 l10n: zh_CN: updated translation for 2.48 l10n: uk: v2.48 update l10n: sv.po, fixed swedish typos l10n: vi: Updated translation for 2.48 l10n: Update German translation l10n: tr: Update Turkish translations for 2.48 l10n: sv.po: Update Swedish translation l10n: fr: v2.48.0 l10n: zh_TW: Git 2.48 round 2 l10n: zh_TW: Git 2.48 l10n: bg.po: Updated Bulgarian translation (5804t) l10n: fr.po: Minor improvements	2025-01-07 08:53:02 -08:00
Jeff King	ddb5287894	t7407: use test_grep There are a few grep calls here that can benefit from test_grep, which produces more user-friendly output when it fails. One of these calls also passes "-sq", which is curious. The "-q" option suppresses the matched output. But test output is either already redirected to /dev/null in non-verbose mode, and in verbose mode it's better to see the output. The "-s" option suppresses errors opening files, but we are just grepping in the "expected" file we just generated, so it should not be needed. Neither of these was really hurting anything, but they are not a style we'd like to see emulated. So get rid of them. (It is also curious to grep in the expected file in the first place, but that is because we are auto-generating the expectation from a Git command. So this is double-checking it did what we wanted). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 08:31:45 -08:00
Jeff King	164a2516eb	test-lib: add a few comments to LSan log checking Commit `b119a687d4` (test-lib: ignore leaks in the sanitizer's thread code, 2025-01-01) added code to suppress a false positive in the leak checker. But if you're just reading the code, the obscure grep call is a bit of a head-scratcher. Let's add a brief comment explaining what's going on (and anybody digging further can find this commit or that one for all the details). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 08:18:15 -08:00
Jeff King	b9a9df93a3	test-lib: simplify lsan results check We want to know if there are any leaks logged by LSan in the results directory, so we run "find" on the containing directory and pipe it to xargs. We can accomplish the same thing by just globbing in the shell and passing the result to grep, which has a few advantages: - it's one fewer process to run - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we checked at the beginning of the function, and is the same glob used to show the logs in check_test_results_san_file_ - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a space in it. For example doing: mkdir "/tmp/foo bar" TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test would yield a lot of: grep: /tmp/foo: No such file or directory grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory when there are leaks. We could do the same thing with "xargs --null", but that isn't portable. We are now subject to command-line length limits, but that is also true of the globbing cat used to show the logs themselves. This hasn't been a problem in practice. We do need to use "grep -s" for the case that the glob does not expand (i.e., there are not any log files at all). This option is in POSIX, and has been used in t7407 for several years without anybody complaining. This also also naturally handles the case where the surrounding directory has already been removed (in which case there are likewise no files!), dropping the need to comment about it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 08:17:54 -08:00
Jeff King	8d24d56ce1	test-lib: invert return value of check_test_results_san_file_empty We have a function to check whether LSan logged any leaks. It returns success for no leaks, and non-zero otherwise. This is the simplest thing for its callers, who want to say "if no leaks then return early". But because it's implemented as a shell pipeline, you end up with the awkward: ! find ... \| xargs grep leaks \| grep -v false-positives where the "!" is actually negating the final grep. Switch the return value (and name) to return success when there are leaks. This should make the code a little easier to read, and the negation in the callers still reads pretty naturally. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 08:09:14 -08:00
Jiang Xin	fc613c01d4	Merge branch '2.48-uk-update' of github.com:arkid15r/git-ukrainian-l10n * '2.48-uk-update' of github.com:arkid15r/git-ukrainian-l10n: l10n: uk: v2.48 update	2025-01-07 15:45:43 +08:00
Jiang Xin	56610beac2	Merge branch 'vi-2.48' of github.com:Nekosha/git-po * 'vi-2.48' of github.com:Nekosha/git-po: l10n: vi: Updated translation for 2.48	2025-01-07 15:45:21 +08:00
Jiang Xin	12bcb4d4d0	Merge branch 'l10n-de-2.48' of github.com:ralfth/git * 'l10n-de-2.48' of github.com:ralfth/git: l10n: Update German translation	2025-01-07 15:44:49 +08:00
Jiang Xin	111a9d51d2	Merge branch 'tl/zh_CN_2.48.0_rnd' of github.com:dyrone/git * 'tl/zh_CN_2.48.0_rnd' of github.com:dyrone/git: l10n: zh_CN: updated translation for 2.48	2025-01-07 15:44:11 +08:00
Jiang Xin	97bfea6377	Merge branch 'fr_v2.48.0' of github.com:jnavila/git * 'fr_v2.48.0' of github.com:jnavila/git: l10n: fr: v2.48.0 l10n: fr.po: Minor improvements	2025-01-07 15:39:11 +08:00
Jiang Xin	b987f159e3	Merge branch 'master' of github.com:alshopov/git-po * 'master' of github.com:alshopov/git-po: l10n: bg.po: Updated Bulgarian translation (5804t)	2025-01-07 15:38:39 +08:00
Jiang Xin	8ddca35c13	Merge branch 'po-id' of github.com:bagasme/git-po * 'po-id' of github.com:bagasme/git-po: l10n: po-id for 2.48	2025-01-07 15:37:51 +08:00
Jiang Xin	ac8fe418a6	Merge branch 'tr-l10n' of github.com:bitigchi/git-po * 'tr-l10n' of github.com:bitigchi/git-po: l10n: tr: Update Turkish translations for 2.48	2025-01-07 15:36:40 +08:00
Jiang Xin	02b355f546	Merge branch 'master' of github.com:nafmo/git-l10n-sv * 'master' of github.com:nafmo/git-l10n-sv: l10n: sv.po, fixed swedish typos l10n: sv.po: Update Swedish translation	2025-01-07 15:35:49 +08:00
Jiang Xin	15341c8499	Merge branch 'l10n/zh-TW/2024-12-17' of github.com:l10n-tw/git-po * 'l10n/zh-TW/2024-12-17' of github.com:l10n-tw/git-po: l10n: zh_TW: Git 2.48 round 2 l10n: zh_TW: Git 2.48	2025-01-07 15:34:24 +08:00
D. Ben Knoble	8776470cf3	completion: repair config completion for Zsh Commit `1e0ee4087e` (completion: add and use __git_compute_first_level_config_vars_for_section, 2024-02-10) uses an indirect variable syntax that is only valid for Bash, but the Zsh completion code relies on the Bash completion code to function. Zsh supports a different indirect variable expansion using ${(P)var}, but in `emulate ksh` mode does not support Bash's ${!var}. This manifests as completing strange config options like "__git_first_level_config_vars_for_section_remote" as a choice for the command line git config set remote. Using Zsh's C-x ? _complete_debug widget with the cursor at the end of that command line captures a trace, in which we see (some details elided): +__git_complete_config_variable_name:7> __git_compute_first_level_config_vars_for_section remote +__git_compute_first_level_config_vars_for_section:7> local section=remote +__git_compute_first_level_config_vars_for_section:7> __git_compute_config_vars +__git_compute_config_vars:7> test -n $'add.ignoreErrors\nadvice.addEmbeddedRepo\nadvice.addEmptyPathspec\nadvice.addIgnoredFile[…]' +__git_compute_first_level_config_vars_for_section:7> local this_section=__git_first_level_config_vars_for_section_remote +__git_compute_first_level_config_vars_for_section:7> test -n __git_first_level_config_vars_for_section_remote +__git_complete_config_variable_name:7> local this_section=__git_first_level_config_vars_for_section_remote +__git_complete_config_variable_name:7> __gitcomp_nl_append __git_first_level_config_vars_for_section_remote remote. '' ' ' +__gitcomp_nl_append:7> __gitcomp_nl __git_first_level_config_vars_for_section_remote remote. '' ' ' +__gitcomp_nl:7> emulate -L zsh +__gitcomp_nl:7> compset -P '*[=:]' +__gitcomp_nl:7> compadd -Q -S ' ' -p remote. -- __git_first_level_config_vars_for_section_remote We perform the test for __git_compute_config_vars correctly, but the ${!this_section} references are not expanded as expected. Instead, portably expand indirect references through the new __git_indirect. Contrary to some versions you might find online [1], this version avoids echo non-portabilities [2] [3] and correctly quotes the indirect expansion after eval (so that the result is not split or globbed before being handed to printf). [1]: https://unix.stackexchange.com/a/41409/301073 [2]: https://askubuntu.com/questions/715765/mysterious-behavior-of-echo-command#comment1056038_715769 [3]: https://mywiki.wooledge.org/CatEchoLs The following demo program demonstrates how this works: b=1 indirect() { eval printf '%s' "\"\$$1\"" } f() { # Comment this out to see that it works for globals, too. Or, use # a value with spaces like '2 3 4' to see how it handles those. local b=2 local a=b test -n "$(indirect $a)" && echo nice } f When placed in a file "demo", then both bash -x demo and zsh -xc 'emulate ksh -c ". ./demo"' \|& tail provide traces showing that "$(indirect $a)" produces 2 (or 1, with the global, or "2 3 4" as a single string, etc.). Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Acked-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-06 14:21:26 -08:00
Junio C Hamano	a41e394e21	Merge branch 'bf/fetch-set-head-config' A hotfix on an advice messagge added during this cycle. * bf/fetch-set-head-config: fetch: fix erroneous set_head advice message	2025-01-06 12:02:21 -08:00
Junio C Hamano	b74ff38af5	Git 2.48-rc2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-06 08:24:43 -08:00
Junio C Hamano	ee0e3bbc8d	Merge branch 'jc/doc-opt-tilde-expand' Docfix. * jc/doc-opt-tilde-expand: gitcli.txt: typeset pathnames as monospace	2025-01-06 08:23:29 -08:00
Junio C Hamano	1fa37a0608	Merge branch 'mh/doc-windows-home-env' Docfix. * mh/doc-windows-home-env: git.txt: fix heading line of tildes	2025-01-06 08:23:29 -08:00
Patrick Steinhardt	d7fcbe2c56	object-file: retry linking file into place when occluding file vanishes Prior to `0ad3d65652` (object-file: fix race in object collision check, 2024-12-30), callers could expect that a successful return from `finalize_object_file()` means that either the file was moved into place, or the identical bytes were already present. If neither of those happens, we'd return an error. Since that commit, if the destination file disappears between our link(3p) call and the collision check, we'd return success without actually checking the contents, and without retrying the link. This solves the common case that the files were indeed the same, but it means that we may corrupt the repository if they weren't (this implies a hash collision, but the whole point of this function is protecting against hash collisions). We can't be pessimistic and assume they're different; that hurts the common case that the mentioned commit was trying to fix. But after seeing that the destination file went away, we can retry linking again. Adapt the code to do so when we see that the destination file has racily vanished. This should generally succeed as we have just observed that the destination file does not exist anymore, except in the very unlikely event that it gets recreated by another concurrent process again. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-06 07:57:17 -08:00

... 4 5 6 7 8 ...

76201 Commits