Patrick Steinhardt a96e9a20f3 reftable/merged: allocation-less dropping of shadowed records
The purpose of the merged reftable iterator is to iterate through all
entries of a set of tables in the correct order. This is implemented by
using a sub-iterator for each table, where the next entry of each of
these iterators gets put into a priority queue. For each iteration, we
do roughly the following steps:

  1. Retrieve the top record of the priority queue. This is the entry we
     want to return to the caller.

  2. Retrieve the next record of the sub-iterator that this record came
     from. If any, add it to the priority queue at the correct position.
     The position is determined by comparing the record keys, which e.g.
     corresponds to the refname for ref records.

  3. Keep removing the top record of the priority queue until we hit the
     first entry whose key is larger than the returned record's key.
     This is required to drop "shadowed" records.

The last step will lead to at least one comparison to the next entry,
but may lead to many comparisons in case the reftable stack consists of
many tables with shadowed records. It is thus part of the hot code path
when iterating through records.

The code to compare the entries with each other is quite inefficient
though. Instead of comparing record keys with each other directly, we
first format them into `struct strbuf`s and only then compare them with
each other. While we already optimized this code path to reuse buffers
in 829231dc20 (reftable/merged: reuse buffer to compute record keys,
2023-12-11), the cost to format the keys into the buffers still adds up
quite significantly.

Refactor the code to use `reftable_record_cmp()` instead, which has been
introduced in the preceding commit. This function compares records with
each other directly without requiring any memory allocations or copying
and is thus way more efficient.

The following benchmark uses git-show-ref(1) to print a single ref
matching a pattern out of 1 million refs. This is the most direct way to
exercise ref iteration speed as we remove all overhead of having to show
the refs, too.

    Benchmark 1: show-ref: single matching ref (revision = HEAD~)
      Time (mean ± σ):     180.7 ms ±   4.7 ms    [User: 177.1 ms, System: 3.4 ms]
      Range (min … max):   174.9 ms … 211.7 ms    1000 runs

    Benchmark 2: show-ref: single matching ref (revision = HEAD)
      Time (mean ± σ):     162.1 ms ±   4.4 ms    [User: 158.5 ms, System: 3.4 ms]
      Range (min … max):   155.4 ms … 189.3 ms    1000 runs

    Summary
      show-ref: single matching ref (revision = HEAD) ran
        1.11 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12 09:18:04 -08:00
2024-01-29 16:03:00 -08:00
2024-01-30 13:34:13 -08:00
2023-12-14 14:38:07 -08:00
2023-11-26 10:07:05 +09:00
2023-11-26 10:07:06 +09:00
2024-01-08 14:05:15 -08:00
2023-11-10 08:15:32 +09:00
2023-11-26 10:10:48 +09:00
2024-01-18 11:53:17 -08:00
2024-01-08 14:05:15 -08:00
2024-01-08 14:05:15 -08:00
2023-12-26 12:04:32 -08:00
2023-12-26 12:04:32 -08:00
2023-11-26 10:10:48 +09:00
2023-07-25 12:05:24 -07:00
2023-11-26 10:10:48 +09:00
2024-01-08 14:05:15 -08:00
2023-07-06 11:54:48 -07:00
2023-04-10 08:46:40 -07:00
2024-01-08 14:05:15 -08:00
2024-01-08 14:05:15 -08:00
2024-01-08 14:05:15 -08:00
2023-08-31 15:51:07 -07:00
2023-08-31 15:51:07 -07:00
2023-04-17 21:15:56 +02:00
2023-11-26 10:07:05 +09:00
2024-01-11 13:10:41 -08:00
2023-12-09 16:37:51 -08:00
2023-11-26 10:10:48 +09:00
2023-11-26 10:07:05 +09:00
2023-06-28 14:06:39 -07:00
2024-01-08 14:05:15 -08:00
2023-12-26 12:04:32 -08:00
2023-06-28 14:06:39 -07:00
2024-01-08 14:05:15 -08:00
2023-11-26 10:07:05 +09:00
2023-11-26 10:07:05 +09:00
2023-11-26 10:07:05 +09:00
2023-11-26 10:07:05 +09:00
2023-06-21 13:39:54 -07:00
2023-12-14 14:38:08 -08:00
2024-01-08 14:05:15 -08:00
2023-06-28 14:06:39 -07:00
2023-11-26 10:07:05 +09:00
2023-09-15 17:08:46 -07:00
2024-01-08 14:05:15 -08:00
2024-01-08 14:05:15 -08:00
2024-01-08 14:05:15 -08:00
2023-12-09 16:37:51 -08:00
2024-01-08 14:05:15 -08:00
2024-01-08 14:05:15 -08:00
2023-11-26 10:07:05 +09:00
2023-12-27 14:52:24 -08:00
2023-09-15 17:08:46 -07:00
2024-01-08 14:05:15 -08:00
2023-11-26 10:07:05 +09:00
2024-01-02 13:51:29 -08:00
2023-06-28 14:06:39 -07:00
2023-11-26 10:07:05 +09:00
2023-11-26 10:07:05 +09:00
2023-04-04 14:28:27 -07:00
2023-05-17 10:11:41 -07:00
2024-01-02 13:51:30 -08:00

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email to git+subscribe@vger.kernel.org (see https://subspace.kernel.org/subscribing.html for details). The mailing list archives are available at https://lore.kernel.org/git/, https://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks
Description
No description provided
Readme 235 MiB
Languages
C 50.1%
Shell 38.4%
Perl 5.1%
Tcl 3.3%
Python 0.8%
Other 2%