From 5b6422a6160dc04faa6742434a4d162b4899cfe3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20=C3=85gren?= Date: Sat, 15 Aug 2020 18:05:59 +0200 Subject: [PATCH 1/4] http-protocol.txt: document SHA-256 "want"/"have" format MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Document that rather than always naming objects using SHA-1, we should use whatever has been negotiated using the object-format capability. Signed-off-by: Martin Ågren Signed-off-by: Junio C Hamano --- Documentation/technical/http-protocol.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt index 51a79e63de..96d89ea9b2 100644 --- a/Documentation/technical/http-protocol.txt +++ b/Documentation/technical/http-protocol.txt @@ -401,8 +401,9 @@ at all in the request stream: The stream is terminated by a pkt-line flush (`0000`). A single "want" or "have" command MUST have one hex formatted -SHA-1 as its value. Multiple SHA-1s MUST be sent by sending -multiple commands. +object name as its value. Multiple object names MUST be sent by sending +multiple commands. Object names MUST be given using the object format +negotiated through the `object-format` capability (default SHA-1). The `have` list is created by popping the first 32 commits from `c_pending`. Less can be supplied if `c_pending` empties. From 123712ba41164146cd711dab6fe107b62d443f12 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20=C3=85gren?= Date: Sat, 15 Aug 2020 18:06:00 +0200 Subject: [PATCH 2/4] index-format.txt: document SHA-256 index format MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Document that in SHA-1 repositories, we use SHA-1 and in SHA-256 repositories, we use SHA-256, then replace all other uses of "SHA-1" with something more neutral. Avoid referring to "160-bit" hash values. Signed-off-by: Martin Ågren Signed-off-by: Junio C Hamano --- Documentation/technical/index-format.txt | 34 +++++++++++++----------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index faa25c5c52..f9a3644711 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -3,8 +3,11 @@ Git index format == The Git index file has the following format - All binary numbers are in network byte order. Version 2 is described - here unless stated otherwise. + All binary numbers are in network byte order. + In a repository using the traditional SHA-1, checksums and object IDs + (object names) mentioned below are all computed using SHA-1. Similarly, + in SHA-256 repositories, these values are computed using SHA-256. + Version 2 is described here unless stated otherwise. - A 12-byte header consisting of @@ -32,8 +35,7 @@ Git index format Extension data - - 160-bit SHA-1 over the content of the index file before this - checksum. + - Hash checksum over the content of the index file before this checksum. == Index entry @@ -80,7 +82,7 @@ Git index format 32-bit file size This is the on-disk size from stat(2), truncated to 32-bit. - 160-bit SHA-1 for the represented object + Object name for the represented object A 16-bit 'flags' field split into (high to low bits) @@ -160,8 +162,8 @@ Git index format - A newline (ASCII 10); and - - 160-bit object name for the object that would result from writing - this span of index as a tree. + - Object name for the object that would result from writing this span + of index as a tree. An entry can be in an invalidated state and is represented by having a negative number in the entry_count field. In this case, there is no @@ -198,7 +200,7 @@ Git index format stage 1 to 3 (a missing stage is represented by "0" in this field); and - - At most three 160-bit object names of the entry in stages from 1 to 3 + - At most three object names of the entry in stages from 1 to 3 (nothing is written for a missing stage). === Split index @@ -211,8 +213,8 @@ Git index format The extension consists of: - - 160-bit SHA-1 of the shared index file. The shared index file path - is $GIT_DIR/sharedindex.. If all 160 bits are zero, the + - Hash of the shared index file. The shared index file path + is $GIT_DIR/sharedindex.. If all bits are zero, the index does not require a shared index file. - An ewah-encoded delete bitmap, each bit represents an entry in the @@ -253,10 +255,10 @@ Git index format - 32-bit dir_flags (see struct dir_struct) - - 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file + - Hash of $GIT_DIR/info/exclude. A null hash means the file does not exist. - - 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does + - Hash of core.excludesfile. A null hash means the file does not exist. - NUL-terminated string of per-dir exclude file name. This usually @@ -285,13 +287,13 @@ The remaining data of each directory block is grouped by type: - An ewah bitmap, the n-th bit records "check-only" bit of read_directory_recursive() for the n-th directory. - - An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data + - An ewah bitmap, the n-th bit indicates whether hash and stat data is valid for the n-th directory and exists in the next data. - An array of stat data. The n-th data corresponds with the n-th "one" bit in the previous ewah bitmap. - - An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit + - An array of hashes. The n-th hash corresponds with the n-th "one" bit in the previous ewah bitmap. - One NUL. @@ -330,12 +332,12 @@ The remaining data of each directory block is grouped by type: - 32-bit offset to the end of the index entries - - 160-bit SHA-1 over the extension types and their sizes (but not + - Hash over the extension types and their sizes (but not their contents). E.g. if we have "TREE" extension that is N-bytes long, "REUC" extension that is M-bytes long, followed by "EOIE", then the hash would be: - SHA-1("TREE" + + + Hash("TREE" + + "REUC" + ) == Index Entry Offset Table From 0756e61078b32686dc995df768e50725aba13b7e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20=C3=85gren?= Date: Sat, 15 Aug 2020 18:06:01 +0200 Subject: [PATCH 3/4] protocol-capabilities.txt: clarify "allow-x-sha1-in-want" re SHA-256 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two of our capabilities contain "sha1" in their names, but that's historical. Clarify that object names are still to be given using whatever object format has been negotiated using the "object-format" capability. Signed-off-by: Martin Ågren Signed-off-by: Junio C Hamano --- Documentation/technical/protocol-capabilities.txt | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt index 36ccd14f97..124d716807 100644 --- a/Documentation/technical/protocol-capabilities.txt +++ b/Documentation/technical/protocol-capabilities.txt @@ -324,15 +324,19 @@ allow-tip-sha1-in-want ---------------------- If the upload-pack server advertises this capability, fetch-pack may -send "want" lines with SHA-1s that exist at the server but are not -advertised by upload-pack. +send "want" lines with object names that exist at the server but are not +advertised by upload-pack. For historical reasons, the name of this +capability contains "sha1". Object names are always given using the +object format negotiated through the 'object-format' capability. allow-reachable-sha1-in-want ---------------------------- If the upload-pack server advertises this capability, fetch-pack may -send "want" lines with SHA-1s that exist at the server but are not -advertised by upload-pack. +send "want" lines with object names that exist at the server but are not +advertised by upload-pack. For historical reasons, the name of this +capability contains "sha1". Object names are always given using the +object format negotiated through the 'object-format' capability. push-cert= ----------------- From 8afa50aabcf18ab1904218e8019d623d11e36ed9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20=C3=85gren?= Date: Sat, 15 Aug 2020 18:06:02 +0200 Subject: [PATCH 4/4] shallow.txt: document SHA-256 shallow format MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Similar to recent commits, document that we list object names rather than SHA-1s. Signed-off-by: Martin Ågren Signed-off-by: Junio C Hamano --- Documentation/technical/shallow.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/technical/shallow.txt b/Documentation/technical/shallow.txt index 01dedfe9ff..f3738baa0f 100644 --- a/Documentation/technical/shallow.txt +++ b/Documentation/technical/shallow.txt @@ -13,7 +13,7 @@ pretend as if they are root commits (e.g. "git log" traversal stops after showing them; "git fsck" does not complain saying the commits listed on their "parent" lines do not exist). -Each line contains exactly one SHA-1. When read, a commit_graft +Each line contains exactly one object name. When read, a commit_graft will be constructed, which has nr_parent < 0 to make it easier to discern from user provided grafts.