Merge branch 'es/config-hooks' into pu

The "hooks defined in config" topic.

Ready???

* es/config-hooks:
  hook: add --porcelain to list command
  hook: add list command
  hook: scaffolding for git-hook subcommand
  doc: propose hooks managed by the config
This commit is contained in:
Junio C Hamano
2020-06-19 14:52:45 -07:00
11 changed files with 640 additions and 0 deletions

1
.gitignore vendored
View File

@ -75,6 +75,7 @@
/git-grep
/git-hash-object
/git-help
/git-hook
/git-http-backend
/git-http-fetch
/git-http-push

View File

@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
TECH_DOCS += MyFirstContribution
TECH_DOCS += MyFirstObjectWalk
TECH_DOCS += SubmittingPatches
TECH_DOCS += technical/config-based-hooks
TECH_DOCS += technical/hash-function-transition
TECH_DOCS += technical/http-protocol
TECH_DOCS += technical/index-format

View File

@ -0,0 +1,63 @@
git-hook(1)
===========
NAME
----
git-hook - Manage configured hooks
SYNOPSIS
--------
[verse]
'git hook' list [--porcelain] <hook-name>
DESCRIPTION
-----------
You can list, add, and modify hooks with this command.
This command parses the default configuration files for sections "hook" and
"hookcmd". "hook" is used to describe the commands which will be run during a
particular hook event; commands are run in config order. "hookcmd" is used to
describe attributes of a specific command. If additional attributes don't need
to be specified, a command to run can be specified directly in the "hook"
section; if a "hookcmd" by that name isn't found, Git will attempt to run the
provided value directly. For example:
Global config
----
[hook "post-commit"]
command = "linter"
command = "~/typocheck.sh"
[hookcmd "linter"]
command = "/bin/linter --c"
----
Local config
----
[hook "prepare-commit-msg"]
command = "linter"
[hook "post-commit"]
command = "python ~/run-test-suite.py"
----
COMMANDS
--------
list [--porcelain] <hook-name>::
List the hooks which have been configured for <hook-name>. Hooks appear
in the order they should be run, and note the config scope where the relevant
`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
If `--porcelain` is specified, instead print the commands alone, separated by
newlines, for easy parsing by a script.
OPTIONS
-------
--porcelain::
With `list`, print the commands in the order they should be run,
separated by newlines, for easy parsing by a script.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -0,0 +1,320 @@
Configuration-based hook management
===================================
== Motivation
Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
the only source of hooks to execute, in a way which is friendly to users with
multiple repos which have similar needs.
Redefine "hook" as an event rather than a single script, allowing users to
perform unrelated actions on a single event.
Take a step closer to safety when copying zipped Git repositories from untrusted
users.
Make it easier for users to discover Git's hook feature and automate their
workflows.
== User interfaces
=== Config schema
Hooks can be introduced by editing the configuration manually. There are two new
sections added, `hook` and `hookcmd`.
==== `hook`
Primarily contains subsections for each hook event. These subsections define
hook command execution order; hook commands can be specified by passing the
command directly if no additional configuration is needed, or by passing the
name of a `hookcmd`. If Git does not find a `hookcmd` whose subsection matches
the value of the given command string, Git will try to execute the string
directly. Hooks are executed by passing the resolved command string to the
shell. Hook event subsections can also contain per-hook-event settings.
Also contains top-level hook execution settings, for example,
`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.
----
[hook "pre-commit"]
command = perl-linter
command = /usr/bin/git-secrets --pre-commit
[hook "pre-applypatch"]
command = perl-linter
error = ignore
[hook]
runHookDir = interactive
----
==== `hookcmd`
Defines a hook command and its attributes, which will be used when a hook event
occurs. Unqualified attributes are assumed to apply to this hook during all hook
events, but event-specific attributes can also be supplied. The example runs
`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
include this config, the hook command will be skipped for all events to which
it's normally subscribed _except_ `pre-commit`.
----
[hookcmd "perl-linter"]
command = /usr/bin/lint-it --language=perl
skip = true
pre-commit-skip = false
----
=== Command-line API
Users should be able to view, reorder, and create hook commands via the command
line. External tools should be able to view a list of hooks in the correct order
to run.
*`git hook list <hook-event>`*
*`git hook list (--system|--global|--local|--worktree)`*
*`git hook edit <hook-event>`*
*`git hook add <hook-command> <hook-event> <options...>`*
=== Hook editor
The tool which is presented by `git hook edit <hook-command>`. Ideally, this
tool should be easier to use than manually editing the config, and then produce
a concise config afterwards. It may take a form similar to `git rebase
--interactive`.
== Implementation
=== Library
`hook.c` and `hook.h` are responsible for interacting with the config files. In
the case when the code generating a hook event doesn't have special concerns
about how to run the hooks, the hook library will provide a basic API to call
all hooks in config order with an `argv_array` provided by the code which
generates the hook event:
*`int run_hooks(const char *hookname, struct argv_array *args)`*
This call includes the hook command provided by `run-command.h:find_hook()`;
eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
config is checked against a number of cases:
- "no": the legacy hook will not be run
- "interactive": Git will prompt the user before running the legacy hook
- "warn": Git will print a warning to stderr before running the legacy hook
- "yes" (default): Git will silently run the legacy hook
In case this list is expanded in the future, if a value for `hook.runHookDir` is
given which Git does not recognize, Git should discard that config entry. For
example, if "warn" was specified at system level and "junk" was specified at
global level, Git would resolve the value to "warn"; if the only time the config
was set was to "junk", Git would use the default value of "yes".
If the caller wants to do something more complicated, the hook library can also
provide a callback API:
*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
Finally, to facilitate the builtin, the library will also provide the following
APIs to interact with the config:
----
int set_hook_commands(const char *hookname, struct string_list *commands,
enum config_scope scope);
int set_hookcmd(const char *hookcmd, struct hookcmd options);
int list_hook_commands(const char *hookname, struct string_list *commands);
int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
----
`struct hookcmd` is expected to grow in size over time as more functionality is
added to hooks; so that other parts of the code don't need to understand the
config schema, `struct hookcmd` should contain logical values instead of string
pairs.
----
struct hookcmd {
const char *name;
const char *command;
/* for illustration only; not planned at present */
int parallelizable;
const char *hookcmd_before;
const char *hookcmd_after;
enum recovery_action on_fail;
}
----
=== Builtin
`builtin/hook.c` is responsible for providing the frontend. It's responsible for
formatting user-provided data and then calling the library API to set the
configs as appropriate. The builtin frontend is not responsible for calling the
config directly, so that other areas of Git can rely on the hook library to
understand the most recent config schema for hooks.
=== Migration path
==== Stage 0
Hooks are called by running `run-command.h:find_hook()` with the hookname and
executing the result. The hook library and builtin do not exist. Hooks only
exist as specially named scripts within `.git/hooks/`.
==== Stage 1
`git hook list --porcelain <hook-event>` is implemented. Users can replace their
`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
output. Modifier commands like `git hook add` and `git hook edit` can be
implemented around this time as well.
==== Stage 2
`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
opt-in to config-based hooks simply by creating some in their config; otherwise
users should remain unaffected by the change.
==== Stage 3
The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
`hook.runHookDir`. Users can opt into managing their hooks completely via the
config this way.
==== Stage 4
`.git/hooks` is removed from the template and the hook directory is considered
deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
not changed, and `find_hook()` is not removed.
== Caveats
=== Security and repo config
Part of the motivation behind this refactor is to mitigate hooks as an attack
vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
however, as the design stands, users can still provide hooks in the repo-level
config, which is included when a repo is zipped and sent elsewhere. The
security of the repo-level config is still under discussion; this design
generally assumes the repo-level config is secure, which is not true yet. The
goal is to avoid an overcomplicated design to work around a problem which has
ceased to exist.
=== Ease of use
The config schema is nontrivial; that's why it's important for the `git hook`
modifier commands to be usable. Contributors with UX expertise are encouraged to
share their suggestions.
== Alternative approaches
A previous summary of alternatives exists in the
archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
=== Status quo
Today users can implement multihooks themselves by using a "trampoline script"
as their hook, and pointing that script to a directory or list of other scripts
they wish to run.
=== Hook directories
Other contributors have suggested Git learn about the existence of a directory
such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
=== Comparison table
.Comparison of alternatives
|===
|Feature |Config-based hooks |Hook directories |Status quo
|Supports multiple hooks
|Natively
|Natively
|With user effort
|Safer for zipped repos
|A little
|No
|No
|Previous hooks just work
|If configured
|Yes
|Yes
|Can install one hook to many repos
|Yes
|No
|No
|Discoverability
|Better (in `git help git`)
|Same as before
|Same as before
|Hard to run unexpected hook
|If configured
|No
|No
|===
== Future work
=== Execution ordering
We may find that config order is insufficient for some users; for example,
config order makes it difficult to add a new hook to the system or global config
which runs at the end of the hook list. A new ordering schema should be:
1) Specified by a `hook.order` config, so that users will not unexpectedly see
their order change;
2) Either dependency or numerically based.
Dependency-based ordering is prone to classic linked-list problems, like a
cycles and handling of missing dependencies. But, it paves the way for enabling
parallelization if some tasks truly depend on others.
Numerical ordering makes it tricky for Git to generate suggested ordering
numbers for each command, but is easy to determine a definitive order.
=== Parallelization
Users with many hooks might want to run them simultaneously, if the hooks don't
modify state; if one hook depends on another's output, then users will want to
specify those dependencies. If we decide to solve this problem, we may want to
look to modern build systems for inspiration on how to manage dependencies and
parallel tasks.
=== Securing hookdir hooks
With the design as written in this doc, it's still possible for a malicious user
to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
zip their repo and send it to another user. It may be necessary to teach Git to
only allow one-line hooks like this if they were configured outside of the local
scope; or another approach, like a list of safe projects, might be useful. It
may also be sufficient (or at least useful) to teach a `hook.disableAll` config
or similar flag to the Git executable.
=== Submodule inheritance
It's possible some submodules may want to run the identical set of hooks that
their superrepo runs. While a globally-configured hook set is helpful, it's not
a great solution for users who have multiple repos-with-submodules under the
same user. It would be useful for submodules to learn how to run hooks from
their superrepo's config, or inherit that hook setting.
== Glossary
*hook event*
A point during Git's execution where user scripts may be run, for example,
_prepare-commit-msg_ or _pre-push_.
*hook command*
A user script or executable which will be run on one or more hook events.

View File

@ -892,6 +892,7 @@ LIB_OBJS += grep.o
LIB_OBJS += hashmap.o
LIB_OBJS += help.o
LIB_OBJS += hex.o
LIB_OBJS += hook.o
LIB_OBJS += ident.o
LIB_OBJS += interdiff.o
LIB_OBJS += json-writer.o
@ -1079,6 +1080,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
BUILTIN_OBJS += builtin/grep.o
BUILTIN_OBJS += builtin/hash-object.o
BUILTIN_OBJS += builtin/help.o
BUILTIN_OBJS += builtin/hook.o
BUILTIN_OBJS += builtin/index-pack.o
BUILTIN_OBJS += builtin/init-db.o
BUILTIN_OBJS += builtin/interpret-trailers.o

View File

@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
int cmd_grep(int argc, const char **argv, const char *prefix);
int cmd_hash_object(int argc, const char **argv, const char *prefix);
int cmd_help(int argc, const char **argv, const char *prefix);
int cmd_hook(int argc, const char **argv, const char *prefix);
int cmd_index_pack(int argc, const char **argv, const char *prefix);
int cmd_init_db(int argc, const char **argv, const char *prefix);
int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);

77
builtin/hook.c Normal file
View File

@ -0,0 +1,77 @@
#include "cache.h"
#include "builtin.h"
#include "config.h"
#include "hook.h"
#include "parse-options.h"
#include "strbuf.h"
static const char * const builtin_hook_usage[] = {
N_("git hook list <hookname>"),
NULL
};
static int list(int argc, const char **argv, const char *prefix)
{
struct list_head *head, *pos;
struct hook *item;
struct strbuf hookname = STRBUF_INIT;
int porcelain = 0;
struct option list_options[] = {
OPT_BOOL(0, "porcelain", &porcelain,
"format for execution by a script"),
OPT_END(),
};
argc = parse_options(argc, argv, prefix, list_options,
builtin_hook_usage, 0);
if (argc < 1) {
usage_msg_opt("a hookname must be provided to operate on.",
builtin_hook_usage, list_options);
}
strbuf_addstr(&hookname, argv[0]);
head = hook_list(&hookname);
if (!head) {
printf(_("no commands configured for hook '%s'\n"),
hookname.buf);
return 0;
}
list_for_each(pos, head) {
item = list_entry(pos, struct hook, list);
if (item) {
if (porcelain)
printf("%s\n", item->command.buf);
else
printf("%s:\t%s\n",
config_scope_name(item->origin),
item->command.buf);
}
}
clear_hook_list();
strbuf_release(&hookname);
return 0;
}
int cmd_hook(int argc, const char **argv, const char *prefix)
{
struct option builtin_hook_options[] = {
OPT_END(),
};
if (argc < 2)
usage_with_options(builtin_hook_usage, builtin_hook_options);
if (!strcmp(argv[1], "list"))
return list(argc - 1, argv + 1, prefix);
usage_with_options(builtin_hook_usage, builtin_hook_options);
}

1
git.c
View File

@ -522,6 +522,7 @@ static struct cmd_struct commands[] = {
{ "grep", cmd_grep, RUN_SETUP_GENTLY },
{ "hash-object", cmd_hash_object },
{ "help", cmd_help },
{ "hook", cmd_hook, RUN_SETUP },
{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
{ "init", cmd_init_db },
{ "init-db", cmd_init_db },

90
hook.c Normal file
View File

@ -0,0 +1,90 @@
#include "cache.h"
#include "hook.h"
#include "config.h"
static LIST_HEAD(hook_head);
void free_hook(struct hook *ptr)
{
if (ptr) {
strbuf_release(&ptr->command);
free(ptr);
}
}
static void emplace_hook(struct list_head *pos, const char *command)
{
struct hook *to_add = malloc(sizeof(struct hook));
to_add->origin = current_config_scope();
strbuf_init(&to_add->command, 0);
strbuf_addstr(&to_add->command, command);
list_add_tail(&to_add->list, pos);
}
static void remove_hook(struct list_head *to_remove)
{
struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
list_del(to_remove);
free_hook(hook_to_remove);
}
void clear_hook_list(void)
{
struct list_head *pos, *tmp;
list_for_each_safe(pos, tmp, &hook_head)
remove_hook(pos);
}
static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
{
const char *hook_key = hook_key_cb;
if (!strcmp(key, hook_key)) {
const char *command = value;
struct strbuf hookcmd_name = STRBUF_INIT;
struct list_head *pos = NULL, *tmp = NULL;
/* Check if a hookcmd with that name exists. */
strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
git_config_get_value(hookcmd_name.buf, &command);
if (!command)
BUG("git_config_get_value overwrote a string it shouldn't have");
/*
* TODO: implement an option-getting callback, e.g.
* get configs by pattern hookcmd.$value.*
* for each key+value, do_callback(key, value, cb_data)
*/
list_for_each_safe(pos, tmp, &hook_head) {
struct hook *hook = list_entry(pos, struct hook, list);
/*
* The list of hooks to run can be reordered by being redeclared
* in the config. Options about hook ordering should be checked
* here.
*/
if (0 == strcmp(hook->command.buf, command))
remove_hook(pos);
}
emplace_hook(pos, command);
}
return 0;
}
struct list_head* hook_list(const struct strbuf* hookname)
{
struct strbuf hook_key = STRBUF_INIT;
if (!hookname)
return NULL;
strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
git_config(hook_config_lookup, (void*)hook_key.buf);
return &hook_head;
}

15
hook.h Normal file
View File

@ -0,0 +1,15 @@
#include "config.h"
#include "list.h"
#include "strbuf.h"
struct hook
{
struct list_head list;
enum config_scope origin;
struct strbuf command;
};
struct list_head* hook_list(const struct strbuf *hookname);
void free_hook(struct hook *ptr);
void clear_hook_list(void);

69
t/t1360-config-based-hooks.sh Executable file
View File

@ -0,0 +1,69 @@
#!/bin/bash
test_description='config-managed multihooks, including git-hook command'
. ./test-lib.sh
test_expect_success 'git hook rejects commands without a mode' '
test_must_fail git hook pre-commit
'
test_expect_success 'git hook rejects commands without a hookname' '
test_must_fail git hook list
'
test_expect_success 'setup hooks in global, and local' '
git config --add --local hook.pre-commit.command "/path/ghi" &&
git config --add --global hook.pre-commit.command "/path/def"
'
test_expect_success 'git hook list orders by config order' '
cat >expected <<-\EOF &&
global: /path/def
local: /path/ghi
EOF
git hook list pre-commit >actual &&
test_cmp expected actual
'
test_expect_success 'git hook list dereferences a hookcmd' '
git config --add --local hook.pre-commit.command "abc" &&
git config --add --global hookcmd.abc.command "/path/abc" &&
cat >expected <<-\EOF &&
global: /path/def
local: /path/ghi
local: /path/abc
EOF
git hook list pre-commit >actual &&
test_cmp expected actual
'
test_expect_success 'git hook list reorders on duplicate commands' '
git config --add --local hook.pre-commit.command "/path/def" &&
cat >expected <<-\EOF &&
local: /path/ghi
local: /path/abc
local: /path/def
EOF
git hook list pre-commit >actual &&
test_cmp expected actual
'
test_expect_success 'git hook list --porcelain prints just the command' '
cat >expected <<-\EOF &&
/path/ghi
/path/abc
/path/def
EOF
git hook list --porcelain pre-commit >actual &&
test_cmp expected actual
'
test_done