Commit Graph

13 Commits

Author SHA1 Message Date
4af159a30a Merge pull request #12259 from alvistack/master-aio_graceful_reboot
`etcd.service`: Define explicit dependencies of systemd etcd service
2021-01-31 23:58:12 +01:00
b0e2c70c71 contrib/systemd: add a sysusers entry
This adds a sysusers.d file, in order to create a system user/group
which matches the one used by the service unit.

Ref: https://www.freedesktop.org/software/systemd/man/sysusers.d.html
2020-12-09 13:59:46 +00:00
17ceed9b47 etcd.service: Support Graceful Reboot for AIO Node
Currently our sample systemd service file `contrib/systemd/etcd.service`
have startup/shutdown dependency as below:

    [Unit]
    After=network.target

For some rare condition, e.g. bare matel deployment with slow network
startup, IP could not be assigned e arly enough before etcd default
`ETCD_HEARTBEAT_INTERVAL="100"` and `ETCD_ELECTION_TIMEOUT="1000"` get
timeouted, after graceful system reboot.

This cause etcd false negative classify itself use unhealthy, therefore
stop rejoining the remaining online cluster members.

This PR introduce:

  - `etcd.service`: Ensure startup after `network-online.target` and
    `time-sync.target`, so effective network connectivity and synced
    time is available.

The logic is concept proof by
<https://github.com/alvistack/ansible-role-etcd/tree/develop>; also
works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop>.
No more deadlock happened during graceful system reboot, both AIO
single/multiple node with loopback mount.

Also see:

  - <https://github.com/ceph/ceph/pull/36776>
  - <https://github.com/etcd-io/etcd/pull/12259>
  - <https://github.com/cri-o/cri-o/pull/4128>
  - <https://github.com/kubernetes/release/pull/1504>

Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
2020-09-17 16:59:12 +08:00
bcde798fdd *: path changes for moving to github/etcd-io/etcd 2018-09-03 21:57:23 +05:30
c0f6597c02 contrib/systemd: remove v2 service files
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 18:35:01 -07:00
ca47aab373 *: fix typos in markdown docs 2018-03-07 10:32:07 -05:00
c8a2c7f64f *: eschew you from documentation
Removed line wrapping in affected files as well.
2017-03-06 11:40:46 -08:00
19d30fd4a7 contrib: add etcd cluster deploy on systemd docs
Fix https://github.com/coreos/etcd/issues/5971
2017-01-26 16:56:55 -08:00
17377f5642 example .service file: Order after network.target
From (systemd NetworkTarget description)[https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/]:
```
[...]since the shutdown ordering of units in systemd is the reverse of the startup ordering, any unit that is order After=network.target can be sure that it is stopped before the network is shut down if the system is powered off. This allows services to cleanly terminate connections before going down, instead of abruptly losing connectivity for ongoing connections, leaving them in an undefined state.[...]
```
2016-09-08 23:11:01 +02:00
ef44f71da9 *: update LICENSE header 2016-05-12 20:51:48 -07:00
fb85da92e8 *: fix based on gosimple and unused 2016-04-07 23:16:37 -07:00
85cd4d9647 contrib/systemd: etcd2-backup package and docs
multi-node backup and restore procedures for etcd2 clusters, presented as systemd jobs.
2015-12-22 14:52:10 -08:00
652d3f1974 contrib: add example systemd unit file 2015-11-06 17:50:19 +01:00