fe35b5130e
Fix code scanning alert: This log write receives unsanitized user input
2022-04-19 13:49:08 +02:00
3152dc8174
contrib/raftexample: Save snapshot and WAL before hard state
...
Update raftexample to save the snapshot file and WAL snapshot entry
before hardstate to ensure the snapshot exists during recovery.
Otherwise if there is a failure after storing the hard state there may
be reference to a non-existent snapshot.
This PR introduces the fix from #10219 to the raftexample.
2022-04-11 23:44:54 +00:00
5cf6ba48de
added a unit test for the method processMessages
2022-03-08 09:38:23 +08:00
793218ed2b
update the confstate before sending snapshot
...
When there is a `raftpb.EntryConfChange` after creating the snapshot,
then the confState included in the snapshot is out of date. so We need
to update the confState before sending a snapshot to a follower.
2022-03-07 12:18:29 +08:00
72c33d8b05
contrib/mixin: Generate rules, fix tests
...
* Add Makefile
* Make tests runnable
* Add generated rule manifest file
Signed-off-by: Manuel Rüger <manuel@rueg.eu >
2022-02-10 16:17:03 +01:00
7460379bad
contrib/mixin: add missing summary to alerts
...
to avoid alert messages being templated with undefined values lets
set summary for alerts that are currently missing one
2022-01-19 19:55:40 +01:00
2a151c8982
*: move from io/ioutil to io and os packages
...
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil . This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com >
2021-10-28 00:05:28 +08:00
5991da1534
Merge pull request #13388 from grafana/mixin-rate-interval
...
contrib/mixin: Update dashboard promql to use $__rate_interval.
2021-10-21 08:14:25 -04:00
fead3be933
Grafana datasource template should be labelled 'Data Source'.
...
Signed-off-by: Tom Wilkie <tom@grafana.com >
2021-10-20 13:42:43 +01:00
aef9131c81
contrib/mixin/mixin.libsonnet: Include gRPC method in alert description
...
This makes it easier for admin to determine the alert issue.
2021-10-15 15:10:52 +02:00
0eb72bde2c
contrib/mixin: omit Defragment method from etcdGRPCRequestsSlow
...
Signed-off-by: Sam Batschelet <sbatsche@redhat.com >
2021-10-08 16:21:46 -04:00
98427d2bed
contrib/mixin: Update dashboard queries to use $__rate_interval
...
A global query variable was introduced in Grafana 7.2 which is "almost always right" for `rate`, `irate`, and `increase` function calls in promql.
2021-10-04 15:02:11 -07:00
b448daa698
Merge pull request #13275 from lilic/add-peer-dashboard
...
contrib/mixin/mixin.libsonnet: Add dashboard for peer round trip time
2021-08-05 08:27:38 -04:00
55b697c528
contrib/mixin/mixin.libsonnet: Add dashboard for peer round trip time
...
This helps users debug firing alerts.
2021-08-05 13:15:34 +02:00
44b8ae145b
etcdserver: Move datadir and wal to storage package
2021-08-03 12:47:37 +02:00
7885f2a951
Mixin: Support configuring cluster label
2021-07-29 17:54:14 +02:00
85f7b3c406
contrib/mixin/mixin.libsonnet: Unify alerting description
2021-07-16 15:25:53 +02:00
36bb8d293c
Use method const in package http instead of literal
2021-07-08 20:00:03 +08:00
f00231951d
contrib/mixin/mixin.libsonnet: Adjust gRPC failed requests
...
OK is not the only one that is allowed, this before also captured
context canceled, NotFound, and other non error requests.
2021-06-21 11:47:53 +02:00
ffea1537d4
ClientV3 tests use integration.NewClient that configures proper logger.
2021-04-29 18:18:34 +02:00
562d645ac9
Fix the mixin.
...
Signed-off-by: Tom Wilkie <tom@grafana.com >
2021-04-13 19:38:55 +01:00
bad0b4d513
Merge pull request #12823 from mtulio/chore/dash-var-refresh
...
chore/dash-var-refresh: change default refresh to 2(time range)
2021-04-08 15:14:53 +02:00
aeeecc06cf
fix/dash-var-refresh: add const and description
2021-04-08 10:12:41 -03:00
816d332d81
Merge pull request #12830 from ptabor/20210405-split-pkg
...
Split client/pkg as dedicated low-dependencies module for client
2021-04-08 00:48:41 +02:00
3bb7acc8cf
Migrate dependencies pkg/foo -> client/pkg/foo
2021-04-07 00:38:47 +02:00
2ba69de281
Contrib lock example
2021-04-06 15:21:01 -04:00
d2bc5343fb
chore/dash-var-refresh: change default refresh to 2(time range)
2021-04-01 00:06:57 -03:00
fce0c192eb
Regenerate protos.
2021-03-25 00:31:44 +01:00
3976d68ed3
raftExample: Allow closing raftexample node when snapshotting.
...
Fix race that made the raftExample test fail.
2021-02-26 08:56:12 +01:00
5ae3f879c9
raftexample: Return an appropriate applyDoneC
2021-02-24 21:28:18 +09:00
cb0d256a18
raftexample: Add test for adding new node to existing cluster
2021-02-22 13:44:33 +09:00
1b1be43d65
raftexample: New joined node have to start with RestartNode
2021-02-22 09:45:44 +09:00
cc2b039817
raftexample: Explicitly notify all committed entries are applied
2021-02-19 19:26:36 +09:00
2d25f7f3da
raftexample: Implement ReportUnreachable and ReportSnapshot
2021-02-17 11:59:32 +09:00
1302e1edb2
raftexample: Save snapshot file before writing to wal
2021-02-16 13:30:15 +09:00
1395a1a795
Migrate back mixin to contrib/
...
The mixin was moved out together with documentation.
This broke kube-prometheous: https://github.com/etcd-io/etcd/issues/12685#issuecomment-777264143
2021-02-11 09:28:30 +01:00
e8ba375032
Merge pull request #11889 from mrkm4ntr/example-recover-from-snap
...
raftexample: Fix recovery from snapshot
2021-02-10 12:18:30 +01:00
be2167ebab
Wait until all committed entries are applied
...
To take a snapshot
2021-02-05 19:05:41 +09:00
cb14cdd774
raftexample: Fix recovery from snapshot
...
* If there is a snapshot, HTTP server won't start.
* Resotring form snapshot occurs after replaying WAL.
* When taking a snapshot, the last change is not applied to the state machine yet.
2021-02-05 09:34:34 +09:00
b5d11723d1
Merge pull request #12393 from viviyww/contrib-doc
...
contrib: del systemd/etcd2-backup-coreos in docs
2021-02-01 15:33:44 +01:00
4af159a30a
Merge pull request #12259 from alvistack/master-aio_graceful_reboot
...
`etcd.service`: Define explicit dependencies of systemd etcd service
2021-01-31 23:58:12 +01:00
b0e2c70c71
contrib/systemd: add a sysusers entry
...
This adds a sysusers.d file, in order to create a system user/group
which matches the one used by the service unit.
Ref: https://www.freedesktop.org/software/systemd/man/sysusers.d.html
2020-12-09 13:59:46 +00:00
aaf423e962
server: Update imports.
...
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
45b007b8b4
contrib,clientv3: Move contrib/recipies to clientv3/experimental/recipies/...
...
Recipies is set of patterns / primitives implementation on top of clientv3.
It's used by integration tests. It shouldn't be considered "server" code.
2020-10-22 11:10:07 +02:00
e33c6dd9df
client/v3: Rename of imports
2020-10-20 10:13:06 +02:00
e62417297d
*: Rename of imports of raft (as its now a module)
...
% find -name '*.go' -o -name '*.md' -o -name '*.sh' | xargs sed -i --follow-symlinks 's|etcd/v3/raft|etcd/raft/v3|g'
2020-10-16 13:58:18 +02:00
5d3609e3cf
contrib: del systemd/etcd2-backup-coreos in docs
...
del systemd/etcd2-back-coreos in docs
2020-10-14 09:49:56 +08:00
de55bb6331
pkg: Rename imports after making 'pkg' a module
...
find -name '*.go' | xargs sed --follow-symlinks -i 's|go.etcd.io/etcd/v3/pkg/|go.etcd.io/etcd/pkg/v3/|g'
go fmt ./...
2020-10-13 00:09:27 +02:00
28f2b07623
*: Update references to code moved to the api/ dir.
...
Follow up to file-moves done in the previous commit.
The commit contains purely mechanical consequences of execution (apart
of scripts/genproto.sh):
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/etcdserver/api/v3rpc/rpctypes|v3/api/v3rpc/rpctypes|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/version|v3/api/version|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/mvcc/mvccpb|v3/api/mvccpb|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/etcdserver/etcdserverpb|v3/api/etcdserverpb|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/etcdserver/api/membership/membershippb|v3/api/membershippb|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/auth/authpb|v3/api/authpb|g'
% find ./ -name '*.proto' -o -name '*.md' | xargs -L 1 sed --follow-symlinks -i 's|/mvcc/mvccpb/kv.proto|/api/mvccpb/kv.proto|g'
% find ./ -name '*.proto' -o -name '*.md' | xargs -L 1 sed --follow-symlinks -i 's|/auth/authpb/auth.proto|/api/authpb/auth.proto|g'
% find ./ -name '*.proto' -o -name '*.md' | xargs -L 1 sed --follow-symlinks -i 's|/etcdserver/api/membership/membershippb/membership.proto|/api/membershippb/membership.proto|g'
I also modified manually paths in scripts/genproto.sh.
% go fmt ./...
2020-10-06 11:56:16 +02:00
17ceed9b47
etcd.service
: Support Graceful Reboot for AIO Node
...
Currently our sample systemd service file `contrib/systemd/etcd.service`
have startup/shutdown dependency as below:
[Unit]
After=network.target
For some rare condition, e.g. bare matel deployment with slow network
startup, IP could not be assigned e arly enough before etcd default
`ETCD_HEARTBEAT_INTERVAL="100"` and `ETCD_ELECTION_TIMEOUT="1000"` get
timeouted, after graceful system reboot.
This cause etcd false negative classify itself use unhealthy, therefore
stop rejoining the remaining online cluster members.
This PR introduce:
- `etcd.service`: Ensure startup after `network-online.target` and
`time-sync.target`, so effective network connectivity and synced
time is available.
The logic is concept proof by
<https://github.com/alvistack/ansible-role-etcd/tree/develop >; also
works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop >.
No more deadlock happened during graceful system reboot, both AIO
single/multiple node with loopback mount.
Also see:
- <https://github.com/ceph/ceph/pull/36776 >
- <https://github.com/etcd-io/etcd/pull/12259 >
- <https://github.com/cri-o/cri-o/pull/4128 >
- <https://github.com/kubernetes/release/pull/1504 >
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com >
2020-09-17 16:59:12 +08:00