Commit Graph

511 Commits

Author SHA1 Message Date
40f22e9319 Merge pull request #17176 from YaoC/fix-learner-metric
server: fix learner metric incorrect issue
2024-01-12 17:37:46 +01:00
f7ab7adf29 server: fix learner metric incorrect issue
Signed-off-by: YaoC <chengyao09@hotmail.com>
2024-01-12 09:36:33 +00:00
28f4c6fef6 dependency: bump github.com/grpc-ecosystem/grpc-gateway/v2 from 2.18.1 to 2.19.0
Signed-off-by: Sharath Sivakumar <mailssr9@gmail.com>
2024-01-09 16:26:47 +01:00
a2eb17c809 Merge pull request #17199 from serathius/dont-flock
Don't flock snapshot files
2024-01-08 15:03:29 +01:00
3471ef133d Add an e2e test and robustness failpoint around recovering from snapshot backend
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-04 15:25:24 +01:00
7f8346b3f2 Don't flock snapshot files
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-04 14:53:44 +01:00
d39d86a214 Improve logs around recovering from snapshot backend
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-20 16:26:27 +01:00
1e8d66ef95 Add beforeOpenSnapshotBackend failpoint
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-20 15:36:54 +01:00
67f17166bf Safeguard lease operations by double checking the leadership
1. ignore old leader's leases revoking request
2. double check current member's leadership before perform lease renew request
3. etcdserve: ensure current member's leadership before performing lease checkpoint request

Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-15 17:53:36 +00:00
36b2523669 added some log messages for better diagnosis
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-13 18:43:22 +00:00
9f82390ae9 server: refine TestProcessIgnoreMismatchMessage
Signed-off-by: Neil Shen <overvenus@gmail.com>
2023-12-07 20:32:43 +08:00
fb769c4306 server: ignore raft messages if member id mismatch
Ignore Raft messages when the `To` field mismatches the local member ID.
In cases where incorrect Raft messages are dispatched, potentially due
to a malfunctioning switch, this proactive check prevents panics,
such as "tocommit is out of range".

Signed-off-by: Neil Shen <overvenus@gmail.com>
2023-12-07 11:57:45 +08:00
8578e07117 server: disable redirects in peer communication
Disable following redirects from peer HTTP communication on the client's side.
Etcd server may run into SSRF (Server-side request forgery) when adding a new
member. If users provide a malicious peer URL, the existing etcd members may be
redirected to another unexpected internal URL when getting the new member's
version.

Signed-off-by: Ivan Valdes <ivan@vald.es>
2023-12-04 13:53:28 -08:00
bc697bc26e Revert "Switch to validating v3 when v2 and v3 are synchronized"
This reverts commit 4fe46f9203.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-03 18:12:09 +01:00
03d551243b Merge pull request #17015 from serathius/extract-membership-applier
Extract membership applier
2023-11-27 19:59:21 +01:00
62b772c321 Merge pull request #17021 from serathius/test-applyconfstate
Test ApplyConfState after restart
2023-11-27 13:19:50 +01:00
fbdf65f101 Test v3 storage configuration validation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-27 12:00:47 +01:00
e1d79097d5 Merge pull request #17017 from serathius/switch-validation-v3
Switch to validating v3 when v2 and v3 are synchronized
2023-11-27 10:50:14 +00:00
e192a05193 Test ApplyConfState after restart
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-26 17:40:28 +01:00
a97052acf4 remove unused method and functions
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-11-25 16:16:52 +00:00
4fe46f9203 Switch to validating v3 when v2 and v3 are synchronized
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 17:46:33 +01:00
2ad21558ac Remove shouldApplyV3 from the v3 applier
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 16:13:25 +01:00
d22c00ccee Extract membership applier
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 15:57:15 +01:00
7fdb33065d Move duplicated shouldApplyV3 logic up into apply method
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 10:21:14 +01:00
093666f450 Cleanup v2 applier
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-23 15:41:13 +01:00
c72ff1e69c Remove syncing the v2 store TTLs
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-23 14:55:01 +01:00
ed3375e076 Remove v2 apply logic
v2 store is no longer available in v3.6.
We can remove apply logic for it as they will never be used.

Only v2 PUT is neeeded as it applies to v3 storage and etcd v3.5 uses it for setting member
attributes and cluster version.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-23 14:13:07 +01:00
28d9564962 Merge pull request #16994 from serathius/tests-v2-cluster-version
Add tests for setting cluster version using v2 request
2023-11-22 21:11:18 +01:00
2f30760b37 Add tests for setting cluster version using v2 request
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-22 18:05:19 +01:00
29dd025b84 Stop using v2 requests in server tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-22 14:41:53 +01:00
fd0882b67e Merge pull request #16984 from siyuanfoundation/lin-read
etcdserver: add linearizable_read check to readyz.
2023-11-21 19:51:12 +00:00
12b640523a etcdserver: add linearizable_read check to readyz.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-21 11:13:20 -08:00
ec6147cd04 Merge pull request #16967 from serathius/remove-v2-proposals
Remove v2 proposals code
2023-11-21 15:35:51 +00:00
3b37afec7b Don't follow redirects when checking peer urls.
It's possible that etcd server may run into SSRF situation when adding a new member. If users provide a malicious peer URL, the existing etcd members may be redirected to other unexpected internal URL when getting the new member's version.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-11-21 10:25:20 +13:00
dd7a4d28a8 Remove code used to make v2 proposals
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-19 22:39:33 +01:00
b4fd31f254 Remove code for setting cluster version via V2 API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-19 15:28:52 +01:00
3897103b77 etcdserver: add metric counters for livez/readyz health checks.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-14 09:26:00 -08:00
2b7e1c6f82 fix scripts
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
2023-11-08 00:27:13 -06:00
8a6c1335e2 v3rpc: run health notifier to listen on online defrag state change
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-10-28 17:49:24 -07:00
ea035471ce online defrag notifies gRPC health server to expose NOT_SERVING status
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-10-25 08:58:33 -07:00
34382006db test: implement method ForgetLeader for struct nodeRecorder
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-10-24 10:19:13 +01:00
1324f03254 add existing http health check handler e2e test
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-10-18 12:42:23 -07:00
aea1cd0077 feat: enable unparam lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-10-17 21:24:13 +08:00
7a57e06eca etcdserver: add livez and ready http endpoints for etcd.
Add two separate probes, one for liveness and one for readiness. The liveness probe would check that the local individual node is up and running, or else restart the node, while the readiness probe would check that the cluster is ready to serve traffic. This would make etcd health-check fully Kubernetes API complient.

Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-10-14 22:32:16 -07:00
867faa1924 etcdserver: remove redundant len check in health check
From the Go specification [1]:

  "1. For a nil slice, the number of iterations is 0."

`len` returns 0 if the slice or map is nil [2]. Therefore, checking
`len(v) > 0` around a loop is unnecessary.

[1]: https://go.dev/ref/spec#For_range
[2]: https://pkg.go.dev/builtin#len

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-10-13 18:39:38 +08:00
c25f1dff82 http health check bug fixes
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-10-12 16:59:34 -07:00
70a3205506 fix broken unit test in server_test.go
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-09-28 20:07:06 +01:00
9c9804399e do not update RaftCluster.members and RaftCluster.removed if the v3store is ahead of the current replayed WAL entry index
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-09-28 20:06:12 +01:00
628b45c099 test: add a test case to verify consistent memberlist on bootstrap
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-09-28 20:04:47 +01:00
4704a5af3a *: fix unused issue
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-25 19:37:18 +08:00