21fb173f76
server: Implement compaction hash checking
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:56 +02:00
a56ec0be4b
tests: Cover periodic check in tests
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:56 +02:00
4a75e3d52d
server: Refactor compaction checker
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
8d4ca10ece
tests: Move CorruptBBolt to testutil
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
a8020a0320
tests: Rename corruptHash to CorruptBBolt
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
037a898ba0
tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
00bc8da0ef
tests: Add tests for HashByRev HTTP API
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
d3db3bc454
tests: Add integration tests for compact hash
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
1200b1006d
server: Cache compaction hash for HashByRev API
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
7358362c99
server: Extract hasher to separate interface
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
631107285a
server: Remove duplicated compaction revision
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
a3f609d742
server: Return revision range that hash was calcualted for
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
1ff59923d6
server: Store real rv range in hasher
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
991b429336
server: Move adjusting revision to hasher
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
2b8dd0de4e
server: Pass revision as int
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
21e5d5d2b6
server: Calculate hash during compaction
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
f1a759a2c8
server: Fix range in mock not returning same number of keys and values
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
ea684db535
server: Move reading KV index inside scheduleCompaction function
...
Makes it easier to test hash match between scheduleCompaction and
HashByRev.
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
22d3e4ebd7
server: Return error from scheduleCompaction
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
679e327d5e
server: Refactor hasher
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
f5ed371885
server: Extract kvHash struct
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
3f26995f99
server: Move unsafeHashByRev to new hash.go file
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
bc592c7b01
server: Extract unsafeHashByRev function
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
336fef4ce2
server: Test HashByRev values to make sure they don't change
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
78a6f387cb
server: Cover corruptionMonitor with tests
...
Get 100% coverage on InitialCheck and PeriodicCheck functions to avoid
any mistakes.
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
35cbdf3961
server: Extract corruption detection to dedicated struct
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
d32de2c410
server: Extract triggerCorruptAlarm to function
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com >
2022-09-07 15:11:55 +02:00
204c0319a1
Merge pull request #14429 from ahrtr/alarm_list_ci_3.5
...
[3.5] Move consistent_index forward when executing alarmList operation
2022-09-06 15:17:13 +08:00
5c8aa08e2c
move consistent_index forward when executing alarmList operation
...
Cherry pick https://github.com/etcd-io/etcd/pull/14419 to 3.5.
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-09-06 12:48:06 +08:00
747bf5ceff
Merge pull request #14424 from serathius/one_member_data_loss_raft_3_5
...
[release-3.5] fix the potential data loss for clusters with only one member
2022-09-06 03:28:24 +08:00
7eb696dfcd
fix the potential data loss for clusters with only one member
...
For a cluster with only one member, the raft always send identical
unstable entries and committed entries to etcdserver, and etcd
responds to the client once it finishes (actually partially) the
applying workflow.
When the client receives the response, it doesn't mean etcd has already
successfully saved the data, including BoltDB and WAL, because:
1. etcd commits the boltDB transaction periodically instead of on each request;
2. etcd saves WAL entries in parallel with applying the committed entries.
Accordingly, it may run into a situation of data loss when the etcd crashes
immediately after responding to the client and before the boltDB and WAL
successfully save the data to disk.
Note that this issue can only happen for clusters with only one member.
For clusters with multiple members, it isn't an issue, because etcd will
not commit & apply the data before it being replicated to majority members.
When the client receives the response, it means the data must have been applied.
It further means the data must have been committed.
Note: for clusters with multiple members, the raft will never send identical
unstable entries and committed entries to etcdserver.
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-09-05 14:26:24 +02:00
fbb14f91bf
Merge pull request #14397 from biosvs/backport-grpc-proxy-endpoints-autosync
...
Backport of pull/14354 to release-3.5
2022-09-01 10:14:22 +02:00
67e4c59e01
Backport of pull/14354 to 3.5.5
...
Signed-off-by: Vitalii Levitskii <vitalii@uber.com >
2022-08-29 15:58:17 +03:00
74aa38ec10
Merge pull request #14366 from ahrtr/keepalive_3.5_20220820
...
[3.5] Refactor the keepAliveListener and keepAliveConn
2022-08-24 10:14:26 +08:00
9ea5b1ba22
Refactor the keepAliveListener and keepAliveConn
...
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.
Also refer to https://github.com/etcd-io/etcd/pull/14356
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-08-20 15:03:15 +08:00
6bab3677eb
Merge pull request #14361 from amdprophet/3.5-close-keepalive-stream
...
[3.5] clientv3: close streams after use in lessor keepAliveOnce method
2022-08-20 05:33:40 +08:00
eab0b999a8
clientv3: close streams after use in lessor keepAliveOnce method
...
Streams are now closed after being used in the lessor `keepAliveOnce` method.
This prevents the "failed to receive lease keepalive request from gRPC stream"
message from being logged by the server after the context is cancelled by the
client.
Signed-off-by: Justin Kolberg <amd.prophet@gmail.com >
2022-08-18 09:54:12 -07:00
9e95685d0a
Merge pull request #14312 from ahrtr/3.5_bump_otl
...
[3.5] etcdserver: bump OpenTelemetry to 1.0.1 and gRPC to 1.41.0
2022-08-09 04:03:21 +08:00
8fdca41cd8
Change default sampling rate from 100% to 0%
...
Refer to https://github.com/etcd-io/etcd/pull/14318
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-08-07 07:19:30 +08:00
8c5f110b59
Fix the failure in TestEndpointSwitchResolvesViolation
...
Refer to a0bdfc4fc9
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-08-07 07:17:27 +08:00
2751c61f24
update all related dependencies
...
Upgrade grpc to 1.41.0;
Run ./script/fix.sh to fix all related issue.
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-08-07 07:17:27 +08:00
5a86ae2c33
move setupTracing into a separate file config_tracing.go
...
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-08-07 07:17:27 +08:00
2d7e49002c
etcdserver: bump OpenTelemetry to 1.0.1
...
Signed-off-by: Benjamin Wang <wachao@vmware.com >
2022-08-07 07:16:08 +08:00
6145831683
Merge pull request #14318 from damemi/3.5-tracing-sample
...
Change default sampling rate from 100% to 0%
2022-08-07 07:14:35 +08:00
4c013c91e9
Change default sampling rate from 100% to 0%
...
This changes the default parent-based trace sampling rate from
100% to 0%. Due to the high QPS etcd can handle, having 100% trace
sampling leads to very high resource usage. Defaulting to 0% means
that only already-sampled traces will be sampled in etcd.
Fixes #14310
Signed-off-by: Mike Dame <mikedame@google.com >
2022-08-05 15:00:40 +00:00
9d7e10863e
Merge pull request #14227 from mitake/perm-cache-lock-3.5
...
server/auth: protect rangePermCache with a RW lock
2022-07-20 10:36:00 +02:00
e15c005fef
server/auth: protect rangePermCache with a RW lock
...
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com >
2022-07-19 15:56:12 +09:00
3237289fff
Merge pull request #14222 from Jille/backport-14203
...
[3.5] clientv3: Fix parsing of ETCD_CLIENT_DEBUG
2022-07-15 08:27:07 +08:00
cbedaf90fe
Improve error message for incorrect values of ETCD_CLIENT_DEBUG
...
Signed-off-by: Jille Timmermans <jille@quis.cx >
2022-07-14 09:43:54 +02:00
fb71790611
Merge pull request #14219 from ahrtr/3.5_backport_maxstream
...
[3.5] Support configuring `MaxConcurrentStreams` for http2
2022-07-13 16:57:48 +08:00