Commit Graph

1580 Commits

Author SHA1 Message Date
722ec487df server: Split metrics and health code 2022-05-05 09:52:14 +02:00
600ee13ac0 server: Cover V3 health with tests 2022-05-05 09:52:14 +02:00
0096d2ecdb server: Remove unused NewClientHandler 2022-05-05 09:52:13 +02:00
755e2e7755 fix data race in testWatchOverlapContextCancel #14003 2022-05-01 21:12:38 +08:00
00d6d40aea Merge pull request #13940 from lavacat/leases-list-test
tests/common/lease: don't use revision to wait for leases
2022-04-29 10:12:18 +02:00
52ea811af8 Merge pull request #13966 from VladSaioc/blocking-readyc
Fixed potential goroutine leak due to p.Ready() receive in pkg/proxy and dependents.
2022-04-28 16:12:02 +02:00
7ebebb4ff6 Merge pull request #13969 from cmurphy/update-crypto
*: Update golang.org/x/crypto to latest
2022-04-28 12:35:14 +02:00
24002fb099 allocate unique port for each member in test cases 2022-04-26 12:32:00 +08:00
67f3bb3ad4 tests/common/lease: don't use revision to wait for leases
Problem: TestLeaseGrantAndList is flaky because lists won't match at the end.
Test uses revision to verify that all members got leases. But checking for revision isn't enough.

Solution: use size of the list to stop polling.
2022-04-25 10:24:32 -07:00
27bd78f6ab Update golang.org/x/crypto to latest
Update crypto to address CVE-2022-27191.

The CVE fix is added in 0.0.0-20220315160706-3147a52a75dd but this
change updates to latest.
2022-04-25 09:52:12 -07:00
887f95d0d3 Merge pull request #13963 from ptabor/20220412-verify-assert
Add verification consistent index is (nearly) never decreasing
2022-04-25 10:13:44 +02:00
d6368a20cf etcdctl: Remove V2 API commands 2022-04-24 12:08:42 +02:00
d69e07dd3a Verification framework and check whether cindex is not decreasing. 2022-04-22 12:32:05 +02:00
00ca558167 Fixed goroutine leak on NewServer 2022-04-22 08:57:19 +02:00
6eef7ede40 Update conssitent_index when applying fails
When clients have no permission to perform whatever operation, then
the applying may fail. We should also move consistent_index forward
in this case, otherwise the consitent_index may smaller than the
snapshot index.
2022-04-20 21:44:48 +08:00
842fed52c7 use lineariziable range request in TestKVGet 2022-04-18 08:56:19 +08:00
0dae4b3b1e rollback the opentelemetry bumpping to recover the pipeline failures 2022-04-14 16:13:28 +08:00
eab1e0c5d5 go.mod: upgrade opentelemetry deps
Downstream users of etcd experience build issues when using dependencies
which require more recent (incompatible) versions of opentelemetry. This
commit upgrades the dependencies so that downstream users stop
experiencing these issues.
2022-04-13 07:14:10 -07:00
6cb7372e6d clientv3: disable mirror auth test with proxy 2022-04-12 13:42:13 +00:00
e324cc1cbe cv3/mirror: Fetch the most recent prefix revision
When a user sets up a Mirror with a restricted user that doesn't have
access to the `foo` path, we will fail to get the most recent revision
due to permissions issues.

With this change, when a prefix is provided we will get the initial
revision from the prefix rather than /foo. This allows restricted users
to setup sync.
2022-04-11 13:42:03 +00:00
fe3a57976e support linearizable renew lease
When etcdserver receives a LeaseRenew request, it may be still in
progress of processing the LeaseGrantRequest on exact the same
leaseID. Accordingly it may return a TTL=0 to client due to the
leaseID not found error. So the leader should wait for the appliedID
to be available before processing client requests.
2022-04-10 14:44:55 +08:00
12bdd1c5e4 Merge pull request #13910 from serathius/crypto
*: update golang.org/x/crypto
2022-04-09 09:41:49 +02:00
1bb59adb1e *: update golang.org/x/crypto 2022-04-08 16:27:52 +02:00
05e6527d26 Merge pull request #13756 from serathius/test-snapshot
tests: Add tests for snapshot compatibility and recovery between versions
2022-04-08 14:14:19 +02:00
7cc00ec981 tests/framework/integration: Fail nesting early
Currently there are a handful of tests within etcd that silently fail
because LeakDetection will skip the test before it manages to hit this
check.

Here we move the check to the beginning of the process to highlight
these cases earlier, and to avoid them accidentally presenting as leaks.
2022-04-07 13:10:15 +00:00
f0f77fc14e go.mod: Bump prometheus/client_golang to v1.12.1
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-04-06 19:03:24 +02:00
3ffa253516 tests: Add tests for snapshot compatibility and recovery between versions 2022-04-06 16:10:38 +02:00
c4d055fe7b Merge pull request #13819 from endocrimes/dani/auth_test.go
migrate e2e/users tests to common framework
2022-04-06 16:02:46 +02:00
73fc864247 tests: Pass logger to backend 2022-04-05 15:53:38 +02:00
f71196d113 tests/common/lease: Wait for correct lease list response
We don't consistently reach the same etcd server during the lifetime of
a test and in some cases, this means that this test will flake if an
etcd server was slow to update its state and the test hits the outdated
server.

Here we switch to using an `Eventually` case which will wait upto a
second for the expected result before failing - with a 10ms gap between
invocations.

```
[tests(dani/leasefix)] $ gotestsum -- ./common -tags integration -count 100 -timeout 15m -run TestLeaseGrantAndList
✓  common (2m26.71s)

DONE 1600 tests in 147.258s
```
2022-04-04 15:43:17 +02:00
6c974a3e31 Merge pull request #13867 from serathius/logs-test
tests: Use zaptest.NewLogger in tests
2022-04-04 14:47:04 +02:00
804fddf921 tests: Use zaptest.NewLogger in tests 2022-04-04 13:03:15 +02:00
d4dcd3061d Fix flakes in TestV3LeaseCheckpoint/Checkpointing_disabled,_lease_TTL_is_reset
I think strong (not-equal) relationship was too restrictive when expressed with 1s granularity.

```
        logger.go:130: 2022-04-03T22:15:15.242+0200	WARN	m1	leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk	{"member": "m1", "to": "cb785755eb80ac1", "heartbeat-interval": "10ms", "expected-duration": "20ms", "exceeded-duration": "24.666613ms"}
        logger.go:130: 2022-04-03T22:15:15.262+0200	INFO	m-1	published local member to cluster through raft	{"member": "m-1", "local-member-id": "e2dd9f523aa7be87", "local-member-attributes": "{Name:m-1 ClientURLs:[unix://127.0.0.1:2196386040]}", "cluster-id": "b4b8e7e41c23c8b5", "publish-timeout": "5.2s"}
        v3_lease_test.go:415: Expected lease ttl (4m58s) to be greather than (4m58s)
```
2022-04-03 23:13:01 +02:00
ed1bc447c7 Flakes: Additional logging and timeouts to understand common flakes. 2022-04-03 14:48:36 +02:00
68f2cb8c77 Fix ExampleAuth from integration/clientv3/examples (on OsX)
The code now ensures that each of the test is running in its own directory as opposed to shared os.tempdir.
```
$  (cd tests && env go test -timeout=15m --race go.etcd.io/etcd/tests/v3/integration/clientv3/examples -run ExampleAuth)
2022/04/03 10:24:59 Running tests (examples): ...
2022/04/03 10:24:59 the function can be called only in the test context. Was integration.BeforeTest() called ?
2022/04/03 10:24:59 2022-04-03T10:24:59.462+0200	INFO	m0	LISTEN GRPC	{"member": "m0", "grpcAddr": "localhost:m0", "m.Name": "m0"}
```
2022-04-03 14:16:45 +02:00
d57f8dba62 Deflaking: Make WaitLeader (and WaitMembersForLeader) aggressively (30s) wait for leader being established.
Nearly none of the tests was checking the value... just assuming WaitLeader success.

```
    maintenance_test.go:277: Waiting for leader...
    logger.go:130: 2022-04-03T08:01:09.914+0200	INFO	m0	cluster version differs from storage version.	{"member": "m0", "cluster-version": "3.6.0", "storage-version": "3.5.0"}
    logger.go:130: 2022-04-03T08:01:09.915+0200	WARN	m0	leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk	{"member": "m0", "to": "2acc3d3b521981", "heartbeat-interval": "10ms", "expected-duration": "20ms", "exceeded-duration": "103.756219ms"}
    logger.go:130: 2022-04-03T08:01:09.916+0200	INFO	m0	updated storage version	{"member": "m0", "new-storage-version": "3.6.0"}
    ...
    logger.go:130: 2022-04-03T08:01:09.926+0200	INFO	grpc	[[roundrobin] roundrobinPicker: Build called with info: {map[0xc002630ac0:{{unix:localhost:m0 localhost <nil> 0 <nil>}} 0xc002630af0:{{unix:localhost:m1 localhost <nil> 0 <nil>}} 0xc002630b20:{{unix:localhost:m2 localhost <nil> 0 <nil>}}]}]
    logger.go:130: 2022-04-03T08:01:09.926+0200	WARN	m0	apply request took too long	{"member": "m0", "took": "114.661766ms", "expected-duration": "100ms", "prefix": "", "request": "header:<ID:12658633312866157316 > cluster_version_set:<ver:\"3.6.0\" > ", "response": ""}
    logger.go:130: 2022-04-03T08:01:09.927+0200	INFO	m0	cluster version is updated	{"member": "m0", "cluster-version": "3.6"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m2.raft	9f96af25a04e2ec3 [logterm: 2, index: 8, vote: 9903a56eaf96afac] ignored MsgVote from 2acc3d3b521981 [logterm: 2, index: 8] at term 2: lease is not expired (remaining ticks: 10)	{"member": "m2"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	9903a56eaf96afac [logterm: 2, index: 8, vote: 9903a56eaf96afac] ignored MsgVote from 2acc3d3b521981 [logterm: 2, index: 8] at term 2: lease is not expired (remaining ticks: 5)	{"member": "m0"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	9903a56eaf96afac [term: 2] received a MsgAppResp message with higher term from 2acc3d3b521981 [term: 3]	{"member": "m0"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	9903a56eaf96afac became follower at term 3	{"member": "m0"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	raft.node: 9903a56eaf96afac lost leader 9903a56eaf96afac at term 3	{"member": "m0"}
    maintenance_test.go:279: Leader established.
```

Tmp
2022-04-03 12:23:09 +02:00
2fab3f3ae5 Make naming of test-nodes consistent and positive: m0, m1, m2
The nodes used to be named: m-1, m0, m1, that was generating very confusing logs
in integration tests.
2022-04-03 09:16:55 +02:00
f85cd0296f Merge pull request #13872 from ptabor/20220402-osx-unit-test-pass
Fix TestauthTokenBundleOnOverwrite on OsX:
2022-04-02 20:03:38 +02:00
8cd8a1ea10 Flakes in integration/clientv3/examples/...
The tests sometimes flaked due to already existing socket-files.
Now each execution works in a tempoarary directory.
2022-04-02 16:16:25 +02:00
3b589fb3b2 Fix TestauthTokenBundleOnOverwrite on OsX:
```
% (cd client/v3 && env go test -short -timeout=3m --race ./...)
--- FAIL: TestAuthTokenBundleNoOverwrite (0.00s)
    client_test.go:210: listen unix /var/folders/t1/3m8z9xz93t9c3vpt7zyzjm6w00374n/T/TestAuthTokenBundleNoOverwrite3197524989/001/etcd-auth-test:0: bind: invalid argument
FAIL
FAIL	go.etcd.io/etcd/client/v3	4.270s
```

The reason was that the path exceeded 108 chars (that is too much for socket).
In the mitigation we first change chroot (working directory) to the tempDir... such the path is 'local'.
2022-04-02 16:12:02 +02:00
63346bfead server: Use default logging configuration instead of zap production one
This fixes problem where logs json changes format of timestamp.
2022-04-01 10:23:42 +02:00
e5bf23037a tests: Keeps log in expect to allow their analysis 2022-03-31 21:02:36 +02:00
27e222e2d7 Merge pull request #13802 from yankay/fix-the-api-dependency-in-pkg-and-update-cobra-to-1.4.0
Fix the etcd api dependency in pkg. And Update Cobra Version to1.4.0
2022-03-28 10:40:24 +02:00
dcc226491f Merge pull request #13836 from kkkkun/set-etcdutl-default
test: set etcdutl to default
2022-03-25 20:13:21 -04:00
afecd3139c fix the api dependency in pkg, and update cobra to 1.4.0
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2022-03-25 17:18:56 +08:00
62641d3385 set etcdutl to default 2022-03-24 16:20:28 +08:00
c544b2a2a5 Update go to 1.17.8 2022-03-23 20:11:12 +01:00
3254125e6c Merge pull request #13809 from kkkkun/cleanup-deprecated-commands
Removing deprecated commands in etcdctl & etcdutl
2022-03-23 11:21:01 -04:00
1649c9dfc2 Removing deprecated commands in etcdctl & etcdutl 2022-03-23 11:05:52 +08:00
5fe4d551df Merge pull request #13754 from serathius/common-noquorum
tests: Migrate noquorum kv tests to common framework
2022-03-22 10:54:38 +01:00