Commit Graph

66 Commits

Author SHA1 Message Date
e90504fe62 Unify shared code (and constants) with cindex package. 2021-04-29 11:51:25 +02:00
f53b70facb Embed: In case KVStoreHash verification fails, close the backend.
In case of failed verification, the server used to keep opened backend
(so the file was locked on OS level).
2021-04-29 11:51:25 +02:00
911204cd76 Fix ETCDCTL_API=2 etcdctl backup --with-v3 consistent index consistency
Prior to this CL, `ETCDCTL_API=2 etcdctl backup --with-v3` was readacting WAL log
(by removal of some entries), but was NOT updating consistent_index in the backend.
Also the WAL editing logic was buggy, as it didn't took in consideration the fact
that when TERM changes, there can be entries with duplicated indexes in
the log. So its NOT sufficient to subtract number of removed entries to
get accurate log indexes.

The PR replaces removing and shifting of WAL entries with replacing them with an no-op entries.
Thanks to this consistent-index references are staying up to date.

The PR also:
  - updates 'verification' logic to check whether consistent_index does not lag befor last snapshot
  - env-gated execution of verification framework in `etcdctl backup`.

Tested with:
```
(./build.sh && cd tests && EXPECT_DEBUG=TRUE 'env' 'go' 'test' '-timeout=300m' 'go.etcd.io/etcd/tests/v3/e2e' -run=TestCtlV2Backup --count=1000 2>&1 | tee TestCtlV2BackupV3.log)
```
2021-04-29 11:51:24 +02:00
067521981e v2 etcdctl backup: producing consistent state of membership 2021-04-27 19:34:34 +02:00
a70386a1a4 Simplify membership interface: Does not pass the 'unused' token. 2021-04-27 17:17:31 +02:00
7ae3d25f91 Membership: Add additional methods to trim/manage membership data in backend. 2021-04-27 17:17:31 +02:00
768da490ed sever: v2store deprecation: Fix etcdctl snapshot restore to restore
correct 'backend' (bbolt) context in aspect of membership.

Prior to this change the 'restored' backend used to still contain:
  - old memberid (mvcc deletion used, why the membership is in bolt
bucket, but not mvcc part):
    ```
	mvs := mvcc.NewStore(s.lg, be, lessor, ci, mvcc.StoreConfig{CompactionBatchLimit: math.MaxInt32})
	defer mvs.Close()
	txn := mvs.Write(traceutil.TODO())
	btx := be.BatchTx()
	del := func(k, v []byte) error {
		txn.DeleteRange(k, nil)
		return nil
	}

	// delete stored members from old cluster since using new members
	btx.UnsafeForEach([]byte("members"), del)
    ```
  - didn't get new members added.
2021-04-27 17:17:30 +02:00
9a4b2bdccc Errors: context cancelled or context deadline exceeded are exposed as codes.Canceled, codes.DeadlineExceeded instead of 'codes.Unknown' 2021-04-22 14:35:24 +02:00
d7d110b5a8 mvcc/backend tests: Refactor: Do not mix testing&prod code. 2021-04-21 09:43:13 +02:00
ea287dd9f8 Merge pull request #12854 from ptabor/20210410-shouldApplyV3
(no)StoreV2 (Part 3): Applying consistency fix: ClusterVersionSet (and co) might get not applied on v2store
2021-04-21 09:31:38 +02:00
0f2c940f64 Merge pull request #12880 from chaochn47/exclude_alarms_from_health_check
etcdhttp/metrics.go: exclude alarms from health check conditionally with `?exclude=NOSPACE`
2021-04-20 21:18:15 -04:00
140ea4fa29 etcdhttp/metrics.go: exclude alarms from health check conditionally with ?exclude=NOSPACE 2021-04-20 13:17:09 -07:00
06ba0fc5a2 Merge pull request #12846 from pyiyun/fix-snaptmpfile-bug
etcdserver: remove temp files in snap dir when etcdserver starting
2021-04-17 12:58:46 +02:00
80586c5b47 etcdserver/util.go: reduce memory when logging range requests (#12871)
* etcdserver/util.go: reduce memory when logging range requests

Fixes #12835

* Update CHANGELOG-3.5.md
2021-04-16 16:39:34 -07:00
17b982382e Fix TestSnapshotV3RestoreMultiMemberAdd flakes (leaks)
- most important: unix's socket transport should not keep idle
connections. For top-level Transport we close them using:
    f3c518025e/server/etcdserver/api/rafthttp/transport.go (L226)
    but currently we don't have access to close them witing the nest (unix) transport. Short idle deadline is good enough.

  - Use dialContext (instead of dial) to make sure context is passed down the stack
  - Make sure Context is cancelled as soon as the operation is done in pipeline
  - nit: use dedicated method to yeld goroutines.

Tested with:
```
d=$(date +"%Y%m%d_%H%M")
(cd tests && go test --timeout=60m ./integration/snapshot -run TestSnapshotV3RestoreMultiMemberAdd -v --count=180 2>&1 | tee log_${d}.log)
```

There were transports & cmux leaked:

```
   leak.go:118: Test appears to have leaked a Transport:
        internal/poll.runtime_pollWait(0x7f6c5c3784c8, 0x72, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
        internal/poll.(*pollDesc).wait(0xc003296298, 0x72, 0x0, 0x18, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
        internal/poll.(*pollDesc).waitRead(...)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
        internal/poll.(*FD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
        net.(*netFD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x18, 0xc0009056e2, 0x203000)
        	/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
        net.(*conn).Read(0xc000010258, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/net/net.go:183 +0x91
        github.com/soheilhy/cmux.(*bufferedReader).Read(0xc0003d24e0, 0xc0031f60a8, 0x18, 0x18, 0xc0003d24d0, 0xc0009056e2, 0xc000278400)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/buffer.go:53 +0x12d
        github.com/soheilhy/cmux.hasHTTP2Preface(0x1367e20, 0xc0003d24e0, 0x7f6c5c699f40)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/matchers.go:195 +0x8a
        github.com/soheilhy/cmux.matchersToMatchWriters.func1(0x7f6c5c699f40, 0xc000010258, 0x1367e20, 0xc0003d24e0, 0xc000010258)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:128 +0x39
        github.com/soheilhy/cmux.(*cMux).serve(0xc003228690, 0x138c410, 0xc000010258, 0xc00327f740, 0xc0059ba860)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:192 +0x1e7
        created by github.com/soheilhy/cmux.(*cMux).Serve
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:179 +0x191

        internal/poll.runtime_pollWait(0x7f6c5c60f3f0, 0x72, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
        internal/poll.(*pollDesc).wait(0xc000d53018, 0x72, 0x1000, 0x1000, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
        internal/poll.(*pollDesc).waitRead(...)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
        internal/poll.(*FD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
        net.(*netFD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x3, 0x3, 0x1000000000001)
        	/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
        net.(*conn).Read(0xc00031a570, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/net/net.go:183 +0x91
        net/http.(*persistConn).Read(0xc00093b320, 0xc000cfd000, 0x1000, 0x1000, 0x577750, 0x60, 0x0)
        	/usr/lib/google-golang/src/net/http/transport.go:1933 +0x77
        bufio.(*Reader).fill(0xc005702fc0)
        	/usr/lib/google-golang/src/bufio/bufio.go:101 +0x108
        bufio.(*Reader).Peek(0xc005702fc0, 0x1, 0xc00077c660, 0xc003b082a0, 0xc000d08de0, 0x5ae586, 0x11dd6c0)
        	/usr/lib/google-golang/src/bufio/bufio.go:139 +0x4f
        net/http.(*persistConn).readLoop(0xc00093b320)
        	/usr/lib/google-golang/src/net/http/transport.go:2094 +0x1a8
        created by net/http.(*Transport).dialConn
        	/usr/lib/google-golang/src/net/http/transport.go:1754 +0xdaa

        net/http.(*persistConn).writeLoop(0xc00093b320)
        	/usr/lib/google-golang/src/net/http/transport.go:2393 +0xf7
        created by net/http.(*Transport).dialConn
        	/usr/lib/google-golang/src/net/http/transport.go:1755 +0xdcf

        sync.runtime_Semacquire(0xc0059ba868)
        	/usr/lib/google-golang/src/runtime/sema.go:56 +0x45
        sync.(*WaitGroup).Wait(0xc0059ba860)
        	/usr/lib/google-golang/src/sync/waitgroup.go:130 +0x65
        github.com/soheilhy/cmux.(*cMux).Serve.func1(0xc003228690, 0xc0059ba860)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:158 +0x56
        github.com/soheilhy/cmux.(*cMux).Serve(0xc003228690, 0x13698c0, 0xc00377a0f0)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:173 +0x115
        go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func1(0xc0007cc360, 0x122b75f)
        	/home/ptab/corp/etcd/server/embed/etcd.go:518 +0x2b9
        go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func3(0xc00036d080, 0xc0059330a0)
        	/home/ptab/corp/etcd/server/embed/etcd.go:549 +0x182
        created by go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers
        	/home/ptab/corp/etcd/server/embed/etcd.go:543 +0x73a
--- FAIL: TestSnapshotV3RestoreMultiMemberAdd (17.74s)
```
2021-04-16 20:17:28 +02:00
28a490b09c etcdserver: remove temp files in snap dir when etcdServer starting
When etcd exits abnormally, tmp files will remain in snap dir, so clean up tmp files in snap dir when etcdserver starting.

Fixes #12837
2021-04-16 20:30:04 +08:00
d69e46ea47 Make ShouldApplyV3 an enum - not bool 2021-04-13 23:01:03 +02:00
b1c04ce043 Applying consistency fix: ClusterVersionSet (and co) might get no applied on v2store
ClusterVersionSet, ClusterMemberAttrSet, DowngradeInfoSet functions are
writing both to V2store and backend. Prior this CL there were
in a branch not executed if shouldApplyV3 was false,
e.g. during restore when Backend is up-to-date (has high
consistency-index) while v2store requires replay from WAL log.

The most serious consequence of this bug was that v2store after restore
could have different index (revision) than the same exact store before restore,
so potentially different content between replicas.

Also this change is supressing double-applying of Membership
(ClusterConfig) changes on Backend (store v3) - that lackilly are not
part of MVCC/KeyValue store, so they didn't caused Revisions to be
bumped.

Inspired by jingyih@ comment:
https://github.com/etcd-io/etcd/pull/12820#issuecomment-815299406
2021-04-12 09:43:48 +02:00
3991a8c9fa etcdserver: replace forceVersionC with FirstCommitInTermNotify 2021-04-09 11:30:42 +02:00
3d485faac5 etcdserver: resend ReadIndex request on empty apply request
Empty apply indicates first commit in current term. It is first time when follower is sure, that it's ReadIndex request can be processed.
2021-04-09 11:30:42 +02:00
cc0f812f51 server_test.go: Use context.Background() instead of TODO in tests.
Suggected in: https://golang.org/pkg/context/#Background
2021-04-08 01:15:16 +02:00
931af493cf Merge pull request #12830 from ptabor/20210405-split-pkg
Split client/pkg as dedicated low-dependencies module for client
2021-04-08 01:12:17 +02:00
3bb7acc8cf Migrate dependencies pkg/foo -> client/pkg/foo 2021-04-07 00:38:47 +02:00
55ccbe62a2 membership/cluster_test: Use zaptest logger. 2021-03-26 13:54:59 +01:00
03f55eeb2c Make NewTmpBackend use testing tmp location (so cleanup). 2021-03-26 13:54:55 +01:00
4d4c84e014 server: Written Snapshot's to WAL contains populated ConfState.
This will (among others) allow etcd-3.6 to not depend on store_v2 .snap files at all,
as WAL + db file will be self-sufficient.
2021-03-25 00:31:44 +01:00
7d7a9c6f23 codec.go: should use google runtime
golang-proto runtime can deal with both: gogo & golang generated
protobufs. It does not work vice-versa with protobuf-1.5.1.
2021-03-24 22:06:47 +01:00
19f7c6ef3e *: Update gogo/protobuf to v1.3.2, rerun ./scripts/genproto.sh
While it appears that etcd is not vulnerable to CVE-2021-3121,
it is a good idea to update to the new generator so that new
vulnerable code isn't generated in any future APIs. Also, this
lays the issue to rest of whether there is any issue with
etcd and CVE-2021-3121.
2021-03-23 11:48:06 -06:00
30ce6067da Merge pull request #12780 from wpedrak/read_index_retry
Read index retry
2021-03-23 13:47:41 +01:00
dac6e37ea1 *: over 20 staticcheck fixes 2021-03-18 15:06:17 +01:00
a84bd093b0 Integration with grpc-settable logger. 2021-03-16 22:50:41 +01:00
e9779231ec server: add 500ms retries to ReadIndex requests for l-reads
It is second approach (with first being #12762) to solve #12680
2021-03-16 16:34:15 +01:00
4b21e38381 refactored l-read loop in v3_server.go 2021-03-16 11:03:45 +01:00
1e7c1805d8 Unify logic of building raft-loggers for etcd.
1. We had the same code copied 3 times.
2. For no good reason the code was not reusing existing logger if this one is given.
2021-03-14 16:02:50 +01:00
44bd22307e Merge get_logger() & Logger() method. 2021-03-14 14:05:17 +01:00
fd7fed1511 Move config (ServerConfig) out of etcdserver package.
Motivation:
  - ServerConfig is part of 'embed' public API, while etcdserver is more 'internal'
  - EtcdServer is already too big and config is pretty wide-spread leaf
if we were to split etcdserver (e.g. into pre & post-apply part).
2021-03-11 20:56:22 +01:00
3ead91ca3e Merge pull request #12739 from LeoYang90/optimization_watch_prevkv
create event do not need prevkv range
2021-03-10 09:48:42 -08:00
fb1d48e98e Integration tests: Use BeforeTest(t) instead of defer AfterTest().
Thanks to this change, a single method BeforeTest(t) can handle
before-test logic as well as registration of cleanup code
(t.Cleanup(func)).
2021-03-09 18:19:51 +01:00
94a371acd7 Merge pull request #12750 from ptabor/20210306-mlock
--experimental-memory-mlock support
2021-03-09 09:13:40 -08:00
6fd85af641 Merge pull request #12702 from hexfusion/add-so
*: add support for socket options
2021-03-09 09:02:24 -08:00
5b49fb41c8 fixup: add ListenerOptions
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-03-08 11:27:03 -05:00
a46a358577 --experimental-memory-mlock support
The flag protects etcd memory from being swapped out to disk.
This can happen in memory constrained systems where mmaped bbolt
area is natural condidate for swapping out.

This flag should provide better tail latency on the cost of higher RSS
ram usage. If the experiment is successful, the logic should get moved
into bbolt layer, where we can protect specific bbolt instances
(e.g. avoid protecting both during defragmentation).
2021-03-07 12:32:57 +01:00
d70f35f8d1 create event do not need prevkv range 2021-03-02 17:43:24 +08:00
3d7aac948b Merge pull request #12196 from ironcladlou/metrics-watch-error-fix
etcdserver: fix incorrect metrics generated when clients cancel watches
2021-02-19 12:46:49 -08:00
49078c683b *: add support for socket options
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-02-19 13:31:23 -05:00
7e38cfcc8d raft: makes 'ConnReadTimeout/ConnWriteTimeout' customizable 2021-02-10 10:36:50 +07:00
2ae3e82f07 etcdserver/api/etcdhttp: log successful etcd server side health check in debug level
When we have an external component that checks /health periodically, the
etcd server logs can be quite verbose (e.g., DDOS-ing against insure
etcd health check can lead to disk space full due to large log files).

This change was introduced in #11704.

While we keep the warning logs for etcd health check failures, the
success (or OK) log level should be set to DEBUG.

Fixes #12676
2021-02-08 17:15:43 -08:00
6d82778a4e etcdserver: export method EtcdServer.leaderChangedNotify (#12378) 2021-02-02 18:13:32 +08:00
d6d03beaea Merge pull request #12538 from lzhfromustc/12_9_GoroutineLeak
test: change channel operations to avoid potential goroutine leaks
2021-02-01 21:16:43 +01:00
69e99e80fa Merge pull request #12465 from spacewander/fdoc
chore: update the documentation link in the comment
2021-01-14 00:39:25 -05:00